]>
Commit | Line | Data |
---|---|---|
23a5b65a | 1 | @c Copyright (C) 2004-2014 Free Software Foundation, Inc. |
6de9cd9a DN |
2 | @c This is part of the GCC manual. |
3 | @c For copying conditions, see the file gcc.texi. | |
4 | ||
5 | @c --------------------------------------------------------------------- | |
6 | @c Tree SSA | |
7 | @c --------------------------------------------------------------------- | |
8 | ||
9 | @node Tree SSA | |
e6c99067 | 10 | @chapter Analysis and Optimization of GIMPLE tuples |
6de9cd9a DN |
11 | @cindex Tree SSA |
12 | @cindex Optimization infrastructure for GIMPLE | |
13 | ||
14 | GCC uses three main intermediate languages to represent the program | |
15 | during compilation: GENERIC, GIMPLE and RTL@. GENERIC is a | |
16 | language-independent representation generated by each front end. It | |
17 | is used to serve as an interface between the parser and optimizer. | |
18 | GENERIC is a common representation that is able to represent programs | |
19 | written in all the languages supported by GCC@. | |
20 | ||
21 | GIMPLE and RTL are used to optimize the program. GIMPLE is used for | |
22 | target and language independent optimizations (e.g., inlining, | |
23 | constant propagation, tail call elimination, redundancy elimination, | |
24 | etc). Much like GENERIC, GIMPLE is a language independent, tree based | |
8a36672b | 25 | representation. However, it differs from GENERIC in that the GIMPLE |
6de9cd9a DN |
26 | grammar is more restrictive: expressions contain no more than 3 |
27 | operands (except function calls), it has no control flow structures | |
28 | and expressions with side-effects are only allowed on the right hand | |
29 | side of assignments. See the chapter describing GENERIC and GIMPLE | |
30 | for more details. | |
31 | ||
32 | This chapter describes the data structures and functions used in the | |
33 | GIMPLE optimizers (also known as ``tree optimizers'' or ``middle | |
34 | end''). In particular, it focuses on all the macros, data structures, | |
35 | functions and programming constructs needed to implement optimization | |
36 | passes for GIMPLE@. | |
37 | ||
38 | @menu | |
e6c99067 | 39 | * Annotations:: Attributes for variables. |
02a9370c | 40 | * SSA Operands:: SSA names referenced by GIMPLE statements. |
6ccde948 RW |
41 | * SSA:: Static Single Assignment representation. |
42 | * Alias analysis:: Representing aliased loads and stores. | |
4d7a65ea | 43 | * Memory model:: Memory model used by the middle-end. |
6de9cd9a DN |
44 | @end menu |
45 | ||
6de9cd9a DN |
46 | @node Annotations |
47 | @section Annotations | |
48 | @cindex annotations | |
49 | ||
e6c99067 DN |
50 | The optimizers need to associate attributes with variables during the |
51 | optimization process. For instance, we need to know whether a | |
52 | variable has aliases. All these attributes are stored in data | |
53 | structures called annotations which are then linked to the field | |
54 | @code{ann} in @code{struct tree_common}. | |
6de9cd9a | 55 | |
e6c99067 | 56 | Presently, we define annotations for variables (@code{var_ann_t}). |
6de9cd9a DN |
57 | Annotations are defined and documented in @file{tree-flow.h}. |
58 | ||
59 | ||
e6c99067 DN |
60 | @node SSA Operands |
61 | @section SSA Operands | |
6de9cd9a DN |
62 | @cindex operands |
63 | @cindex virtual operands | |
64 | @cindex real operands | |
43ae1e1c | 65 | @findex update_stmt |
6de9cd9a DN |
66 | |
67 | Almost every GIMPLE statement will contain a reference to a variable | |
68 | or memory location. Since statements come in different shapes and | |
69 | sizes, their operands are going to be located at various spots inside | |
70 | the statement's tree. To facilitate access to the statement's | |
f47c96aa AM |
71 | operands, they are organized into lists associated inside each |
72 | statement's annotation. Each element in an operand list is a pointer | |
6de9cd9a DN |
73 | to a @code{VAR_DECL}, @code{PARM_DECL} or @code{SSA_NAME} tree node. |
74 | This provides a very convenient way of examining and replacing | |
f0eb93a8 | 75 | operands. |
6de9cd9a DN |
76 | |
77 | Data flow analysis and optimization is done on all tree nodes | |
78 | representing variables. Any node for which @code{SSA_VAR_P} returns | |
79 | nonzero is considered when scanning statement operands. However, not | |
80 | all @code{SSA_VAR_P} variables are processed in the same way. For the | |
81 | purposes of optimization, we need to distinguish between references to | |
82 | local scalar variables and references to globals, statics, structures, | |
83 | arrays, aliased variables, etc. The reason is simple, the compiler | |
84 | can gather complete data flow information for a local scalar. On the | |
85 | other hand, a global variable may be modified by a function call, it | |
86 | may not be possible to keep track of all the elements of an array or | |
87 | the fields of a structure, etc. | |
88 | ||
89 | The operand scanner gathers two kinds of operands: @dfn{real} and | |
90 | @dfn{virtual}. An operand for which @code{is_gimple_reg} returns true | |
91 | is considered real, otherwise it is a virtual operand. We also | |
92 | distinguish between uses and definitions. An operand is used if its | |
93 | value is loaded by the statement (e.g., the operand at the RHS of an | |
94 | assignment). If the statement assigns a new value to the operand, the | |
95 | operand is considered a definition (e.g., the operand at the LHS of | |
96 | an assignment). | |
97 | ||
98 | Virtual and real operands also have very different data flow | |
99 | properties. Real operands are unambiguous references to the | |
100 | full object that they represent. For instance, given | |
101 | ||
102 | @smallexample | |
103 | @{ | |
104 | int a, b; | |
105 | a = b | |
106 | @} | |
107 | @end smallexample | |
108 | ||
109 | Since @code{a} and @code{b} are non-aliased locals, the statement | |
110 | @code{a = b} will have one real definition and one real use because | |
f133d4a2 EF |
111 | variable @code{a} is completely modified with the contents of |
112 | variable @code{b}. Real definition are also known as @dfn{killing | |
113 | definitions}. Similarly, the use of @code{b} reads all its bits. | |
6de9cd9a | 114 | |
a32b97a2 | 115 | In contrast, virtual operands are used with variables that can have |
8a36672b JM |
116 | a partial or ambiguous reference. This includes structures, arrays, |
117 | globals, and aliased variables. In these cases, we have two types of | |
118 | definitions. For globals, structures, and arrays, we can determine from | |
f0eb93a8 | 119 | a statement whether a variable of these types has a killing definition. |
a32b97a2 | 120 | If the variable does, then the statement is marked as having a |
8a36672b | 121 | @dfn{must definition} of that variable. However, if a statement is only |
431ae0bf | 122 | defining a part of the variable (i.e.@: a field in a structure), or if we |
a32b97a2 | 123 | know that a statement might define the variable but we cannot say for sure, |
f0eb93a8 | 124 | then we mark that statement as having a @dfn{may definition}. For |
a32b97a2 | 125 | instance, given |
6de9cd9a DN |
126 | |
127 | @smallexample | |
128 | @{ | |
129 | int a, b, *p; | |
130 | ||
923158be | 131 | if (@dots{}) |
6de9cd9a DN |
132 | p = &a; |
133 | else | |
134 | p = &b; | |
135 | *p = 5; | |
136 | return *p; | |
137 | @} | |
138 | @end smallexample | |
139 | ||
140 | The assignment @code{*p = 5} may be a definition of @code{a} or | |
141 | @code{b}. If we cannot determine statically where @code{p} is | |
142 | pointing to at the time of the store operation, we create virtual | |
143 | definitions to mark that statement as a potential definition site for | |
144 | @code{a} and @code{b}. Memory loads are similarly marked with virtual | |
145 | use operands. Virtual operands are shown in tree dumps right before | |
146 | the statement that contains them. To request a tree dump with virtual | |
147 | operands, use the @option{-vops} option to @option{-fdump-tree}: | |
148 | ||
149 | @smallexample | |
150 | @{ | |
151 | int a, b, *p; | |
152 | ||
923158be | 153 | if (@dots{}) |
6de9cd9a DN |
154 | p = &a; |
155 | else | |
156 | p = &b; | |
38635499 DN |
157 | # a = VDEF <a> |
158 | # b = VDEF <b> | |
6de9cd9a DN |
159 | *p = 5; |
160 | ||
161 | # VUSE <a> | |
162 | # VUSE <b> | |
163 | return *p; | |
164 | @} | |
165 | @end smallexample | |
166 | ||
38635499 | 167 | Notice that @code{VDEF} operands have two copies of the referenced |
6de9cd9a DN |
168 | variable. This indicates that this is not a killing definition of |
169 | that variable. In this case we refer to it as a @dfn{may definition} | |
170 | or @dfn{aliased store}. The presence of the second copy of the | |
38635499 | 171 | variable in the @code{VDEF} operand will become important when the |
6de9cd9a DN |
172 | function is converted into SSA form. This will be used to link all |
173 | the non-killing definitions to prevent optimizations from making | |
174 | incorrect assumptions about them. | |
175 | ||
1e6a5d3c KH |
176 | Operands are updated as soon as the statement is finished via a call |
177 | to @code{update_stmt}. If statement elements are changed via | |
178 | @code{SET_USE} or @code{SET_DEF}, then no further action is required | |
1eaf20ec | 179 | (i.e., those macros take care of updating the statement). If changes |
1e6a5d3c | 180 | are made by manipulating the statement's tree directly, then a call |
c3689350 | 181 | must be made to @code{update_stmt} when complete. Calling one of the |
1e6a5d3c KH |
182 | @code{bsi_insert} routines or @code{bsi_replace} performs an implicit |
183 | call to @code{update_stmt}. | |
6de9cd9a | 184 | |
f47c96aa | 185 | @subsection Operand Iterators And Access Routines |
ff2ce160 | 186 | @cindex Operand Iterators |
f47c96aa AM |
187 | @cindex Operand Access Routines |
188 | ||
189 | Operands are collected by @file{tree-ssa-operands.c}. They are stored | |
190 | inside each statement's annotation and can be accessed through either the | |
191 | operand iterators or an access routine. | |
192 | ||
193 | The following access routines are available for examining operands: | |
194 | ||
195 | @enumerate | |
ff2ce160 MS |
196 | @item @code{SINGLE_SSA_@{USE,DEF,TREE@}_OPERAND}: These accessors will return |
197 | NULL unless there is exactly one operand matching the specified flags. If | |
198 | there is exactly one operand, the operand is returned as either a @code{tree}, | |
f47c96aa AM |
199 | @code{def_operand_p}, or @code{use_operand_p}. |
200 | ||
201 | @smallexample | |
202 | tree t = SINGLE_SSA_TREE_OPERAND (stmt, flags); | |
203 | use_operand_p u = SINGLE_SSA_USE_OPERAND (stmt, SSA_ALL_VIRTUAL_USES); | |
204 | def_operand_p d = SINGLE_SSA_DEF_OPERAND (stmt, SSA_OP_ALL_DEFS); | |
205 | @end smallexample | |
206 | ||
ff2ce160 | 207 | @item @code{ZERO_SSA_OPERANDS}: This macro returns true if there are no |
f47c96aa AM |
208 | operands matching the specified flags. |
209 | ||
210 | @smallexample | |
211 | if (ZERO_SSA_OPERANDS (stmt, SSA_OP_ALL_VIRTUALS)) | |
212 | return; | |
213 | @end smallexample | |
aca2bd7c | 214 | |
ff2ce160 MS |
215 | @item @code{NUM_SSA_OPERANDS}: This macro Returns the number of operands |
216 | matching 'flags'. This actually executes a loop to perform the count, so | |
f47c96aa AM |
217 | only use this if it is really needed. |
218 | ||
219 | @smallexample | |
220 | int count = NUM_SSA_OPERANDS (stmt, flags) | |
221 | @end smallexample | |
222 | @end enumerate | |
223 | ||
224 | ||
225 | If you wish to iterate over some or all operands, use the | |
226 | @code{FOR_EACH_SSA_@{USE,DEF,TREE@}_OPERAND} iterator. For example, to print | |
227 | all the operands for a statement: | |
aca2bd7c AM |
228 | |
229 | @smallexample | |
230 | void | |
231 | print_ops (tree stmt) | |
232 | @{ | |
233 | ssa_op_iter; | |
234 | tree var; | |
235 | ||
aca2bd7c | 236 | FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_ALL_OPERANDS) |
f47c96aa | 237 | print_generic_expr (stderr, var, TDF_SLIM); |
aca2bd7c AM |
238 | @} |
239 | @end smallexample | |
240 | ||
241 | ||
f47c96aa AM |
242 | How to choose the appropriate iterator: |
243 | ||
aca2bd7c AM |
244 | @enumerate |
245 | @item Determine whether you are need to see the operand pointers, or just the | |
6ccde948 | 246 | trees, and choose the appropriate macro: |
aca2bd7c AM |
247 | |
248 | @smallexample | |
f0eb93a8 JM |
249 | Need Macro: |
250 | ---- ------- | |
251 | use_operand_p FOR_EACH_SSA_USE_OPERAND | |
252 | def_operand_p FOR_EACH_SSA_DEF_OPERAND | |
253 | tree FOR_EACH_SSA_TREE_OPERAND | |
aca2bd7c AM |
254 | @end smallexample |
255 | ||
256 | @item You need to declare a variable of the type you are interested | |
6ccde948 RW |
257 | in, and an ssa_op_iter structure which serves as the loop controlling |
258 | variable. | |
aca2bd7c AM |
259 | |
260 | @item Determine which operands you wish to use, and specify the flags of | |
6ccde948 RW |
261 | those you are interested in. They are documented in |
262 | @file{tree-ssa-operands.h}: | |
aca2bd7c AM |
263 | |
264 | @smallexample | |
12bcfaa1 JM |
265 | #define SSA_OP_USE 0x01 /* @r{Real USE operands.} */ |
266 | #define SSA_OP_DEF 0x02 /* @r{Real DEF operands.} */ | |
267 | #define SSA_OP_VUSE 0x04 /* @r{VUSE operands.} */ | |
50e6a148 | 268 | #define SSA_OP_VDEF 0x08 /* @r{VDEF operands.} */ |
12bcfaa1 JM |
269 | |
270 | /* @r{These are commonly grouped operand flags.} */ | |
50e6a148 AH |
271 | #define SSA_OP_VIRTUAL_USES (SSA_OP_VUSE) |
272 | #define SSA_OP_VIRTUAL_DEFS (SSA_OP_VDEF) | |
273 | #define SSA_OP_ALL_VIRTUALS (SSA_OP_VIRTUAL_USES | SSA_OP_VIRTUAL_DEFS) | |
274 | #define SSA_OP_ALL_USES (SSA_OP_VIRTUAL_USES | SSA_OP_USE) | |
275 | #define SSA_OP_ALL_DEFS (SSA_OP_VIRTUAL_DEFS | SSA_OP_DEF) | |
276 | #define SSA_OP_ALL_OPERANDS (SSA_OP_ALL_USES | SSA_OP_ALL_DEFS) | |
aca2bd7c AM |
277 | @end smallexample |
278 | @end enumerate | |
279 | ||
280 | So if you want to look at the use pointers for all the @code{USE} and | |
f0eb93a8 | 281 | @code{VUSE} operands, you would do something like: |
aca2bd7c AM |
282 | |
283 | @smallexample | |
f0eb93a8 JM |
284 | use_operand_p use_p; |
285 | ssa_op_iter iter; | |
aca2bd7c AM |
286 | |
287 | FOR_EACH_SSA_USE_OPERAND (use_p, stmt, iter, (SSA_OP_USE | SSA_OP_VUSE)) | |
f0eb93a8 JM |
288 | @{ |
289 | process_use_ptr (use_p); | |
290 | @} | |
aca2bd7c AM |
291 | @end smallexample |
292 | ||
f47c96aa | 293 | The @code{TREE} macro is basically the same as the @code{USE} and |
aca2bd7c | 294 | @code{DEF} macros, only with the use or def dereferenced via |
8a36672b | 295 | @code{USE_FROM_PTR (use_p)} and @code{DEF_FROM_PTR (def_p)}. Since we |
f0eb93a8 | 296 | aren't using operand pointers, use and defs flags can be mixed. |
aca2bd7c AM |
297 | |
298 | @smallexample | |
299 | tree var; | |
300 | ssa_op_iter iter; | |
301 | ||
e9dc9c4e | 302 | FOR_EACH_SSA_TREE_OPERAND (var, stmt, iter, SSA_OP_VUSE) |
aca2bd7c AM |
303 | @{ |
304 | print_generic_expr (stderr, var, TDF_SLIM); | |
305 | @} | |
306 | @end smallexample | |
307 | ||
38635499 | 308 | @code{VDEF}s are broken into two flags, one for the |
e9dc9c4e | 309 | @code{DEF} portion (@code{SSA_OP_VDEF}) and one for the USE portion |
50e6a148 | 310 | (@code{SSA_OP_VUSE}). |
aca2bd7c | 311 | |
50e6a148 AH |
312 | There are many examples in the code, in addition to the documentation |
313 | in @file{tree-ssa-operands.h} and @file{ssa-iterators.h}. | |
aca2bd7c | 314 | |
f47c96aa AM |
315 | There are also a couple of variants on the stmt iterators regarding PHI |
316 | nodes. | |
317 | ||
ff2ce160 MS |
318 | @code{FOR_EACH_PHI_ARG} Works exactly like |
319 | @code{FOR_EACH_SSA_USE_OPERAND}, except it works over @code{PHI} arguments | |
f47c96aa AM |
320 | instead of statement operands. |
321 | ||
322 | @smallexample | |
323 | /* Look at every virtual PHI use. */ | |
324 | FOR_EACH_PHI_ARG (use_p, phi_stmt, iter, SSA_OP_VIRTUAL_USES) | |
325 | @{ | |
326 | my_code; | |
327 | @} | |
328 | ||
329 | /* Look at every real PHI use. */ | |
330 | FOR_EACH_PHI_ARG (use_p, phi_stmt, iter, SSA_OP_USES) | |
331 | my_code; | |
332 | ||
d1facce0 | 333 | /* Look at every PHI use. */ |
f47c96aa AM |
334 | FOR_EACH_PHI_ARG (use_p, phi_stmt, iter, SSA_OP_ALL_USES) |
335 | my_code; | |
336 | @end smallexample | |
337 | ||
ff2ce160 | 338 | @code{FOR_EACH_PHI_OR_STMT_@{USE,DEF@}} works exactly like |
f47c96aa AM |
339 | @code{FOR_EACH_SSA_@{USE,DEF@}_OPERAND}, except it will function on |
340 | either a statement or a @code{PHI} node. These should be used when it is | |
ff2ce160 | 341 | appropriate but they are not quite as efficient as the individual |
f47c96aa AM |
342 | @code{FOR_EACH_PHI} and @code{FOR_EACH_SSA} routines. |
343 | ||
344 | @smallexample | |
345 | FOR_EACH_PHI_OR_STMT_USE (use_operand_p, stmt, iter, flags) | |
346 | @{ | |
347 | my_code; | |
348 | @} | |
349 | ||
350 | FOR_EACH_PHI_OR_STMT_DEF (def_operand_p, phi, iter, flags) | |
351 | @{ | |
352 | my_code; | |
353 | @} | |
354 | @end smallexample | |
6de9cd9a | 355 | |
43ae1e1c AM |
356 | @subsection Immediate Uses |
357 | @cindex Immediate Uses | |
358 | ||
ff2ce160 | 359 | Immediate use information is now always available. Using the immediate use |
43ae1e1c | 360 | iterators, you may examine every use of any @code{SSA_NAME}. For instance, |
6c00f606 AM |
361 | to change each use of @code{ssa_var} to @code{ssa_var2} and call fold_stmt on |
362 | each stmt after that is done: | |
43ae1e1c AM |
363 | |
364 | @smallexample | |
c3689350 AM |
365 | use_operand_p imm_use_p; |
366 | imm_use_iterator iterator; | |
6c00f606 | 367 | tree ssa_var, stmt; |
c3689350 | 368 | |
6c00f606 AM |
369 | |
370 | FOR_EACH_IMM_USE_STMT (stmt, iterator, ssa_var) | |
371 | @{ | |
372 | FOR_EACH_IMM_USE_ON_STMT (imm_use_p, iterator) | |
373 | SET_USE (imm_use_p, ssa_var_2); | |
374 | fold_stmt (stmt); | |
375 | @} | |
43ae1e1c AM |
376 | @end smallexample |
377 | ||
1eaf20ec RW |
378 | There are 2 iterators which can be used. @code{FOR_EACH_IMM_USE_FAST} is |
379 | used when the immediate uses are not changed, i.e., you are looking at the | |
ff2ce160 | 380 | uses, but not setting them. |
43ae1e1c | 381 | |
ff2ce160 MS |
382 | If they do get changed, then care must be taken that things are not changed |
383 | under the iterators, so use the @code{FOR_EACH_IMM_USE_STMT} and | |
384 | @code{FOR_EACH_IMM_USE_ON_STMT} iterators. They attempt to preserve the | |
385 | sanity of the use list by moving all the uses for a statement into | |
386 | a controlled position, and then iterating over those uses. Then the | |
6c00f606 | 387 | optimization can manipulate the stmt when all the uses have been |
ff2ce160 MS |
388 | processed. This is a little slower than the FAST version since it adds a |
389 | placeholder element and must sort through the list a bit for each statement. | |
390 | This placeholder element must be also be removed if the loop is | |
391 | terminated early. The macro @code{BREAK_FROM_IMM_USE_SAFE} is provided | |
6c00f606 | 392 | to do this : |
43ae1e1c AM |
393 | |
394 | @smallexample | |
6c00f606 | 395 | FOR_EACH_IMM_USE_STMT (stmt, iterator, ssa_var) |
43ae1e1c | 396 | @{ |
6c00f606 | 397 | if (stmt == last_stmt) |
43ae1e1c | 398 | BREAK_FROM_SAFE_IMM_USE (iter); |
6c00f606 AM |
399 | |
400 | FOR_EACH_IMM_USE_ON_STMT (imm_use_p, iterator) | |
401 | SET_USE (imm_use_p, ssa_var_2); | |
402 | fold_stmt (stmt); | |
43ae1e1c AM |
403 | @} |
404 | @end smallexample | |
405 | ||
406 | There are checks in @code{verify_ssa} which verify that the immediate use list | |
ff2ce160 MS |
407 | is up to date, as well as checking that an optimization didn't break from the |
408 | loop without using this macro. It is safe to simply 'break'; from a | |
43ae1e1c AM |
409 | @code{FOR_EACH_IMM_USE_FAST} traverse. |
410 | ||
411 | Some useful functions and macros: | |
412 | @enumerate | |
413 | @item @code{has_zero_uses (ssa_var)} : Returns true if there are no uses of | |
414 | @code{ssa_var}. | |
ff2ce160 | 415 | @item @code{has_single_use (ssa_var)} : Returns true if there is only a |
43ae1e1c AM |
416 | single use of @code{ssa_var}. |
417 | @item @code{single_imm_use (ssa_var, use_operand_p *ptr, tree *stmt)} : | |
418 | Returns true if there is only a single use of @code{ssa_var}, and also returns | |
d1facce0 | 419 | the use pointer and statement it occurs in, in the second and third parameters. |
43ae1e1c | 420 | @item @code{num_imm_uses (ssa_var)} : Returns the number of immediate uses of |
c3689350 | 421 | @code{ssa_var}. It is better not to use this if possible since it simply |
43ae1e1c AM |
422 | utilizes a loop to count the uses. |
423 | @item @code{PHI_ARG_INDEX_FROM_USE (use_p)} : Given a use within a @code{PHI} | |
424 | node, return the index number for the use. An assert is triggered if the use | |
425 | isn't located in a @code{PHI} node. | |
c3689350 | 426 | @item @code{USE_STMT (use_p)} : Return the statement a use occurs in. |
43ae1e1c AM |
427 | @end enumerate |
428 | ||
429 | Note that uses are not put into an immediate use list until their statement is | |
ff2ce160 | 430 | actually inserted into the instruction stream via a @code{bsi_*} routine. |
43ae1e1c | 431 | |
ff2ce160 MS |
432 | It is also still possible to utilize lazy updating of statements, but this |
433 | should be used only when absolutely required. Both alias analysis and the | |
434 | dominator optimizations currently do this. | |
43ae1e1c | 435 | |
ff2ce160 | 436 | When lazy updating is being used, the immediate use information is out of date |
c3689350 | 437 | and cannot be used reliably. Lazy updating is achieved by simply marking |
ff2ce160 MS |
438 | statements modified via calls to @code{mark_stmt_modified} instead of |
439 | @code{update_stmt}. When lazy updating is no longer required, all the | |
440 | modified statements must have @code{update_stmt} called in order to bring them | |
441 | up to date. This must be done before the optimization is finished, or | |
c3689350 | 442 | @code{verify_ssa} will trigger an abort. |
43ae1e1c AM |
443 | |
444 | This is done with a simple loop over the instruction stream: | |
445 | @smallexample | |
446 | block_stmt_iterator bsi; | |
447 | basic_block bb; | |
448 | FOR_EACH_BB (bb) | |
449 | @{ | |
450 | for (bsi = bsi_start (bb); !bsi_end_p (bsi); bsi_next (&bsi)) | |
451 | update_stmt_if_modified (bsi_stmt (bsi)); | |
452 | @} | |
453 | @end smallexample | |
454 | ||
6de9cd9a DN |
455 | @node SSA |
456 | @section Static Single Assignment | |
457 | @cindex SSA | |
458 | @cindex static single assignment | |
459 | ||
460 | Most of the tree optimizers rely on the data flow information provided | |
461 | by the Static Single Assignment (SSA) form. We implement the SSA form | |
462 | as described in @cite{R. Cytron, J. Ferrante, B. Rosen, M. Wegman, and | |
8a36672b JM |
463 | K. Zadeck. Efficiently Computing Static Single Assignment Form and the |
464 | Control Dependence Graph. ACM Transactions on Programming Languages | |
6de9cd9a DN |
465 | and Systems, 13(4):451-490, October 1991}. |
466 | ||
467 | The SSA form is based on the premise that program variables are | |
468 | assigned in exactly one location in the program. Multiple assignments | |
469 | to the same variable create new versions of that variable. Naturally, | |
470 | actual programs are seldom in SSA form initially because variables | |
471 | tend to be assigned multiple times. The compiler modifies the program | |
472 | representation so that every time a variable is assigned in the code, | |
473 | a new version of the variable is created. Different versions of the | |
474 | same variable are distinguished by subscripting the variable name with | |
475 | its version number. Variables used in the right-hand side of | |
476 | expressions are renamed so that their version number matches that of | |
477 | the most recent assignment. | |
478 | ||
479 | We represent variable versions using @code{SSA_NAME} nodes. The | |
480 | renaming process in @file{tree-ssa.c} wraps every real and | |
481 | virtual operand with an @code{SSA_NAME} node which contains | |
482 | the version number and the statement that created the | |
483 | @code{SSA_NAME}. Only definitions and virtual definitions may | |
484 | create new @code{SSA_NAME} nodes. | |
485 | ||
b814cc0a BF |
486 | @cindex PHI nodes |
487 | Sometimes, flow of control makes it impossible to determine the | |
6de9cd9a DN |
488 | most recent version of a variable. In these cases, the compiler |
489 | inserts an artificial definition for that variable called | |
490 | @dfn{PHI function} or @dfn{PHI node}. This new definition merges | |
491 | all the incoming versions of the variable to create a new name | |
492 | for it. For instance, | |
493 | ||
494 | @smallexample | |
923158be | 495 | if (@dots{}) |
6de9cd9a | 496 | a_1 = 5; |
923158be | 497 | else if (@dots{}) |
6de9cd9a DN |
498 | a_2 = 2; |
499 | else | |
500 | a_3 = 13; | |
501 | ||
502 | # a_4 = PHI <a_1, a_2, a_3> | |
503 | return a_4; | |
504 | @end smallexample | |
505 | ||
506 | Since it is not possible to determine which of the three branches | |
507 | will be taken at runtime, we don't know which of @code{a_1}, | |
508 | @code{a_2} or @code{a_3} to use at the return statement. So, the | |
509 | SSA renamer creates a new version @code{a_4} which is assigned | |
510 | the result of ``merging'' @code{a_1}, @code{a_2} and @code{a_3}. | |
511 | Hence, PHI nodes mean ``one of these operands. I don't know | |
512 | which''. | |
513 | ||
c960732f | 514 | The following functions can be used to examine PHI nodes |
6de9cd9a | 515 | |
c960732f | 516 | @defun gimple_phi_result (@var{phi}) |
6de9cd9a | 517 | Returns the @code{SSA_NAME} created by PHI node @var{phi} (i.e., |
8a36672b | 518 | @var{phi}'s LHS)@. |
c960732f | 519 | @end defun |
6de9cd9a | 520 | |
c960732f | 521 | @defun gimple_phi_num_args (@var{phi}) |
6de9cd9a DN |
522 | Returns the number of arguments in @var{phi}. This number is exactly |
523 | the number of incoming edges to the basic block holding @var{phi}@. | |
c960732f | 524 | @end defun |
6de9cd9a | 525 | |
c960732f ML |
526 | @defun gimple_phi_arg (@var{phi}, @var{i}) |
527 | Returns @var{i}th argument of @var{phi}@. | |
528 | @end defun | |
6de9cd9a | 529 | |
c960732f | 530 | @defun gimple_phi_arg_edge (@var{phi}, @var{i}) |
6de9cd9a | 531 | Returns the incoming edge for the @var{i}th argument of @var{phi}. |
c960732f | 532 | @end defun |
6de9cd9a | 533 | |
c960732f | 534 | @defun gimple_phi_arg_def (@var{phi}, @var{i}) |
6de9cd9a | 535 | Returns the @code{SSA_NAME} for the @var{i}th argument of @var{phi}. |
c960732f | 536 | @end defun |
6de9cd9a DN |
537 | |
538 | ||
539 | @subsection Preserving the SSA form | |
0bca51f0 | 540 | @findex update_ssa |
6de9cd9a DN |
541 | @cindex preserving SSA form |
542 | Some optimization passes make changes to the function that | |
543 | invalidate the SSA property. This can happen when a pass has | |
0bca51f0 DN |
544 | added new symbols or changed the program so that variables that |
545 | were previously aliased aren't anymore. Whenever something like this | |
ff2ce160 | 546 | happens, the affected symbols must be renamed into SSA form again. |
0bca51f0 DN |
547 | Transformations that emit new code or replicate existing statements |
548 | will also need to update the SSA form@. | |
549 | ||
550 | Since GCC implements two different SSA forms for register and virtual | |
551 | variables, keeping the SSA form up to date depends on whether you are | |
552 | updating register or virtual names. In both cases, the general idea | |
553 | behind incremental SSA updates is similar: when new SSA names are | |
554 | created, they typically are meant to replace other existing names in | |
555 | the program@. | |
556 | ||
557 | For instance, given the following code: | |
558 | ||
559 | @smallexample | |
6ccde948 RW |
560 | 1 L0: |
561 | 2 x_1 = PHI (0, x_5) | |
562 | 3 if (x_1 < 10) | |
563 | 4 if (x_1 > 7) | |
564 | 5 y_2 = 0 | |
565 | 6 else | |
566 | 7 y_3 = x_1 + x_7 | |
567 | 8 endif | |
568 | 9 x_5 = x_1 + 1 | |
0bca51f0 | 569 | 10 goto L0; |
6ccde948 | 570 | 11 endif |
0bca51f0 DN |
571 | @end smallexample |
572 | ||
573 | Suppose that we insert new names @code{x_10} and @code{x_11} (lines | |
574 | @code{4} and @code{8})@. | |
575 | ||
576 | @smallexample | |
6ccde948 RW |
577 | 1 L0: |
578 | 2 x_1 = PHI (0, x_5) | |
579 | 3 if (x_1 < 10) | |
580 | 4 x_10 = @dots{} | |
581 | 5 if (x_1 > 7) | |
582 | 6 y_2 = 0 | |
583 | 7 else | |
584 | 8 x_11 = @dots{} | |
585 | 9 y_3 = x_1 + x_7 | |
586 | 10 endif | |
587 | 11 x_5 = x_1 + 1 | |
588 | 12 goto L0; | |
589 | 13 endif | |
0bca51f0 DN |
590 | @end smallexample |
591 | ||
592 | We want to replace all the uses of @code{x_1} with the new definitions | |
593 | of @code{x_10} and @code{x_11}. Note that the only uses that should | |
594 | be replaced are those at lines @code{5}, @code{9} and @code{11}. | |
595 | Also, the use of @code{x_7} at line @code{9} should @emph{not} be | |
596 | replaced (this is why we cannot just mark symbol @code{x} for | |
597 | renaming)@. | |
598 | ||
599 | Additionally, we may need to insert a PHI node at line @code{11} | |
600 | because that is a merge point for @code{x_10} and @code{x_11}. So the | |
601 | use of @code{x_1} at line @code{11} will be replaced with the new PHI | |
602 | node. The insertion of PHI nodes is optional. They are not strictly | |
603 | necessary to preserve the SSA form, and depending on what the caller | |
604 | inserted, they may not even be useful for the optimizers@. | |
605 | ||
606 | Updating the SSA form is a two step process. First, the pass has to | |
607 | identify which names need to be updated and/or which symbols need to | |
608 | be renamed into SSA form for the first time. When new names are | |
609 | introduced to replace existing names in the program, the mapping | |
610 | between the old and the new names are registered by calling | |
611 | @code{register_new_name_mapping} (note that if your pass creates new | |
612 | code by duplicating basic blocks, the call to @code{tree_duplicate_bb} | |
1491b564 | 613 | will set up the necessary mappings automatically). |
0bca51f0 DN |
614 | |
615 | After the replacement mappings have been registered and new symbols | |
616 | marked for renaming, a call to @code{update_ssa} makes the registered | |
617 | changes. This can be done with an explicit call or by creating | |
618 | @code{TODO} flags in the @code{tree_opt_pass} structure for your pass. | |
0fa2e4df | 619 | There are several @code{TODO} flags that control the behavior of |
0bca51f0 DN |
620 | @code{update_ssa}: |
621 | ||
622 | @itemize @bullet | |
623 | @item @code{TODO_update_ssa}. Update the SSA form inserting PHI nodes | |
6ccde948 RW |
624 | for newly exposed symbols and virtual names marked for updating. |
625 | When updating real names, only insert PHI nodes for a real name | |
626 | @code{O_j} in blocks reached by all the new and old definitions for | |
627 | @code{O_j}. If the iterated dominance frontier for @code{O_j} | |
628 | is not pruned, we may end up inserting PHI nodes in blocks that | |
629 | have one or more edges with no incoming definition for | |
630 | @code{O_j}. This would lead to uninitialized warnings for | |
631 | @code{O_j}'s symbol@. | |
0bca51f0 DN |
632 | |
633 | @item @code{TODO_update_ssa_no_phi}. Update the SSA form without | |
6ccde948 RW |
634 | inserting any new PHI nodes at all. This is used by passes that |
635 | have either inserted all the PHI nodes themselves or passes that | |
636 | need only to patch use-def and def-def chains for virtuals | |
637 | (e.g., DCE)@. | |
0bca51f0 DN |
638 | |
639 | ||
640 | @item @code{TODO_update_ssa_full_phi}. Insert PHI nodes everywhere | |
6ccde948 RW |
641 | they are needed. No pruning of the IDF is done. This is used |
642 | by passes that need the PHI nodes for @code{O_j} even if it | |
643 | means that some arguments will come from the default definition | |
644 | of @code{O_j}'s symbol (e.g., @code{pass_linear_transform})@. | |
0bca51f0 | 645 | |
6ccde948 RW |
646 | WARNING: If you need to use this flag, chances are that your |
647 | pass may be doing something wrong. Inserting PHI nodes for an | |
648 | old name where not all edges carry a new replacement may lead to | |
649 | silent codegen errors or spurious uninitialized warnings@. | |
0bca51f0 DN |
650 | |
651 | @item @code{TODO_update_ssa_only_virtuals}. Passes that update the | |
6ccde948 RW |
652 | SSA form on their own may want to delegate the updating of |
653 | virtual names to the generic updater. Since FUD chains are | |
654 | easier to maintain, this simplifies the work they need to do. | |
655 | NOTE: If this flag is used, any OLD->NEW mappings for real names | |
656 | are explicitly destroyed and only the symbols marked for | |
657 | renaming are processed@. | |
0bca51f0 DN |
658 | @end itemize |
659 | ||
d9d93d96 DB |
660 | @subsection Preserving the virtual SSA form |
661 | @cindex preserving virtual SSA form | |
662 | ||
663 | The virtual SSA form is harder to preserve than the non-virtual SSA form | |
664 | mainly because the set of virtual operands for a statement may change at | |
38635499 DN |
665 | what some would consider unexpected times. In general, statement |
666 | modifications should be bracketed between calls to | |
667 | @code{push_stmt_changes} and @code{pop_stmt_changes}. For example, | |
668 | ||
669 | @smallexample | |
670 | munge_stmt (tree stmt) | |
671 | @{ | |
672 | push_stmt_changes (&stmt); | |
923158be | 673 | @dots{} rewrite STMT @dots{} |
38635499 DN |
674 | pop_stmt_changes (&stmt); |
675 | @} | |
676 | @end smallexample | |
677 | ||
678 | The call to @code{push_stmt_changes} saves the current state of the | |
679 | statement operands and the call to @code{pop_stmt_changes} compares | |
680 | the saved state with the current one and does the appropriate symbol | |
681 | marking for the SSA renamer. | |
682 | ||
683 | It is possible to modify several statements at a time, provided that | |
684 | @code{push_stmt_changes} and @code{pop_stmt_changes} are called in | |
685 | LIFO order, as when processing a stack of statements. | |
686 | ||
687 | Additionally, if the pass discovers that it did not need to make | |
688 | changes to the statement after calling @code{push_stmt_changes}, it | |
689 | can simply discard the topmost change buffer by calling | |
690 | @code{discard_stmt_changes}. This will avoid the expensive operand | |
691 | re-scan operation and the buffer comparison that determines if symbols | |
692 | need to be marked for renaming. | |
6de9cd9a DN |
693 | |
694 | @subsection Examining @code{SSA_NAME} nodes | |
695 | @cindex examining SSA_NAMEs | |
696 | ||
697 | The following macros can be used to examine @code{SSA_NAME} nodes | |
698 | ||
699 | @defmac SSA_NAME_DEF_STMT (@var{var}) | |
700 | Returns the statement @var{s} that creates the @code{SSA_NAME} | |
701 | @var{var}. If @var{s} is an empty statement (i.e., @code{IS_EMPTY_STMT | |
702 | (@var{s})} returns @code{true}), it means that the first reference to | |
703 | this variable is a USE or a VUSE@. | |
704 | @end defmac | |
705 | ||
706 | @defmac SSA_NAME_VERSION (@var{var}) | |
707 | Returns the version number of the @code{SSA_NAME} object @var{var}. | |
708 | @end defmac | |
709 | ||
710 | ||
6de9cd9a DN |
711 | @subsection Walking the dominator tree |
712 | ||
713 | @deftypefn {Tree SSA function} void walk_dominator_tree (@var{walk_data}, @var{bb}) | |
714 | ||
715 | This function walks the dominator tree for the current CFG calling a | |
716 | set of callback functions defined in @var{struct dom_walk_data} in | |
717 | @file{domwalk.h}. The call back functions you need to define give you | |
718 | hooks to execute custom code at various points during traversal: | |
719 | ||
720 | @enumerate | |
721 | @item Once to initialize any local data needed while processing | |
6ccde948 RW |
722 | @var{bb} and its children. This local data is pushed into an |
723 | internal stack which is automatically pushed and popped as the | |
724 | walker traverses the dominator tree. | |
6de9cd9a DN |
725 | |
726 | @item Once before traversing all the statements in the @var{bb}. | |
727 | ||
728 | @item Once for every statement inside @var{bb}. | |
729 | ||
730 | @item Once after traversing all the statements and before recursing | |
6ccde948 | 731 | into @var{bb}'s dominator children. |
6de9cd9a DN |
732 | |
733 | @item It then recurses into all the dominator children of @var{bb}. | |
734 | ||
735 | @item After recursing into all the dominator children of @var{bb} it | |
6ccde948 RW |
736 | can, optionally, traverse every statement in @var{bb} again |
737 | (i.e., repeating steps 2 and 3). | |
6de9cd9a DN |
738 | |
739 | @item Once after walking the statements in @var{bb} and @var{bb}'s | |
6ccde948 RW |
740 | dominator children. At this stage, the block local data stack |
741 | is popped. | |
6de9cd9a DN |
742 | @end enumerate |
743 | @end deftypefn | |
744 | ||
745 | @node Alias analysis | |
746 | @section Alias analysis | |
747 | @cindex alias | |
748 | @cindex flow-sensitive alias analysis | |
749 | @cindex flow-insensitive alias analysis | |
750 | ||
5006671f RG |
751 | Alias analysis in GIMPLE SSA form consists of two pieces. First |
752 | the virtual SSA web ties conflicting memory accesses and provides | |
753 | a SSA use-def chain and SSA immediate-use chains for walking | |
754 | possibly dependent memory accesses. Second an alias-oracle can | |
755 | be queried to disambiguate explicit and implicit memory references. | |
6de9cd9a DN |
756 | |
757 | @enumerate | |
5006671f | 758 | @item Memory SSA form. |
c75ab022 | 759 | |
5006671f RG |
760 | All statements that may use memory have exactly one accompanied use of |
761 | a virtual SSA name that represents the state of memory at the | |
762 | given point in the IL. | |
763 | ||
764 | All statements that may define memory have exactly one accompanied | |
765 | definition of a virtual SSA name using the previous state of memory | |
766 | and defining the new state of memory after the given point in the IL. | |
c75ab022 DB |
767 | |
768 | @smallexample | |
5006671f RG |
769 | int i; |
770 | int foo (void) | |
c75ab022 | 771 | @{ |
5006671f RG |
772 | # .MEM_3 = VDEF <.MEM_2(D)> |
773 | i = 1; | |
774 | # VUSE <.MEM_3> | |
775 | return i; | |
c75ab022 DB |
776 | @} |
777 | @end smallexample | |
778 | ||
5006671f RG |
779 | The virtual SSA names in this case are @code{.MEM_2(D)} and |
780 | @code{.MEM_3}. The store to the global variable @code{i} | |
781 | defines @code{.MEM_3} invalidating @code{.MEM_2(D)}. The | |
782 | load from @code{i} uses that new state @code{.MEM_3}. | |
783 | ||
784 | The virtual SSA web serves as constraints to SSA optimizers | |
785 | preventing illegitimate code-motion and optimization. It | |
786 | also provides a way to walk related memory statements. | |
c75ab022 | 787 | |
6ccde948 | 788 | @item Points-to and escape analysis. |
6de9cd9a | 789 | |
5006671f RG |
790 | Points-to analysis builds a set of constraints from the GIMPLE |
791 | SSA IL representing all pointer operations and facts we do | |
792 | or do not know about pointers. Solving this set of constraints | |
793 | yields a conservatively correct solution for each pointer | |
794 | variable in the program (though we are only interested in | |
795 | SSA name pointers) as to what it may possibly point to. | |
796 | ||
797 | This points-to solution for a given SSA name pointer is stored | |
798 | in the @code{pt_solution} sub-structure of the | |
799 | @code{SSA_NAME_PTR_INFO} record. The following accessor | |
800 | functions are available: | |
6de9cd9a | 801 | |
6ccde948 | 802 | @itemize @bullet |
5006671f RG |
803 | @item @code{pt_solution_includes} |
804 | @item @code{pt_solutions_intersect} | |
6ccde948 | 805 | @end itemize |
6de9cd9a | 806 | |
5006671f RG |
807 | Points-to analysis also computes the solution for two special |
808 | set of pointers, @code{ESCAPED} and @code{CALLUSED}. Those | |
809 | represent all memory that has escaped the scope of analysis | |
810 | or that is used by pure or nested const calls. | |
6de9cd9a | 811 | |
5006671f | 812 | @item Type-based alias analysis |
6de9cd9a | 813 | |
5006671f RG |
814 | Type-based alias analysis is frontend dependent though generic |
815 | support is provided by the middle-end in @code{alias.c}. TBAA | |
816 | code is used by both tree optimizers and RTL optimizers. | |
d2927bd5 JC |
817 | |
818 | Every language that wishes to perform language-specific alias analysis | |
819 | should define a function that computes, given a @code{tree} | |
820 | node, an alias set for the node. Nodes in different alias sets are not | |
821 | allowed to alias. For an example, see the C front-end function | |
822 | @code{c_get_alias_set}. | |
6de9cd9a | 823 | |
5006671f | 824 | @item Tree alias-oracle |
6de9cd9a | 825 | |
5006671f RG |
826 | The tree alias-oracle provides means to disambiguate two memory |
827 | references and memory references against statements. The following | |
828 | queries are available: | |
6de9cd9a | 829 | |
5006671f RG |
830 | @itemize @bullet |
831 | @item @code{refs_may_alias_p} | |
832 | @item @code{ref_maybe_used_by_stmt_p} | |
833 | @item @code{stmt_may_clobber_ref_p} | |
834 | @end itemize | |
6de9cd9a | 835 | |
5006671f RG |
836 | In addition to those two kind of statement walkers are available |
837 | walking statements related to a reference ref. | |
838 | @code{walk_non_aliased_vuses} walks over dominating memory defining | |
839 | statements and calls back if the statement does not clobber ref | |
840 | providing the non-aliased VUSE. The walk stops at | |
841 | the first clobbering statement or if asked to. | |
842 | @code{walk_aliased_vdefs} walks over dominating memory defining | |
843 | statements and calls back on each statement clobbering ref | |
844 | providing its aliasing VDEF. The walk stops if asked to. | |
6de9cd9a | 845 | |
6de9cd9a | 846 | @end enumerate |
5006671f | 847 | |
4d7a65ea RG |
848 | |
849 | @node Memory model | |
850 | @section Memory model | |
851 | @cindex memory model | |
852 | ||
853 | The memory model used by the middle-end models that of the C/C++ | |
854 | languages. The middle-end has the notion of an effective type | |
855 | of a memory region which is used for type-based alias analysis. | |
856 | ||
857 | The following is a refinement of ISO C99 6.5/6, clarifying the block copy case | |
858 | to follow common sense and extending the concept of a dynamic effective | |
859 | type to objects with a declared type as required for C++. | |
860 | ||
861 | @smallexample | |
862 | The effective type of an object for an access to its stored value is | |
863 | the declared type of the object or the effective type determined by | |
864 | a previous store to it. If a value is stored into an object through | |
865 | an lvalue having a type that is not a character type, then the | |
866 | type of the lvalue becomes the effective type of the object for that | |
867 | access and for subsequent accesses that do not modify the stored value. | |
868 | If a value is copied into an object using @code{memcpy} or @code{memmove}, | |
869 | or is copied as an array of character type, then the effective type | |
870 | of the modified object for that access and for subsequent accesses that | |
871 | do not modify the value is undetermined. For all other accesses to an | |
872 | object, the effective type of the object is simply the type of the | |
873 | lvalue used for the access. | |
874 | @end smallexample | |
875 |