]>
Commit | Line | Data |
---|---|---|
6de9cd9a | 1 | @c -*-texinfo-*- |
23a5b65a | 2 | @c Copyright (C) 2001-2014 Free Software Foundation, Inc. |
6de9cd9a DN |
3 | @c This is part of the GCC manual. |
4 | @c For copying conditions, see the file gcc.texi. | |
5 | ||
6 | @c --------------------------------------------------------------------- | |
7 | @c Control Flow Graph | |
8 | @c --------------------------------------------------------------------- | |
9 | ||
10 | @node Control Flow | |
11 | @chapter Control Flow Graph | |
12 | @cindex CFG, Control Flow Graph | |
13 | @findex basic-block.h | |
14 | ||
15 | A control flow graph (CFG) is a data structure built on top of the | |
532aafad | 16 | intermediate code representation (the RTL or @code{GIMPLE} instruction |
6de9cd9a DN |
17 | stream) abstracting the control flow behavior of a function that is |
18 | being compiled. The CFG is a directed graph where the vertices | |
19 | represent basic blocks and edges represent possible transfer of | |
20 | control flow from one basic block to another. The data structures | |
21 | used to represent the control flow graph are defined in | |
22 | @file{basic-block.h}. | |
23 | ||
532aafad SB |
24 | In GCC, the representation of control flow is maintained throughout |
25 | the compilation process, from constructing the CFG early in | |
2769de23 | 26 | @code{pass_build_cfg} to @code{pass_free_cfg} (see @file{passes.def}). |
532aafad SB |
27 | The CFG takes various different modes and may undergo extensive |
28 | manipulations, but the graph is always valid between its construction | |
29 | and its release. This way, transfer of information such as data flow, | |
30 | a measured profile, or the loop tree, can be propagated through the | |
31 | passes pipeline, and even from @code{GIMPLE} to @code{RTL}. | |
32 | ||
33 | Often the CFG may be better viewed as integral part of instruction | |
34 | chain, than structure built on the top of it. Updating the compiler's | |
35 | intermediate representation for instructions can not be easily done | |
36 | without proper maintenance of the CFG simultaneously. | |
37 | ||
6de9cd9a DN |
38 | @menu |
39 | * Basic Blocks:: The definition and representation of basic blocks. | |
40 | * Edges:: Types of edges and their representation. | |
41 | * Profile information:: Representation of frequencies and probabilities. | |
42 | * Maintaining the CFG:: Keeping the control flow graph and up to date. | |
43 | * Liveness information:: Using and maintaining liveness information. | |
44 | @end menu | |
45 | ||
46 | ||
47 | @node Basic Blocks | |
48 | @section Basic Blocks | |
49 | ||
50 | @cindex basic block | |
51 | @findex basic_block | |
52 | A basic block is a straight-line sequence of code with only one entry | |
53 | point and only one exit. In GCC, basic blocks are represented using | |
54 | the @code{basic_block} data type. | |
55 | ||
532aafad SB |
56 | @findex ENTRY_BLOCK_PTR, EXIT_BLOCK_PTR |
57 | Special basic blocks represent possible entry and exit points of a | |
58 | function. These blocks are called @code{ENTRY_BLOCK_PTR} and | |
59 | @code{EXIT_BLOCK_PTR}. These blocks do not contain any code. | |
6de9cd9a DN |
60 | |
61 | @findex BASIC_BLOCK | |
62 | The @code{BASIC_BLOCK} array contains all basic blocks in an | |
63 | unspecified order. Each @code{basic_block} structure has a field | |
64 | that holds a unique integer identifier @code{index} that is the | |
65 | index of the block in the @code{BASIC_BLOCK} array. | |
66 | The total number of basic blocks in the function is | |
67 | @code{n_basic_blocks}. Both the basic block indices and | |
68 | the total number of basic blocks may vary during the compilation | |
69 | process, as passes reorder, create, duplicate, and destroy basic | |
70 | blocks. The index for any block should never be greater than | |
532aafad SB |
71 | @code{last_basic_block}. The indices 0 and 1 are special codes |
72 | reserved for @code{ENTRY_BLOCK} and @code{EXIT_BLOCK}, the | |
73 | indices of @code{ENTRY_BLOCK_PTR} and @code{EXIT_BLOCK_PTR}. | |
6de9cd9a | 74 | |
532aafad SB |
75 | @findex next_bb, prev_bb, FOR_EACH_BB, FOR_ALL_BB |
76 | Two pointer members of the @code{basic_block} structure are the | |
77 | pointers @code{next_bb} and @code{prev_bb}. These are used to keep | |
78 | doubly linked chain of basic blocks in the same order as the | |
79 | underlying instruction stream. The chain of basic blocks is updated | |
80 | transparently by the provided API for manipulating the CFG@. The macro | |
81 | @code{FOR_EACH_BB} can be used to visit all the basic blocks in | |
82 | lexicographical order, except @code{ENTRY_BLOCK} and @code{EXIT_BLOCK}. | |
83 | The macro @code{FOR_ALL_BB} also visits all basic blocks in | |
84 | lexicographical order, including @code{ENTRY_BLOCK} and @code{EXIT_BLOCK}. | |
85 | ||
86 | @findex post_order_compute, inverted_post_order_compute, walk_dominator_tree | |
87 | The functions @code{post_order_compute} and @code{inverted_post_order_compute} | |
88 | can be used to compute topological orders of the CFG. The orders are | |
89 | stored as vectors of basic block indices. The @code{BASIC_BLOCK} array | |
90 | can be used to iterate each basic block by index. | |
91 | Dominator traversals are also possible using | |
92 | @code{walk_dominator_tree}. Given two basic blocks A and B, block A | |
93 | dominates block B if A is @emph{always} executed before B@. | |
6de9cd9a DN |
94 | |
95 | Each @code{basic_block} also contains pointers to the first | |
96 | instruction (the @dfn{head}) and the last instruction (the @dfn{tail}) | |
97 | or @dfn{end} of the instruction stream contained in a basic block. In | |
98 | fact, since the @code{basic_block} data type is used to represent | |
532aafad | 99 | blocks in both major intermediate representations of GCC (@code{GIMPLE} |
6de9cd9a | 100 | and RTL), there are pointers to the head and end of a basic block for |
532aafad SB |
101 | both representations, stored in intermediate representation specific |
102 | data in the @code{il} field of @code{struct basic_block_def}. | |
103 | ||
104 | @findex CODE_LABEL | |
105 | @findex NOTE_INSN_BASIC_BLOCK | |
106 | For RTL, these pointers are @code{BB_HEAD} and @code{BB_END}. | |
6de9cd9a | 107 | |
532aafad SB |
108 | @cindex insn notes, notes |
109 | @findex NOTE_INSN_BASIC_BLOCK | |
6de9cd9a | 110 | In the RTL representation of a function, the instruction stream |
532aafad SB |
111 | contains not only the ``real'' instructions, but also @dfn{notes} |
112 | or @dfn{insn notes} (to distinguish them from @dfn{reg notes}). | |
6de9cd9a DN |
113 | Any function that moves or duplicates the basic blocks needs |
114 | to take care of updating of these notes. Many of these notes expect | |
532aafad SB |
115 | that the instruction stream consists of linear regions, so updating |
116 | can sometimes be tedious. All types of insn notes are defined | |
117 | in @file{insn-notes.def}. | |
118 | ||
119 | In the RTL function representation, the instructions contained in a | |
120 | basic block always follow a @code{NOTE_INSN_BASIC_BLOCK}, but zero | |
121 | or more @code{CODE_LABEL} nodes can precede the block note. | |
122 | A basic block ends with a control flow instruction or with the last | |
123 | instruction before the next @code{CODE_LABEL} or | |
124 | @code{NOTE_INSN_BASIC_BLOCK}. | |
125 | By definition, a @code{CODE_LABEL} cannot appear in the middle of | |
6de9cd9a DN |
126 | the instruction stream of a basic block. |
127 | ||
128 | @findex can_fallthru | |
129 | @cindex table jump | |
130 | In addition to notes, the jump table vectors are also represented as | |
131 | ``pseudo-instructions'' inside the insn stream. These vectors never | |
132 | appear in the basic block and should always be placed just after the | |
133 | table jump instructions referencing them. After removing the | |
134 | table-jump it is often difficult to eliminate the code computing the | |
135 | address and referencing the vector, so cleaning up these vectors is | |
136 | postponed until after liveness analysis. Thus the jump table vectors | |
137 | may appear in the insn stream unreferenced and without any purpose. | |
138 | Before any edge is made @dfn{fall-thru}, the existence of such | |
139 | construct in the way needs to be checked by calling | |
140 | @code{can_fallthru} function. | |
141 | ||
532aafad SB |
142 | @cindex GIMPLE statement iterators |
143 | For the @code{GIMPLE} representation, the PHI nodes and statements | |
144 | contained in a basic block are in a @code{gimple_seq} pointed to by | |
145 | the basic block intermediate language specific pointers. | |
146 | Abstract containers and iterators are used to access the PHI nodes | |
147 | and statements in a basic blocks. These iterators are called | |
148 | @dfn{GIMPLE statement iterators} (GSIs). Grep for @code{^gsi} | |
149 | in the various @file{gimple-*} and @file{tree-*} files. | |
7004cce9 DM |
150 | There is a @code{gimple_stmt_iterator} type for iterating over |
151 | all kinds of statement, and a @code{gphi_iterator} subclass for | |
152 | iterating over PHI nodes. | |
532aafad SB |
153 | The following snippet will pretty-print all PHI nodes the statements |
154 | of the current function in the GIMPLE representation. | |
6de9cd9a | 155 | |
cd1a8088 | 156 | @smallexample |
532aafad SB |
157 | basic_block bb; |
158 | ||
6de9cd9a DN |
159 | FOR_EACH_BB (bb) |
160 | @{ | |
7004cce9 | 161 | gphi_iterator pi; |
532aafad SB |
162 | gimple_stmt_iterator si; |
163 | ||
7004cce9 | 164 | for (pi = gsi_start_phis (bb); !gsi_end_p (pi); gsi_next (&pi)) |
532aafad | 165 | @{ |
7004cce9 | 166 | gphi *phi = pi.phi (); |
532aafad SB |
167 | print_gimple_stmt (dump_file, phi, 0, TDF_SLIM); |
168 | @} | |
169 | for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si)) | |
170 | @{ | |
171 | gimple stmt = gsi_stmt (si); | |
172 | print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM); | |
173 | @} | |
6de9cd9a | 174 | @} |
cd1a8088 | 175 | @end smallexample |
6de9cd9a DN |
176 | |
177 | ||
178 | @node Edges | |
179 | @section Edges | |
180 | ||
181 | @cindex edge in the flow graph | |
182 | @findex edge | |
183 | Edges represent possible control flow transfers from the end of some | |
8a36672b JM |
184 | basic block A to the head of another basic block B@. We say that A is |
185 | a predecessor of B, and B is a successor of A@. Edges are represented | |
6de9cd9a | 186 | in GCC with the @code{edge} data type. Each @code{edge} acts as a |
532aafad | 187 | link between two basic blocks: The @code{src} member of an edge |
6de9cd9a | 188 | points to the predecessor basic block of the @code{dest} basic block. |
92083932 BE |
189 | The members @code{preds} and @code{succs} of the @code{basic_block} data |
190 | type point to type-safe vectors of edges to the predecessors and | |
191 | successors of the block. | |
192 | ||
193 | @cindex edge iterators | |
194 | When walking the edges in an edge vector, @dfn{edge iterators} should | |
195 | be used. Edge iterators are constructed using the | |
196 | @code{edge_iterator} data structure and several methods are available | |
197 | to operate on them: | |
198 | ||
199 | @ftable @code | |
200 | @item ei_start | |
201 | This function initializes an @code{edge_iterator} that points to the | |
202 | first edge in a vector of edges. | |
203 | ||
204 | @item ei_last | |
205 | This function initializes an @code{edge_iterator} that points to the | |
206 | last edge in a vector of edges. | |
207 | ||
208 | @item ei_end_p | |
209 | This predicate is @code{true} if an @code{edge_iterator} represents | |
210 | the last edge in an edge vector. | |
211 | ||
212 | @item ei_one_before_end_p | |
213 | This predicate is @code{true} if an @code{edge_iterator} represents | |
214 | the second last edge in an edge vector. | |
215 | ||
216 | @item ei_next | |
217 | This function takes a pointer to an @code{edge_iterator} and makes it | |
218 | point to the next edge in the sequence. | |
219 | ||
220 | @item ei_prev | |
221 | This function takes a pointer to an @code{edge_iterator} and makes it | |
222 | point to the previous edge in the sequence. | |
223 | ||
224 | @item ei_edge | |
225 | This function returns the @code{edge} currently pointed to by an | |
226 | @code{edge_iterator}. | |
227 | ||
228 | @item ei_safe_safe | |
229 | This function returns the @code{edge} currently pointed to by an | |
230 | @code{edge_iterator}, but returns @code{NULL} if the iterator is | |
231 | pointing at the end of the sequence. This function has been provided | |
232 | for existing code makes the assumption that a @code{NULL} edge | |
233 | indicates the end of the sequence. | |
234 | ||
235 | @end ftable | |
236 | ||
237 | The convenience macro @code{FOR_EACH_EDGE} can be used to visit all of | |
238 | the edges in a sequence of predecessor or successor edges. It must | |
239 | not be used when an element might be removed during the traversal, | |
240 | otherwise elements will be missed. Here is an example of how to use | |
241 | the macro: | |
242 | ||
243 | @smallexample | |
244 | edge e; | |
245 | edge_iterator ei; | |
246 | ||
247 | FOR_EACH_EDGE (e, ei, bb->succs) | |
248 | @{ | |
249 | if (e->flags & EDGE_FALLTHRU) | |
250 | break; | |
251 | @} | |
252 | @end smallexample | |
6de9cd9a DN |
253 | |
254 | @findex fall-thru | |
255 | There are various reasons why control flow may transfer from one block | |
256 | to another. One possibility is that some instruction, for example a | |
257 | @code{CODE_LABEL}, in a linearized instruction stream just always | |
258 | starts a new basic block. In this case a @dfn{fall-thru} edge links | |
259 | the basic block to the first following basic block. But there are | |
260 | several other reasons why edges may be created. The @code{flags} | |
261 | field of the @code{edge} data type is used to store information | |
262 | about the type of edge we are dealing with. Each edge is of one of | |
263 | the following types: | |
264 | ||
265 | @table @emph | |
266 | @item jump | |
267 | No type flags are set for edges corresponding to jump instructions. | |
268 | These edges are used for unconditional or conditional jumps and in | |
269 | RTL also for table jumps. They are the easiest to manipulate as they | |
270 | may be freely redirected when the flow graph is not in SSA form. | |
271 | ||
272 | @item fall-thru | |
273 | @findex EDGE_FALLTHRU, force_nonfallthru | |
274 | Fall-thru edges are present in case where the basic block may continue | |
275 | execution to the following one without branching. These edges have | |
8a36672b | 276 | the @code{EDGE_FALLTHRU} flag set. Unlike other types of edges, these |
6de9cd9a DN |
277 | edges must come into the basic block immediately following in the |
278 | instruction stream. The function @code{force_nonfallthru} is | |
279 | available to insert an unconditional jump in the case that redirection | |
280 | is needed. Note that this may require creation of a new basic block. | |
281 | ||
282 | @item exception handling | |
283 | @cindex exception handling | |
284 | @findex EDGE_ABNORMAL, EDGE_EH | |
285 | Exception handling edges represent possible control transfers from a | |
286 | trapping instruction to an exception handler. The definition of | |
287 | ``trapping'' varies. In C++, only function calls can throw, but for | |
532aafad | 288 | Java and Ada, exceptions like division by zero or segmentation fault are |
6de9cd9a DN |
289 | defined and thus each instruction possibly throwing this kind of |
290 | exception needs to be handled as control flow instruction. Exception | |
291 | edges have the @code{EDGE_ABNORMAL} and @code{EDGE_EH} flags set. | |
292 | ||
293 | @findex purge_dead_edges | |
294 | When updating the instruction stream it is easy to change possibly | |
295 | trapping instruction to non-trapping, by simply removing the exception | |
8a36672b | 296 | edge. The opposite conversion is difficult, but should not happen |
6de9cd9a DN |
297 | anyway. The edges can be eliminated via @code{purge_dead_edges} call. |
298 | ||
299 | @findex REG_EH_REGION, EDGE_ABNORMAL_CALL | |
300 | In the RTL representation, the destination of an exception edge is | |
301 | specified by @code{REG_EH_REGION} note attached to the insn. | |
302 | In case of a trapping call the @code{EDGE_ABNORMAL_CALL} flag is set | |
532aafad | 303 | too. In the @code{GIMPLE} representation, this extra flag is not set. |
6de9cd9a DN |
304 | |
305 | @findex may_trap_p, tree_could_trap_p | |
306 | In the RTL representation, the predicate @code{may_trap_p} may be used | |
307 | to check whether instruction still may trap or not. For the tree | |
308 | representation, the @code{tree_could_trap_p} predicate is available, | |
309 | but this predicate only checks for possible memory traps, as in | |
310 | dereferencing an invalid pointer location. | |
311 | ||
312 | ||
313 | @item sibling calls | |
314 | @cindex sibling call | |
315 | @findex EDGE_ABNORMAL, EDGE_SIBCALL | |
316 | Sibling calls or tail calls terminate the function in a non-standard | |
317 | way and thus an edge to the exit must be present. | |
318 | @code{EDGE_SIBCALL} and @code{EDGE_ABNORMAL} are set in such case. | |
319 | These edges only exist in the RTL representation. | |
320 | ||
321 | @item computed jumps | |
322 | @cindex computed jump | |
323 | @findex EDGE_ABNORMAL | |
324 | Computed jumps contain edges to all labels in the function referenced | |
325 | from the code. All those edges have @code{EDGE_ABNORMAL} flag set. | |
326 | The edges used to represent computed jumps often cause compile time | |
327 | performance problems, since functions consisting of many taken labels | |
328 | and many computed jumps may have @emph{very} dense flow graphs, so | |
329 | these edges need to be handled with special care. During the earlier | |
330 | stages of the compilation process, GCC tries to avoid such dense flow | |
331 | graphs by factoring computed jumps. For example, given the following | |
f0eb93a8 | 332 | series of jumps, |
6de9cd9a | 333 | |
cd1a8088 | 334 | @smallexample |
6de9cd9a | 335 | goto *x; |
923158be | 336 | [ @dots{} ] |
6de9cd9a DN |
337 | |
338 | goto *x; | |
923158be | 339 | [ @dots{} ] |
6de9cd9a DN |
340 | |
341 | goto *x; | |
923158be | 342 | [ @dots{} ] |
cd1a8088 | 343 | @end smallexample |
6de9cd9a DN |
344 | |
345 | @noindent | |
346 | factoring the computed jumps results in the following code sequence | |
347 | which has a much simpler flow graph: | |
348 | ||
cd1a8088 | 349 | @smallexample |
6de9cd9a | 350 | goto y; |
923158be | 351 | [ @dots{} ] |
6de9cd9a DN |
352 | |
353 | goto y; | |
923158be | 354 | [ @dots{} ] |
6de9cd9a DN |
355 | |
356 | goto y; | |
923158be | 357 | [ @dots{} ] |
6de9cd9a DN |
358 | |
359 | y: | |
360 | goto *x; | |
cd1a8088 | 361 | @end smallexample |
6de9cd9a | 362 | |
532aafad | 363 | @findex pass_duplicate_computed_gotos |
6de9cd9a DN |
364 | However, the classic problem with this transformation is that it has a |
365 | runtime cost in there resulting code: An extra jump. Therefore, the | |
532aafad SB |
366 | computed jumps are un-factored in the later passes of the compiler |
367 | (in the pass called @code{pass_duplicate_computed_gotos}). | |
6de9cd9a DN |
368 | Be aware of that when you work on passes in that area. There have |
369 | been numerous examples already where the compile time for code with | |
370 | unfactored computed jumps caused some serious headaches. | |
371 | ||
372 | @item nonlocal goto handlers | |
373 | @cindex nonlocal goto handler | |
374 | @findex EDGE_ABNORMAL, EDGE_ABNORMAL_CALL | |
375 | GCC allows nested functions to return into caller using a @code{goto} | |
376 | to a label passed to as an argument to the callee. The labels passed | |
377 | to nested functions contain special code to cleanup after function | |
378 | call. Such sections of code are referred to as ``nonlocal goto | |
379 | receivers''. If a function contains such nonlocal goto receivers, an | |
380 | edge from the call to the label is created with the | |
381 | @code{EDGE_ABNORMAL} and @code{EDGE_ABNORMAL_CALL} flags set. | |
382 | ||
383 | @item function entry points | |
384 | @cindex function entry point, alternate function entry point | |
385 | @findex LABEL_ALTERNATE_NAME | |
386 | By definition, execution of function starts at basic block 0, so there | |
387 | is always an edge from the @code{ENTRY_BLOCK_PTR} to basic block 0. | |
532aafad | 388 | There is no @code{GIMPLE} representation for alternate entry points at |
6de9cd9a DN |
389 | this moment. In RTL, alternate entry points are specified by |
390 | @code{CODE_LABEL} with @code{LABEL_ALTERNATE_NAME} defined. This | |
391 | feature is currently used for multiple entry point prologues and is | |
392 | limited to post-reload passes only. This can be used by back-ends to | |
393 | emit alternate prologues for functions called from different contexts. | |
394 | In future full support for multiple entry functions defined by Fortran | |
395 | 90 needs to be implemented. | |
396 | ||
397 | @item function exits | |
398 | In the pre-reload representation a function terminates after the last | |
399 | instruction in the insn chain and no explicit return instructions are | |
400 | used. This corresponds to the fall-thru edge into exit block. After | |
401 | reload, optimal RTL epilogues are used that use explicit (conditional) | |
402 | return instructions that are represented by edges with no flags set. | |
403 | ||
404 | @end table | |
405 | ||
406 | ||
407 | @node Profile information | |
408 | @section Profile information | |
409 | ||
410 | @cindex profile representation | |
411 | In many cases a compiler must make a choice whether to trade speed in | |
412 | one part of code for speed in another, or to trade code size for code | |
413 | speed. In such cases it is useful to know information about how often | |
414 | some given block will be executed. That is the purpose for | |
415 | maintaining profile within the flow graph. | |
416 | GCC can handle profile information obtained through @dfn{profile | |
532aafad | 417 | feedback}, but it can also estimate branch probabilities based on |
6de9cd9a DN |
418 | statics and heuristics. |
419 | ||
420 | @cindex profile feedback | |
421 | The feedback based profile is produced by compiling the program with | |
422 | instrumentation, executing it on a train run and reading the numbers | |
423 | of executions of basic blocks and edges back to the compiler while | |
424 | re-compiling the program to produce the final executable. This method | |
425 | provides very accurate information about where a program spends most | |
426 | of its time on the train run. Whether it matches the average run of | |
427 | course depends on the choice of train data set, but several studies | |
428 | have shown that the behavior of a program usually changes just | |
429 | marginally over different data sets. | |
430 | ||
431 | @cindex Static profile estimation | |
432 | @cindex branch prediction | |
433 | @findex predict.def | |
434 | When profile feedback is not available, the compiler may be asked to | |
435 | attempt to predict the behavior of each branch in the program using a | |
436 | set of heuristics (see @file{predict.def} for details) and compute | |
437 | estimated frequencies of each basic block by propagating the | |
438 | probabilities over the graph. | |
439 | ||
440 | @findex frequency, count, BB_FREQ_BASE | |
441 | Each @code{basic_block} contains two integer fields to represent | |
442 | profile information: @code{frequency} and @code{count}. The | |
443 | @code{frequency} is an estimation how often is basic block executed | |
444 | within a function. It is represented as an integer scaled in the | |
445 | range from 0 to @code{BB_FREQ_BASE}. The most frequently executed | |
446 | basic block in function is initially set to @code{BB_FREQ_BASE} and | |
447 | the rest of frequencies are scaled accordingly. During optimization, | |
448 | the frequency of the most frequent basic block can both decrease (for | |
449 | instance by loop unrolling) or grow (for instance by cross-jumping | |
450 | optimization), so scaling sometimes has to be performed multiple | |
451 | times. | |
452 | ||
453 | @findex gcov_type | |
454 | The @code{count} contains hard-counted numbers of execution measured | |
455 | during training runs and is nonzero only when profile feedback is | |
456 | available. This value is represented as the host's widest integer | |
457 | (typically a 64 bit integer) of the special type @code{gcov_type}. | |
458 | ||
459 | Most optimization passes can use only the frequency information of a | |
460 | basic block, but a few passes may want to know hard execution counts. | |
461 | The frequencies should always match the counts after scaling, however | |
462 | during updating of the profile information numerical error may | |
463 | accumulate into quite large errors. | |
464 | ||
465 | @findex REG_BR_PROB_BASE, EDGE_FREQUENCY | |
466 | Each edge also contains a branch probability field: an integer in the | |
467 | range from 0 to @code{REG_BR_PROB_BASE}. It represents probability of | |
468 | passing control from the end of the @code{src} basic block to the | |
431ae0bf | 469 | @code{dest} basic block, i.e.@: the probability that control will flow |
532aafad | 470 | along this edge. The @code{EDGE_FREQUENCY} macro is available to |
8a36672b | 471 | compute how frequently a given edge is taken. There is a @code{count} |
6de9cd9a DN |
472 | field for each edge as well, representing same information as for a |
473 | basic block. | |
474 | ||
475 | The basic block frequencies are not represented in the instruction | |
476 | stream, but in the RTL representation the edge frequencies are | |
477 | represented for conditional jumps (via the @code{REG_BR_PROB} | |
478 | macro) since they are used when instructions are output to the | |
479 | assembly file and the flow graph is no longer maintained. | |
480 | ||
481 | @cindex reverse probability | |
482 | The probability that control flow arrives via a given edge to its | |
483 | destination basic block is called @dfn{reverse probability} and is not | |
484 | directly represented, but it may be easily computed from frequencies | |
485 | of basic blocks. | |
486 | ||
487 | @findex redirect_edge_and_branch | |
488 | Updating profile information is a delicate task that can unfortunately | |
8a36672b | 489 | not be easily integrated with the CFG manipulation API@. Many of the |
6de9cd9a DN |
490 | functions and hooks to modify the CFG, such as |
491 | @code{redirect_edge_and_branch}, do not have enough information to | |
492 | easily update the profile, so updating it is in the majority of cases | |
493 | left up to the caller. It is difficult to uncover bugs in the profile | |
494 | updating code, because they manifest themselves only by producing | |
495 | worse code, and checking profile consistency is not possible because | |
496 | of numeric error accumulation. Hence special attention needs to be | |
8a36672b | 497 | given to this issue in each pass that modifies the CFG@. |
6de9cd9a DN |
498 | |
499 | @findex REG_BR_PROB_BASE, BB_FREQ_BASE, count | |
500 | It is important to point out that @code{REG_BR_PROB_BASE} and | |
501 | @code{BB_FREQ_BASE} are both set low enough to be possible to compute | |
502 | second power of any frequency or probability in the flow graph, it is | |
503 | not possible to even square the @code{count} field, as modern CPUs are | |
504 | fast enough to execute $2^32$ operations quickly. | |
505 | ||
506 | ||
507 | @node Maintaining the CFG | |
508 | @section Maintaining the CFG | |
509 | @findex cfghooks.h | |
510 | ||
511 | An important task of each compiler pass is to keep both the control | |
512 | flow graph and all profile information up-to-date. Reconstruction of | |
513 | the control flow graph after each pass is not an option, since it may be | |
514 | very expensive and lost profile information cannot be reconstructed at | |
515 | all. | |
516 | ||
517 | GCC has two major intermediate representations, and both use the | |
518 | @code{basic_block} and @code{edge} data types to represent control | |
519 | flow. Both representations share as much of the CFG maintenance code | |
520 | as possible. For each representation, a set of @dfn{hooks} is defined | |
521 | so that each representation can provide its own implementation of CFG | |
522 | manipulation routines when necessary. These hooks are defined in | |
523 | @file{cfghooks.h}. There are hooks for almost all common CFG | |
524 | manipulations, including block splitting and merging, edge redirection | |
525 | and creating and deleting basic blocks. These hooks should provide | |
526 | everything you need to maintain and manipulate the CFG in both the RTL | |
532aafad | 527 | and @code{GIMPLE} representation. |
6de9cd9a DN |
528 | |
529 | At the moment, the basic block boundaries are maintained transparently | |
530 | when modifying instructions, so there rarely is a need to move them | |
531 | manually (such as in case someone wants to output instruction outside | |
532 | basic block explicitly). | |
532aafad SB |
533 | |
534 | @findex BLOCK_FOR_INSN, gimple_bb | |
6de9cd9a DN |
535 | In the RTL representation, each instruction has a |
536 | @code{BLOCK_FOR_INSN} value that represents pointer to the basic block | |
532aafad SB |
537 | that contains the instruction. In the @code{GIMPLE} representation, the |
538 | function @code{gimple_bb} returns a pointer to the basic block | |
6de9cd9a DN |
539 | containing the queried statement. |
540 | ||
532aafad SB |
541 | @cindex GIMPLE statement iterators |
542 | When changes need to be applied to a function in its @code{GIMPLE} | |
543 | representation, @dfn{GIMPLE statement iterators} should be used. These | |
6de9cd9a | 544 | iterators provide an integrated abstraction of the flow graph and the |
d1facce0 | 545 | instruction stream. Block statement iterators are constructed using |
532aafad | 546 | the @code{gimple_stmt_iterator} data structure and several modifier are |
d1facce0 | 547 | available, including the following: |
6de9cd9a | 548 | |
d7889929 | 549 | @ftable @code |
532aafad SB |
550 | @item gsi_start |
551 | This function initializes a @code{gimple_stmt_iterator} that points to | |
6de9cd9a DN |
552 | the first non-empty statement in a basic block. |
553 | ||
532aafad SB |
554 | @item gsi_last |
555 | This function initializes a @code{gimple_stmt_iterator} that points to | |
6de9cd9a DN |
556 | the last statement in a basic block. |
557 | ||
532aafad SB |
558 | @item gsi_end_p |
559 | This predicate is @code{true} if a @code{gimple_stmt_iterator} | |
6de9cd9a DN |
560 | represents the end of a basic block. |
561 | ||
532aafad SB |
562 | @item gsi_next |
563 | This function takes a @code{gimple_stmt_iterator} and makes it point to | |
6de9cd9a DN |
564 | its successor. |
565 | ||
532aafad SB |
566 | @item gsi_prev |
567 | This function takes a @code{gimple_stmt_iterator} and makes it point to | |
6de9cd9a DN |
568 | its predecessor. |
569 | ||
532aafad SB |
570 | @item gsi_insert_after |
571 | This function inserts a statement after the @code{gimple_stmt_iterator} | |
6de9cd9a DN |
572 | passed in. The final parameter determines whether the statement |
573 | iterator is updated to point to the newly inserted statement, or left | |
574 | pointing to the original statement. | |
575 | ||
532aafad SB |
576 | @item gsi_insert_before |
577 | This function inserts a statement before the @code{gimple_stmt_iterator} | |
6de9cd9a DN |
578 | passed in. The final parameter determines whether the statement |
579 | iterator is updated to point to the newly inserted statement, or left | |
580 | pointing to the original statement. | |
581 | ||
532aafad SB |
582 | @item gsi_remove |
583 | This function removes the @code{gimple_stmt_iterator} passed in and | |
6de9cd9a | 584 | rechains the remaining statements in a basic block, if any. |
d7889929 | 585 | @end ftable |
6de9cd9a DN |
586 | |
587 | @findex BB_HEAD, BB_END | |
588 | In the RTL representation, the macros @code{BB_HEAD} and @code{BB_END} | |
589 | may be used to get the head and end @code{rtx} of a basic block. No | |
590 | abstract iterators are defined for traversing the insn chain, but you | |
38528d50 | 591 | can just use @code{NEXT_INSN} and @code{PREV_INSN} instead. @xref{Insns}. |
6de9cd9a DN |
592 | |
593 | @findex purge_dead_edges | |
594 | Usually a code manipulating pass simplifies the instruction stream and | |
595 | the flow of control, possibly eliminating some edges. This may for | |
596 | example happen when a conditional jump is replaced with an | |
597 | unconditional jump, but also when simplifying possibly trapping | |
598 | instruction to non-trapping while compiling Java. Updating of edges | |
599 | is not transparent and each optimization pass is required to do so | |
600 | manually. However only few cases occur in practice. The pass may | |
601 | call @code{purge_dead_edges} on a given basic block to remove | |
602 | superfluous edges, if any. | |
603 | ||
604 | @findex redirect_edge_and_branch, redirect_jump | |
605 | Another common scenario is redirection of branch instructions, but | |
606 | this is best modeled as redirection of edges in the control flow graph | |
607 | and thus use of @code{redirect_edge_and_branch} is preferred over more | |
608 | low level functions, such as @code{redirect_jump} that operate on RTL | |
609 | chain only. The CFG hooks defined in @file{cfghooks.h} should provide | |
8a36672b | 610 | the complete API required for manipulating and maintaining the CFG@. |
6de9cd9a | 611 | |
12eff7b7 | 612 | @findex split_block |
6de9cd9a DN |
613 | It is also possible that a pass has to insert control flow instruction |
614 | into the middle of a basic block, thus creating an entry point in the | |
615 | middle of the basic block, which is impossible by definition: The | |
431ae0bf | 616 | block must be split to make sure it only has one entry point, i.e.@: the |
12eff7b7 | 617 | head of the basic block. The CFG hook @code{split_block} may be used |
6de9cd9a DN |
618 | when an instruction in the middle of a basic block has to become the |
619 | target of a jump or branch instruction. | |
620 | ||
d7889929 BE |
621 | @findex insert_insn_on_edge |
622 | @findex commit_edge_insertions | |
532aafad SB |
623 | @findex gsi_insert_on_edge |
624 | @findex gsi_commit_edge_inserts | |
6de9cd9a DN |
625 | @cindex edge splitting |
626 | For a global optimizer, a common operation is to split edges in the | |
627 | flow graph and insert instructions on them. In the RTL | |
628 | representation, this can be easily done using the | |
629 | @code{insert_insn_on_edge} function that emits an instruction | |
630 | ``on the edge'', caching it for a later @code{commit_edge_insertions} | |
631 | call that will take care of moving the inserted instructions off the | |
632 | edge into the instruction stream contained in a basic block. This | |
633 | includes the creation of new basic blocks where needed. In the | |
532aafad SB |
634 | @code{GIMPLE} representation, the equivalent functions are |
635 | @code{gsi_insert_on_edge} which inserts a block statement | |
636 | iterator on an edge, and @code{gsi_commit_edge_inserts} which flushes | |
6de9cd9a DN |
637 | the instruction to actual instruction stream. |
638 | ||
532aafad SB |
639 | @findex verify_flow_info |
640 | @cindex CFG verification | |
641 | While debugging the optimization pass, the @code{verify_flow_info} | |
6de9cd9a DN |
642 | function may be useful to find bugs in the control flow graph updating |
643 | code. | |
644 | ||
6de9cd9a DN |
645 | |
646 | @node Liveness information | |
647 | @section Liveness information | |
648 | @cindex Liveness representation | |
649 | Liveness information is useful to determine whether some register is | |
431ae0bf | 650 | ``live'' at given point of program, i.e.@: that it contains a value that |
6de9cd9a DN |
651 | may be used at a later point in the program. This information is |
652 | used, for instance, during register allocation, as the pseudo | |
653 | registers only need to be assigned to a unique hard register or to a | |
654 | stack slot if they are live. The hard registers and stack slots may | |
ff2ce160 | 655 | be freely reused for other values when a register is dead. |
6fb5fa3c DB |
656 | |
657 | Liveness information is available in the back end starting with | |
658 | @code{pass_df_initialize} and ending with @code{pass_df_finish}. Three | |
659 | flavors of live analysis are available: With @code{LR}, it is possible | |
660 | to determine at any point @code{P} in the function if the register may be | |
661 | used on some path from @code{P} to the end of the function. With | |
662 | @code{UR}, it is possible to determine if there is a path from the | |
ff2ce160 | 663 | beginning of the function to @code{P} that defines the variable. |
6fb5fa3c DB |
664 | @code{LIVE} is the intersection of the @code{LR} and @code{UR} and a |
665 | variable is live at @code{P} if there is both an assignment that reaches | |
e4ae5e77 | 666 | it from the beginning of the function and a use that can be reached on |
6fb5fa3c DB |
667 | some path from @code{P} to the end of the function. |
668 | ||
669 | In general @code{LIVE} is the most useful of the three. The macros | |
670 | @code{DF_[LR,UR,LIVE]_[IN,OUT]} can be used to access this information. | |
671 | The macros take a basic block number and return a bitmap that is indexed | |
672 | by the register number. This information is only guaranteed to be up to | |
673 | date after calls are made to @code{df_analyze}. See the file | |
ff2ce160 | 674 | @code{df-core.c} for details on using the dataflow. |
6fb5fa3c | 675 | |
6de9cd9a DN |
676 | |
677 | @findex REG_DEAD, REG_UNUSED | |
6fb5fa3c DB |
678 | The liveness information is stored partly in the RTL instruction stream |
679 | and partly in the flow graph. Local information is stored in the | |
680 | instruction stream: Each instruction may contain @code{REG_DEAD} notes | |
681 | representing that the value of a given register is no longer needed, or | |
6de9cd9a DN |
682 | @code{REG_UNUSED} notes representing that the value computed by the |
683 | instruction is never used. The second is useful for instructions | |
684 | computing multiple values at once. | |
685 |