]> git.ipfire.org Git - people/ms/gcc.git/blame - gcc/doc/cfg.texi
**/*.texi: Reorder index entries
[people/ms/gcc.git] / gcc / doc / cfg.texi
CommitLineData
d77de738 1@c -*-texinfo-*-
83ffe9cd 2@c Copyright (C) 2001-2023 Free Software Foundation, Inc.
d77de738
ML
3@c This is part of the GCC manual.
4@c For copying conditions, see the file gcc.texi.
5
6@c ---------------------------------------------------------------------
7@c Control Flow Graph
8@c ---------------------------------------------------------------------
9
10@node Control Flow
11@chapter Control Flow Graph
12@cindex CFG, Control Flow Graph
13@findex basic-block.h
14
15A control flow graph (CFG) is a data structure built on top of the
16intermediate code representation (the RTL or @code{GIMPLE} instruction
17stream) abstracting the control flow behavior of a function that is
18being compiled. The CFG is a directed graph where the vertices
19represent basic blocks and edges represent possible transfer of
20control flow from one basic block to another. The data structures
21used to represent the control flow graph are defined in
22@file{basic-block.h}.
23
24In GCC, the representation of control flow is maintained throughout
25the compilation process, from constructing the CFG early in
26@code{pass_build_cfg} to @code{pass_free_cfg} (see @file{passes.def}).
27The CFG takes various different modes and may undergo extensive
28manipulations, but the graph is always valid between its construction
29and its release. This way, transfer of information such as data flow,
30a measured profile, or the loop tree, can be propagated through the
31passes pipeline, and even from @code{GIMPLE} to @code{RTL}.
32
33Often the CFG may be better viewed as integral part of instruction
34chain, than structure built on the top of it. Updating the compiler's
35intermediate representation for instructions cannot be easily done
36without proper maintenance of the CFG simultaneously.
37
38@menu
39* Basic Blocks:: The definition and representation of basic blocks.
40* Edges:: Types of edges and their representation.
41* Profile information:: Representation of frequencies and probabilities.
42* Maintaining the CFG:: Keeping the control flow graph and up to date.
43* Liveness information:: Using and maintaining liveness information.
44@end menu
45
46
47@node Basic Blocks
48@section Basic Blocks
49
50@cindex basic block
51@findex basic_block
52A basic block is a straight-line sequence of code with only one entry
53point and only one exit. In GCC, basic blocks are represented using
54the @code{basic_block} data type.
55
56@findex ENTRY_BLOCK_PTR, EXIT_BLOCK_PTR
57Special basic blocks represent possible entry and exit points of a
58function. These blocks are called @code{ENTRY_BLOCK_PTR} and
59@code{EXIT_BLOCK_PTR}. These blocks do not contain any code.
60
61@findex BASIC_BLOCK
62The @code{BASIC_BLOCK} array contains all basic blocks in an
63unspecified order. Each @code{basic_block} structure has a field
64that holds a unique integer identifier @code{index} that is the
65index of the block in the @code{BASIC_BLOCK} array.
66The total number of basic blocks in the function is
67@code{n_basic_blocks}. Both the basic block indices and
68the total number of basic blocks may vary during the compilation
69process, as passes reorder, create, duplicate, and destroy basic
70blocks. The index for any block should never be greater than
71@code{last_basic_block}. The indices 0 and 1 are special codes
72reserved for @code{ENTRY_BLOCK} and @code{EXIT_BLOCK}, the
73indices of @code{ENTRY_BLOCK_PTR} and @code{EXIT_BLOCK_PTR}.
74
75@findex next_bb, prev_bb, FOR_EACH_BB, FOR_ALL_BB
76Two pointer members of the @code{basic_block} structure are the
77pointers @code{next_bb} and @code{prev_bb}. These are used to keep
78doubly linked chain of basic blocks in the same order as the
79underlying instruction stream. The chain of basic blocks is updated
80transparently by the provided API for manipulating the CFG@. The macro
81@code{FOR_EACH_BB} can be used to visit all the basic blocks in
82lexicographical order, except @code{ENTRY_BLOCK} and @code{EXIT_BLOCK}.
83The macro @code{FOR_ALL_BB} also visits all basic blocks in
84lexicographical order, including @code{ENTRY_BLOCK} and @code{EXIT_BLOCK}.
85
86@findex post_order_compute, inverted_post_order_compute, walk_dominator_tree
87The functions @code{post_order_compute} and @code{inverted_post_order_compute}
88can be used to compute topological orders of the CFG. The orders are
89stored as vectors of basic block indices. The @code{BASIC_BLOCK} array
90can be used to iterate each basic block by index.
91Dominator traversals are also possible using
92@code{walk_dominator_tree}. Given two basic blocks A and B, block A
93dominates block B if A is @emph{always} executed before B@.
94
95Each @code{basic_block} also contains pointers to the first
96instruction (the @dfn{head}) and the last instruction (the @dfn{tail})
97or @dfn{end} of the instruction stream contained in a basic block. In
98fact, since the @code{basic_block} data type is used to represent
99blocks in both major intermediate representations of GCC (@code{GIMPLE}
100and RTL), there are pointers to the head and end of a basic block for
101both representations, stored in intermediate representation specific
102data in the @code{il} field of @code{struct basic_block_def}.
103
104@findex CODE_LABEL
105@findex NOTE_INSN_BASIC_BLOCK
106For RTL, these pointers are @code{BB_HEAD} and @code{BB_END}.
107
108@cindex insn notes, notes
109@findex NOTE_INSN_BASIC_BLOCK
110In the RTL representation of a function, the instruction stream
111contains not only the ``real'' instructions, but also @dfn{notes}
112or @dfn{insn notes} (to distinguish them from @dfn{reg notes}).
113Any function that moves or duplicates the basic blocks needs
114to take care of updating of these notes. Many of these notes expect
115that the instruction stream consists of linear regions, so updating
116can sometimes be tedious. All types of insn notes are defined
117in @file{insn-notes.def}.
118
119In the RTL function representation, the instructions contained in a
120basic block always follow a @code{NOTE_INSN_BASIC_BLOCK}, but zero
121or more @code{CODE_LABEL} nodes can precede the block note.
122A basic block ends with a control flow instruction or with the last
123instruction before the next @code{CODE_LABEL} or
124@code{NOTE_INSN_BASIC_BLOCK}.
125By definition, a @code{CODE_LABEL} cannot appear in the middle of
126the instruction stream of a basic block.
127
128@findex can_fallthru
129@cindex table jump
130In addition to notes, the jump table vectors are also represented as
131``pseudo-instructions'' inside the insn stream. These vectors never
132appear in the basic block and should always be placed just after the
133table jump instructions referencing them. After removing the
134table-jump it is often difficult to eliminate the code computing the
135address and referencing the vector, so cleaning up these vectors is
136postponed until after liveness analysis. Thus the jump table vectors
137may appear in the insn stream unreferenced and without any purpose.
138Before any edge is made @dfn{fall-thru}, the existence of such
139construct in the way needs to be checked by calling
140@code{can_fallthru} function.
141
142@cindex GIMPLE statement iterators
143For the @code{GIMPLE} representation, the PHI nodes and statements
144contained in a basic block are in a @code{gimple_seq} pointed to by
145the basic block intermediate language specific pointers.
146Abstract containers and iterators are used to access the PHI nodes
147and statements in a basic blocks. These iterators are called
148@dfn{GIMPLE statement iterators} (GSIs). Grep for @code{^gsi}
149in the various @file{gimple-*} and @file{tree-*} files.
150There is a @code{gimple_stmt_iterator} type for iterating over
151all kinds of statement, and a @code{gphi_iterator} subclass for
152iterating over PHI nodes.
153The following snippet will pretty-print all PHI nodes the statements
154of the current function in the GIMPLE representation.
155
156@smallexample
157basic_block bb;
158
159FOR_EACH_BB (bb)
160 @{
161 gphi_iterator pi;
162 gimple_stmt_iterator si;
163
164 for (pi = gsi_start_phis (bb); !gsi_end_p (pi); gsi_next (&pi))
165 @{
166 gphi *phi = pi.phi ();
167 print_gimple_stmt (dump_file, phi, 0, TDF_SLIM);
168 @}
169 for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si))
170 @{
171 gimple stmt = gsi_stmt (si);
172 print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
173 @}
174 @}
175@end smallexample
176
177
178@node Edges
179@section Edges
180
181@cindex edge in the flow graph
182@findex edge
183Edges represent possible control flow transfers from the end of some
184basic block A to the head of another basic block B@. We say that A is
185a predecessor of B, and B is a successor of A@. Edges are represented
186in GCC with the @code{edge} data type. Each @code{edge} acts as a
187link between two basic blocks: The @code{src} member of an edge
188points to the predecessor basic block of the @code{dest} basic block.
189The members @code{preds} and @code{succs} of the @code{basic_block} data
190type point to type-safe vectors of edges to the predecessors and
191successors of the block.
192
193@cindex edge iterators
194When walking the edges in an edge vector, @dfn{edge iterators} should
195be used. Edge iterators are constructed using the
196@code{edge_iterator} data structure and several methods are available
197to operate on them:
198
199@ftable @code
200@item ei_start
201This function initializes an @code{edge_iterator} that points to the
202first edge in a vector of edges.
203
204@item ei_last
205This function initializes an @code{edge_iterator} that points to the
206last edge in a vector of edges.
207
208@item ei_end_p
209This predicate is @code{true} if an @code{edge_iterator} represents
210the last edge in an edge vector.
211
212@item ei_one_before_end_p
213This predicate is @code{true} if an @code{edge_iterator} represents
214the second last edge in an edge vector.
215
216@item ei_next
217This function takes a pointer to an @code{edge_iterator} and makes it
218point to the next edge in the sequence.
219
220@item ei_prev
221This function takes a pointer to an @code{edge_iterator} and makes it
222point to the previous edge in the sequence.
223
224@item ei_edge
225This function returns the @code{edge} currently pointed to by an
226@code{edge_iterator}.
227
228@item ei_safe_edge
229This function returns the @code{edge} currently pointed to by an
230@code{edge_iterator}, but returns @code{NULL} if the iterator is
231pointing at the end of the sequence. This function has been provided
232for existing code makes the assumption that a @code{NULL} edge
233indicates the end of the sequence.
234
235@end ftable
236
237The convenience macro @code{FOR_EACH_EDGE} can be used to visit all of
238the edges in a sequence of predecessor or successor edges. It must
239not be used when an element might be removed during the traversal,
240otherwise elements will be missed. Here is an example of how to use
241the macro:
242
243@smallexample
244edge e;
245edge_iterator ei;
246
247FOR_EACH_EDGE (e, ei, bb->succs)
248 @{
249 if (e->flags & EDGE_FALLTHRU)
250 break;
251 @}
252@end smallexample
253
254@findex fall-thru
255There are various reasons why control flow may transfer from one block
256to another. One possibility is that some instruction, for example a
257@code{CODE_LABEL}, in a linearized instruction stream just always
258starts a new basic block. In this case a @dfn{fall-thru} edge links
259the basic block to the first following basic block. But there are
260several other reasons why edges may be created. The @code{flags}
261field of the @code{edge} data type is used to store information
262about the type of edge we are dealing with. Each edge is of one of
263the following types:
264
265@table @emph
266@item jump
267No type flags are set for edges corresponding to jump instructions.
268These edges are used for unconditional or conditional jumps and in
269RTL also for table jumps. They are the easiest to manipulate as they
270may be freely redirected when the flow graph is not in SSA form.
271
d77de738 272@findex EDGE_FALLTHRU, force_nonfallthru
f33d7a88 273@item fall-thru
d77de738
ML
274Fall-thru edges are present in case where the basic block may continue
275execution to the following one without branching. These edges have
276the @code{EDGE_FALLTHRU} flag set. Unlike other types of edges, these
277edges must come into the basic block immediately following in the
278instruction stream. The function @code{force_nonfallthru} is
279available to insert an unconditional jump in the case that redirection
280is needed. Note that this may require creation of a new basic block.
281
d77de738
ML
282@cindex exception handling
283@findex EDGE_ABNORMAL, EDGE_EH
f33d7a88 284@item exception handling
d77de738
ML
285Exception handling edges represent possible control transfers from a
286trapping instruction to an exception handler. The definition of
287``trapping'' varies. In C++, only function calls can throw, but for
288Ada exceptions like division by zero or segmentation fault are
289defined and thus each instruction possibly throwing this kind of
290exception needs to be handled as control flow instruction. Exception
291edges have the @code{EDGE_ABNORMAL} and @code{EDGE_EH} flags set.
292
293@findex purge_dead_edges
294When updating the instruction stream it is easy to change possibly
295trapping instruction to non-trapping, by simply removing the exception
296edge. The opposite conversion is difficult, but should not happen
297anyway. The edges can be eliminated via @code{purge_dead_edges} call.
298
299@findex REG_EH_REGION, EDGE_ABNORMAL_CALL
300In the RTL representation, the destination of an exception edge is
301specified by @code{REG_EH_REGION} note attached to the insn.
302In case of a trapping call the @code{EDGE_ABNORMAL_CALL} flag is set
303too. In the @code{GIMPLE} representation, this extra flag is not set.
304
305@findex may_trap_p, tree_could_trap_p
306In the RTL representation, the predicate @code{may_trap_p} may be used
307to check whether instruction still may trap or not. For the tree
308representation, the @code{tree_could_trap_p} predicate is available,
309but this predicate only checks for possible memory traps, as in
310dereferencing an invalid pointer location.
311
312
d77de738
ML
313@cindex sibling call
314@findex EDGE_ABNORMAL, EDGE_SIBCALL
f33d7a88 315@item sibling calls
d77de738
ML
316Sibling calls or tail calls terminate the function in a non-standard
317way and thus an edge to the exit must be present.
318@code{EDGE_SIBCALL} and @code{EDGE_ABNORMAL} are set in such case.
319These edges only exist in the RTL representation.
320
d77de738
ML
321@cindex computed jump
322@findex EDGE_ABNORMAL
f33d7a88 323@item computed jumps
d77de738
ML
324Computed jumps contain edges to all labels in the function referenced
325from the code. All those edges have @code{EDGE_ABNORMAL} flag set.
326The edges used to represent computed jumps often cause compile time
327performance problems, since functions consisting of many taken labels
328and many computed jumps may have @emph{very} dense flow graphs, so
329these edges need to be handled with special care. During the earlier
330stages of the compilation process, GCC tries to avoid such dense flow
331graphs by factoring computed jumps. For example, given the following
332series of jumps,
333
334@smallexample
335 goto *x;
336 [ @dots{} ]
337
338 goto *x;
339 [ @dots{} ]
340
341 goto *x;
342 [ @dots{} ]
343@end smallexample
344
345@noindent
346factoring the computed jumps results in the following code sequence
347which has a much simpler flow graph:
348
349@smallexample
350 goto y;
351 [ @dots{} ]
352
353 goto y;
354 [ @dots{} ]
355
356 goto y;
357 [ @dots{} ]
358
359y:
360 goto *x;
361@end smallexample
362
363@findex pass_duplicate_computed_gotos
364However, the classic problem with this transformation is that it has a
365runtime cost in there resulting code: An extra jump. Therefore, the
366computed jumps are un-factored in the later passes of the compiler
367(in the pass called @code{pass_duplicate_computed_gotos}).
368Be aware of that when you work on passes in that area. There have
369been numerous examples already where the compile time for code with
370unfactored computed jumps caused some serious headaches.
371
d77de738
ML
372@cindex nonlocal goto handler
373@findex EDGE_ABNORMAL, EDGE_ABNORMAL_CALL
f33d7a88 374@item nonlocal goto handlers
d77de738
ML
375GCC allows nested functions to return into caller using a @code{goto}
376to a label passed to as an argument to the callee. The labels passed
377to nested functions contain special code to cleanup after function
378call. Such sections of code are referred to as ``nonlocal goto
379receivers''. If a function contains such nonlocal goto receivers, an
380edge from the call to the label is created with the
381@code{EDGE_ABNORMAL} and @code{EDGE_ABNORMAL_CALL} flags set.
382
d77de738
ML
383@cindex function entry point, alternate function entry point
384@findex LABEL_ALTERNATE_NAME
f33d7a88 385@item function entry points
d77de738
ML
386By definition, execution of function starts at basic block 0, so there
387is always an edge from the @code{ENTRY_BLOCK_PTR} to basic block 0.
388There is no @code{GIMPLE} representation for alternate entry points at
389this moment. In RTL, alternate entry points are specified by
390@code{CODE_LABEL} with @code{LABEL_ALTERNATE_NAME} defined. This
391feature is currently used for multiple entry point prologues and is
392limited to post-reload passes only. This can be used by back-ends to
393emit alternate prologues for functions called from different contexts.
394In future full support for multiple entry functions defined by Fortran
39590 needs to be implemented.
396
397@item function exits
398In the pre-reload representation a function terminates after the last
399instruction in the insn chain and no explicit return instructions are
400used. This corresponds to the fall-thru edge into exit block. After
401reload, optimal RTL epilogues are used that use explicit (conditional)
402return instructions that are represented by edges with no flags set.
403
404@end table
405
406
407@node Profile information
408@section Profile information
409
410@cindex profile representation
411In many cases a compiler must make a choice whether to trade speed in
412one part of code for speed in another, or to trade code size for code
413speed. In such cases it is useful to know information about how often
414some given block will be executed. That is the purpose for
415maintaining profile within the flow graph.
416GCC can handle profile information obtained through @dfn{profile
417feedback}, but it can also estimate branch probabilities based on
418statics and heuristics.
419
420@cindex profile feedback
421The feedback based profile is produced by compiling the program with
422instrumentation, executing it on a train run and reading the numbers
423of executions of basic blocks and edges back to the compiler while
424re-compiling the program to produce the final executable. This method
425provides very accurate information about where a program spends most
426of its time on the train run. Whether it matches the average run of
427course depends on the choice of train data set, but several studies
428have shown that the behavior of a program usually changes just
429marginally over different data sets.
430
431@cindex Static profile estimation
432@cindex branch prediction
433@findex predict.def
434When profile feedback is not available, the compiler may be asked to
435attempt to predict the behavior of each branch in the program using a
436set of heuristics (see @file{predict.def} for details) and compute
437estimated frequencies of each basic block by propagating the
438probabilities over the graph.
439
440@findex frequency, count, BB_FREQ_BASE
441Each @code{basic_block} contains two integer fields to represent
442profile information: @code{frequency} and @code{count}. The
443@code{frequency} is an estimation how often is basic block executed
444within a function. It is represented as an integer scaled in the
445range from 0 to @code{BB_FREQ_BASE}. The most frequently executed
446basic block in function is initially set to @code{BB_FREQ_BASE} and
447the rest of frequencies are scaled accordingly. During optimization,
448the frequency of the most frequent basic block can both decrease (for
449instance by loop unrolling) or grow (for instance by cross-jumping
450optimization), so scaling sometimes has to be performed multiple
451times.
452
453@findex gcov_type
454The @code{count} contains hard-counted numbers of execution measured
455during training runs and is nonzero only when profile feedback is
456available. This value is represented as the host's widest integer
457(typically a 64 bit integer) of the special type @code{gcov_type}.
458
459Most optimization passes can use only the frequency information of a
460basic block, but a few passes may want to know hard execution counts.
461The frequencies should always match the counts after scaling, however
462during updating of the profile information numerical error may
463accumulate into quite large errors.
464
465@findex REG_BR_PROB_BASE, EDGE_FREQUENCY
466Each edge also contains a branch probability field: an integer in the
467range from 0 to @code{REG_BR_PROB_BASE}. It represents probability of
468passing control from the end of the @code{src} basic block to the
469@code{dest} basic block, i.e.@: the probability that control will flow
470along this edge. The @code{EDGE_FREQUENCY} macro is available to
471compute how frequently a given edge is taken. There is a @code{count}
472field for each edge as well, representing same information as for a
473basic block.
474
475The basic block frequencies are not represented in the instruction
476stream, but in the RTL representation the edge frequencies are
477represented for conditional jumps (via the @code{REG_BR_PROB}
478macro) since they are used when instructions are output to the
479assembly file and the flow graph is no longer maintained.
480
481@cindex reverse probability
482The probability that control flow arrives via a given edge to its
483destination basic block is called @dfn{reverse probability} and is not
484directly represented, but it may be easily computed from frequencies
485of basic blocks.
486
487@findex redirect_edge_and_branch
488Updating profile information is a delicate task that can unfortunately
489not be easily integrated with the CFG manipulation API@. Many of the
490functions and hooks to modify the CFG, such as
491@code{redirect_edge_and_branch}, do not have enough information to
492easily update the profile, so updating it is in the majority of cases
493left up to the caller. It is difficult to uncover bugs in the profile
494updating code, because they manifest themselves only by producing
495worse code, and checking profile consistency is not possible because
496of numeric error accumulation. Hence special attention needs to be
497given to this issue in each pass that modifies the CFG@.
498
499@findex REG_BR_PROB_BASE, BB_FREQ_BASE, count
500It is important to point out that @code{REG_BR_PROB_BASE} and
501@code{BB_FREQ_BASE} are both set low enough to be possible to compute
502second power of any frequency or probability in the flow graph, it is
503not possible to even square the @code{count} field, as modern CPUs are
504fast enough to execute $2^32$ operations quickly.
505
506
507@node Maintaining the CFG
508@section Maintaining the CFG
509@findex cfghooks.h
510
511An important task of each compiler pass is to keep both the control
512flow graph and all profile information up-to-date. Reconstruction of
513the control flow graph after each pass is not an option, since it may be
514very expensive and lost profile information cannot be reconstructed at
515all.
516
517GCC has two major intermediate representations, and both use the
518@code{basic_block} and @code{edge} data types to represent control
519flow. Both representations share as much of the CFG maintenance code
520as possible. For each representation, a set of @dfn{hooks} is defined
521so that each representation can provide its own implementation of CFG
522manipulation routines when necessary. These hooks are defined in
523@file{cfghooks.h}. There are hooks for almost all common CFG
524manipulations, including block splitting and merging, edge redirection
525and creating and deleting basic blocks. These hooks should provide
526everything you need to maintain and manipulate the CFG in both the RTL
527and @code{GIMPLE} representation.
528
529At the moment, the basic block boundaries are maintained transparently
530when modifying instructions, so there rarely is a need to move them
531manually (such as in case someone wants to output instruction outside
532basic block explicitly).
533
534@findex BLOCK_FOR_INSN, gimple_bb
535In the RTL representation, each instruction has a
536@code{BLOCK_FOR_INSN} value that represents pointer to the basic block
537that contains the instruction. In the @code{GIMPLE} representation, the
538function @code{gimple_bb} returns a pointer to the basic block
539containing the queried statement.
540
541@cindex GIMPLE statement iterators
542When changes need to be applied to a function in its @code{GIMPLE}
543representation, @dfn{GIMPLE statement iterators} should be used. These
544iterators provide an integrated abstraction of the flow graph and the
545instruction stream. Block statement iterators are constructed using
546the @code{gimple_stmt_iterator} data structure and several modifiers are
547available, including the following:
548
549@ftable @code
550@item gsi_start
551This function initializes a @code{gimple_stmt_iterator} that points to
552the first non-empty statement in a basic block.
553
554@item gsi_last
555This function initializes a @code{gimple_stmt_iterator} that points to
556the last statement in a basic block.
557
558@item gsi_end_p
559This predicate is @code{true} if a @code{gimple_stmt_iterator}
560represents the end of a basic block.
561
562@item gsi_next
563This function takes a @code{gimple_stmt_iterator} and makes it point to
564its successor.
565
566@item gsi_prev
567This function takes a @code{gimple_stmt_iterator} and makes it point to
568its predecessor.
569
570@item gsi_insert_after
571This function inserts a statement after the @code{gimple_stmt_iterator}
572passed in. The final parameter determines whether the statement
573iterator is updated to point to the newly inserted statement, or left
574pointing to the original statement.
575
576@item gsi_insert_before
577This function inserts a statement before the @code{gimple_stmt_iterator}
578passed in. The final parameter determines whether the statement
579iterator is updated to point to the newly inserted statement, or left
580pointing to the original statement.
581
582@item gsi_remove
583This function removes the @code{gimple_stmt_iterator} passed in and
584rechains the remaining statements in a basic block, if any.
585@end ftable
586
587@findex BB_HEAD, BB_END
588In the RTL representation, the macros @code{BB_HEAD} and @code{BB_END}
589may be used to get the head and end @code{rtx} of a basic block. No
590abstract iterators are defined for traversing the insn chain, but you
591can just use @code{NEXT_INSN} and @code{PREV_INSN} instead. @xref{Insns}.
592
593@findex purge_dead_edges
594Usually a code manipulating pass simplifies the instruction stream and
595the flow of control, possibly eliminating some edges. This may for
596example happen when a conditional jump is replaced with an
597unconditional jump. Updating of edges
598is not transparent and each optimization pass is required to do so
599manually. However only few cases occur in practice. The pass may
600call @code{purge_dead_edges} on a given basic block to remove
601superfluous edges, if any.
602
603@findex redirect_edge_and_branch, redirect_jump
604Another common scenario is redirection of branch instructions, but
605this is best modeled as redirection of edges in the control flow graph
606and thus use of @code{redirect_edge_and_branch} is preferred over more
607low level functions, such as @code{redirect_jump} that operate on RTL
608chain only. The CFG hooks defined in @file{cfghooks.h} should provide
609the complete API required for manipulating and maintaining the CFG@.
610
611@findex split_block
612It is also possible that a pass has to insert control flow instruction
613into the middle of a basic block, thus creating an entry point in the
614middle of the basic block, which is impossible by definition: The
615block must be split to make sure it only has one entry point, i.e.@: the
616head of the basic block. The CFG hook @code{split_block} may be used
617when an instruction in the middle of a basic block has to become the
618target of a jump or branch instruction.
619
620@findex insert_insn_on_edge
621@findex commit_edge_insertions
622@findex gsi_insert_on_edge
623@findex gsi_commit_edge_inserts
624@cindex edge splitting
625For a global optimizer, a common operation is to split edges in the
626flow graph and insert instructions on them. In the RTL
627representation, this can be easily done using the
628@code{insert_insn_on_edge} function that emits an instruction
629``on the edge'', caching it for a later @code{commit_edge_insertions}
630call that will take care of moving the inserted instructions off the
631edge into the instruction stream contained in a basic block. This
632includes the creation of new basic blocks where needed. In the
633@code{GIMPLE} representation, the equivalent functions are
634@code{gsi_insert_on_edge} which inserts a block statement
635iterator on an edge, and @code{gsi_commit_edge_inserts} which flushes
636the instruction to actual instruction stream.
637
638@findex verify_flow_info
639@cindex CFG verification
640While debugging the optimization pass, the @code{verify_flow_info}
641function may be useful to find bugs in the control flow graph updating
642code.
643
644
645@node Liveness information
646@section Liveness information
647@cindex Liveness representation
648Liveness information is useful to determine whether some register is
649``live'' at given point of program, i.e.@: that it contains a value that
650may be used at a later point in the program. This information is
651used, for instance, during register allocation, as the pseudo
652registers only need to be assigned to a unique hard register or to a
653stack slot if they are live. The hard registers and stack slots may
654be freely reused for other values when a register is dead.
655
656Liveness information is available in the back end starting with
657@code{pass_df_initialize} and ending with @code{pass_df_finish}. Three
658flavors of live analysis are available: With @code{LR}, it is possible
659to determine at any point @code{P} in the function if the register may be
660used on some path from @code{P} to the end of the function. With
661@code{UR}, it is possible to determine if there is a path from the
662beginning of the function to @code{P} that defines the variable.
663@code{LIVE} is the intersection of the @code{LR} and @code{UR} and a
664variable is live at @code{P} if there is both an assignment that reaches
665it from the beginning of the function and a use that can be reached on
666some path from @code{P} to the end of the function.
667
668In general @code{LIVE} is the most useful of the three. The macros
669@code{DF_[LR,UR,LIVE]_[IN,OUT]} can be used to access this information.
670The macros take a basic block number and return a bitmap that is indexed
671by the register number. This information is only guaranteed to be up to
672date after calls are made to @code{df_analyze}. See the file
673@code{df-core.cc} for details on using the dataflow.
674
675
676@findex REG_DEAD, REG_UNUSED
677The liveness information is stored partly in the RTL instruction stream
678and partly in the flow graph. Local information is stored in the
679instruction stream: Each instruction may contain @code{REG_DEAD} notes
680representing that the value of a given register is no longer needed, or
681@code{REG_UNUSED} notes representing that the value computed by the
682instruction is never used. The second is useful for instructions
683computing multiple values at once.
684