]> git.ipfire.org Git - thirdparty/gcc.git/blame - gcc/doc/loop.texi
* cfgloop.h (struct loop): New field constraints.
[thirdparty/gcc.git] / gcc / doc / loop.texi
CommitLineData
f1717362 1@c Copyright (C) 2006-2016 Free Software Foundation, Inc.
93d414c9 2@c Free Software Foundation, Inc.
3@c This is part of the GCC manual.
4@c For copying conditions, see the file gcc.texi.
5
6@c ---------------------------------------------------------------------
7@c Loop Representation
8@c ---------------------------------------------------------------------
9
339496f1 10@node Loop Analysis and Representation
93d414c9 11@chapter Analysis and Representation of Loops
12
13GCC provides extensive infrastructure for work with natural loops, i.e.,
14strongly connected components of CFG with only one entry block. This
15chapter describes representation of loops in GCC, both on GIMPLE and in
16RTL, as well as the interfaces to loop-related analyses (induction
17variable analysis and number of iterations analysis).
18
19@menu
c24c5fac 20* Loop representation:: Representation and analysis of loops.
21* Loop querying:: Getting information about loops.
22* Loop manipulation:: Loop manipulation functions.
23* LCSSA:: Loop-closed SSA form.
24* Scalar evolutions:: Induction variables on GIMPLE.
25* loop-iv:: Induction variables on RTL.
26* Number of iterations:: Number of iterations analysis.
27* Dependency analysis:: Data dependency analysis.
93d414c9 28@end menu
29
30@node Loop representation
31@section Loop representation
32@cindex Loop representation
33@cindex Loop analysis
34
35This chapter describes the representation of loops in GCC, and functions
36that can be used to build, modify and analyze this representation. Most
37of the interfaces and data structures are declared in @file{cfgloop.h}.
044e2e4b 38Loop structures are analyzed and this information disposed or updated
39at the discretion of individual passes. Still most of the generic
40CFG manipulation routines are aware of loop structures and try to
41keep them up-to-date. By this means an increasing part of the
42compilation pipeline is setup to maintain loop structure across
43passes to allow attaching meta information to individual loops
44for consumption by later passes.
93d414c9 45
46In general, a natural loop has one entry block (header) and possibly
47several back edges (latches) leading to the header from the inside of
48the loop. Loops with several latches may appear if several loops share
49a single header, or if there is a branching in the middle of the loop.
50The representation of loops in GCC however allows only loops with a
51single latch. During loop analysis, headers of such loops are split and
52forwarder blocks are created in order to disambiguate their structures.
4a6f9e19 53Heuristic based on profile information and structure of the induction
54variables in the loops is used to determine whether the latches
55correspond to sub-loops or to control flow in a single loop. This means
56that the analysis sometimes changes the CFG, and if you run it in the
57middle of an optimization pass, you must be able to deal with the new
58blocks. You may avoid CFG changes by passing
59@code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES} flag to the loop discovery,
60note however that most other loop manipulation functions will not work
61correctly for loops with multiple latch edges (the functions that only
62query membership of blocks to loops and subloop relationships, or
63enumerate and test loop exits, can be expected to work).
93d414c9 64
65Body of the loop is the set of blocks that are dominated by its header,
a81d4cba 66and reachable from its latch against the direction of edges in CFG@. The
93d414c9 67loops are organized in a containment hierarchy (tree) such that all the
68loops immediately contained inside loop L are the children of L in the
69tree. This tree is represented by the @code{struct loops} structure.
70The root of this tree is a fake loop that contains all blocks in the
71function. Each of the loops is represented in a @code{struct loop}
72structure. Each loop is assigned an index (@code{num} field of the
73@code{struct loop} structure), and the pointer to the loop is stored in
17519ba0 74the corresponding field of the @code{larray} vector in the loops
3bbbcdff 75structure. The indices do not have to be continuous, there may be
17519ba0 76empty (@code{NULL}) entries in the @code{larray} created by deleting
3bbbcdff 77loops. Also, there is no guarantee on the relative order of a loop
78and its subloops in the numbering. The index of a loop never changes.
17519ba0 79
80The entries of the @code{larray} field should not be accessed directly.
81The function @code{get_loop} returns the loop description for a loop with
82the given index. @code{number_of_loops} function returns number of
83loops in the function. To traverse all loops, use @code{FOR_EACH_LOOP}
84macro. The @code{flags} argument of the macro is used to determine
3bbbcdff 85the direction of traversal and the set of loops visited. Each loop is
86guaranteed to be visited exactly once, regardless of the changes to the
87loop tree, and the loops may be removed during the traversal. The newly
88created loops are never traversed, if they need to be visited, this
89must be done separately after their creation. The @code{FOR_EACH_LOOP}
90macro allocates temporary variables. If the @code{FOR_EACH_LOOP} loop
91were ended using break or goto, they would not be released;
92@code{FOR_EACH_LOOP_BREAK} macro must be used instead.
93d414c9 93
94Each basic block contains the reference to the innermost loop it belongs
95to (@code{loop_father}). For this reason, it is only possible to have
96one @code{struct loops} structure initialized at the same time for each
a81d4cba 97CFG@. The global variable @code{current_loops} contains the
17519ba0 98@code{struct loops} structure. Many of the loop manipulation functions
99assume that dominance information is up-to-date.
93d414c9 100
101The loops are analyzed through @code{loop_optimizer_init} function. The
102argument of this function is a set of flags represented in an integer
103bitmask. These flags specify what other properties of the loop
104structures should be calculated/enforced and preserved later:
105
106@itemize
4a6f9e19 107@item @code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES}: If this flag is set, no
108changes to CFG will be performed in the loop analysis, in particular,
109loops with multiple latch edges will not be disambiguated. If a loop
a81d4cba 110has multiple latches, its latch block is set to NULL@. Most of
4a6f9e19 111the loop manipulation functions will not work for loops in this shape.
112No other flags that require CFG changes can be passed to
113loop_optimizer_init.
93d414c9 114@item @code{LOOPS_HAVE_PREHEADERS}: Forwarder blocks are created in such
115a way that each loop has only one entry edge, and additionally, the
116source block of this entry edge has only one successor. This creates a
117natural place where the code can be moved out of the loop, and ensures
118that the entry edge of the loop leads from its immediate super-loop.
119@item @code{LOOPS_HAVE_SIMPLE_LATCHES}: Forwarder blocks are created to
120force the latch block of each loop to have only one successor. This
121ensures that the latch of the loop does not belong to any of its
122sub-loops, and makes manipulation with the loops significantly easier.
123Most of the loop manipulation functions assume that the loops are in
124this shape. Note that with this flag, the ``normal'' loop without any
125control flow inside and with one exit consists of two basic blocks.
126@item @code{LOOPS_HAVE_MARKED_IRREDUCIBLE_REGIONS}: Basic blocks and
127edges in the strongly connected components that are not natural loops
128(have more than one entry block) are marked with
129@code{BB_IRREDUCIBLE_LOOP} and @code{EDGE_IRREDUCIBLE_LOOP} flags. The
130flag is not set for blocks and edges that belong to natural loops that
131are in such an irreducible region (but it is set for the entry and exit
132edges of such a loop, if they lead to/from this region).
dce58e66 133@item @code{LOOPS_HAVE_RECORDED_EXITS}: The lists of exits are recorded
134and updated for each loop. This makes some functions (e.g.,
135@code{get_loop_exit_edges}) more efficient. Some functions (e.g.,
136@code{single_exit}) can be used only if the lists of exits are
137recorded.
93d414c9 138@end itemize
139
140These properties may also be computed/enforced later, using functions
141@code{create_preheaders}, @code{force_single_succ_latches},
dce58e66 142@code{mark_irreducible_loops} and @code{record_loop_exits}.
044e2e4b 143The properties can be queried using @code{loops_state_satisfies_p}.
93d414c9 144
145The memory occupied by the loops structures should be freed with
044e2e4b 146@code{loop_optimizer_finalize} function. When loop structures are
147setup to be preserved across passes this function reduces the
148information to be kept up-to-date to a minimum (only
149@code{LOOPS_MAY_HAVE_MULTIPLE_LATCHES} set).
93d414c9 150
151The CFG manipulation functions in general do not update loop structures.
152Specialized versions that additionally do so are provided for the most
153common tasks. On GIMPLE, @code{cleanup_tree_cfg_loop} function can be
154used to cleanup CFG while updating the loops structures if
155@code{current_loops} is set.
156
044e2e4b 157At the moment loop structure is preserved from the start of GIMPLE
158loop optimizations until the end of RTL loop optimizations. During
159this time a loop can be tracked by its @code{struct loop} and number.
160
93d414c9 161@node Loop querying
162@section Loop querying
163@cindex Loop querying
164
165The functions to query the information about loops are declared in
166@file{cfgloop.h}. Some of the information can be taken directly from
167the structures. @code{loop_father} field of each basic block contains
168the innermost loop to that the block belongs. The most useful fields of
169loop structure (that are kept up-to-date at all times) are:
170
171@itemize
172@item @code{header}, @code{latch}: Header and latch basic blocks of the
173loop.
174@item @code{num_nodes}: Number of basic blocks in the loop (including
175the basic blocks of the sub-loops).
93d414c9 176@item @code{outer}, @code{inner}, @code{next}: The super-loop, the first
177sub-loop, and the sibling of the loop in the loops tree.
93d414c9 178@end itemize
179
180There are other fields in the loop structures, many of them used only by
181some of the passes, or not updated during CFG changes; in general, they
182should not be accessed directly.
183
184The most important functions to query loop structures are:
185
186@itemize
29bd66c5 187@item @code{loop_depth}: The depth of the loop in the loops tree, i.e., the
188number of super-loops of the loop.
93d414c9 189@item @code{flow_loops_dump}: Dumps the information about loops to a
190file.
191@item @code{verify_loop_structure}: Checks consistency of the loop
192structures.
193@item @code{loop_latch_edge}: Returns the latch edge of a loop.
194@item @code{loop_preheader_edge}: If loops have preheaders, returns
195the preheader edge of a loop.
196@item @code{flow_loop_nested_p}: Tests whether loop is a sub-loop of
197another loop.
198@item @code{flow_bb_inside_loop_p}: Tests whether a basic block belongs
199to a loop (including its sub-loops).
200@item @code{find_common_loop}: Finds the common super-loop of two loops.
201@item @code{superloop_at_depth}: Returns the super-loop of a loop with
202the given depth.
203@item @code{tree_num_loop_insns}, @code{num_loop_insns}: Estimates the
204number of insns in the loop, on GIMPLE and on RTL.
205@item @code{loop_exit_edge_p}: Tests whether edge is an exit from a
206loop.
207@item @code{mark_loop_exit_edges}: Marks all exit edges of all loops
208with @code{EDGE_LOOP_EXIT} flag.
209@item @code{get_loop_body}, @code{get_loop_body_in_dom_order},
210@code{get_loop_body_in_bfs_order}: Enumerates the basic blocks in the
211loop in depth-first search order in reversed CFG, ordered by dominance
212relation, and breath-first search order, respectively.
d9e7e1a2 213@item @code{single_exit}: Returns the single exit edge of the loop, or
214@code{NULL} if the loop has more than one exit. You can only use this
215function if LOOPS_HAVE_MARKED_SINGLE_EXITS property is used.
93d414c9 216@item @code{get_loop_exit_edges}: Enumerates the exit edges of a loop.
217@item @code{just_once_each_iteration_p}: Returns true if the basic block
218is executed exactly once during each iteration of a loop (that is, it
219does not belong to a sub-loop, and it dominates the latch of the loop).
220@end itemize
221
222@node Loop manipulation
223@section Loop manipulation
224@cindex Loop manipulation
225
226The loops tree can be manipulated using the following functions:
227
228@itemize
229@item @code{flow_loop_tree_node_add}: Adds a node to the tree.
230@item @code{flow_loop_tree_node_remove}: Removes a node from the tree.
231@item @code{add_bb_to_loop}: Adds a basic block to a loop.
232@item @code{remove_bb_from_loops}: Removes a basic block from loops.
233@end itemize
234
88e6f696 235Most low-level CFG functions update loops automatically. The following
236functions handle some more complicated cases of CFG manipulations:
93d414c9 237
238@itemize
93d414c9 239@item @code{remove_path}: Removes an edge and all blocks it dominates.
93d414c9 240@item @code{split_loop_exit_edge}: Splits exit edge of the loop,
241ensuring that PHI node arguments remain in the loop (this ensures that
242loop-closed SSA form is preserved). Only useful on GIMPLE.
243@end itemize
244
245Finally, there are some higher-level loop transformations implemented.
246While some of them are written so that they should work on non-innermost
247loops, they are mostly untested in that case, and at the moment, they
248are only reliable for the innermost loops:
249
250@itemize
251@item @code{create_iv}: Creates a new induction variable. Only works on
a81d4cba 252GIMPLE@. @code{standard_iv_increment_position} can be used to find a
93d414c9 253suitable place for the iv increment.
254@item @code{duplicate_loop_to_header_edge},
255@code{tree_duplicate_loop_to_header_edge}: These functions (on RTL and
256on GIMPLE) duplicate the body of the loop prescribed number of times on
257one of the edges entering loop header, thus performing either loop
258unrolling or loop peeling. @code{can_duplicate_loop_p}
259(@code{can_unroll_loop_p} on GIMPLE) must be true for the duplicated
260loop.
261@item @code{loop_version}, @code{tree_ssa_loop_version}: These function
262create a copy of a loop, and a branch before them that selects one of
263them depending on the prescribed condition. This is useful for
264optimizations that need to verify some assumptions in runtime (one of
265the copies of the loop is usually left unchanged, while the other one is
266transformed in some way).
267@item @code{tree_unroll_loop}: Unrolls the loop, including peeling the
268extra iterations to make the number of iterations divisible by unroll
269factor, updating the exit condition, and removing the exits that now
270cannot be taken. Works only on GIMPLE.
271@end itemize
272
273@node LCSSA
274@section Loop-closed SSA form
275@cindex LCSSA
276@cindex Loop-closed SSA form
277
278Throughout the loop optimizations on tree level, one extra condition is
279enforced on the SSA form: No SSA name is used outside of the loop in
280that it is defined. The SSA form satisfying this condition is called
a81d4cba 281``loop-closed SSA form'' -- LCSSA@. To enforce LCSSA, PHI nodes must be
93d414c9 282created at the exits of the loops for the SSA names that are used
283outside of them. Only the real operands (not virtual SSA names) are
284held in LCSSA, in order to save memory.
285
286There are various benefits of LCSSA:
287
288@itemize
289@item Many optimizations (value range analysis, final value
290replacement) are interested in the values that are defined in the loop
291and used outside of it, i.e., exactly those for that we create new PHI
292nodes.
293@item In induction variable analysis, it is not necessary to specify the
294loop in that the analysis should be performed -- the scalar evolution
295analysis always returns the results with respect to the loop in that the
296SSA name is defined.
297@item It makes updating of SSA form during loop transformations simpler.
298Without LCSSA, operations like loop unrolling may force creation of PHI
299nodes arbitrarily far from the loop, while in LCSSA, the SSA form can be
300updated locally. However, since we only keep real operands in LCSSA, we
301cannot use this advantage (we could have local updating of real
302operands, but it is not much more efficient than to use generic SSA form
303updating for it as well; the amount of changes to SSA is the same).
304@end itemize
305
306However, it also means LCSSA must be updated. This is usually
307straightforward, unless you create a new value in loop and use it
308outside, or unless you manipulate loop exit edges (functions are
309provided to make these manipulations simple).
310@code{rewrite_into_loop_closed_ssa} is used to rewrite SSA form to
311LCSSA, and @code{verify_loop_closed_ssa} to check that the invariant of
312LCSSA is preserved.
313
314@node Scalar evolutions
315@section Scalar evolutions
316@cindex Scalar evolutions
317@cindex IV analysis on GIMPLE
318
319Scalar evolutions (SCEV) are used to represent results of induction
a81d4cba 320variable analysis on GIMPLE@. They enable us to represent variables with
93d414c9 321complicated behavior in a simple and consistent way (we only use it to
322express values of polynomial induction variables, but it is possible to
323extend it). The interfaces to SCEV analysis are declared in
324@file{tree-scalar-evolution.h}. To use scalar evolutions analysis,
325@code{scev_initialize} must be used. To stop using SCEV,
326@code{scev_finalize} should be used. SCEV analysis caches results in
327order to save time and memory. This cache however is made invalid by
328most of the loop transformations, including removal of code. If such a
329transformation is performed, @code{scev_reset} must be called to clean
330the caches.
331
332Given an SSA name, its behavior in loops can be analyzed using the
333@code{analyze_scalar_evolution} function. The returned SCEV however
334does not have to be fully analyzed and it may contain references to
335other SSA names defined in the loop. To resolve these (potentially
336recursive) references, @code{instantiate_parameters} or
337@code{resolve_mixers} functions must be used.
338@code{instantiate_parameters} is useful when you use the results of SCEV
339only for some analysis, and when you work with whole nest of loops at
340once. It will try replacing all SSA names by their SCEV in all loops,
341including the super-loops of the current loop, thus providing a complete
342information about the behavior of the variable in the loop nest.
343@code{resolve_mixers} is useful if you work with only one loop at a
344time, and if you possibly need to create code based on the value of the
345induction variable. It will only resolve the SSA names defined in the
346current loop, leaving the SSA names defined outside unchanged, even if
347their evolution in the outer loops is known.
348
349The SCEV is a normal tree expression, except for the fact that it may
350contain several special tree nodes. One of them is
351@code{SCEV_NOT_KNOWN}, used for SSA names whose value cannot be
352expressed. The other one is @code{POLYNOMIAL_CHREC}. Polynomial chrec
353has three arguments -- base, step and loop (both base and step may
354contain further polynomial chrecs). Type of the expression and of base
355and step must be the same. A variable has evolution
356@code{POLYNOMIAL_CHREC(base, step, loop)} if it is (in the specified
357loop) equivalent to @code{x_1} in the following example
358
359@smallexample
7a5a9c54 360while (@dots{})
93d414c9 361 @{
362 x_1 = phi (base, x_2);
363 x_2 = x_1 + step;
364 @}
365@end smallexample
366
367Note that this includes the language restrictions on the operations.
368For example, if we compile C code and @code{x} has signed type, then the
369overflow in addition would cause undefined behavior, and we may assume
370that this does not happen. Hence, the value with this SCEV cannot
371overflow (which restricts the number of iterations of such a loop).
372
373In many cases, one wants to restrict the attention just to affine
374induction variables. In this case, the extra expressive power of SCEV
375is not useful, and may complicate the optimizations. In this case,
376@code{simple_iv} function may be used to analyze a value -- the result
377is a loop-invariant base and step.
378
379@node loop-iv
380@section IV analysis on RTL
381@cindex IV analysis on RTL
382
383The induction variable on RTL is simple and only allows analysis of
384affine induction variables, and only in one loop at once. The interface
385is declared in @file{cfgloop.h}. Before analyzing induction variables
386in a loop L, @code{iv_analysis_loop_init} function must be called on L.
387After the analysis (possibly calling @code{iv_analysis_loop_init} for
388several loops) is finished, @code{iv_analysis_done} should be called.
389The following functions can be used to access the results of the
390analysis:
391
392@itemize
393@item @code{iv_analyze}: Analyzes a single register used in the given
394insn. If no use of the register in this insn is found, the following
395insns are scanned, so that this function can be called on the insn
396returned by get_condition.
397@item @code{iv_analyze_result}: Analyzes result of the assignment in the
398given insn.
399@item @code{iv_analyze_expr}: Analyzes a more complicated expression.
400All its operands are analyzed by @code{iv_analyze}, and hence they must
401be used in the specified insn or one of the following insns.
402@end itemize
403
404The description of the induction variable is provided in @code{struct
405rtx_iv}. In order to handle subregs, the representation is a bit
406complicated; if the value of the @code{extend} field is not
407@code{UNKNOWN}, the value of the induction variable in the i-th
408iteration is
409
410@smallexample
411delta + mult * extend_@{extend_mode@} (subreg_@{mode@} (base + i * step)),
412@end smallexample
413
414with the following exception: if @code{first_special} is true, then the
415value in the first iteration (when @code{i} is zero) is @code{delta +
416mult * base}. However, if @code{extend} is equal to @code{UNKNOWN},
417then @code{first_special} must be false, @code{delta} 0, @code{mult} 1
418and the value in the i-th iteration is
419
420@smallexample
421subreg_@{mode@} (base + i * step)
422@end smallexample
423
424The function @code{get_iv_value} can be used to perform these
425calculations.
426
427@node Number of iterations
428@section Number of iterations analysis
429@cindex Number of iterations analysis
430
431Both on GIMPLE and on RTL, there are functions available to determine
0c3c2e56 432the number of iterations of a loop, with a similar interface. The
433number of iterations of a loop in GCC is defined as the number of
434executions of the loop latch. In many cases, it is not possible to
435determine the number of iterations unconditionally -- the determined
436number is correct only if some assumptions are satisfied. The analysis
437tries to verify these conditions using the information contained in the
438program; if it fails, the conditions are returned together with the
439result. The following information and conditions are provided by the
440analysis:
93d414c9 441
442@itemize
443@item @code{assumptions}: If this condition is false, the rest of
444the information is invalid.
445@item @code{noloop_assumptions} on RTL, @code{may_be_zero} on GIMPLE: If
446this condition is true, the loop exits in the first iteration.
447@item @code{infinite}: If this condition is true, the loop is infinite.
a81d4cba 448This condition is only available on RTL@. On GIMPLE, conditions for
93d414c9 449finiteness of the loop are included in @code{assumptions}.
450@item @code{niter_expr} on RTL, @code{niter} on GIMPLE: The expression
451that gives number of iterations. The number of iterations is defined as
452the number of executions of the loop latch.
453@end itemize
454
455Both on GIMPLE and on RTL, it necessary for the induction variable
456analysis framework to be initialized (SCEV on GIMPLE, loop-iv on RTL).
457On GIMPLE, the results are stored to @code{struct tree_niter_desc}
458structure. Number of iterations before the loop is exited through a
459given exit can be determined using @code{number_of_iterations_exit}
460function. On RTL, the results are returned in @code{struct niter_desc}
461structure. The corresponding function is named
462@code{check_simple_exit}. There are also functions that pass through
463all the exits of a loop and try to find one with easy to determine
464number of iterations -- @code{find_loop_niter} on GIMPLE and
a81d4cba 465@code{find_simple_exit} on RTL@. Finally, there are functions that
93d414c9 466provide the same information, but additionally cache it, so that
467repeated calls to number of iterations are not so costly --
0c3c2e56 468@code{number_of_latch_executions} on GIMPLE and @code{get_simple_loop_desc}
469on RTL.
93d414c9 470
471Note that some of these functions may behave slightly differently than
472others -- some of them return only the expression for the number of
473iterations, and fail if there are some assumptions. The function
0c3c2e56 474@code{number_of_latch_executions} works only for single-exit loops.
475The function @code{number_of_cond_exit_executions} can be used to
476determine number of executions of the exit condition of a single-exit
477loop (i.e., the @code{number_of_latch_executions} increased by one).
93d414c9 478
85168485 479On GIMPLE, below constraint flags affect semantics of some APIs of number
480of iterations analyzer:
481
482@itemize
483@item @code{LOOP_C_INFINITE}: If this constraint flag is set, the loop
484is known to be infinite. APIs like @code{number_of_iterations_exit} can
485return false directly without doing any analysis.
486@item @code{LOOP_C_FINITE}: If this constraint flag is set, the loop is
487known to be finite, in other words, loop's number of iterations can be
488computed with @code{assumptions} be true.
489@end itemize
490
491Generally, the constraint flags are set/cleared by consumers which are
492loop optimizers. It's also the consumers' responsibility to set/clear
493constraints correctly. Failing to do that might result in hard to track
494down bugs in scev/niter consumers. One typical use case is vectorizer:
495it drives number of iterations analyzer by setting @code{LOOP_C_FINITE}
496and vectorizes possibly infinite loop by versioning loop with analysis
497result. In return, constraints set by consumers can also help number of
498iterations analyzer in following optimizers. For example, @code{niter}
499of a loop versioned under @code{assumptions} is valid unconditionally.
500
501Other constraints may be added in the future, for example, a constraint
502indicating that loops' latch must roll thus @code{may_be_zero} would be
503false unconditionally.
504
93d414c9 505@node Dependency analysis
506@section Data Dependency Analysis
507@cindex Data Dependency Analysis
508
509The code for the data dependence analysis can be found in
510@file{tree-data-ref.c} and its interface and data structures are
511described in @file{tree-data-ref.h}. The function that computes the
512data dependences for all the array and pointer references for a given
513loop is @code{compute_data_dependences_for_loop}. This function is
514currently used by the linear loop transform and the vectorization
515passes. Before calling this function, one has to allocate two vectors:
516a first vector will contain the set of data references that are
517contained in the analyzed loop body, and the second vector will contain
518the dependence relations between the data references. Thus if the
519vector of data references is of size @code{n}, the vector containing the
520dependence relations will contain @code{n*n} elements. However if the
521analyzed loop contains side effects, such as calls that potentially can
522interfere with the data references in the current analyzed loop, the
523analysis stops while scanning the loop body for data references, and
524inserts a single @code{chrec_dont_know} in the dependence relation
525array.
526
527The data references are discovered in a particular order during the
528scanning of the loop body: the loop body is analyzed in execution order,
529and the data references of each statement are pushed at the end of the
530data reference array. Two data references syntactically occur in the
531program in the same order as in the array of data references. This
532syntactic order is important in some classical data dependence tests,
533and mapping this order to the elements of this array avoids costly
534queries to the loop body representation.
535
15b474a2 536Three types of data references are currently handled: ARRAY_REF,
537INDIRECT_REF and COMPONENT_REF@. The data structure for the data reference
538is @code{data_reference}, where @code{data_reference_p} is a name of a
539pointer to the data reference structure. The structure contains the
009860bf 540following elements:
541
542@itemize
15b474a2 543@item @code{base_object_info}: Provides information about the base object
544of the data reference and its access functions. These access functions
545represent the evolution of the data reference in the loop relative to
546its base, in keeping with the classical meaning of the data reference
547access function for the support of arrays. For example, for a reference
548@code{a.b[i][j]}, the base object is @code{a.b} and the access functions,
549one for each array subscript, are:
009860bf 550@code{@{i_init, + i_step@}_1, @{j_init, +, j_step@}_2}.
551
15b474a2 552@item @code{first_location_in_loop}: Provides information about the first
553location accessed by the data reference in the loop and about the access
554function used to represent evolution relative to this location. This data
555is used to support pointers, and is not used for arrays (for which we
009860bf 556have base objects). Pointer accesses are represented as a one-dimensional
15b474a2 557access that starts from the first location accessed in the loop. For
009860bf 558example:
559
560@smallexample
561 for1 i
562 for2 j
563 *((int *)p + i + j) = a[i][j];
564@end smallexample
565
15b474a2 566The access function of the pointer access is @code{@{0, + 4B@}_for2}
567relative to @code{p + i}. The access functions of the array are
568@code{@{i_init, + i_step@}_for1} and @code{@{j_init, +, j_step@}_for2}
009860bf 569relative to @code{a}.
570
15b474a2 571Usually, the object the pointer refers to is either unknown, or we can't
572prove that the access is confined to the boundaries of a certain object.
009860bf 573
15b474a2 574Two data references can be compared only if at least one of these two
575representations has all its fields filled for both data references.
009860bf 576
15b474a2 577The current strategy for data dependence tests is as follows:
578If both @code{a} and @code{b} are represented as arrays, compare
009860bf 579@code{a.base_object} and @code{b.base_object};
15b474a2 580if they are equal, apply dependence tests (use access functions based on
009860bf 581base_objects).
15b474a2 582Else if both @code{a} and @code{b} are represented as pointers, compare
583@code{a.first_location} and @code{b.first_location};
584if they are equal, apply dependence tests (use access functions based on
009860bf 585first location).
15b474a2 586However, if @code{a} and @code{b} are represented differently, only try
009860bf 587to prove that the bases are definitely different.
588
589@item Aliasing information.
590@item Alignment information.
591@end itemize
592
93d414c9 593The structure describing the relation between two data references is
594@code{data_dependence_relation} and the shorter name for a pointer to
595such a structure is @code{ddr_p}. This structure contains:
596
597@itemize
598@item a pointer to each data reference,
599@item a tree node @code{are_dependent} that is set to @code{chrec_known}
600if the analysis has proved that there is no dependence between these two
601data references, @code{chrec_dont_know} if the analysis was not able to
602determine any useful result and potentially there could exist a
603dependence between these data references, and @code{are_dependent} is
604set to @code{NULL_TREE} if there exist a dependence relation between the
605data references, and the description of this dependence relation is
606given in the @code{subscripts}, @code{dir_vects}, and @code{dist_vects}
607arrays,
608@item a boolean that determines whether the dependence relation can be
15b474a2 609represented by a classical distance vector,
93d414c9 610@item an array @code{subscripts} that contains a description of each
611subscript of the data references. Given two array accesses a
612subscript is the tuple composed of the access functions for a given
613dimension. For example, given @code{A[f1][f2][f3]} and
614@code{B[g1][g2][g3]}, there are three subscripts: @code{(f1, g1), (f2,
615g2), (f3, g3)}.
616@item two arrays @code{dir_vects} and @code{dist_vects} that contain
617classical representations of the data dependences under the form of
618direction and distance dependence vectors,
619@item an array of loops @code{loop_nest} that contains the loops to
620which the distance and direction vectors refer to.
621@end itemize
622
623Several functions for pretty printing the information extracted by the
624data dependence analysis are available: @code{dump_ddrs} prints with a
625maximum verbosity the details of a data dependence relations array,
626@code{dump_dist_dir_vectors} prints only the classical distance and
627direction vectors for a data dependence relations array, and
628@code{dump_data_references} prints the details of the data references
629contained in a data reference array.