]> git.ipfire.org Git - thirdparty/gcc.git/blame - gcc/doc/gty.texi
toplevel/
[thirdparty/gcc.git] / gcc / doc / gty.texi
CommitLineData
d91f7526 1@c Copyright (C) 2002, 2003, 2004, 2007, 2008, 2009, 2010
1f3233d1 2@c Free Software Foundation, Inc.
3@c This is part of the GCC manual.
4@c For copying conditions, see the file gcc.texi.
5
6@node Type Information
7@chapter Memory Management and Type Information
8@cindex GGC
9@findex GTY
10
11GCC uses some fairly sophisticated memory management techniques, which
12involve determining information about GCC's data structures from GCC's
573aba85 13source code and using this information to perform garbage collection and
14implement precompiled headers.
1f3233d1 15
7035b2ab 16A full C parser would be too complicated for this task, so a limited
1f3233d1 17subset of C is interpreted and special markers are used to determine
7035b2ab 18what parts of the source to look at. All @code{struct} and
19@code{union} declarations that define data structures that are
20allocated under control of the garbage collector must be marked. All
21global variables that hold pointers to garbage-collected memory must
22also be marked. Finally, all global variables that need to be saved
23and restored by a precompiled header must be marked. (The precompiled
24header mechanism can only save static variables if they're scalar.
25Complex data structures must be allocated in garbage-collected memory
26to be saved in a precompiled header.)
27
28The full format of a marker is
29@smallexample
30GTY (([@var{option}] [(@var{param})], [@var{option}] [(@var{param})] @dots{}))
31@end smallexample
32@noindent
33but in most cases no options are needed. The outer double parentheses
34are still necessary, though: @code{GTY(())}. Markers can appear:
1f3233d1 35
1f3233d1 36@itemize @bullet
37@item
38In a structure definition, before the open brace;
39@item
7a0ad664 40In a global variable declaration, after the keyword @code{static} or
1f3233d1 41@code{extern}; and
42@item
43In a structure field definition, before the name of the field.
44@end itemize
45
7035b2ab 46Here are some examples of marking simple data structures and globals.
47
48@smallexample
fb1e4f4a 49struct GTY(()) @var{tag}
7035b2ab 50@{
51 @var{fields}@dots{}
52@};
53
fb1e4f4a 54typedef struct GTY(()) @var{tag}
7035b2ab 55@{
56 @var{fields}@dots{}
57@} *@var{typename};
58
59static GTY(()) struct @var{tag} *@var{list}; /* @r{points to GC memory} */
60static GTY(()) int @var{counter}; /* @r{save counter in a PCH} */
61@end smallexample
62
63The parser understands simple typedefs such as
09c8496f 64@code{typedef struct @var{tag} *@var{name};} and
7035b2ab 65@code{typedef int @var{name};}.
66These don't need to be marked.
67
68@menu
c24c5fac 69* GTY Options:: What goes inside a @code{GTY(())}.
70* GGC Roots:: Making global variables GGC roots.
71* Files:: How the generated files work.
4dbb1e0c 72* Invoking the garbage collector:: How to invoke the garbage collector.
77580f20 73* Troubleshooting:: When something does not work as expected.
7035b2ab 74@end menu
75
76@node GTY Options
77@section The Inside of a @code{GTY(())}
78
79Sometimes the C code is not enough to fully describe the type
80structure. Extra information can be provided with @code{GTY} options
81and additional markers. Some options take a parameter, which may be
82either a string or a type name, depending on the parameter. If an
83option takes no parameter, it is acceptable either to omit the
84parameter entirely, or to provide an empty string as a parameter. For
85example, @code{@w{GTY ((skip))}} and @code{@w{GTY ((skip ("")))}} are
86equivalent.
1f3233d1 87
7035b2ab 88When the parameter is a string, often it is a fragment of C code. Four
89special escapes may be used in these strings, to refer to pieces of
90the data structure being marked:
1f3233d1 91
92@cindex % in GTY option
93@table @code
94@item %h
7035b2ab 95The current structure.
1f3233d1 96@item %1
7035b2ab 97The structure that immediately contains the current structure.
1f3233d1 98@item %0
7035b2ab 99The outermost structure that contains the current structure.
c849df63 100@item %a
7a5a9c54 101A partial expression of the form @code{[i1][i2]@dots{}} that indexes
7035b2ab 102the array item currently being marked.
1f3233d1 103@end table
104
7035b2ab 105For instance, suppose that you have a structure of the form
106@smallexample
107struct A @{
7a5a9c54 108 @dots{}
7035b2ab 109@};
110struct B @{
111 struct A foo[12];
112@};
113@end smallexample
114@noindent
115and @code{b} is a variable of type @code{struct B}. When marking
116@samp{b.foo[11]}, @code{%h} would expand to @samp{b.foo[11]},
117@code{%0} and @code{%1} would both expand to @samp{b}, and @code{%a}
118would expand to @samp{[11]}.
119
120As in ordinary C, adjacent strings will be concatenated; this is
121helpful when you have a complicated expression.
122@smallexample
123@group
124GTY ((chain_next ("TREE_CODE (&%h.generic) == INTEGER_TYPE"
125 " ? TYPE_NEXT_VARIANT (&%h.generic)"
126 " : TREE_CHAIN (&%h.generic)")))
127@end group
128@end smallexample
129
1f3233d1 130The available options are:
131
132@table @code
133@findex length
7035b2ab 134@item length ("@var{expression}")
1f3233d1 135
136There are two places the type machinery will need to be explicitly told
137the length of an array. The first case is when a structure ends in a
138variable-length array, like this:
7035b2ab 139@smallexample
fb1e4f4a 140struct GTY(()) rtvec_def @{
c24c5fac 141 int num_elem; /* @r{number of elements} */
1f3233d1 142 rtx GTY ((length ("%h.num_elem"))) elem[1];
7035b2ab 143@};
144@end smallexample
145
1f3233d1 146In this case, the @code{length} option is used to override the specified
147array length (which should usually be @code{1}). The parameter of the
148option is a fragment of C code that calculates the length.
149
150The second case is when a structure or a global variable contains a
151pointer to an array, like this:
b724fad7 152@smallexample
ba72912a 153struct gimple_omp_for_iter * GTY((length ("%h.collapse"))) iter;
b724fad7 154@end smallexample
ba72912a 155In this case, @code{iter} has been allocated by writing something like
b724fad7 156@smallexample
ba72912a 157 x->iter = ggc_alloc_cleared_vec_gimple_omp_for_iter (collapse);
b724fad7 158@end smallexample
ba72912a 159and the @code{collapse} provides the length of the field.
1f3233d1 160
161This second use of @code{length} also works on global variables, like:
162@verbatim
ba72912a 163static GTY((length("reg_known_value_size"))) rtx *reg_known_value;
1f3233d1 164@end verbatim
165
166@findex skip
167@item skip
168
169If @code{skip} is applied to a field, the type machinery will ignore it.
170This is somewhat dangerous; the only safe use is in a union when one
171field really isn't ever used.
172
173@findex desc
174@findex tag
c849df63 175@findex default
7035b2ab 176@item desc ("@var{expression}")
177@itemx tag ("@var{constant}")
c849df63 178@itemx default
1f3233d1 179
180The type machinery needs to be told which field of a @code{union} is
de315068 181currently active. This is done by giving each field a constant
182@code{tag} value, and then specifying a discriminator using @code{desc}.
183The value of the expression given by @code{desc} is compared against
184each @code{tag} value, each of which should be different. If no
185@code{tag} is matched, the field marked with @code{default} is used if
186there is one, otherwise no field in the union will be marked.
187
188In the @code{desc} option, the ``current structure'' is the union that
189it discriminates. Use @code{%1} to mean the structure containing it.
7035b2ab 190There are no escapes available to the @code{tag} option, since it is a
191constant.
de315068 192
193For example,
b724fad7 194@smallexample
fb1e4f4a 195struct GTY(()) tree_binding
b724fad7 196@{
1f3233d1 197 struct tree_common common;
b724fad7 198 union tree_binding_u @{
1f3233d1 199 tree GTY ((tag ("0"))) scope;
200 struct cp_binding_level * GTY ((tag ("1"))) level;
de315068 201 @} GTY ((desc ("BINDING_HAS_LEVEL_P ((tree)&%0)"))) xscope;
1f3233d1 202 tree value;
b724fad7 203@};
204@end smallexample
1f3233d1 205
de315068 206In this example, the value of BINDING_HAS_LEVEL_P when applied to a
207@code{struct tree_binding *} is presumed to be 0 or 1. If 1, the type
208mechanism will treat the field @code{level} as being present and if 0,
209will treat the field @code{scope} as being present.
1f3233d1 210
211@findex param_is
212@findex use_param
7035b2ab 213@item param_is (@var{type})
1f3233d1 214@itemx use_param
215
216Sometimes it's convenient to define some data structure to work on
c849df63 217generic pointers (that is, @code{PTR}) and then use it with a specific
218type. @code{param_is} specifies the real type pointed to, and
219@code{use_param} says where in the generic data structure that type
220should be put.
1f3233d1 221
7035b2ab 222For instance, to have a @code{htab_t} that points to trees, one would
223write the definition of @code{htab_t} like this:
224@smallexample
225typedef struct GTY(()) @{
226 @dots{}
227 void ** GTY ((use_param, @dots{})) entries;
228 @dots{}
229@} htab_t;
230@end smallexample
231and then declare variables like this:
232@smallexample
b8f03392 233 static htab_t GTY ((param_is (union tree_node))) ict;
7035b2ab 234@end smallexample
1f3233d1 235
c849df63 236@findex param@var{n}_is
237@findex use_param@var{n}
7035b2ab 238@item param@var{n}_is (@var{type})
c849df63 239@itemx use_param@var{n}
240
241In more complicated cases, the data structure might need to work on
242several different types, which might not necessarily all be pointers.
243For this, @code{param1_is} through @code{param9_is} may be used to
244specify the real type of a field identified by @code{use_param1} through
245@code{use_param9}.
246
247@findex use_params
248@item use_params
249
a99e98db 250When a structure contains another structure that is parameterized,
42cc08b5 251there's no need to do anything special, the inner structure inherits the
c849df63 252parameters of the outer one. When a structure contains a pointer to a
a99e98db 253parameterized structure, the type machinery won't automatically detect
c849df63 254this (it could, it just doesn't yet), so it's necessary to tell it that
255the pointed-to structure should use the same parameters as the outer
256structure. This is done by marking the pointer with the
257@code{use_params} option.
258
1f3233d1 259@findex deletable
260@item deletable
261
262@code{deletable}, when applied to a global variable, indicates that when
263garbage collection runs, there's no need to mark anything pointed to
264by this variable, it can just be set to @code{NULL} instead. This is used
265to keep a list of free structures around for re-use.
266
267@findex if_marked
7035b2ab 268@item if_marked ("@var{expression}")
1f3233d1 269
270Suppose you want some kinds of object to be unique, and so you put them
271in a hash table. If garbage collection marks the hash table, these
272objects will never be freed, even if the last other reference to them
273goes away. GGC has special handling to deal with this: if you use the
274@code{if_marked} option on a global hash table, GGC will call the
275routine whose name is the parameter to the option on each hash table
276entry. If the routine returns nonzero, the hash table entry will
277be marked as usual. If the routine returns zero, the hash table entry
278will be deleted.
279
280The routine @code{ggc_marked_p} can be used to determine if an element
281has been marked already; in fact, the usual case is to use
282@code{if_marked ("ggc_marked_p")}.
283
b0a1d041 284@findex mark_hook
285@item mark_hook ("@var{hook-routine-name}")
286
287If provided for a structure or union type, the given
288@var{hook-routine-name} (between double-quotes) is the name of a
289routine called when the garbage collector has just marked the data as
290reachable. This routine should not change the data, or call any ggc
291routine. Its only argument is a pointer to the just marked (const)
292structure or union.
293
1f3233d1 294@findex maybe_undef
295@item maybe_undef
296
297When applied to a field, @code{maybe_undef} indicates that it's OK if
298the structure that this fields points to is never defined, so long as
299this field is always @code{NULL}. This is used to avoid requiring
300backends to define certain optional structures. It doesn't work with
301language frontends.
302
8ed01400 303@findex nested_ptr
304@item nested_ptr (@var{type}, "@var{to expression}", "@var{from expression}")
305
306The type machinery expects all pointers to point to the start of an
307object. Sometimes for abstraction purposes it's convenient to have
308a pointer which points inside an object. So long as it's possible to
309convert the original object to and from the pointer, such pointers
310can still be used. @var{type} is the type of the original object,
311the @var{to expression} returns the pointer given the original object,
312and the @var{from expression} returns the original object given
313the pointer. The pointer will be available using the @code{%h}
314escape.
315
573aba85 316@findex chain_next
317@findex chain_prev
ae093573 318@findex chain_circular
7035b2ab 319@item chain_next ("@var{expression}")
320@itemx chain_prev ("@var{expression}")
ae093573 321@itemx chain_circular ("@var{expression}")
573aba85 322
323It's helpful for the type machinery to know if objects are often
324chained together in long lists; this lets it generate code that uses
325less stack space by iterating along the list instead of recursing down
326it. @code{chain_next} is an expression for the next item in the list,
7035b2ab 327@code{chain_prev} is an expression for the previous item. For singly
328linked lists, use only @code{chain_next}; for doubly linked lists, use
329both. The machinery requires that taking the next item of the
ae093573 330previous item gives the original item. @code{chain_circular} is similar
331to @code{chain_next}, but can be used for circular single linked lists.
573aba85 332
333@findex reorder
7035b2ab 334@item reorder ("@var{function name}")
573aba85 335
336Some data structures depend on the relative ordering of pointers. If
7035b2ab 337the precompiled header machinery needs to change that ordering, it
338will call the function referenced by the @code{reorder} option, before
339changing the pointers in the object that's pointed to by the field the
340option applies to. The function must take four arguments, with the
341signature @samp{@w{void *, void *, gt_pointer_operator, void *}}.
342The first parameter is a pointer to the structure that contains the
343object being updated, or the object itself if there is no containing
344structure. The second parameter is a cookie that should be ignored.
345The third parameter is a routine that, given a pointer, will update it
346to its correct new value. The fourth parameter is a cookie that must
347be passed to the second parameter.
348
349PCH cannot handle data structures that depend on the absolute values
350of pointers. @code{reorder} functions can be expensive. When
351possible, it is better to depend on properties of the data, like an ID
352number or the hash of a string instead.
573aba85 353
ba72912a 354@findex variable_size
355@item variable_size
356
357The type machinery expects the types to be of constant size. When this
358is not true, for example, with structs that have array fields or unions,
77580f20 359the type machinery cannot tell how many bytes need to be allocated at
ba72912a 360each allocation. The @code{variable_size} is used to mark such types.
77580f20 361The type machinery then provides allocators that take a parameter
362indicating an exact size of object being allocated. Note that the size
363must be provided in bytes whereas the @code{length} option works with
364array lengths in number of elements.
ba72912a 365
366For example,
367@smallexample
368struct GTY((variable_size)) sorted_fields_type @{
369 int len;
370 tree GTY((length ("%h.len"))) elts[1];
371@};
372@end smallexample
373
15b474a2 374Then the objects of @code{struct sorted_fields_type} are allocated in GC
ba72912a 375memory as follows:
376@smallexample
377 field_vec = ggc_alloc_sorted_fields_type (size);
378@end smallexample
379
77580f20 380If @var{field_vec->elts} stores @var{n} elements, then @var{size}
381could be calculated as follows:
382@smallexample
383 size_t size = sizeof (struct sorted_fields_type) + n * sizeof (tree);
384@end smallexample
385
248c24dd 386@findex atomic
387@item atomic
388
389The @code{atomic} option can only be used with pointers. It informs
390the GC machinery that the memory that the pointer points to does not
391contain any pointers, and hence it should be treated by the GC and PCH
392machinery as an ``atomic'' block of memory that does not need to be
393examined when scanning memory for pointers. In particular, the
394machinery will not scan that memory for pointers to mark them as
395reachable (when marking pointers for GC) or to relocate them (when
396writing a PCH file).
397
398The @code{atomic} option differs from the @code{skip} option.
399@code{atomic} keeps the memory under Garbage Collection, but makes the
400GC ignore the contents of the memory. @code{skip} is more drastic in
401that it causes the pointer and the memory to be completely ignored by
402the Garbage Collector. So, memory marked as @code{atomic} is
403automatically freed when no longer reachable, while memory marked as
404@code{skip} is not.
405
406The @code{atomic} option must be used with great care, because all
407sorts of problem can occur if used incorrectly, that is, if the memory
408the pointer points to does actually contain a pointer.
409
410Here is an example of how to use it:
411@smallexample
412struct GTY(()) my_struct @{
413 int number_of_elements;
414 unsigned int GTY ((atomic)) * elements;
415@};
416@end smallexample
417In this case, @code{elements} is a pointer under GC, and the memory it
418points to needs to be allocated using the Garbage Collector, and will
419be freed automatically by the Garbage Collector when it is no longer
420referenced. But the memory that the pointer points to is an array of
421@code{unsigned int} elements, and the GC must not try to scan it to
422find pointers to mark or relocate, which is why it is marked with the
423@code{atomic} option.
424
425Note that, currently, global variables can not be marked with
426@code{atomic}; only fields of a struct can. This is a known
427limitation. It would be useful to be able to mark global pointers
428with @code{atomic} to make the PCH machinery aware of them so that
429they are saved and restored correctly to PCH files.
430
1f3233d1 431@findex special
7035b2ab 432@item special ("@var{name}")
1f3233d1 433
7035b2ab 434The @code{special} option is used to mark types that have to be dealt
435with by special case machinery. The parameter is the name of the
436special case. See @file{gengtype.c} for further details. Avoid
437adding new special cases unless there is no other alternative.
1f3233d1 438@end table
439
440@node GGC Roots
441@section Marking Roots for the Garbage Collector
442@cindex roots, marking
443@cindex marking roots
444
445In addition to keeping track of types, the type machinery also locates
7035b2ab 446the global variables (@dfn{roots}) that the garbage collector starts
447at. Roots must be declared using one of the following syntaxes:
1f3233d1 448
7035b2ab 449@itemize @bullet
1f3233d1 450@item
7035b2ab 451@code{extern GTY(([@var{options}])) @var{type} @var{name};}
1f3233d1 452@item
7035b2ab 453@code{static GTY(([@var{options}])) @var{type} @var{name};}
454@end itemize
455@noindent
456The syntax
457@itemize @bullet
458@item
459@code{GTY(([@var{options}])) @var{type} @var{name};}
460@end itemize
461@noindent
462is @emph{not} accepted. There should be an @code{extern} declaration
463of such a variable in a header somewhere---mark that, not the
464definition. Or, if the variable is only used in one file, make it
465@code{static}.
f191dff6 466
1f3233d1 467@node Files
468@section Source Files Containing Type Information
469@cindex generated files
470@cindex files, generated
471
7035b2ab 472Whenever you add @code{GTY} markers to a source file that previously
473had none, or create a new source file containing @code{GTY} markers,
474there are three things you need to do:
1f3233d1 475
476@enumerate
477@item
776c30b8 478You need to add the file to the list of source files the type
7035b2ab 479machinery scans. There are four cases:
776c30b8 480
481@enumerate a
482@item
483For a back-end file, this is usually done
eb1bd38b 484automatically; if not, you should add it to @code{target_gtfiles} in
7a0ad664 485the appropriate port's entries in @file{config.gcc}.
776c30b8 486
487@item
7035b2ab 488For files shared by all front ends, add the filename to the
489@code{GTFILES} variable in @file{Makefile.in}.
1f3233d1 490
7a0ad664 491@item
7035b2ab 492For files that are part of one front end, add the filename to the
493@code{gtfiles} variable defined in the appropriate
e53d55e7 494@file{config-lang.in}.
76ee6ef2 495Headers should appear before non-headers in this list.
776c30b8 496
7035b2ab 497@item
498For files that are part of some but not all front ends, add the
499filename to the @code{gtfiles} variable of @emph{all} the front ends
500that use it.
776c30b8 501@end enumerate
502
1f3233d1 503@item
f191dff6 504If the file was a header file, you'll need to check that it's included
505in the right place to be visible to the generated files. For a back-end
506header file, this should be done automatically. For a front-end header
507file, it needs to be included by the same file that includes
508@file{gtype-@var{lang}.h}. For other header files, it needs to be
509included in @file{gtype-desc.c}, which is a generated file, so add it to
7a0ad664 510@code{ifiles} in @code{open_base_file} in @file{gengtype.c}.
f191dff6 511
512For source files that aren't header files, the machinery will generate a
513header file that should be included in the source file you just changed.
514The file will be called @file{gt-@var{path}.h} where @var{path} is the
515pathname relative to the @file{gcc} directory with slashes replaced by
516@verb{|-|}, so for example the header file to be included in
f163dcb6 517@file{cp/parser.c} is called @file{gt-cp-parser.c}. The
f191dff6 518generated header file should be included after everything else in the
519source file. Don't forget to mention this file as a dependency in the
520@file{Makefile}!
1f3233d1 521
1f3233d1 522@end enumerate
523
524For language frontends, there is another file that needs to be included
525somewhere. It will be called @file{gtype-@var{lang}.h}, where
526@var{lang} is the name of the subdirectory the language is contained in.
4dbb1e0c 527
9dc75945 528Plugins can add additional root tables. Run the @code{gengtype}
ae8a3b92 529utility in plugin mode as @code{gengtype -P pluginout.h @var{source-dir}
9dc75945 530@var{file-list} @var{plugin*.c}} with your plugin files
ae8a3b92 531@var{plugin*.c} using @code{GTY} to generate the @var{pluginout.h} file.
532The GCC build tree is needed to be present in that mode.
9dc75945 533
534
4dbb1e0c 535@node Invoking the garbage collector
536@section How to invoke the garbage collector
537@cindex garbage collector, invocation
538@findex ggc_collect
539
540The GCC garbage collector GGC is only invoked explicitly. In contrast
541with many other garbage collectors, it is not implicitly invoked by
542allocation routines when a lot of memory has been consumed. So the
543only way to have GGC reclaim storage it to call the @code{ggc_collect}
77580f20 544function explicitly. This call is an expensive operation, as it may
545have to scan the entire heap. Beware that local variables (on the GCC
4dbb1e0c 546call stack) are not followed by such an invocation (as many other
547garbage collectors do): you should reference all your data from static
548or external @code{GTY}-ed variables, and it is advised to call
77580f20 549@code{ggc_collect} with a shallow call stack. The GGC is an exact mark
4dbb1e0c 550and sweep garbage collector (so it does not scan the call stack for
77580f20 551pointers). In practice GCC passes don't often call @code{ggc_collect}
4dbb1e0c 552themselves, because it is called by the pass manager between passes.
77580f20 553
554At the time of the @code{ggc_collect} call all pointers in the GC-marked
555structures must be valid or @code{NULL}. In practice this means that
556there should not be uninitialized pointer fields in the structures even
557if your code never reads or writes those fields at a particular
558instance. One way to ensure this is to use cleared versions of
559allocators unless all the fields are initialized manually immediately
560after allocation.
561
562@node Troubleshooting
563@section Troubleshooting the garbage collector
564@cindex garbage collector, troubleshooting
565
566With the current garbage collector implementation, most issues should
567show up as GCC compilation errors. Some of the most commonly
568encountered issues are described below.
569
570@itemize @bullet
571@item Gengtype does not produce allocators for a @code{GTY}-marked type.
572Gengtype checks if there is at least one possible path from GC roots to
573at least one instance of each type before outputting allocators. If
574there is no such path, the @code{GTY} markers will be ignored and no
575allocators will be output. Solve this by making sure that there exists
576at least one such path. If creating it is unfeasible or raises a ``code
577smell'', consider if you really must use GC for allocating such type.
578
579@item Link-time errors about undefined @code{gt_ggc_r_foo_bar} and
580similarly-named symbols. Check if your @file{foo_bar} source file has
581@code{#include "gt-foo_bar.h"} as its very last line.
582
583@end itemize