]> git.ipfire.org Git - thirdparty/gcc.git/blame - boehm-gc/doc/tree.html
re PR target/78594 (Bug in November 11th, 2016 change to rs6000.md)
[thirdparty/gcc.git] / boehm-gc / doc / tree.html
CommitLineData
61f38a77
TT
1<HTML>
2<HEAD>
3 <TITLE> Two-Level Tree Structure for Fast Pointer Lookup</TITLE>
ff6fe7a1 4 <AUTHOR> Hans-J. Boehm, Silicon Graphics (now at HP)</author>
61f38a77
TT
5</HEAD>
6<BODY>
7<H1>Two-Level Tree Structure for Fast Pointer Lookup</h1>
8<P>
9The conservative garbage collector described
ff6fe7a1
JS
10<A HREF="http://www.hpl.hp.com/personal/Hans_Boehm/gc/">here</a>
11uses a 2-level tree
61f38a77
TT
12data structure to aid in fast pointer identification.
13This data structure is described in a bit more detail here, since
14<OL>
15<LI> Variations of the data structure are more generally useful.
16<LI> It appears to be hard to understand by reading the code.
17<LI> Some other collectors appear to use inferior data structures to
18solve the same problem.
19<LI> It is central to fast collector operation.
20</ol>
21A candidate pointer is divided into three sections, the <I>high</i>,
22<I>middle</i>, and <I>low</i> bits. The exact division between these
23three groups of bits is dependent on the detailed collector configuration.
24<P>
25The high and middle bits are used to look up an entry in the table described
26here. The resulting table entry consists of either a block descriptor
27(<TT>struct hblkhdr *</tt> or <TT>hdr *</tt>)
28identifying the layout of objects in the block, or an indication that this
29address range corresponds to the middle of a large block, together with a
30hint for locating the actual block descriptor. Such a hint consist
31of a displacement that can be subtracted from the middle bits of the candidate
32pointer without leaving the object.
33<P>
34In either case, the block descriptor (<TT>struct hblkhdr</tt>)
35refers to a table of object starting addresses (the <TT>hb_map</tt> field).
36The starting address table is indexed by the low bits if the candidate pointer.
37The resulting entry contains a displacement to the beginning of the object,
38or an indication that this cannot be a valid object pointer.
39(If all interior pointer are recognized, pointers into large objects
40are handled specially, as appropriate.)
41
42<H2>The Tree</h2>
43<P>
44The rest of this discussion focuses on the two level data structure
45used to map the high and middle bits to the block descriptor.
46<P>
47The high bits are used as an index into the <TT>GC_top_index</tt> (really
48<TT>GC_arrays._top_index</tt>) array. Each entry points to a
49<TT>bottom_index</tt> data structure. This structure in turn consists
50mostly of an array <TT>index</tt> indexed by the middle bits of
51the candidate pointer. The <TT>index</tt> array contains the actual
52<TT>hdr</tt> pointers.
53<P>
54Thus a pointer lookup consists primarily of a handful of memory references,
55and can be quite fast:
56<OL>
57<LI> The appropriate <TT>bottom_index</tt> pointer is looked up in
58<TT>GC_top_index</tt>, based on the high bits of the candidate pointer.
59<LI> The appropriate <TT>hdr</tt> pointer is looked up in the
60<TT>bottom_index</tt> structure, based on the middle bits.
61<LI> The block layout map pointer is retrieved from the <TT>hdr</tt>
62structure. (This memory reference is necessary since we try to share
63block layout maps.)
64<LI> The displacement to the beginning of the object is retrieved from the
65above map.
66</ol>
67<P>
68In order to conserve space, not all <TT>GC_top_index</tt> entries in fact
69point to distinct <TT>bottom_index</tt> structures. If no address with
70the corresponding high bits is part of the heap, then the entry points
71to <TT>GC_all_nils</tt>, a single <TT>bottom_index</tt> structure consisting
72only of NULL <TT>hdr</tt> pointers.
73<P>
74<TT>Bottom_index</tt> structures contain slightly more information than
75just <TT>hdr</tt> pointers. The <TT>asc_link</tt> field is used to link
76all <TT>bottom_index</tt> structures in ascending order for fast traversal.
77This list is pointed to be <TT>GC_all_bottom_indices</tt>.
78It is maintained with the aid of <TT>key</tt> field that contains the
79high bits corresponding to the <TT>bottom_index</tt>.
80
81<H2>64 bit addresses</h2>
82<P>
83In the case of 64 bit addresses, this picture is complicated slightly
84by the fact that one of the index structures would have to be huge to
85cover the entire address space with a two level tree. We deal with this
86by turning <TT>GC_top_index</tt> into a chained hash table, instead of
87a simple array. This adds a <TT>hash_link</tt> field to the
88<TT>bottom_index</tt> structure.
89<P>
90The "hash function" consists of dropping the high bits. This is cheap to
91compute, and guarantees that there will be no collisions if the heap
92is contiguous and not excessively large.
93
94<H2>A picture</h2>
95<P>
96The following is an ASCII diagram of the data structure.
97This was contributed by Dave Barrett several years ago.
98<PRE>
99
100 Data Structure used by GC_base in gc3.7:
101 21-Apr-94
102
103
104
105
106 63 LOG_TOP_SZ[11] LOG_BOTTOM_SZ[10] LOG_HBLKSIZE[13]
107 +------------------+----------------+------------------+------------------+
108 p:| | TL_HASH(hi) | | HBLKDISPL(p) |
109 +------------------+----------------+------------------+------------------+
110 \-----------------------HBLKPTR(p)-------------------/
111 \------------hi-------------------/
112 \______ ________/ \________ _______/ \________ _______/
113 V V V
114 | | |
115 GC_top_index[] | | |
116 --- +--------------+ | | |
117 ^ | | | | |
118 | | | | | |
119 TOP +--------------+<--+ | |
120 _SZ +-<| [] | * | |
121(items)| +--------------+ if 0 < bi< HBLKSIZE | |
122 | | | | then large object | |
123 | | | | starts at the bi'th | |
124 v | | | HBLK before p. | i |
125 --- | +--------------+ | (word- |
126 v | aligned) |
127 bi= |GET_BI(p){->hash_link}->key==hi | |
128 v | |
129 | (bottom_index) \ scratch_alloc'd | |
130 | ( struct bi ) / by get_index() | |
131 --- +->+--------------+ | |
132 ^ | | | |
133 ^ | | | |
134 BOTTOM | | ha=GET_HDR_ADDR(p) | |
135_SZ(items)+--------------+<----------------------+ +-------+
136 | +--<| index[] | |
137 | | +--------------+ GC_obj_map: v
138 | | | | from / +-+-+-----+-+-+-+-+ ---
139 v | | | GC_add < 0| | | | | | | | ^
140 --- | +--------------+ _map_entry \ +-+-+-----+-+-+-+-+ |
141 | | asc_link | +-+-+-----+-+-+-+-+ MAXOBJSZ
142 | +--------------+ +-->| | | j | | | | | +1
143 | | key | | +-+-+-----+-+-+-+-+ |
144 | +--------------+ | +-+-+-----+-+-+-+-+ |
145 | | hash_link | | | | | | | | | | v
146 | +--------------+ | +-+-+-----+-+-+-+-+ ---
147 | | |<--MAX_OFFSET--->|
148 | | (bytes)
149HDR(p)| GC_find_header(p) | |<--MAP_ENTRIES-->|
150 | \ from | =HBLKSIZE/WORDSZ
151 | (hdr) (struct hblkhdr) / alloc_hdr() | (1024 on Alpha)
152 +-->+----------------------+ | (8/16 bits each)
153GET_HDR(p)| word hb_sz (words) | |
154 +----------------------+ |
155 | struct hblk *hb_next | |
156 +----------------------+ |
157 |mark_proc hb_mark_proc| |
158 +----------------------+ |
159 | char * hb_map |>-------------+
160 +----------------------+
161 | ushort hb_obj_kind |
162 +----------------------+
163 | hb_last_reclaimed |
164 --- +----------------------+
165 ^ | |
166 MARK_BITS| hb_marks[] | *if hdr is free, hb_sz + DISCARD_WORDS
167_SZ(words)| | is the size of a heap chunk (struct hblk)
168 v | | of at least MININCR*HBLKSIZE bytes (below),
169 --- +----------------------+ otherwise, size of each object in chunk.
170
171Dynamic data structures above are interleaved throughout the heap in blocks of
172size MININCR * HBLKSIZE bytes as done by gc_scratch_alloc which cannot be
173freed; free lists are used (e.g. alloc_hdr). HBLK's below are collected.
174
175 (struct hblk)
176 --- +----------------------+ < HBLKSIZE --- --- DISCARD_
177 ^ |garbage[DISCARD_WORDS]| aligned ^ ^ HDR_BYTES WORDS
178 | | | | v (bytes) (words)
179 | +-----hb_body----------+ < WORDSZ | --- ---
180 | | | aligned | ^ ^
181 | | Object 0 | | hb_sz |
182 | | | i |(word- (words)|
183 | | | (bytes)|aligned) v |
184 | + - - - - - - - - - - -+ --- | --- |
185 | | | ^ | ^ |
186 n * | | j (words) | hb_sz BODY_SZ
187 HBLKSIZE | Object 1 | v v | (words)
188 (bytes) | |--------------- v MAX_OFFSET
189 | + - - - - - - - - - - -+ --- (bytes)
190 | | | !All_INTERIOR_PTRS ^ |
191 | | | sets j only for hb_sz |
192 | | Object N | valid object offsets. | |
193 v | | All objects WORDSZ v v
194 --- +----------------------+ aligned. --- ---
195
196DISCARD_WORDS is normally zero. Indeed the collector has not been tested
197with another value in ages.
198</pre>
199</body>