]>
Commit | Line | Data |
---|---|---|
9be90c68 NA |
1 | \input texinfo @c -*- Texinfo -*- |
2 | @setfilename ctf-spec.info | |
3 | @settitle The CTF File Format | |
4 | @ifnottex | |
9be90c68 NA |
5 | @xrefautomaticsectiontitle on |
6 | @end ifnottex | |
7 | @synindex fn cp | |
8 | @synindex tp cp | |
9 | @synindex vr cp | |
10 | ||
11 | @copying | |
fd67aa11 | 12 | Copyright @copyright{} 2021-2024 Free Software Foundation, Inc. |
9be90c68 NA |
13 | |
14 | Permission is granted to copy, distribute and/or modify this document | |
15 | under the terms of the GNU General Public License, Version 3 or any | |
16 | later version published by the Free Software Foundation. A copy of the | |
17 | license is included in the section entitled ``GNU General Public | |
18 | License''. | |
19 | ||
20 | @end copying | |
21 | ||
22 | @dircategory Software development | |
23 | @direntry | |
24 | * CTF: (ctf-spec). The CTF file format. | |
25 | @end direntry | |
26 | ||
27 | @titlepage | |
28 | @title The CTF File Format | |
29 | @subtitle Version 3 | |
30 | @author Nick Alcock | |
31 | ||
32 | @page | |
33 | @vskip 0pt plus 1filll | |
34 | @insertcopying | |
35 | @end titlepage | |
36 | @contents | |
37 | ||
38 | @ifnottex | |
39 | @node Top | |
40 | @top The CTF file format | |
41 | ||
42 | This manual describes version 3 of the CTF file format, which is | |
43 | intended to model the C type system in a fashion that C programs can | |
44 | consume at runtime. | |
45 | @end ifnottex | |
46 | ||
47 | @node Overview | |
48 | @unnumbered Overview | |
49 | @cindex Overview | |
50 | ||
51 | The CTF file format compactly describes C types and the association | |
52 | between function and data symbols and types: if embedded in ELF objects, | |
53 | it can exploit the ELF string table to reduce duplication further. | |
54 | There is no real concept of namespacing: only top-level types are | |
55 | described, not types scoped to within single functions. | |
56 | ||
57 | CTF dictionaries can be @dfn{children} of other dictionaries, in a | |
58 | one-level hierarchy: child dictionaries can refer to types in the | |
59 | parent, but the opposite is not sensible (since if you refer to a child | |
60 | type in the parent, the actual type you cited would vary depending on | |
61 | what child was attached). This parent/child definition is recorded in | |
62 | the child, but only as a recommendation: users of the API have to attach | |
63 | parents to children explicitly, and can choose to attach a child to any | |
64 | parent they like, or to none, though doing so might lead to unpleasant | |
65 | consequences like dangling references to types. @xref{Type indexes and | |
66 | type IDs}. Type lookups in child dicts that are not associated with a | |
67 | parent at all will fail with @code{ECTF_NOPARENT} if a parent type was | |
68 | needed. | |
69 | ||
70 | The associated API to generate, merge together, and query this file | |
71 | format will be described in the accompanying @code{libctf} manual once | |
72 | it is written. There is no API to modify dictionaries once they've been | |
73 | written out: CTF is a write-once file format. (However, it is always | |
74 | possible to dynamically create a new child dictionary on the fly and | |
75 | attach it to a pre-existing, read-only parent.) | |
76 | ||
77 | There are two major pieces to CTF: the @dfn{archive} and the | |
78 | @dfn{dictionary}. Some relatives and ancestors of CTF call dictionaries | |
79 | @dfn{containers}: the archive format is unique to this variant of CTF. | |
80 | (Much of the source code still uses the old term.) | |
81 | ||
82 | The archive file format is a very simple mmappable archive used to group | |
83 | multiple dictionaries together into groups: it is expected to slowly go | |
84 | away and be replaced by other mechanisms, but right now it is an | |
85 | important part of the file format, used to group dictionaries containing | |
86 | types with conflicting definitions in different TUs with the overarching | |
87 | dictionary used to store all other types. (Even when archives go away, | |
88 | the @code{libctf} API used to access them will remain, and access the | |
89 | other mechanisms that replace it instead.) | |
90 | ||
91 | The CTF dictionary consists of a @dfn{preamble}, which does not vary | |
92 | between versions of the CTF file format, and a @dfn{header} and some | |
93 | number of @dfn{sections}, which can vary between versions. | |
94 | ||
95 | The rest of this specification describes the format of these sections, | |
96 | first for the latest version of CTF, then for all earlier versions | |
97 | supported by @code{libctf}: the earlier versions are defined in terms of | |
98 | their differences from the next later one. We describe each part of the | |
99 | format first by reproducing the C structure which defines that part, | |
100 | then describing it at greater length in terms of file offsets. | |
101 | ||
102 | The description of the file format ends with a description of relevant | |
103 | limits that apply to it. These limits can vary between file format | |
104 | versions. | |
105 | ||
106 | This document is quite young, so for now the C code in @file{ctf.h} | |
107 | should be presumed correct when this document conflicts with it. | |
108 | ||
109 | @node CTF archive | |
110 | @chapter CTF archives | |
111 | @cindex archive, CTF archive | |
112 | ||
113 | The CTF archive format maps names to CTF dictionaries. The names may | |
114 | contain any character other than \0, but for now archives containing | |
115 | slashes in the names may not extract correctly. It is possible to | |
116 | insert multiple members with the same name, but these are quite hard to | |
117 | access reliably (you have to iterate through all the members rather than | |
118 | opening by name) so this is not recommended. | |
119 | ||
120 | CTF archives are not themselves compressed: the constituent components, | |
121 | CTF dictionaries, can be compressed. (@xref{CTF header}). | |
122 | ||
123 | CTF archives usually contain a collection of related dictionaries, one | |
124 | parent and many children of that parent. CTF archives can have a member | |
125 | with a @dfn{default name}, @code{.ctf} (which can be represented as | |
126 | @code{NULL} in the API). If present, this member is usually the parent | |
127 | of all the children, but it is possible for CTF producers to emit | |
128 | parents with different names if they wish (usually for backward- | |
129 | compatibility purposes). | |
130 | ||
131 | @code{.ctf} sections in ELF objects consist of a single CTF dictionary | |
132 | rather than an archive of dictionaries if and only if the section | |
133 | contains no types with identical names but conflicting definitions: if | |
134 | two conflicting definitions exist, the deduplicator will place the type | |
135 | most commonly referred to by other types in the parent and will place | |
136 | the other type in a child named after the translation unit it is found | |
137 | in, and will emit a CTF archive containing both dictionaries instead of | |
138 | a raw dictionary. All types that refer to such conflicting types are | |
139 | also placed in the per-translation-unit child. | |
140 | ||
141 | The definition of an archive in @file{ctf.h} is as follows: | |
142 | ||
143 | @verbatim | |
144 | struct ctf_archive | |
145 | { | |
146 | uint64_t ctfa_magic; | |
147 | uint64_t ctfa_model; | |
148 | uint64_t ctfa_nfiles; | |
149 | uint64_t ctfa_names; | |
150 | uint64_t ctfa_ctfs; | |
151 | }; | |
152 | ||
153 | typedef struct ctf_archive_modent | |
154 | { | |
155 | uint64_t name_offset; | |
156 | uint64_t ctf_offset; | |
157 | } ctf_archive_modent_t; | |
158 | @end verbatim | |
159 | ||
160 | (Note one irregularity here: the @code{ctf_archive_t} is not a typedef | |
161 | to @code{struct ctf_archive}, but a different typedef, private to | |
162 | @code{libctf}, so that things that are not really archives can be made | |
163 | to appear as if they were.) | |
164 | ||
165 | All the above items are always in little-endian byte order, regardless | |
166 | of the machine endianness. | |
167 | ||
168 | The archive header has the following fields: | |
169 | ||
170 | @tindex struct ctf_archive | |
171 | @multitable {Offset} {@code{uint64_t ctfa_nfiles}} {The data model for this archive: an arbitrary integer} | |
172 | @headitem Offset @tab Name @tab Description | |
173 | @item 0x00 | |
174 | @tab @code{uint64_t ctfa_magic} | |
175 | @vindex ctfa_magic | |
176 | @vindex struct ctf_archive, ctfa_magic | |
177 | @tab The magic number for archives, @code{CTFA_MAGIC}: 0x8b47f2a4d7623eeb. | |
178 | @tindex CTFA_MAGIC | |
179 | ||
180 | @item 0x08 | |
181 | @tab @code{uint64_t ctfa_model} | |
182 | @vindex ctfa_model | |
183 | @vindex struct ctf_archive, ctfa_model | |
184 | @tab The data model for this archive: an arbitrary integer that serves no | |
185 | purpose but to be handed back by the libctf API. @xref{Data models}. | |
186 | ||
187 | @item 0x10 | |
188 | @tab @code{uint64_t ctfa_nfiles} | |
189 | @vindex ctfa_nfiles | |
190 | @vindex struct ctf_archive, ctfa_nfiles | |
191 | @tab The number of CTF dictionaries in this archive. | |
192 | ||
193 | @item 0x18 | |
194 | @tab @code{uint64_t ctfa_names} | |
195 | @vindex ctfa_names | |
196 | @vindex struct ctf_archive, ctfa_names | |
197 | @tab Offset of the name table, in bytes from the start of the archive. | |
198 | The name table is an array of @code{struct ctf_archive_modent_t[ctfa_nfiles]}. | |
199 | ||
200 | @item 0x20 | |
201 | @tab @code{uint64_t ctfa_ctfs} | |
202 | @vindex ctfa_ctfs | |
203 | @vindex struct ctf_archive, ctfa_ctfs | |
204 | @tab Offset of the CTF table. Each element starts with a @code{uint64_t} size, | |
205 | followed by a CTF dictionary. | |
206 | ||
207 | @end multitable | |
208 | ||
209 | The array pointed to by @code{ctfa_names} is an array of entries of | |
210 | @code{ctf_archive_modent}: | |
211 | ||
212 | @tindex struct ctf_archive_modent | |
213 | @tindex ctf_archive_modent_t | |
214 | @multitable {Offset} {@code{uint64_t name_offset}} {Offset of this name, in bytes from the start} | |
215 | @headitem Offset @tab Name @tab Description | |
216 | @item 0x00 | |
217 | @tab @code{uint64_t name_offset} | |
218 | @vindex name_offset | |
219 | @vindex struct ctf_archive_modent, name_offset | |
220 | @vindex ctf_archive_modent_t, name_offset | |
221 | @tab Offset of this name, in bytes from the start of the archive. | |
222 | ||
223 | @item 0x08 | |
224 | @tab @code{uint64_t ctf_offset} | |
225 | @vindex ctf_offset | |
226 | @vindex struct ctf_archive_modent, ctf_offset | |
227 | @vindex ctf_archive_modent_t, ctf_offset | |
228 | @tab Offset of this CTF dictionary, in bytes from the start of the archive. | |
229 | ||
230 | @end multitable | |
231 | ||
232 | The @code{ctfa_names} array is sorted into ASCIIbetical order by name | |
233 | (i.e. by the result of dereferencing the @code{name_offset}). | |
234 | ||
235 | The archive file also contains a name table and a table of CTF | |
236 | dictionaries: these are pointed to by the structures above. The name | |
237 | table is a simple strtab which is not required to be sorted; the | |
238 | dictionary array is described above in the entry for @code{ctfa_ctfs}. | |
239 | ||
240 | The relative order of these various parts is not defined, except that | |
241 | the header naturally always comes first. | |
242 | ||
243 | @node CTF dictionaries | |
244 | @chapter CTF dictionaries | |
245 | @cindex dictionary, CTF dictionary | |
246 | ||
247 | CTF dictionaries consist of a header, starting with a premable, and a | |
248 | number of sections. | |
249 | ||
250 | @node CTF Preamble | |
251 | @section CTF Preamble | |
252 | ||
253 | The preamble is the only part of the CTF dictionary whose format cannot | |
254 | vary between versions. It is never compressed. It is correspondingly | |
255 | simple: | |
256 | ||
257 | @verbatim | |
258 | typedef struct ctf_preamble | |
259 | { | |
260 | unsigned short ctp_magic; | |
261 | unsigned char ctp_version; | |
262 | unsigned char ctp_flags; | |
263 | } ctf_preamble_t; | |
264 | @end verbatim | |
265 | ||
266 | @code{#define}s are provided under the names @code{cth_magic}, | |
267 | @code{cth_version} and @code{cth_flags} to make the fields of the | |
268 | @code{ctf_preamble_t} appear to be part of the @code{ctf_header_t}, so | |
269 | consuming programs rarely need to consider the existence of the preamble | |
270 | as a separate structure. | |
271 | ||
272 | @tindex struct ctf_preamble | |
273 | @tindex ctf_preamble_t | |
274 | @multitable {Offset} {@code{unsigned char ctp_version}} {The magic number for CTF dictionaries} | |
275 | @headitem Offset @tab Name @tab Description | |
276 | @item 0x00 | |
277 | @tab @code{unsigned short ctp_magic} | |
278 | @vindex ctp_magic | |
279 | @vindex cth_magic | |
280 | @vindex ctf_preamble_t, ctp_magic | |
281 | @vindex struct ctf_preamble, ctp_magic | |
282 | @vindex ctf_header_t, cth_magic | |
283 | @vindex struct ctf_header, cth_magic | |
284 | @tab The magic number for CTF dictionaries, @code{CTF_MAGIC}: 0xdff2. | |
285 | @tindex CTF_MAGIC | |
286 | ||
287 | @item 0x02 | |
288 | @tab @code {unsigned char ctp_version} | |
289 | @vindex ctp_version | |
290 | @vindex cth_version | |
291 | @vindex ctf_preamble_t, ctp_version | |
292 | @vindex struct ctf_preamble, ctp_version | |
293 | @vindex ctf_header_t, cth_version | |
294 | @vindex struct ctf_header, cth_version | |
295 | @tab The version number of this CTF dictionary. | |
296 | ||
297 | @item 0x03 | |
298 | @tab @code{ctp_flags} | |
299 | @vindex ctp_flags | |
300 | @vindex cth_flags | |
301 | @vindex ctf_preamble_t, ctp_flags | |
302 | @vindex struct ctf_preamble, ctp_flags | |
303 | @vindex ctf_header_t, cth_flags | |
304 | @vindex struct ctf_header, cth_flags | |
305 | @tab Flags for this CTF file. @xref{CTF file-wide flags}. | |
306 | @end multitable | |
307 | ||
308 | @cindex alignment | |
309 | Every element of a dictionary must be naturally aligned unless otherwise | |
310 | specified. (This restriction will be lifted in later versions.) | |
311 | ||
312 | @cindex endianness | |
313 | CTF dictionaries are stored in the native endianness of the system that | |
314 | generates them: the consumer (e.g., @code{libctf}) can detect whether to | |
315 | endian-flip a CTF dictionary by inspecting the @code{ctp_magic}. (If it | |
316 | appears as 0xf2df, endian-flipping is needed.) | |
317 | ||
318 | The version of the CTF dictionary can be determined by inspecting | |
319 | @code{ctp_version}. The following versions are currently valid, and | |
320 | @code{libctf} can read all of them: | |
321 | ||
322 | @tindex CTF_VERSION_3 | |
323 | @cindex CTF versions, versions | |
324 | @multitable {@code{CTF_VERSION_1_UPGRADED_3}} {Number} {First version, rare. Very similar to Solaris CTF.} | |
325 | @headitem Version @tab Number @tab Description | |
326 | @item @code{CTF_VERSION_1} | |
327 | @tab 1 @tab First version, rare. Very similar to Solaris CTF. | |
328 | ||
329 | @item @code{CTF_VERSION_1_UPGRADED_3} | |
330 | @tab 2 @tab First version, upgraded to v3 or higher and written out again. | |
331 | Name may change. Very rare. | |
332 | ||
333 | @item @code{CTF_VERSION_2} | |
334 | @tab 3 @tab Second version, with many range limits lifted. | |
335 | ||
336 | @item @code{CTF_VERSION_3} | |
337 | @tab 4 @tab Third and current version, documented here. | |
338 | @end multitable | |
339 | ||
340 | This section documents @code{CTF_VERSION_3}. | |
341 | ||
342 | @vindex ctp_flags | |
343 | @node CTF file-wide flags | |
344 | @subsection CTF file-wide flags | |
345 | ||
346 | The preamble contains bitflags in its @code{ctp_flags} field that | |
347 | describe various file-wide properties. Some of the flags are valid only | |
348 | for particular file-format versions, which means the flags can be used | |
349 | to fix file-format bugs. Consumers that see unknown flags should | |
350 | accordingly assume that the dictionary is not comprehensible, and | |
351 | refuse to open them. | |
352 | ||
353 | The following flags are currently defined. Many are bug workarounds, | |
354 | valid only in CTFv3, and will not be valid in any future versions: the | |
355 | same values may be reused for other flags in v4+. | |
356 | ||
357 | @multitable {@code{CTF_F_NEWFUNCINFO}} {Versions} {Value} {The external strtab is in @code{.dynstr} and the} | |
358 | @headitem Flag @tab Versions @tab Value @tab Meaning | |
359 | @tindex CTF_F_COMPRESS | |
360 | @item @code{CTF_F_COMPRESS} @tab All @tab 0x1 @tab Compressed with zlib | |
361 | @tindex CTF_F_NEWFUNCINFO | |
362 | @item @code{CTF_F_NEWFUNCINFO} @tab 3 only @tab 0x2 | |
363 | @tab ``New-format'' func info section. | |
364 | @tindex CTF_F_IDXSORTED | |
365 | @item @code{CTF_F_IDXSORTED} @tab 3+ @tab 0x4 @tab The index section is | |
366 | in sorted order | |
367 | @tindex CTF_F_DYNSTR | |
368 | @item @code{CTF_F_DYNSTR} @tab 3 only @tab 0x8 @tab The external strtab is | |
369 | in @code{.dynstr} and the symtab used is @code{.dynsym}. | |
370 | @xref{The string section} | |
371 | @end multitable | |
372 | ||
373 | @code{CTF_F_NEWFUNCINFO} and @code{CTF_F_IDXSORTED} relate to the | |
374 | function info and data object sections. @xref{The symtypetab sections}. | |
375 | ||
376 | Further flags (and further compression methods) wil be added in future. | |
377 | ||
378 | @node CTF header | |
379 | @section CTF header | |
380 | @cindex CTF header | |
381 | @cindex Sections, header | |
382 | ||
383 | The CTF header is the first part of a CTF dictionary, including the | |
384 | preamble. All parts of it other than the preamble (@pxref{CTF Preamble}) | |
385 | can vary between CTF file versions and are never compressed. It | |
386 | contains things that apply to the dictionary as a whole, and a table of | |
387 | the sections into which the rest of the dictionary is divided. The | |
388 | sections tile the file: each section runs from the offset given until | |
389 | the start of the next section. Only the last section cannot follow this | |
390 | rule, so the header has a length for it instead. | |
391 | ||
392 | All section offsets, here and in the rest of the CTF file, are relative to the | |
393 | @emph{end} of the header. (This is annoyingly different to how offsets in CTF | |
394 | archives are handled.) | |
395 | ||
396 | This is the first structure to include offsets into the string table, which are | |
397 | not straight references because CTF dictionaries can include references into the | |
398 | ELF string table to save space, as well as into the string table internal to the | |
399 | CTF dictionary. @xref{The string section} for more on these. Offset 0 is | |
400 | always the null string. | |
401 | ||
402 | @verbatim | |
403 | typedef struct ctf_header | |
404 | { | |
405 | ctf_preamble_t cth_preamble; | |
406 | uint32_t cth_parlabel; | |
407 | uint32_t cth_parname; | |
408 | uint32_t cth_cuname; | |
409 | uint32_t cth_lbloff; | |
410 | uint32_t cth_objtoff; | |
411 | uint32_t cth_funcoff; | |
412 | uint32_t cth_objtidxoff; | |
413 | uint32_t cth_funcidxoff; | |
414 | uint32_t cth_varoff; | |
415 | uint32_t cth_typeoff; | |
416 | uint32_t cth_stroff; | |
417 | uint32_t cth_strlen; | |
418 | } ctf_header_t; | |
419 | @end verbatim | |
420 | ||
421 | In detail: | |
422 | ||
423 | @tindex struct ctf_header | |
424 | @tindex ctf_header_t | |
425 | @multitable {Offset} {@code{ctf_preamble_t cth_preamble}} {The parent label, if deduplication happened against} | |
426 | @headitem Offset @tab Name @tab Description | |
427 | @item 0x00 | |
428 | @tab @code{ctf_preamble_t cth_preamble} | |
429 | @vindex cth_preamble | |
430 | @vindex struct ctf_header, cth_preamble | |
431 | @vindex ctf_header_t, cth_preamble | |
432 | @tab The preamble (conceptually embedded in the header). @xref{CTF Preamble} | |
433 | ||
434 | @item 0x04 | |
435 | @tab @code{uint32_t cth_parlabel} | |
436 | @vindex cth_parlabel | |
437 | @vindex struct ctf_header, cth_parlabel | |
438 | @vindex ctf_header_t, cth_parlabel | |
439 | @tab The parent label, if deduplication happened against a specific label: a | |
440 | strtab offset. @xref{The label section}. Currently unused and always 0, but may | |
441 | be used in future when semantics are attached to the label section. | |
442 | ||
443 | @item 0x08 | |
444 | @tab @code{uint32_t cth_parname} | |
445 | @vindex cth_parname | |
446 | @vindex struct ctf_header, cth_parname | |
447 | @vindex ctf_header_t, cth_parname | |
448 | @tab The name of the parent dictionary deduplicated against: a strtab offset. | |
449 | Interpretation is up to the consumer (usually a CTF archive member name). 0 | |
450 | (the null string) if this is not a child dictionary. | |
451 | ||
452 | @item 0x1c | |
453 | @tab @code{uint32_t cth_cuname} | |
454 | @vindex cth_cuname | |
455 | @vindex struct ctf_header, cth_cuname | |
456 | @vindex ctf_header_t, cth_cuname | |
457 | @tab The name of the compilation unit, for consumers like GDB that want to | |
458 | know the name of CUs associated with single CUs: a strtab offset. 0 if this | |
459 | dictionary describes types from many CUs. | |
460 | ||
461 | @item 0x10 | |
462 | @tab @code{uint32_t cth_lbloff} | |
463 | @vindex cth_lbloff | |
464 | @vindex struct ctf_header, cth_lbloff | |
465 | @vindex ctf_header_t, cth_lbloff | |
466 | @tab The offset of the label section, which tiles the type space into | |
467 | named regions. @xref{The label section}. | |
468 | ||
469 | @item 0x14 | |
470 | @tab @code{uint32_t cth_objtoff} | |
471 | @vindex cth_objtoff | |
472 | @vindex struct ctf_header, cth_objtoff | |
473 | @vindex ctf_header_t, cth_objtoff | |
474 | @tab The offset of the data object symtypetab section, which maps ELF data symbols to | |
475 | types. @xref{The symtypetab sections}. | |
476 | ||
477 | @item 0x18 | |
478 | @tab @code{uint32_t cth_funcoff} | |
479 | @vindex cth_funcoff | |
480 | @vindex struct ctf_header, cth_funcoff | |
481 | @vindex ctf_header_t, cth_funcoff | |
482 | @tab The offset of the function info symtypetab section, which maps ELF function | |
483 | symbols to a return type and arg types. @xref{The symtypetab sections}. | |
484 | ||
485 | @item 0x1c | |
486 | @tab @code{uint32_t cth_objtidxoff} | |
487 | @vindex cth_objtidxoff | |
488 | @vindex struct ctf_header, cth_objtidxoff | |
489 | @vindex ctf_header_t, cth_objtidxoff | |
490 | @tab The offset of the object index section, which maps ELF object symbols to | |
491 | entries in the data object section. @xref{The symtypetab sections}. | |
492 | ||
493 | @item 0x20 | |
494 | @tab @code{uint32_t cth_funcidxoff} | |
495 | @vindex cth_funcidxoff | |
496 | @vindex struct ctf_header, cth_funcidxoff | |
497 | @vindex ctf_header_t, cth_funcidxoff | |
498 | @tab The offset of the function info index section, which maps ELF function | |
499 | symbols to entries in the function info section. @xref{The symtypetab sections}. | |
500 | ||
501 | @item 0x24 | |
502 | @tab @code{uint32_t cth_varoff} | |
503 | @vindex cth_varoff | |
504 | @vindex struct ctf_header, cth_varoff | |
505 | @vindex ctf_header_t, cth_varoff | |
506 | @tab The offset of the variable section, which maps string names to types. | |
507 | @xref{The variable section}. | |
508 | ||
509 | @item 0x28 | |
510 | @tab @code{uint32_t cth_typeoff} | |
511 | @vindex cth_typeoff | |
512 | @vindex struct ctf_header, cth_typeoff | |
513 | @vindex ctf_header_t, cth_typeoff | |
514 | @tab The offset of the type section, the core of CTF, which describes types | |
515 | using variable-length array elements. @xref{The type section}. | |
516 | ||
517 | @item 0x2c | |
518 | @tab @code{uint32_t cth_stroff} | |
519 | @vindex cth_stroff | |
520 | @vindex struct ctf_header, cth_stroff | |
521 | @vindex ctf_header_t, cth_stroff | |
522 | @tab The offset of the string section. @xref{The string section}. | |
523 | ||
524 | @item 0x30 | |
525 | @tab @code{uint32_t cth_strlen} | |
526 | @vindex cth_strlen | |
527 | @vindex struct ctf_header, cth_strlen | |
528 | @vindex ctf_header_t, cth_strlen | |
529 | @tab The length of the string section (not an offset!). The CTF file ends | |
530 | at this point. | |
531 | ||
532 | @end multitable | |
533 | ||
534 | Everything from this point on (until the end of the file at @code{cth_stroff} + | |
535 | @code{cth_strlen}) is compressed with zlib if @code{CTF_F_COMPRESS} is set in | |
536 | the preamble's @code{ctp_flags}. | |
537 | ||
538 | @node The type section | |
539 | @section The type section | |
540 | @cindex Type section | |
541 | @cindex Sections, type | |
542 | ||
543 | This section is the most important section in CTF, describing all the top-level | |
544 | types in the program. It consists of an array of type structures, each of which | |
545 | describes a type of some @dfn{kind}: each kind of type has some amount of | |
546 | variable-length data associated with it (some kinds have none). The amount of | |
547 | variable-length data associated with a given type can be determined by | |
548 | inspecting the type, so the reading code can walk through the types in sequence | |
549 | at opening time. | |
550 | ||
551 | Each type structure is one of a set of overlapping structures in a discriminated | |
552 | union of sorts: the variable-length data for each type immediately follows the | |
553 | type's type structure. Here's the largest of the overlapping structures, which | |
554 | is only needed for huge types and so is very rarely seen: | |
555 | ||
556 | @verbatim | |
557 | typedef struct ctf_type | |
558 | { | |
559 | uint32_t ctt_name; | |
560 | uint32_t ctt_info; | |
561 | __extension__ | |
562 | union | |
563 | { | |
564 | uint32_t ctt_size; | |
565 | uint32_t ctt_type; | |
566 | }; | |
567 | uint32_t ctt_lsizehi; | |
568 | uint32_t ctt_lsizelo; | |
569 | } ctf_type_t; | |
570 | @end verbatim | |
571 | ||
572 | Here's the much more common smaller form: | |
573 | ||
574 | @verbatim | |
575 | typedef struct ctf_stype | |
576 | { | |
577 | uint32_t ctt_name; | |
578 | uint32_t ctt_info; | |
579 | __extension__ | |
580 | union | |
581 | { | |
582 | uint32_t ctt_size; | |
583 | uint32_t ctt_type; | |
584 | }; | |
585 | } ctf_type_t; | |
586 | @end verbatim | |
587 | ||
588 | If @code{ctt_size} is the #define @code{CTF_LSIZE_SENT}, 0xffffffff, this type | |
589 | is described by a @code{ctf_type_t}: otherwise, a @code{ctf_stype_t}. | |
590 | @tindex CTF_LSIZE_SENT | |
591 | ||
592 | Here's what the fields mean: | |
593 | ||
594 | @tindex struct ctf_type | |
595 | @tindex struct ctf_stype | |
596 | @tindex ctf_type_t | |
597 | @tindex ctf_stype_t | |
598 | @multitable {0x1c (@code{ctf_type_t}} {@code{uint32_t ctt_lsizehi}} {The size of this type, if this type is of a kind for} | |
599 | @headitem Offset @tab Name @tab Description | |
600 | @item 0x00 | |
601 | @tab @code{uint32_t ctt_name} | |
602 | @vindex ctt_name | |
603 | @tab Strtab offset of the type name, if any (0 if none). | |
604 | ||
605 | @item 0x04 | |
606 | @tab @code{uint32_t ctt_info} | |
607 | @vindex ctt_info | |
608 | @vindex struct ctf_type, ctt_info | |
609 | @vindex ctf_type_t, ctt_info | |
610 | @vindex struct ctf_stype, ctt_info | |
611 | @vindex ctf_stype_t, ctt_info | |
612 | @tab The @dfn{info word}, containing information on the kind of this type, its | |
613 | variable-length data and whether it is visible to name lookup. See @xref{The | |
614 | info word}. | |
615 | ||
616 | @item 0x08 | |
617 | @tab @code{uint32_t ctt_size} | |
618 | @vindex ctt_size | |
619 | @vindex struct ctf_type, ctt_size | |
620 | @vindex ctf_type_t, ctt_size | |
621 | @vindex struct ctf_stype, ctt_size | |
622 | @vindex ctf_stype_t, ctt_size | |
623 | @tab The size of this type, if this type is of a kind for which a size needs | |
624 | to be recorded (constant-size types don't need one). If this is | |
625 | @code{CTF_LSIZE_SENT}, this type is a huge type described by @code{ctf_type_t}. | |
626 | ||
627 | @item 0x08 | |
628 | @tab @code{uint32_t ctt_type} | |
629 | @vindex ctt_type | |
630 | @vindex struct ctf_stype, ctt_type | |
631 | @vindex ctf_stype_t, ctt_type | |
632 | @tab The type this type refers to, if this type is of a kind which refers to | |
633 | other types (like a pointer). All such types are fixed-size, and no types that | |
634 | are variable-size refer to other types, so @code{ctt_size} and @code{ctt_type} | |
635 | overlap. All type kinds that use @code{ctt_type} are described by | |
636 | @code{ctf_stype_t}, not @code{ctf_type_t}. @xref{Type indexes and type IDs}. | |
637 | ||
638 | @item 0x0c (@code{ctf_type_t} only) | |
639 | @tab @code{uint32_t ctt_lsizehi} | |
640 | @vindex ctt_lsizehi | |
641 | @vindex struct ctf_type, ctt_lsizehi | |
642 | @vindex ctf_type_t, ctt_lsizehi | |
643 | @tab The high 32 bits of the size of a very large type. The @code{CTF_TYPE_LSIZE} macro | |
644 | can be used to get a 64-bit size out of this field and the next one. | |
645 | @code{CTF_SIZE_TO_LSIZE_HI} splits the @code{ctt_lsizehi} out of it again. | |
646 | @findex CTF_TYPE_LSIZE | |
647 | @findex CTF_SIZE_TO_LSIZE_HI | |
648 | ||
649 | @item 0x10 (@code{ctf_type_t} only) | |
650 | @tab @code{uint32_t ctt_lsizelo} | |
651 | @vindex ctt_lsizelo | |
652 | @vindex struct ctf_type, ctt_lsizelo | |
653 | @vindex ctf_type_t, ctt_lsizelo | |
654 | @tab The low 32 bits of the size of a very large type. | |
655 | @code{CTF_SIZE_TO_LSIZE_LO} splits the @code{ctt_lsizelo} out of a 64-bit size. | |
656 | @findex CTF_SIZE_TO_LSIZE_LO | |
657 | @end multitable | |
658 | ||
659 | Two aspects of this need further explanation: the info word, and what exactly a | |
660 | type ID is and how you determine it. (Information on the various type-kind- | |
661 | dependent things, like whether @code{ctt_size} or @code{ctt_type} is used, | |
662 | is described in the section devoted to each kind.) | |
663 | ||
664 | @node The info word | |
665 | @subsection The info word, ctt_info | |
666 | ||
667 | The info word is a bitfield split into three parts. From MSB to LSB: | |
668 | ||
669 | @multitable {Bit offset} {@code{isroot}} {Length of variable-length data for this type (some kinds only).} | |
670 | @headitem Bit offset @tab Name @tab Description | |
671 | @item 26--31 | |
672 | @tab @code{kind} | |
673 | @tab Type kind: @pxref{Type kinds}. | |
674 | ||
675 | @item 25 | |
676 | @tab @code{isroot} | |
677 | @tab 1 if this type is visible to name lookup | |
678 | ||
679 | @item 0--24 | |
680 | @tab @code{vlen} | |
681 | @tab Length of variable-length data for this type (some kinds only). | |
682 | The variable-length data directly follows the @code{ctf_type_t} or | |
683 | @code{ctf_stype_t}. This is a kind-dependent array length value, | |
684 | not a length in bytes. Some kinds have no variable-length data, or | |
685 | fixed-size variable-length data, and do not use this value. | |
686 | @end multitable | |
687 | ||
688 | The most mysterious of these is undoubtedly @code{isroot}. This indicates | |
689 | whether types with names (nonzero @code{ctt_name}) are visible to name lookup: | |
690 | if zero, this type is considered a @dfn{non-root type} and you can't look it up | |
691 | by name at all. Multiple types with the same name in the same C namespace | |
692 | (struct, union, enum, other) can exist in a single dictionary, but only one of | |
693 | them may have a nonzero value for @code{isroot}. @code{libctf} validates this | |
694 | at open time and refuses to open dictionaries that violate this constraint. | |
695 | ||
696 | Historically, this feature was introduced for the encoding of bitfields | |
697 | (@pxref{Integer types}): for instance, int bitfields will all be named | |
698 | @code{int} with different widths or offsets, but only the full-width one at | |
699 | offset zero is wanted when you look up the type named @code{int}. With the | |
700 | introduction of slices (@pxref{Slices}) as a more general bitfield encoding | |
701 | mechanism, this is less important, but we still use non-root types to handle | |
702 | conflicts if the linker API is used to fuse multiple translation units into one | |
703 | dictionary and those translation units contain types with the same name and | |
704 | conflicting definitions. (We do not discuss this further here, because the | |
705 | linker never does this: only specialized type mergers do, like that used for the | |
706 | Linux kernel. The libctf documentation will describe this in more detail.) | |
707 | @c XXX update when libctf docs are written. | |
708 | ||
709 | The @code{CTF_TYPE_INFO} macro can be used to compose an info word from | |
710 | a @code{kind}, @code{isroot}, and @code{vlen}; @code{CTF_V2_INFO_KIND}, | |
711 | @code{CTF_V2_INFO_ISROOT} and @code{CTF_V2_INFO_VLEN} pick it apart again. | |
712 | @findex CTF_TYPE_INFO | |
713 | @findex CTF_V2_INFO_KIND | |
714 | @findex CTF_V2_INFO_ISROOT | |
715 | @findex CTF_V2_INFO_VLEN | |
716 | ||
717 | @node Type indexes and type IDs | |
718 | @subsection Type indexes and type IDs | |
719 | @cindex Type indexes | |
720 | @cindex Type IDs | |
721 | @cindex Type, IDs of | |
722 | @cindex Type, indexes of | |
723 | @cindex ctf_id_t | |
724 | ||
725 | @cindex Parent range | |
726 | @cindex Child range | |
727 | @cindex Type IDs, ranges | |
728 | Types are referred to within the CTF file via @dfn{type IDs}. A type ID is a | |
729 | number from 0 to @math{2^32}, from a space divided in half. Types @math{2^31-1} | |
730 | and below are in the @dfn{parent range}: these IDs are used for dictionaries | |
731 | that have not had any other dictionary @code{ctf_import}ed into it as a parent. | |
732 | Both completely standalone dictionaries and parent dictionaries with children | |
733 | hanging off them have types in this range. Types @math{2^31} and above are in | |
734 | the @dfn{child range}: only types in child dictionaries are in this range. | |
735 | ||
736 | These IDs appear in @code{ctf_type_t.ctt_type} (@pxref{The type section}), but | |
737 | the types themselves have no visible ID: quite intentionally, because adding an | |
738 | ID uses space, and every ID is different so they don't compress well. The IDs | |
739 | are implicit: at open time, the consumer walks through the entire type section | |
740 | and counts the types in the type section. The type section is an array of | |
741 | variable-length elements, so each entry could be considered as having an index, | |
742 | starting from 1. We count these indexes and associate each with its | |
743 | corresponding @code{ctf_type_t} or @code{ctf_stype_t}. | |
744 | ||
745 | Lookups of types with IDs in the parent space look in the parent dictionary if | |
746 | this dictionary has one associated with it; lookups of types with IDs in the | |
747 | child space error out if the dictionary does not have a parent, and otherwise | |
748 | convert the ID into an index by shaving off the top bit and look up the index | |
749 | in the child. | |
750 | ||
751 | These properties mean that the same dictionary can be used as a parent of child | |
752 | dictionaries and can also be used directly with no children at all, but a | |
753 | dictionary created as a child dictionary must always be associated with a parent | |
754 | --- usually, the same parent --- because its references to its own types have | |
755 | the high bit turned on and this is only flipped off again if this is a child | |
756 | dictionary. (This is not a problem, because if you @emph{don't} associate the | |
757 | child with a parent, any references within it to its parent types will fail, and | |
758 | there are almost certain to be many such references, or why is it a child at | |
759 | all?) | |
760 | ||
761 | This does mean that consumers should keep a close eye on the distinction between | |
762 | type IDs and type indexes: if you mix them up, everything will appear to work as | |
763 | long as you're only using parent dictionaries or standalone dictionaries, but as | |
764 | soon as you start using children, everything will fail horribly. | |
765 | ||
766 | Type index zero, and type ID zero, are used to indicate that this type cannot be | |
767 | represented in CTF as currently constituted: they are emitted by the compiler, | |
768 | but all type chains that terminate in the unknown type are erased at link time | |
769 | (structure fields that use them just vanish, etc). So you will probably never | |
770 | see a use of type zero outside the symtypetab sections, where they serve as | |
771 | sentinels of sorts, to indicate symbols with no associated type. | |
772 | ||
773 | The macros @code{CTF_V2_TYPE_TO_INDEX} and @code{CTF_V2_INDEX_TO_TYPE} may help | |
774 | in translation between types and indexes: @code{CTF_V2_TYPE_ISPARENT} and | |
775 | @code{CTF_V2_TYPE_ISCHILD} can be used to tell whether a given ID is in the | |
776 | parent or child range. | |
777 | @findex CTF_V2_TYPE_TO_INDEX | |
778 | @findex CTF_V2_INDEX_TO_TYPE | |
779 | @findex CTF_V2_TYPE_ISPARENT | |
780 | @findex CTF_V2_TYPE_ISCHILD | |
781 | ||
782 | It is quite possible and indeed common for type IDs to point forward in the | |
783 | dictionary, as well as backward. | |
784 | ||
785 | @node Type kinds | |
786 | @subsection Type kinds | |
787 | @cindex Type kinds | |
788 | @cindex Type, kinds of | |
789 | ||
790 | Every type in CTF is of some @dfn{kind}. Each kind is some variety of C type: | |
791 | all structures are a single kind, as are all unions, all pointers, all arrays, | |
792 | all integers regardless of their bitfield width, etc. The kind of a type is | |
793 | given in the @code{kind} field of the @code{ctt_info} word (@pxref{The info | |
794 | word}). | |
795 | ||
796 | The space of type kinds is only a quarter full so far, so there is plenty of | |
797 | room for expansion. It is likely that in future versions of the file format, | |
798 | types with smaller kinds will be more efficiently encoded than types with larger | |
799 | kinds, so their numerical value will actually start to matter in future. (So | |
800 | these IDs will probably change their numerical values in a later release of this | |
801 | format, to move more frequently-used kinds like structures and cv-quals towards | |
802 | the top of the space, and move rarely-used kinds like integers downwards. Yes, | |
803 | integers are rare: how many kinds of @code{int} are there in a program? They're | |
804 | just very frequently @emph{referenced}.) | |
805 | ||
806 | Here's the set of kinds so far. Each kind has a @code{#define} associated with | |
807 | it, also given here. | |
808 | ||
809 | @multitable {Kind} {@code{CTF_K_VOLATILE}} {Indicates a type that cannot be represented in CTF, or that} {@xref{Pointers typedefs and cvr-quals}} | |
810 | @headitem Kind @tab Macro @tab Purpose | |
811 | @item 0 | |
812 | @tab @code{CTF_K_UNKNOWN} | |
813 | @tab Indicates a type that cannot be represented in CTF, or that is being skipped. | |
814 | It is very similar to type ID 0, except that you can have @emph{multiple}, distinct types | |
815 | of kind @code{CTF_K_UNKNOWN}. | |
816 | @tindex CTF_K_UNKNOWN | |
817 | ||
818 | @item 1 | |
819 | @tab @code{CTF_K_INTEGER} | |
820 | @tab An integer type. @xref{Integer types}. | |
821 | ||
822 | @item 2 | |
823 | @tab @code{CTF_K_FLOAT} | |
824 | @tab A floating-point type. @xref{Floating-point types}. | |
825 | ||
826 | @item 3 | |
827 | @tab @code{CTF_K_POINTER} | |
828 | @tab A pointer. @xref{Pointers typedefs and cvr-quals}. | |
829 | ||
830 | @item 4 | |
831 | @tab @code{CTF_K_ARRAY} | |
832 | @tab An array. @xref{Arrays}. | |
833 | ||
834 | @item 5 | |
835 | @tab @code{CTF_K_FUNCTION} | |
836 | @tab A function pointer. @xref{Function pointers}. | |
837 | ||
838 | @item 6 | |
839 | @tab @code{CTF_K_STRUCT} | |
840 | @tab A structure. @xref{Structs and unions}. | |
841 | ||
842 | @item 7 | |
843 | @tab @code{CTF_K_UNION} | |
844 | @tab A union. @xref{Structs and unions}. | |
845 | ||
846 | @item 8 | |
847 | @tab @code{CTF_K_ENUM} | |
848 | @tab An enumerated type. @xref{Enums}. | |
849 | ||
850 | @item 9 | |
851 | @tab @code{CTF_K_FORWARD} | |
852 | @tab A forward. @xref{Forward declarations}. | |
853 | ||
854 | @item 10 | |
855 | @tab @code{CTF_K_TYPEDEF} | |
856 | @tab A typedef. @xref{Pointers typedefs and cvr-quals}. | |
857 | ||
858 | @item 11 | |
859 | @tab @code{CTF_K_VOLATILE} | |
860 | @tab A volatile-qualified type. @xref{Pointers typedefs and cvr-quals}. | |
861 | ||
862 | @item 12 | |
863 | @tab @code{CTF_K_CONST} | |
864 | @tab A const-qualified type. @xref{Pointers typedefs and cvr-quals}. | |
865 | ||
866 | @item 13 | |
867 | @tab @code{CTF_K_RESTRICT} | |
868 | @tab A restrict-qualified type. @xref{Pointers typedefs and cvr-quals}. | |
869 | ||
870 | @item 14 | |
871 | @tab @code{CTF_K_SLICE} | |
872 | @tab A slice, a change of the bit-width or offset of some other type. @xref{Slices}. | |
873 | @end multitable | |
874 | ||
875 | Now we cover all type kinds in turn. Some are more complicated than others. | |
876 | ||
877 | @node Integer types | |
878 | @subsection Integer types | |
879 | @cindex Integer types | |
880 | @cindex Types, integer | |
881 | @tindex int | |
882 | @tindex long | |
883 | @tindex long long | |
884 | @tindex short | |
885 | @tindex char | |
886 | @tindex bool | |
887 | @tindex unsigned int | |
888 | @tindex unsigned long | |
889 | @tindex unsigned long long | |
890 | @tindex unsigned short | |
891 | @tindex unsigned char | |
892 | @tindex signed int | |
893 | @tindex signed long | |
894 | @tindex signed long long | |
895 | @tindex signed short | |
896 | @tindex signed char | |
897 | @cindex CTF_K_INTEGER | |
898 | ||
899 | Integral types are all represented as types of kind @code{CTF_K_INTEGER}. These | |
900 | types fill out @code{ctt_size} in the @code{ctf_stype_t} with the size in bytes | |
901 | of the integral type in question. They are always represented by | |
902 | @code{ctf_stype_t}, never @code{ctf_type_t}. Their variable-length data is one | |
903 | @code{uint32_t} in length: @code{vlen} in the info word should be disregarded | |
904 | and is always zero. | |
905 | ||
906 | The variable-length data for integers has multiple items packed into it much | |
907 | like the info word does. | |
908 | ||
909 | @multitable {Bit offset} {Encoding} {The integer encoding and desired display representation.} | |
910 | @headitem Bit offset @tab Name @tab Description | |
911 | @item 24--31 | |
912 | @tab Encoding | |
913 | @tab The desired display representation of this integer. You can extract this | |
914 | field with the @code{CTF_INT_ENCODING} macro. See below. | |
915 | @findex CTF_INT_ENCODING | |
916 | ||
917 | @item 16--23 | |
918 | @tab Offset | |
919 | @tab The offset of this integral type in bits from the start of its enclosing | |
920 | structure field, adjusted for endianness: @pxref{Structs and unions}. You can | |
921 | extract this field with the @code{CTF_INT_OFFSET} macro. | |
922 | @findex CTF_INT_OFFSET | |
923 | ||
924 | @item 0--15 | |
925 | @tab Bit-width | |
926 | @tab The width of this integral type in bits. You can extract this field with | |
927 | the @code{CTF_INT_BITS} macro. | |
928 | @findex CTF_INT_BITS | |
929 | @end multitable | |
930 | ||
931 | If you choose, bitfields can be represented using the things above as a sort of | |
932 | integral type with the @code{isroot} bit flipped off and the offset and bits | |
933 | values set in the vlen word: you can populate it with the @code{CTF_INT_DATA} | |
934 | macro. (But it may be more convenient to represent them using slices of a | |
935 | full-width integer: @pxref{Slices}.) | |
936 | @findex CTF_INT_DATA | |
937 | ||
938 | Integers that are bitfields usually have a @code{ctt_size} rounded up to the | |
939 | nearest power of two in bytes, for natural alignment (e.g. a 17-bit integer | |
940 | would have a @code{ctt_size} of 4). However, not all types are naturally | |
941 | aligned on all architectures: packed structures may in theory use integral | |
942 | bitfields with different @code{ctt_size}, though this is rarely observed. | |
943 | ||
944 | The @dfn{encoding} for integers is a bit-field comprised of the values below, | |
945 | which consumers can use to decide how to display values of this type: | |
946 | ||
947 | @multitable {Offset} {@code{CTF_INT_VARARGS}} {If set, this is a char type. It is platform-dependent whether unadorned} | |
948 | @headitem Offset @tab Name @tab Description | |
949 | @item 0x01 | |
950 | @tab @code{CTF_INT_SIGNED} | |
951 | @tab If set, this is a signed int: if false, unsigned. | |
952 | @tindex CTF_INT_SIGNED | |
953 | ||
954 | @item 0x02 | |
955 | @tab @code{CTF_INT_CHAR} | |
956 | @tab If set, this is a char type. It is platform-dependent whether unadorned | |
957 | @code{char} is signed or not: the @code{CTF_CHAR} macro produces an integral | |
958 | type suitable for the definition of @code{char} on this platform. | |
959 | @tindex CTF_INT_CHAR | |
960 | @findex CTF_CHAR | |
961 | ||
962 | @item 0x04 | |
963 | @tab @code{CTF_INT_BOOL} | |
964 | @tab If set, this is a boolean type. (It is theoretically possible to turn this | |
965 | and @code{CTF_INT_CHAR} on at the same time, but it is not clear what this would | |
966 | mean.) | |
967 | @tindex CTF_INT_BOOL | |
968 | ||
969 | @item 0x08 | |
970 | @tab @code{CTF_INT_VARARGS} | |
971 | @tab If set, this is a varargs-promoted value in a K&R function definition. | |
972 | This is not currently produced or consumed by anything that we know of: it is set | |
973 | aside for future use. | |
974 | @end multitable | |
975 | ||
976 | The GCC ``@code{Complex int}'' and fixed-point extensions are not yet supported: | |
977 | references to such types will be emitted as type 0. | |
978 | ||
979 | @node Floating-point types | |
980 | @subsection Floating-point types | |
981 | @cindex Floating-point types | |
982 | @cindex Types, floating-point | |
983 | @tindex float | |
984 | @tindex double | |
985 | @tindex signed float | |
986 | @tindex signed double | |
987 | @tindex unsigned float | |
988 | @tindex unsigned double | |
989 | @tindex Complex, float | |
990 | @tindex Complex, double | |
991 | @tindex Complex, signed float | |
992 | @tindex Complex, signed double | |
993 | @tindex Complex, unsigned float | |
994 | @tindex Complex, unsigned double | |
995 | @cindex CTF_K_FLOAT | |
996 | ||
997 | Floating-point types are all represented as types of kind @code{CTF_K_FLOAT}. | |
998 | Like integers, These types fill out @code{ctt_size} in the @code{ctf_stype_t} | |
999 | with the size in bytes of the floating-point type in question. They are always | |
1000 | represented by @code{ctf_stype_t}, never @code{ctf_type_t}. | |
1001 | ||
1002 | This part of CTF shows many rough edges in the more obscure corners of | |
1003 | floating-point handling, and is likely to change in format v4. | |
1004 | ||
1005 | The variable-length data for floats has multiple items packed into it just like | |
1006 | integers do: | |
1007 | ||
1008 | @multitable {Bit offset} {Encoding} {The floating-;point encoding and desired display representation.} | |
1009 | @headitem Bit offset @tab Name @tab Description | |
1010 | @item 24--31 | |
1011 | @tab Encoding | |
1012 | @tab The desired display representation of this float. You can extract this | |
1013 | field with the @code{CTF_FP_ENCODING} macro. See below. | |
1014 | @findex CTF_FP_ENCODING | |
1015 | ||
1016 | @item 16--23 | |
1017 | @tab Offset | |
1018 | @tab The offset of this floating-point type in bits from the start of its enclosing | |
1019 | structure field, adjusted for endianness: @pxref{Structs and unions}. You can | |
1020 | extract this field with the @code{CTF_FP_OFFSET} macro. | |
1021 | @findex CTF_FP_OFFSET | |
1022 | ||
1023 | @item 0--15 | |
1024 | @tab Bit-width | |
1025 | @tab The width of this floating-point type in bits. You can extract this field with | |
1026 | the @code{CTF_FP_BITS} macro. | |
1027 | @findex CTF_FP_BITS | |
1028 | @end multitable | |
1029 | ||
1030 | The purpose of the floating-point offset and bit-width is somewhat opaque, since | |
1031 | there are no such things as floating-point bitfields in C: the bit-width should | |
1032 | be filled out with the full width of the type in bits, and the offset should | |
1033 | always be zero. It is likely that these fields will go away in the future. As | |
1034 | with integers, you can use @code{CTF_FP_DATA} to assemble one of these vlen | |
1035 | items from its component parts. | |
1036 | @findex CTF_INT_DATA | |
1037 | ||
1038 | The @dfn{encoding} for floats is not a bitfield but a simple value indicating | |
1039 | the display representation. Many of these are unused, relate to | |
1040 | Solaris-specific compiler extensions, and will be recycled in future: some are | |
1041 | unused and will become used in future. | |
1042 | ||
1043 | @multitable {Offset} {@code{CTF_FP_LDIMAGRY}} {This is a @code{float} interval type, a Solaris-specific extension.} | |
1044 | @headitem Offset @tab Name @tab Description | |
1045 | @item 1 | |
1046 | @tab @code{CTF_FP_SINGLE} | |
1047 | @tab This is a single-precision IEEE 754 @code{float}. | |
1048 | @tindex CTF_FP_SINGLE | |
1049 | @item 2 | |
1050 | @tab @code{CTF_FP_DOUBLE} | |
1051 | @tab This is a double-precision IEEE 754 @code{double}. | |
1052 | @tindex CTF_FP_DOUBLE | |
1053 | @item 3 | |
1054 | @tab @code{CTF_FP_CPLX} | |
1055 | @tab This is a @code{Complex float}. | |
1056 | @tindex CTF_FP_CPLX | |
1057 | @item 4 | |
1058 | @tab @code{CTF_FP_DCPLX} | |
1059 | @tab This is a @code{Complex double}. | |
1060 | @tindex CTF_FP_DCPLX | |
1061 | @item 5 | |
1062 | @tab @code{CTF_FP_LDCPLX} | |
1063 | @tab This is a @code{Complex long double}. | |
1064 | @tindex CTF_FP_LDCPLX | |
1065 | @item 6 | |
1066 | @tab @code{CTF_FP_LDOUBLE} | |
1067 | @tab This is a @code{long double}. | |
1068 | @tindex CTF_FP_LDOUBLE | |
1069 | @item 7 | |
1070 | @tab @code{CTF_FP_INTRVL} | |
1071 | @tab This is a @code{float} interval type, a Solaris-specific extension. | |
1072 | Unused: will be recycled. | |
1073 | @tindex CTF_FP_INTRVL | |
1074 | @cindex Unused bits | |
1075 | @item 8 | |
1076 | @tab @code{CTF_FP_DINTRVL} | |
1077 | @tab This is a @code{double} interval type, a Solaris-specific extension. | |
1078 | Unused: will be recycled. | |
1079 | @tindex CTF_FP_DINTRVL | |
1080 | @cindex Unused bits | |
1081 | @item 9 | |
1082 | @tab @code{CTF_FP_LDINTRVL} | |
1083 | @tab This is a @code{long double} interval type, a Solaris-specific extension. | |
1084 | Unused: will be recycled. | |
1085 | @tindex CTF_FP_LDINTRVL | |
1086 | @cindex Unused bits | |
1087 | @item 10 | |
1088 | @tab @code{CTF_FP_IMAGRY} | |
1089 | @tab This is a the imaginary part of a @code{Complex float}. Not currently | |
1090 | generated. May change. | |
1091 | @tindex CTF_FP_IMAGRY | |
1092 | @cindex Unused bits | |
1093 | @item 11 | |
1094 | @tab @code{CTF_FP_DIMAGRY} | |
1095 | @tab This is a the imaginary part of a @code{Complex double}. Not currently | |
1096 | generated. May change. | |
1097 | @tindex CTF_FP_DIMAGRY | |
1098 | @cindex Unused bits | |
1099 | @item 12 | |
1100 | @tab @code{CTF_FP_LDIMAGRY} | |
1101 | @tab This is a the imaginary part of a @code{Complex long double}. Not currently | |
1102 | generated. May change. | |
1103 | @tindex CTF_FP_LDIMAGRY | |
1104 | @cindex Unused bits | |
1105 | @end multitable | |
1106 | ||
1107 | The use of the complex floating-point encodings is obscure: it is possible that | |
1108 | @code{CTF_FP_CPLX} is meant to be used for only the real part of complex types, | |
1109 | and @code{CTF_FP_IMAGRY} et al for the imaginary part -- but for now, we are | |
1110 | emitting @code{CTF_FP_CPLX} to cover the entire type, with no way to get at its | |
1111 | constituent parts. There appear to be no uses of these encodings anywhere, so | |
1112 | they are quite likely to change incompatibly in future. | |
1113 | ||
1114 | @node Slices | |
1115 | @subsection Slices | |
1116 | @cindex Slices | |
1117 | @cindex Types, slices of integral | |
1118 | @tindex CTF_K_SLICE | |
1119 | ||
1120 | Slices, with kind @code{CTF_K_SLICE}, are an unusual CTF construct: they do not | |
1121 | directly correspond to any C type, but are a way to model other types in a more | |
1122 | convenient fashion for CTF generators. | |
1123 | ||
1124 | A slice is like a pointer or other reference type in that they are always | |
1125 | represented by @code{ctf_stype_t}: but unlike pointers and other reference | |
1126 | types, they populate the @code{ctt_size} field just like integral types do, and | |
1127 | come with an attached encoding and transform the encoding of the underlying | |
1128 | type. The underlying type is described in the variable-length data, similarly | |
1129 | to structure and union fields: see below. Requests for the type size should | |
1130 | also chase down to the referenced type. | |
1131 | ||
1132 | Slices are always nameless: @code{ctt_name} is always zero for them. | |
1133 | ||
1134 | (The @code{libctf} API behaviour is unusual as well, and justifies the existence | |
1135 | of slices: @code{ctf_type_kind} never returns @code{CTF_K_SLICE} but always the | |
1136 | underlying type kind, so that consumers never need to know about slices: they | |
1137 | can tell if an apparent integer is actually a slice if they need to by calling | |
1138 | @code{ctf_type_reference}, which will uniquely return the underlying integral | |
1139 | type rather than erroring out with @code{ECTF_NOTREF} if this is actually a | |
1140 | slice. So slices act just like an integer with an encoding, but more closely | |
1141 | mirror DWARF and other debugging information formats by allowing CTF file | |
1142 | creators to represent a bitfield as a slice of an underlying integral type.) | |
1143 | @findex Slices, effect on ctf_type_kind | |
1144 | @findex Slices, effect on ctf_type_reference | |
1145 | @findex libctf, effect of slices | |
1146 | ||
1147 | The vlen in the info word for a slice should be ignored and is always zero. The | |
1148 | variable-length data for a slice is a single @code{ctf_slice_t}: | |
1149 | ||
1150 | @verbatim | |
1151 | typedef struct ctf_slice | |
1152 | { | |
1153 | uint32_t cts_type; | |
1154 | unsigned short cts_offset; | |
1155 | unsigned short cts_bits; | |
1156 | } ctf_slice_t; | |
1157 | @end verbatim | |
1158 | ||
1159 | @tindex struct ctf_slice | |
1160 | @tindex ctf_slice_t | |
1161 | @multitable {Offset} {@code{unsigned short cts_offset}} {The type this slice is a slice of. Must be an} | |
1162 | @headitem Offset @tab Name @tab Description | |
1163 | @item 0x0 | |
1164 | @tab @code{uint32_t cts_type} | |
1165 | @vindex cts_type | |
1166 | @vindex struct ctf_slice, cts_type | |
1167 | @vindex ctf_slice_t, cts_type | |
1168 | @tab The type this slice is a slice of. Must be an integral type (or a | |
1169 | floating-point type, but this nonsensical option will go away in v4.) | |
1170 | ||
1171 | @item 0x4 | |
1172 | @tab @code{unsigned short cts_offset} | |
1173 | @vindex cts_offset | |
1174 | @vindex struct ctf_slice, cts_offset | |
1175 | @vindex ctf_slice_t, cts_offset | |
1176 | @tab The offset of this integral type in bits from the start of its enclosing | |
1177 | structure field, adjusted for endianness: @pxref{Structs and unions}. Identical | |
1178 | semantics to the @code{CTF_INT_OFFSET} field: @pxref{Integer types}. This field | |
1179 | is much too long, because the maximum possible offset of an integral type would | |
1180 | easily fit in a char: this field is bigger just for the sake of alignment. This | |
1181 | will change in v4. | |
1182 | ||
1183 | @item 0x6 | |
1184 | @tab @code{unsigned short cts_bits} | |
1185 | @vindex cts_bits | |
1186 | @vindex struct ctf_slice, cts_bits | |
1187 | @vindex ctf_slice_t, cts_bits | |
1188 | @tab The bit-width of this integral type. Identical semantics to the | |
1189 | @code{CTF_INT_BITS} field: @pxref{Integer types}. As above, this field is | |
1190 | really too large and will shrink in v4. | |
1191 | @end multitable | |
1192 | ||
1193 | @node Pointers typedefs and cvr-quals | |
1194 | @subsection Pointers, typedefs, and cvr-quals | |
1195 | @cindex Pointers | |
1196 | @cindex Typedefs | |
1197 | @cindex cvr-quals | |
1198 | @tindex typedef | |
1199 | @tindex const | |
1200 | @tindex volatile | |
1201 | @tindex restrict | |
1202 | @tindex CTF_K_POINTER | |
1203 | @tindex CTF_K_TYPEDEF | |
1204 | @tindex CTF_K_CONST | |
1205 | @tindex CTF_K_VOLATILE | |
1206 | @tindex CTF_K_RESTRICT | |
1207 | ||
1208 | Pointers, @code{typedef}s, and @code{const}, @code{volatile} and @code{restrict} | |
1209 | qualifiers are represented identically except for their type kind (though they | |
1210 | may be treated differently by consuming libraries like @code{libctf}, since | |
1211 | pointers affect assignment-compatibility in ways cvr-quals do not, and they may | |
1212 | have different alignment requirements, etc). | |
1213 | ||
1214 | All of these are represented by @code{ctf_stype_t}, have no variable data at | |
1215 | all, and populate @code{ctt_type} with the type ID of the type they point | |
1216 | to. These types can stack: a @code{CTF_K_RESTRICT} can point to a | |
1217 | @code{CTF_K_CONST} which can point to a @code{CTF_K_POINTER} etc. | |
1218 | ||
1219 | They are all unnamed: @code{ctt_name} is 0. | |
1220 | ||
1221 | The size of @code{CTF_K_POINTER} is derived from the data model (@pxref{Data | |
1222 | models}), i.e. in practice, from the target machine ABI, and is not explicitly | |
1223 | represented. The size of other kinds in this set should be determined by | |
1224 | chasing ctf_types as necessary until a non-typedef/const/volatile/restrict is | |
1225 | found, and using that. | |
1226 | ||
1227 | @node Arrays | |
1228 | @subsection Arrays | |
1229 | @cindex Arrays | |
1230 | ||
1231 | Arrays are encoded as types of kind @code{CTF_K_ARRAY} in a @code{ctf_stype_t}. | |
1232 | Both size and kind for arrays are zero. The variable-length data is a | |
1233 | @code{ctf_array_t}: @code{vlen} in the info word should be disregarded and is | |
1234 | always zero. | |
1235 | ||
1236 | @verbatim | |
1237 | typedef struct ctf_array | |
1238 | { | |
1239 | uint32_t cta_contents; | |
1240 | uint32_t cta_index; | |
1241 | uint32_t cta_nelems; | |
1242 | } ctf_array_t; | |
1243 | @end verbatim | |
1244 | ||
1245 | @tindex struct ctf_array | |
1246 | @tindex ctf_array_t | |
1247 | @multitable {Offset} {@code{unsigned short cta_contents}} {The type of the array index: a type ID of an} | |
1248 | @headitem Offset @tab Name @tab Description | |
1249 | @item 0x0 | |
1250 | @tab @code{uint32_t cta_contents} | |
1251 | @vindex cta_contents | |
1252 | @vindex struct ctf_array, cta_contents | |
1253 | @vindex ctf_array_t, cta_contents | |
1254 | @tab The type of the array elements: a type ID. | |
1255 | ||
1256 | @item 0x4 | |
1257 | @tab @code{uint32_t cta_index} | |
1258 | @vindex cta_index | |
1259 | @vindex struct ctf_array, cta_index | |
1260 | @vindex ctf_array_t, cta_index | |
1261 | @tab The type of the array index: a type ID of an integral type. | |
1262 | If this is a variable-length array, the index type ID will be 0 | |
1263 | (but the actual index type of this array is probably @code{int}). | |
1264 | Probably redundant and may be dropped in v4. | |
1265 | ||
1266 | @item 0x8 | |
1267 | @tab @code{uint32_t cta_nelems} | |
1268 | @vindex cta_nelems | |
1269 | @vindex struct ctf_array, cta_nelems | |
1270 | @vindex ctf_array_t, cta_nelems | |
1271 | @tab The number of array elements. 0 for VLAs, and also for | |
1272 | the historical variety of VLA which has explicit zero dimensions (which will | |
1273 | have a nonzero @code{cta_index}.) | |
1274 | @end multitable | |
1275 | ||
1276 | The size of an array can be computed by simple multiplication of the size of the | |
1277 | @code{cta_contents} type by the @code{cta_nelems}. | |
1278 | ||
1279 | @node Function pointers | |
1280 | @subsection Function pointers | |
1281 | @cindex Function pointers | |
1282 | @cindex Pointers, to functions | |
1283 | ||
1284 | Function pointers are explicitly represented in the CTF type section by a type | |
1285 | of kind @code{CTF_K_FUNCTION}, always encoded with a @code{ctf_stype_t}. The | |
1286 | @code{ctt_type} is the function return type ID. The @code{vlen} in the info | |
1287 | word is the number of arguments, each of which is a type ID, a @code{uint32_t}: | |
1288 | if the last argument is 0, this is a varargs function and the number of | |
1289 | arguments is one less than indicated by the vlen. | |
1290 | ||
1291 | If the number of arguments is odd, a single @code{uint32_t} of padding is | |
1292 | inserted to maintain alignment. | |
1293 | ||
1294 | @node Enums | |
1295 | @subsection Enums | |
1296 | @cindex Enums | |
1297 | @tindex enum | |
1298 | @tindex CTF_K_ENUM | |
1299 | ||
1300 | Enumerated types are represented as types of kind @code{CTF_K_ENUM} in a | |
1301 | @code{ctf_stype_t}. The @code{ctt_size} is always the size of an int from the | |
1302 | data model (enum bitfields are implemented via slices). The @code{vlen} is a | |
1303 | count of enumerations, each of which is represented by a @code{ctf_enum_t} in | |
1304 | the vlen: | |
1305 | ||
1306 | @verbatim | |
1307 | typedef struct ctf_enum | |
1308 | { | |
1309 | uint32_t cte_name; | |
1310 | int32_t cte_value; | |
1311 | } ctf_enum_t; | |
1312 | @end verbatim | |
1313 | ||
1314 | @tindex struct ctf_enum | |
1315 | @tindex ctf_enum_t | |
1316 | @multitable {Offset} {@code{int32_t cte_value}} {Strtab offset of the enumeration name.} | |
1317 | @headitem Offset @tab Name @tab Description | |
1318 | @item 0x0 | |
1319 | @tab @code{uint32_t cte_name} | |
1320 | @vindex cte_name | |
1321 | @vindex struct ctf_enum, cte_name | |
1322 | @vindex ctf_enum_t, cte_name | |
1323 | @tab Strtab offset of the enumeration name. Must not be 0. | |
1324 | ||
1325 | @item 0x4 | |
1326 | @tab @code{int32_t cte_value} | |
1327 | @vindex cte_value | |
1328 | @vindex struct ctf_enum, cte_value | |
1329 | @vindex ctf_enum_t, cte_value | |
1330 | @tab The enumeration value. | |
1331 | ||
1332 | @end multitable | |
1333 | ||
1334 | Enumeration values larger than @math{2^32} are not yet supported and are omitted | |
1335 | from the enumeration. (v4 will lift this restriction by encoding the value | |
1336 | differently.) | |
1337 | ||
1338 | Forward declarations of enums are not implemented with this kind: @pxref{Forward | |
1339 | declarations}. | |
1340 | ||
1341 | Enumerated type names, as usual in C, go into their own namespace, and do not | |
1342 | conflict with non-enums, structs, or unions with the same name. | |
1343 | ||
1344 | @node Structs and unions | |
1345 | @subsection Structs and unions | |
1346 | @cindex Structures | |
1347 | @cindex Unions | |
1348 | @tindex struct | |
1349 | @tindex union | |
1350 | @tindex CTF_K_STRUCT | |
1351 | @tindex CTF_K_UNION | |
1352 | ||
1353 | Structures and unions are represnted as types of kind @code{CTF_K_STRUCT} and | |
1354 | @code{CTF_K_UNION}: their representation is otherwise identical, and it is | |
1355 | perfectly allowed for ``structs'' to contain overlapping fields etc, so we will | |
1356 | treat them together for the rest of this section. | |
1357 | ||
1358 | They fill out @code{ctt_size}, and use @code{ctf_type_t} in preference to | |
1359 | @code{ctf_stype_t} if the structure size is greater than @code{CTF_MAX_SIZE} | |
1360 | (0xfffffffe). | |
1361 | @tindex CTF_MAX_LSIZE | |
1362 | ||
1363 | The vlen for structures and unions is a count of structure fields, but the type | |
1364 | used to represent a structure field (and thus the size of the variable-length | |
1365 | array element representing the type) depends on the size of the structure: truly | |
1366 | huge structures, greater than @code{CTF_LSTRUCT_THRESH} bytes in length, use a | |
1367 | different type. (@code{CTF_LSTRUCT_THRESH} is 536870912, so such structures are | |
1368 | vanishingly rare: in v4, this representation will change somewhat for greater | |
1369 | compactness. It's inherited from v1, where the limits were much lower.) | |
1370 | @tindex CTF_LSTRUCT_THRESH | |
1371 | ||
1372 | Most structures can get away with using @code{ctf_member_t}: | |
1373 | ||
1374 | @verbatim | |
1375 | typedef struct ctf_member_v2 | |
1376 | { | |
1377 | uint32_t ctm_name; | |
1378 | uint32_t ctm_offset; | |
1379 | uint32_t ctm_type; | |
1380 | } ctf_member_t; | |
1381 | @end verbatim | |
1382 | ||
1383 | Huge structures that are represented by @code{ctf_type_t} rather than | |
1384 | @code{ctf_stype_t} have to use @code{ctf_lmember_t}, which splits the offset as | |
1385 | @code{ctf_type_t} splits the size: | |
1386 | ||
1387 | @verbatim | |
1388 | typedef struct ctf_lmember_v2 | |
1389 | { | |
1390 | uint32_t ctlm_name; | |
1391 | uint32_t ctlm_offsethi; | |
1392 | uint32_t ctlm_type; | |
1393 | uint32_t ctlm_offsetlo; | |
1394 | } ctf_lmember_t; | |
1395 | @end verbatim | |
1396 | ||
1397 | Here's what the fields of @code{ctf_member} mean: | |
1398 | ||
1399 | @tindex struct ctf_member_v2 | |
1400 | @tindex ctf_member_t | |
1401 | @multitable {Offset} {@code{uint32_t ctm_offset}} {The offset of this field @emph{in bits}. (Usually, for bitfields, this is} | |
1402 | @headitem Offset @tab Name @tab Description | |
1403 | @item 0x00 | |
1404 | @tab @code{uint32_t ctm_name} | |
1405 | @vindex ctm_name | |
1406 | @vindex struct ctf_member_v2, ctm_name | |
1407 | @vindex ctf_member_t, ctm_name | |
1408 | @tab Strtab offset of the field name. | |
1409 | ||
1410 | @item 0x04 | |
1411 | @tab @code{uint32_t ctm_offset} | |
1412 | @vindex ctm_offset | |
1413 | @vindex struct ctf_member_v2, ctm_offset | |
1414 | @vindex ctf_member_t, ctm_offset | |
1415 | @tab The offset of this field @emph{in bits}. (Usually, for bitfields, this is | |
1416 | machine-word-aligned and the individual field has an offset in bits, but | |
1417 | the format allows for the offset to be encoded in bits here.) | |
1418 | ||
1419 | @item 0x08 | |
1420 | @tab @code{uint32_t ctm_type} | |
1421 | @vindex ctm_type | |
1422 | @vindex struct ctf_member_v2, ctm_type | |
1423 | @vindex ctf_member_t, ctm_type | |
1424 | @tab The type ID of the type of the field. | |
1425 | @end multitable | |
1426 | ||
1427 | Here's what the fields of the very similar @code{ctf_lmember} mean: | |
1428 | ||
1429 | @tindex struct ctf_lmember_v2 | |
1430 | @tindex ctf_lmember_t | |
1431 | @multitable {Offset} {@code{uint32_t ctlm_offsethi}} {The offset of this field @emph{in bits}. (Usually, for bitfields, this is} | |
1432 | @headitem Offset @tab Name @tab Description | |
1433 | @item 0x00 | |
1434 | @tab @code{uint32_t ctlm_name} | |
1435 | @vindex ctlm_name | |
1436 | @vindex struct ctf_lmember_v2, ctlm_name | |
1437 | @vindex ctf_lmember_t, ctlm_name | |
1438 | @tab Strtab offset of the field name. | |
1439 | ||
1440 | @item 0x04 | |
1441 | @tab @code{uint32_t ctlm_offsethi} | |
1442 | @vindex ctlm_offsethi | |
1443 | @vindex struct ctf_lmember_v2, ctlm_offsethi | |
1444 | @vindex ctf_lmember_t, ctlm_offsethi | |
1445 | @tab The high 32 bits of the offset of this field in bits. | |
1446 | ||
1447 | @item 0x08 | |
1448 | @tab @code{uint32_t ctlm_type} | |
1449 | @vindex ctm_type | |
1450 | @vindex struct ctf_lmember_v2, ctlm_type | |
1451 | @vindex ctf_member_t, ctlm_type | |
1452 | @tab The type ID of the type of the field. | |
1453 | ||
1454 | @item 0x0c | |
1455 | @tab @code{uint32_t ctlm_offsetlo} | |
1456 | @vindex ctlm_offsetlo | |
1457 | @vindex struct ctf_lmember_v2, ctlm_offsetlo | |
1458 | @vindex ctf_lmember_t, ctlm_offsetlo | |
1459 | @tab The low 32 bits of the offset of this field in bits. | |
1460 | @end multitable | |
1461 | ||
1462 | Macros @code{CTF_LMEM_OFFSET}, @code{CTF_OFFSET_TO_LMEMHI} and | |
1463 | @code{CTF_OFFSET_TO_LMEMLO} serve to extract and install the values of the | |
1464 | @code{ctlm_offset} fields, much as with the split size fields in | |
1465 | @code{ctf_type_t}. | |
1466 | ||
1467 | Unnamed structure and union fields are simply implemented by collapsing the | |
1468 | unnamed field's members into the containing structure or union: this does mean | |
1469 | that a structure containing an unnamed union can end up being a ``structure'' | |
1470 | with multiple members at the same offset. (A future format revision may | |
1471 | collapse @code{CTF_K_STRUCT} and @code{CTF_K_UNION} into the same kind and | |
1472 | decide among them based on whether their members do in fact overlap.) | |
1473 | ||
1474 | Structure and union type names, as usual in C, go into their own namespace, | |
1475 | just as enum type names do. | |
1476 | ||
1477 | Forward declarations of structures and unions are not implemented with this | |
1478 | kind: @pxref{Forward declarations}. | |
1479 | ||
1480 | @node Forward declarations | |
1481 | @subsection Forward declarations | |
1482 | @cindex Forwards | |
1483 | @tindex enum | |
1484 | @tindex struct | |
1485 | @tindex union | |
1486 | @tindex CTF_K_FORWARD | |
1487 | ||
1488 | When the compiler encounters a forward declaration of a struct, union, or enum, | |
1489 | it emits a type of kind @code{CTF_K_FORWARD}. If it later encounters a non- | |
1490 | forward declaration of the same thing, it marks the forward as non-root-visible: | |
1491 | before link time, therefore, non-root-visible forwards indicate that a | |
1492 | non-forward is coming. | |
1493 | ||
1494 | After link time, forwards are fused with their corresponding non-forwards by the | |
1495 | deduplicator where possible. They are kept if there is no non-forward | |
1496 | definition (maybe it's not visible from any TU at all) or if @code{multiple} | |
1497 | conflicting structures with the same name might match it. Otherwise, all other | |
1498 | forwards are converted to structures, unions, or enums as appropriate, even | |
1499 | across TUs if only one structure could correspond to the forward (after all, | |
1500 | all types across all TUs land in the same dictionary unless they conflict, | |
1501 | so promoting forwards to their concrete type seems most helpful). | |
1502 | ||
1503 | A forward has a rather strange representation: it is encoded with a | |
1504 | @code{ctf_stype_t} but the @code{ctt_type} is populated not with a type (if it's | |
1505 | a forward, we don't have an underlying type yet: if we did, we'd have promoted | |
1506 | it and this wouldn't be a forward any more) but with the @code{kind} of the | |
1507 | forward. This means that we can distinguish forwards to structs, enums and | |
1508 | unions reliably and ensure they land in the appropriate namespace even before | |
1509 | the actual struct, union or enum is found. | |
1510 | ||
1511 | @node The symtypetab sections | |
1512 | @section The symtypetab sections | |
1513 | @cindex Symtypetab section | |
1514 | @cindex Sections, symtypetab | |
1515 | @cindex Function info section | |
1516 | @cindex Sections, function info | |
1517 | @cindex Data object section | |
1518 | @cindex Sections, data object | |
1519 | @cindex Function info index section | |
1520 | @cindex Sections, function info index | |
1521 | @cindex Data object index section | |
1522 | @cindex Sections, data object index | |
1523 | @tindex CTF_F_IDXSORTED | |
1524 | @tindex CTF_F_DYNSTR | |
1525 | @cindex Bug workarounds, CTF_F_DYNSTR | |
1526 | ||
1527 | These are two very simple sections with identical formats, used by consumers to | |
1528 | map from ELF function and data symbols directly to their types. So they are | |
1529 | usually populated only in CTF sections that are embedded in ELF objects. | |
1530 | ||
1531 | Their format is very simple: an array of type IDs. Which symbol each type ID | |
1532 | corresponds to depends on whether the optional @emph{index section} associated | |
1533 | with this symtypetab section has any content. | |
1534 | ||
1535 | If the index section is nonempty, it is an array of @code{uint32_t} string table | |
1536 | offsets, each giving the name of the symbol whose type is at the same offset in | |
1537 | the corresponding non-index section: users can look up symbols in such a table | |
1538 | by name. The index section and corresponding symtypetab section is usually | |
1539 | ASCIIbetically sorted (indicated by the @code{CTF_F_IDXSORTED} flag in the | |
1540 | header): if it's sorted, it can be bsearched for a symbol name rather than | |
1541 | having to use a slower linear search. | |
1542 | ||
1543 | If the data object index section is empty, the entries in the data object and | |
1544 | function info sections are associated 1:1 with ELF symbols of type | |
1545 | @code{STT_OBJECT} (for data object) or @code{STT_FUNC} (for function info) with | |
1546 | a nonzero value: the linker shuffles the symtypetab sections to correspond with | |
1547 | the order of the symbols in the ELF file. Symbols with no name, undefined | |
1548 | symbols and symbols named ``@code{_START_}'' and ``@code{_END_}'' are skipped | |
1549 | and never appear in either section. Symbols that have no corresponding type are | |
1550 | represented by type ID 0. The section may have fewer entries than the symbol | |
1551 | table, in which case no later entries have associated types. This format is | |
1552 | more compact than an indexed form if most entries have types (since there is no | |
1553 | need to record any symbol names), but if the producer and consumer disagree even | |
1554 | slightly about which symbols are omitted, the types of all further symbols will | |
1555 | be wrong! | |
1556 | ||
1557 | The compiler always emits indexed symtypetab tables, because there is no symbol | |
1558 | table yet. The linker will always have to read them all in and always works | |
1559 | through them from start to end, so there is no benefit having the compiler sort | |
1560 | them either. The linker (actually, @code{libctf}'s linking machinery) will | |
1561 | automatically sort unsorted indexed sections, and convert indexed sections that | |
1562 | contain a lot of pads into the more compact, unindexed form. | |
1563 | ||
1564 | If child dicts are in use, only symbols that use types actually mentioned in the | |
1565 | child appear in the child's symtypetab: symbols that use only types in the | |
1566 | parent appear in the parent's symtypetab instead. So the child's symtypetab will | |
1567 | almost always be very sparse, and thus will usually use the indexed form even in | |
1568 | fully linked objects. (It is, of course, impossible for symbols to exist that | |
1569 | use types from multiple child dicts at once, since it's impossible to declare a | |
1570 | function in C that uses types that are only visible in two different, disjoint | |
1571 | translation units.) | |
1572 | ||
1573 | @node The variable section | |
1574 | @section The variable section | |
1575 | @cindex Variable section | |
1576 | @cindex Sections, variable | |
1577 | ||
1578 | The variable section is a simple array mapping names (strtab entries) to type | |
1579 | IDs, intended to provide a replacement for the data object section in dynamic | |
1580 | situations in which there is no static ELF strtab but the consumer instead hands | |
1581 | back names. The section is sorted into ASCIIbetical order by name for rapid | |
1582 | lookup, like the CTF archive name table. | |
1583 | ||
1584 | The section is an array of these structures: | |
1585 | ||
1586 | @verbatim | |
1587 | typedef struct ctf_varent | |
1588 | { | |
1589 | uint32_t ctv_name; | |
1590 | uint32_t ctv_type; | |
1591 | } ctf_varent_t; | |
1592 | @end verbatim | |
1593 | ||
1594 | @tindex struct ctf_varent | |
1595 | @tindex ctf_varent_t | |
1596 | @multitable {Offset} {@code{uint32_t ctv_name}} {Strtab offset of the name} | |
1597 | @headitem Offset @tab Name @tab Description | |
1598 | @item 0x00 | |
1599 | @tab @code{uint32_t ctv_name} | |
1600 | @vindex ctv_name | |
1601 | @vindex struct ctf_varent, ctv_name | |
1602 | @vindex ctf_varent_t, ctv_name | |
1603 | @tab Strtab offset of the name | |
1604 | ||
1605 | @item 0x04 | |
1606 | @tab @code{uint32_t ctv_type} | |
1607 | @vindex ctv_type | |
1608 | @vindex struct ctf_varent, ctv_type | |
1609 | @vindex ctf_varent_t, ctv_type | |
1610 | @tab Type ID of this type | |
1611 | @end multitable | |
1612 | ||
1613 | There is no analogue of the function info section yet: v4 will probably drop | |
1614 | this section in favour of a way to put both indexed (thus, named) and nonindexed | |
1615 | symbols into the symtypetab sections at the same time. | |
1616 | ||
1617 | @node The label section | |
1618 | @section The label section | |
1619 | @cindex Label section | |
1620 | @cindex Sections, label | |
1621 | ||
1622 | The label section is a currently-unused facility allowing the tiling of the type | |
1623 | space with names taken from the strtab. The section is an array of these | |
1624 | structures: | |
1625 | ||
1626 | @verbatim | |
1627 | typedef struct ctf_lblent | |
1628 | { | |
1629 | uint32_t ctl_label; | |
1630 | uint32_t ctl_type; | |
1631 | } ctf_lblent_t; | |
1632 | @end verbatim | |
1633 | ||
1634 | @tindex struct ctf_lblent | |
1635 | @tindex ctf_lblent_t | |
1636 | @multitable {Offset} {@code{uint32_t ctl_label}} {Strtab offset of the label} | |
1637 | @headitem Offset @tab Name @tab Description | |
1638 | @item 0x00 | |
1639 | @tab @code{uint32_t ctl_label} | |
1640 | @vindex ctl_label | |
1641 | @vindex struct ctf_lblent, ctl_label | |
1642 | @vindex ctf_lblent_t, ctl_label | |
1643 | @tab Strtab offset of the label | |
1644 | ||
1645 | @item 0x04 | |
1646 | @tab @code{uint32_t ctl_type} | |
1647 | @vindex ctl_type | |
1648 | @vindex struct ctf_lblent, ctl_type | |
1649 | @vindex ctf_lblent_t, ctl_type | |
1650 | @tab Type ID of the last type covered by this label | |
1651 | @end multitable | |
1652 | ||
1653 | Semantics will be attached to labels soon, probably in v4 (the plan is to use | |
1654 | them to allow multiple disjoint namespaces in a single CTF file, removing many | |
1655 | uses of CTF archives, in particular in the @code{.ctf} section in ELF objects). | |
1656 | ||
1657 | @node The string section | |
1658 | @section The string section | |
1659 | @cindex String section | |
1660 | @cindex Sections, string | |
1661 | ||
1662 | This section is a simple ELF-format strtab, starting with a zero byte (thus | |
1663 | ensuring that the string with offset 0 is the null string, as assumed elsewhere | |
1664 | in this spec). The strtab is usually ASCIIbetically sorted to somewhat improve | |
1665 | compression efficiency. | |
1666 | ||
1667 | Where the strtab is unusual is the @emph{references} to it. CTF has two | |
1668 | string tables, the internal strtab and an external strtab associated | |
1669 | with the CTF dictionary at open time: usually, this is the ELF dynamic | |
1670 | strtab (@code{.dynstr}) of a CTF dictionary embedded in an ELF file. We | |
1671 | distinguish between these strtabs by the most significant bit, bit 31, | |
1672 | of the 32-bit strtab references: if it is 0, the offset is in the | |
1673 | internal strtab: if 1, the offset is in the external strtab. | |
1674 | ||
1675 | @tindex CTF_F_DYNSTR | |
1676 | @cindex Bug workarounds, CTF_F_DYNSTR | |
1677 | There is a bug workaround in this area: in format v3 (the first version | |
1678 | to have working support for external strtabs), the external strtab is | |
1679 | @code{.strtab} unless the @code{CTF_F_DYNSTR} flag is set on the | |
1680 | dictionary (@pxref{CTF file-wide flags}). Format v4 will introduce a | |
1681 | header field that explicitly names the external strtab, making this flag | |
1682 | unnecessary. | |
1683 | ||
1684 | @node Data models | |
1685 | @section Data models | |
1686 | @cindex Data models | |
1687 | ||
1688 | The data model is a simple integer which indicates the ABI in use on this | |
1689 | platform. Right now, it is very simple, distinguishing only between 32- and | |
1690 | 64-bit types: a model of 1 indicates ILP32, 2 indicats LP64. The mapping from | |
1691 | ABI integer to type sizes is hardwired into @code{libctf}: currently, we use | |
1692 | this to hardwire the size of pointers, function pointers, and enumerated types, | |
1693 | ||
1694 | This is a very kludgy corner of CTF and will probably be replaced with explicit | |
1695 | header fields to record this sort of thing in future. | |
1696 | ||
1697 | @node Limits of CTF | |
1698 | @section Limits of CTF | |
1699 | @cindex Limits | |
1700 | ||
1701 | The following limits are imposed by various aspects of CTF version 3: | |
1702 | ||
1703 | @table @code | |
1704 | @item CTF_MAX_TYPE | |
1705 | Maximum type identifier (maximum number of types accessible with parent and | |
1706 | child containers in use): 0xfffffffe | |
1707 | @item CTF_MAX_PTYPE | |
1708 | Maximum type identifier in a parent dictioanry: maximum number of types in any | |
1709 | one dictionary: 0x7fffffff | |
1710 | @item CTF_MAX_NAME | |
1711 | Maximum offset into a string table: 0x7fffffff | |
1712 | @item CTF_MAX_VLEN | |
1713 | Maximum number of members in a struct, union, or enum: maximum number of | |
1714 | function args: 0xffffff | |
1715 | @item CTF_MAX_SIZE | |
1716 | Maximum size of a @code{ctf_stype_t} in bytes before we fall back to | |
1717 | @code{ctf_type_t}: 0xfffffffe bytes | |
1718 | @end table | |
1719 | ||
1720 | Other maxima without associated macros: | |
1721 | @itemize | |
1722 | @item | |
1723 | Maximum value of an enumerated type: 2^32 | |
1724 | @item | |
1725 | Maximum size of an array element: 2^32 | |
1726 | @end itemize | |
1727 | ||
1728 | These maxima are generally considered to be too low, because C programs can and | |
1729 | do exceed them: they will be lifted in format v4. | |
1730 | ||
1731 | @node Index | |
1732 | @unnumbered Index | |
1733 | ||
1734 | @printindex cp | |
1735 | ||
1736 | @bye |