]> git.ipfire.org Git - thirdparty/binutils-gdb.git/blame - libctf/doc/ctf-spec.texi
Update year range in copyright notice of binutils files
[thirdparty/binutils-gdb.git] / libctf / doc / ctf-spec.texi
CommitLineData
9be90c68
NA
1\input texinfo @c -*- Texinfo -*-
2@setfilename ctf-spec.info
3@settitle The CTF File Format
4@ifnottex
9be90c68
NA
5@xrefautomaticsectiontitle on
6@end ifnottex
7@synindex fn cp
8@synindex tp cp
9@synindex vr cp
10
11@copying
fd67aa11 12Copyright @copyright{} 2021-2024 Free Software Foundation, Inc.
9be90c68
NA
13
14Permission is granted to copy, distribute and/or modify this document
15under the terms of the GNU General Public License, Version 3 or any
16later version published by the Free Software Foundation. A copy of the
17license is included in the section entitled ``GNU General Public
18License''.
19
20@end copying
21
22@dircategory Software development
23@direntry
24* CTF: (ctf-spec). The CTF file format.
25@end direntry
26
27@titlepage
28@title The CTF File Format
29@subtitle Version 3
30@author Nick Alcock
31
32@page
33@vskip 0pt plus 1filll
34@insertcopying
35@end titlepage
36@contents
37
38@ifnottex
39@node Top
40@top The CTF file format
41
42This manual describes version 3 of the CTF file format, which is
43intended to model the C type system in a fashion that C programs can
44consume at runtime.
45@end ifnottex
46
47@node Overview
48@unnumbered Overview
49@cindex Overview
50
51The CTF file format compactly describes C types and the association
52between function and data symbols and types: if embedded in ELF objects,
53it can exploit the ELF string table to reduce duplication further.
54There is no real concept of namespacing: only top-level types are
55described, not types scoped to within single functions.
56
57CTF dictionaries can be @dfn{children} of other dictionaries, in a
58one-level hierarchy: child dictionaries can refer to types in the
59parent, but the opposite is not sensible (since if you refer to a child
60type in the parent, the actual type you cited would vary depending on
61what child was attached). This parent/child definition is recorded in
62the child, but only as a recommendation: users of the API have to attach
63parents to children explicitly, and can choose to attach a child to any
64parent they like, or to none, though doing so might lead to unpleasant
65consequences like dangling references to types. @xref{Type indexes and
66type IDs}. Type lookups in child dicts that are not associated with a
67parent at all will fail with @code{ECTF_NOPARENT} if a parent type was
68needed.
69
70The associated API to generate, merge together, and query this file
71format will be described in the accompanying @code{libctf} manual once
72it is written. There is no API to modify dictionaries once they've been
73written out: CTF is a write-once file format. (However, it is always
74possible to dynamically create a new child dictionary on the fly and
75attach it to a pre-existing, read-only parent.)
76
77There are two major pieces to CTF: the @dfn{archive} and the
78@dfn{dictionary}. Some relatives and ancestors of CTF call dictionaries
79@dfn{containers}: the archive format is unique to this variant of CTF.
80(Much of the source code still uses the old term.)
81
82The archive file format is a very simple mmappable archive used to group
83multiple dictionaries together into groups: it is expected to slowly go
84away and be replaced by other mechanisms, but right now it is an
85important part of the file format, used to group dictionaries containing
86types with conflicting definitions in different TUs with the overarching
87dictionary used to store all other types. (Even when archives go away,
88the @code{libctf} API used to access them will remain, and access the
89other mechanisms that replace it instead.)
90
91The CTF dictionary consists of a @dfn{preamble}, which does not vary
92between versions of the CTF file format, and a @dfn{header} and some
93number of @dfn{sections}, which can vary between versions.
94
95The rest of this specification describes the format of these sections,
96first for the latest version of CTF, then for all earlier versions
97supported by @code{libctf}: the earlier versions are defined in terms of
98their differences from the next later one. We describe each part of the
99format first by reproducing the C structure which defines that part,
100then describing it at greater length in terms of file offsets.
101
102The description of the file format ends with a description of relevant
103limits that apply to it. These limits can vary between file format
104versions.
105
106This document is quite young, so for now the C code in @file{ctf.h}
107should be presumed correct when this document conflicts with it.
108
109@node CTF archive
110@chapter CTF archives
111@cindex archive, CTF archive
112
113The CTF archive format maps names to CTF dictionaries. The names may
114contain any character other than \0, but for now archives containing
115slashes in the names may not extract correctly. It is possible to
116insert multiple members with the same name, but these are quite hard to
117access reliably (you have to iterate through all the members rather than
118opening by name) so this is not recommended.
119
120CTF archives are not themselves compressed: the constituent components,
121CTF dictionaries, can be compressed. (@xref{CTF header}).
122
123CTF archives usually contain a collection of related dictionaries, one
124parent and many children of that parent. CTF archives can have a member
125with a @dfn{default name}, @code{.ctf} (which can be represented as
126@code{NULL} in the API). If present, this member is usually the parent
127of all the children, but it is possible for CTF producers to emit
128parents with different names if they wish (usually for backward-
129compatibility purposes).
130
131@code{.ctf} sections in ELF objects consist of a single CTF dictionary
132rather than an archive of dictionaries if and only if the section
133contains no types with identical names but conflicting definitions: if
134two conflicting definitions exist, the deduplicator will place the type
135most commonly referred to by other types in the parent and will place
136the other type in a child named after the translation unit it is found
137in, and will emit a CTF archive containing both dictionaries instead of
138a raw dictionary. All types that refer to such conflicting types are
139also placed in the per-translation-unit child.
140
141The definition of an archive in @file{ctf.h} is as follows:
142
143@verbatim
144struct ctf_archive
145{
146 uint64_t ctfa_magic;
147 uint64_t ctfa_model;
148 uint64_t ctfa_nfiles;
149 uint64_t ctfa_names;
150 uint64_t ctfa_ctfs;
151};
152
153typedef struct ctf_archive_modent
154{
155 uint64_t name_offset;
156 uint64_t ctf_offset;
157} ctf_archive_modent_t;
158@end verbatim
159
160(Note one irregularity here: the @code{ctf_archive_t} is not a typedef
161to @code{struct ctf_archive}, but a different typedef, private to
162@code{libctf}, so that things that are not really archives can be made
163to appear as if they were.)
164
165All the above items are always in little-endian byte order, regardless
166of the machine endianness.
167
168The archive header has the following fields:
169
170@tindex struct ctf_archive
171@multitable {Offset} {@code{uint64_t ctfa_nfiles}} {The data model for this archive: an arbitrary integer}
172@headitem Offset @tab Name @tab Description
173@item 0x00
174@tab @code{uint64_t ctfa_magic}
175@vindex ctfa_magic
176@vindex struct ctf_archive, ctfa_magic
177@tab The magic number for archives, @code{CTFA_MAGIC}: 0x8b47f2a4d7623eeb.
178@tindex CTFA_MAGIC
179
180@item 0x08
181@tab @code{uint64_t ctfa_model}
182@vindex ctfa_model
183@vindex struct ctf_archive, ctfa_model
184@tab The data model for this archive: an arbitrary integer that serves no
185purpose but to be handed back by the libctf API. @xref{Data models}.
186
187@item 0x10
188@tab @code{uint64_t ctfa_nfiles}
189@vindex ctfa_nfiles
190@vindex struct ctf_archive, ctfa_nfiles
191@tab The number of CTF dictionaries in this archive.
192
193@item 0x18
194@tab @code{uint64_t ctfa_names}
195@vindex ctfa_names
196@vindex struct ctf_archive, ctfa_names
197@tab Offset of the name table, in bytes from the start of the archive.
198The name table is an array of @code{struct ctf_archive_modent_t[ctfa_nfiles]}.
199
200@item 0x20
201@tab @code{uint64_t ctfa_ctfs}
202@vindex ctfa_ctfs
203@vindex struct ctf_archive, ctfa_ctfs
204@tab Offset of the CTF table. Each element starts with a @code{uint64_t} size,
205followed by a CTF dictionary.
206
207@end multitable
208
209The array pointed to by @code{ctfa_names} is an array of entries of
210@code{ctf_archive_modent}:
211
212@tindex struct ctf_archive_modent
213@tindex ctf_archive_modent_t
214@multitable {Offset} {@code{uint64_t name_offset}} {Offset of this name, in bytes from the start}
215@headitem Offset @tab Name @tab Description
216@item 0x00
217@tab @code{uint64_t name_offset}
218@vindex name_offset
219@vindex struct ctf_archive_modent, name_offset
220@vindex ctf_archive_modent_t, name_offset
221@tab Offset of this name, in bytes from the start of the archive.
222
223@item 0x08
224@tab @code{uint64_t ctf_offset}
225@vindex ctf_offset
226@vindex struct ctf_archive_modent, ctf_offset
227@vindex ctf_archive_modent_t, ctf_offset
228@tab Offset of this CTF dictionary, in bytes from the start of the archive.
229
230@end multitable
231
232The @code{ctfa_names} array is sorted into ASCIIbetical order by name
233(i.e. by the result of dereferencing the @code{name_offset}).
234
235The archive file also contains a name table and a table of CTF
236dictionaries: these are pointed to by the structures above. The name
237table is a simple strtab which is not required to be sorted; the
238dictionary array is described above in the entry for @code{ctfa_ctfs}.
239
240The relative order of these various parts is not defined, except that
241the header naturally always comes first.
242
243@node CTF dictionaries
244@chapter CTF dictionaries
245@cindex dictionary, CTF dictionary
246
247CTF dictionaries consist of a header, starting with a premable, and a
248number of sections.
249
250@node CTF Preamble
251@section CTF Preamble
252
253The preamble is the only part of the CTF dictionary whose format cannot
254vary between versions. It is never compressed. It is correspondingly
255simple:
256
257@verbatim
258typedef struct ctf_preamble
259{
260 unsigned short ctp_magic;
261 unsigned char ctp_version;
262 unsigned char ctp_flags;
263} ctf_preamble_t;
264@end verbatim
265
266@code{#define}s are provided under the names @code{cth_magic},
267@code{cth_version} and @code{cth_flags} to make the fields of the
268@code{ctf_preamble_t} appear to be part of the @code{ctf_header_t}, so
269consuming programs rarely need to consider the existence of the preamble
270as a separate structure.
271
272@tindex struct ctf_preamble
273@tindex ctf_preamble_t
274@multitable {Offset} {@code{unsigned char ctp_version}} {The magic number for CTF dictionaries}
275@headitem Offset @tab Name @tab Description
276@item 0x00
277@tab @code{unsigned short ctp_magic}
278@vindex ctp_magic
279@vindex cth_magic
280@vindex ctf_preamble_t, ctp_magic
281@vindex struct ctf_preamble, ctp_magic
282@vindex ctf_header_t, cth_magic
283@vindex struct ctf_header, cth_magic
284@tab The magic number for CTF dictionaries, @code{CTF_MAGIC}: 0xdff2.
285@tindex CTF_MAGIC
286
287@item 0x02
288@tab @code {unsigned char ctp_version}
289@vindex ctp_version
290@vindex cth_version
291@vindex ctf_preamble_t, ctp_version
292@vindex struct ctf_preamble, ctp_version
293@vindex ctf_header_t, cth_version
294@vindex struct ctf_header, cth_version
295@tab The version number of this CTF dictionary.
296
297@item 0x03
298@tab @code{ctp_flags}
299@vindex ctp_flags
300@vindex cth_flags
301@vindex ctf_preamble_t, ctp_flags
302@vindex struct ctf_preamble, ctp_flags
303@vindex ctf_header_t, cth_flags
304@vindex struct ctf_header, cth_flags
305@tab Flags for this CTF file. @xref{CTF file-wide flags}.
306@end multitable
307
308@cindex alignment
309Every element of a dictionary must be naturally aligned unless otherwise
310specified. (This restriction will be lifted in later versions.)
311
312@cindex endianness
313CTF dictionaries are stored in the native endianness of the system that
314generates them: the consumer (e.g., @code{libctf}) can detect whether to
315endian-flip a CTF dictionary by inspecting the @code{ctp_magic}. (If it
316appears as 0xf2df, endian-flipping is needed.)
317
318The version of the CTF dictionary can be determined by inspecting
319@code{ctp_version}. The following versions are currently valid, and
320@code{libctf} can read all of them:
321
322@tindex CTF_VERSION_3
323@cindex CTF versions, versions
324@multitable {@code{CTF_VERSION_1_UPGRADED_3}} {Number} {First version, rare. Very similar to Solaris CTF.}
325@headitem Version @tab Number @tab Description
326@item @code{CTF_VERSION_1}
327@tab 1 @tab First version, rare. Very similar to Solaris CTF.
328
329@item @code{CTF_VERSION_1_UPGRADED_3}
330@tab 2 @tab First version, upgraded to v3 or higher and written out again.
331Name may change. Very rare.
332
333@item @code{CTF_VERSION_2}
334@tab 3 @tab Second version, with many range limits lifted.
335
336@item @code{CTF_VERSION_3}
337@tab 4 @tab Third and current version, documented here.
338@end multitable
339
340This section documents @code{CTF_VERSION_3}.
341
342@vindex ctp_flags
343@node CTF file-wide flags
344@subsection CTF file-wide flags
345
346The preamble contains bitflags in its @code{ctp_flags} field that
347describe various file-wide properties. Some of the flags are valid only
348for particular file-format versions, which means the flags can be used
349to fix file-format bugs. Consumers that see unknown flags should
350accordingly assume that the dictionary is not comprehensible, and
351refuse to open them.
352
353The following flags are currently defined. Many are bug workarounds,
354valid only in CTFv3, and will not be valid in any future versions: the
355same values may be reused for other flags in v4+.
356
357@multitable {@code{CTF_F_NEWFUNCINFO}} {Versions} {Value} {The external strtab is in @code{.dynstr} and the}
358@headitem Flag @tab Versions @tab Value @tab Meaning
359@tindex CTF_F_COMPRESS
360@item @code{CTF_F_COMPRESS} @tab All @tab 0x1 @tab Compressed with zlib
361@tindex CTF_F_NEWFUNCINFO
362@item @code{CTF_F_NEWFUNCINFO} @tab 3 only @tab 0x2
363@tab ``New-format'' func info section.
364@tindex CTF_F_IDXSORTED
365@item @code{CTF_F_IDXSORTED} @tab 3+ @tab 0x4 @tab The index section is
366in sorted order
367@tindex CTF_F_DYNSTR
368@item @code{CTF_F_DYNSTR} @tab 3 only @tab 0x8 @tab The external strtab is
369in @code{.dynstr} and the symtab used is @code{.dynsym}.
370@xref{The string section}
371@end multitable
372
373@code{CTF_F_NEWFUNCINFO} and @code{CTF_F_IDXSORTED} relate to the
374function info and data object sections. @xref{The symtypetab sections}.
375
376Further flags (and further compression methods) wil be added in future.
377
378@node CTF header
379@section CTF header
380@cindex CTF header
381@cindex Sections, header
382
383The CTF header is the first part of a CTF dictionary, including the
384preamble. All parts of it other than the preamble (@pxref{CTF Preamble})
385can vary between CTF file versions and are never compressed. It
386contains things that apply to the dictionary as a whole, and a table of
387the sections into which the rest of the dictionary is divided. The
388sections tile the file: each section runs from the offset given until
389the start of the next section. Only the last section cannot follow this
390rule, so the header has a length for it instead.
391
392All section offsets, here and in the rest of the CTF file, are relative to the
393@emph{end} of the header. (This is annoyingly different to how offsets in CTF
394archives are handled.)
395
396This is the first structure to include offsets into the string table, which are
397not straight references because CTF dictionaries can include references into the
398ELF string table to save space, as well as into the string table internal to the
399CTF dictionary. @xref{The string section} for more on these. Offset 0 is
400always the null string.
401
402@verbatim
403typedef struct ctf_header
404{
405 ctf_preamble_t cth_preamble;
406 uint32_t cth_parlabel;
407 uint32_t cth_parname;
408 uint32_t cth_cuname;
409 uint32_t cth_lbloff;
410 uint32_t cth_objtoff;
411 uint32_t cth_funcoff;
412 uint32_t cth_objtidxoff;
413 uint32_t cth_funcidxoff;
414 uint32_t cth_varoff;
415 uint32_t cth_typeoff;
416 uint32_t cth_stroff;
417 uint32_t cth_strlen;
418} ctf_header_t;
419@end verbatim
420
421In detail:
422
423@tindex struct ctf_header
424@tindex ctf_header_t
425@multitable {Offset} {@code{ctf_preamble_t cth_preamble}} {The parent label, if deduplication happened against}
426@headitem Offset @tab Name @tab Description
427@item 0x00
428@tab @code{ctf_preamble_t cth_preamble}
429@vindex cth_preamble
430@vindex struct ctf_header, cth_preamble
431@vindex ctf_header_t, cth_preamble
432@tab The preamble (conceptually embedded in the header). @xref{CTF Preamble}
433
434@item 0x04
435@tab @code{uint32_t cth_parlabel}
436@vindex cth_parlabel
437@vindex struct ctf_header, cth_parlabel
438@vindex ctf_header_t, cth_parlabel
439@tab The parent label, if deduplication happened against a specific label: a
440strtab offset. @xref{The label section}. Currently unused and always 0, but may
441be used in future when semantics are attached to the label section.
442
443@item 0x08
444@tab @code{uint32_t cth_parname}
445@vindex cth_parname
446@vindex struct ctf_header, cth_parname
447@vindex ctf_header_t, cth_parname
448@tab The name of the parent dictionary deduplicated against: a strtab offset.
449Interpretation is up to the consumer (usually a CTF archive member name). 0
450(the null string) if this is not a child dictionary.
451
452@item 0x1c
453@tab @code{uint32_t cth_cuname}
454@vindex cth_cuname
455@vindex struct ctf_header, cth_cuname
456@vindex ctf_header_t, cth_cuname
457@tab The name of the compilation unit, for consumers like GDB that want to
458know the name of CUs associated with single CUs: a strtab offset. 0 if this
459dictionary describes types from many CUs.
460
461@item 0x10
462@tab @code{uint32_t cth_lbloff}
463@vindex cth_lbloff
464@vindex struct ctf_header, cth_lbloff
465@vindex ctf_header_t, cth_lbloff
466@tab The offset of the label section, which tiles the type space into
467named regions. @xref{The label section}.
468
469@item 0x14
470@tab @code{uint32_t cth_objtoff}
471@vindex cth_objtoff
472@vindex struct ctf_header, cth_objtoff
473@vindex ctf_header_t, cth_objtoff
474@tab The offset of the data object symtypetab section, which maps ELF data symbols to
475types. @xref{The symtypetab sections}.
476
477@item 0x18
478@tab @code{uint32_t cth_funcoff}
479@vindex cth_funcoff
480@vindex struct ctf_header, cth_funcoff
481@vindex ctf_header_t, cth_funcoff
482@tab The offset of the function info symtypetab section, which maps ELF function
483symbols to a return type and arg types. @xref{The symtypetab sections}.
484
485@item 0x1c
486@tab @code{uint32_t cth_objtidxoff}
487@vindex cth_objtidxoff
488@vindex struct ctf_header, cth_objtidxoff
489@vindex ctf_header_t, cth_objtidxoff
490@tab The offset of the object index section, which maps ELF object symbols to
491entries in the data object section. @xref{The symtypetab sections}.
492
493@item 0x20
494@tab @code{uint32_t cth_funcidxoff}
495@vindex cth_funcidxoff
496@vindex struct ctf_header, cth_funcidxoff
497@vindex ctf_header_t, cth_funcidxoff
498@tab The offset of the function info index section, which maps ELF function
499symbols to entries in the function info section. @xref{The symtypetab sections}.
500
501@item 0x24
502@tab @code{uint32_t cth_varoff}
503@vindex cth_varoff
504@vindex struct ctf_header, cth_varoff
505@vindex ctf_header_t, cth_varoff
506@tab The offset of the variable section, which maps string names to types.
507@xref{The variable section}.
508
509@item 0x28
510@tab @code{uint32_t cth_typeoff}
511@vindex cth_typeoff
512@vindex struct ctf_header, cth_typeoff
513@vindex ctf_header_t, cth_typeoff
514@tab The offset of the type section, the core of CTF, which describes types
515 using variable-length array elements. @xref{The type section}.
516
517@item 0x2c
518@tab @code{uint32_t cth_stroff}
519@vindex cth_stroff
520@vindex struct ctf_header, cth_stroff
521@vindex ctf_header_t, cth_stroff
522@tab The offset of the string section. @xref{The string section}.
523
524@item 0x30
525@tab @code{uint32_t cth_strlen}
526@vindex cth_strlen
527@vindex struct ctf_header, cth_strlen
528@vindex ctf_header_t, cth_strlen
529@tab The length of the string section (not an offset!). The CTF file ends
530at this point.
531
532@end multitable
533
534Everything from this point on (until the end of the file at @code{cth_stroff} +
535@code{cth_strlen}) is compressed with zlib if @code{CTF_F_COMPRESS} is set in
536the preamble's @code{ctp_flags}.
537
538@node The type section
539@section The type section
540@cindex Type section
541@cindex Sections, type
542
543This section is the most important section in CTF, describing all the top-level
544types in the program. It consists of an array of type structures, each of which
545describes a type of some @dfn{kind}: each kind of type has some amount of
546variable-length data associated with it (some kinds have none). The amount of
547variable-length data associated with a given type can be determined by
548inspecting the type, so the reading code can walk through the types in sequence
549at opening time.
550
551Each type structure is one of a set of overlapping structures in a discriminated
552union of sorts: the variable-length data for each type immediately follows the
553type's type structure. Here's the largest of the overlapping structures, which
554is only needed for huge types and so is very rarely seen:
555
556@verbatim
557typedef struct ctf_type
558{
559 uint32_t ctt_name;
560 uint32_t ctt_info;
561 __extension__
562 union
563 {
564 uint32_t ctt_size;
565 uint32_t ctt_type;
566 };
567 uint32_t ctt_lsizehi;
568 uint32_t ctt_lsizelo;
569} ctf_type_t;
570@end verbatim
571
572Here's the much more common smaller form:
573
574@verbatim
575typedef struct ctf_stype
576{
577 uint32_t ctt_name;
578 uint32_t ctt_info;
579 __extension__
580 union
581 {
582 uint32_t ctt_size;
583 uint32_t ctt_type;
584 };
585} ctf_type_t;
586@end verbatim
587
588If @code{ctt_size} is the #define @code{CTF_LSIZE_SENT}, 0xffffffff, this type
589is described by a @code{ctf_type_t}: otherwise, a @code{ctf_stype_t}.
590@tindex CTF_LSIZE_SENT
591
592Here's what the fields mean:
593
594@tindex struct ctf_type
595@tindex struct ctf_stype
596@tindex ctf_type_t
597@tindex ctf_stype_t
598@multitable {0x1c (@code{ctf_type_t}} {@code{uint32_t ctt_lsizehi}} {The size of this type, if this type is of a kind for}
599@headitem Offset @tab Name @tab Description
600@item 0x00
601@tab @code{uint32_t ctt_name}
602@vindex ctt_name
603@tab Strtab offset of the type name, if any (0 if none).
604
605@item 0x04
606@tab @code{uint32_t ctt_info}
607@vindex ctt_info
608@vindex struct ctf_type, ctt_info
609@vindex ctf_type_t, ctt_info
610@vindex struct ctf_stype, ctt_info
611@vindex ctf_stype_t, ctt_info
612@tab The @dfn{info word}, containing information on the kind of this type, its
613variable-length data and whether it is visible to name lookup. See @xref{The
614info word}.
615
616@item 0x08
617@tab @code{uint32_t ctt_size}
618@vindex ctt_size
619@vindex struct ctf_type, ctt_size
620@vindex ctf_type_t, ctt_size
621@vindex struct ctf_stype, ctt_size
622@vindex ctf_stype_t, ctt_size
623@tab The size of this type, if this type is of a kind for which a size needs
624to be recorded (constant-size types don't need one). If this is
625@code{CTF_LSIZE_SENT}, this type is a huge type described by @code{ctf_type_t}.
626
627@item 0x08
628@tab @code{uint32_t ctt_type}
629@vindex ctt_type
630@vindex struct ctf_stype, ctt_type
631@vindex ctf_stype_t, ctt_type
632@tab The type this type refers to, if this type is of a kind which refers to
633other types (like a pointer). All such types are fixed-size, and no types that
634are variable-size refer to other types, so @code{ctt_size} and @code{ctt_type}
635overlap. All type kinds that use @code{ctt_type} are described by
636@code{ctf_stype_t}, not @code{ctf_type_t}. @xref{Type indexes and type IDs}.
637
638@item 0x0c (@code{ctf_type_t} only)
639@tab @code{uint32_t ctt_lsizehi}
640@vindex ctt_lsizehi
641@vindex struct ctf_type, ctt_lsizehi
642@vindex ctf_type_t, ctt_lsizehi
643@tab The high 32 bits of the size of a very large type. The @code{CTF_TYPE_LSIZE} macro
644can be used to get a 64-bit size out of this field and the next one.
645@code{CTF_SIZE_TO_LSIZE_HI} splits the @code{ctt_lsizehi} out of it again.
646@findex CTF_TYPE_LSIZE
647@findex CTF_SIZE_TO_LSIZE_HI
648
649@item 0x10 (@code{ctf_type_t} only)
650@tab @code{uint32_t ctt_lsizelo}
651@vindex ctt_lsizelo
652@vindex struct ctf_type, ctt_lsizelo
653@vindex ctf_type_t, ctt_lsizelo
654@tab The low 32 bits of the size of a very large type.
655@code{CTF_SIZE_TO_LSIZE_LO} splits the @code{ctt_lsizelo} out of a 64-bit size.
656@findex CTF_SIZE_TO_LSIZE_LO
657@end multitable
658
659Two aspects of this need further explanation: the info word, and what exactly a
660type ID is and how you determine it. (Information on the various type-kind-
661dependent things, like whether @code{ctt_size} or @code{ctt_type} is used,
662is described in the section devoted to each kind.)
663
664@node The info word
665@subsection The info word, ctt_info
666
667The info word is a bitfield split into three parts. From MSB to LSB:
668
669@multitable {Bit offset} {@code{isroot}} {Length of variable-length data for this type (some kinds only).}
670@headitem Bit offset @tab Name @tab Description
671@item 26--31
672@tab @code{kind}
673@tab Type kind: @pxref{Type kinds}.
674
675@item 25
676@tab @code{isroot}
677@tab 1 if this type is visible to name lookup
678
679@item 0--24
680@tab @code{vlen}
681@tab Length of variable-length data for this type (some kinds only).
682The variable-length data directly follows the @code{ctf_type_t} or
683@code{ctf_stype_t}. This is a kind-dependent array length value,
684not a length in bytes. Some kinds have no variable-length data, or
685fixed-size variable-length data, and do not use this value.
686@end multitable
687
688The most mysterious of these is undoubtedly @code{isroot}. This indicates
689whether types with names (nonzero @code{ctt_name}) are visible to name lookup:
690if zero, this type is considered a @dfn{non-root type} and you can't look it up
691by name at all. Multiple types with the same name in the same C namespace
692(struct, union, enum, other) can exist in a single dictionary, but only one of
693them may have a nonzero value for @code{isroot}. @code{libctf} validates this
694at open time and refuses to open dictionaries that violate this constraint.
695
696Historically, this feature was introduced for the encoding of bitfields
697(@pxref{Integer types}): for instance, int bitfields will all be named
698@code{int} with different widths or offsets, but only the full-width one at
699offset zero is wanted when you look up the type named @code{int}. With the
700introduction of slices (@pxref{Slices}) as a more general bitfield encoding
701mechanism, this is less important, but we still use non-root types to handle
702conflicts if the linker API is used to fuse multiple translation units into one
703dictionary and those translation units contain types with the same name and
704conflicting definitions. (We do not discuss this further here, because the
705linker never does this: only specialized type mergers do, like that used for the
706Linux kernel. The libctf documentation will describe this in more detail.)
707@c XXX update when libctf docs are written.
708
709The @code{CTF_TYPE_INFO} macro can be used to compose an info word from
710a @code{kind}, @code{isroot}, and @code{vlen}; @code{CTF_V2_INFO_KIND},
711@code{CTF_V2_INFO_ISROOT} and @code{CTF_V2_INFO_VLEN} pick it apart again.
712@findex CTF_TYPE_INFO
713@findex CTF_V2_INFO_KIND
714@findex CTF_V2_INFO_ISROOT
715@findex CTF_V2_INFO_VLEN
716
717@node Type indexes and type IDs
718@subsection Type indexes and type IDs
719@cindex Type indexes
720@cindex Type IDs
721@cindex Type, IDs of
722@cindex Type, indexes of
723@cindex ctf_id_t
724
725@cindex Parent range
726@cindex Child range
727@cindex Type IDs, ranges
728Types are referred to within the CTF file via @dfn{type IDs}. A type ID is a
729number from 0 to @math{2^32}, from a space divided in half. Types @math{2^31-1}
730and below are in the @dfn{parent range}: these IDs are used for dictionaries
731that have not had any other dictionary @code{ctf_import}ed into it as a parent.
732Both completely standalone dictionaries and parent dictionaries with children
733hanging off them have types in this range. Types @math{2^31} and above are in
734the @dfn{child range}: only types in child dictionaries are in this range.
735
736These IDs appear in @code{ctf_type_t.ctt_type} (@pxref{The type section}), but
737the types themselves have no visible ID: quite intentionally, because adding an
738ID uses space, and every ID is different so they don't compress well. The IDs
739are implicit: at open time, the consumer walks through the entire type section
740and counts the types in the type section. The type section is an array of
741variable-length elements, so each entry could be considered as having an index,
742starting from 1. We count these indexes and associate each with its
743corresponding @code{ctf_type_t} or @code{ctf_stype_t}.
744
745Lookups of types with IDs in the parent space look in the parent dictionary if
746this dictionary has one associated with it; lookups of types with IDs in the
747child space error out if the dictionary does not have a parent, and otherwise
748convert the ID into an index by shaving off the top bit and look up the index
749in the child.
750
751These properties mean that the same dictionary can be used as a parent of child
752dictionaries and can also be used directly with no children at all, but a
753dictionary created as a child dictionary must always be associated with a parent
754--- usually, the same parent --- because its references to its own types have
755the high bit turned on and this is only flipped off again if this is a child
756dictionary. (This is not a problem, because if you @emph{don't} associate the
757child with a parent, any references within it to its parent types will fail, and
758there are almost certain to be many such references, or why is it a child at
759all?)
760
761This does mean that consumers should keep a close eye on the distinction between
762type IDs and type indexes: if you mix them up, everything will appear to work as
763long as you're only using parent dictionaries or standalone dictionaries, but as
764soon as you start using children, everything will fail horribly.
765
766Type index zero, and type ID zero, are used to indicate that this type cannot be
767represented in CTF as currently constituted: they are emitted by the compiler,
768but all type chains that terminate in the unknown type are erased at link time
769(structure fields that use them just vanish, etc). So you will probably never
770see a use of type zero outside the symtypetab sections, where they serve as
771sentinels of sorts, to indicate symbols with no associated type.
772
773The macros @code{CTF_V2_TYPE_TO_INDEX} and @code{CTF_V2_INDEX_TO_TYPE} may help
774in translation between types and indexes: @code{CTF_V2_TYPE_ISPARENT} and
775@code{CTF_V2_TYPE_ISCHILD} can be used to tell whether a given ID is in the
776parent or child range.
777@findex CTF_V2_TYPE_TO_INDEX
778@findex CTF_V2_INDEX_TO_TYPE
779@findex CTF_V2_TYPE_ISPARENT
780@findex CTF_V2_TYPE_ISCHILD
781
782It is quite possible and indeed common for type IDs to point forward in the
783dictionary, as well as backward.
784
785@node Type kinds
786@subsection Type kinds
787@cindex Type kinds
788@cindex Type, kinds of
789
790Every type in CTF is of some @dfn{kind}. Each kind is some variety of C type:
791all structures are a single kind, as are all unions, all pointers, all arrays,
792all integers regardless of their bitfield width, etc. The kind of a type is
793given in the @code{kind} field of the @code{ctt_info} word (@pxref{The info
794word}).
795
796The space of type kinds is only a quarter full so far, so there is plenty of
797room for expansion. It is likely that in future versions of the file format,
798types with smaller kinds will be more efficiently encoded than types with larger
799kinds, so their numerical value will actually start to matter in future. (So
800these IDs will probably change their numerical values in a later release of this
801format, to move more frequently-used kinds like structures and cv-quals towards
802the top of the space, and move rarely-used kinds like integers downwards. Yes,
803integers are rare: how many kinds of @code{int} are there in a program? They're
804just very frequently @emph{referenced}.)
805
806Here's the set of kinds so far. Each kind has a @code{#define} associated with
807it, also given here.
808
809@multitable {Kind} {@code{CTF_K_VOLATILE}} {Indicates a type that cannot be represented in CTF, or that} {@xref{Pointers typedefs and cvr-quals}}
810@headitem Kind @tab Macro @tab Purpose
811@item 0
812@tab @code{CTF_K_UNKNOWN}
813@tab Indicates a type that cannot be represented in CTF, or that is being skipped.
814It is very similar to type ID 0, except that you can have @emph{multiple}, distinct types
815of kind @code{CTF_K_UNKNOWN}.
816@tindex CTF_K_UNKNOWN
817
818@item 1
819@tab @code{CTF_K_INTEGER}
820@tab An integer type. @xref{Integer types}.
821
822@item 2
823@tab @code{CTF_K_FLOAT}
824@tab A floating-point type. @xref{Floating-point types}.
825
826@item 3
827@tab @code{CTF_K_POINTER}
828@tab A pointer. @xref{Pointers typedefs and cvr-quals}.
829
830@item 4
831@tab @code{CTF_K_ARRAY}
832@tab An array. @xref{Arrays}.
833
834@item 5
835@tab @code{CTF_K_FUNCTION}
836@tab A function pointer. @xref{Function pointers}.
837
838@item 6
839@tab @code{CTF_K_STRUCT}
840@tab A structure. @xref{Structs and unions}.
841
842@item 7
843@tab @code{CTF_K_UNION}
844@tab A union. @xref{Structs and unions}.
845
846@item 8
847@tab @code{CTF_K_ENUM}
848@tab An enumerated type. @xref{Enums}.
849
850@item 9
851@tab @code{CTF_K_FORWARD}
852@tab A forward. @xref{Forward declarations}.
853
854@item 10
855@tab @code{CTF_K_TYPEDEF}
856@tab A typedef. @xref{Pointers typedefs and cvr-quals}.
857
858@item 11
859@tab @code{CTF_K_VOLATILE}
860@tab A volatile-qualified type. @xref{Pointers typedefs and cvr-quals}.
861
862@item 12
863@tab @code{CTF_K_CONST}
864@tab A const-qualified type. @xref{Pointers typedefs and cvr-quals}.
865
866@item 13
867@tab @code{CTF_K_RESTRICT}
868@tab A restrict-qualified type. @xref{Pointers typedefs and cvr-quals}.
869
870@item 14
871@tab @code{CTF_K_SLICE}
872@tab A slice, a change of the bit-width or offset of some other type. @xref{Slices}.
873@end multitable
874
875Now we cover all type kinds in turn. Some are more complicated than others.
876
877@node Integer types
878@subsection Integer types
879@cindex Integer types
880@cindex Types, integer
881@tindex int
882@tindex long
883@tindex long long
884@tindex short
885@tindex char
886@tindex bool
887@tindex unsigned int
888@tindex unsigned long
889@tindex unsigned long long
890@tindex unsigned short
891@tindex unsigned char
892@tindex signed int
893@tindex signed long
894@tindex signed long long
895@tindex signed short
896@tindex signed char
897@cindex CTF_K_INTEGER
898
899Integral types are all represented as types of kind @code{CTF_K_INTEGER}. These
900types fill out @code{ctt_size} in the @code{ctf_stype_t} with the size in bytes
901of the integral type in question. They are always represented by
902@code{ctf_stype_t}, never @code{ctf_type_t}. Their variable-length data is one
903@code{uint32_t} in length: @code{vlen} in the info word should be disregarded
904and is always zero.
905
906The variable-length data for integers has multiple items packed into it much
907like the info word does.
908
909@multitable {Bit offset} {Encoding} {The integer encoding and desired display representation.}
910@headitem Bit offset @tab Name @tab Description
911@item 24--31
912@tab Encoding
913@tab The desired display representation of this integer. You can extract this
914field with the @code{CTF_INT_ENCODING} macro. See below.
915@findex CTF_INT_ENCODING
916
917@item 16--23
918@tab Offset
919@tab The offset of this integral type in bits from the start of its enclosing
920structure field, adjusted for endianness: @pxref{Structs and unions}. You can
921extract this field with the @code{CTF_INT_OFFSET} macro.
922@findex CTF_INT_OFFSET
923
924@item 0--15
925@tab Bit-width
926@tab The width of this integral type in bits. You can extract this field with
927the @code{CTF_INT_BITS} macro.
928@findex CTF_INT_BITS
929@end multitable
930
931If you choose, bitfields can be represented using the things above as a sort of
932integral type with the @code{isroot} bit flipped off and the offset and bits
933values set in the vlen word: you can populate it with the @code{CTF_INT_DATA}
934macro. (But it may be more convenient to represent them using slices of a
935full-width integer: @pxref{Slices}.)
936@findex CTF_INT_DATA
937
938Integers that are bitfields usually have a @code{ctt_size} rounded up to the
939nearest power of two in bytes, for natural alignment (e.g. a 17-bit integer
940would have a @code{ctt_size} of 4). However, not all types are naturally
941aligned on all architectures: packed structures may in theory use integral
942bitfields with different @code{ctt_size}, though this is rarely observed.
943
944The @dfn{encoding} for integers is a bit-field comprised of the values below,
945which consumers can use to decide how to display values of this type:
946
947@multitable {Offset} {@code{CTF_INT_VARARGS}} {If set, this is a char type. It is platform-dependent whether unadorned}
948@headitem Offset @tab Name @tab Description
949@item 0x01
950@tab @code{CTF_INT_SIGNED}
951@tab If set, this is a signed int: if false, unsigned.
952@tindex CTF_INT_SIGNED
953
954@item 0x02
955@tab @code{CTF_INT_CHAR}
956@tab If set, this is a char type. It is platform-dependent whether unadorned
957@code{char} is signed or not: the @code{CTF_CHAR} macro produces an integral
958type suitable for the definition of @code{char} on this platform.
959@tindex CTF_INT_CHAR
960@findex CTF_CHAR
961
962@item 0x04
963@tab @code{CTF_INT_BOOL}
964@tab If set, this is a boolean type. (It is theoretically possible to turn this
965and @code{CTF_INT_CHAR} on at the same time, but it is not clear what this would
966mean.)
967@tindex CTF_INT_BOOL
968
969@item 0x08
970@tab @code{CTF_INT_VARARGS}
971@tab If set, this is a varargs-promoted value in a K&R function definition.
972This is not currently produced or consumed by anything that we know of: it is set
973aside for future use.
974@end multitable
975
976The GCC ``@code{Complex int}'' and fixed-point extensions are not yet supported:
977references to such types will be emitted as type 0.
978
979@node Floating-point types
980@subsection Floating-point types
981@cindex Floating-point types
982@cindex Types, floating-point
983@tindex float
984@tindex double
985@tindex signed float
986@tindex signed double
987@tindex unsigned float
988@tindex unsigned double
989@tindex Complex, float
990@tindex Complex, double
991@tindex Complex, signed float
992@tindex Complex, signed double
993@tindex Complex, unsigned float
994@tindex Complex, unsigned double
995@cindex CTF_K_FLOAT
996
997Floating-point types are all represented as types of kind @code{CTF_K_FLOAT}.
998Like integers, These types fill out @code{ctt_size} in the @code{ctf_stype_t}
999with the size in bytes of the floating-point type in question. They are always
1000represented by @code{ctf_stype_t}, never @code{ctf_type_t}.
1001
1002This part of CTF shows many rough edges in the more obscure corners of
1003floating-point handling, and is likely to change in format v4.
1004
1005The variable-length data for floats has multiple items packed into it just like
1006integers do:
1007
1008@multitable {Bit offset} {Encoding} {The floating-;point encoding and desired display representation.}
1009@headitem Bit offset @tab Name @tab Description
1010@item 24--31
1011@tab Encoding
1012@tab The desired display representation of this float. You can extract this
1013field with the @code{CTF_FP_ENCODING} macro. See below.
1014@findex CTF_FP_ENCODING
1015
1016@item 16--23
1017@tab Offset
1018@tab The offset of this floating-point type in bits from the start of its enclosing
1019structure field, adjusted for endianness: @pxref{Structs and unions}. You can
1020extract this field with the @code{CTF_FP_OFFSET} macro.
1021@findex CTF_FP_OFFSET
1022
1023@item 0--15
1024@tab Bit-width
1025@tab The width of this floating-point type in bits. You can extract this field with
1026the @code{CTF_FP_BITS} macro.
1027@findex CTF_FP_BITS
1028@end multitable
1029
1030The purpose of the floating-point offset and bit-width is somewhat opaque, since
1031there are no such things as floating-point bitfields in C: the bit-width should
1032be filled out with the full width of the type in bits, and the offset should
1033always be zero. It is likely that these fields will go away in the future. As
1034with integers, you can use @code{CTF_FP_DATA} to assemble one of these vlen
1035items from its component parts.
1036@findex CTF_INT_DATA
1037
1038The @dfn{encoding} for floats is not a bitfield but a simple value indicating
1039the display representation. Many of these are unused, relate to
1040Solaris-specific compiler extensions, and will be recycled in future: some are
1041unused and will become used in future.
1042
1043@multitable {Offset} {@code{CTF_FP_LDIMAGRY}} {This is a @code{float} interval type, a Solaris-specific extension.}
1044@headitem Offset @tab Name @tab Description
1045@item 1
1046@tab @code{CTF_FP_SINGLE}
1047@tab This is a single-precision IEEE 754 @code{float}.
1048@tindex CTF_FP_SINGLE
1049@item 2
1050@tab @code{CTF_FP_DOUBLE}
1051@tab This is a double-precision IEEE 754 @code{double}.
1052@tindex CTF_FP_DOUBLE
1053@item 3
1054@tab @code{CTF_FP_CPLX}
1055@tab This is a @code{Complex float}.
1056@tindex CTF_FP_CPLX
1057@item 4
1058@tab @code{CTF_FP_DCPLX}
1059@tab This is a @code{Complex double}.
1060@tindex CTF_FP_DCPLX
1061@item 5
1062@tab @code{CTF_FP_LDCPLX}
1063@tab This is a @code{Complex long double}.
1064@tindex CTF_FP_LDCPLX
1065@item 6
1066@tab @code{CTF_FP_LDOUBLE}
1067@tab This is a @code{long double}.
1068@tindex CTF_FP_LDOUBLE
1069@item 7
1070@tab @code{CTF_FP_INTRVL}
1071@tab This is a @code{float} interval type, a Solaris-specific extension.
1072Unused: will be recycled.
1073@tindex CTF_FP_INTRVL
1074@cindex Unused bits
1075@item 8
1076@tab @code{CTF_FP_DINTRVL}
1077@tab This is a @code{double} interval type, a Solaris-specific extension.
1078Unused: will be recycled.
1079@tindex CTF_FP_DINTRVL
1080@cindex Unused bits
1081@item 9
1082@tab @code{CTF_FP_LDINTRVL}
1083@tab This is a @code{long double} interval type, a Solaris-specific extension.
1084Unused: will be recycled.
1085@tindex CTF_FP_LDINTRVL
1086@cindex Unused bits
1087@item 10
1088@tab @code{CTF_FP_IMAGRY}
1089@tab This is a the imaginary part of a @code{Complex float}. Not currently
1090generated. May change.
1091@tindex CTF_FP_IMAGRY
1092@cindex Unused bits
1093@item 11
1094@tab @code{CTF_FP_DIMAGRY}
1095@tab This is a the imaginary part of a @code{Complex double}. Not currently
1096generated. May change.
1097@tindex CTF_FP_DIMAGRY
1098@cindex Unused bits
1099@item 12
1100@tab @code{CTF_FP_LDIMAGRY}
1101@tab This is a the imaginary part of a @code{Complex long double}. Not currently
1102generated. May change.
1103@tindex CTF_FP_LDIMAGRY
1104@cindex Unused bits
1105@end multitable
1106
1107The use of the complex floating-point encodings is obscure: it is possible that
1108@code{CTF_FP_CPLX} is meant to be used for only the real part of complex types,
1109and @code{CTF_FP_IMAGRY} et al for the imaginary part -- but for now, we are
1110emitting @code{CTF_FP_CPLX} to cover the entire type, with no way to get at its
1111constituent parts. There appear to be no uses of these encodings anywhere, so
1112they are quite likely to change incompatibly in future.
1113
1114@node Slices
1115@subsection Slices
1116@cindex Slices
1117@cindex Types, slices of integral
1118@tindex CTF_K_SLICE
1119
1120Slices, with kind @code{CTF_K_SLICE}, are an unusual CTF construct: they do not
1121directly correspond to any C type, but are a way to model other types in a more
1122convenient fashion for CTF generators.
1123
1124A slice is like a pointer or other reference type in that they are always
1125represented by @code{ctf_stype_t}: but unlike pointers and other reference
1126types, they populate the @code{ctt_size} field just like integral types do, and
1127come with an attached encoding and transform the encoding of the underlying
1128type. The underlying type is described in the variable-length data, similarly
1129to structure and union fields: see below. Requests for the type size should
1130also chase down to the referenced type.
1131
1132Slices are always nameless: @code{ctt_name} is always zero for them.
1133
1134(The @code{libctf} API behaviour is unusual as well, and justifies the existence
1135of slices: @code{ctf_type_kind} never returns @code{CTF_K_SLICE} but always the
1136underlying type kind, so that consumers never need to know about slices: they
1137can tell if an apparent integer is actually a slice if they need to by calling
1138@code{ctf_type_reference}, which will uniquely return the underlying integral
1139type rather than erroring out with @code{ECTF_NOTREF} if this is actually a
1140slice. So slices act just like an integer with an encoding, but more closely
1141mirror DWARF and other debugging information formats by allowing CTF file
1142creators to represent a bitfield as a slice of an underlying integral type.)
1143@findex Slices, effect on ctf_type_kind
1144@findex Slices, effect on ctf_type_reference
1145@findex libctf, effect of slices
1146
1147The vlen in the info word for a slice should be ignored and is always zero. The
1148variable-length data for a slice is a single @code{ctf_slice_t}:
1149
1150@verbatim
1151typedef struct ctf_slice
1152{
1153 uint32_t cts_type;
1154 unsigned short cts_offset;
1155 unsigned short cts_bits;
1156} ctf_slice_t;
1157@end verbatim
1158
1159@tindex struct ctf_slice
1160@tindex ctf_slice_t
1161@multitable {Offset} {@code{unsigned short cts_offset}} {The type this slice is a slice of. Must be an}
1162@headitem Offset @tab Name @tab Description
1163@item 0x0
1164@tab @code{uint32_t cts_type}
1165@vindex cts_type
1166@vindex struct ctf_slice, cts_type
1167@vindex ctf_slice_t, cts_type
1168@tab The type this slice is a slice of. Must be an integral type (or a
1169floating-point type, but this nonsensical option will go away in v4.)
1170
1171@item 0x4
1172@tab @code{unsigned short cts_offset}
1173@vindex cts_offset
1174@vindex struct ctf_slice, cts_offset
1175@vindex ctf_slice_t, cts_offset
1176@tab The offset of this integral type in bits from the start of its enclosing
1177structure field, adjusted for endianness: @pxref{Structs and unions}. Identical
1178semantics to the @code{CTF_INT_OFFSET} field: @pxref{Integer types}. This field
1179is much too long, because the maximum possible offset of an integral type would
1180easily fit in a char: this field is bigger just for the sake of alignment. This
1181will change in v4.
1182
1183@item 0x6
1184@tab @code{unsigned short cts_bits}
1185@vindex cts_bits
1186@vindex struct ctf_slice, cts_bits
1187@vindex ctf_slice_t, cts_bits
1188@tab The bit-width of this integral type. Identical semantics to the
1189@code{CTF_INT_BITS} field: @pxref{Integer types}. As above, this field is
1190really too large and will shrink in v4.
1191@end multitable
1192
1193@node Pointers typedefs and cvr-quals
1194@subsection Pointers, typedefs, and cvr-quals
1195@cindex Pointers
1196@cindex Typedefs
1197@cindex cvr-quals
1198@tindex typedef
1199@tindex const
1200@tindex volatile
1201@tindex restrict
1202@tindex CTF_K_POINTER
1203@tindex CTF_K_TYPEDEF
1204@tindex CTF_K_CONST
1205@tindex CTF_K_VOLATILE
1206@tindex CTF_K_RESTRICT
1207
1208Pointers, @code{typedef}s, and @code{const}, @code{volatile} and @code{restrict}
1209qualifiers are represented identically except for their type kind (though they
1210may be treated differently by consuming libraries like @code{libctf}, since
1211pointers affect assignment-compatibility in ways cvr-quals do not, and they may
1212have different alignment requirements, etc).
1213
1214All of these are represented by @code{ctf_stype_t}, have no variable data at
1215all, and populate @code{ctt_type} with the type ID of the type they point
1216to. These types can stack: a @code{CTF_K_RESTRICT} can point to a
1217@code{CTF_K_CONST} which can point to a @code{CTF_K_POINTER} etc.
1218
1219They are all unnamed: @code{ctt_name} is 0.
1220
1221The size of @code{CTF_K_POINTER} is derived from the data model (@pxref{Data
1222models}), i.e. in practice, from the target machine ABI, and is not explicitly
1223represented. The size of other kinds in this set should be determined by
1224chasing ctf_types as necessary until a non-typedef/const/volatile/restrict is
1225found, and using that.
1226
1227@node Arrays
1228@subsection Arrays
1229@cindex Arrays
1230
1231Arrays are encoded as types of kind @code{CTF_K_ARRAY} in a @code{ctf_stype_t}.
1232Both size and kind for arrays are zero. The variable-length data is a
1233@code{ctf_array_t}: @code{vlen} in the info word should be disregarded and is
1234always zero.
1235
1236@verbatim
1237typedef struct ctf_array
1238{
1239 uint32_t cta_contents;
1240 uint32_t cta_index;
1241 uint32_t cta_nelems;
1242} ctf_array_t;
1243@end verbatim
1244
1245@tindex struct ctf_array
1246@tindex ctf_array_t
1247@multitable {Offset} {@code{unsigned short cta_contents}} {The type of the array index: a type ID of an}
1248@headitem Offset @tab Name @tab Description
1249@item 0x0
1250@tab @code{uint32_t cta_contents}
1251@vindex cta_contents
1252@vindex struct ctf_array, cta_contents
1253@vindex ctf_array_t, cta_contents
1254@tab The type of the array elements: a type ID.
1255
1256@item 0x4
1257@tab @code{uint32_t cta_index}
1258@vindex cta_index
1259@vindex struct ctf_array, cta_index
1260@vindex ctf_array_t, cta_index
1261@tab The type of the array index: a type ID of an integral type.
1262If this is a variable-length array, the index type ID will be 0
1263(but the actual index type of this array is probably @code{int}).
1264Probably redundant and may be dropped in v4.
1265
1266@item 0x8
1267@tab @code{uint32_t cta_nelems}
1268@vindex cta_nelems
1269@vindex struct ctf_array, cta_nelems
1270@vindex ctf_array_t, cta_nelems
1271@tab The number of array elements. 0 for VLAs, and also for
1272the historical variety of VLA which has explicit zero dimensions (which will
1273have a nonzero @code{cta_index}.)
1274@end multitable
1275
1276The size of an array can be computed by simple multiplication of the size of the
1277@code{cta_contents} type by the @code{cta_nelems}.
1278
1279@node Function pointers
1280@subsection Function pointers
1281@cindex Function pointers
1282@cindex Pointers, to functions
1283
1284Function pointers are explicitly represented in the CTF type section by a type
1285of kind @code{CTF_K_FUNCTION}, always encoded with a @code{ctf_stype_t}. The
1286@code{ctt_type} is the function return type ID. The @code{vlen} in the info
1287word is the number of arguments, each of which is a type ID, a @code{uint32_t}:
1288if the last argument is 0, this is a varargs function and the number of
1289arguments is one less than indicated by the vlen.
1290
1291If the number of arguments is odd, a single @code{uint32_t} of padding is
1292inserted to maintain alignment.
1293
1294@node Enums
1295@subsection Enums
1296@cindex Enums
1297@tindex enum
1298@tindex CTF_K_ENUM
1299
1300Enumerated types are represented as types of kind @code{CTF_K_ENUM} in a
1301@code{ctf_stype_t}. The @code{ctt_size} is always the size of an int from the
1302data model (enum bitfields are implemented via slices). The @code{vlen} is a
1303count of enumerations, each of which is represented by a @code{ctf_enum_t} in
1304the vlen:
1305
1306@verbatim
1307typedef struct ctf_enum
1308{
1309 uint32_t cte_name;
1310 int32_t cte_value;
1311} ctf_enum_t;
1312@end verbatim
1313
1314@tindex struct ctf_enum
1315@tindex ctf_enum_t
1316@multitable {Offset} {@code{int32_t cte_value}} {Strtab offset of the enumeration name.}
1317@headitem Offset @tab Name @tab Description
1318@item 0x0
1319@tab @code{uint32_t cte_name}
1320@vindex cte_name
1321@vindex struct ctf_enum, cte_name
1322@vindex ctf_enum_t, cte_name
1323@tab Strtab offset of the enumeration name. Must not be 0.
1324
1325@item 0x4
1326@tab @code{int32_t cte_value}
1327@vindex cte_value
1328@vindex struct ctf_enum, cte_value
1329@vindex ctf_enum_t, cte_value
1330@tab The enumeration value.
1331
1332@end multitable
1333
1334Enumeration values larger than @math{2^32} are not yet supported and are omitted
1335from the enumeration. (v4 will lift this restriction by encoding the value
1336differently.)
1337
1338Forward declarations of enums are not implemented with this kind: @pxref{Forward
1339declarations}.
1340
1341Enumerated type names, as usual in C, go into their own namespace, and do not
1342conflict with non-enums, structs, or unions with the same name.
1343
1344@node Structs and unions
1345@subsection Structs and unions
1346@cindex Structures
1347@cindex Unions
1348@tindex struct
1349@tindex union
1350@tindex CTF_K_STRUCT
1351@tindex CTF_K_UNION
1352
1353Structures and unions are represnted as types of kind @code{CTF_K_STRUCT} and
1354@code{CTF_K_UNION}: their representation is otherwise identical, and it is
1355perfectly allowed for ``structs'' to contain overlapping fields etc, so we will
1356treat them together for the rest of this section.
1357
1358They fill out @code{ctt_size}, and use @code{ctf_type_t} in preference to
1359@code{ctf_stype_t} if the structure size is greater than @code{CTF_MAX_SIZE}
1360(0xfffffffe).
1361@tindex CTF_MAX_LSIZE
1362
1363The vlen for structures and unions is a count of structure fields, but the type
1364used to represent a structure field (and thus the size of the variable-length
1365array element representing the type) depends on the size of the structure: truly
1366huge structures, greater than @code{CTF_LSTRUCT_THRESH} bytes in length, use a
1367different type. (@code{CTF_LSTRUCT_THRESH} is 536870912, so such structures are
1368vanishingly rare: in v4, this representation will change somewhat for greater
1369compactness. It's inherited from v1, where the limits were much lower.)
1370@tindex CTF_LSTRUCT_THRESH
1371
1372Most structures can get away with using @code{ctf_member_t}:
1373
1374@verbatim
1375typedef struct ctf_member_v2
1376{
1377 uint32_t ctm_name;
1378 uint32_t ctm_offset;
1379 uint32_t ctm_type;
1380} ctf_member_t;
1381@end verbatim
1382
1383Huge structures that are represented by @code{ctf_type_t} rather than
1384@code{ctf_stype_t} have to use @code{ctf_lmember_t}, which splits the offset as
1385@code{ctf_type_t} splits the size:
1386
1387@verbatim
1388typedef struct ctf_lmember_v2
1389{
1390 uint32_t ctlm_name;
1391 uint32_t ctlm_offsethi;
1392 uint32_t ctlm_type;
1393 uint32_t ctlm_offsetlo;
1394} ctf_lmember_t;
1395@end verbatim
1396
1397Here's what the fields of @code{ctf_member} mean:
1398
1399@tindex struct ctf_member_v2
1400@tindex ctf_member_t
1401@multitable {Offset} {@code{uint32_t ctm_offset}} {The offset of this field @emph{in bits}. (Usually, for bitfields, this is}
1402@headitem Offset @tab Name @tab Description
1403@item 0x00
1404@tab @code{uint32_t ctm_name}
1405@vindex ctm_name
1406@vindex struct ctf_member_v2, ctm_name
1407@vindex ctf_member_t, ctm_name
1408@tab Strtab offset of the field name.
1409
1410@item 0x04
1411@tab @code{uint32_t ctm_offset}
1412@vindex ctm_offset
1413@vindex struct ctf_member_v2, ctm_offset
1414@vindex ctf_member_t, ctm_offset
1415@tab The offset of this field @emph{in bits}. (Usually, for bitfields, this is
1416machine-word-aligned and the individual field has an offset in bits, but
1417the format allows for the offset to be encoded in bits here.)
1418
1419@item 0x08
1420@tab @code{uint32_t ctm_type}
1421@vindex ctm_type
1422@vindex struct ctf_member_v2, ctm_type
1423@vindex ctf_member_t, ctm_type
1424@tab The type ID of the type of the field.
1425@end multitable
1426
1427Here's what the fields of the very similar @code{ctf_lmember} mean:
1428
1429@tindex struct ctf_lmember_v2
1430@tindex ctf_lmember_t
1431@multitable {Offset} {@code{uint32_t ctlm_offsethi}} {The offset of this field @emph{in bits}. (Usually, for bitfields, this is}
1432@headitem Offset @tab Name @tab Description
1433@item 0x00
1434@tab @code{uint32_t ctlm_name}
1435@vindex ctlm_name
1436@vindex struct ctf_lmember_v2, ctlm_name
1437@vindex ctf_lmember_t, ctlm_name
1438@tab Strtab offset of the field name.
1439
1440@item 0x04
1441@tab @code{uint32_t ctlm_offsethi}
1442@vindex ctlm_offsethi
1443@vindex struct ctf_lmember_v2, ctlm_offsethi
1444@vindex ctf_lmember_t, ctlm_offsethi
1445@tab The high 32 bits of the offset of this field in bits.
1446
1447@item 0x08
1448@tab @code{uint32_t ctlm_type}
1449@vindex ctm_type
1450@vindex struct ctf_lmember_v2, ctlm_type
1451@vindex ctf_member_t, ctlm_type
1452@tab The type ID of the type of the field.
1453
1454@item 0x0c
1455@tab @code{uint32_t ctlm_offsetlo}
1456@vindex ctlm_offsetlo
1457@vindex struct ctf_lmember_v2, ctlm_offsetlo
1458@vindex ctf_lmember_t, ctlm_offsetlo
1459@tab The low 32 bits of the offset of this field in bits.
1460@end multitable
1461
1462Macros @code{CTF_LMEM_OFFSET}, @code{CTF_OFFSET_TO_LMEMHI} and
1463@code{CTF_OFFSET_TO_LMEMLO} serve to extract and install the values of the
1464@code{ctlm_offset} fields, much as with the split size fields in
1465@code{ctf_type_t}.
1466
1467Unnamed structure and union fields are simply implemented by collapsing the
1468unnamed field's members into the containing structure or union: this does mean
1469that a structure containing an unnamed union can end up being a ``structure''
1470with multiple members at the same offset. (A future format revision may
1471collapse @code{CTF_K_STRUCT} and @code{CTF_K_UNION} into the same kind and
1472decide among them based on whether their members do in fact overlap.)
1473
1474Structure and union type names, as usual in C, go into their own namespace,
1475just as enum type names do.
1476
1477Forward declarations of structures and unions are not implemented with this
1478kind: @pxref{Forward declarations}.
1479
1480@node Forward declarations
1481@subsection Forward declarations
1482@cindex Forwards
1483@tindex enum
1484@tindex struct
1485@tindex union
1486@tindex CTF_K_FORWARD
1487
1488When the compiler encounters a forward declaration of a struct, union, or enum,
1489it emits a type of kind @code{CTF_K_FORWARD}. If it later encounters a non-
1490forward declaration of the same thing, it marks the forward as non-root-visible:
1491before link time, therefore, non-root-visible forwards indicate that a
1492non-forward is coming.
1493
1494After link time, forwards are fused with their corresponding non-forwards by the
1495deduplicator where possible. They are kept if there is no non-forward
1496definition (maybe it's not visible from any TU at all) or if @code{multiple}
1497conflicting structures with the same name might match it. Otherwise, all other
1498forwards are converted to structures, unions, or enums as appropriate, even
1499across TUs if only one structure could correspond to the forward (after all,
1500all types across all TUs land in the same dictionary unless they conflict,
1501so promoting forwards to their concrete type seems most helpful).
1502
1503A forward has a rather strange representation: it is encoded with a
1504@code{ctf_stype_t} but the @code{ctt_type} is populated not with a type (if it's
1505a forward, we don't have an underlying type yet: if we did, we'd have promoted
1506it and this wouldn't be a forward any more) but with the @code{kind} of the
1507forward. This means that we can distinguish forwards to structs, enums and
1508unions reliably and ensure they land in the appropriate namespace even before
1509the actual struct, union or enum is found.
1510
1511@node The symtypetab sections
1512@section The symtypetab sections
1513@cindex Symtypetab section
1514@cindex Sections, symtypetab
1515@cindex Function info section
1516@cindex Sections, function info
1517@cindex Data object section
1518@cindex Sections, data object
1519@cindex Function info index section
1520@cindex Sections, function info index
1521@cindex Data object index section
1522@cindex Sections, data object index
1523@tindex CTF_F_IDXSORTED
1524@tindex CTF_F_DYNSTR
1525@cindex Bug workarounds, CTF_F_DYNSTR
1526
1527These are two very simple sections with identical formats, used by consumers to
1528map from ELF function and data symbols directly to their types. So they are
1529usually populated only in CTF sections that are embedded in ELF objects.
1530
1531Their format is very simple: an array of type IDs. Which symbol each type ID
1532corresponds to depends on whether the optional @emph{index section} associated
1533with this symtypetab section has any content.
1534
1535If the index section is nonempty, it is an array of @code{uint32_t} string table
1536offsets, each giving the name of the symbol whose type is at the same offset in
1537the corresponding non-index section: users can look up symbols in such a table
1538by name. The index section and corresponding symtypetab section is usually
1539ASCIIbetically sorted (indicated by the @code{CTF_F_IDXSORTED} flag in the
1540header): if it's sorted, it can be bsearched for a symbol name rather than
1541having to use a slower linear search.
1542
1543If the data object index section is empty, the entries in the data object and
1544function info sections are associated 1:1 with ELF symbols of type
1545@code{STT_OBJECT} (for data object) or @code{STT_FUNC} (for function info) with
1546a nonzero value: the linker shuffles the symtypetab sections to correspond with
1547the order of the symbols in the ELF file. Symbols with no name, undefined
1548symbols and symbols named ``@code{_START_}'' and ``@code{_END_}'' are skipped
1549and never appear in either section. Symbols that have no corresponding type are
1550represented by type ID 0. The section may have fewer entries than the symbol
1551table, in which case no later entries have associated types. This format is
1552more compact than an indexed form if most entries have types (since there is no
1553need to record any symbol names), but if the producer and consumer disagree even
1554slightly about which symbols are omitted, the types of all further symbols will
1555be wrong!
1556
1557The compiler always emits indexed symtypetab tables, because there is no symbol
1558table yet. The linker will always have to read them all in and always works
1559through them from start to end, so there is no benefit having the compiler sort
1560them either. The linker (actually, @code{libctf}'s linking machinery) will
1561automatically sort unsorted indexed sections, and convert indexed sections that
1562contain a lot of pads into the more compact, unindexed form.
1563
1564If child dicts are in use, only symbols that use types actually mentioned in the
1565child appear in the child's symtypetab: symbols that use only types in the
1566parent appear in the parent's symtypetab instead. So the child's symtypetab will
1567almost always be very sparse, and thus will usually use the indexed form even in
1568fully linked objects. (It is, of course, impossible for symbols to exist that
1569use types from multiple child dicts at once, since it's impossible to declare a
1570function in C that uses types that are only visible in two different, disjoint
1571translation units.)
1572
1573@node The variable section
1574@section The variable section
1575@cindex Variable section
1576@cindex Sections, variable
1577
1578The variable section is a simple array mapping names (strtab entries) to type
1579IDs, intended to provide a replacement for the data object section in dynamic
1580situations in which there is no static ELF strtab but the consumer instead hands
1581back names. The section is sorted into ASCIIbetical order by name for rapid
1582lookup, like the CTF archive name table.
1583
1584The section is an array of these structures:
1585
1586@verbatim
1587typedef struct ctf_varent
1588{
1589 uint32_t ctv_name;
1590 uint32_t ctv_type;
1591} ctf_varent_t;
1592@end verbatim
1593
1594@tindex struct ctf_varent
1595@tindex ctf_varent_t
1596@multitable {Offset} {@code{uint32_t ctv_name}} {Strtab offset of the name}
1597@headitem Offset @tab Name @tab Description
1598@item 0x00
1599@tab @code{uint32_t ctv_name}
1600@vindex ctv_name
1601@vindex struct ctf_varent, ctv_name
1602@vindex ctf_varent_t, ctv_name
1603@tab Strtab offset of the name
1604
1605@item 0x04
1606@tab @code{uint32_t ctv_type}
1607@vindex ctv_type
1608@vindex struct ctf_varent, ctv_type
1609@vindex ctf_varent_t, ctv_type
1610@tab Type ID of this type
1611@end multitable
1612
1613There is no analogue of the function info section yet: v4 will probably drop
1614this section in favour of a way to put both indexed (thus, named) and nonindexed
1615symbols into the symtypetab sections at the same time.
1616
1617@node The label section
1618@section The label section
1619@cindex Label section
1620@cindex Sections, label
1621
1622The label section is a currently-unused facility allowing the tiling of the type
1623space with names taken from the strtab. The section is an array of these
1624structures:
1625
1626@verbatim
1627typedef struct ctf_lblent
1628{
1629 uint32_t ctl_label;
1630 uint32_t ctl_type;
1631} ctf_lblent_t;
1632@end verbatim
1633
1634@tindex struct ctf_lblent
1635@tindex ctf_lblent_t
1636@multitable {Offset} {@code{uint32_t ctl_label}} {Strtab offset of the label}
1637@headitem Offset @tab Name @tab Description
1638@item 0x00
1639@tab @code{uint32_t ctl_label}
1640@vindex ctl_label
1641@vindex struct ctf_lblent, ctl_label
1642@vindex ctf_lblent_t, ctl_label
1643@tab Strtab offset of the label
1644
1645@item 0x04
1646@tab @code{uint32_t ctl_type}
1647@vindex ctl_type
1648@vindex struct ctf_lblent, ctl_type
1649@vindex ctf_lblent_t, ctl_type
1650@tab Type ID of the last type covered by this label
1651@end multitable
1652
1653Semantics will be attached to labels soon, probably in v4 (the plan is to use
1654them to allow multiple disjoint namespaces in a single CTF file, removing many
1655uses of CTF archives, in particular in the @code{.ctf} section in ELF objects).
1656
1657@node The string section
1658@section The string section
1659@cindex String section
1660@cindex Sections, string
1661
1662This section is a simple ELF-format strtab, starting with a zero byte (thus
1663ensuring that the string with offset 0 is the null string, as assumed elsewhere
1664in this spec). The strtab is usually ASCIIbetically sorted to somewhat improve
1665compression efficiency.
1666
1667Where the strtab is unusual is the @emph{references} to it. CTF has two
1668string tables, the internal strtab and an external strtab associated
1669with the CTF dictionary at open time: usually, this is the ELF dynamic
1670strtab (@code{.dynstr}) of a CTF dictionary embedded in an ELF file. We
1671distinguish between these strtabs by the most significant bit, bit 31,
1672of the 32-bit strtab references: if it is 0, the offset is in the
1673internal strtab: if 1, the offset is in the external strtab.
1674
1675@tindex CTF_F_DYNSTR
1676@cindex Bug workarounds, CTF_F_DYNSTR
1677There is a bug workaround in this area: in format v3 (the first version
1678to have working support for external strtabs), the external strtab is
1679@code{.strtab} unless the @code{CTF_F_DYNSTR} flag is set on the
1680dictionary (@pxref{CTF file-wide flags}). Format v4 will introduce a
1681header field that explicitly names the external strtab, making this flag
1682unnecessary.
1683
1684@node Data models
1685@section Data models
1686@cindex Data models
1687
1688The data model is a simple integer which indicates the ABI in use on this
1689platform. Right now, it is very simple, distinguishing only between 32- and
169064-bit types: a model of 1 indicates ILP32, 2 indicats LP64. The mapping from
1691ABI integer to type sizes is hardwired into @code{libctf}: currently, we use
1692this to hardwire the size of pointers, function pointers, and enumerated types,
1693
1694This is a very kludgy corner of CTF and will probably be replaced with explicit
1695header fields to record this sort of thing in future.
1696
1697@node Limits of CTF
1698@section Limits of CTF
1699@cindex Limits
1700
1701The following limits are imposed by various aspects of CTF version 3:
1702
1703@table @code
1704@item CTF_MAX_TYPE
1705Maximum type identifier (maximum number of types accessible with parent and
1706child containers in use): 0xfffffffe
1707@item CTF_MAX_PTYPE
1708Maximum type identifier in a parent dictioanry: maximum number of types in any
1709one dictionary: 0x7fffffff
1710@item CTF_MAX_NAME
1711Maximum offset into a string table: 0x7fffffff
1712@item CTF_MAX_VLEN
1713Maximum number of members in a struct, union, or enum: maximum number of
1714function args: 0xffffff
1715@item CTF_MAX_SIZE
1716Maximum size of a @code{ctf_stype_t} in bytes before we fall back to
1717@code{ctf_type_t}: 0xfffffffe bytes
1718@end table
1719
1720Other maxima without associated macros:
1721@itemize
1722@item
1723Maximum value of an enumerated type: 2^32
1724@item
1725Maximum size of an array element: 2^32
1726@end itemize
1727
1728These maxima are generally considered to be too low, because C programs can and
1729do exceed them: they will be lifted in format v4.
1730
1731@node Index
1732@unnumbered Index
1733
1734@printindex cp
1735
1736@bye