]>
Commit | Line | Data |
---|---|---|
c63539ff ML |
1 | .. |
2 | Copyright 1988-2022 Free Software Foundation, Inc. | |
3 | This is part of the GCC manual. | |
4 | For copying conditions, see the copyright.rst file. | |
5 | ||
6 | .. _type-encoding: | |
7 | ||
8 | Type Encoding | |
9 | ************* | |
10 | ||
11 | This is an advanced section. Type encodings are used extensively by | |
12 | the compiler and by the runtime, but you generally do not need to know | |
13 | about them to use Objective-C. | |
14 | ||
15 | The Objective-C compiler generates type encodings for all the types. | |
16 | These type encodings are used at runtime to find out information about | |
17 | selectors and methods and about objects and classes. | |
18 | ||
19 | The types are encoded in the following way: | |
20 | ||
21 | .. @sp 1 | |
22 | ||
23 | .. list-table:: | |
24 | :widths: 25 75 | |
25 | ||
26 | * - ``_Bool`` | |
27 | - ``B`` | |
28 | * - ``char`` | |
29 | - ``c`` | |
30 | * - ``unsigned char`` | |
31 | - ``C`` | |
32 | * - ``short`` | |
33 | - ``s`` | |
34 | * - ``unsigned short`` | |
35 | - ``S`` | |
36 | * - ``int`` | |
37 | - ``i`` | |
38 | * - ``unsigned int`` | |
39 | - ``I`` | |
40 | * - ``long`` | |
41 | - ``l`` | |
42 | * - ``unsigned long`` | |
43 | - ``L`` | |
44 | * - ``long long`` | |
45 | - ``q`` | |
46 | * - ``unsigned long long`` | |
47 | - ``Q`` | |
48 | * - ``float`` | |
49 | - ``f`` | |
50 | * - ``double`` | |
51 | - ``d`` | |
52 | * - ``long double`` | |
53 | - ``D`` | |
54 | * - ``void`` | |
55 | - ``v`` | |
56 | * - ``id`` | |
57 | - ``@`` | |
58 | * - ``Class`` | |
59 | - ``#`` | |
60 | * - ``SEL`` | |
61 | - ``:`` | |
62 | * - ``char*`` | |
63 | - ``*`` | |
64 | * - ``enum`` | |
65 | - an ``enum`` is encoded exactly as the integer type that the compiler uses for it, which depends on the enumeration values. Often the compiler users ``unsigned int``, which is then encoded as ``I``. | |
66 | * - unknown type | |
67 | - ``?`` | |
68 | * - Complex types | |
69 | - ``j`` followed by the inner type. For example ``_Complex double`` is encoded as "jd". | |
70 | * - bit-fields | |
71 | - ``b`` followed by the starting position of the bit-field, the type of the bit-field and the size of the bit-field (the bit-fields encoding was changed from the NeXT's compiler encoding, see below) | |
72 | ||
73 | .. @sp 1 | |
74 | ||
75 | The encoding of bit-fields has changed to allow bit-fields to be | |
76 | properly handled by the runtime functions that compute sizes and | |
77 | alignments of types that contain bit-fields. The previous encoding | |
78 | contained only the size of the bit-field. Using only this information | |
79 | it is not possible to reliably compute the size occupied by the | |
80 | bit-field. This is very important in the presence of the Boehm's | |
81 | garbage collector because the objects are allocated using the typed | |
82 | memory facility available in this collector. The typed memory | |
83 | allocation requires information about where the pointers are located | |
84 | inside the object. | |
85 | ||
86 | The position in the bit-field is the position, counting in bits, of the | |
87 | bit closest to the beginning of the structure. | |
88 | ||
89 | The non-atomic types are encoded as follows: | |
90 | ||
91 | .. @sp 1 | |
92 | ||
93 | .. list-table:: | |
94 | :widths: 15 85 | |
95 | ||
96 | * - pointers | |
97 | - :samp:`^` followed by the pointed type. | |
98 | * - arrays | |
99 | - :samp:`[` followed by the number of elements in the array followed by the type of the elements followed by :samp:`]` | |
100 | * - structures | |
101 | - :samp:`{` followed by the name of the structure (or :samp:`?` if the structure is unnamed), the :samp:`=` sign, the type of the members and by :samp:`}` | |
102 | * - unions | |
103 | - :samp:`(` followed by the name of the structure (or :samp:`?` if the union is unnamed), the :samp:`=` sign, the type of the members followed by :samp:`)` | |
104 | * - vectors | |
105 | - :samp:`![` followed by the vector_size (the number of bytes composing the vector) followed by a comma, followed by the alignment (in bytes) of the vector, followed by the type of the elements followed by :samp:`]` | |
106 | ||
107 | Here are some types and their encodings, as they are generated by the | |
108 | compiler on an i386 machine: | |
109 | ||
110 | +-------------------------------------------+------------------------------------------------+ | |
111 | |Objective-C type |Compiler encoding | | |
112 | +===========================================+================================================+ | |
113 | |.. code-block:: objective-c |``[10i]`` | | |
114 | | | | | |
115 | | int a[10]; | | | |
116 | +-------------------------------------------+------------------------------------------------+ | |
117 | |.. code-block:: objective-c |``{?=i[3f]b128i3b131i2c}`` | | |
118 | | | | | |
119 | | struct { | | | |
120 | | int i; | | | |
121 | | float f[3]; | | | |
122 | | int a:3; | | | |
123 | | int b:2; | | | |
124 | | char c; | | | |
125 | | } | | | |
126 | +-------------------------------------------+------------------------------------------------+ | |
127 | |.. code-block:: objective-c |``![16,16i]`` (alignment depends on the machine)| | |
128 | | | | | |
129 | | int a __attribute__ ((vector_size (16)));| | | |
130 | +-------------------------------------------+------------------------------------------------+ | |
131 | ||
132 | In addition to the types the compiler also encodes the type | |
133 | specifiers. The table below describes the encoding of the current | |
134 | Objective-C type specifiers: | |
135 | ||
136 | .. list-table:: | |
137 | :header-rows: 1 | |
138 | ||
139 | * - Specifier | |
140 | - Encoding | |
141 | ||
142 | * - ``const`` | |
143 | - ``r`` | |
144 | * - ``in`` | |
145 | - ``n`` | |
146 | * - ``inout`` | |
147 | - ``N`` | |
148 | * - ``out`` | |
149 | - ``o`` | |
150 | * - ``bycopy`` | |
151 | - ``O`` | |
152 | * - ``byref`` | |
153 | - ``R`` | |
154 | * - ``oneway`` | |
155 | - ``V`` | |
156 | ||
157 | The type specifiers are encoded just before the type. Unlike types | |
158 | however, the type specifiers are only encoded when they appear in method | |
159 | argument types. | |
160 | ||
161 | Note how ``const`` interacts with pointers: | |
162 | ||
163 | +---------------------------+-----------------+ | |
164 | |Objective-C type |Compiler encoding| | |
165 | +===========================+=================+ | |
166 | |.. code-block:: objective-c|``ri`` | | |
167 | | | | | |
168 | | const int | | | |
169 | +---------------------------+-----------------+ | |
170 | |.. code-block:: objective-c|``^ri`` | | |
171 | | | | | |
172 | | const int* | | | |
173 | +---------------------------+-----------------+ | |
174 | |.. code-block:: objective-c|``r^i`` | | |
175 | | | | | |
176 | | int *const | | | |
177 | +---------------------------+-----------------+ | |
178 | ||
179 | ``const int*`` is a pointer to a ``const int``, and so is | |
180 | encoded as ``^ri``. ``int* const``, instead, is a ``const`` | |
181 | pointer to an ``int``, and so is encoded as ``r^i``. | |
182 | ||
183 | Finally, there is a complication when encoding ``const char *`` | |
184 | versus ``char * const``. Because ``char *`` is encoded as | |
185 | ``*`` and not as ``^c``, there is no way to express the fact | |
186 | that ``r`` applies to the pointer or to the pointee. | |
187 | ||
188 | Hence, it is assumed as a convention that ``r*`` means ``const | |
189 | char *`` (since it is what is most often meant), and there is no way to | |
190 | encode ``char *const``. ``char *const`` would simply be encoded | |
191 | as ``*``, and the ``const`` is lost. | |
192 | ||
193 | .. toctree:: | |
194 | :maxdepth: 2 | |
195 | ||
196 | ||
197 | .. _legacy-type-encoding: | |
198 | ||
199 | Legacy Type Encoding | |
200 | ^^^^^^^^^^^^^^^^^^^^ | |
201 | ||
202 | Unfortunately, historically GCC used to have a number of bugs in its | |
203 | encoding code. The NeXT runtime expects GCC to emit type encodings in | |
204 | this historical format (compatible with GCC-3.3), so when using the | |
205 | NeXT runtime, GCC will introduce on purpose a number of incorrect | |
206 | encodings: | |
207 | ||
208 | * the read-only qualifier of the pointee gets emitted before the '^'. | |
209 | The read-only qualifier of the pointer itself gets ignored, unless it | |
210 | is a typedef. Also, the 'r' is only emitted for the outermost type. | |
211 | ||
212 | * 32-bit longs are encoded as 'l' or 'L', but not always. For typedefs, | |
213 | the compiler uses 'i' or 'I' instead if encoding a struct field or a | |
214 | pointer. | |
215 | ||
216 | * ``enum`` s are always encoded as 'i' (int) even if they are actually | |
217 | unsigned or long. | |
218 | ||
219 | In addition to that, the NeXT runtime uses a different encoding for | |
220 | bitfields. It encodes them as ``b`` followed by the size, without | |
221 | a bit offset or the underlying field type. | |
222 | ||
223 | .. _@encode: | |
224 | ||
225 | @encode | |
226 | ^^^^^^^ | |
227 | ||
228 | GNU Objective-C supports the ``@encode`` syntax that allows you to | |
229 | create a type encoding from a C/Objective-C type. For example, | |
230 | ``@encode(int)`` is compiled by the compiler into ``"i"``. | |
231 | ||
232 | ``@encode`` does not support type qualifiers other than | |
233 | ``const``. For example, ``@encode(const char*)`` is valid and | |
234 | is compiled into ``"r*"``, while ``@encode(bycopy char *)`` is | |
235 | invalid and will cause a compilation error. | |
236 | ||
237 | .. _method-signatures: | |
238 | ||
239 | Method Signatures | |
240 | ^^^^^^^^^^^^^^^^^ | |
241 | ||
242 | This section documents the encoding of method types, which is rarely | |
243 | needed to use Objective-C. You should skip it at a first reading; the | |
244 | runtime provides functions that will work on methods and can walk | |
245 | through the list of parameters and interpret them for you. These | |
246 | functions are part of the public 'API' and are the preferred way to | |
247 | interact with method signatures from user code. | |
248 | ||
249 | But if you need to debug a problem with method signatures and need to | |
250 | know how they are implemented (i.e., the 'ABI'), read on. | |
251 | ||
252 | Methods have their 'signature' encoded and made available to the | |
253 | runtime. The 'signature' encodes all the information required to | |
254 | dynamically build invocations of the method at runtime: return type | |
255 | and arguments. | |
256 | ||
257 | The 'signature' is a null-terminated string, composed of the following: | |
258 | ||
259 | * The return type, including type qualifiers. For example, a method | |
260 | returning ``int`` would have ``i`` here. | |
261 | ||
262 | * The total size (in bytes) required to pass all the parameters. This | |
263 | includes the two hidden parameters (the object ``self`` and the | |
264 | method selector ``_cmd``). | |
265 | ||
266 | * Each argument, with the type encoding, followed by the offset (in | |
267 | bytes) of the argument in the list of parameters. | |
268 | ||
269 | For example, a method with no arguments and returning ``int`` would | |
270 | have the signature ``i8@0:4`` if the size of a pointer is 4. The | |
271 | signature is interpreted as follows: the ``i`` is the return type | |
272 | (an ``int``), the ``8`` is the total size of the parameters in | |
273 | bytes (two pointers each of size 4), the ``@0`` is the first | |
274 | parameter (an object at byte offset ``0``) and ``:4`` is the | |
275 | second parameter (a ``SEL`` at byte offset ``4``). | |
276 | ||
277 | You can easily find more examples by running the 'strings' program | |
278 | on an Objective-C object file compiled by GCC. You'll see a lot of | |
279 | strings that look very much like ``i8@0:4``. They are signatures | |
3ed1b4ce | 280 | of Objective-C methods. |