]> git.ipfire.org Git - thirdparty/gcc.git/blame - gcc/README.Portability
* README.Portability: New file.
[thirdparty/gcc.git] / gcc / README.Portability
CommitLineData
e15e66e7 1Copyright (C) 2000 Free Software Foundation, Inc.
2
3This file is intended to contain a few notes about writing C code
4within GCC so that it compiles without error on the full range of
5compilers GCC needs to be able to compile on.
6
7The problem is that many ISO-standard constructs are not accepted by
8either old or buggy compilers, and we keep getting bitten by them.
9This knowledge until know has been sparsely spread around, so I
10thought I'd collect it in one useful place. Please add and correct
11any problems as you come across them.
12
13I'm going to start from a base of the ISO C89 standard, since that is
14probably what most people code to naturally. Obviously using
15constructs introduced after that is not a good idea.
16
17The first section of this file deals strictly with portability issues,
18the second with common coding pitfalls.
19
20
21 Portability Issues
22 ==================
23
24Unary +
25-------
26
27K+R C compilers and preprocessors have no notion of unary '+'. Thus
28the following code snippet contains 2 portability problems.
29
30int x = +2; /* int x = 2; */
31#if +1 /* #if 1 */
32#endif
33
34
35Pointers to void
36----------------
37
38K+R C compilers did not have a void pointer, and used char * as the
39pointer to anything. The macro PTR is defined as either void * or
40char * depending on whether you have a standards compliant compiler or
41a K+R one. Thus
42
43 free ((void *) h->value.expansion);
44
45should be written
46
47 free ((PTR) h->value.expansion);
48
49
50String literals
51---------------
52
53K+R C did not allow concatenation of string literals like
54
55 "This is a " "single string literal".
56
57Moreover, some compilers like MSVC++ have fairly low limits on the
58maximum length of a string literal; 509 is the lowest we've come
59across. You may need to break up a long printf statement into many
60smaller ones.
61
62
63Empty macro arguments
64---------------------
65
66ISO C (6.8.3 in the 1990 standard) specifies the following:
67
68If (before argument substitution) any argument consists of no
69preprocessing tokens, the behavior is undefined.
70
71This was relaxed by ISO C99, but some older compilers emit an error,
72so code like
73
74#define foo(x, y) x y
75foo (bar, )
76
77needs to be coded in some other way.
78
79
80signed keyword
81--------------
82
83The signed keyword did not exist in K+R comilers, it was introduced in
84ISO C89, so you cannot use it. In both K+R and standard C,
85unqualified char and bitfields may be signed or unsigned. There is no
86way to portably declare signed chars or signed bitfields.
87
88All other arithmetic types are signed unless you use the 'unsigned'
89qualifier. For instance, it is safe to write
90
91 short paramc;
92
93instead of
94
95 signed short paramc;
96
97If you have an algorithm that depends on signed char or signed
98bitfields, you must find another way to write it before it can be
99integrated into GCC.
100
101
102Function prototypes
103-------------------
104
105You need to provide a function prototype for every function before you
106use it, and functions must be defined K+R style. The function
107prototype should use the PARAMS macro, which takes a single argument.
108Therefore the parameter list must be enclosed in parentheses. For
109example,
110
111int myfunc PARAMS ((double, int *));
112
113int
114myfunc (var1, var2)
115 double var1;
116 int *var2;
117{
118 ...
119}
120
121You also need to use PARAMS when referring to function protypes in
122other circumstances, for example see "Calling functions through
123pointers to functions" below.
124
125Variable-argument functions are best described by example:-
126
127void cpp_ice PARAMS ((cpp_reader *, const char *msgid, ...));
128
129void
130cpp_ice VPARAMS ((cpp_reader *pfile, const char *msgid, ...))
131{
132#ifndef ANSI_PROTOTYPES
133 cpp_reader *pfile;
134 const char *msgid;
135#endif
136 va_list ap;
137
138 VA_START (ap, msgid);
139
140#ifndef ANSI_PROTOTYPES
141 pfile = va_arg (ap, cpp_reader *);
142 msgid = va_arg (ap, const char *);
143#endif
144
145 ...
146 va_end (ap);
147}
148
149For the curious, here are the definitions of the above macros. See
150ansidecl.h for the definitions of the above macros and more.
151
152#define PARAMS(paramlist) paramlist /* ISO C. */
153#define VPARAMS(args) args
154
155#define PARAMS(paramlist) () /* K+R C. */
156#define VPARAMS(args) (va_alist) va_dcl
157
158
159Calling functions through pointers to functions
160-----------------------------------------------
161
162K+R C compilers require brackets around the dereferenced pointer
163variable. For example
164
165typedef void (* cl_directive_handler) PARAMS ((cpp_reader *, const char *));
166 p->handler (pfile, p->arg);
167
168needs to become
169
170 (p->handler) (pfile, p->arg);
171
172
173Macros
174------
175
176The rules under K+R C and ISO C for achieving stringification and
177token pasting are quite different. Therefore some macros have been
178defined which will get it right depending upon the compiler.
179
180 CONCAT2(a,b) CONCAT3(a,b,c) and CONCAT4(a,b,c,d)
181
182will paste the tokens passed as arguments. You must not leave any
183space around the commas. Also,
184
185 STRINGX(x)
186
187will stringify an argument; to get the same result on K+R and ISO
188compilers x should not have spaces around it.
189
190
191Enums
192-----
193
194In K+R C, you have to cast enum types to use them as integers, and
195some compilers in particular give lots of warnings for using an enum
196as an array index.
197
198Bitfields
199---------
200
201See also "signed keyword" above. In K+R C only unsigned int bitfields
202were defined (i.e. unsigned char, unsigned short, unsigned long.
203Using plain int/short/long was not allowed).
204
205
206free and realloc
207----------------
208
209Some implementations crash upon attempts to free or realloc the null
210pointer. Thus if mem might be null, you need to write
211
212 if (mem)
213 free (mem);
214
215
216Reserved Keywords
217-----------------
218
219K+R C has "entry" as a reserved keyword, so you should not use it for
220your variable names.
221
222
223Type promotions
224---------------
225
226K+R used unsigned-preserving rules for arithmetic expresssions, while
227ISO uses value-preserving. This means an unsigned char compared to an
228int is done as an unsigned comparison in K+R (since unsigned char
229promotes to unsigned) while it is signed in ISO (since all of the
230values in unsigned char fit in an int, it promotes to int).
231
232** Not having any argument whose type is a short type (char, short,
233float of any flavor) and subject to promotion. **
234
235Trigraphs
236---------
237
238You weren't going to use them anyway, but trigraphs were not defined
239in K+R C, and some otherwise ISO C compliant compilers do not accept
240them.
241
242
243Suffixes on Integer Constants
244-----------------------------
245
246**Using a 'u' suffix on integer constants.**
247
248
249errno
250-----
251
252errno might be declared as a macro.
253
254
255 Common Coding Pitfalls
256 ======================
257Implicit int
258------------
259
260In C, the 'int' keyword can often be omitted from type declarations.
261For instance, you can write
262
263 unsigned variable;
264
265as shorthand for
266
267 unsigned int variable;
268
269There are several places where this can cause trouble. First, suppose
270'variable' is a long; then you might think
271
272 (unsigned) variable
273
274would convert it to unsigned long. It does not. It converts to
275unsigned int. This mostly causes problems on 64-bit platforms, where
276long and int are not the same size.
277
278Second, if you write a function definition with no return type at
279all:
280
281 operate(a, b)
282 int a, b;
283 {
284 ...
285 }
286
287that function is expected to return int, *not* void. GCC will warn
288about this. K+R C has no problem with 'void' as a return type, so you
289need not worry about that.
290
291Implicit function declarations always have return type int. So if you
292correct the above definition to
293
294 void
295 operate(a, b)
296 int a, b;
297 ...
298
299but operate() is called above its definition, you will get an error
300about a "type mismatch with previous implicit declaration". The cure
301is to prototype all functions at the top of the file, or in an
302appropriate header.
303
304Char vs unsigned char vs int
305----------------------------
306
307In C, unqualified 'char' may be either signed or unsigned; it is the
308implementation's choice. When you are processing 7-bit ASCII, it does
309not matter. But when your program must handle arbitrary binary data,
310or fully 8-bit character sets, you have a problem. The most obvious
311issue is if you have a look-up table indexed by characters.
312
313For instance, the character '\341' in ISO Latin 1 is SMALL LETTER A
314WITH ACUTE ACCENT. In the proper locale, isalpha('\341') will be
315true. But if you read '\341' from a file and store it in a plain
316char, isalpha(c) may look up character 225, or it may look up
317character -31. And the ctype table has no entry at offset -31, so
318your program will crash. (If you're lucky.)
319
320It is wise to use unsigned char everywhere you possibly can. This
321avoids all these problems. Unfortunately, the routines in <string.h>
322take plain char arguments, so you have to remember to cast them back
323and forth - or avoid the use of strxxx() functions, which is probably
324a good idea anyway.
325
326Another common mistake is to use either char or unsigned char to
327receive the result of getc() or related stdio functions. They may
328return EOF, which is outside the range of values representable by
329char. If you use char, some legal character value may be confused
330with EOF, such as '\377' (SMALL LETTER Y WITH UMLAUT, in Latin-1).
331The correct choice is int.
332
333A more subtle version of the same mistake might look like this:
334
335 unsigned char pushback[NPUSHBACK];
336 int pbidx;
337 #define unget(c) (assert(pbidx < NPUSHBACK), pushback[pbidx++] = (c))
338 #define get(c) (pbidx ? pushback[--pbidx] : getchar())
339 ...
340 unget(EOF);
341
342which will mysteriously turn a pushed-back EOF into a SMALL LETTER Y
343WITH UMLAUT.
344
345
346Other common pitfalls
347---------------------
348
349o Expecting 'plain' char to be either sign or unsigned extending
350
351o Shifting an item by a negative amount or by greater than or equal to
352 the number of bits in a type (expecting shifts by 32 to be sensible
353 has caused quite a number of bugs at least in the early days).
354
355o Expecting ints shifted right to be sign extended.
356
357o Modifying the same value twice within one sequence point.
358
359o Host vs. target floating point representation, including emitting NaNs
360 and Infinities in a form that the assembler handles.
361
362o qsort being an unstable sort function (unstable in the sense that
363 multiple items that sort the same may be sorted in different orders
364 by different qsort functions).
365
366o Passing incorrect types to fprintf and friends.
367
368o Adding a function declaration for a module declared in another file to
369 a .c file instead of to a .h file.