]>
Commit | Line | Data |
---|---|---|
8d9254fc | 1 | Copyright (C) 2000-2020 Free Software Foundation, Inc. |
91399395 NB |
2 | |
3 | This file is intended to contain a few notes about writing C code | |
4 | within GCC so that it compiles without error on the full range of | |
5 | compilers GCC needs to be able to compile on. | |
6 | ||
7 | The problem is that many ISO-standard constructs are not accepted by | |
8 | either old or buggy compilers, and we keep getting bitten by them. | |
ba117645 | 9 | This knowledge until now has been sparsely spread around, so I |
91399395 NB |
10 | thought I'd collect it in one useful place. Please add and correct |
11 | any problems as you come across them. | |
12 | ||
b46b8fb4 | 13 | I'm going to start from a base of the ISO C90 standard, since that is |
91399395 NB |
14 | probably what most people code to naturally. Obviously using |
15 | constructs introduced after that is not a good idea. | |
16 | ||
498ec23d SB |
17 | For the complete coding style conventions used in GCC, please read |
18 | http://gcc.gnu.org/codingconventions.html | |
91399395 NB |
19 | |
20 | ||
ced2ad76 MK |
21 | String literals |
22 | --------------- | |
23 | ||
ced2ad76 MK |
24 | Some compilers like MSVC++ have fairly low limits on the maximum |
25 | length of a string literal; 509 is the lowest we've come across. You | |
26 | may need to break up a long printf statement into many smaller ones. | |
27 | ||
28 | ||
29 | Empty macro arguments | |
30 | --------------------- | |
31 | ||
32 | ISO C (6.8.3 in the 1990 standard) specifies the following: | |
33 | ||
34 | If (before argument substitution) any argument consists of no | |
35 | preprocessing tokens, the behavior is undefined. | |
36 | ||
37 | This was relaxed by ISO C99, but some older compilers emit an error, | |
38 | so code like | |
39 | ||
40 | #define foo(x, y) x y | |
41 | foo (bar, ) | |
42 | ||
43 | needs to be coded in some other way. | |
44 | ||
45 | ||
53eebfbf JM |
46 | Avoid unnecessary test before free |
47 | ---------------------------------- | |
ced2ad76 | 48 | |
53eebfbf JM |
49 | Since SunOS 4 stopped being a reasonable portability target, |
50 | (which happened around 2007) there has been no need to guard | |
51 | against "free (NULL)". Thus, any guard like the following | |
52 | constitutes a redundant test: | |
53 | ||
54 | if (P) | |
55 | free (P); | |
56 | ||
57 | It is better to avoid the test.[*] | |
58 | Instead, simply free P, regardless of whether it is NULL. | |
59 | ||
60 | [*] However, if your profiling exposes a test like this in a | |
61 | performance-critical loop, say where P is nearly always NULL, and | |
62 | the cost of calling free on a NULL pointer would be prohibitively | |
63 | high, consider using __builtin_expect, e.g., like this: | |
64 | ||
65 | if (__builtin_expect (ptr != NULL, 0)) | |
66 | free (ptr); | |
ced2ad76 | 67 | |
ced2ad76 MK |
68 | |
69 | ||
70 | Trigraphs | |
71 | --------- | |
72 | ||
73 | You weren't going to use them anyway, but some otherwise ISO C | |
74 | compliant compilers do not accept trigraphs. | |
75 | ||
76 | ||
77 | Suffixes on Integer Constants | |
78 | ----------------------------- | |
79 | ||
80 | You should never use a 'l' suffix on integer constants ('L' is fine), | |
81 | since it can easily be confused with the number '1'. | |
82 | ||
83 | ||
84 | Common Coding Pitfalls | |
85 | ====================== | |
86 | ||
87 | errno | |
88 | ----- | |
89 | ||
90 | errno might be declared as a macro. | |
91 | ||
92 | ||
93 | Implicit int | |
94 | ------------ | |
95 | ||
96 | In C, the 'int' keyword can often be omitted from type declarations. | |
97 | For instance, you can write | |
98 | ||
99 | unsigned variable; | |
100 | ||
101 | as shorthand for | |
102 | ||
103 | unsigned int variable; | |
104 | ||
105 | There are several places where this can cause trouble. First, suppose | |
106 | 'variable' is a long; then you might think | |
107 | ||
108 | (unsigned) variable | |
109 | ||
110 | would convert it to unsigned long. It does not. It converts to | |
111 | unsigned int. This mostly causes problems on 64-bit platforms, where | |
112 | long and int are not the same size. | |
113 | ||
114 | Second, if you write a function definition with no return type at | |
115 | all: | |
116 | ||
117 | operate (int a, int b) | |
118 | { | |
119 | ... | |
120 | } | |
121 | ||
122 | that function is expected to return int, *not* void. GCC will warn | |
123 | about this. | |
124 | ||
125 | Implicit function declarations always have return type int. So if you | |
126 | correct the above definition to | |
127 | ||
128 | void | |
129 | operate (int a, int b) | |
130 | ... | |
131 | ||
132 | but operate() is called above its definition, you will get an error | |
133 | about a "type mismatch with previous implicit declaration". The cure | |
134 | is to prototype all functions at the top of the file, or in an | |
135 | appropriate header. | |
136 | ||
137 | Char vs unsigned char vs int | |
138 | ---------------------------- | |
139 | ||
140 | In C, unqualified 'char' may be either signed or unsigned; it is the | |
141 | implementation's choice. When you are processing 7-bit ASCII, it does | |
142 | not matter. But when your program must handle arbitrary binary data, | |
143 | or fully 8-bit character sets, you have a problem. The most obvious | |
144 | issue is if you have a look-up table indexed by characters. | |
145 | ||
146 | For instance, the character '\341' in ISO Latin 1 is SMALL LETTER A | |
147 | WITH ACUTE ACCENT. In the proper locale, isalpha('\341') will be | |
148 | true. But if you read '\341' from a file and store it in a plain | |
149 | char, isalpha(c) may look up character 225, or it may look up | |
150 | character -31. And the ctype table has no entry at offset -31, so | |
151 | your program will crash. (If you're lucky.) | |
152 | ||
153 | It is wise to use unsigned char everywhere you possibly can. This | |
154 | avoids all these problems. Unfortunately, the routines in <string.h> | |
155 | take plain char arguments, so you have to remember to cast them back | |
156 | and forth - or avoid the use of strxxx() functions, which is probably | |
157 | a good idea anyway. | |
158 | ||
159 | Another common mistake is to use either char or unsigned char to | |
160 | receive the result of getc() or related stdio functions. They may | |
161 | return EOF, which is outside the range of values representable by | |
162 | char. If you use char, some legal character value may be confused | |
163 | with EOF, such as '\377' (SMALL LETTER Y WITH UMLAUT, in Latin-1). | |
164 | The correct choice is int. | |
165 | ||
166 | A more subtle version of the same mistake might look like this: | |
167 | ||
168 | unsigned char pushback[NPUSHBACK]; | |
169 | int pbidx; | |
170 | #define unget(c) (assert(pbidx < NPUSHBACK), pushback[pbidx++] = (c)) | |
171 | #define get(c) (pbidx ? pushback[--pbidx] : getchar()) | |
172 | ... | |
173 | unget(EOF); | |
174 | ||
175 | which will mysteriously turn a pushed-back EOF into a SMALL LETTER Y | |
176 | WITH UMLAUT. | |
177 | ||
178 | ||
179 | Other common pitfalls | |
180 | --------------------- | |
181 | ||
498ec23d | 182 | o Expecting 'plain' char to be either sign or unsigned extending. |
ced2ad76 MK |
183 | |
184 | o Shifting an item by a negative amount or by greater than or equal to | |
185 | the number of bits in a type (expecting shifts by 32 to be sensible | |
186 | has caused quite a number of bugs at least in the early days). | |
187 | ||
188 | o Expecting ints shifted right to be sign extended. | |
189 | ||
190 | o Modifying the same value twice within one sequence point. | |
191 | ||
192 | o Host vs. target floating point representation, including emitting NaNs | |
193 | and Infinities in a form that the assembler handles. | |
194 | ||
195 | o qsort being an unstable sort function (unstable in the sense that | |
196 | multiple items that sort the same may be sorted in different orders | |
197 | by different qsort functions). | |
198 | ||
199 | o Passing incorrect types to fprintf and friends. | |
200 | ||
201 | o Adding a function declaration for a module declared in another file to | |
202 | a .c file instead of to a .h file. |