]> git.ipfire.org Git - thirdparty/gcc.git/blob - gcc/doc/cpp/macros/macro-pitfalls.rst
sphinx: add missing trailing newline
[thirdparty/gcc.git] / gcc / doc / cpp / macros / macro-pitfalls.rst
1 ..
2 Copyright 1988-2022 Free Software Foundation, Inc.
3 This is part of the GCC manual.
4 For copying conditions, see the copyright.rst file.
5
6 .. index:: problems with macros, pitfalls of macros
7
8 .. _macro-pitfalls:
9
10 Macro Pitfalls
11 **************
12
13 In this section we describe some special rules that apply to macros and
14 macro expansion, and point out certain cases in which the rules have
15 counter-intuitive consequences that you must watch out for.
16
17 .. toctree::
18 :maxdepth: 2
19
20
21 .. _misnesting:
22
23 Misnesting
24 ^^^^^^^^^^
25
26 When a macro is called with arguments, the arguments are substituted
27 into the macro body and the result is checked, together with the rest of
28 the input file, for more macro calls. It is possible to piece together
29 a macro call coming partially from the macro body and partially from the
30 arguments. For example,
31
32 .. code-block::
33
34 #define twice(x) (2*(x))
35 #define call_with_1(x) x(1)
36 call_with_1 (twice)
37 → twice(1)
38 → (2*(1))
39
40 Macro definitions do not have to have balanced parentheses. By writing
41 an unbalanced open parenthesis in a macro body, it is possible to create
42 a macro call that begins inside the macro body but ends outside of it.
43 For example,
44
45 .. code-block::
46
47 #define strange(file) fprintf (file, "%s %d",
48 ...
49 strange(stderr) p, 35)
50 → fprintf (stderr, "%s %d", p, 35)
51
52 The ability to piece together a macro call can be useful, but the use of
53 unbalanced open parentheses in a macro body is just confusing, and
54 should be avoided.
55
56 .. index:: parentheses in macro bodies
57
58 .. _operator-precedence-problems:
59
60 Operator Precedence Problems
61 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
62
63 You may have noticed that in most of the macro definition examples shown
64 above, each occurrence of a macro argument name had parentheses around
65 it. In addition, another pair of parentheses usually surround the
66 entire macro definition. Here is why it is best to write macros that
67 way.
68
69 Suppose you define a macro as follows,
70
71 .. code-block:: c++
72
73 #define ceil_div(x, y) (x + y - 1) / y
74
75 whose purpose is to divide, rounding up. (One use for this operation is
76 to compute how many ``int`` objects are needed to hold a certain
77 number of ``char`` objects.) Then suppose it is used as follows:
78
79 .. code-block::
80
81 a = ceil_div (b & c, sizeof (int));
82 → a = (b & c + sizeof (int) - 1) / sizeof (int);
83
84 This does not do what is intended. The operator-precedence rules of
85 C make it equivalent to this:
86
87 .. code-block:: c++
88
89 a = (b & (c + sizeof (int) - 1)) / sizeof (int);
90
91 What we want is this:
92
93 .. code-block:: c++
94
95 a = ((b & c) + sizeof (int) - 1)) / sizeof (int);
96
97 Defining the macro as
98
99 .. code-block:: c++
100
101 #define ceil_div(x, y) ((x) + (y) - 1) / (y)
102
103 provides the desired result.
104
105 Unintended grouping can result in another way. Consider ``sizeof
106 ceil_div(1, 2)``. That has the appearance of a C expression that would
107 compute the size of the type of ``ceil_div (1, 2)``, but in fact it
108 means something very different. Here is what it expands to:
109
110 .. code-block:: c++
111
112 sizeof ((1) + (2) - 1) / (2)
113
114 This would take the size of an integer and divide it by two. The
115 precedence rules have put the division outside the ``sizeof`` when it
116 was intended to be inside.
117
118 Parentheses around the entire macro definition prevent such problems.
119 Here, then, is the recommended way to define ``ceil_div`` :
120
121 .. code-block:: c++
122
123 #define ceil_div(x, y) (((x) + (y) - 1) / (y))
124
125 .. index:: semicolons (after macro calls)
126
127 .. _swallowing-the-semicolon:
128
129 Swallowing the Semicolon
130 ^^^^^^^^^^^^^^^^^^^^^^^^
131
132 Often it is desirable to define a macro that expands into a compound
133 statement. Consider, for example, the following macro, that advances a
134 pointer (the argument ``p`` says where to find it) across whitespace
135 characters:
136
137 .. code-block:: c++
138
139 #define SKIP_SPACES(p, limit) \
140 { char *lim = (limit); \
141 while (p < lim) { \
142 if (*p++ != ' ') { \
143 p--; break; }}}
144
145 Here backslash-newline is used to split the macro definition, which must
146 be a single logical line, so that it resembles the way such code would
147 be laid out if not part of a macro definition.
148
149 A call to this macro might be ``SKIP_SPACES (p, lim)``. Strictly
150 speaking, the call expands to a compound statement, which is a complete
151 statement with no need for a semicolon to end it. However, since it
152 looks like a function call, it minimizes confusion if you can use it
153 like a function call, writing a semicolon afterward, as in
154 ``SKIP_SPACES (p, lim);``
155
156 This can cause trouble before ``else`` statements, because the
157 semicolon is actually a null statement. Suppose you write
158
159 .. code-block:: c++
160
161 if (*p != 0)
162 SKIP_SPACES (p, lim);
163 else ...
164
165 The presence of two statements---the compound statement and a null
166 statement---in between the ``if`` condition and the ``else``
167 makes invalid C code.
168
169 The definition of the macro ``SKIP_SPACES`` can be altered to solve
170 this problem, using a ``do ... while`` statement. Here is how:
171
172 .. code-block:: c++
173
174 #define SKIP_SPACES(p, limit) \
175 do { char *lim = (limit); \
176 while (p < lim) { \
177 if (*p++ != ' ') { \
178 p--; break; }}} \
179 while (0)
180
181 Now ``SKIP_SPACES (p, lim);`` expands into
182
183 .. code-block:: c++
184
185 do {...} while (0);
186
187 which is one statement. The loop executes exactly once; most compilers
188 generate no extra code for it.
189
190 .. index:: side effects (in macro arguments), unsafe macros
191
192 .. _duplication-of-side-effects:
193
194 Duplication of Side Effects
195 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
196
197 Many C programs define a macro ``min``, for 'minimum', like this:
198
199 .. code-block:: c++
200
201 #define min(X, Y) ((X) < (Y) ? (X) : (Y))
202
203 When you use this macro with an argument containing a side effect,
204 as shown here,
205
206 .. code-block:: c++
207
208 next = min (x + y, foo (z));
209
210 it expands as follows:
211
212 .. code-block:: c++
213
214 next = ((x + y) < (foo (z)) ? (x + y) : (foo (z)));
215
216 where ``x + y`` has been substituted for ``X`` and ``foo (z)``
217 for ``Y``.
218
219 The function ``foo`` is used only once in the statement as it appears
220 in the program, but the expression ``foo (z)`` has been substituted
221 twice into the macro expansion. As a result, ``foo`` might be called
222 two times when the statement is executed. If it has side effects or if
223 it takes a long time to compute, the results might not be what you
224 intended. We say that ``min`` is an :dfn:`unsafe` macro.
225
226 The best solution to this problem is to define ``min`` in a way that
227 computes the value of ``foo (z)`` only once. The C language offers
228 no standard way to do this, but it can be done with GNU extensions as
229 follows:
230
231 .. code-block:: c++
232
233 #define min(X, Y) \
234 ({ typeof (X) x_ = (X); \
235 typeof (Y) y_ = (Y); \
236 (x_ < y_) ? x_ : y_; })
237
238 The :samp:`({ ... })` notation produces a compound statement that
239 acts as an expression. Its value is the value of its last statement.
240 This permits us to define local variables and assign each argument to
241 one. The local variables have underscores after their names to reduce
242 the risk of conflict with an identifier of wider scope (it is impossible
243 to avoid this entirely). Now each argument is evaluated exactly once.
244
245 If you do not wish to use GNU C extensions, the only solution is to be
246 careful when *using* the macro ``min``. For example, you can
247 calculate the value of ``foo (z)``, save it in a variable, and use
248 that variable in ``min`` :
249
250 .. code-block:: c++
251
252 #define min(X, Y) ((X) < (Y) ? (X) : (Y))
253 ...
254 {
255 int tem = foo (z);
256 next = min (x + y, tem);
257 }
258
259 (where we assume that ``foo`` returns type ``int``).
260
261 .. index:: self-reference
262
263 .. _self-referential-macros:
264
265 Self-Referential Macros
266 ^^^^^^^^^^^^^^^^^^^^^^^
267
268 A :dfn:`self-referential` macro is one whose name appears in its
269 definition. Recall that all macro definitions are rescanned for more
270 macros to replace. If the self-reference were considered a use of the
271 macro, it would produce an infinitely large expansion. To prevent this,
272 the self-reference is not considered a macro call. It is passed into
273 the preprocessor output unchanged. Consider an example:
274
275 .. code-block:: c++
276
277 #define foo (4 + foo)
278
279 where ``foo`` is also a variable in your program.
280
281 Following the ordinary rules, each reference to ``foo`` will expand
282 into ``(4 + foo)`` ; then this will be rescanned and will expand into
283 ``(4 + (4 + foo))`` ; and so on until the computer runs out of memory.
284
285 The self-reference rule cuts this process short after one step, at
286 ``(4 + foo)``. Therefore, this macro definition has the possibly
287 useful effect of causing the program to add 4 to the value of ``foo``
288 wherever ``foo`` is referred to.
289
290 In most cases, it is a bad idea to take advantage of this feature. A
291 person reading the program who sees that ``foo`` is a variable will
292 not expect that it is a macro as well. The reader will come across the
293 identifier ``foo`` in the program and think its value should be that
294 of the variable ``foo``, whereas in fact the value is four greater.
295
296 One common, useful use of self-reference is to create a macro which
297 expands to itself. If you write
298
299 .. code-block:: c++
300
301 #define EPERM EPERM
302
303 then the macro ``EPERM`` expands to ``EPERM``. Effectively, it is
304 left alone by the preprocessor whenever it's used in running text. You
305 can tell that it's a macro with :samp:`#ifdef`. You might do this if you
306 want to define numeric constants with an ``enum``, but have
307 :samp:`#ifdef` be true for each constant.
308
309 If a macro ``x`` expands to use a macro ``y``, and the expansion of
310 ``y`` refers to the macro ``x``, that is an :dfn:`indirect
311 self-reference` of ``x``. ``x`` is not expanded in this case
312 either. Thus, if we have
313
314 .. code-block:: c++
315
316 #define x (4 + y)
317 #define y (2 * x)
318
319 then ``x`` and ``y`` expand as follows:
320
321 .. code-block::
322
323 x → (4 + y)
324 → (4 + (2 * x))
325
326 y → (2 * x)
327 → (2 * (4 + y))
328
329 Each macro is expanded when it appears in the definition of the other
330 macro, but not when it indirectly appears in its own definition.
331
332 .. index:: expansion of arguments, macro argument expansion, prescan of macro arguments
333
334 .. _argument-prescan:
335
336 Argument Prescan
337 ^^^^^^^^^^^^^^^^
338
339 Macro arguments are completely macro-expanded before they are
340 substituted into a macro body, unless they are stringized or pasted
341 with other tokens. After substitution, the entire macro body, including
342 the substituted arguments, is scanned again for macros to be expanded.
343 The result is that the arguments are scanned *twice* to expand
344 macro calls in them.
345
346 Most of the time, this has no effect. If the argument contained any
347 macro calls, they are expanded during the first scan. The result
348 therefore contains no macro calls, so the second scan does not change
349 it. If the argument were substituted as given, with no prescan, the
350 single remaining scan would find the same macro calls and produce the
351 same results.
352
353 You might expect the double scan to change the results when a
354 self-referential macro is used in an argument of another macro
355 (see :ref:`self-referential-macros`): the self-referential macro would be
356 expanded once in the first scan, and a second time in the second scan.
357 However, this is not what happens. The self-references that do not
358 expand in the first scan are marked so that they will not expand in the
359 second scan either.
360
361 You might wonder, 'Why mention the prescan, if it makes no difference?
362 And why not skip it and make the preprocessor faster?' The answer is
363 that the prescan does make a difference in three special cases:
364
365 * Nested calls to a macro.
366
367 We say that :dfn:`nested` calls to a macro occur when a macro's argument
368 contains a call to that very macro. For example, if ``f`` is a macro
369 that expects one argument, ``f (f (1))`` is a nested pair of calls to
370 ``f``. The desired expansion is made by expanding ``f (1)`` and
371 substituting that into the definition of ``f``. The prescan causes
372 the expected result to happen. Without the prescan, ``f (1)`` itself
373 would be substituted as an argument, and the inner use of ``f`` would
374 appear during the main scan as an indirect self-reference and would not
375 be expanded.
376
377 * Macros that call other macros that stringize or concatenate.
378
379 If an argument is stringized or concatenated, the prescan does not
380 occur. If you *want* to expand a macro, then stringize or
381 concatenate its expansion, you can do that by causing one macro to call
382 another macro that does the stringizing or concatenation. For
383 instance, if you have
384
385 .. code-block:: c++
386
387 #define AFTERX(x) X_ ## x
388 #define XAFTERX(x) AFTERX(x)
389 #define TABLESIZE 1024
390 #define BUFSIZE TABLESIZE
391
392 then ``AFTERX(BUFSIZE)`` expands to ``X_BUFSIZE``, and
393 ``XAFTERX(BUFSIZE)`` expands to ``X_1024``. (Not to
394 ``X_TABLESIZE``. Prescan always does a complete expansion.)
395
396 * Macros used in arguments, whose expansions contain unshielded commas.
397
398 This can cause a macro expanded on the second scan to be called with the
399 wrong number of arguments. Here is an example:
400
401 .. code-block:: c++
402
403 #define foo a,b
404 #define bar(x) lose(x)
405 #define lose(x) (1 + (x))
406
407 We would like ``bar(foo)`` to turn into ``(1 + (foo))``, which
408 would then turn into ``(1 + (a,b))``. Instead, ``bar(foo)``
409 expands into ``lose(a,b)``, and you get an error because ``lose``
410 requires a single argument. In this case, the problem is easily solved
411 by the same parentheses that ought to be used to prevent misnesting of
412 arithmetic operations:
413
414 .. code-block::
415
416 #define foo (a,b)
417 or#define bar(x) lose((x))
418
419 The extra pair of parentheses prevents the comma in ``foo`` 's
420 definition from being interpreted as an argument separator.
421
422 .. index:: newlines in macro arguments
423
424 .. _newlines-in-arguments:
425
426 Newlines in Arguments
427 ^^^^^^^^^^^^^^^^^^^^^
428
429 The invocation of a function-like macro can extend over many logical
430 lines. However, in the present implementation, the entire expansion
431 comes out on one line. Thus line numbers emitted by the compiler or
432 debugger refer to the line the invocation started on, which might be
433 different to the line containing the argument causing the problem.
434
435 Here is an example illustrating this:
436
437 .. code-block:: c++
438
439 #define ignore_second_arg(a,b,c) a; c
440
441 ignore_second_arg (foo (),
442 ignored (),
443 syntax error);
444
445 The syntax error triggered by the tokens ``syntax error`` results in
446 an error message citing line three---the line of ignore_second_arg---
447 even though the problematic code comes from line five.
448
449 We consider this a bug, and intend to fix it in the near future.