2 Copyright 1988-2022 Free Software Foundation, Inc.
3 This is part of the GCC manual.
4 For copying conditions, see the copyright.rst file.
6 .. _initial-processing:
11 The preprocessor performs a series of textual transformations on its
12 input. These happen before all other processing. Conceptually, they
13 happen in a rigid order, and the entire file is run through each
14 transformation before the next one begins. CPP actually does them
15 all at once, for performance reasons. These transformations correspond
16 roughly to the first three 'phases of translation' described in the C
19 .. index:: line endings
21 * The input file is read into memory and broken into lines.
23 Different systems use different conventions to indicate the end of a
24 line. GCC accepts the ASCII control sequences LF, CR
25 LF and CR as end-of-line markers. These are the canonical
26 sequences used by Unix, DOS and VMS, and the classic Mac OS (before
27 OSX) respectively. You may therefore safely copy source code written
28 on any of those systems to a different one and use it without
29 conversion. (GCC may lose track of the current line number if a file
30 doesn't consistently use one convention, as sometimes happens when it
31 is edited on computers with different conventions that share a network
34 If the last line of any input file lacks an end-of-line marker, the end
35 of the file is considered to implicitly supply one. The C standard says
36 that this condition provokes undefined behavior, so GCC will emit a
43 * If trigraphs are enabled, they are replaced by their
44 corresponding single characters. By default GCC ignores trigraphs,
45 but if you request a strictly conforming mode with the :option:`-std`
46 option, or you specify the :option:`-trigraphs` option, then it
49 These are nine three-character sequences, all starting with :samp:`??`,
50 that are defined by ISO C to stand for single characters. They permit
51 obsolete systems that lack some of C's punctuation to use C. For
52 example, :samp:`??/` stands for :samp:`\\`, so ``'??/n'`` is a character
53 constant for a newline.
55 Trigraphs are not popular and many compilers implement them
56 incorrectly. Portable code should not rely on trigraphs being either
57 converted or ignored. With :option:`-Wtrigraphs` GCC will warn you
58 when a trigraph may change the meaning of your program if it were
59 converted. See :ref:`wtrigraphs`.
61 In a string constant, you can prevent a sequence of question marks
62 from being confused with a trigraph by inserting a backslash between
63 the question marks, or by separating the string literal at the
64 trigraph and making use of string literal concatenation. ``"(??\?)"``
65 is the string :samp:`(???)`, not :samp:`(?]`. Traditional C compilers
66 do not recognize these idioms.
68 The nine trigraphs and their replacements are
72 Trigraph: ??( ??) ??< ??> ??= ??/ ??' ??! ??-
73 Replacement: [ ] { } # \ ^ | ~
75 .. index:: continued lines, backslash-newline
77 * Continued lines are merged into one long line.
79 A continued line is a line which ends with a backslash, :samp:`\\`. The
80 backslash is removed and the following line is joined with the current
81 one. No space is inserted, so you may split a line anywhere, even in
82 the middle of a word. (It is generally more readable to split lines
85 The trailing backslash on a continued line is commonly referred to as a
86 :dfn:`backslash-newline`.
88 If there is white space between a backslash and the end of a line, that
89 is still a continued line. However, as this is usually the result of an
90 editing mistake, and many compilers will not accept it as a continued
91 line, GCC will warn you about it.
93 .. index:: comments, line comments, block comments
95 * All comments are replaced with single spaces.
97 There are two kinds of comments. :dfn:`Block comments` begin with
98 :samp:`/*` and continue until the next :samp:`*/`. Block comments do not
103 /* this is /* one comment */ text outside comment
105 :dfn:`Line comments` begin with :samp:`//` and continue to the end of the
106 current line. Line comments do not nest either, but it does not matter,
107 because they would end in the same place anyway.
111 // this is // one comment
114 It is safe to put line comments inside block comments, or vice versa.
119 // contains line comment
123 // line comment /* contains block comment */
125 But beware of commenting out one end of a block comment with a line
130 // l.c. /* block comment begins
131 oops! this isn't a comment anymore */
133 Comments are not recognized within string literals.
134 ``"/* blah */"`` is the string constant :samp:`/\* blah \*/`, not
137 Line comments are not in the 1989 edition of the C standard, but they
138 are recognized by GCC as an extension. In C++ and in the 1999 edition
139 of the C standard, they are an official part of the language.
141 Since these transformations happen before all other processing, you can
142 split a line mechanically with backslash-newline anywhere. You can
143 comment out the end of a line. You can continue a line comment onto the
144 next line with backslash-newline. You can even split :samp:`/*`,
145 :samp:`*/`, and :samp:`//` onto multiple lines with backslash-newline.
158 is equivalent to ``#define FOO 1020``. All these tricks are
159 extremely confusing and should not be used in code intended to be
162 There is no way to prevent a backslash at the end of a line from being
163 interpreted as a backslash-newline. This cannot affect any correct