]> git.ipfire.org Git - thirdparty/gcc.git/blob - gcc/doc/cpp/initial-processing.rst
sphinx: add missing trailing newline
[thirdparty/gcc.git] / gcc / doc / cpp / initial-processing.rst
1 ..
2 Copyright 1988-2022 Free Software Foundation, Inc.
3 This is part of the GCC manual.
4 For copying conditions, see the copyright.rst file.
5
6 .. _initial-processing:
7
8 Initial processing
9 ******************
10
11 The preprocessor performs a series of textual transformations on its
12 input. These happen before all other processing. Conceptually, they
13 happen in a rigid order, and the entire file is run through each
14 transformation before the next one begins. CPP actually does them
15 all at once, for performance reasons. These transformations correspond
16 roughly to the first three 'phases of translation' described in the C
17 standard.
18
19 .. index:: line endings
20
21 * The input file is read into memory and broken into lines.
22
23 Different systems use different conventions to indicate the end of a
24 line. GCC accepts the ASCII control sequences LF, CR
25 LF and CR as end-of-line markers. These are the canonical
26 sequences used by Unix, DOS and VMS, and the classic Mac OS (before
27 OSX) respectively. You may therefore safely copy source code written
28 on any of those systems to a different one and use it without
29 conversion. (GCC may lose track of the current line number if a file
30 doesn't consistently use one convention, as sometimes happens when it
31 is edited on computers with different conventions that share a network
32 file system.)
33
34 If the last line of any input file lacks an end-of-line marker, the end
35 of the file is considered to implicitly supply one. The C standard says
36 that this condition provokes undefined behavior, so GCC will emit a
37 warning message.
38
39 .. index:: trigraphs
40
41 .. _trigraphs:
42
43 * If trigraphs are enabled, they are replaced by their
44 corresponding single characters. By default GCC ignores trigraphs,
45 but if you request a strictly conforming mode with the :option:`-std`
46 option, or you specify the :option:`-trigraphs` option, then it
47 converts them.
48
49 These are nine three-character sequences, all starting with :samp:`??`,
50 that are defined by ISO C to stand for single characters. They permit
51 obsolete systems that lack some of C's punctuation to use C. For
52 example, :samp:`??/` stands for :samp:`\\`, so ``'??/n'`` is a character
53 constant for a newline.
54
55 Trigraphs are not popular and many compilers implement them
56 incorrectly. Portable code should not rely on trigraphs being either
57 converted or ignored. With :option:`-Wtrigraphs` GCC will warn you
58 when a trigraph may change the meaning of your program if it were
59 converted. See :ref:`wtrigraphs`.
60
61 In a string constant, you can prevent a sequence of question marks
62 from being confused with a trigraph by inserting a backslash between
63 the question marks, or by separating the string literal at the
64 trigraph and making use of string literal concatenation. ``"(??\?)"``
65 is the string :samp:`(???)`, not :samp:`(?]`. Traditional C compilers
66 do not recognize these idioms.
67
68 The nine trigraphs and their replacements are
69
70 .. code-block::
71
72 Trigraph: ??( ??) ??< ??> ??= ??/ ??' ??! ??-
73 Replacement: [ ] { } # \ ^ | ~
74
75 .. index:: continued lines, backslash-newline
76
77 * Continued lines are merged into one long line.
78
79 A continued line is a line which ends with a backslash, :samp:`\\`. The
80 backslash is removed and the following line is joined with the current
81 one. No space is inserted, so you may split a line anywhere, even in
82 the middle of a word. (It is generally more readable to split lines
83 only at white space.)
84
85 The trailing backslash on a continued line is commonly referred to as a
86 :dfn:`backslash-newline`.
87
88 If there is white space between a backslash and the end of a line, that
89 is still a continued line. However, as this is usually the result of an
90 editing mistake, and many compilers will not accept it as a continued
91 line, GCC will warn you about it.
92
93 .. index:: comments, line comments, block comments
94
95 * All comments are replaced with single spaces.
96
97 There are two kinds of comments. :dfn:`Block comments` begin with
98 :samp:`/*` and continue until the next :samp:`*/`. Block comments do not
99 nest:
100
101 .. code-block:: c++
102
103 /* this is /* one comment */ text outside comment
104
105 :dfn:`Line comments` begin with :samp:`//` and continue to the end of the
106 current line. Line comments do not nest either, but it does not matter,
107 because they would end in the same place anyway.
108
109 .. code-block:: c++
110
111 // this is // one comment
112 text outside comment
113
114 It is safe to put line comments inside block comments, or vice versa.
115
116 .. code-block:: c++
117
118 /* block comment
119 // contains line comment
120 yet more comment
121 */ outside comment
122
123 // line comment /* contains block comment */
124
125 But beware of commenting out one end of a block comment with a line
126 comment.
127
128 .. code-block::
129
130 // l.c. /* block comment begins
131 oops! this isn't a comment anymore */
132
133 Comments are not recognized within string literals.
134 ``"/* blah */"`` is the string constant :samp:`/\* blah \*/`, not
135 an empty string.
136
137 Line comments are not in the 1989 edition of the C standard, but they
138 are recognized by GCC as an extension. In C++ and in the 1999 edition
139 of the C standard, they are an official part of the language.
140
141 Since these transformations happen before all other processing, you can
142 split a line mechanically with backslash-newline anywhere. You can
143 comment out the end of a line. You can continue a line comment onto the
144 next line with backslash-newline. You can even split :samp:`/*`,
145 :samp:`*/`, and :samp:`//` onto multiple lines with backslash-newline.
146 For example:
147
148 .. code-block::
149
150 /\
151 *
152 */ # /*
153 */ defi\
154 ne FO\
155 O 10\
156 20
157
158 is equivalent to ``#define FOO 1020``. All these tricks are
159 extremely confusing and should not be used in code intended to be
160 readable.
161
162 There is no way to prevent a backslash at the end of a line from being
163 interpreted as a backslash-newline. This cannot affect any correct
164 program, however.