]>
Commit | Line | Data |
---|---|---|
3dfe78b3 MS |
1 | |
2 | ||
3 | ||
4 | ||
5 | ||
6 | ||
7 | Network Working Group D. Robinson | |
8 | Request for Comments: 3875 K. Coar | |
9 | Category: Informational The Apache Software Foundation | |
10 | October 2004 | |
11 | ||
12 | ||
13 | The Common Gateway Interface (CGI) Version 1.1 | |
14 | ||
15 | Status of this Memo | |
16 | ||
17 | This memo provides information for the Internet community. It does | |
18 | not specify an Internet standard of any kind. Distribution of this | |
19 | memo is unlimited. | |
20 | ||
21 | Copyright Notice | |
22 | ||
23 | Copyright (C) The Internet Society (2004). | |
24 | ||
25 | IESG Note | |
26 | ||
27 | This document is not a candidate for any level of Internet Standard. | |
28 | The IETF disclaims any knowledge of the fitness of this document for | |
29 | any purpose, and in particular notes that it has not had IETF review | |
30 | for such things as security, congestion control or inappropriate | |
31 | interaction with deployed protocols. The RFC Editor has chosen to | |
32 | publish this document at its discretion. Readers of this document | |
33 | should exercise caution in evaluating its value for implementation | |
34 | and deployment. | |
35 | ||
36 | Abstract | |
37 | ||
38 | The Common Gateway Interface (CGI) is a simple interface for running | |
39 | external programs, software or gateways under an information server | |
40 | in a platform-independent manner. Currently, the supported | |
41 | information servers are HTTP servers. | |
42 | ||
43 | The interface has been in use by the World-Wide Web (WWW) since 1993. | |
44 | This specification defines the 'current practice' parameters of the | |
45 | 'CGI/1.1' interface developed and documented at the U.S. National | |
46 | Centre for Supercomputing Applications. This document also defines | |
47 | the use of the CGI/1.1 interface on UNIX(R) and other, similar | |
48 | systems. | |
49 | ||
50 | ||
51 | ||
52 | ||
53 | ||
54 | ||
55 | ||
56 | ||
57 | ||
58 | Robinson & Coar Informational [Page 1] | |
59 | \f | |
60 | RFC 3875 CGI Version 1.1 October 2004 | |
61 | ||
62 | ||
63 | Table of Contents | |
64 | ||
65 | 1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . 4 | |
66 | 1.1. Purpose . . . . . . . . . . . . . . . . . . . . . . . . 4 | |
67 | 1.2. Requirements . . . . . . . . . . . . . . . . . . . . . . 4 | |
68 | 1.3. Specifications . . . . . . . . . . . . . . . . . . . . . 4 | |
69 | 1.4. Terminology . . . . . . . . . . . . . . . . . . . . . . 5 | |
70 | ||
71 | 2. Notational Conventions and Generic Grammar. . . . . . . . . . 5 | |
72 | 2.1. Augmented BNF . . . . . . . . . . . . . . . . . . . . . 5 | |
73 | 2.2. Basic Rules . . . . . . . . . . . . . . . . . . . . . . 6 | |
74 | 2.3. URL Encoding . . . . . . . . . . . . . . . . . . . . . . 7 | |
75 | ||
76 | 3. Invoking the Script . . . . . . . . . . . . . . . . . . . . . 8 | |
77 | 3.1. Server Responsibilities . . . . . . . . . . . . . . . . 8 | |
78 | 3.2. Script Selection . . . . . . . . . . . . . . . . . . . . 9 | |
79 | 3.3. The Script-URI . . . . . . . . . . . . . . . . . . . . . 9 | |
80 | 3.4. Execution . . . . . . . . . . . . . . . . . . . . . . . 10 | |
81 | ||
82 | 4. The CGI Request . . . . . . . . . . . . . . . . . . . . . . . 10 | |
83 | 4.1. Request Meta-Variables . . . . . . . . . . . . . . . . . 10 | |
84 | 4.1.1. AUTH_TYPE. . . . . . . . . . . . . . . . . . . . 11 | |
85 | 4.1.2. CONTENT_LENGTH . . . . . . . . . . . . . . . . . 12 | |
86 | 4.1.3. CONTENT_TYPE . . . . . . . . . . . . . . . . . . 12 | |
87 | 4.1.4. GATEWAY_INTERFACE. . . . . . . . . . . . . . . . 13 | |
88 | 4.1.5. PATH_INFO. . . . . . . . . . . . . . . . . . . . 13 | |
89 | 4.1.6. PATH_TRANSLATED. . . . . . . . . . . . . . . . . 14 | |
90 | 4.1.7. QUERY_STRING . . . . . . . . . . . . . . . . . . 15 | |
91 | 4.1.8. REMOTE_ADDR. . . . . . . . . . . . . . . . . . . 15 | |
92 | 4.1.9. REMOTE_HOST. . . . . . . . . . . . . . . . . . . 16 | |
93 | 4.1.10. REMOTE_IDENT . . . . . . . . . . . . . . . . . . 16 | |
94 | 4.1.11. REMOTE_USER. . . . . . . . . . . . . . . . . . . 16 | |
95 | 4.1.12. REQUEST_METHOD . . . . . . . . . . . . . . . . . 17 | |
96 | 4.1.13. SCRIPT_NAME. . . . . . . . . . . . . . . . . . . 17 | |
97 | 4.1.14. SERVER_NAME. . . . . . . . . . . . . . . . . . . 17 | |
98 | 4.1.15. SERVER_PORT. . . . . . . . . . . . . . . . . . . 18 | |
99 | 4.1.16. SERVER_PROTOCOL. . . . . . . . . . . . . . . . . 18 | |
100 | 4.1.17. SERVER_SOFTWARE. . . . . . . . . . . . . . . . . 19 | |
101 | 4.1.18. Protocol-Specific Meta-Variables . . . . . . . . 19 | |
102 | 4.2. Request Message-Body . . . . . . . . . . . . . . . . . . 20 | |
103 | 4.3. Request Methods . . . . . . . . . . . . . . . . . . . . 20 | |
104 | 4.3.1. GET. . . . . . . . . . . . . . . . . . . . . . . 20 | |
105 | 4.3.2. POST . . . . . . . . . . . . . . . . . . . . . . 21 | |
106 | 4.3.3. HEAD . . . . . . . . . . . . . . . . . . . . . . 21 | |
107 | 4.3.4. Protocol-Specific Methods. . . . . . . . . . . . 21 | |
108 | 4.4. The Script Command Line. . . . . . . . . . . . . . . . . 21 | |
109 | ||
110 | ||
111 | ||
112 | ||
113 | ||
114 | Robinson & Coar Informational [Page 2] | |
115 | \f | |
116 | RFC 3875 CGI Version 1.1 October 2004 | |
117 | ||
118 | ||
119 | 5. NPH Scripts . . . . . . . . . . . . . . . . . . . . . . . . . 22 | |
120 | 5.1. Identification . . . . . . . . . . . . . . . . . . . . . 22 | |
121 | 5.2. NPH Response . . . . . . . . . . . . . . . . . . . . . . 22 | |
122 | ||
123 | 6. CGI Response. . . . . . . . . . . . . . . . . . . . . . . . . 23 | |
124 | 6.1. Response Handling. . . . . . . . . . . . . . . . . . . . 23 | |
125 | 6.2. Response Types . . . . . . . . . . . . . . . . . . . . . 23 | |
126 | 6.2.1. Document Response. . . . . . . . . . . . . . . . 23 | |
127 | 6.2.2. Local Redirect Response. . . . . . . . . . . . . 24 | |
128 | 6.2.3. Client Redirect Response . . . . . . . . . . . . 24 | |
129 | 6.2.4. Client Redirect Response with Document . . . . . 24 | |
130 | 6.3. Response Header Fields . . . . . . . . . . . . . . . . . 25 | |
131 | 6.3.1. Content-Type . . . . . . . . . . . . . . . . . . 25 | |
132 | 6.3.2. Location . . . . . . . . . . . . . . . . . . . . 26 | |
133 | 6.3.3. Status . . . . . . . . . . . . . . . . . . . . . 26 | |
134 | 6.3.4. Protocol-Specific Header Fields. . . . . . . . . 27 | |
135 | 6.3.5. Extension Header Fields. . . . . . . . . . . . . 27 | |
136 | 6.4. Response Message-Body. . . . . . . . . . . . . . . . . . 28 | |
137 | ||
138 | 7. System Specifications . . . . . . . . . . . . . . . . . . . . 28 | |
139 | 7.1. AmigaDOS . . . . . . . . . . . . . . . . . . . . . . . . 28 | |
140 | 7.2. UNIX . . . . . . . . . . . . . . . . . . . . . . . . . . 28 | |
141 | 7.3. EBCDIC/POSIX . . . . . . . . . . . . . . . . . . . . . . 29 | |
142 | ||
143 | 8. Implementation. . . . . . . . . . . . . . . . . . . . . . . . 29 | |
144 | 8.1. Recommendations for Servers. . . . . . . . . . . . . . . 29 | |
145 | 8.2. Recommendations for Scripts. . . . . . . . . . . . . . . 30 | |
146 | ||
147 | 9. Security Considerations . . . . . . . . . . . . . . . . . . . 30 | |
148 | 9.1. Safe Methods . . . . . . . . . . . . . . . . . . . . . . 30 | |
149 | 9.2. Header Fields Containing Sensitive Information . . . . . 31 | |
150 | 9.3. Data Privacy . . . . . . . . . . . . . . . . . . . . . . 31 | |
151 | 9.4. Information Security Model . . . . . . . . . . . . . . . 31 | |
152 | 9.5. Script Interference with the Server. . . . . . . . . . . 31 | |
153 | 9.6. Data Length and Buffering Considerations . . . . . . . . 32 | |
154 | 9.7. Stateless Processing . . . . . . . . . . . . . . . . . . 32 | |
155 | 9.8. Relative Paths . . . . . . . . . . . . . . . . . . . . . 33 | |
156 | 9.9. Non-parsed Header Output . . . . . . . . . . . . . . . . 33 | |
157 | ||
158 | 10. Acknowledgements. . . . . . . . . . . . . . . . . . . . . . . 33 | |
159 | ||
160 | 11. References. . . . . . . . . . . . . . . . . . . . . . . . . . 33 | |
161 | 11.1. Normative References. . . . . . . . . . . . . . . . . . 33 | |
162 | 11.2. Informative References. . . . . . . . . . . . . . . . . 34 | |
163 | ||
164 | 12. Authors' Addresses. . . . . . . . . . . . . . . . . . . . . . 35 | |
165 | ||
166 | 13. Full Copyright Statement. . . . . . . . . . . . . . . . . . . 36 | |
167 | ||
168 | ||
169 | ||
170 | Robinson & Coar Informational [Page 3] | |
171 | \f | |
172 | RFC 3875 CGI Version 1.1 October 2004 | |
173 | ||
174 | ||
175 | 1. Introduction | |
176 | ||
177 | 1.1. Purpose | |
178 | ||
179 | The Common Gateway Interface (CGI) [22] allows an HTTP [1], [4] | |
180 | server and a CGI script to share responsibility for responding to | |
181 | client requests. The client request comprises a Uniform Resource | |
182 | Identifier (URI) [11], a request method and various ancillary | |
183 | information about the request provided by the transport protocol. | |
184 | ||
185 | The CGI defines the abstract parameters, known as meta-variables, | |
186 | which describe a client's request. Together with a concrete | |
187 | programmer interface this specifies a platform-independent interface | |
188 | between the script and the HTTP server. | |
189 | ||
190 | The server is responsible for managing connection, data transfer, | |
191 | transport and network issues related to the client request, whereas | |
192 | the CGI script handles the application issues, such as data access | |
193 | and document processing. | |
194 | ||
195 | 1.2. Requirements | |
196 | ||
197 | The key words 'MUST', 'MUST NOT', 'REQUIRED', 'SHALL', 'SHALL NOT', | |
198 | 'SHOULD', 'SHOULD NOT', 'RECOMMENDED', 'MAY' and 'OPTIONAL' in this | |
199 | document are to be interpreted as described in BCP 14, RFC 2119 [3]. | |
200 | ||
201 | An implementation is not compliant if it fails to satisfy one or more | |
202 | of the 'must' requirements for the protocols it implements. An | |
203 | implementation that satisfies all of the 'must' and all of the | |
204 | 'should' requirements for its features is said to be 'unconditionally | |
205 | compliant'; one that satisfies all of the 'must' requirements but not | |
206 | all of the 'should' requirements for its features is said to be | |
207 | 'conditionally compliant'. | |
208 | ||
209 | 1.3. Specifications | |
210 | ||
211 | Not all of the functions and features of the CGI are defined in the | |
212 | main part of this specification. The following phrases are used to | |
213 | describe the features that are not specified: | |
214 | ||
215 | 'system-defined' | |
216 | The feature may differ between systems, but must be the same for | |
217 | different implementations using the same system. A system will | |
218 | usually identify a class of operating systems. Some systems are | |
219 | defined in section 7 of this document. New systems may be defined | |
220 | by new specifications without revision of this document. | |
221 | ||
222 | ||
223 | ||
224 | ||
225 | ||
226 | Robinson & Coar Informational [Page 4] | |
227 | \f | |
228 | RFC 3875 CGI Version 1.1 October 2004 | |
229 | ||
230 | ||
231 | 'implementation-defined' | |
232 | The behaviour of the feature may vary from implementation to | |
233 | implementation; a particular implementation must document its | |
234 | behaviour. | |
235 | ||
236 | 1.4. Terminology | |
237 | ||
238 | This specification uses many terms defined in the HTTP/1.1 | |
239 | specification [4]; however, the following terms are used here in a | |
240 | sense which may not accord with their definitions in that document, | |
241 | or with their common meaning. | |
242 | ||
243 | 'meta-variable' | |
244 | A named parameter which carries information from the server to the | |
245 | script. It is not necessarily a variable in the operating | |
246 | system's environment, although that is the most common | |
247 | implementation. | |
248 | ||
249 | 'script' | |
250 | The software that is invoked by the server according to this | |
251 | interface. It need not be a standalone program, but could be a | |
252 | dynamically-loaded or shared library, or even a subroutine in the | |
253 | server. It might be a set of statements interpreted at run-time, | |
254 | as the term 'script' is frequently understood, but that is not a | |
255 | requirement and within the context of this specification the term | |
256 | has the broader definition stated. | |
257 | ||
258 | 'server' | |
259 | The application program that invokes the script in order to | |
260 | service requests from the client. | |
261 | ||
262 | 2. Notational Conventions and Generic Grammar | |
263 | ||
264 | 2.1. Augmented BNF | |
265 | ||
266 | All of the mechanisms specified in this document are described in | |
267 | both prose and an augmented Backus-Naur Form (BNF) similar to that | |
268 | used by RFC 822 [13]. Unless stated otherwise, the elements are | |
269 | case-sensitive. This augmented BNF contains the following | |
270 | constructs: | |
271 | ||
272 | name = definition | |
273 | The name of a rule and its definition are separated by the equals | |
274 | character ('='). Whitespace is only significant in that | |
275 | continuation lines of a definition are indented. | |
276 | ||
277 | ||
278 | ||
279 | ||
280 | ||
281 | ||
282 | Robinson & Coar Informational [Page 5] | |
283 | \f | |
284 | RFC 3875 CGI Version 1.1 October 2004 | |
285 | ||
286 | ||
287 | "literal" | |
288 | Double quotation marks (") surround literal text, except for a | |
289 | literal quotation mark, which is surrounded by angle-brackets ('<' | |
290 | and '>'). | |
291 | ||
292 | rule1 | rule2 | |
293 | Alternative rules are separated by a vertical bar ('|'). | |
294 | ||
295 | (rule1 rule2 rule3) | |
296 | Elements enclosed in parentheses are treated as a single element. | |
297 | ||
298 | *rule | |
299 | A rule preceded by an asterisk ('*') may have zero or more | |
300 | occurrences. The full form is 'n*m rule' indicating at least n | |
301 | and at most m occurrences of the rule. n and m are optional | |
302 | decimal values with default values of 0 and infinity respectively. | |
303 | ||
304 | [rule] | |
305 | An element enclosed in square brackets ('[' and ']') is optional, | |
306 | and is equivalent to '*1 rule'. | |
307 | ||
308 | N rule | |
309 | A rule preceded by a decimal number represents exactly N | |
310 | occurrences of the rule. It is equivalent to 'N*N rule'. | |
311 | ||
312 | 2.2. Basic Rules | |
313 | ||
314 | This specification uses a BNF-like grammar defined in terms of | |
315 | characters. Unlike many specifications which define the bytes | |
316 | allowed by a protocol, here each literal in the grammar corresponds | |
317 | to the character it represents. How these characters are represented | |
318 | in terms of bits and bytes within a system are either system-defined | |
319 | or specified in the particular context. The single exception is the | |
320 | rule 'OCTET', defined below. | |
321 | ||
322 | The following rules are used throughout this specification to | |
323 | describe basic parsing constructs. | |
324 | ||
325 | alpha = lowalpha | hialpha | |
326 | lowalpha = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | | |
327 | "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" | | |
328 | "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" | | |
329 | "y" | "z" | |
330 | hialpha = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | | |
331 | "I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" | | |
332 | "Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" | | |
333 | "Y" | "Z" | |
334 | ||
335 | ||
336 | ||
337 | ||
338 | Robinson & Coar Informational [Page 6] | |
339 | \f | |
340 | RFC 3875 CGI Version 1.1 October 2004 | |
341 | ||
342 | ||
343 | digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | | |
344 | "8" | "9" | |
345 | alphanum = alpha | digit | |
346 | OCTET = <any 8-bit byte> | |
347 | CHAR = alpha | digit | separator | "!" | "#" | "$" | | |
348 | "%" | "&" | "'" | "*" | "+" | "-" | "." | "`" | | |
349 | "^" | "_" | "{" | "|" | "}" | "~" | CTL | |
350 | CTL = <any control character> | |
351 | SP = <space character> | |
352 | HT = <horizontal tab character> | |
353 | NL = <newline> | |
354 | LWSP = SP | HT | NL | |
355 | separator = "(" | ")" | "<" | ">" | "@" | "," | ";" | ":" | | |
356 | "\" | <"> | "/" | "[" | "]" | "?" | "=" | "{" | | |
357 | "}" | SP | HT | |
358 | token = 1*<any CHAR except CTLs or separators> | |
359 | quoted-string = <"> *qdtext <"> | |
360 | qdtext = <any CHAR except <"> and CTLs but including LWSP> | |
361 | TEXT = <any printable character> | |
362 | ||
363 | Note that newline (NL) need not be a single control character, but | |
364 | can be a sequence of control characters. A system MAY define TEXT to | |
365 | be a larger set of characters than <any CHAR excluding CTLs but | |
366 | including LWSP>. | |
367 | ||
368 | 2.3. URL Encoding | |
369 | ||
370 | Some variables and constructs used here are described as being | |
371 | 'URL-encoded'. This encoding is described in section 2 of RFC 2396 | |
372 | [2]. In a URL-encoded string an escape sequence consists of a | |
373 | percent character ("%") followed by two hexadecimal digits, where the | |
374 | two hexadecimal digits form an octet. An escape sequence represents | |
375 | the graphic character that has the octet as its code within the | |
376 | US-ASCII [9] coded character set, if it exists. Currently there is | |
377 | no provision within the URI syntax to identify which character set | |
378 | non-ASCII codes represent, so CGI handles this issue on an ad-hoc | |
379 | basis. | |
380 | ||
381 | Note that some unsafe (reserved) characters may have different | |
382 | semantics when encoded. The definition of which characters are | |
383 | unsafe depends on the context; see section 2 of RFC 2396 [2], updated | |
384 | by RFC 2732 [7], for an authoritative treatment. These reserved | |
385 | characters are generally used to provide syntactic structure to the | |
386 | character string, for example as field separators. In all cases, the | |
387 | string is first processed with regard to any reserved characters | |
388 | present, and then the resulting data can be URL-decoded by replacing | |
389 | "%" escape sequences by their character values. | |
390 | ||
391 | ||
392 | ||
393 | ||
394 | Robinson & Coar Informational [Page 7] | |
395 | \f | |
396 | RFC 3875 CGI Version 1.1 October 2004 | |
397 | ||
398 | ||
399 | To encode a character string, all reserved and forbidden characters | |
400 | are replaced by the corresponding "%" escape sequences. The string | |
401 | can then be used in assembling a URI. The reserved characters will | |
402 | vary from context to context, but will always be drawn from this set: | |
403 | ||
404 | reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" | "$" | | |
405 | "," | "[" | "]" | |
406 | ||
407 | The last two characters were added by RFC 2732 [7]. In any | |
408 | particular context, a sub-set of these characters will be reserved; | |
409 | the other characters from this set MUST NOT be encoded when a string | |
410 | is URL-encoded in that context. Other basic rules used to describe | |
411 | URI syntax are: | |
412 | ||
413 | hex = digit | "A" | "B" | "C" | "D" | "E" | "F" | "a" | "b" | |
414 | | "c" | "d" | "e" | "f" | |
415 | escaped = "%" hex hex | |
416 | unreserved = alpha | digit | mark | |
417 | mark = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")" | |
418 | ||
419 | 3. Invoking the Script | |
420 | ||
421 | 3.1. Server Responsibilities | |
422 | ||
423 | The server acts as an application gateway. It receives the request | |
424 | from the client, selects a CGI script to handle the request, converts | |
425 | the client request to a CGI request, executes the script and converts | |
426 | the CGI response into a response for the client. When processing the | |
427 | client request, it is responsible for implementing any protocol or | |
428 | transport level authentication and security. The server MAY also | |
429 | function in a 'non-transparent' manner, modifying the request or | |
430 | response in order to provide some additional service, such as media | |
431 | type transformation or protocol reduction. | |
432 | ||
433 | The server MUST perform translations and protocol conversions on the | |
434 | client request data required by this specification. Furthermore, the | |
435 | server retains its responsibility to the client to conform to the | |
436 | relevant network protocol even if the CGI script fails to conform to | |
437 | this specification. | |
438 | ||
439 | If the server is applying authentication to the request, then it MUST | |
440 | NOT execute the script unless the request passes all defined access | |
441 | controls. | |
442 | ||
443 | ||
444 | ||
445 | ||
446 | ||
447 | ||
448 | ||
449 | ||
450 | Robinson & Coar Informational [Page 8] | |
451 | \f | |
452 | RFC 3875 CGI Version 1.1 October 2004 | |
453 | ||
454 | ||
455 | 3.2. Script Selection | |
456 | ||
457 | The server determines which CGI is script to be executed based on a | |
458 | generic-form URI supplied by the client. This URI includes a | |
459 | hierarchical path with components separated by "/". For any | |
460 | particular request, the server will identify all or a leading part of | |
461 | this path with an individual script, thus placing the script at a | |
462 | particular point in the path hierarchy. The remainder of the path, | |
463 | if any, is a resource or sub-resource identifier to be interpreted by | |
464 | the script. | |
465 | ||
466 | Information about this split of the path is available to the script | |
467 | in the meta-variables, described below. Support for non-hierarchical | |
468 | URI schemes is outside the scope of this specification. | |
469 | ||
470 | 3.3. The Script-URI | |
471 | ||
472 | The mapping from client request URI to choice of script is defined by | |
473 | the particular server implementation and its configuration. The | |
474 | server may allow the script to be identified with a set of several | |
475 | different URI path hierarchies, and therefore is permitted to replace | |
476 | the URI by other members of this set during processing and generation | |
477 | of the meta-variables. The server | |
478 | ||
479 | 1. MAY preserve the URI in the particular client request; or | |
480 | ||
481 | 2. it MAY select a canonical URI from the set of possible values | |
482 | for each script; or | |
483 | ||
484 | 3. it can implement any other selection of URI from the set. | |
485 | ||
486 | From the meta-variables thus generated, a URI, the 'Script-URI', can | |
487 | be constructed. This MUST have the property that if the client had | |
488 | accessed this URI instead, then the script would have been executed | |
489 | with the same values for the SCRIPT_NAME, PATH_INFO and QUERY_STRING | |
490 | meta-variables. The Script-URI has the structure of a generic URI as | |
491 | defined in section 3 of RFC 2396 [2], with the exception that object | |
492 | parameters and fragment identifiers are not permitted. The various | |
493 | components of the Script-URI are defined by some of the | |
494 | meta-variables (see below); | |
495 | ||
496 | script-URI = <scheme> "://" <server-name> ":" <server-port> | |
497 | <script-path> <extra-path> "?" <query-string> | |
498 | ||
499 | where <scheme> is found from SERVER_PROTOCOL, <server-name>, | |
500 | <server-port> and <query-string> are the values of the respective | |
501 | meta-variables. The SCRIPT_NAME and PATH_INFO values, URL-encoded | |
502 | with ";", "=" and "?" reserved, give <script-path> and <extra-path>. | |
503 | ||
504 | ||
505 | ||
506 | Robinson & Coar Informational [Page 9] | |
507 | \f | |
508 | RFC 3875 CGI Version 1.1 October 2004 | |
509 | ||
510 | ||
511 | See section 4.1.5 for more information about the PATH_INFO | |
512 | meta-variable. | |
513 | ||
514 | The scheme and the protocol are not identical as the scheme | |
515 | identifies the access method in addition to the application protocol. | |
516 | For example, a resource accessed using Transport Layer Security (TLS) | |
517 | [14] would have a request URI with a scheme of https when using the | |
518 | HTTP protocol [19]. CGI/1.1 provides no generic means for the script | |
519 | to reconstruct this, and therefore the Script-URI as defined includes | |
520 | the base protocol used. However, a script MAY make use of | |
521 | scheme-specific meta-variables to better deduce the URI scheme. | |
522 | ||
523 | Note that this definition also allows URIs to be constructed which | |
524 | would invoke the script with any permitted values for the path-info | |
525 | or query-string, by modifying the appropriate components. | |
526 | ||
527 | 3.4. Execution | |
528 | ||
529 | The script is invoked in a system-defined manner. Unless specified | |
530 | otherwise, the file containing the script will be invoked as an | |
531 | executable program. The server prepares the CGI request as described | |
532 | in section 4; this comprises the request meta-variables (immediately | |
533 | available to the script on execution) and request message data. The | |
534 | request data need not be immediately available to the script; the | |
535 | script can be executed before all this data has been received by the | |
536 | server from the client. The response from the script is returned to | |
537 | the server as described in sections 5 and 6. | |
538 | ||
539 | In the event of an error condition, the server can interrupt or | |
540 | terminate script execution at any time and without warning. That | |
541 | could occur, for example, in the event of a transport failure between | |
542 | the server and the client; so the script SHOULD be prepared to handle | |
543 | abnormal termination. | |
544 | ||
545 | 4. The CGI Request | |
546 | ||
547 | Information about a request comes from two different sources; the | |
548 | request meta-variables and any associated message-body. | |
549 | ||
550 | 4.1. Request Meta-Variables | |
551 | ||
552 | Meta-variables contain data about the request passed from the server | |
553 | to the script, and are accessed by the script in a system-defined | |
554 | manner. Meta-variables are identified by case-insensitive names; | |
555 | there cannot be two different variables whose names differ in case | |
556 | only. Here they are shown using a canonical representation of | |
557 | capitals plus underscore ("_"). A particular system can define a | |
558 | different representation. | |
559 | ||
560 | ||
561 | ||
562 | Robinson & Coar Informational [Page 10] | |
563 | \f | |
564 | RFC 3875 CGI Version 1.1 October 2004 | |
565 | ||
566 | ||
567 | meta-variable-name = "AUTH_TYPE" | "CONTENT_LENGTH" | | |
568 | "CONTENT_TYPE" | "GATEWAY_INTERFACE" | | |
569 | "PATH_INFO" | "PATH_TRANSLATED" | | |
570 | "QUERY_STRING" | "REMOTE_ADDR" | | |
571 | "REMOTE_HOST" | "REMOTE_IDENT" | | |
572 | "REMOTE_USER" | "REQUEST_METHOD" | | |
573 | "SCRIPT_NAME" | "SERVER_NAME" | | |
574 | "SERVER_PORT" | "SERVER_PROTOCOL" | | |
575 | "SERVER_SOFTWARE" | scheme | | |
576 | protocol-var-name | extension-var-name | |
577 | protocol-var-name = ( protocol | scheme ) "_" var-name | |
578 | scheme = alpha *( alpha | digit | "+" | "-" | "." ) | |
579 | var-name = token | |
580 | extension-var-name = token | |
581 | ||
582 | Meta-variables with the same name as a scheme, and names beginning | |
583 | with the name of a protocol or scheme (e.g., HTTP_ACCEPT) are also | |
584 | defined. The number and meaning of these variables may change | |
585 | independently of this specification. (See also section 4.1.18.) | |
586 | ||
587 | The server MAY set additional implementation-defined extension meta- | |
588 | variables, whose names SHOULD be prefixed with "X_". | |
589 | ||
590 | This specification does not distinguish between zero-length (NULL) | |
591 | values and missing values. For example, a script cannot distinguish | |
592 | between the two requests http://host/script and http://host/script? | |
593 | as in both cases the QUERY_STRING meta-variable would be NULL. | |
594 | ||
595 | meta-variable-value = "" | 1*<TEXT, CHAR or tokens of value> | |
596 | ||
597 | An optional meta-variable may be omitted (left unset) if its value is | |
598 | NULL. Meta-variable values MUST be considered case-sensitive except | |
599 | as noted otherwise. The representation of the characters in the | |
600 | meta-variables is system-defined; the server MUST convert values to | |
601 | that representation. | |
602 | ||
603 | 4.1.1. AUTH_TYPE | |
604 | ||
605 | The AUTH_TYPE variable identifies any mechanism used by the server to | |
606 | authenticate the user. It contains a case-insensitive value defined | |
607 | by the client protocol or server implementation. | |
608 | ||
609 | For HTTP, if the client request required authentication for external | |
610 | access, then the server MUST set the value of this variable from the | |
611 | 'auth-scheme' token in the request Authorization header field. | |
612 | ||
613 | ||
614 | ||
615 | ||
616 | ||
617 | ||
618 | Robinson & Coar Informational [Page 11] | |
619 | \f | |
620 | RFC 3875 CGI Version 1.1 October 2004 | |
621 | ||
622 | ||
623 | AUTH_TYPE = "" | auth-scheme | |
624 | auth-scheme = "Basic" | "Digest" | extension-auth | |
625 | extension-auth = token | |
626 | ||
627 | HTTP access authentication schemes are described in RFC 2617 [5]. | |
628 | ||
629 | 4.1.2. CONTENT_LENGTH | |
630 | ||
631 | The CONTENT_LENGTH variable contains the size of the message-body | |
632 | attached to the request, if any, in decimal number of octets. If no | |
633 | data is attached, then NULL (or unset). | |
634 | ||
635 | CONTENT_LENGTH = "" | 1*digit | |
636 | ||
637 | The server MUST set this meta-variable if and only if the request is | |
638 | accompanied by a message-body entity. The CONTENT_LENGTH value must | |
639 | reflect the length of the message-body after the server has removed | |
640 | any transfer-codings or content-codings. | |
641 | ||
642 | 4.1.3. CONTENT_TYPE | |
643 | ||
644 | If the request includes a message-body, the CONTENT_TYPE variable is | |
645 | set to the Internet Media Type [6] of the message-body. | |
646 | ||
647 | CONTENT_TYPE = "" | media-type | |
648 | media-type = type "/" subtype *( ";" parameter ) | |
649 | type = token | |
650 | subtype = token | |
651 | parameter = attribute "=" value | |
652 | attribute = token | |
653 | value = token | quoted-string | |
654 | ||
655 | The type, subtype and parameter attribute names are not | |
656 | case-sensitive. Parameter values may be case sensitive. Media types | |
657 | and their use in HTTP are described section 3.7 of the HTTP/1.1 | |
658 | specification [4]. | |
659 | ||
660 | There is no default value for this variable. If and only if it is | |
661 | unset, then the script MAY attempt to determine the media type from | |
662 | the data received. If the type remains unknown, then the script MAY | |
663 | choose to assume a type of application/octet-stream or it may reject | |
664 | the request with an error (as described in section 6.3.3). | |
665 | ||
666 | Each media-type defines a set of optional and mandatory parameters. | |
667 | This may include a charset parameter with a case-insensitive value | |
668 | defining the coded character set for the message-body. If the | |
669 | ||
670 | ||
671 | ||
672 | ||
673 | ||
674 | Robinson & Coar Informational [Page 12] | |
675 | \f | |
676 | RFC 3875 CGI Version 1.1 October 2004 | |
677 | ||
678 | ||
679 | charset parameter is omitted, then the default value should be | |
680 | derived according to whichever of the following rules is the first to | |
681 | apply: | |
682 | ||
683 | 1. There MAY be a system-defined default charset for some | |
684 | media-types. | |
685 | ||
686 | 2. The default for media-types of type "text" is ISO-8859-1 [4]. | |
687 | ||
688 | 3. Any default defined in the media-type specification. | |
689 | ||
690 | 4. The default is US-ASCII. | |
691 | ||
692 | The server MUST set this meta-variable if an HTTP Content-Type field | |
693 | is present in the client request header. If the server receives a | |
694 | request with an attached entity but no Content-Type header field, it | |
695 | MAY attempt to determine the correct content type, otherwise it | |
696 | should omit this meta-variable. | |
697 | ||
698 | 4.1.4. GATEWAY_INTERFACE | |
699 | ||
700 | The GATEWAY_INTERFACE variable MUST be set to the dialect of CGI | |
701 | being used by the server to communicate with the script. Syntax: | |
702 | ||
703 | GATEWAY_INTERFACE = "CGI" "/" 1*digit "." 1*digit | |
704 | ||
705 | Note that the major and minor numbers are treated as separate | |
706 | integers and hence each may be incremented higher than a single | |
707 | digit. Thus CGI/2.4 is a lower version than CGI/2.13 which in turn | |
708 | is lower than CGI/12.3. Leading zeros MUST be ignored by the script | |
709 | and MUST NOT be generated by the server. | |
710 | ||
711 | This document defines the 1.1 version of the CGI interface. | |
712 | ||
713 | 4.1.5. PATH_INFO | |
714 | ||
715 | The PATH_INFO variable specifies a path to be interpreted by the CGI | |
716 | script. It identifies the resource or sub-resource to be returned by | |
717 | the CGI script, and is derived from the portion of the URI path | |
718 | hierarchy following the part that identifies the script itself. | |
719 | Unlike a URI path, the PATH_INFO is not URL-encoded, and cannot | |
720 | contain path-segment parameters. A PATH_INFO of "/" represents a | |
721 | single void path segment. | |
722 | ||
723 | PATH_INFO = "" | ( "/" path ) | |
724 | path = lsegment *( "/" lsegment ) | |
725 | lsegment = *lchar | |
726 | lchar = <any TEXT or CTL except "/"> | |
727 | ||
728 | ||
729 | ||
730 | Robinson & Coar Informational [Page 13] | |
731 | \f | |
732 | RFC 3875 CGI Version 1.1 October 2004 | |
733 | ||
734 | ||
735 | The value is considered case-sensitive and the server MUST preserve | |
736 | the case of the path as presented in the request URI. The server MAY | |
737 | impose restrictions and limitations on what values it permits for | |
738 | PATH_INFO, and MAY reject the request with an error if it encounters | |
739 | any values considered objectionable. That MAY include any requests | |
740 | that would result in an encoded "/" being decoded into PATH_INFO, as | |
741 | this might represent a loss of information to the script. Similarly, | |
742 | treatment of non US-ASCII characters in the path is system-defined. | |
743 | ||
744 | URL-encoded, the PATH_INFO string forms the extra-path component of | |
745 | the Script-URI (see section 3.3) which follows the SCRIPT_NAME part | |
746 | of that path. | |
747 | ||
748 | 4.1.6. PATH_TRANSLATED | |
749 | ||
750 | The PATH_TRANSLATED variable is derived by taking the PATH_INFO | |
751 | value, parsing it as a local URI in its own right, and performing any | |
752 | virtual-to-physical translation appropriate to map it onto the | |
753 | server's document repository structure. The set of characters | |
754 | permitted in the result is system-defined. | |
755 | ||
756 | PATH_TRANSLATED = *<any character> | |
757 | ||
758 | This is the file location that would be accessed by a request for | |
759 | ||
760 | <scheme> "://" <server-name> ":" <server-port> <extra-path> | |
761 | ||
762 | where <scheme> is the scheme for the original client request and | |
763 | <extra-path> is a URL-encoded version of PATH_INFO, with ";", "=" and | |
764 | "?" reserved. For example, a request such as the following: | |
765 | ||
766 | http://somehost.com/cgi-bin/somescript/this%2eis%2epath%3binfo | |
767 | ||
768 | would result in a PATH_INFO value of | |
769 | ||
770 | /this.is.the.path;info | |
771 | ||
772 | An internal URI is constructed from the scheme, server location and | |
773 | the URL-encoded PATH_INFO: | |
774 | ||
775 | http://somehost.com/this.is.the.path%3binfo | |
776 | ||
777 | This would then be translated to a location in the server's document | |
778 | repository, perhaps a filesystem path something like this: | |
779 | ||
780 | /usr/local/www/htdocs/this.is.the.path;info | |
781 | ||
782 | The value of PATH_TRANSLATED is the result of the translation. | |
783 | ||
784 | ||
785 | ||
786 | Robinson & Coar Informational [Page 14] | |
787 | \f | |
788 | RFC 3875 CGI Version 1.1 October 2004 | |
789 | ||
790 | ||
791 | The value is derived in this way irrespective of whether it maps to a | |
792 | valid repository location. The server MUST preserve the case of the | |
793 | extra-path segment unless the underlying repository supports case- | |
794 | insensitive names. If the repository is only case-aware, case- | |
795 | preserving, or case-blind with regard to document names, the server | |
796 | is not required to preserve the case of the original segment through | |
797 | the translation. | |
798 | ||
799 | The translation algorithm the server uses to derive PATH_TRANSLATED | |
800 | is implementation-defined; CGI scripts which use this variable may | |
801 | suffer limited portability. | |
802 | ||
803 | The server SHOULD set this meta-variable if the request URI includes | |
804 | a path-info component. If PATH_INFO is NULL, then the | |
805 | PATH_TRANSLATED variable MUST be set to NULL (or unset). | |
806 | ||
807 | 4.1.7. QUERY_STRING | |
808 | ||
809 | The QUERY_STRING variable contains a URL-encoded search or parameter | |
810 | string; it provides information to the CGI script to affect or refine | |
811 | the document to be returned by the script. | |
812 | ||
813 | The URL syntax for a search string is described in section 3 of RFC | |
814 | 2396 [2]. The QUERY_STRING value is case-sensitive. | |
815 | ||
816 | QUERY_STRING = query-string | |
817 | query-string = *uric | |
818 | uric = reserved | unreserved | escaped | |
819 | ||
820 | When parsing and decoding the query string, the details of the | |
821 | parsing, reserved characters and support for non US-ASCII characters | |
822 | depends on the context. For example, form submission from an HTML | |
823 | document [18] uses application/x-www-form-urlencoded encoding, in | |
824 | which the characters "+", "&" and "=" are reserved, and the ISO | |
825 | 8859-1 encoding may be used for non US-ASCII characters. | |
826 | ||
827 | The QUERY_STRING value provides the query-string part of the | |
828 | Script-URI. (See section 3.3). | |
829 | ||
830 | The server MUST set this variable; if the Script-URI does not include | |
831 | a query component, the QUERY_STRING MUST be defined as an empty | |
832 | string (""). | |
833 | ||
834 | 4.1.8. REMOTE_ADDR | |
835 | ||
836 | The REMOTE_ADDR variable MUST be set to the network address of the | |
837 | client sending the request to the server. | |
838 | ||
839 | ||
840 | ||
841 | ||
842 | Robinson & Coar Informational [Page 15] | |
843 | \f | |
844 | RFC 3875 CGI Version 1.1 October 2004 | |
845 | ||
846 | ||
847 | REMOTE_ADDR = hostnumber | |
848 | hostnumber = ipv4-address | ipv6-address | |
849 | ipv4-address = 1*3digit "." 1*3digit "." 1*3digit "." 1*3digit | |
850 | ipv6-address = hexpart [ ":" ipv4-address ] | |
851 | hexpart = hexseq | ( [ hexseq ] "::" [ hexseq ] ) | |
852 | hexseq = 1*4hex *( ":" 1*4hex ) | |
853 | ||
854 | The format of an IPv6 address is described in RFC 3513 [15]. | |
855 | ||
856 | 4.1.9. REMOTE_HOST | |
857 | ||
858 | The REMOTE_HOST variable contains the fully qualified domain name of | |
859 | the client sending the request to the server, if available, otherwise | |
860 | NULL. Fully qualified domain names take the form as described in | |
861 | section 3.5 of RFC 1034 [17] and section 2.1 of RFC 1123 [12]. | |
862 | Domain names are not case sensitive. | |
863 | ||
864 | REMOTE_HOST = "" | hostname | hostnumber | |
865 | hostname = *( domainlabel "." ) toplabel [ "." ] | |
866 | domainlabel = alphanum [ *alphahypdigit alphanum ] | |
867 | toplabel = alpha [ *alphahypdigit alphanum ] | |
868 | alphahypdigit = alphanum | "-" | |
869 | ||
870 | The server SHOULD set this variable. If the hostname is not | |
871 | available for performance reasons or otherwise, the server MAY | |
872 | substitute the REMOTE_ADDR value. | |
873 | ||
874 | 4.1.10. REMOTE_IDENT | |
875 | ||
876 | The REMOTE_IDENT variable MAY be used to provide identity information | |
877 | reported about the connection by an RFC 1413 [20] request to the | |
878 | remote agent, if available. The server may choose not to support | |
879 | this feature, or not to request the data for efficiency reasons, or | |
880 | not to return available identity data. | |
881 | ||
882 | REMOTE_IDENT = *TEXT | |
883 | ||
884 | The data returned may be used for authentication purposes, but the | |
885 | level of trust reposed in it should be minimal. | |
886 | ||
887 | 4.1.11. REMOTE_USER | |
888 | ||
889 | The REMOTE_USER variable provides a user identification string | |
890 | supplied by client as part of user authentication. | |
891 | ||
892 | REMOTE_USER = *TEXT | |
893 | ||
894 | ||
895 | ||
896 | ||
897 | ||
898 | Robinson & Coar Informational [Page 16] | |
899 | \f | |
900 | RFC 3875 CGI Version 1.1 October 2004 | |
901 | ||
902 | ||
903 | If the client request required HTTP Authentication [5] (e.g., the | |
904 | AUTH_TYPE meta-variable is set to "Basic" or "Digest"), then the | |
905 | value of the REMOTE_USER meta-variable MUST be set to the user-ID | |
906 | supplied. | |
907 | ||
908 | 4.1.12. REQUEST_METHOD | |
909 | ||
910 | The REQUEST_METHOD meta-variable MUST be set to the method which | |
911 | should be used by the script to process the request, as described in | |
912 | section 4.3. | |
913 | ||
914 | REQUEST_METHOD = method | |
915 | method = "GET" | "POST" | "HEAD" | extension-method | |
916 | extension-method = "PUT" | "DELETE" | token | |
917 | ||
918 | The method is case sensitive. The HTTP methods are described in | |
919 | section 5.1.1 of the HTTP/1.0 specification [1] and section 5.1.1 of | |
920 | the HTTP/1.1 specification [4]. | |
921 | ||
922 | 4.1.13. SCRIPT_NAME | |
923 | ||
924 | The SCRIPT_NAME variable MUST be set to a URI path (not URL-encoded) | |
925 | which could identify the CGI script (rather than the script's | |
926 | output). The syntax is the same as for PATH_INFO (section 4.1.5) | |
927 | ||
928 | SCRIPT_NAME = "" | ( "/" path ) | |
929 | ||
930 | The leading "/" is not part of the path. It is optional if the path | |
931 | is NULL; however, the variable MUST still be set in that case. | |
932 | ||
933 | The SCRIPT_NAME string forms some leading part of the path component | |
934 | of the Script-URI derived in some implementation-defined manner. No | |
935 | PATH_INFO segment (see section 4.1.5) is included in the SCRIPT_NAME | |
936 | value. | |
937 | ||
938 | 4.1.14. SERVER_NAME | |
939 | ||
940 | The SERVER_NAME variable MUST be set to the name of the server host | |
941 | to which the client request is directed. It is a case-insensitive | |
942 | hostname or network address. It forms the host part of the | |
943 | Script-URI. | |
944 | ||
945 | SERVER_NAME = server-name | |
946 | server-name = hostname | ipv4-address | ( "[" ipv6-address "]" ) | |
947 | ||
948 | ||
949 | ||
950 | ||
951 | ||
952 | ||
953 | ||
954 | Robinson & Coar Informational [Page 17] | |
955 | \f | |
956 | RFC 3875 CGI Version 1.1 October 2004 | |
957 | ||
958 | ||
959 | A deployed server can have more than one possible value for this | |
960 | variable, where several HTTP virtual hosts share the same IP address. | |
961 | In that case, the server would use the contents of the request's Host | |
962 | header field to select the correct virtual host. | |
963 | ||
964 | 4.1.15. SERVER_PORT | |
965 | ||
966 | The SERVER_PORT variable MUST be set to the TCP/IP port number on | |
967 | which this request is received from the client. This value is used | |
968 | in the port part of the Script-URI. | |
969 | ||
970 | SERVER_PORT = server-port | |
971 | server-port = 1*digit | |
972 | ||
973 | Note that this variable MUST be set, even if the port is the default | |
974 | port for the scheme and could otherwise be omitted from a URI. | |
975 | ||
976 | 4.1.16. SERVER_PROTOCOL | |
977 | ||
978 | The SERVER_PROTOCOL variable MUST be set to the name and version of | |
979 | the application protocol used for this CGI request. This MAY differ | |
980 | from the protocol version used by the server in its communication | |
981 | with the client. | |
982 | ||
983 | SERVER_PROTOCOL = HTTP-Version | "INCLUDED" | extension-version | |
984 | HTTP-Version = "HTTP" "/" 1*digit "." 1*digit | |
985 | extension-version = protocol [ "/" 1*digit "." 1*digit ] | |
986 | protocol = token | |
987 | ||
988 | Here, 'protocol' defines the syntax of some of the information | |
989 | passing between the server and the script (the 'protocol-specific' | |
990 | features). It is not case sensitive and is usually presented in | |
991 | upper case. The protocol is not the same as the scheme part of the | |
992 | script URI, which defines the overall access mechanism used by the | |
993 | client to communicate with the server. For example, a request that | |
994 | reaches the script with a protocol of "HTTP" may have used an "https" | |
995 | scheme. | |
996 | ||
997 | A well-known value for SERVER_PROTOCOL which the server MAY use is | |
998 | "INCLUDED", which signals that the current document is being included | |
999 | as part of a composite document, rather than being the direct target | |
1000 | of the client request. The script should treat this as an HTTP/1.0 | |
1001 | request. | |
1002 | ||
1003 | ||
1004 | ||
1005 | ||
1006 | ||
1007 | ||
1008 | ||
1009 | ||
1010 | Robinson & Coar Informational [Page 18] | |
1011 | \f | |
1012 | RFC 3875 CGI Version 1.1 October 2004 | |
1013 | ||
1014 | ||
1015 | 4.1.17. SERVER_SOFTWARE | |
1016 | ||
1017 | The SERVER_SOFTWARE meta-variable MUST be set to the name and version | |
1018 | of the information server software making the CGI request (and | |
1019 | running the gateway). It SHOULD be the same as the server | |
1020 | description reported to the client, if any. | |
1021 | ||
1022 | SERVER_SOFTWARE = 1*( product | comment ) | |
1023 | product = token [ "/" product-version ] | |
1024 | product-version = token | |
1025 | comment = "(" *( ctext | comment ) ")" | |
1026 | ctext = <any TEXT excluding "(" and ")"> | |
1027 | ||
1028 | 4.1.18. Protocol-Specific Meta-Variables | |
1029 | ||
1030 | The server SHOULD set meta-variables specific to the protocol and | |
1031 | scheme for the request. Interpretation of protocol-specific | |
1032 | variables depends on the protocol version in SERVER_PROTOCOL. The | |
1033 | server MAY set a meta-variable with the name of the scheme to a | |
1034 | non-NULL value if the scheme is not the same as the protocol. The | |
1035 | presence of such a variable indicates to a script which scheme is | |
1036 | used by the request. | |
1037 | ||
1038 | Meta-variables with names beginning with "HTTP_" contain values read | |
1039 | from the client request header fields, if the protocol used is HTTP. | |
1040 | The HTTP header field name is converted to upper case, has all | |
1041 | occurrences of "-" replaced with "_" and has "HTTP_" prepended to | |
1042 | give the meta-variable name. The header data can be presented as | |
1043 | sent by the client, or can be rewritten in ways which do not change | |
1044 | its semantics. If multiple header fields with the same field-name | |
1045 | are received then the server MUST rewrite them as a single value | |
1046 | having the same semantics. Similarly, a header field that spans | |
1047 | multiple lines MUST be merged onto a single line. The server MUST, | |
1048 | if necessary, change the representation of the data (for example, the | |
1049 | character set) to be appropriate for a CGI meta-variable. | |
1050 | ||
1051 | The server is not required to create meta-variables for all the | |
1052 | header fields that it receives. In particular, it SHOULD remove any | |
1053 | header fields carrying authentication information, such as | |
1054 | 'Authorization'; or that are available to the script in other | |
1055 | variables, such as 'Content-Length' and 'Content-Type'. The server | |
1056 | MAY remove header fields that relate solely to client-side | |
1057 | communication issues, such as 'Connection'. | |
1058 | ||
1059 | ||
1060 | ||
1061 | ||
1062 | ||
1063 | ||
1064 | ||
1065 | ||
1066 | Robinson & Coar Informational [Page 19] | |
1067 | \f | |
1068 | RFC 3875 CGI Version 1.1 October 2004 | |
1069 | ||
1070 | ||
1071 | 4.2. Request Message-Body | |
1072 | ||
1073 | Request data is accessed by the script in a system-defined method; | |
1074 | unless defined otherwise, this will be by reading the 'standard | |
1075 | input' file descriptor or file handle. | |
1076 | ||
1077 | Request-Data = [ request-body ] [ extension-data ] | |
1078 | request-body = <CONTENT_LENGTH>OCTET | |
1079 | extension-data = *OCTET | |
1080 | ||
1081 | A request-body is supplied with the request if the CONTENT_LENGTH is | |
1082 | not NULL. The server MUST make at least that many bytes available | |
1083 | for the script to read. The server MAY signal an end-of-file | |
1084 | condition after CONTENT_LENGTH bytes have been read or it MAY supply | |
1085 | extension data. Therefore, the script MUST NOT attempt to read more | |
1086 | than CONTENT_LENGTH bytes, even if more data is available. However, | |
1087 | it is not obliged to read any of the data. | |
1088 | ||
1089 | For non-parsed header (NPH) scripts (section 5), the server SHOULD | |
1090 | attempt to ensure that the data supplied to the script is precisely | |
1091 | as supplied by the client and is unaltered by the server. | |
1092 | ||
1093 | As transfer-codings are not supported on the request-body, the server | |
1094 | MUST remove any such codings from the message-body, and recalculate | |
1095 | the CONTENT_LENGTH. If this is not possible (for example, because of | |
1096 | large buffering requirements), the server SHOULD reject the client | |
1097 | request. It MAY also remove content-codings from the message-body. | |
1098 | ||
1099 | 4.3. Request Methods | |
1100 | ||
1101 | The Request Method, as supplied in the REQUEST_METHOD meta-variable, | |
1102 | identifies the processing method to be applied by the script in | |
1103 | producing a response. The script author can choose to implement the | |
1104 | methods most appropriate for the particular application. If the | |
1105 | script receives a request with a method it does not support it SHOULD | |
1106 | reject it with an error (see section 6.3.3). | |
1107 | ||
1108 | 4.3.1. GET | |
1109 | ||
1110 | The GET method indicates that the script should produce a document | |
1111 | based on the meta-variable values. By convention, the GET method is | |
1112 | 'safe' and 'idempotent' and SHOULD NOT have the significance of | |
1113 | taking an action other than producing a document. | |
1114 | ||
1115 | The meaning of the GET method may be modified and refined by | |
1116 | protocol-specific meta-variables. | |
1117 | ||
1118 | ||
1119 | ||
1120 | ||
1121 | ||
1122 | Robinson & Coar Informational [Page 20] | |
1123 | \f | |
1124 | RFC 3875 CGI Version 1.1 October 2004 | |
1125 | ||
1126 | ||
1127 | 4.3.2. POST | |
1128 | ||
1129 | The POST method is used to request the script perform processing and | |
1130 | produce a document based on the data in the request message-body, in | |
1131 | addition to meta-variable values. A common use is form submission in | |
1132 | HTML [18], intended to initiate processing by the script that has a | |
1133 | permanent affect, such a change in a database. | |
1134 | ||
1135 | The script MUST check the value of the CONTENT_LENGTH variable before | |
1136 | reading the attached message-body, and SHOULD check the CONTENT_TYPE | |
1137 | value before processing it. | |
1138 | ||
1139 | 4.3.3. HEAD | |
1140 | ||
1141 | The HEAD method requests the script to do sufficient processing to | |
1142 | return the response header fields, without providing a response | |
1143 | message-body. The script MUST NOT provide a response message-body | |
1144 | for a HEAD request. If it does, then the server MUST discard the | |
1145 | message-body when reading the response from the script. | |
1146 | ||
1147 | 4.3.4. Protocol-Specific Methods | |
1148 | ||
1149 | The script MAY implement any protocol-specific method, such as | |
1150 | HTTP/1.1 PUT and DELETE; it SHOULD check the value of SERVER_PROTOCOL | |
1151 | when doing so. | |
1152 | ||
1153 | The server MAY decide that some methods are not appropriate or | |
1154 | permitted for a script, and may handle the methods itself or return | |
1155 | an error to the client. | |
1156 | ||
1157 | 4.4. The Script Command Line | |
1158 | ||
1159 | Some systems support a method for supplying an array of strings to | |
1160 | the CGI script. This is only used in the case of an 'indexed' HTTP | |
1161 | query, which is identified by a 'GET' or 'HEAD' request with a URI | |
1162 | query string that does not contain any unencoded "=" characters. For | |
1163 | such a request, the server SHOULD treat the query-string as a | |
1164 | search-string and parse it into words, using the rules | |
1165 | ||
1166 | search-string = search-word *( "+" search-word ) | |
1167 | search-word = 1*schar | |
1168 | schar = unreserved | escaped | xreserved | |
1169 | xreserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "," | | |
1170 | "$" | |
1171 | ||
1172 | After parsing, each search-word is URL-decoded, optionally encoded in | |
1173 | a system-defined manner and then added to the command line argument | |
1174 | list. | |
1175 | ||
1176 | ||
1177 | ||
1178 | Robinson & Coar Informational [Page 21] | |
1179 | \f | |
1180 | RFC 3875 CGI Version 1.1 October 2004 | |
1181 | ||
1182 | ||
1183 | If the server cannot create any part of the argument list, then the | |
1184 | server MUST NOT generate any command line information. For example, | |
1185 | the number of arguments may be greater than operating system or | |
1186 | server limits, or one of the words may not be representable as an | |
1187 | argument. | |
1188 | ||
1189 | The script SHOULD check to see if the QUERY_STRING value contains an | |
1190 | unencoded "=" character, and SHOULD NOT use the command line | |
1191 | arguments if it does. | |
1192 | ||
1193 | 5. NPH Scripts | |
1194 | ||
1195 | 5.1. Identification | |
1196 | ||
1197 | The server MAY support NPH (Non-Parsed Header) scripts; these are | |
1198 | scripts to which the server passes all responsibility for response | |
1199 | processing. | |
1200 | ||
1201 | This specification provides no mechanism for an NPH script to be | |
1202 | identified on the basis of its output data alone. By convention, | |
1203 | therefore, any particular script can only ever provide output of one | |
1204 | type (NPH or CGI) and hence the script itself is described as an 'NPH | |
1205 | script'. A server with NPH support MUST provide an implementation- | |
1206 | defined mechanism for identifying NPH scripts, perhaps based on the | |
1207 | name or location of the script. | |
1208 | ||
1209 | 5.2. NPH Response | |
1210 | ||
1211 | There MUST be a system-defined method for the script to send data | |
1212 | back to the server or client; a script MUST always return some data. | |
1213 | Unless defined otherwise, this will be the same as for conventional | |
1214 | CGI scripts. | |
1215 | ||
1216 | Currently, NPH scripts are only defined for HTTP client requests. An | |
1217 | (HTTP) NPH script MUST return a complete HTTP response message, | |
1218 | currently described in section 6 of the HTTP specifications [1], [4]. | |
1219 | The script MUST use the SERVER_PROTOCOL variable to determine the | |
1220 | appropriate format for a response. It MUST also take account of any | |
1221 | generic or protocol-specific meta-variables in the request as might | |
1222 | be mandated by the particular protocol specification. | |
1223 | ||
1224 | The server MUST ensure that the script output is sent to the client | |
1225 | unmodified. Note that this requires the script to use the correct | |
1226 | character set (US-ASCII [9] and ISO 8859-1 [10] for HTTP) in the | |
1227 | header fields. The server SHOULD attempt to ensure that the script | |
1228 | output is sent directly to the client, with minimal internal and no | |
1229 | transport-visible buffering. | |
1230 | ||
1231 | ||
1232 | ||
1233 | ||
1234 | Robinson & Coar Informational [Page 22] | |
1235 | \f | |
1236 | RFC 3875 CGI Version 1.1 October 2004 | |
1237 | ||
1238 | ||
1239 | Unless the implementation defines otherwise, the script MUST NOT | |
1240 | indicate in its response that the client can send further requests | |
1241 | over the same connection. | |
1242 | ||
1243 | 6. CGI Response | |
1244 | ||
1245 | 6.1. Response Handling | |
1246 | ||
1247 | A script MUST always provide a non-empty response, and so there is a | |
1248 | system-defined method for it to send this data back to the server. | |
1249 | Unless defined otherwise, this will be via the 'standard output' file | |
1250 | descriptor. | |
1251 | ||
1252 | The script MUST check the REQUEST_METHOD variable when processing the | |
1253 | request and preparing its response. | |
1254 | ||
1255 | The server MAY implement a timeout period within which data must be | |
1256 | received from the script. If a server implementation defines such a | |
1257 | timeout and receives no data from a script within the timeout period, | |
1258 | the server MAY terminate the script process. | |
1259 | ||
1260 | 6.2. Response Types | |
1261 | ||
1262 | The response comprises a message-header and a message-body, separated | |
1263 | by a blank line. The message-header contains one or more header | |
1264 | fields. The body may be NULL. | |
1265 | ||
1266 | generic-response = 1*header-field NL [ response-body ] | |
1267 | ||
1268 | The script MUST return one of either a document response, a local | |
1269 | redirect response or a client redirect (with optional document) | |
1270 | response. In the response definitions below, the order of header | |
1271 | fields in a response is not significant (despite appearing so in the | |
1272 | BNF). The header fields are defined in section 6.3. | |
1273 | ||
1274 | CGI-Response = document-response | local-redir-response | | |
1275 | client-redir-response | client-redirdoc-response | |
1276 | ||
1277 | 6.2.1. Document Response | |
1278 | ||
1279 | The CGI script can return a document to the user in a document | |
1280 | response, with an optional error code indicating the success status | |
1281 | of the response. | |
1282 | ||
1283 | document-response = Content-Type [ Status ] *other-field NL | |
1284 | response-body | |
1285 | ||
1286 | ||
1287 | ||
1288 | ||
1289 | ||
1290 | Robinson & Coar Informational [Page 23] | |
1291 | \f | |
1292 | RFC 3875 CGI Version 1.1 October 2004 | |
1293 | ||
1294 | ||
1295 | The script MUST return a Content-Type header field. A Status header | |
1296 | field is optional, and status 200 'OK' is assumed if it is omitted. | |
1297 | The server MUST make any appropriate modifications to the script's | |
1298 | output to ensure that the response to the client complies with the | |
1299 | response protocol version. | |
1300 | ||
1301 | 6.2.2. Local Redirect Response | |
1302 | ||
1303 | The CGI script can return a URI path and query-string | |
1304 | ('local-pathquery') for a local resource in a Location header field. | |
1305 | This indicates to the server that it should reprocess the request | |
1306 | using the path specified. | |
1307 | ||
1308 | local-redir-response = local-Location NL | |
1309 | ||
1310 | The script MUST NOT return any other header fields or a message-body, | |
1311 | and the server MUST generate the response that it would have produced | |
1312 | in response to a request containing the URL | |
1313 | ||
1314 | scheme "://" server-name ":" server-port local-pathquery | |
1315 | ||
1316 | 6.2.3. Client Redirect Response | |
1317 | ||
1318 | The CGI script can return an absolute URI path in a Location header | |
1319 | field, to indicate to the client that it should reprocess the request | |
1320 | using the URI specified. | |
1321 | ||
1322 | client-redir-response = client-Location *extension-field NL | |
1323 | ||
1324 | The script MUST not provide any other header fields, except for | |
1325 | server-defined CGI extension fields. For an HTTP client request, the | |
1326 | server MUST generate a 302 'Found' HTTP response message. | |
1327 | ||
1328 | 6.2.4. Client Redirect Response with Document | |
1329 | ||
1330 | The CGI script can return an absolute URI path in a Location header | |
1331 | field together with an attached document, to indicate to the client | |
1332 | that it should reprocess the request using the URI specified. | |
1333 | ||
1334 | client-redirdoc-response = client-Location Status Content-Type | |
1335 | *other-field NL response-body | |
1336 | ||
1337 | The Status header field MUST be supplied and MUST contain a status | |
1338 | value of 302 'Found', or it MAY contain an extension-code, that is, | |
1339 | another valid status code that means client redirection. The server | |
1340 | MUST make any appropriate modifications to the script's output to | |
1341 | ensure that the response to the client complies with the response | |
1342 | protocol version. | |
1343 | ||
1344 | ||
1345 | ||
1346 | Robinson & Coar Informational [Page 24] | |
1347 | \f | |
1348 | RFC 3875 CGI Version 1.1 October 2004 | |
1349 | ||
1350 | ||
1351 | 6.3. Response Header Fields | |
1352 | ||
1353 | The response header fields are either CGI or extension header fields | |
1354 | to be interpreted by the server, or protocol-specific header fields | |
1355 | to be included in the response returned to the client. At least one | |
1356 | CGI field MUST be supplied; each CGI field MUST NOT appear more than | |
1357 | once in the response. The response header fields have the syntax: | |
1358 | ||
1359 | header-field = CGI-field | other-field | |
1360 | CGI-field = Content-Type | Location | Status | |
1361 | other-field = protocol-field | extension-field | |
1362 | protocol-field = generic-field | |
1363 | extension-field = generic-field | |
1364 | generic-field = field-name ":" [ field-value ] NL | |
1365 | field-name = token | |
1366 | field-value = *( field-content | LWSP ) | |
1367 | field-content = *( token | separator | quoted-string ) | |
1368 | ||
1369 | The field-name is not case sensitive. A NULL field value is | |
1370 | equivalent to a field not being sent. Note that each header field in | |
1371 | a CGI-Response MUST be specified on a single line; CGI/1.1 does not | |
1372 | support continuation lines. Whitespace is permitted between the ":" | |
1373 | and the field-value (but not between the field-name and the ":"), and | |
1374 | also between tokens in the field-value. | |
1375 | ||
1376 | 6.3.1. Content-Type | |
1377 | ||
1378 | The Content-Type response field sets the Internet Media Type [6] of | |
1379 | the entity body. | |
1380 | ||
1381 | Content-Type = "Content-Type:" media-type NL | |
1382 | ||
1383 | If an entity body is returned, the script MUST supply a Content-Type | |
1384 | field in the response. If it fails to do so, the server SHOULD NOT | |
1385 | attempt to determine the correct content type. The value SHOULD be | |
1386 | sent unmodified to the client, except for any charset parameter | |
1387 | changes. | |
1388 | ||
1389 | Unless it is otherwise system-defined, the default charset assumed by | |
1390 | the client for text media-types is ISO-8859-1 if the protocol is HTTP | |
1391 | and US-ASCII otherwise. Hence the script SHOULD include a charset | |
1392 | parameter. See section 3.4.1 of the HTTP/1.1 specification [4] for a | |
1393 | discussion of this issue. | |
1394 | ||
1395 | ||
1396 | ||
1397 | ||
1398 | ||
1399 | ||
1400 | ||
1401 | ||
1402 | Robinson & Coar Informational [Page 25] | |
1403 | \f | |
1404 | RFC 3875 CGI Version 1.1 October 2004 | |
1405 | ||
1406 | ||
1407 | 6.3.2. Location | |
1408 | ||
1409 | The Location header field is used to specify to the server that the | |
1410 | script is returning a reference to a document rather than an actual | |
1411 | document (see sections 6.2.3 and 6.2.4). It is either an absolute | |
1412 | URI (optionally with a fragment identifier), indicating that the | |
1413 | client is to fetch the referenced document, or a local URI path | |
1414 | (optionally with a query string), indicating that the server is to | |
1415 | fetch the referenced document and return it to the client as the | |
1416 | response. | |
1417 | ||
1418 | Location = local-Location | client-Location | |
1419 | client-Location = "Location:" fragment-URI NL | |
1420 | local-Location = "Location:" local-pathquery NL | |
1421 | fragment-URI = absoluteURI [ "#" fragment ] | |
1422 | fragment = *uric | |
1423 | local-pathquery = abs-path [ "?" query-string ] | |
1424 | abs-path = "/" path-segments | |
1425 | path-segments = segment *( "/" segment ) | |
1426 | segment = *pchar | |
1427 | pchar = unreserved | escaped | extra | |
1428 | extra = ":" | "@" | "&" | "=" | "+" | "$" | "," | |
1429 | ||
1430 | The syntax of an absoluteURI is incorporated into this document from | |
1431 | that specified in RFC 2396 [2] and RFC 2732 [7]. A valid absoluteURI | |
1432 | always starts with the name of scheme followed by ":"; scheme names | |
1433 | start with a letter and continue with alphanumerics, "+", "-" or ".". | |
1434 | The local URI path and query must be an absolute path, and not a | |
1435 | relative path or NULL, and hence must start with a "/". | |
1436 | ||
1437 | Note that any message-body attached to the request (such as for a | |
1438 | POST request) may not be available to the resource that is the target | |
1439 | of the redirect. | |
1440 | ||
1441 | 6.3.3. Status | |
1442 | ||
1443 | The Status header field contains a 3-digit integer result code that | |
1444 | indicates the level of success of the script's attempt to handle the | |
1445 | request. | |
1446 | ||
1447 | Status = "Status:" status-code SP reason-phrase NL | |
1448 | status-code = "200" | "302" | "400" | "501" | extension-code | |
1449 | extension-code = 3digit | |
1450 | reason-phrase = *TEXT | |
1451 | ||
1452 | Status code 200 'OK' indicates success, and is the default value | |
1453 | assumed for a document response. Status code 302 'Found' is used | |
1454 | with a Location header field and response message-body. Status code | |
1455 | ||
1456 | ||
1457 | ||
1458 | Robinson & Coar Informational [Page 26] | |
1459 | \f | |
1460 | RFC 3875 CGI Version 1.1 October 2004 | |
1461 | ||
1462 | ||
1463 | 400 'Bad Request' may be used for an unknown request format, such as | |
1464 | a missing CONTENT_TYPE. Status code 501 'Not Implemented' may be | |
1465 | returned by a script if it receives an unsupported REQUEST_METHOD. | |
1466 | ||
1467 | Other valid status codes are listed in section 6.1.1 of the HTTP | |
1468 | specifications [1], [4], and also the IANA HTTP Status Code Registry | |
1469 | [8] and MAY be used in addition to or instead of the ones listed | |
1470 | above. The script SHOULD check the value of SERVER_PROTOCOL before | |
1471 | using HTTP/1.1 status codes. The script MAY reject with error 405 | |
1472 | 'Method Not Allowed' HTTP/1.1 requests made using a method it does | |
1473 | not support. | |
1474 | ||
1475 | Note that returning an error status code does not have to mean an | |
1476 | error condition with the script itself. For example, a script that | |
1477 | is invoked as an error handler by the server should return the code | |
1478 | appropriate to the server's error condition. | |
1479 | ||
1480 | The reason-phrase is a textual description of the error to be | |
1481 | returned to the client for human consumption. | |
1482 | ||
1483 | 6.3.4. Protocol-Specific Header Fields | |
1484 | ||
1485 | The script MAY return any other header fields that relate to the | |
1486 | response message defined by the specification for the SERVER_PROTOCOL | |
1487 | (HTTP/1.0 [1] or HTTP/1.1 [4]). The server MUST translate the header | |
1488 | data from the CGI header syntax to the HTTP header syntax if these | |
1489 | differ. For example, the character sequence for newline (such as | |
1490 | UNIX's US-ASCII LF) used by CGI scripts may not be the same as that | |
1491 | used by HTTP (US-ASCII CR followed by LF). | |
1492 | ||
1493 | The script MUST NOT return any header fields that relate to | |
1494 | client-side communication issues and could affect the server's | |
1495 | ability to send the response to the client. The server MAY remove | |
1496 | any such header fields returned by the client. It SHOULD resolve any | |
1497 | conflicts between header fields returned by the script and header | |
1498 | fields that it would otherwise send itself. | |
1499 | ||
1500 | 6.3.5. Extension Header Fields | |
1501 | ||
1502 | There may be additional implementation-defined CGI header fields, | |
1503 | whose field names SHOULD begin with "X-CGI-". The server MAY ignore | |
1504 | (and delete) any unrecognised header fields with names beginning "X- | |
1505 | CGI-" that are received from the script. | |
1506 | ||
1507 | ||
1508 | ||
1509 | ||
1510 | ||
1511 | ||
1512 | ||
1513 | ||
1514 | Robinson & Coar Informational [Page 27] | |
1515 | \f | |
1516 | RFC 3875 CGI Version 1.1 October 2004 | |
1517 | ||
1518 | ||
1519 | 6.4. Response Message-Body | |
1520 | ||
1521 | The response message-body is an attached document to be returned to | |
1522 | the client by the server. The server MUST read all the data provided | |
1523 | by the script, until the script signals the end of the message-body | |
1524 | by way of an end-of-file condition. The message-body SHOULD be sent | |
1525 | unmodified to the client, except for HEAD requests or any required | |
1526 | transfer-codings, content-codings or charset conversions. | |
1527 | ||
1528 | response-body = *OCTET | |
1529 | ||
1530 | 7. System Specifications | |
1531 | ||
1532 | 7.1. AmigaDOS | |
1533 | ||
1534 | Meta-Variables | |
1535 | Meta-variables are passed to the script in identically named | |
1536 | environment variables. These are accessed by the DOS library | |
1537 | routine GetVar(). The flags argument SHOULD be 0. Case is | |
1538 | ignored, but upper case is recommended for compatibility with | |
1539 | case-sensitive systems. | |
1540 | ||
1541 | The current working directory | |
1542 | The current working directory for the script is set to the | |
1543 | directory containing the script. | |
1544 | ||
1545 | Character set | |
1546 | The US-ASCII character set [9] is used for the definition of | |
1547 | meta-variables, header fields and values; the newline (NL) | |
1548 | sequence is LF; servers SHOULD also accept CR LF as a newline. | |
1549 | ||
1550 | 7.2. UNIX | |
1551 | ||
1552 | For UNIX compatible operating systems, the following are defined: | |
1553 | ||
1554 | Meta-Variables | |
1555 | Meta-variables are passed to the script in identically named | |
1556 | environment variables. These are accessed by the C library | |
1557 | routine getenv() or variable environ. | |
1558 | ||
1559 | The command line | |
1560 | This is accessed using the argc and argv arguments to main(). The | |
1561 | words have any characters which are 'active' in the Bourne shell | |
1562 | escaped with a backslash. | |
1563 | ||
1564 | The current working directory | |
1565 | The current working directory for the script SHOULD be set to the | |
1566 | directory containing the script. | |
1567 | ||
1568 | ||
1569 | ||
1570 | Robinson & Coar Informational [Page 28] | |
1571 | \f | |
1572 | RFC 3875 CGI Version 1.1 October 2004 | |
1573 | ||
1574 | ||
1575 | Character set | |
1576 | The US-ASCII character set [9], excluding NUL, is used for the | |
1577 | definition of meta-variables, header fields and CHAR values; TEXT | |
1578 | values use ISO-8859-1. The PATH_TRANSLATED value can contain any | |
1579 | 8-bit byte except NUL. The newline (NL) sequence is LF; servers | |
1580 | should also accept CR LF as a newline. | |
1581 | ||
1582 | 7.3. EBCDIC/POSIX | |
1583 | ||
1584 | For POSIX compatible operating systems using the EBCDIC character | |
1585 | set, the following are defined: | |
1586 | ||
1587 | Meta-Variables | |
1588 | Meta-variables are passed to the script in identically named | |
1589 | environment variables. These are accessed by the C library | |
1590 | routine getenv(). | |
1591 | ||
1592 | The command line | |
1593 | This is accessed using the argc and argv arguments to main(). The | |
1594 | words have any characters which are 'active' in the Bourne shell | |
1595 | escaped with a backslash. | |
1596 | ||
1597 | The current working directory | |
1598 | The current working directory for the script SHOULD be set to the | |
1599 | directory containing the script. | |
1600 | ||
1601 | Character set | |
1602 | The IBM1047 character set [21], excluding NUL, is used for the | |
1603 | definition of meta-variables, header fields, values, TEXT strings | |
1604 | and the PATH_TRANSLATED value. The newline (NL) sequence is LF; | |
1605 | servers should also accept CR LF as a newline. | |
1606 | ||
1607 | media-type charset default | |
1608 | The default charset value for text (and other implementation- | |
1609 | defined) media types is IBM1047. | |
1610 | ||
1611 | 8. Implementation | |
1612 | ||
1613 | 8.1. Recommendations for Servers | |
1614 | ||
1615 | Although the server and the CGI script need not be consistent in | |
1616 | their handling of URL paths (client URLs and the PATH_INFO data, | |
1617 | respectively), server authors may wish to impose consistency. So the | |
1618 | server implementation should specify its behaviour for the following | |
1619 | cases: | |
1620 | ||
1621 | 1. define any restrictions on allowed path segments, in particular | |
1622 | whether non-terminal NULL segments are permitted; | |
1623 | ||
1624 | ||
1625 | ||
1626 | Robinson & Coar Informational [Page 29] | |
1627 | \f | |
1628 | RFC 3875 CGI Version 1.1 October 2004 | |
1629 | ||
1630 | ||
1631 | 2. define the behaviour for "." or ".." path segments; i.e., | |
1632 | whether they are prohibited, treated as ordinary path segments | |
1633 | or interpreted in accordance with the relative URL | |
1634 | specification [2]; | |
1635 | ||
1636 | 3. define any limits of the implementation, including limits on | |
1637 | path or search string lengths, and limits on the volume of | |
1638 | header fields the server will parse. | |
1639 | ||
1640 | 8.2. Recommendations for Scripts | |
1641 | ||
1642 | If the script does not intend processing the PATH_INFO data, then it | |
1643 | should reject the request with 404 Not Found if PATH_INFO is not | |
1644 | NULL. | |
1645 | ||
1646 | If the output of a form is being processed, check that CONTENT_TYPE | |
1647 | is "application/x-www-form-urlencoded" [18] or "multipart/form-data" | |
1648 | [16]. If CONTENT_TYPE is blank, the script can reject the request | |
1649 | with a 415 'Unsupported Media Type' error, where supported by the | |
1650 | protocol. | |
1651 | ||
1652 | When parsing PATH_INFO, PATH_TRANSLATED or SCRIPT_NAME the script | |
1653 | should be careful of void path segments ("//") and special path | |
1654 | segments ("." and ".."). They should either be removed from the path | |
1655 | before use in OS system calls, or the request should be rejected with | |
1656 | 404 'Not Found'. | |
1657 | ||
1658 | When returning header fields, the script should try to send the CGI | |
1659 | header fields as soon as possible, and should send them before any | |
1660 | HTTP header fields. This may help reduce the server's memory | |
1661 | requirements. | |
1662 | ||
1663 | Script authors should be aware that the REMOTE_ADDR and REMOTE_HOST | |
1664 | meta-variables (see sections 4.1.8 and 4.1.9) may not identify the | |
1665 | ultimate source of the request. They identify the client for the | |
1666 | immediate request to the server; that client may be a proxy, gateway, | |
1667 | or other intermediary acting on behalf of the actual source client. | |
1668 | ||
1669 | 9. Security Considerations | |
1670 | ||
1671 | 9.1. Safe Methods | |
1672 | ||
1673 | As discussed in the security considerations of the HTTP | |
1674 | specifications [1], [4], the convention has been established that the | |
1675 | GET and HEAD methods should be 'safe' and 'idempotent' (repeated | |
1676 | requests have the same effect as a single request). See section 9.1 | |
1677 | of RFC 2616 [4] for a full discussion. | |
1678 | ||
1679 | ||
1680 | ||
1681 | ||
1682 | Robinson & Coar Informational [Page 30] | |
1683 | \f | |
1684 | RFC 3875 CGI Version 1.1 October 2004 | |
1685 | ||
1686 | ||
1687 | 9.2. Header Fields Containing Sensitive Information | |
1688 | ||
1689 | Some HTTP header fields may carry sensitive information which the | |
1690 | server should not pass on to the script unless explicitly configured | |
1691 | to do so. For example, if the server protects the script by using | |
1692 | the Basic authentication scheme, then the client will send an | |
1693 | Authorization header field containing a username and password. The | |
1694 | server validates this information and so it should not pass on the | |
1695 | password via the HTTP_AUTHORIZATION meta-variable without careful | |
1696 | consideration. This also applies to the Proxy-Authorization header | |
1697 | field and the corresponding HTTP_PROXY_AUTHORIZATION meta-variable. | |
1698 | ||
1699 | 9.3. Data Privacy | |
1700 | ||
1701 | Confidential data in a request should be placed in a message-body as | |
1702 | part of a POST request, and not placed in the URI or message headers. | |
1703 | On some systems, the environment used to pass meta-variables to a | |
1704 | script may be visible to other scripts or users. In addition, many | |
1705 | existing servers, proxies and clients will permanently record the URI | |
1706 | where it might be visible to third parties. | |
1707 | ||
1708 | 9.4. Information Security Model | |
1709 | ||
1710 | For a client connection using TLS, the security model applies between | |
1711 | the client and the server, and not between the client and the script. | |
1712 | It is the server's responsibility to handle the TLS session, and thus | |
1713 | it is the server which is authenticated to the client, not the CGI | |
1714 | script. | |
1715 | ||
1716 | This specification provides no mechanism for the script to | |
1717 | authenticate the server which invoked it. There is no enforced | |
1718 | integrity on the CGI request and response messages. | |
1719 | ||
1720 | 9.5. Script Interference with the Server | |
1721 | ||
1722 | The most common implementation of CGI invokes the script as a child | |
1723 | process using the same user and group as the server process. It | |
1724 | should therefore be ensured that the script cannot interfere with the | |
1725 | server process, its configuration, documents or log files. | |
1726 | ||
1727 | If the script is executed by calling a function linked in to the | |
1728 | server software (either at compile-time or run-time) then precautions | |
1729 | should be taken to protect the core memory of the server, or to | |
1730 | ensure that untrusted code cannot be executed. | |
1731 | ||
1732 | ||
1733 | ||
1734 | ||
1735 | ||
1736 | ||
1737 | ||
1738 | Robinson & Coar Informational [Page 31] | |
1739 | \f | |
1740 | RFC 3875 CGI Version 1.1 October 2004 | |
1741 | ||
1742 | ||
1743 | 9.6. Data Length and Buffering Considerations | |
1744 | ||
1745 | This specification places no limits on the length of the message-body | |
1746 | presented to the script. The script should not assume that | |
1747 | statically allocated buffers of any size are sufficient to contain | |
1748 | the entire submission at one time. Use of a fixed length buffer | |
1749 | without careful overflow checking may result in an attacker | |
1750 | exploiting 'stack-smashing' or 'stack-overflow' vulnerabilities of | |
1751 | the operating system. The script may spool large submissions to disk | |
1752 | or other buffering media, but a rapid succession of large submissions | |
1753 | may result in denial of service conditions. If the CONTENT_LENGTH of | |
1754 | a message-body is larger than resource considerations allow, scripts | |
1755 | should respond with an error status appropriate for the protocol | |
1756 | version; potentially applicable status codes include 503 'Service | |
1757 | Unavailable' (HTTP/1.0 and HTTP/1.1), 413 'Request Entity Too Large' | |
1758 | (HTTP/1.1), and 414 'Request-URI Too Large' (HTTP/1.1). | |
1759 | ||
1760 | Similar considerations apply to the server's handling of the CGI | |
1761 | response from the script. There is no limit on the length of the | |
1762 | header or message-body returned by the script; the server should not | |
1763 | assume that statically allocated buffers of any size are sufficient | |
1764 | to contain the entire response. | |
1765 | ||
1766 | 9.7. Stateless Processing | |
1767 | ||
1768 | The stateless nature of the Web makes each script execution and | |
1769 | resource retrieval independent of all others even when multiple | |
1770 | requests constitute a single conceptual Web transaction. Because of | |
1771 | this, a script should not make any assumptions about the context of | |
1772 | the user-agent submitting a request. In particular, scripts should | |
1773 | examine data obtained from the client and verify that they are valid, | |
1774 | both in form and content, before allowing them to be used for | |
1775 | sensitive purposes such as input to other applications, commands, or | |
1776 | operating system services. These uses include (but are not limited | |
1777 | to) system call arguments, database writes, dynamically evaluated | |
1778 | source code, and input to billing or other secure processes. It is | |
1779 | important that applications be protected from invalid input | |
1780 | regardless of whether the invalidity is the result of user error, | |
1781 | logic error, or malicious action. | |
1782 | ||
1783 | Authors of scripts involved in multi-request transactions should be | |
1784 | particularly cautious about validating the state information; | |
1785 | undesirable effects may result from the substitution of dangerous | |
1786 | values for portions of the submission which might otherwise be | |
1787 | presumed safe. Subversion of this type occurs when alterations are | |
1788 | made to data from a prior stage of the transaction that were not | |
1789 | meant to be controlled by the client (e.g., hidden HTML form | |
1790 | elements, cookies, embedded URLs, etc.). | |
1791 | ||
1792 | ||
1793 | ||
1794 | Robinson & Coar Informational [Page 32] | |
1795 | \f | |
1796 | RFC 3875 CGI Version 1.1 October 2004 | |
1797 | ||
1798 | ||
1799 | 9.8. Relative Paths | |
1800 | ||
1801 | The server should be careful of ".." path segments in the request | |
1802 | URI. These should be removed or resolved in the request URI before | |
1803 | it is split into the script-path and extra-path. Alternatively, when | |
1804 | the extra-path is used to find the PATH_TRANSLATED, care should be | |
1805 | taken to avoid the path resolution from providing translated paths | |
1806 | outside an expected path hierarchy. | |
1807 | ||
1808 | 9.9. Non-parsed Header Output | |
1809 | ||
1810 | If a script returns a non-parsed header output, to be interpreted by | |
1811 | the client in its native protocol, then the script must address all | |
1812 | security considerations relating to that protocol. | |
1813 | ||
1814 | 10. Acknowledgements | |
1815 | ||
1816 | This work is based on the original CGI interface that arose out of | |
1817 | discussions on the 'www-talk' mailing list. In particular, Rob | |
1818 | McCool, John Franks, Ari Luotonen, George Phillips and Tony Sanders | |
1819 | deserve special recognition for their efforts in defining and | |
1820 | implementing the early versions of this interface. | |
1821 | ||
1822 | This document has also greatly benefited from the comments and | |
1823 | suggestions made Chris Adie, Dave Kristol and Mike Meyer; also David | |
1824 | Morris, Jeremy Madea, Patrick McManus, Adam Donahue, Ross Patterson | |
1825 | and Harald Alvestrand. | |
1826 | ||
1827 | 11. References | |
1828 | ||
1829 | 11.1 Normative References | |
1830 | ||
1831 | [1] Berners-Lee, T., Fielding, R. and H. Frystyk, "Hypertext | |
1832 | Transfer Protocol -- HTTP/1.0", RFC 1945, May 1996. | |
1833 | ||
1834 | [2] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform Resource | |
1835 | Identifiers (URI) : Generic Syntax", RFC 2396, August 1998. | |
1836 | ||
1837 | [3] Bradner, S., "Key words for use in RFCs to Indicate Requirements | |
1838 | Levels", BCP 14, RFC 2119, March 1997. | |
1839 | ||
1840 | [4] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., | |
1841 | Leach, P., and T. Berners-Lee, "Hypertext Transfer Protocol -- | |
1842 | HTTP/1.1", RFC 2616, June 1999. | |
1843 | ||
1844 | [5] Franks, J., Hallam-Baker, P., Hostetler, J., Lawrence, S., | |
1845 | Leach, P., Luotonen, A., and L. Stewart, "HTTP Authentication: | |
1846 | Basic and Digest Access Authentication", RFC 2617, June 1999. | |
1847 | ||
1848 | ||
1849 | ||
1850 | Robinson & Coar Informational [Page 33] | |
1851 | \f | |
1852 | RFC 3875 CGI Version 1.1 October 2004 | |
1853 | ||
1854 | ||
1855 | [6] Freed, N. and N. Borenstein, "Multipurpose Internet Mail | |
1856 | Extensions (MIME) Part Two: Media Types", RFC 2046, November | |
1857 | 1996. | |
1858 | ||
1859 | [7] Hinden, R., Carpenter, B., and L. Masinter, "Format for Literal | |
1860 | IPv6 Addresses in URL's", RFC 2732, December 1999. | |
1861 | ||
1862 | [8] "HTTP Status Code Registry", | |
1863 | http://www.iana.org/assignments/http-status-codes, IANA. | |
1864 | ||
1865 | [9] "Information Systems -- Coded Character Sets -- 7-bit American | |
1866 | Standard Code for Information Interchange (7-Bit ASCII)", ANSI | |
1867 | INCITS.4-1986 (R2002). | |
1868 | ||
1869 | [10] "Information technology -- 8-bit single-byte coded graphic | |
1870 | character sets -- Part 1: Latin alphabet No. 1", ISO/IEC | |
1871 | 8859-1:1998. | |
1872 | ||
1873 | 11.2. Informative References | |
1874 | ||
1875 | [11] Berners-Lee, T., "Universal Resource Identifiers in WWW: A | |
1876 | Unifying Syntax for the Expression of Names and Addresses of | |
1877 | Objects on the Network as used in the World-Wide Web", RFC 1630, | |
1878 | June 1994. | |
1879 | ||
1880 | [12] Braden, R., Ed., "Requirements for Internet Hosts -- Application | |
1881 | and Support", STD 3, RFC 1123, October 1989. | |
1882 | ||
1883 | [13] Crocker, D., "Standard for the Format of ARPA Internet Text | |
1884 | Messages", STD 11, RFC 822, August 1982. | |
1885 | ||
1886 | [14] Dierks, T. and C. Allen, "The TLS Protocol Version 1.0", RFC | |
1887 | 2246, January 1999. | |
1888 | ||
1889 | [15] Hinden R. and S. Deering, "Internet Protocol Version 6 (IPv6) | |
1890 | Addressing Architecture", RFC 3513, April 2003. | |
1891 | ||
1892 | [16] Masinter, L., "Returning Values from Forms: | |
1893 | multipart/form-data", RFC 2388, August 1998. | |
1894 | ||
1895 | [17] Mockapetris, P., "Domain Names - Concepts and Facilities", STD | |
1896 | 13, RFC 1034, November 1987. | |
1897 | ||
1898 | [18] Raggett, D., Le Hors, A., and I. Jacobs, Eds., "HTML 4.01 | |
1899 | Specification", W3C Recommendation December 1999, | |
1900 | http://www.w3.org/TR/html401/. | |
1901 | ||
1902 | [19] Rescola, E. "HTTP Over TLS", RFC 2818, May 2000. | |
1903 | ||
1904 | ||
1905 | ||
1906 | Robinson & Coar Informational [Page 34] | |
1907 | \f | |
1908 | RFC 3875 CGI Version 1.1 October 2004 | |
1909 | ||
1910 | ||
1911 | [20] St. Johns, M., "Identification Protocol", RFC 1413, February | |
1912 | 1993. | |
1913 | ||
1914 | [21] IBM National Language Support Reference Manual Volume 2, | |
1915 | SE09-8002-01, March 1990. | |
1916 | ||
1917 | [22] "The Common Gateway Interface", | |
1918 | http://hoohoo.ncsa.uiuc.edu/cgi/, NCSA, University of Illinois. | |
1919 | ||
1920 | 12. Authors' Addresses | |
1921 | ||
1922 | David Robinson | |
1923 | The Apache Software Foundation | |
1924 | ||
1925 | EMail: drtr@apache.org | |
1926 | ||
1927 | ||
1928 | Ken A. L. Coar | |
1929 | The Apache Software Foundation | |
1930 | ||
1931 | EMail: coar@apache.org | |
1932 | ||
1933 | ||
1934 | ||
1935 | ||
1936 | ||
1937 | ||
1938 | ||
1939 | ||
1940 | ||
1941 | ||
1942 | ||
1943 | ||
1944 | ||
1945 | ||
1946 | ||
1947 | ||
1948 | ||
1949 | ||
1950 | ||
1951 | ||
1952 | ||
1953 | ||
1954 | ||
1955 | ||
1956 | ||
1957 | ||
1958 | ||
1959 | ||
1960 | ||
1961 | ||
1962 | Robinson & Coar Informational [Page 35] | |
1963 | \f | |
1964 | RFC 3875 CGI Version 1.1 October 2004 | |
1965 | ||
1966 | ||
1967 | 13. Full Copyright Statement | |
1968 | ||
1969 | Copyright (C) The Internet Society (2004). This document is subject | |
1970 | to the rights, licenses and restrictions contained in BCP 78 and at | |
1971 | www.rfc-editor.org, and except as set forth therein, the authors | |
1972 | retain all their rights. | |
1973 | ||
1974 | This document and the information contained herein are provided on an | |
1975 | "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS | |
1976 | OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET | |
1977 | ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, | |
1978 | INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE | |
1979 | INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED | |
1980 | WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. | |
1981 | ||
1982 | Intellectual Property | |
1983 | ||
1984 | The IETF takes no position regarding the validity or scope of any | |
1985 | Intellectual Property Rights or other rights that might be claimed to | |
1986 | pertain to the implementation or use of the technology described in | |
1987 | this document or the extent to which any license under such rights | |
1988 | might or might not be available; nor does it represent that it has | |
1989 | made any independent effort to identify any such rights. Information | |
1990 | on the ISOC's procedures with respect to rights in ISOC Documents can | |
1991 | be found in BCP 78 and BCP 79. | |
1992 | ||
1993 | Copies of IPR disclosures made to the IETF Secretariat and any | |
1994 | assurances of licenses to be made available, or the result of an | |
1995 | attempt made to obtain a general license or permission for the use of | |
1996 | such proprietary rights by implementers or users of this | |
1997 | specification can be obtained from the IETF on-line IPR repository at | |
1998 | http://www.ietf.org/ipr. | |
1999 | ||
2000 | The IETF invites any interested party to bring to its attention any | |
2001 | copyrights, patents or patent applications, or other proprietary | |
2002 | rights that may cover technology that may be required to implement | |
2003 | this standard. Please address the information to the IETF at ietf- | |
2004 | ipr@ietf.org. | |
2005 | ||
2006 | Acknowledgement | |
2007 | ||
2008 | Funding for the RFC Editor function is currently provided by the | |
2009 | Internet Society. | |
2010 | ||
2011 | ||
2012 | ||
2013 | ||
2014 | ||
2015 | ||
2016 | ||
2017 | ||
2018 | Robinson & Coar Informational [Page 36] | |
2019 | \f |