From: hno <> Date: Mon, 10 Jan 2005 23:45:42 +0000 (+0000) Subject: Imported RFC and I-D documents relevant to HTTP proxies X-Git-Tag: SQUID_3_0_PRE4~906 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=fc98416e8e157118dbccf371405050dbe1859284;p=thirdparty%2Fsquid.git Imported RFC and I-D documents relevant to HTTP proxies --- diff --git a/doc/rfc/1-index.txt b/doc/rfc/1-index.txt new file mode 100644 index 0000000000..98afb1c9bb --- /dev/null +++ b/doc/rfc/1-index.txt @@ -0,0 +1,39 @@ +draft-cooper-webi-wpad-00.txt + WPAD protocol documenting how MSIE and several other browsers + automatically find their proxy settings from DHCP and/or DNS + +draft-ietf-wrec-web-pro-00.txt + WCCP 1.0 + +draft-wilson-wrec-wccp-v2-01.txt + WCCP 2.0 + +draft-vinod-carp-v1-03.txt + Microsoft CARP peering algorithm + +rfc1738.txt + Uniform Resource Locators (URL) + +rfc1945.txt + Hypertext Transfer Protocol -- HTTP/1.0 + +rfc2817.txt + Upgrading to TLS Within HTTP/1.1 + Not currently in use, but scheduled to replace https:// + +rfc2818.txt + HTTP Over TLS + Documents the https:// scheme + +rfc2964.txt + Use of HTTP State Management + Cookies + +rfc2965.txt + HTTP State Management Mechanism + Cookies + +rfc3310.txt + Updated Digest specification + Most likely not in use for HTTP. Title says HTTP but all examples + is SIP. diff --git a/doc/rfc/draft-cooper-webi-wpad-00.txt b/doc/rfc/draft-cooper-webi-wpad-00.txt new file mode 100644 index 0000000000..ff194cccb8 --- /dev/null +++ b/doc/rfc/draft-cooper-webi-wpad-00.txt @@ -0,0 +1,1176 @@ + + +Network Working Group I. Cooper +Internet-Draft Equinix +Expires: May 16, 2001 P. Gauthier + Inktomi Corporation + J. Cohen + (Microsoft Corporation) + M. Dunsmuir + (RealNetworks, Inc.) + C. Perkins + Sun Microsystems, Inc. + November 15, 2000 + + + Web Proxy Auto-Discovery Protocol + draft-cooper-webi-wpad-00.txt + +Status of this Memo + + This document is an Internet-Draft and is in full conformance with + all provisions of Section 10 of RFC2026. + + Internet-Drafts are working documents of the Internet Engineering + Task Force (IETF), its areas, and its working groups. Note that + other groups may also distribute working documents as + Internet-Drafts. + + Internet-Drafts are draft documents valid for a maximum of six + months and may be updated, replaced, or obsoleted by other documents + at any time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress." + + The list of current Internet-Drafts can be accessed at + http://www.ietf.org/ietf/1id-abstracts.txt. + + The list of Internet-Draft Shadow Directories can be accessed at + http://www.ietf.org/shadow.html. + + This Internet-Draft will expire on May 16, 2001. + +Copyright Notice + + Copyright (C) The Internet Society (2000). All Rights Reserved. + +Abstract + + A mechanism is needed to permit web clients to locate nearby + (caching) web proxy. Current best practice is for end users to hand + configure their web client (i.e., browser) with the URL of an "auto + configuration file". In large environments this presents a + formidable support problem. It would be much more manageable for + + +Cooper, et. al. Expires May 16, 2001 [Page 1] + +Internet-Draft WPAD November 2000 + + + the web client software to automatically learn the configuration + information for its web proxy settings. This is typically referred + to as a resource discovery problem. + + Web client implementers are faced with a dizzying array of resource + discovery protocols at varying levels of implementation and + deployment. This complexity is hampering deployment of a "web proxy + auto-discovery" facility. This document proposes a pragmatic + approach to web proxy auto-discovery. It draws on a number of + proposed standards in the light of practical deployment concerns. It + proposes an escalating strategy of resource discovery attempts in + order to find a nearby web proxy server. It attempts to provide rich + mechanisms for supporting a complex environment, which may contain + multiple web proxy servers. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Cooper, et. al. Expires May 16, 2001 [Page 2] + +Internet-Draft WPAD November 2000 + + +Table of Contents + + 1. Prior Work . . . . . . . . . . . . . . . . . . . . . . . . . 4 + 2. Conventions used in this document . . . . . . . . . . . . . 4 + 3. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 + 4. Defining Web Proxy Auto-Discovery . . . . . . . . . . . . . 5 + 5. The Discovery Process . . . . . . . . . . . . . . . . . . . 6 + 5.1 WPAD Overview . . . . . . . . . . . . . . . . . . . . . . . 6 + 5.2 When to Execute WPAD . . . . . . . . . . . . . . . . . . . . 8 + 5.2.1 Upon Startup of the Web Client . . . . . . . . . . . . . . . 8 + 5.2.2 Network Stack Events . . . . . . . . . . . . . . . . . . . . 8 + 5.2.3 Expiration of the CFILE . . . . . . . . . . . . . . . . . . 8 + 5.3 WPAD Protocol Specification . . . . . . . . . . . . . . . . 9 + 5.4 Discovery Mechanisms . . . . . . . . . . . . . . . . . . . . 11 + 5.4.1 DHCP . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 + 5.4.2 Service Location Protocol /SLP . . . . . . . . . . . . . . . 12 + 5.4.3 DNS A/CNAME "Well Known Aliases" . . . . . . . . . . . . . 12 + 5.4.4 DNS SRV Records . . . . . . . . . . . . . . . . . . . . . . 12 + 5.4.5 DNS TXT service: Entries . . . . . . . . . . . . . . . . . . 13 + 5.4.6 Fallback . . . . . . . . . . . . . . . . . . . . . . . . . . 13 + 5.4.7 Timeouts . . . . . . . . . . . . . . . . . . . . . . . . . . 13 + 5.5 Composing a Candidate CURL . . . . . . . . . . . . . . . . . 13 + 5.6 Retrieving the CFILE at the CURL . . . . . . . . . . . . . . 14 + 5.7 Resuming Discovery . . . . . . . . . . . . . . . . . . . . . 14 + 6. Client Implementation Considerations . . . . . . . . . . . . 14 + 7. Proxy Considerations . . . . . . . . . . . . . . . . . . . . 15 + 8. Administrator Considerations . . . . . . . . . . . . . . . . 15 + 9. Conditional Compliance . . . . . . . . . . . . . . . . . . . 16 + 9.1 Class 0 - Minimally compliant . . . . . . . . . . . . . . . 16 + 9.2 Class 1 - Compliant . . . . . . . . . . . . . . . . . . . . 17 + 9.3 Class 2 - Maximally compliant . . . . . . . . . . . . . . . 17 + 10. Security Considerations . . . . . . . . . . . . . . . . . . 17 + 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 17 + References . . . . . . . . . . . . . . . . . . . . . . . . . 18 + Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 19 + Full Copyright Statement . . . . . . . . . . . . . . . . . . 21 + + + + + + + + + + + + + + + +Cooper, et. al. Expires May 16, 2001 [Page 3] + +Internet-Draft WPAD November 2000 + + +1. Prior Work + + This memo is built on the prior work of Paul Gauthier, Josh Cohen, + Martin Dunsmuir and Charles Perkins. Their efforts in producing + previous versions of this work are acknowledged with thanks. + +2. Conventions used in this document + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this + document are to be interpreted as described in BCP4[7]. + +3. Introduction + + The problem of locating nearby web proxies cannot wait for the + implementation and large scale deployment of various upcoming + resource discovery protocols. The widespread success of the HTTP + protocol and the recent popularity of streaming media has placed + unanticipated strains on the networks of corporations, ISPs and + backbone providers. There currently is no effective method for these + organizations to realize the obvious benefits of web caching without + tedious and error-prone configuration by each and every end user. + + The de-facto mechanism for specifying a web proxy configuration in + web clients is the download of a script or configuration file named + by a URL. Users are currently expected to hand configure this URL + into their browser or other web client. This mechanism suffers from + a number of drawbacks: + o Difficulty in supporting a large body of end-users. Many users + misconfigure their proxy settings and are unable to diagnose the + cause of their problems. + o Lack of support for mobile clients who require a different proxy + as their point of access changes. + o Lack of support for complex proxy environments where there may + exist a number of proxies with different affinities for different + clients (based on network proximity, for example). Currently, + clients would have to "know" which proxy server was optimal for + their use. + + Currently available methods for resource discovery need to be + exploited in the context of a well defined framework. Simple, + functional and efficient mechanisms stand a good chance of solving + this pressing and basic need. As new resource discovery mechanisms + mature they can be folded into this framework with little difficulty. + + This document is a specification for implementers of web client + software. It defines a protocol for automatically configuring those + clients to use a local proxy. It also defines how an administrator + should configure various resource discovery services in their + + +Cooper, et. al. Expires May 16, 2001 [Page 4] + +Internet-Draft WPAD November 2000 + + + network to support WPAD compatible web clients. + + While it does contain suggestions for web proxy software + implementers, it does not make any specific demands of those parties. + +4. Defining Web Proxy Auto-Discovery + + As mentioned above, web client software currently needs to be + configured with the URL of a proxy auto-configuration file or + script. The contents of this script are vendor specific and not + currently standardized. This document does not attempt to discuss + the contents of these files (see[8] for an example file format). + + Thus, the Web Proxy Auto-Discovery (WPAD) problem reduces to + providing the web client a mechanism for discovering the URL of the + Configuration File. Once this Configuration URL (CURL) is known, the + client software already contains mechanisms for retrieving and + interpreting the Configuration File (CFILE) to enable access to the + specified proxy or proxies. + + It is worth carefully noting that the goal of the WPAD process is to + discover the correct CURL at which to retrieve the CFILE. The client + is *not* trying to directly discover the name of the proxy. That + would circumvent the additional capabilities provided by proxy + Configuration Files (such as load balancing, request routing to an + array of servers, automated fail-over to backup proxy [10][8]). + + It is worth noting that different clients requesting the CURL may + receive completely different CFILEs in response. The web server may + send back different CFILES based on a number of criteria such as the + "User-Agent" header, "Accept" headers, client IP address/subnet, + etc. The same client could conceivably receive a different CFILE on + successive retrievals (as a method of round-robin load balancing, + for example). + + This document will discuss a range of mechanisms for discovering the + Configuration URL. The client will attempt them in a predefined + order, until one succeeds. Existing widely deployed facilities may + not provide enough expressiveness to specify a complete URL. As + such, we will define default values for portions of the CURL which + may not be expressible by some discovery mechanisms: + + http://: + + HOST + There is no default for this portion. Any succeeding discovery + mechanism will provide a value for the portion of the + CURL. The client MUST NOT provide a default. + PORT + + +Cooper, et. al. Expires May 16, 2001 [Page 5] + +Internet-Draft WPAD November 2000 + + + The client MUST assume port 80 if the successful discovery + mechanism does not provide a port component. + PATH + The client MUST assume a path of "/wpad.dat" if the successful + discovery mechanism does not provide a path component. + +5. The Discovery Process + +5.1 WPAD Overview + + This sub-section will present a descriptive overview of the WPAD + protocol. It is intended to introduce the concepts and flow of the + protocol. The remaining sub-sections (Section 5.2-Section 5.7) will + provide the rigorous specification of the protocol details. WPAD + uses a collection of pre-existing Internet resource discovery + mechanisms to perform web proxy auto-discovery. Readers may wish to + refer to [1] for a similar approach to resource discovery, since it + was a basis for this strategy. The WPAD protocol specifies the + following: + o how to use each mechanism for the specific purpose of web proxy + auto-discovery + o the order in which the mechanisms should be performed + + The resource discovery mechanisms utilized by WPAD are as follows. + o Dynamic Host Configuration Protocol (DHCP [3][4]) + o Service Location Protocol (SLP [5]) + o "Well Known Aliases" using DNS A records [6][9] + o DNS SRV records [2][9] + o "service: URLs" in DNS TXT records [11] + + Of all these mechanisms only the DHCP and "Well Known Aliases" are + required in WPAD clients. This decision is based on three reasons: + these facilities are currently widely deployed in existing vendor + hardware and software; they represent functionality that should + cover most real world environments; they are relatively simple to + implement. + + DNS servers supporting A records are clearly the most widely + deployed of the services outlined above. It is reasonable to expect + API support inside most web client development environments (POSIX + C, Java, etc). The hierarchical nature of DNS makes it possible to + support hierarchies of proxy servers, + + DNS is not suitable in every environment, unfortunately. + Administrators often choose a DNS domain name hierarchy that does + not correlate to network topologies, but rather with some + organizational model (for example, foo.development.bar.com and + foo.marketing.bar.com). DHCP servers, on the other hand, are + frequently deployed with concern for network topologies. DHCP + + +Cooper, et. al. Expires May 16, 2001 [Page 6] + +Internet-Draft WPAD November 2000 + + + servers provide support for making configuration decisions based on + subnets, which are directly related to network topology. + + Full client support for DHCP is not as ubiquitous as for DNS. That + is, not all clients are equipped to take advantage of DHCP for their + essential network configuration (assignment of IP address, network + mask, etc). APIs for DHCP are not as widely available. Luckily, + using DHCP for WPAD does not require either of these facilities. It + is relatively easy for web client developers to speak just the + minimal DHCP protocol to perform resource discovery. It entails + building a simple UDP packet, sending it to the subnet broadcast + address, and parsing the reply UDP packet(s) which are received to + extract the WPAD option field. A reference implementation of this + code in C is available [12]. + + The WPAD client attempts a series of resource discovery requests, + using the discovery mechanisms mentioned above, in a specific order. + Clients only attempt mechanisms that they support (obviously). Each + time the discovery attempt succeeds; the client uses the information + obtained to construct a CURL. If a CFILE is successfully retrieved + at that CURL, the process completes. If not, the client resumes + where it left off in the predefined series of resource discovery + requests. If no untried mechanisms remain and a CFILE has not been + successfully retrieved, the WPAD protocol fails and the client is + configured to use no proxy. + + First the client tries DHCP, followed by SLP. If no CFILE has been + retrieved the client moves on to the DNS based mechanisms. The + client will cycle through the DNS SRV, "Well Known Aliases" and DNS + TXT record methods multiple times. Each time through the QNAME being + used in the DNS query is made less and less specific. In this manner + the client can locate the most specific configuration information + possible, but can fall back on less specific information. Every DNS + lookup has the QNAME prefixed with "wpad" to indicate the resource + type being requested. + + As an example, consider a client with hostname + johns-desktop.development.foo.com. Assume the web client software + supports all of the mechanisms listed above. This is the sequence of + discovery attempts the client would perform until one succeeded in + locating a valid CFILE: + o DHCP + o SLP + o DNS A lookup on QNAME=wpad.development.foo.com. + o DNS SRV lookup on QNAME=wpad.development.foo.com. + o DNS TXT lookup on QNAME=wpad.development.foo.com. + o DNS A lookup on QNAME=wpad.foo.com. + o DNS SRV lookup on QNAME=wpad.foo.com. + o DNS TXT lookup on QNAME=wpad.foo.com. + + +Cooper, et. al. Expires May 16, 2001 [Page 7] + +Internet-Draft WPAD November 2000 + + +5.2 When to Execute WPAD + + Web clients need to perform the WPAD protocol periodically to + maintain correct proxy settings. This should occur on a regular + basis corresponding to initialization of the client software or the + networking stack below the client. Further, WPAD will need to occur + in response to expiration of existing configuration data. The + following sections describe the details of these scenarios. + + The web proxy auto-discovery process MUST occur at least as + frequently as one of the following two options. A web client can use + either option depending on which makes sense in their environment. + Clients MUST use at least one of the following options. They MAY + also choose to implement both options. + o Upon startup of the web client + o Whenever there indication from the networking stack that the IP + address of the client host either has, or could have, changed + + In addition, the client MUST attempt a discovery cycle upon + expiration of a previously downloaded CFILE in accordance with + HTTP/1.1[15]. + +5.2.1 Upon Startup of the Web Client + + For many types of web client (like web browsers) there can be many + instances of the client operating for a given user at one time. This + is often to allow display of multiple web pages in different + windows, for example. There is no need to re-perform WPAD every time + a new instance of the web client is opened. WPAD MUST be performed + when the number of web client instances transitions from 0 to 1. It + SHOULD NOT be performed as additional instances are created. + +5.2.2 Network Stack Events + + Another option for clients is to tie the execution of WPAD to + changes in the networking environment. If the client can learn about + the change of the local host's IP address, or the possible change of + the IP address, it MUST re-perform the WPAD protocol. Many + operating systems provide indications of "network up" events, for + example. Those types of events and system-boot events might be the + triggers for WPAD in many environments. + +5.2.3 Expiration of the CFILE + + The HTTP retrieval of the CURL may return HTTP headers specifying a + valid lifetime for the CFILE returned. The client MUST obey these + timeouts and rerun the WPAD process when it expires. A client MAY + rerun the WPAD process if it detects a failure of the currently + configured proxy (which is not otherwise recoverable via the + + +Cooper, et. al. Expires May 16, 2001 [Page 8] + +Internet-Draft WPAD November 2000 + + + inherent mechanisms provided by the currently active Configuration + File). + + Whenever the client decides to invalidate the current CURL or CFILE, + it MUST rerun the entire WPAD protocol to ensure it discovers the + currently correct CURL. Specifically, if the valid lifetime of the + CFILE ends (as specified by the HTTP headers provided when it was + retrieved), the complete WPAD protocol MUST be rerun. The client + MUST NOT simply re-use the existing CURL to obtain a fresh copy of + the CFILE. + + A number of network round trips, broadcast and/or multicast + communications may be required during the WPAD protocol. The WPAD + protocol SHOULD NOT be invoked at a more frequent rate than + specified above (such as per-URL retrieval). + +5.3 WPAD Protocol Specification + + The following pseudo-code defines the WPAD protocol. If a + particular discovery mechanism is not supported, treat it as a + failed discovery attempt in the pseudo-code. + + Two subroutines need explanation. The subroutine + strip_leading_component(dns_string) strips off the leading + characters, up to and including the first dot (`.') in the string + which is passed as a parameter, and is expected to contain DNS name. + The Boolean subroutine is_not_canonical(dns_string) returns FALSE if + dns_string is one of the canonical domain suffixes defined in RFC + 1591[13] (for example, "com"). + + The slp_list and dns_list elements below are assumed to be linked + lists containing a data field and a pointer to the next element. + The data field contains the elements used to override the default + values in creating a CURL, as detailed in Section 5.5. + + load_CFILE() { + /* MUST use DHCP */ + curl = dhcp_query(/*WPAD option (Section 5.4.1) */); + if (curl != null) { /* DHCP succeeded */ + if isvalid (read_CFILE(curl)) + return SUCCESS; /* valid CFILE */ + } + + /* Should use SLP */ + slp_list = slp_query(/*(WPAD attributes (Section 5.4.2)*/); + while (slp_list != null) { /* test each curl */ + if isvalid(read_CFILE(slp_list.curl_data)) + return SUCCESS; /* valid CFILE */ + else + + +Cooper, et. al. Expires May 16, 2001 [Page 9] + +Internet-Draft WPAD November 2000 + + + slp_list = slp_list.next; + } + + /* all the DNS mechanisms */ + TGTDOM = gethostbyname(me); + TGTDOM = strip_leading_component(TGTDOM); + + while (is_not_canonical(TGTDOM)) { + + /* SHOULD try DNS SRV records */ + dns_list = dns_query(/*QNAME=wpad.TGTDOM., + QTYPE=SRV (Section 5.4.4)*/); + while (dns_list != null) { /* each TXT record */ + if isvalid(read_CFILE(dns_list, curl_data)) + return SUCCESS; /* valid CFILE */ + else + dns_list = dns_list.next; + } + + /* SHOULD try DNS TXT records */ + dns_list = dns_query(/*QNAME=wpad.TGTDOM., + QTYPE=TXT (Section 5.4.5)*/); + while (dns_list != null) { /* each TXT record */ + if isvalid(read_CFILE(dns_list, curl_data)) + return SUCCESS; /* valid CFILE */ + else + dns_list = dns_list.next; + } + + /* MUST try DNS A records */ + dns_list = dns_query(/*QNAME=wpad.TGTDOM., + QTYPE=A (Section 5.4.3)*/); + + while (dns_list != null) { /* check each A record */ + if isvalid(read_CFILE(dns_list, curl_data)) + return SUCCESS; /* valid CFILE */ + else + dns_list = dns_list.next; + } + + /* Still no match, remove leading component and iterate */ + TGTDOM = strip_leading_component(TGTDOM); + + } /* no A, TXT or SRV records for wpad.* */ + + return FAILED; /* could not locate valid CFILE */ + } + + + + +Cooper, et. al. Expires May 16, 2001 [Page 10] + +Internet-Draft WPAD November 2000 + + +5.4 Discovery Mechanisms + + Each of the resource discovery methods will be marked as to whether + the client MUST, SHOULD, MAY, or MUST NOT implement them to be + compliant. Client implementers are encouraged to implement as many + mechanisms as possible, to promote maximum interoperability. + + SUMMARY OF DISCOVERY MECHANISMS + + +-------------------------+--------+----------+ + | Discovery | | Document | + | Mechanism | Status | Section | + +-------------------------+--------+----------+ + | DHCP | MUST | 5.4.1 | + | SLP | SHOULD | 5.4.2 | + | "Well Known Alias" | MUST | 5.4.3 | + | DNS SRV Records | SHOULD | 5.4.4 | + | DNS TXT "service: URLs" | SHOULD | 5.4.5 | + +-------------------------+--------+----------+ + +5.4.1 DHCP + + Client implementations MUST support DHCP. DHCP has widespread + support in numerous vendor hardware and software implementations, + and is widely deployed. It is also perfectly suited to this task, + and is used to discover other network resources (such a time + servers, printers, etc). The DHCP protocol is detailed in RFC + 2131[3]. We propose a new DHCP option with code 252 for use in web + proxy auto-discovery. See RFC 2132[4] for a list of existing DHCP + options. See "Conditional Compliance" (Section 9) for more + information on DHCP requirements. + + The client should obtain the value of the DHCP option code 252 as + returned by the DHCP server. If the client has already conducted + DHCP protocol during its initialization, the DHCP server may already + have supplied that value. If the value is not available through a + client OS API, the client SHOULD use a DHCPINFORM message to query + the DHCP server to obtain the value. + + The DHCP option code for WPAD is 252 by agreement of the DHC working + group chair. This option is of type STRING. This string contains a + URL which points to an appropriate config file. The STRING is of + arbitrary size. + + An example STRING value would be: + "http://server.domain/proxyconfig.pac" + + + + + +Cooper, et. al. Expires May 16, 2001 [Page 11] + +Internet-Draft WPAD November 2000 + + +5.4.2 Service Location Protocol /SLP + + The Service Location Protocol[14] is a Proposed Standard for + discovering services in the Internet. SLP has several reference + implementations available; for details, check [16]. + + A service type for use with WPAD has been defined and is available + as an Internet Draft. + + Client implementations SHOULD implement SLP. SLP Service Replies + will provide one or more complete CURLs. Each candidate CURL so + created should be pursued as specified in Section 5.5 and beyond. + +5.4.3 DNS A/CNAME "Well Known Aliases" + + Client implementations MUST support this mechanism. This should be + straightforward since only basic DNS lookup of A records is + required. See RFC 2219[6] for a description of using "well known" + DNS aliases for resource discovery. We propose the "well known + alias" of "wpad" for web proxy auto-discovery. + + The client performs the following DNS lookup: + QNAME=wpad.TGTDOM., QCLASS=IN, QTYPE=A + + Each A RR, which is returned, contains an IP address which is used + to replace the default in the CURL. + + Each candidate CURL so created should be pursued as specified in + Section 5.5 and beyond. + +5.4.4 DNS SRV Records + + Client implementations SHOULD support the DNS SRV mechanism. Details + of the protocol can be found in RFC 2052[2]. If the implementation + language/environment provides the ability to perform DNS lookups on + QTYPEs other than A, client implementers are strongly encouraged to + provide this support. It is acknowledged that not all resolver APIs + provide this functionality. + + The client issues the following DNS lookup: + QNAME=wpad.tcp.TGTDOM., QCLASS=IN, QTYPE=SRV + + If it receives SRV RRs in response, the client should use each valid + RR in the order specified in RFC 2052[2]. Each valid record will + specify both a and to override the CURL defaults. + + Each candidate CURL so created should be pursued as specified in + Section 5.5 and beyond. + + + +Cooper, et. al. Expires May 16, 2001 [Page 12] + +Internet-Draft WPAD November 2000 + + +5.4.5 DNS TXT service: Entries + + Client implementations SHOULD support this mechanism. If the + implementation language/environment provides the ability to perform + DNS lookups on QTYPEs other than A, the vendor is strongly + encouraged to provide this support. It is acknowledged that not all + resolver APIs provide this functionality. + + The client should attempt to retrieve TXT RRs from the DNS to obtain + "service: URLs" contained therein. The "service: URL" will be of the + following format, specifying a complete candidate CURL for each + record located: + + service: wpad:http://: + + The client should first issue the following DNS query: + QNAME=wpad.TGTDOM., QCLASS=IN, QTYPE=TXT + + It should process each TXT RR it receives (if any) using each + service:URL found (if any) to generate a candidate CURL. These CURLs + should be pursued as described in Section 5.5 and beyond. Readers + familiar with [1] should note that WPAD clients MUST NOT perform the + QNAME=TGTDOM., QCLASS=IN, QTYPE=TXT lookup which would be suggested + by that document. + +5.4.6 Fallback + + Clients MUST NOT implement the "Fallback" mechanism described in + [1]. It is unlikely that a client will find a web server prepared to + handle the CURL request at a random suffix of its FQDN. This will + only increase the number of DNS probes and introduce an excess of + spurious "GET" requests on those hapless web servers. + + Instead, the "Well Known Aliases" method of Section 5.4.4 provides + equivalent functionality. + +5.4.7 Timeouts + + Implementers are encouraged to limit the time elapsed in each + discovery phase. When possible, limiting each phase to 10 seconds + is considered reasonable. Implementers may choose a different value + which is more appropriate to their network properties. For example, + a device implementation, which operated over a wireless network, may + use a much larger timeout to account for low bandwidth or high + latency. + +5.5 Composing a Candidate CURL + + Any successful discovery mechanism response will provide a + + +Cooper, et. al. Expires May 16, 2001 [Page 13] + +Internet-Draft WPAD November 2000 + + + (perhaps in the form of an IP address). Some mechanisms will also + provide a and/or a . The client should override the + default CURL fields with all of those supplied by the discovery + mechanism. + +5.6 Retrieving the CFILE at the CURL + + The client then requests the CURL via HTTP. When making the request + it MUST transmit HTTP "Accept" headers indicating what CFILE formats + it is capable of accepting. For example, Netscape Navigator browsers + with versions 2.0 and beyond might include the following line in the + HTTP Request: + + Accept: application/x-ns-proxy-autoconfig + + The client MUST follow HTTP redirect directives (response codes 3xx) + returned by the server. The client SHOULD send a valid "User-Agent" + header. + +5.7 Resuming Discovery + + If the HTTP request fails for any reason (fails to connect, server + error response, etc.) the client MUST resume the search for a + successful CURL where it left off. It should continue attempting + other sub-steps in a specific discovery mechanism, and then move on + to the next mechanism or TGTDOM iteration, etc. + +6. Client Implementation Considerations + + The large number of discovery mechanisms specified in this document + may raise concerns about network traffic and performance. The DHCP + portion of the process will result in a single broadcast by the + client, and perhaps a few replies by listening DHCP servers. + + The remaining mechanisms are all DNS based. All DNS queries should + have the QNAME terminated with a trailing '.' to indicate a FQDN and + expedite the lookup. As such each TGTDOM iteration will cause 3 DNS + lookups, each a unicast UDP packet and a reply. Most clients will + have fewer than 2 TGTDOM iterations, limiting the total number of + DNS request/replies to 6. + + In total, 7 UDP request/reply packets on client startup is quite a + low overhead. The first web page downloaded by the client will + likely dwarf that packet count. Each of the DNS lookups should stand + a high chance of hitting the cache in the client's DNS server, since + other clients will have likely looked them up recently, providing a + low total elapsed time. + + This is of course the worst case, where no CURLS are obtained, and + + +Cooper, et. al. Expires May 16, 2001 [Page 14] + +Internet-Draft WPAD November 2000 + + + assuming a long client FQDN. Often, a successful CURL will be found + early in the protocol, reducing the total packet count. Client + implementations are encouraged to overlap this protocol work with + other startup activities. Also, client implementers with concerns + about performance can choose to implement only the discovery + mechanisms listed as MUST in Section 5.4. + + A longer delay could occur if a CURL is obtained, but the hosting + web server is down. The client could spend considerable time waiting + for the TCP "connect ()" call to fail. Luckily this is an extremely + rare case where the web server hosting the CFILE has failed. See + Section 6, where proxy implementers are encouraged to provide + support for hosting CURLs on the proxy itself (acting as web + server). Since proxies are often deployed with considerable + attention to fault tolerance, this corner case can be further + minimized. + +7. Proxy Considerations + + As mentioned in the previous section, it is suggested that proxies + be capable of acting as a web server, so that they can host the CURL + directly. + + The implementers of proxies are most likely to understand the + deployment situations of (caching) proxies, the formats of proxy + configuration files, etc. They can also build in the ability select + a CFILE based on all the various inputs at the time of the CURL + request("User-Agent", "Accept", client IP address/subnet/hostname, + topological distribution of nearby proxy servers, etc.). + +8. Administrator Considerations + + Administrators should configure at least one of the DHCP or DNS A RR + methods in their environment (since those are the only two all + compatible clients MUST implement). Beyond that, configuring to + support mechanisms earlier in the search order will improve client + startup time. + + One of the major motivations for this protocol structure was to + support client location of "nearby" proxies. In many environments + there may be a number of proxies (workgroup, corporate gateway, ISP, + backbone). There are a number of possible points at which "nearness" + decisions can be made in this framework: + o DHCP servers for different subnets can return different answers. + They can also base decisions on the client cipaddr field or the + client identifier option. + o DNS servers can be configured to return different SRV/A/TXT RRs + for Different domain suffixes (for example, QNAMEs + wpad.marketing.bigcorp.com and wpad.development.bigcorp.com). + + +Cooper, et. al. Expires May 16, 2001 [Page 15] + +Internet-Draft WPAD November 2000 + + + o The web server handling the CURL request can make decisions based + on the "User-Agent", "Accept", client IP address/subnet/hostname, + and the topological distribution of nearby proxies, etc. This + can occur inside a CGI executable created to handle the CURL. As + mentioned above it could be a proxy server itself handing the + CURL request and making those decisions. + o The CFILE may be expressive enough to select from a set of + alternatives at "runtime" on the client. CARP[10] is based on + this premise for an array of caches. It is not inconceivable that + the CFILE could compute some network distance or fitness metrics + to a set of candidate proxies and then select the "closest" or + "most responsive" device. + + Note that it is valid to configure a DHCP daemon to respond only to + INFORM option queries in static IP environments + + Not all of the above mechanisms can be supported in all currently + deployed vendor hardware and software. The hope is that enough + flexibility is provided in this framework that administrators can + select which mechanisms will work in their environments. + +9. Conditional Compliance + + In light of the fact that many of the discovery technologies + described in this document are not well deployed or not available on + all platforms, this specification permits conditional compliance. + Conditional compliance is designated by three class identifications. + + Additionally, due to the possible security implications of a DHCP + broadcast request, it is onerous to REQUIRE an implementer to put + himself or his implementation at undue risk. It is quite common to + have rogue DHCP servers on a network which may fool a DHCP broadcast + implementation into using a malicious configuration file. On + platforms which do not support DHCP natively and cannot get the WPAD + option along with its IP address, and which cannot support the DHCP + INFORM unicast request, presumably to a known and trusted DHCP + server, the likelihood of an undetected spoofing attack is + increased. Having an individual program, such as a browser, trying + to detect a DHCP server on a network is unreasonable, in the + authors' opinion. On platforms which use DHCP for their system IP + address and have previously trusted a DHCP server, a unicast DHCP + INFORM to that same trusted server does not introduce any additional + trust to that server. + +9.1 Class 0 - Minimally compliant + + A WPAD implementation which implements only the following discovery + mechanisms and interval schemes is considered class 0 compliant: + DNS A record queries + + +Cooper, et. al. Expires May 16, 2001 [Page 16] + +Internet-Draft WPAD November 2000 + + + Browser or System session refresh intervals + + Class 0 compliance is only applicable to systems or implementations + which do not natively support DHCP and/or cannot securely determine + a trusted local DHCP server. + +9.2 Class 1 - Compliant + + A WPAD implementation which implements only the following discovery + mechanisms and interval schemes is considered class 1 compliant: + DNS A record queries + DHCP INFORM Queries + Network stack change refresh intervals + CFILE expiration refresh intervals + +9.3 Class 2 - Maximally compliant + + A WPAD implementation which implements only the following discovery + mechanisms and interval schemes is considered class 1 compliant: + DNS A record queries + DHCP INFORM Queries + DNS TXT service: queries + DNS SRV RR queries + SVRLOC Queries + Network stack change refresh intervals + CFILE expiration refresh intervals + + To be considered compliant with a given class, an implementation + MUST support the features listed above corresponding to that class. + +10. Security Considerations + + This document does not address security of the protocols involved. + The WPAD protocol is vulnerable to existing identified weaknesses in + DHCP and DNS. The groups driving those standards, as well as the SLP + protocol standards, are addressing security. + + When using DHCP discovery, clients are encouraged to use unicast + DHCP INFORM queries instead of broadcast queries which are more + easily spoofed in insecure networks. + + Minimally, it can be said that the WPAD protocol does not create new + security weaknesses. + +11. Acknowledgements + + The authors' work on this specification would be incomplete without + the assistance of many people. Specifically, the authors would like + the express their gratitude to the following people: + + +Cooper, et. al. Expires May 16, 2001 [Page 17] + +Internet-Draft WPAD November 2000 + + + Chuck Neerdaels, Inktomi, for providing assistance in the design of + the WPAD protocol as well as for providing reference implementations. + + Arthur Bierer, Darren Mitchell, Sean Edmison, Mario Rodriguez, Danpo + Zhang, and Yaron Goland, Microsoft, for providing implementation + insights as well as testing and deployment. + + Ari Luotonen, Netscape, for his role in designing the first web + proxy. + + In addition, the authors are grateful for the feedback provided by + the following people: + o Jeremy Worley (RealNetworks) + o Eric Twitchell (United Parcel Service) + +References + + [1] Moats, R., Hamilton, M. and P. Leach, "Finding Stuff (How to + discover services) (Internet Draft)", October 1997. + + [2] Gulbrandsen, A. and P. Vixie, "A DNS RR for specifying the + location of services (DNS SRV)", RFC 2052, October 1996, + . + + [3] Droms, R., "Dynamic Host Configuration Protocol", RFC 2131, + March 1997, + . + + [4] Alexander, S. and R. Droms, "DHCP Options and BOOTP Vendor + Extensions", RFC 2132, March 1997, + . + + [5] Veizades, J., Guttman, E., Perkins, C. and M. Day, "Service + Location Protocol (Internet Draft)", October 1997. + + [6] Hamilton, M. and R. Wright, "Use of DNS Aliases for Network + Services", RFC 2219, October 1997, + . + + [7] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", RFC 2119, March 1997, + . + + [8] Luotonen, A., "Navigator Proxy Auto-Config File Format", March + 1996, + . + + [9] Mockapetris, P., "Domain Names - Concepts and Facilities", RFC + + +Cooper, et. al. Expires May 16, 2001 [Page 18] + +Internet-Draft WPAD November 2000 + + + 1034, November 1987, + . + + [10] Valloppillil, V. and K.W. Ross, "Cache Array Routing + Protocol", draft-vinod-carp-v1-03.txt (work in progress), + February 1998, + . + + [11] Perkins, C., Guttman, E. and J. Kempf, "Service Templates and + service: Schemes (Internet Draft)", December 1997. + + [12] "A Sample DHCP Implementation for WPAD", February 1998, + . + + [13] Postel, J., "Domain Name System Structure and Delegation", RFC + 1591, March 1994, + . + + [14] Guttman, E., Perkins, C., Viezades, J. and M. Day, "Service + Location Protocol, Version 2", RFC 2608, June 1999, + . + + [15] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, + L., Leach, P. and T. Berners-Lee, "Hypertext Transfer Protocol + -- HTTP/1.1", RFC 2616, June 1999, + . + + [16] + + +Authors' Addresses + + Ian Cooper + Equinix, Inc. + + EMail: icooper@equinix.com + + + Paul Gauthier + Inktomi Corporation + + EMail: gauthier@inktomi.com + + + Josh Cohen + (Microsoft Corporation) + + + + + +Cooper, et. al. Expires May 16, 2001 [Page 19] + +Internet-Draft WPAD November 2000 + + + Martin Dunsmuir + (RealNetworks, Inc.) + + + Charles Perkins + Sun Microsystems, Inc. + + EMail: charles.perkins@Sun.COM + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Cooper, et. al. Expires May 16, 2001 [Page 20] + +Internet-Draft WPAD November 2000 + + +Full Copyright Statement + + Copyright (C) The Internet Society (2000). All Rights Reserved. + + This document and translations of it may be copied and furnished to + others, and derivative works that comment on or otherwise explain it + or assist in its implementation may be prepared, copied, published + and distributed, in whole or in part, without restriction of any + kind, provided that the above copyright notice and this paragraph + are included on all such copies and derivative works. However, this + document itself may not be modified in any way, such as by removing + the copyright notice or references to the Internet Society or other + Internet organizations, except as needed for the purpose of + developing Internet standards in which case the procedures for + copyrights defined in the Internet Standards process must be + followed, or as required to translate it into languages other than + English. + + The limited permissions granted above are perpetual and will not be + revoked by the Internet Society or its successors or assigns. + + This document and the information contained herein is provided on an + "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING + TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING + BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION + HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF + MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +Acknowledgement + + Funding for the RFC editor function is currently provided by the + Internet Society. + + + + + + + + + + + + + + + + + + + +Cooper, et. al. Expires May 16, 2001 [Page 21] + diff --git a/doc/rfc/draft-ietf-wrec-web-pro-00.txt b/doc/rfc/draft-ietf-wrec-web-pro-00.txt new file mode 100644 index 0000000000..d47edfdbb2 --- /dev/null +++ b/doc/rfc/draft-ietf-wrec-web-pro-00.txt @@ -0,0 +1,589 @@ + + + + + + +INTERNET-DRAFT M Cieslak + D Forster + Cisco Systems + 1 June 1999 + Expires December 1999 + + Web Cache Coordination Protocol V1.0 + +Status of this Memo + + This document is an Internet-Draft and is in full conformance with + all provisions of Section 10 of RFC2026. + + Internet-Drafts are working documents of the Internet Engineering + Task Force (IETF), its areas, and its working groups. Note that other + groups may also distribute working documents as Internet-Drafts. + + Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as "work in progress". + + The list of current Internet-Drafts can be accessed at + http://www.ietf.org/ietf/lid-abstracts.txt + + The list of Internet-Draft Shadow Directories can be accessed at + http://www.ietf.org/shadow.html + + Cisco has a patent pending that may relate to WCCP V1.0. If any + patents issue to Cisco or its subsidiaries with claims that are + necessary for practicing WCCP V1.0, then any party will be able to + obtain a license from Cisco to use any such patent claims under + openly specified, reasonable, non-discriminatory terms to implement + WCCP V1.0. No license is required for nonprofit institutions. + +Abstract + + This draft documents the Web Cache Coordination Protocol (WCCP) V1.0. + This protocol is used (a) to associate a single router with one or + more web-caches for the purposes of transparent redirection of HTTP + traffic, and (b) to allow one of the web-caches to dictate how the + router distributes transparently-redirected traffic across the + associated web-caches. + + This draft describes the interactions between a router and one or + more web-caches. It does not describe the interactions between a + group of associated web-caches or those between a web-cache and a + web-server. + + + + [Page 1] + +Definitions + + Transparent Redirection. + + Transparent redirection is a technique used to deploy web-caching + without the need for reconfiguration of web-clients. It involves + the interception and redirection of HTTP traffic to one or more + web-caches by a router or switch, transparently to the web-client. + + Web-Cache Farm. + + One or more web-caches associated with a router. + + Designated Web-Cache. + + The web-cache in a web-cache farm responsible for dictating to the + router how redirected traffic should be distributed across the + members of the farm. + + Redirection Hash Table. + + A 256-bucket hash table maintained by the router. This table maps + the IP destination address of a packet for redirection to the IP + address of a web-cache in the farm + +Description of Protocol + + WCCP has two main functions. The first is to allow a router enabled + for transparent redirection to discover, verify, and advertise + connectivity to one or more web-caches. + + The second function is to allow one of the web-caches, the designated + web-cache, to dictate how the router distributes redirected traffic + across the web-cache farm. + + It is recommended that the web-cache with the lowest IP address is + elected as designated web-cache for a farm. + +Discovery + + WCCP V1.0 allows a single router to be associated with one or more + web-caches. A group of web-caches associated with a router is + referred to as a web-cache farm. A web-cache may be directly attached + to the router or some hops distant. + +Joining a web-cache farm + + A web-cache joins a web-cache farm by periodically unicasting a + + + + [Page 2] + +WCCP_HERE_I_AM packet to the router associated with the farm at + intervals of HERE_I_AM_T (10) seconds. The source IP address of the + WCCP_HERE_I_AM uniquely identifies the web-cache. The router unicasts + a WCCP_I_SEE_YOU packet back to the web-cache in response to each + WCCP_HERE_I_AM it receives. + +Verifying connectivity + + The Received_ID fields in the WCCP_HERE_I_AM and WCCP_I_SEE_YOU + packets are used to verify two-way connectivity between the router + and web-cache. The router increments the value of the Received_ID + field each time it sends a WCCP_I_SEE_YOU to a web-cache and expects + to receive the same value back in the Received_ID field of the next + WCCP_HERE_I_AM from that cache. WCCP_HERE_I_AM packets containing an + invalid Received_ID are ignored. + + The Received_ID in the initial WCCP_HERE_I_AM sent from a web-cache + is ignored. The router will only consider a web-cache to be reachable + when it has received a subsequent WCCP_HERE_I_AM with a correct + Received_ID. Note that a useable web-cache is merely reachable; the + router will not redirect traffic to a newly-acquired useable web- + cache until instructed to do so in a WCCP_ASSIGN_BUCKETS packet from + the designated web-cache. + +Advertising connectivity + + The router includes a list of the web-caches it considers to be + usable in each WCCP_I_SEE_YOU packet it transmits. Each entry in the + list includes the IP address of the web-cache and indicates which + buckets in the Redirection Hash Table are currently assigned to that + web-cache. This information is provided for the benefit of the + designated web-cache. + + A Change ID in the WCCP_I_SEE_YOU packet is incremented whenever the + web-cache list changes or the bucket allocation for an entry in the + list is modified. + +Timing-out a web-cache + + If the router does not receive a valid WCCP_HERE_I_AM for 3 * + HERE_I_AM_T seconds it will no longer consider a web-cache to be + useable. In this case the web-cache is no longer advertised in the + WCCP_I_SEE_YOU packet and all buckets previously assigned to the + web-cache in the router's Redirection Hash Table are marked as + unassigned. + +Assignment + + + + + [Page 3] + +The router associated with a web-cache farm distributes redirected + traffic by destination IP address across the members of the farm as + directed by the designated web-cache via the WCCP_ASSIGN_BUCKETS + packet. + + How the designated web-cache arrives at the traffic distribution + described by the WCCP_ASSIGN_BUCKETS packet is outside the scope of + this draft. + + Since the router has no knowledge of the designated web-cache + election process it will accept a WCCP_ASSIGN_BUCKETS packet from any + member of the web-cache farm. + + The value of the Received_ID in the WCCP_ASSIGN_BUCKETS packet must + match that in the last WCCP_I_SEE_YOU sent to the designated web- + cache. If the Received_ID is not valid the router will ignore the + WCCP_ASSIGN_BUCKETS packet. + + On receipt of a valid WCCP_ASSIGN_BUCKETS packet the router will set + its Redirection Hash Table from information contained in the packet. + This information comprises a list of web-caches followed by a 256- + bucket hash table. The position of a web-cache in the list is its + index number, the index number of the first entry being zero. Each + bucket in the hash table may contain the value 0xFF, indicating no + web-cache has been assigned to that bucket, or the index number of a + web-cache. + + The router does not generate a packet in response to the + WCCP_ASSIGN_BUCKET. However the change in the Redirection Hash Table + will be reflected in subsequent WCCP_I_SEE_YOU packets generated by + the router. + +Packet Redirection + +Detection + + The router detects HTTP packets (TCP packets with a destination port + number of 80) and redirects them to a web-cache in the web-cache + farm. + + The destination IP address of a candidate packet is hashed to yield + an index into the 256-bucket Redirection Hash Table. The indexed + bucket indicates to which web-cache the packet should be redirected. + If the bucket in the Redirection Hash Table is unassigned the packet + cannot be redirected and should be forwarded normally. + +Encapsulation + + + + + [Page 4] + +Each redirected packet is encapsulated in a GRE packet[1]. The + encapsulation uses the base four-octet GRE header with the two Flags + and version octets set to zero and a Protocol Type of 0x883E. + + An encapsulated packet may be fragmented if it exceeds the output + interface's MTU. + +Returned packets + + The router must ensure that HTTP traffic passing through it from + members of the web-cache farm en-route to a web-server is not + redirected. + + The router will not redirect any packet with a source address + belonging to a member of the web-cache farm. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + [Page 5] + +Format of Protocol Packets + + This section defines the format of the WCCP packets. + + Each WCCP protocol packet is carried in a UDP packet with a + destination port of 2048 + +Here I Am + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Type | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Protocol Version | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Hash Revision | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Hash Information (0) | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | . | + | . | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Hash Information (7) | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |U| Reserved | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Received ID | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Type + + WCCP_HERE_I_AM (7) + + Protocol Version + + 4 + + Hash Revision + + 0 + + Hash Information + + A 256-element bit-vector. A set bit indicates that the + corresponding bucket in the Redirection Hash Table is + assigned to this web-cache. Normally the value of the Hash + + + + [Page 6] + +Information present in the last WCCP_I_SEE_YOU message received by + this cache. In the initial WCCP_HERE_I_AM sent to the router it + may be zero or the value assigned to the cache in a previous + membership of this web-cache farm. This information may be used by + the designated web-cache to re-assign buckets to the cache. + + U + + Normally the value of the U flag present in the last + WCCP_I_SEE_YOU message received by this cache. Set in first + WCCP_HERE_I_AM to indicate that Hash Information is historical. + + Received ID + + The value of the Received ID present in the last WCCP_I_SEE_YOU + received by this web-cache. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + [Page 7] + +I See You + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Type | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Protocol Version | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Change Number | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Received ID | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Number of WCs | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Web-Cache List Entry(0) | + | . | + | . | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Web-Cache List Entry(n) | + | . | + | . | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Type + + WCCP_I_SEE_YOU (8) + + Protocol Version + + 4 + + Change Number + + Incremented if a Web-Cache List Entry has been added, removed or + its hash information has been modified since the last + WCCP_I_SEE_YOU sent by the router. + + Received ID + + Incremented each time the router generates a WCCP_I_SEE_YOU. Will + never be zero. + + Number of WCs + + Number of Web-Cache List Entry elements in the packet. + + + + [Page 8] + +Web-Cache List Entry + + The Web-Cache List Entry describes a Web-Cache by IP Address and + lists the redirection hash table entries assigned to it. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | IP Address | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Hash Revision | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Hash Information (0) | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | . | + | . | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Hash Information (7) | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |U| Reserved | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + IP Address + + Web-cache IP Address + + Hash Revision + + 0 + + Hash Information + + A 256-element bit-vector. A set bit indicates that the + corresponding bucket in the Redirection Hash Table is + assigned to this web-cache. + + U + + If set indicates web-cache is not assigned in the Redirection Hash + Table and that the web-cache hash information is historical. This + information may be used by the designated web-cache to reassign + buckets to a web-cache which has rejoined the farm. + + + + + + + + + [Page 9] + +Assign Bucket + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Type | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Received ID | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Number of Web Caches | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Web-Cache 0 IP Address | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | . | + | . | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Web-Cache n IP Address | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Bucket 0 | Bucket 1 | Bucket 2 | Bucket 3 | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | . | + | . | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Bucket 252 | Bucket 253 | Bucket 254 | Bucket 255 | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Type + + WCCP_ASSIGN_BUCKET (9) + + Received ID + + Value of Received ID in last WCCP_I_SEE_YOU received from router. + + Number of Web Caches + + Number of Web Caches to which redirect traffic can be sent. + + Web-Cache IP Address, 0-n + + IP Addresses of Web-Caches to which redirect traffic can be sent. + The position of a Web-Cache's IP Address in this list is the Web- + Cache's index number. The first entry in the list has an index + number of zero. + + Bucket 0-255 + + + + [Page 10] + +These 256 buckets represent the redirection hash table. The value + of each bucket may be 0xFF (Unassigned) or a Web-Cache index + number (0-31). + +References + + [1] Hanks, Li, Farinacci & Traina, "Generic Routing Encapsulation + (GRE)", RFC 1701, October 1994 + +Authors' Addresses + + Martin Cieslak + Cisco Systems + 170 Tasman Drive + San Jose, CA 95143 + + David Forster + Cisco Systems + 170 Tasman Drive + San Jose, CA 95143 + + Phone: +44-181-7568967 + Email: dforster@cisco.com + + + Expires December 1999 + + + + + + + + + + + + + + + + + + + + + + + + + + [Page 11] + diff --git a/doc/rfc/draft-vinod-carp-v1-03.txt b/doc/rfc/draft-vinod-carp-v1-03.txt new file mode 100644 index 0000000000..e1e6df027e --- /dev/null +++ b/doc/rfc/draft-vinod-carp-v1-03.txt @@ -0,0 +1,417 @@ +INTERNET-DRAFT Vinod Valloppillil + Microsoft Corporation + Keith W. Ross + University of Pennsylvania + 26 Feb 1998 + Expires August 1998 + + + Cache Array Routing Protocol v1.0 + +Status of this Memo + + This document is an Internet-Draft. Internet-Drafts are working + documents of the Internet Engineering Task Force (IETF), its areas, + and its working groups. Note that other groups may also distribute + working documents as Internet-Drafts. + + Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference + material or to cite them other than as ``work in progress.'' + + To learn the current status of any Internet-Draft, please check the + ``1id-abstracts.txt'' listing contained in the Internet-Drafts Shadow + Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), + munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or + ftp.isi.edu (US West Coast). + +Abstract + + This draft documents the Cache Array Routing Protocol (CARP) v1.0 + for dividing URL-space among an array of loosely coupled proxy + servers. + + An HTTP client agent (either a proxy server or a client browser) + which implements CARP v1.0 can allocate and intelligently route + requests for the correct URLs to any member of the Proxy + Array. Due to the resulting sorting of requests through these + proxies, duplication of cache contents is eliminated and global + cache hit rates are improved. + +Valloppillil [Page 1] + +INTERNET-DRAFT Cache Array Routing Protocol v1.0 26 Feb 1998 + +Table of Contents + + 1. Overview........................................ 2 + 2. Proxy Array Membership Table.................... 3 + 2.1 Global Information......................... 3 + 2.2 Member Information......................... 4 + 3. Routing Function................................ 5 + 3.1 Hash Function.............................. 5 + 3.2 Hash Combination........................... 6 + 3.3 Load Factor................................ 7 + 3.4 Route Selection............................ 7 + 3.5 Member Failure Routing..................... 7 + 4. Client-Side Implementation...................... 7 + 5. Versioning...................................... 7 + 6. Security Considerations......................... 8 + 7. Open Issues..................................... 8 + 8. Acknowledgements................................ 8 + 9. References...................................... 8 + 10. Author's Information............................ 9 + +Valloppillil [Page 2] + +INTERNET-DRAFT Cache Array Routing Protocol v1.0 26 Feb 1998 + +1. Overview + + The Cache Array Routing Protocol describes a distributed caching + protocol based on + + 1) a known membership list of loosely coupled proxies and + 2) a hash function for dividing URL space among those proxies + + The Proxy Array Membership Table is defined as a plain ASCII text + file retrieved from an Array Configuration URL. This document does + NOT describe how this table is constructed, merely the format of + the fields used by agents implementing. + + The hash function plus routing algorithm defined in this document + take member proxies described in the Proxy Array Membership Table + and make an on-the-fly determination as to which Proxy Array member + should be the proper receptacle for a cached version of a resource + keyed by URL. + + Downstream agents may then access the cached resource by forwarding + the proxied HTTP request [5] for that resource to the appropriate + member of the Proxy Array. + +Valloppillil [Page 3] + +INTERNET-DRAFT Cache Array Routing Protocol v1.0 26 Feb 1998 + +2. Proxy Array Membership Table + + The Proxy Array Membership Table is a plain-text ASCII file which + can be published from a URL. + + The format of the table is: + + Proxy Array Information/ + ArrayEnabled: <0 | 1> + ConfigID: + ArrayName: + ListTTL: + + + + +2.1 Global Information + + These are fields that describe the array itself and are not specific + to any one member of an array + + Global information is terminated in the Proxy Array Membership Table + by a CR/LF/CR/LF. + +2.1.1 Version number + + The version number for implementations of this specification is + 1.0 + +2.1.2 ArrayEnabled + + This field allows proxies to advertise their implementation of CARP + v1 even if they are not members of a Proxy Array. + +2.1.3 ConfigID + + ConfigID is an opaque number no larger than 32bits similar to an + ETag in HTTP 1.1. It is used to track the current state of an + Array table and may be used to match multiple yet independently + published copies of the Proxy Array Membership Table. + +2.1.4 ArrayName + + ArrayName is an opaque string which is used to provide a convenient + administrative name for a given array. + +2.1.5 ListTTL + + ListTTL is the number of seconds for which an HTTP client entity + should consider the current table image valid. After ListTTL + has expired, that client should retrieve a new copy of + the Proxy Array Membership Table. + +Valloppillil [Page 4] + +INTERNET-DRAFT Cache Array Routing Protocol v1.0 26 Feb 1998 + +2.2 Member Information + + The following fields are published per member in an array and are + separated by single spaces. The end of an array member's record is + terminated by a CR/LF. + +2.2.1 Name + + The name of the proxy server. Typically this is the fully qualified + DNS name. Downstream HTTP agents should use resolution of this name + to determine how to connect to this proxy. + +2.2.2 IP Addr + + The IP address that other proxy servers within this array should use + to connect to this proxy server. This is necessary for proxy + servers which may be hosted on multi-hommed servers where requests + are only accepted by one of the interfaces. + + If this field is not published in the table, name resolution may be + used to find a proxy IP address + +2.2.3 Listening Port + + The TCP port number this proxy is expecting requests on. + +2.2.4 Table URL + + A URL which may be maintained by this proxy server on which a copy + of the array membership table can be found. + + This entry does not have to be unique per proxy server. In such + cases, the URL should be identical to the URL from which the + downstream client requested this table. + +2.2.5 Agent String + + An opaque string identifying the vendor / version of the + proxy Server in the Array Membership Table. + +Valloppillil [Page 5] + +INTERNET-DRAFT Cache Array Routing Protocol v1.0 26 Feb 1998 + +2.2.7 Statetime + + How long a Proxy Server has been in its current state and has been + a member of this table. This is useful for dynamic generation of + the Array Membership Table where the host generating the table has + knowledge of the proxy's operational status. + + This field is expressed in seconds and is an unsighed 32 bit value. + +2.2.8 Status + + Status provides a simple text string indicating whether a member + proxy is currently able to handle requests (UP) or refused a + connection when last contacted (DOWN). + +2.2.9 Load Factor + + Load Factor is a relative amount of the total load for an array that + should be handled by any given member of the array. + + Load Factor is specified as an integer and relative weight is + computed against other integer values in the table. + +2.2.10 Cache Size + + Cache size is an informational field that indicates the size of the + cache held by a particular member of an array. + + Cache size is specified in Megabytes (MB) and represents the + maximum potential size of a disk cache for this server. + +3. Routing Function + + Once an agent has a Proxy Array Membership Table. It uses a + mathematical hash function to determine which of the members of + the array should be the receptacle of a particular URL request. + + This routing function involves constructing n "scores" using a hash + of the request URL plus a hash of each of the k proxies in the Proxy + Array Membership Table. + + Both the URL and the proxy names are hashed in order to minimize the + disruption of target routes if a member of the target array can't + be contacted. + + Hashes of the URL and proxy name are constructed using the algorithm + described in 3.1 and combined using the algorithm described in 3.2. + +Valloppillil [Page 6] + +INTERNET-DRAFT Cache Array Routing Protocol v1.0 26 Feb 1998 + +3.1. Hash Function + + The hash function outputs a 32 bit unsigned integers based on a + zero-terminated ASCII input string. The machine name and domain + names of the URL, the protocol, and the machine names of each member + proxy should be evaluated in lower case since that portion of the + URL is case insensitive. + + Because irreversibility and strong cryptographic features are + unnecessary for this application, a very simple and fast hash + function based on the bitwise left rotate operator is used. + + For (each char in URL): + URL_Hash += _rotl(URL_Hash, 19) + char ; + + Member proxy hashes are computed in a similar manner: + + For (each char in MemberProxyName): + MemberProxy_Hash += _rotl(MemberProxy_Hash, 19) + char ; + + Becaues member names are often similar to each other, their hash + values are further spread across hash space via the following + additional operations: + + MemberProxy_Hash += MemberProxy_Hash * 0x62531965 ; + MemberProxy_Hash = _rotl (MemberProxy_Hash, 21) ; + +3.2. Hash Combination + + Hashes are combined by first exclusive or-ing (XOR) the URL hash by + the machine name and then multiplying by a constant and performing + a bitwise rotation. + + All final and intermediate values are 32 bit unsigned integers. + + Combined_Hash = (URL_hash ^ MemberProxy_Hash) ; + Combined_Hash += Combined_Hash * 0x62531965 ; + Combined_Hash = _rotl(Combined_Hash, 21) ; + +Valloppillil [Page 7] + +INTERNET-DRAFT Cache Array Routing Protocol v1.0 26 Feb 1998 + +3.3. Load Factor + + Support for array members with differing HTTP processing & caching + capacity is achieved by multiplying each of the combined hash values + by a Load Factor Multiplier. + + The Load Factor Multiplier for an individual member is calculated by + taking each member's relative Load Factor and applying the + following formula: + + The Load Factor Multiplier must be calculated from the smallest + P_k to the largest P_k. The sum of all P_k's must be 1. + + For each proxy server 1,...,K, the Load Factor Multiplier, X_k, is + calculated iteratively as follows: + + All X_n values are 32 bit floating point numbers. + + X_1 = pow ((K*p_1), (1/K)) + + X_k = ([K-k+1] * [P_k - P_{k-1}])/(X_1 * X_2 * ... * X_{k-1}) + X_k += pow ((X_{k-1}, {K-k+1}) + X_k = pow (X_k, {1/(K-k+1)}) + + where: + + X_k = Load Factor Multiplier for proxy k + K = number of proxies in an array + P_k = relative percent of the load that proxy k should handle + + This is then combined with the previously computed hashes as + + Resultant_value = Combined_Hash * X_k + + +3.4. Route Selection + + The "score" for a particular combination of URL plus proxy is its + resultant value. Once the agent determines the scores of the + K proxies, it routes the URL query to the proxy with the highest + score. + +3.5. Member Failure Routing + + If a proxy can not contact the designated member of a proxy array + in order to forward an HTTP request, that proxy should route + the request to the second highest scoring proxy in the target array. + +4. Client-side implementation + + CARP can be implemented on client-side HTTP browsers via the + use of the Proxy AutoConfig file described in [1] and [2]. + +5. Versioning + + If a downstream proxy receives an Array Membership Table with a + greater version # than that proxy is able to parse, it should + fall back to simple proxy request routing to any administrator + defined upstream proxy server. + +Valloppillil [Page 8] + +INTERNET-DRAFT Cache Array Routing Protocol v1.0 26 Feb 1998 + +6. Security Considerations + + This draft does not discuss relevant security considerations. + +7. Open Issues + +8. Acknowledgements + + The author would like to thank Brian Smith, Kip Compton, and + Kerry Schwartz for their assistance in preparing this document. + + Most of the architecture & design of CARP stem from work conducted + by Brian Smith at Microsoft Corp. + +9. References + + [1] Luotonen, Ari., "Navigator Proxy Auto-Config File Format", + Netscape Corporation, http://home.netscape.com/eng/mozilla/2.0/ + relnotes/demo/proxy-live.html, March 1996. + + [3] Wessels, Duane., "Internet Cache Protocol Version 2", http://ds. + internic.net/internet-drafts/draft-wessels-icp-v2-00.txt, March 21, + 1997. + + [4] Sharp Corporation., "Super Proxy Script", + http://naragw.sharp.co.jp/sps/, August 9, 1996. + + [5] Fielding, R., et. al, "Hypertext Transfer Protocol -- HTTP/1.1", + RFC 2068, UC Irvine, January 1997. + + [6] Valloppillil & Cohen, "Hierarchical HTTP Routing Protocol", + http://ircache.nlanr.net/Cache/ICP/draft-vinod-icp-traffic-dist-00.txt, + April 21, 1997. + + [7] Thaler, David & Ravishankar, Chinya. "Using Name-Based + Mappings to Increase Hit Rates," ACM/IEEE Transactions on Networking. + to appear. + +Valloppillil [Page 9] + +INTERNET-DRAFT Cache Array Routing Protocol v1.0 21 Aug 1997 + +10. Author Information + + Vinod Valloppillil + Microsoft Corporation + One Microsoft Way + Redmond, WA 98052 + + Phone: 1.206.703.3460 + Email: VinodV@Microsoft.Com + + Keith W. Ross + University of Pennsylvania + Department of Systems Engineering + Philadelphia, PA 19104 + + Phone: 1.215.898.6069 + Email: Ross@UPenn.Edu + +Expires August 1998 diff --git a/doc/rfc/draft-wilson-wrec-wccp-v2-01.txt b/doc/rfc/draft-wilson-wrec-wccp-v2-01.txt new file mode 100644 index 0000000000..c7e16562e7 --- /dev/null +++ b/doc/rfc/draft-wilson-wrec-wccp-v2-01.txt @@ -0,0 +1,2332 @@ +INTERNET-DRAFT M Cieslak + D Forster + G Tiwana + R Wilson + Cisco Systems + 03 Apr 2001 + Expires Oct 2001 + + Web Cache Communication Protocol V2.0 + +Status of this Memo + + This document is an Internet-Draft and is in full conformance with all + provisions of Section 10 of RFC2026. + + Internet-Drafts are working documents of the Internet Engineering Task + Force (IETF), its areas, and its working groups. Note that other + groups may also distribute working documents as Internet-Drafts. + + Internet-Drafts are draft documents valid for a maximum of six months + and may be updated, replaced, or obsoleted by other documents at any + time. It is inappropriate to use Internet-Drafts as reference material + or to cite them other than as "work in progress". + + The list of current Internet-Drafts can be accessed at + http://www.ietf.org/ietf/lid-abstracts.txt. + + The list of Internet-Draft Shadow Directories can be accessed at + http://www.ietf.org/shadow.html. + +1. Abstract + + This document describes version 2.0 of the Web Cache Communication + Protocol (WCCP). The WCCP V2.0 protocol specifies interactions between + one or more routers and one or more web-caches. The purpose of the + interaction is to establish and maintain the transparent redirection + of selected types of traffic flowing through a group of routers. The + selected traffic is redirected to a group of web-caches with the aim + of optimising resource usage and lowering response times. + + The protocol does not specify any interaction between the web-caches + within a group or between a web-cache and a web-server. + +2. Definitions + + Assignment Method + + The method by which redirected packets are distributed between + + + + [Page 1] + +web-caches. + + Designated Web-Cache. + + The web-cache in a web-cache farm responsible for dictating to the + router or routers how redirected traffic should be distributed between + the members of the farm. + + Forwarding Method + + The method by which redirected packets are transported from router to + web-cache. + + Packet Return Method + + The method by which packets redirected to a web-cache are returned to + a router for normal forwarding. + + Redirection Hash Table. + + A 256-bucket hash table maintained by the router or routers. This + table maps the hash index derived from a packet to be redirected to + the IP address of a destination web-cache. + + Service Group + + A group of one or more routers plus one or more web-caches working + together in the redirection of traffic whose characteristics are part + of the Service Group definition. + + Transparent Redirection. + + Transparent redirection is a technique used to deploy caching without + the need for reconfiguration of clients or servers. It involves the + interception and redirection of traffic to one or more web-caches by a + router or switch transparently to the end points of the traffic flow. + + Usable Web-Cache. + + From the viewpoint of a router a web-cache is considered a usable + member of a Service Group when it has sent that web-cache a + WCCP2_I_SEE_YOU message and has received in response a WCCP2_HERE_I_AM + message with a valid "Receive ID". + + Web-Cache Farm. + + One or more web-caches associated with a router or routers. + + + + + [Page 2] + +3. Introduction + +3.1 Protocol Overview + + WCCP V2.0 defines mechanisms to allow one or more routers enabled for + transparent redirection to discover, verify, and advertise + connectivity to one or more web-caches. + + Having established connectivity the routers and web-caches form + Service Groups to handle the redirection of traffic whose + characteristics are part of the Service Group definition. + + The protocol provides the means to negotiate the specific method + used for load distribution among web-caches and also the method used + to transport traffic between router and cache. + + A single web-cache within a Service Group is elected as the designated + web-cache. It is the responsibility of the designated web-cache to + provide routers with the data which determines how redirected traffic + is distributed between the web-caches in the Service Group. + +3.2 WCCP V2.0 enhancements + + WCCP V2.0 supports the following enhancements to the WCCP V1.0 + protocol. + + * Multi-Router Support. + WCCP V2.0 allows a farm of web-caches to be attached to more than one + router. + + * Multicast Support. + WCCP V2.0 supports multicasting of protocol messages between + web-caches and routers. + + * Improved Security. + WCCP V2.0 provides optional authentication of protocol packets + received by web-caches and routers. + + * Support for redirection of non-HTTP traffic. + WCCP V2.0 supports the redirection of traffic other than HTTP traffic + through the concept of Service Groups. + + * Packet return. + WCCP V2.0 allows a web-cache to decline to service a redirected packet + and to return it to a router to be forwarded. The method by which + packets are returned to a router is negotiable. + + + + + + [Page 3] + +* Alternative Hashing. + WCCP V2.0 allows the designated web-cache to mark individual buckets + in the Redirection Hash Table for a secondary hash. This allows the + traffic being hashed to a particular bucket to be distributed across + the members of a Service Group. + + * Multiple Forwarding Methods + WCCP V2.0 allows individual web-caches to negotiate the method by + which packets are forwarded to a web-cache from a router. Packets + may now be forwarded unencapsulated using a Layer 2 destination + address rewrite. + + * Multiple Assignment Methods + WCCP V2.0 allows the designated web-cache to negotiate the method by which + packets are distributed between the web-caches in a service group. + Packets may now be assigned using a hashing scheme or a masking scheme. + + * Command and Status Information + WCCP V2.0 includes a mechanism to allow a web-cache to pass a command + to the routers in a Service Group. The same mechanism can be employed + by the routers to pass status information to the web-caches in a + Service Group. + +4. Protocol Description + +4.1 Joining a Service Group + + A web-cache joins and maintains its membership of a Service Group by + transmitting a WCCP2_HERE_I_AM message to each router in the Group at + HERE_I_AM_T (10) second intervals. This may be by unicast to each + router or multicast to the configured Service Group multicast + address. The Web Cache Info component in the WCCP2_HERE_I_AM message + identifies the web-cache by IP address. The Service Info component of + the WCCP2_HERE_I_AM message identifies and describes the Service Group in + which the web-cache wishes to participate. + + A router responds to a WCCP2_HERE_I_AM message with a WCCP2_I_SEE_YOU + message. If the WCCP2_HERE_I_AM message was unicast then the router will + respond immediately with a unicast WCCP2_I_SEE_YOU message. If the + WCCP2_HERE_I_AM message was multicast the router will respond via the + scheduled multicast WCCP2_I_SEE_YOU message for the Service Group. + + A router responds to multicast web-cache members of a Service Group + using a multicast WCCP2_I_SEE_YOU message transmitted at 9 second + intervals with a 10% jitter. + + The Router Identity component in a WCCP2_I_SEE_YOU message includes a list + of the web-caches to which the packet is addressed. A web-cache not + + + + [Page 4] + +in the list should discard the WCCP2_I_SEE_YOU message. + +4.2 Describing a Service Group + + The Service Info component of a WCCP2_HERE_I_AM message describes the + Service Group in which a web-cache wishes to participate. A Service + Group is identified by Service Type and Service ID. There are two + types of Service Group: + + * Well Known Services + * Dynamic Services. + + Well Known Services are known by both routers and web-caches and do + not require a description other than a Service ID. + + In contrast Dynamic Services must be described to a router. A router + may be configured to participate in a particular Dynamic Service + Group, identified by Service ID, without any knowledge of the + characteristics of the traffic associated with the Service Group. The + traffic description is communicated to the router in the + WCCP2_HERE_I_AM message of the first web-cache to join the Service + Group. A web-cache describes a Dynamic Service using the Protocol, + Service Flags and Port fields of the Service Info component. Once a + Dynamic Service has been defined a router will discard any subsequent + WCCP2_HERE_I_AM message which contains a conflicting description. A + router will also discard a WCCP2_HERE_I_AM message which describes a + Service Group for which the router has not been configured. + +4.3 Establishing Two-Way Connectivity + + WCCP V2.0 uses a "Receive ID" to verify two-way connectivity between a + router and a web-cache. The Router Identity Info component of a + WCCP2_I_SEE_YOU message contains a "Receive ID" field. This field is + maintained separately for each Service Group and its value is + incremented each time the router sends a WCCP2_I_SEE_YOU message to + the Service Group. + + The "Receive ID" sent by a router is reflected back by a web-cache in + the Web-Cache View Info component of a WCCP2_HERE_I_AM message. A + router checks the value of the "Receive ID" in each WCCP2_HERE_I_AM + message received from a Service Group member. If the value does not + match the "Receive ID" in the last WCCP2_I_SEE_YOU message sent to + that member the message is discarded. + + A router considers a web-cache to be a usable member of a Service + Group only after it has sent that web-cache a WCCP2_I_SEE_YOU message + and received a WCCP2_HERE_I_AM message with a valid "Receive ID" in + response. + + + + [Page 5] + +4.4 Negotiating the Forwarding Method + + A web-cache and router may negotiate the method by which packets are + forwarded to the web-cache by the router. + + This negotiation is per web-cache, per Service Group. Thus web-caches + participating in the same Service Group may negotiate different + forwarding methods with the Service Group routers. + + A router will advertise the supported forwarding methods for a Service + Group using the optional Capabilities Info component of the + WCCP2_I_SEE_YOU message. The absence of such an advertisement implies + the router supports the default GRE encapsulation method only. + + A web-cache will inspect the forwarding method advertisement in the + first WCCP2_I_SEE_YOU message received from a router for a particular + Service Group. If the router does not advertise a method supported by + the web-cache then the web-cache will abort its attempt to join the + Service Group. Otherwise the web-cache will pick one method from those + advertised by the router and specify that in the optional Capabilities + Info component of its next WCCP2_HERE_I_AM message. Absence of a + forwarding method advertisement in a WCCP2_HERE_I_AM message implies + the cache is requesting the default GRE encapsulation method. + + A router will inspect the forwarding method selected by a web-cache in + the WCCP2_HERE_I_AM message received in response to a WCCP2_I_SEE_YOU + message. If the selected method is not supported by the router the + router will ignore the WCCP2_HERE_I_AM message. If the forwarding + method is supported the router will accept the web-cache as usable and + add it to the Service Group. + +4.5 Negotiating the Assignment Method + + A web-cache and router may negotiate the method by which packets are + distributed between the web-caches in a Service Group. + + The negotiation is per Service. Thus web-caches participating in + several Service Groups may negotiate a different assignment method for + each Service Group. + + A router will advertise the supported assignment methods for a + Service Group using the optional Capabilities Info component of the + WCCP2_I_SEE_YOU message. The absence of such an advertisement implies + the router supports the default Hash assignment method only. + + A web-cache will inspect the assignment method advertisement in the + first WCCP2_I_SEE_YOU message received from a router for the Service + Group. If the router does not advertise a method supported by the + + + + [Page 6] + +web-cache then the web-cache will abort its attempt to join the + Service Group. Otherwise the web-cache will pick one method from those + advertised by the router and specify that in the optional Capabilities + Info component of its next WCCP2_HERE_I_AM message. Absence of an + assignment method advertisement in a WCCP2_HERE_I_AM message implies + the cache is requesting the default Hash assignment method. + + A router will inspect the assignment method selected by a web-cache in + the WCCP2_HERE_I_AM message received in response to a WCCP2_I_SEE_YOU + message. If the selected method is not supported by the router the + router will ignore the WCCP2_HERE_I_AM message. If the assignment + method is supported the router will accept the web-cache as usable and + add it to the Service Group. + +4.5 Negotiating the Packet Return Method + + A web-cache and router may negotiate the method by which packets are + returned from a web-cache to a router for normal forwarding. + + The negotiation is per Service. Thus web-caches participating in + several Service Groups may negotiate a different packet return method + for each Service Group. + + A router will advertise the supported packet return methods for a + Service Group using the optional Capabilities Info component of the + WCCP2_I_SEE_YOU message. The absence of such an advertisement implies + the router supports the default GRE packet return method only. + + A web-cache will inspect the packet return method advertisement in the + first WCCP2_I_SEE_YOU message received from a router for the Service + Group. If the router does not advertise a method supported by the + web-cache then the web-cache will abort its attempt to join the + Service Group. Otherwise the web-cache will pick one method from those + advertised by the router and specify that method in the optional + Capabilities Info component of its next WCCP2_HERE_I_AM + message. Absence of a packet return method advertisement in a + WCCP2_HERE_I_AM message implies the cache is requesting the default + GRE packet return method. + + A router will inspect the packet return method selected by a web-cache + in the WCCP2_HERE_I_AM message received in response to a + WCCP2_I_SEE_YOU message. If the selected method is not supported by + the router the router will ignore the WCCP2_HERE_I_AM message. If the + packet return method is supported the router will accept the web-cache + as usable and add it to the Service Group. + + + + + + + [Page 7] + +4.6 Advertising Views of the Service Group + + Each router advertises its view of a Service Group via the Router View + Info component in the WCCP2_I_SEE_YOU message it sends to web-caches. + This component includes a list of the useable web-caches in the + Service Group as seen by the router and a list of the routers in the + Service Group as reported in WCCP2_HERE_I_AM messages from + web-caches. A change number in the component is incremented if the + Service Group membership has changed since the last WCCP2_I_SEE_YOU + message sent by the router. + + Each web-cache advertises its view of the Service Group via the Web + Cache View Info component in the WCCP2_HERE_I_AM message it sends to + routers in the Service Group. This component includes the list of + routers that have sent the web-cache a WCCP2_I_SEE_YOU message and a + list of web-caches learnt from the WCCP2_I_SEE_YOU messages. The Web + Cache View Info component also includes a change number which is + incremented each time Service Group membership information changes. + +4.7 Security + + WCCP V2.0 provides a security component in each protocol message to + allow simple authentication. Two options are supported: + + * No Security (default) + * MD5 password security + + MD5 password security requires that each router and web-cache wishing + to join a Service Group be configured with the Service Group + password. Each WCCP protocol packet sent by a router or web-cache for + that Service Group will contain in its security component the MD5 + checksum of the WCCP protocol message (including the WCCP message + header) and a Service Group password. Each web-cache or router in the + Service Group will authenticate the security component in a received + WCCP message immediately after validating the WCCP message header. + Packets failing authentication will be discarded. + +4.8 Distribution of Traffic Assignments + + WCCP V2.0 allows the traffic assignment method to be negotiated. There + are two types of information to be communicated depending on the + assignment method: + + * Hash Tables + * Mask/Value Sets + + + + + + + [Page 8] + +4.8.1 Hash Tables + + When using hash assignment each router uses a 256-bucket Redirection + Hash Table to distribute traffic for a Service Group across the member + web-caches. It is the responsibility of the Service Group's designated + web-cache to assign each router's Redirection Hash Table. + + The designated web-cache uses a WCCP2_REDIRECT_ASSIGNMENT message to + assign the routers' Redirection Hash Tables. This message is + generated following a change in Service Group membership and is sent + to the same set of addresses to which the web-cache sends WCCP2_HERE_I_AM + messages. The designated web-cache will wait 1.5 HERE_I_AM_T + seconds following a change before generating the message in order to + allow the Service Group membership to stabilise. + + The Redirection Hash Tables can be conveyed in either an Assignment + Info Component or an Alternate Assignment Component within a + WCCP2_REDIRECT_ASSIGNMENT. Both components contain an Assignment + Key. This will be reflected back to the designated web-cache in + subsequent WCCP2_I_SEE_YOU messages from the routers in the Service + Group. A WCCP2_REDIRECT_ASSIGNMENT may be repeated after HERE_I_AM_T + seconds if inspection of WCCP2_I_SEE_YOU messages indicates a router + has not received an assignment. + + A router will flush its Redirection Hash Table if a + WCCP2_REDIRECT_ASSIGNMENT is not received within 5 HERE_I_AM_T seconds + of a Service Group membership change. A router will flush its + Redirection Hash Table if it receives a WCCP2_REDIRECT_ASSIGNMENT + message in which it is not listed. + + The designated web-cache lists the web-caches to which traffic should + be distributed in either an Assignment Info Component or an Alternate + Assignment Component within a WCCP2_REDIRECT_ASSIGNMENT message. Only + those web-caches seen by every router in the Service Group are + included. + +4.8.2 Mask/Value Sets + + When using mask assignment each router uses masks and a table of + values to distribute traffic for a Service Group across the member + web-caches. It is the responsibility of the Service Group's designated + web-cache to assign each router's mask/value sets. + + The designated web-cache uses the Alternate Assignment Component in a + WCCP2_REDIRECT_ASSIGNMENT message to assign the routers' mask/value + set. This message is generated following a change in Service Group + membership and is sent to the same set of addresses to which the + web-cache sends WCCP2_HERE_I_AM messages. The designated web-cache + + + + [Page 9] + +will wait 1.5 HERE_I_AM_T seconds following a change before generating + the message in order to allow the Service Group membership to + stabilise. + + The Alternate Assignment Info component of the + WCCP2_REDIRECT_ASSIGNMENT contains an Assignment Key. This will be + reflected back to the designated web-cache in subsequent + WCCP2_I_SEE_YOU messages from the routers in the Service Group. A + WCCP2_REDIRECT_ASSIGNMENT message may be repeated after HERE_I_AM_T + seconds if inspection of WCCP2_I_SEE_YOU messages indicates a router + has not received an assignment. + + A router will flush its mask/value set if a WCCP2_REDIRECT_ASSIGNMENT + is not received within 5 HERE_I_AM_T seconds of a Service Group + membership change. A router will flush its mask/value set if it + receives a WCCP2_REDIRECT_ASSIGNMENT in which it is not listed. + + The designated web-cache lists the web-caches to which traffic should + be distributed in the Alternate Assignment Info component of the + WCCP2_REDIRECT_ASSIGNMENT message. Only those web-caches seen by every + router in the Service Group are included. + +4.9 Electing the Designated Web-cache + + Election of the designated web-cache will take place once a Service + Group membership has stabilised following a change. The designated + web-cache must be receiving a WCCP2_I_SEE_YOU message from every + router in the Service Group. + + Election of the designated web-cache is not part of the WCCP + protocol. However it is recommended that the web-cache with the lowest + IP address is selected as designated web-cache for a Service Group. + +4.10 Traffic Interception + + A router will check packets passing through it against its set of + Service Group descriptions. The Service Group descriptions are + checked in priority order. A packet which matches a Service Group + description is a candidate for redirection to a web-cache in the + Service Group. + + A router will not redirect a packet with a source IP address matching + any web-cache in the Service Group. + + + + + + + + + [Page 10] + +4.11 Traffic Redirection + +4.11.1 Redirection with Hash Assignment + + Redirection with hash assignment is a two-stage process. In the first + stage a primary key is formed from the packet (as defined by the + Service Group description) and hashed to yield an index into the + Redirection Hash Table. + + If the Redirection Hash Table entry contains an unflagged web-cache + index then the packet is redirected to that web-cache. If the bucket + is unassigned the packet is forwarded normally. If the bucket is + flagged as requiring a secondary hash then a secondary key is formed + (as defined by the Service Group description) and hashed to yield an + index into the Redirection Hash Table. If the secondary entry contains + a web-cache index then the packet is directed to that web-cache. If the + entry is unassigned the packet is forwarded normally. + +4.11.2 Redirection with Mask Assignment + + The first step in redirection using the mask assignment method is to + perform a bitwise AND operation between the mask from the first + mask/value set in the Service Group definition and the contents of the + packet. The output of this operation is the set of fields in the packet + which will be used for value matching. The selected fields from the + packet are then compared against each entry in the list of values for + that mask/value set. If a match is found the packet is redirected to + the web-cache associated with the value entry. If no match is found + the process is repeated for each mask/value set defined for the + Service Group. If, after trying all of the mask/value sets defined + for the Service Group, no match is found, the packet is forwarded + normally. + + Mask/value sets are processed in the order in which they are + presented in the Alternate Assignment component. Value elements are + compared in the order in which they appear in the mask/value set of which + they are part. + +4.12 Traffic Forwarding + + WCCP allows the negotiation of the forwarding method between router + and web-cache (See Negotiating the Forwarding Method). The currently + defined forwarding methods are: + + * GRE Encapsulated + * Unencapsulated with L2 rewrite + + + + + + [Page 11] + +4.12.1 Forwarding with GRE Encapsulation + + Redirected packets are encapsulated in a new IP packet with a GRE [1] + header followed by a four-octet Redirect header. + + The GRE encapsulation uses the simple four-octet GRE header with the + two Flags and Version octets set to zero and a Protocol Type of + 0x883E. + + The Redirect header is as follows: + + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + |D|A| Reserved | Service ID | Alt Bucket | Pri Bucket | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + D Dynamic Service + 0: Well known service + 1: Dynamic service + + A Alternative bucket used + 0: Primary bucket used + 1: Alternative bucket used + + Service ID + + Service Group identifier + + Alt Bucket + + Alternative bucket index used to redirect the packet. Only valid + for hash assignment. + + Pri Bucket + + Primary bucket index used to redirect the packet. Only valid for hash + assignment. + +4.12.2 Forwarding with L2 Rewrite + + Redirected packets are not encapsulated. The router replaces the + packet's destination MAC address with the MAC address of the target + web-cache. + + This forwarding method requires that the target web-cache + be directly-connected to the router at Layer 2. A router will not + allow a web-cache which is not directly attached to negotiate this + forwarding method. + + + + + [Page 12] + +4.13 Packet Return + + WCCP V2.0 allows a web-cache to decline a redirected packet and return + it to a router for normal forwarding as specified by the packet's + destination IP address. The method by which packets are returned from + router to cache is a matter for negotiation (see Negotiating the + Packet Return Method). + + When a router receives a returned packet it must not attempt to + redirect that packet back to a web-cache. Two methods are available to + prevent any further redirection: + + * Interface Configuration + * Encapsulation + + The interface configuration method requires that a router is + configured to inhibit redirection of packets arriving over interfaces + connected to web-caches. Redirection may be disabled for all packets + arriving on an interface or for packets where the source MAC + address is that of a web-cache. This mechanism is efficient but is + topology dependant and thus may not always be suitable. In this case + the packet return method in use is L2. + + The encapsulation method requires a web-cache to send returned packets + to a router with encapsulation. Returned packets are encapsulated in a + GRE packet [1] with a Protocol Type of 0x883E and contain the original + Redirect Header or a null Redirect Header if none was present in the + original redirected packet. The receiving router removes the GRE + encapsulation from the packets and forwards them without attempting to + redirect. The packet return method used in this case is GRE. + +4.14 Querying Cache Time-Out + + If a router does not receive a WCCP2_HERE_I_AM message from a Service + Group member for 2.5 * HERE_I_AM_T seconds it will query the member by + unicasting a WCCP2_REMOVAL_QUERY message to it. The target Service + Group member should respond by sending a series of 3 identical + WCCP2_HERE_I_AM messages, each separated by HERE_I_AM_T/10 seconds. + + If a router does not receive a WCCP2_HERE_I_AM message from a Service + Group member for 3 * HERE_I_AM_T seconds it will consider the member + to be unusable and remove it from the Service Group. The web-cache + will no longer appear in the Router View Info component of the + WCCP2_I_SEE_YOU message. + + The web-cache will be purged from the assignment data for the Service + Group. + + + + + [Page 13] + +4.15 Command and Status Information + + WCCP V2.0 includes a mechanism to allow web-caches to send commands to + routers within a service group. The same mechanism can be used by the + routers to provide status information to web-caches. + + The mechanism is implemented by the Command Extension component. This + component is included in the WCCP2_HERE_I_AM message from a web-cache + passing commands to routers in a Service Group. + + If a router needs to send status information to a web-cache it will + include a command in the Command Extension component within its own + WCCP2_I_SEE_YOU message. That command will indicate the type of status + information being carried. + +5. Protocol Messages + + Each WCCP protocol message is carried in a UDP packet with a + destination port of 2048. There are four WCCP V2.0 messages: + + * Here I AM + * I See You + * Redirect Assign + * Removal Query + +5.1 'Here I Am' Message + + +--------------------------------------+ + | WCCP Message Header | + +--------------------------------------+ + | Security Info Component | + +--------------------------------------+ + | Service Info Component | + +--------------------------------------+ + | Web-Cache Identity Info Component | + +--------------------------------------+ + | Web-Cache View Info Component | + +--------------------------------------+ + | Capability Info Component (optional) | + +--------------------------------------+ + |Command Extension Component (optional)| + +--------------------------------------+ + + + + + + + + + + [Page 14] + +5.2 'I See You' Message + + +--------------------------------------+ + | WCCP Message Header | + +--------------------------------------+ + | Security Info Component | + +--------------------------------------+ + | Service Info Component | + +--------------------------------------+ + | Router Identity Info Component | + +--------------------------------------+ + | Router View Info Component | + +--------------------------------------+ + | Assignment Info Component | + | OR | + | Assignment Map Component | + +--------------------------------------+ + | Capability Info Component (optional) | + +--------------------------------------+ + |Command Extension Component (optional)| + +--------------------------------------+ + +5.3 'Redirect Assign' Message + + +--------------------------------------+ + | WCCP Message Header | + +--------------------------------------+ + | Security Info Component | + +--------------------------------------+ + | Service Info Component | + +--------------------------------------+ + | Assignment Info Component | + | OR | + | Alternate Assignment Component | + +--------------------------------------+ + +5.4 'Removal Query' Message + + +--------------------------------------+ + | WCCP Message Header | + +--------------------------------------+ + | Security Info Component | + +--------------------------------------+ + | Service Info Component | + +--------------------------------------+ + | Router Query Info Component | + +--------------------------------------+ + + + + + [Page 15] + +5.5 WCCP Message Header + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Type | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Version | Length | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Type + + WCCP2_HERE_I_AM (10) + WCCP2_I_SEE_YOU (11) + WCCP2_REDIRECT_ASSIGN (12) + WCCP2_REMOVAL_QUERY (13) + + Version + + 0x200 + + Length + + Length of the WCCP message not including the WCCP Message Header. + + +5.6 Message Components + + Each WCCP message comprises a WCCP Message Header followed by a number of + message components. The defined components are: + + * Security Info + * Service Info + * Router Identity Info + * Web-Cache Identify Info + * Router View Info + * Web-Cache View Info + * Assignment Info + * Router Query Info + * Capabilities Info + * Alternate Assignment + * Assignment Map + * Command Extension + + Components are padded to align on a four-octet boundary. Each + component has a 4-octet header specifying the component type and + length. Note that the length value does not include the 4-octet + component header. + + + + [Page 16] + +5.6.1 Security Info Component + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Type | Length | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Security Option | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Security Implementation | + | . | + | . | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Type + + WCCP2_SECURITY_INFO (0) + + Length + + Length of the remainder of the component. + + Security Option + + WCCP2_NO_SECURITY (0) + WCCP2_MD5_SECURITY (1) + + Security Implementation + + If Security Option has the value WCCP2_NO_SECURITY then this field is + not present. If Security Option has the value WCCP2_MD5_SECURITY this + is a 16-octet field containing the MD5 checksum of the WCCP message and + the Service Group password. The maximum password length is 8 octets. + + Prior to calculating the MD5 checksum the password should be padded + out to 8 octets with trailing zeros and the Security Implementation + field of the Security Option set to zero. The MD5 checksum is calculated + using the 8 octet padded password and the WCCP message (including the + WCCP Message Header). + + + + + + + + + + + + [Page 17] + +5.6.2 Service Info Component + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Type | Length | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Service Type | Service ID | Priority | Protocol | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Service Flags | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Port 0 | Port 1 | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | . | + | . | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Port 6 | Port 7 | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Type + + WCCP2_SERVICE_INFO (1) + + Length + + Length of the remainder of the component. + + Service Type + + WCCP2_SERVICE_STANDARD (0). + Service is a well known service and is described by the Service ID. + All fields other than Service ID must be zero. + + WCCP2_SERVICE_DYNAMIC (1). + Service is defined by the Protocol, Service Flags and Port fields. + + Service ID + + Service number. A number in the range 0-255. For well known services + numbers in the range 0-50 are reserved. The numbers currently defined + for well known services are: + + 0x00 HTTP + + + + + + + + [Page 18] + +Priority + + Service priority. The lowest priority is 0, the highest is + 255. Packets for redirection are matched against Services in priority + order, highest first. Well known services have a priority of 240. + + Protocol + + IP protocol identifier + + Service Flags + + 0x0001 Source IP Hash + 0x0002 Destination IP Hash + 0x0004 Source Port Hash + 0x0008 Destination Port Hash + 0x0010 Ports Defined. + 0x0020 Ports Source. + 0x0100 Source IP Alternative Hash + 0x0200 Destination IP Alternative Hash + 0x0400 Source Port Alternative Hash + 0x0800 Destination Port Alternative Hash + + The primary hash flags (Source IP Hash, Destination IP Hash, Source + Port Hash, Destination Port Hash) determine the key which will be + hashed to yield the Redirection Hash Table primary bucket index. If + only the Destination IP Hash flag is set then the packet destination + IP address is used as the key. Otherwise if any of the primary hash + flags are set then the key is constructed by XORing the appropriate + fields from the packet with the key (which has an initial value of + zero). + + The key is hashed using the following algorithm: + + ulong hash = key; + hash ^= hash >> 16; + hash ^= hash >> 8; + return(hash & 0xFF); + + If alternative hashing has been enabled for the primary bucket (see + Assignment Info Component) the alternate hash flags (Source IP + Alternative Hash, Destination IP Alternative Hash, Source Port + Alternative Hash, Destination Port Alternative Hash) determine the + key which will be hashed to yield a secondary bucket index. The key + is constructed by XORing the appropriate fields from the packet with + a key (which has an initial value of zero). + + + + + + [Page 19] + +Port 0-7 + + Zero terminated list of UDP or TCP port identifiers. Packets will be + matched against this set of ports if the Ports Defined flag is set. If + the Ports Source flag is set the port information refers to a source + port, if clear the port information refers to a destination port. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + [Page 20] + +5.6.3 Router Identity Info Component + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Type | Length | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Router ID Element | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Sent To Address | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Number Received From | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Received From Address 0 | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | . | + | . | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Received From Address n | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Type + + WCCP2_ROUTER_ID_INFO (2) + + Length + + Length of the remainder of the component. + + Router ID Element + + Element containing the router's identifying IP address and Receive + ID. The IP address must be a valid, reachable address for the router. + + Sent To Address + + IP address to which the target web-cache sent the WCCP2_HERE_I_AM + message. When this component is present in a unicast WCCP2_I_SEE_YOU + message it will contain the IP address that the target web-cache + used. When present in a multicast WCCP2_I_SEE_YOU message it will + contain the Service Group multicast address. + + Number Received From + + The number of web-caches to which this message is directed. When using + multicast addressing it may be less than the number of caches which + + + + [Page 21] + +actually see the message. + + Received From Address 0-n + + List of the IP addresses of web-caches to which this message is + directed. When using multicast addressing it may be a subset of the + caches which actually see the message. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + [Page 22] + +5.6.4 Web-Cache Identity Info Component + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Type | Length | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Web-Cache Identity Element | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Type + + WCCP2_WC_ID_INFO (3) + + Length + + Length of the remainder of the component. + + Web-Cache Identity Element + + Element containing the web-cache IP address and Redirection Hash Table + mapping. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + [Page 23] + +5.6.5 Router View Info Component + + This represents a router's view of the Service Group. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Type | Length | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Member Change Number | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Assignment Key | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Number of Routers | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Router 0 | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | . | + | . | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Router n | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Number of Web-Caches | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Web-Cache Identity Element 0 | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | . | + | . | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Web-Cache Identity Element n | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Type + + WCCP2_RTR_VIEW_INFO (4) + + Length + + Length of the remainder of the component. + + Member Change Number + + Incremented each time there is a change in Service Group membership. + + + + [Page 24] + +Assignment Key + + Assignment Key element received in the last WCCP2_REDIRECT_ASSIGNMENT + message. Used by the designated web-cache to verify that an assignment + has been executed. + + Number of Routers + + Number of routers in the Service Group + + Router 0-n + + IP addresses of routers in the Service Group. This list is constructed + from routers reported by web-caches via WCCP2_HERE_I_AM messages. Note + that a router does not include itself in the list unless it has also + been reported via a WCCP2_HERE_I_AM message. + + Number of Web-Caches + + Number of useable web-caches in the Service Group + + Web-Cache Identity Element 0-n + + Identity elements of useable web-caches in Service Group. This list + contains web-caches that have sent the router a WCCP2_HERE_I_AM + message with a valid "Received ID". + + + + + + + + + + + + + + + + + + + + + + + + + + [Page 25] + +5.6.6 Web Cache View Info Component + + This represents a web-cache's view of the Service Group. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Type | Length | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Change Number | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Number of Routers | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Router ID Element 0 | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | . | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Router ID Element n | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Number of Web-Caches | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Web Cache address 0 | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | . | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Web Cache address n | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Type + + WCCP2_WC_VIEW_INFO (5) + + Length + + Length of the remainder of the component. + + Change Number + + Incremented each time there is a change in the view. + + Number of Routers + + Number of routers in the Service Group + + + + + [Page 26] + +Router ID Element 0-n + + List of elements containing the identifying IP address for each router + in the Service Group and the last "Received ID" from each. + + Number of Web-Caches + + Number of web-caches in the Service Group + + Web Cache address 0-n + + List of web-cache IP addresses learnt from WCCP2_I_SEE_YOU messages. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + [Page 27] + +5.6.7 Assignment Info Component + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Type | Length | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Assignment Key | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Number of Routers | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Router Assignment Element 0 | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | . | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Router Assignment Element n | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Number of Web-Caches | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Web-Cache 0 | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | . | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Web-Cache n | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Bucket 0 | Bucket 1 | Bucket 2 | Bucket 3 | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | . | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Bucket 252 | Bucket 253 | Bucket 254 | Bucket 255 | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Type + + WCCP2_REDIRECT_ASSIGNMENT (6) + + Length + + Length of the remainder of the component. + + + + + + + [Page 28] + +Assignment Key + + The designated web-cache expects this element to be returned by a router + in subsequent WCCP2_I_SEE_YOU messages. + + Number of Routers + + Number of routers reachable by the designated web-cache. + + Router Assignment Element 0-n + + Elements containing the router IP address, "Receive ID" and "Change + Number" for each router. + + Number of Web-Caches + + Number of useable web-caches in the Service Group seen by all routers. + + Web Cache 0-n + + List of the IP addresses of useable web-caches in Service Group. The + position of a web-cache identifier in this list is the web-cache + index. The first entry in the list has an index of zero. + + Bucket 0-255 + + Contents of the Redirection Hash Table. The content of each bucket is a + web-cache index value in the range 0-31. If set the A flag indicates + that alternative hashing should be used for this web-cache. The value + 0xFF indicates no web-cache has been assigned to the bucket. + + 0 1 2 3 4 5 6 7 + +-+-+-+-+-+-+-+-+ + | Index |A| + +-+-+-+-+-+-+-+-+ + + + + + + + + + + + + + + + + + [Page 29] + +5.6.8 Router Query Info Component + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Type | Length | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Router ID | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Receive ID | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Sent To IP Address | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Target IP Address | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Type + + WCCP2_QUERY_INFO (7) + + Length + + Length of the remainder of the component. + + Router ID + + Router IP address. The same address advertised in a WCCP2_I_SEE_YOU + message. + + Receive ID + + Receive ID expected by the router. + + Sent To IP Address + + IP address to which the web-cache sent its last WCCP2_HERE_I_AM + message. This will not be the Router ID if the web-cache is + multicasting its WCCP2_HERE_I_AM messages. + + Target IP Address + + IP address of web-cache being queried. + + + + + + + + + + [Page 30] + +5.6.9 Capabilities Info Component + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Type | Length | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Capability Element 0 | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | . | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Capability Element n | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Type + + WCCP2_CAPABILITY_INFO (8) + + Length + + Length of the remainder of the component. + + Capability Element + + Element in Type-Length-Value format (TLV) describing a router or + web-cache capability. + + + + + + + + + + + + + + + + + + + + + + + + + [Page 31] + +5.6.10 Alternate Assignment Component + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Type | Length | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Assignment Type | Assignment Length | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Assignment Body | + | . | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Type + + WCCP2_ALT_ASSIGNMENT (13) + + Length + + Length of the remainder of the component. + + Assignment Type + + Currently defined values: + + WCCP2_HASH_ASSIGNMENT (0x00) + WCCP2_MASK_ASSIGNMENT (0x01) + + Assignment Length + + Length of Assignment Body + + Assignment Body + + The format of Assignment Body depends upon the value of Assignment Type. + + Assignment Type = WCCP2_HASH_ASSIGNMENT + + In this case the body of the message is identical to the Assignment + Info Component with the Type and Length fields omitted. + + + + + + + + + + + [Page 32] + +Assignment Type = WCCP2_MASK_ASSIGNMENT + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Assignment Key | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Number of Routers | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Router Assignment Element 0 | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | . | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Router Assignment Element n | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Number of Mask/Value Set Elements (m) | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Mask/Value Set Element 0 | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | . | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Mask/Value Set Element m | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Assignment Key + + The designated web-cache expects this element to be returned by a + router in subsequent WCCP2_I_SEE_YOU messages. + + Number of Routers + + Number of routers reachable by the designated web-cache. + + Router Assignment Element 0-n + + Element containing the router IP address, Receive ID and Change + Number for each router. + + Number of Mask/Value Set Elements (m) + + Number of Mask/Value Set elements in this message + + + + + [Page 33] + +Mask/Value Set Element 0-m + + A list of the Mask/Value Element Sets for the Service Group + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + [Page 34] + +5.6.11 Assignment Map Component + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Type | Length | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Number of Mask/Value Set Elements (n) | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Mask/Value Set Element 0 | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | . | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Mask/Value Set Element n | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Type + + WCCP2_ASSIGN_MAP (14) + + Length + + Length of the remainder of the component. + + Number of Mask/Value Set Elements (n) + + Number of Mask/Value Set elements in the message + + Mask/Value Set Element 0-n + + A list of the Mask/Value Element Sets for the Service Group + + + + + + + + + + + + + + + + + + + [Page 35] + +5.6.12 Command Extension Component + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Type | Length | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Command Type | Command Length ! + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Command Data | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | . | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Type + + WCCP2_COMMAND_EXTENSION (15) + + Length + + Length of the remainder of the component. + + Command Type + + The command specifier. + + Command Length + + The length of the Command Data field of this command + + The defined Command Types are: + + Command Type: WCCP2_COMMAND_TYPE_SHUTDOWN (01) + Command Length: 4 + Command Data: Web-cache IP address + Description: This command is used by a web-cache to indicate to + the routers in a Service Group that it is shutting + down and should no longer receive any redirected traffic. + + + Command Type: WCCP2_COMMAND_TYPE_SHUTDOWN_RESPONSE (02) + Command Length: 4 + Command Data: Web-cache IP address. + Description: This command is used by a router to acknowledge + receipt of a SHUTDOWN command received from the web-cache + identified by the IP address in the Command Data field. + + + + + [Page 36] + +5.7 Information Elements + +5.7.1 Router ID Element + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Router ID | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Receive ID | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Router ID + + Router's identifying IP address. This must be a valid IP address by + which the router is reachable. + + Receive ID + + Defined per Service Group. Incremented each time the router sends a WCCP + protocol message including a Router Identity Info component. Will never be + zero. + +5.7.2 Web-Cache Identity Element + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | WC Address | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Hash Revision |U| Reserved | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Bucket Block 0 | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | . | + | . | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Bucket Block 7 | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Assignment Weight | Status | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + WC Address + + Web-Cache IP address + + + + + + [Page 37] + +Hash Revision + + 0x00 + + U + + If set indicates that the web cache does not have an assignment in the + Redirection Hash Table and that Bucket Block data is historical. + Historical data may be used by the designated web-cache to re-assign + the same bucket set to a web-cache that left and subsequently + rejoined a Service Group. + + Bucket Block 0-7 + + 256-bit vector. A set bit indicates the corresponding Redirection + Hash Table bucket is assigned to this web-cache. + + Assignment Weight + + Hash weight. May be used to indicate to the designated web-cache how new + assignments should be made. + + Status + + Hash status. May be used to indicate to the designated web-cache how new + assignments should be made. + +5.7.3 Assignment Key Element + + This element identifies a particular assignment. + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Key IP Address | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Key Change Number | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Key IP Address + + Designated web-cache IP address + + Key Change Number + + Incremented if a change has occurred. + + + + + + [Page 38] + +5.7.4 Router Assignment Element + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Router ID | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Receive ID | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Change Number | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Router ID + + Router's identifying IP address. It must be a valid address by which + the router is reachable. + + Receive ID + + Last Receive ID received from the router identified by Router + ID. A router will ignore an assignment if Receive ID is invalid. + + Change Number + + Last Member Change Number received from the router identified by + Router ID. A router will ignore an assignment if Change Number is + invalid. + +5.7.5 Capability Element + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Type | Length | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Value | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Type + + Currently defined types are: + + WCCP2_FORWARDING_METHOD 0x01 + WCCP2_ASSIGNMENT_METHOD 0x02 + WCCP2_PACKET_RETURN_METHOD 0x03 + + + + + + + [Page 39] + +Length + + Length of Capability element Value + + Value + + The length and format of the value field is dependant on the capability type. + + Type = WCCP2_FORWARDING_METHOD + + A 32-bit bitmask indicating supported/selected forwarding methods. + Currently defined values are: + + WCCP2_FORWARDING_METHOD_GRE 0x00000001 + WCCP2_FORWARDING_METHOD_L2 0x00000002 + + Type = WCCP2_ASSIGNMENT_METHOD + + A 32-bit bitmask indicating supported/selected assignment methods. + Currently defined values are: + + WCCP2_ASSIGNMENT_METHOD_HASH 0x00000001 + WCCP2_ASSIGNEMNT_METHOD_MASK 0x00000002 + + Type = WCCP2_PACKET_RETURN_METHOD + + A 32-bit bitmask indicating supported/selected packet return methods. + Currently defined values are: + + WCCP2_PACKET_RETURN_METHOD_GRE 0x00000001 + WCCP2_PACKET_RETURN_METHOD_L2 0x00000002 + +5.7.6 Mask/Value Set Element + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Mask Element | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Number of Value Elements (n) | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Value Element 0 | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | . | + | . | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Value Element n | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + + + [Page 40] + +Mask Element + + Mask element for this set. + + Number of Value Elements (n) + + The number of value elements in this set. + + Value Element 0-n + + The list of value elements for this set. + +5.7.7 Mask Element + + Note that in all of the mask fields of this element a zero means + "Don't care". + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Source Address Mask | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Destination Address Mask | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Source Port Mask | Destination Port Mask | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Source Address Mask + + The 32 bit mask to be applied to the source IP address of the packet. + + Destination Address Mask + + The 32 bit mask to be applied to the destination IP address of the packet. + + Source Port Mask + + The 16 bit mask to be applied to the TCP/UDP source port field of the packet. + + Destination Port Mask + + The 16 bit mask to be applied to the TCP/UDP destination port field of the packet. + + + + + + + + + + [Page 41] + +5.7.8 Value Element + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Source Address Value | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Destination Address Value | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Source Port Value | Destination Port Value | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Web Cache IP Address | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + + Source Address Value + + The value to match against the source IP address of the packet after + masking. + + Destination Address Value + + The value to match against the destination IP address of the packet after + masking. + + Source Port Value + + The value to match against the TCP/UDP source port number of the + packet after masking. + + Destination Port Value + + The value to match against the TCP/UDP destination port number of the + packet after masking. + + Web-cache IP address + + The IP address of the web-cache to which packets matching this value + element should be sent. + + + + + + + + + + + + + + [Page 42] + +6. Security Considerations + + WCCP V2 provides a mechanism for message authentication. It is + described in section 4.7 of this document. The authentication + mechanism relies on a password known to all routers and web-caches in + a Service Group. The password is part of the Service Group + configuration and is used to compute message checksums which can be + verified by other members of the group. Should the password become + known to a host attempting to disrupt the operation of a Service Group + it would be possible for that host to spoof WCCP messages and appear + as either a router or web-cache in the Service Group. + + To pose as a router in a Service Group a host would advertise its + presence to the members of the group in I_SEE_YOU messages. If + accepted as part of the Service Group the host would receive the + configuration for the group in a HERE_I_AM message from the designated + web-cache. This situation would not pose any threat to the operation + of the Service Group because the host would not be performing any + packet redirection and all packets would flow normally. + + To pose as a web-cache within a Service Group a host would advertise + its presence in HERE_I_AM messages. Acceptance of the host as part of + the Service Group would be decided by the designated cache and may be + subject to additional security checks not specified by WCCP. Should + the host become part of the Service Group it would be assigned a + proportion of the traffic redirected by the routers in the Service + Group. Assuming that the host drops any redirected packets the net + effect to clients would be that some attempts to retrieve content via + the Service Group routers would fail. + + +7. References + + [1] Hanks, Li, Farinacci & Traina, "Generic Routing Encapsulation + (GRE)", RFC 1701, October 1994 + + +8. Authors' Addresses + + Martin Cieslak + Cisco Systems + 170 Tasman Drive + San Jose, CA 95143 + + David Forster + Cisco Systems + 170 Tasman Drive + San Jose, CA 95143 + + + + [Page 43] + +Gurumukh Tiwana + Cisco Systems + 170 Tasman Drive + San Jose, CA 95143 + + Rob Wilson + Cisco Systems + 170 Tasman Drive + San Jose, CA 95143 + + email: robewils@cisco.com + + Expires January 2001 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + [Page 44] + diff --git a/doc/rfc/rfc1738.txt b/doc/rfc/rfc1738.txt new file mode 100644 index 0000000000..3728866e17 --- /dev/null +++ b/doc/rfc/rfc1738.txt @@ -0,0 +1,1403 @@ + + + + + + +Network Working Group T. Berners-Lee +Request for Comments: 1738 CERN +Category: Standards Track L. Masinter + Xerox Corporation + M. McCahill + University of Minnesota + Editors + December 1994 + + + Uniform Resource Locators (URL) + +Status of this Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Abstract + + This document specifies a Uniform Resource Locator (URL), the syntax + and semantics of formalized information for location and access of + resources via the Internet. + +1. Introduction + + This document describes the syntax and semantics for a compact string + representation for a resource available via the Internet. These + strings are called "Uniform Resource Locators" (URLs). + + The specification is derived from concepts introduced by the World- + Wide Web global information initiative, whose use of such objects + dates from 1990 and is described in "Universal Resource Identifiers + in WWW", RFC 1630. The specification of URLs is designed to meet the + requirements laid out in "Functional Requirements for Internet + Resource Locators" [12]. + + This document was written by the URI working group of the Internet + Engineering Task Force. Comments may be addressed to the editors, or + to the URI-WG . Discussions of the group are archived + at + + + + + + + + +Berners-Lee, Masinter & McCahill [Page 1] + +RFC 1738 Uniform Resource Locators (URL) December 1994 + + +2. General URL Syntax + + Just as there are many different methods of access to resources, + there are several schemes for describing the location of such + resources. + + The generic syntax for URLs provides a framework for new schemes to + be established using protocols other than those defined in this + document. + + URLs are used to `locate' resources, by providing an abstract + identification of the resource location. Having located a resource, + a system may perform a variety of operations on the resource, as + might be characterized by such words as `access', `update', + `replace', `find attributes'. In general, only the `access' method + needs to be specified for any URL scheme. + +2.1. The main parts of URLs + + A full BNF description of the URL syntax is given in Section 5. + + In general, URLs are written as follows: + + : + + A URL contains the name of the scheme being used () followed + by a colon and then a string (the ) whose + interpretation depends on the scheme. + + Scheme names consist of a sequence of characters. The lower case + letters "a"--"z", digits, and the characters plus ("+"), period + ("."), and hyphen ("-") are allowed. For resiliency, programs + interpreting URLs should treat upper case letters as equivalent to + lower case in scheme names (e.g., allow "HTTP" as well as "http"). + +2.2. URL Character Encoding Issues + + URLs are sequences of characters, i.e., letters, digits, and special + characters. A URLs may be represented in a variety of ways: e.g., ink + on paper, or a sequence of octets in a coded character set. The + interpretation of a URL depends only on the identity of the + characters used. + + In most URL schemes, the sequences of characters in different parts + of a URL are used to represent sequences of octets used in Internet + protocols. For example, in the ftp scheme, the host name, directory + name and file names are such sequences of octets, represented by + parts of the URL. Within those parts, an octet may be represented by + + + +Berners-Lee, Masinter & McCahill [Page 2] + +RFC 1738 Uniform Resource Locators (URL) December 1994 + + + the chararacter which has that octet as its code within the US-ASCII + [20] coded character set. + + In addition, octets may be encoded by a character triplet consisting + of the character "%" followed by the two hexadecimal digits (from + "0123456789ABCDEF") which forming the hexadecimal value of the octet. + (The characters "abcdef" may also be used in hexadecimal encodings.) + + Octets must be encoded if they have no corresponding graphic + character within the US-ASCII coded character set, if the use of the + corresponding character is unsafe, or if the corresponding character + is reserved for some other interpretation within the particular URL + scheme. + + No corresponding graphic US-ASCII: + + URLs are written only with the graphic printable characters of the + US-ASCII coded character set. The octets 80-FF hexadecimal are not + used in US-ASCII, and the octets 00-1F and 7F hexadecimal represent + control characters; these must be encoded. + + Unsafe: + + Characters can be unsafe for a number of reasons. The space + character is unsafe because significant spaces may disappear and + insignificant spaces may be introduced when URLs are transcribed or + typeset or subjected to the treatment of word-processing programs. + The characters "<" and ">" are unsafe because they are used as the + delimiters around URLs in free text; the quote mark (""") is used to + delimit URLs in some systems. The character "#" is unsafe and should + always be encoded because it is used in World Wide Web and in other + systems to delimit a URL from a fragment/anchor identifier that might + follow it. The character "%" is unsafe because it is used for + encodings of other characters. Other characters are unsafe because + gateways and other transport agents are known to sometimes modify + such characters. These characters are "{", "}", "|", "\", "^", "~", + "[", "]", and "`". + + All unsafe characters must always be encoded within a URL. For + example, the character "#" must be encoded within URLs even in + systems that do not normally deal with fragment or anchor + identifiers, so that if the URL is copied into another system that + does use them, it will not be necessary to change the URL encoding. + + + + + + + + +Berners-Lee, Masinter & McCahill [Page 3] + +RFC 1738 Uniform Resource Locators (URL) December 1994 + + + Reserved: + + Many URL schemes reserve certain characters for a special meaning: + their appearance in the scheme-specific part of the URL has a + designated semantics. If the character corresponding to an octet is + reserved in a scheme, the octet must be encoded. The characters ";", + "/", "?", ":", "@", "=" and "&" are the characters which may be + reserved for special meaning within a scheme. No other characters may + be reserved within a scheme. + + Usually a URL has the same interpretation when an octet is + represented by a character and when it encoded. However, this is not + true for reserved characters: encoding a character reserved for a + particular scheme may change the semantics of a URL. + + Thus, only alphanumerics, the special characters "$-_.+!*'(),", and + reserved characters used for their reserved purposes may be used + unencoded within a URL. + + On the other hand, characters that are not required to be encoded + (including alphanumerics) may be encoded within the scheme-specific + part of a URL, as long as they are not being used for a reserved + purpose. + +2.3 Hierarchical schemes and relative links + + In some cases, URLs are used to locate resources that contain + pointers to other resources. In some cases, those pointers are + represented as relative links where the expression of the location of + the second resource is in terms of "in the same place as this one + except with the following relative path". Relative links are not + described in this document. However, the use of relative links + depends on the original URL containing a hierarchical structure + against which the relative link is based. + + Some URL schemes (such as the ftp, http, and file schemes) contain + names that can be considered hierarchical; the components of the + hierarchy are separated by "/". + + + + + + + + + + + + + +Berners-Lee, Masinter & McCahill [Page 4] + +RFC 1738 Uniform Resource Locators (URL) December 1994 + + +3. Specific Schemes + + The mapping for some existing standard and experimental protocols is + outlined in the BNF syntax definition. Notes on particular protocols + follow. The schemes covered are: + + ftp File Transfer protocol + http Hypertext Transfer Protocol + gopher The Gopher protocol + mailto Electronic mail address + news USENET news + nntp USENET news using NNTP access + telnet Reference to interactive sessions + wais Wide Area Information Servers + file Host-specific file names + prospero Prospero Directory Service + + Other schemes may be specified by future specifications. Section 4 of + this document describes how new schemes may be registered, and lists + some scheme names that are under development. + +3.1. Common Internet Scheme Syntax + + While the syntax for the rest of the URL may vary depending on the + particular scheme selected, URL schemes that involve the direct use + of an IP-based protocol to a specified host on the Internet use a + common syntax for the scheme-specific data: + + //:@:/ + + Some or all of the parts ":@", ":", + ":", and "/" may be excluded. The scheme specific + data start with a double slash "//" to indicate that it complies with + the common Internet scheme syntax. The different components obey the + following rules: + + user + An optional user name. Some schemes (e.g., ftp) allow the + specification of a user name. + + password + An optional password. If present, it follows the user + name separated from it by a colon. + + The user name (and password), if present, are followed by a + commercial at-sign "@". Within the user and password field, any ":", + "@", or "/" must be encoded. + + + + +Berners-Lee, Masinter & McCahill [Page 5] + +RFC 1738 Uniform Resource Locators (URL) December 1994 + + + Note that an empty user name or password is different than no user + name or password; there is no way to specify a password without + specifying a user name. E.g., has an empty + user name and no password, has no user name, + while has a user name of "foo" and an + empty password. + + host + The fully qualified domain name of a network host, or its IP + address as a set of four decimal digit groups separated by + ".". Fully qualified domain names take the form as described + in Section 3.5 of RFC 1034 [13] and Section 2.1 of RFC 1123 + [5]: a sequence of domain labels separated by ".", each domain + label starting and ending with an alphanumerical character and + possibly also containing "-" characters. The rightmost domain + label will never start with a digit, though, which + syntactically distinguishes all domain names from the IP + addresses. + + port + The port number to connect to. Most schemes designate + protocols that have a default port number. Another port number + may optionally be supplied, in decimal, separated from the + host by a colon. If the port is omitted, the colon is as well. + + url-path + The rest of the locator consists of data specific to the + scheme, and is known as the "url-path". It supplies the + details of how the specified resource can be accessed. Note + that the "/" between the host (or port) and the url-path is + NOT part of the url-path. + + The url-path syntax depends on the scheme being used, as does the + manner in which it is interpreted. + +3.2. FTP + + The FTP URL scheme is used to designate files and directories on + Internet hosts accessible using the FTP protocol (RFC959). + + A FTP URL follow the syntax described in Section 3.1. If : is + omitted, the port defaults to 21. + + + + + + + + + +Berners-Lee, Masinter & McCahill [Page 6] + +RFC 1738 Uniform Resource Locators (URL) December 1994 + + +3.2.1. FTP Name and Password + + A user name and password may be supplied; they are used in the ftp + "USER" and "PASS" commands after first making the connection to the + FTP server. If no user name or password is supplied and one is + requested by the FTP server, the conventions for "anonymous" FTP are + to be used, as follows: + + The user name "anonymous" is supplied. + + The password is supplied as the Internet e-mail address + of the end user accessing the resource. + + If the URL supplies a user name but no password, and the remote + server requests a password, the program interpreting the FTP URL + should request one from the user. + +3.2.2. FTP url-path + + The url-path of a FTP URL has the following syntax: + + //...//;type= + + Where through and are (possibly encoded) strings + and is one of the characters "a", "i", or "d". The part + ";type=" may be omitted. The and parts may be + empty. The whole url-path may be omitted, including the "/" + delimiting it from the prefix containing user, password, host, and + port. + + The url-path is interpreted as a series of FTP commands as follows: + + Each of the elements is to be supplied, sequentially, as the + argument to a CWD (change working directory) command. + + If the typecode is "d", perform a NLST (name list) command with + as the argument, and interpret the results as a file + directory listing. + + Otherwise, perform a TYPE command with as the argument, + and then access the file whose name is (for example, using + the RETR command.) + + Within a name or CWD component, the characters "/" and ";" are + reserved and must be encoded. The components are decoded prior to + their use in the FTP protocol. In particular, if the appropriate FTP + sequence to access a particular file requires supplying a string + containing a "/" as an argument to a CWD or RETR command, it is + + + +Berners-Lee, Masinter & McCahill [Page 7] + +RFC 1738 Uniform Resource Locators (URL) December 1994 + + + necessary to encode each "/". + + For example, the URL is + interpreted by FTP-ing to "host.dom", logging in as "myname" + (prompting for a password if it is asked for), and then executing + "CWD /etc" and then "RETR motd". This has a different meaning from + which would "CWD etc" and then + "RETR motd"; the initial "CWD" might be executed relative to the + default directory for "myname". On the other hand, + , would "CWD " with a null + argument, then "CWD etc", and then "RETR motd". + + FTP URLs may also be used for other operations; for example, it is + possible to update a file on a remote file server, or infer + information about it from the directory listings. The mechanism for + doing so is not spelled out here. + +3.2.3. FTP Typecode is Optional + + The entire ;type= part of a FTP URL is optional. If it is + omitted, the client program interpreting the URL must guess the + appropriate mode to use. In general, the data content type of a file + can only be guessed from the name, e.g., from the suffix of the name; + the appropriate type code to be used for transfer of the file can + then be deduced from the data content of the file. + +3.2.4 Hierarchy + + For some file systems, the "/" used to denote the hierarchical + structure of the URL corresponds to the delimiter used to construct a + file name hierarchy, and thus, the filename will look similar to the + URL path. This does NOT mean that the URL is a Unix filename. + +3.2.5. Optimization + + Clients accessing resources via FTP may employ additional heuristics + to optimize the interaction. For some FTP servers, for example, it + may be reasonable to keep the control connection open while accessing + multiple URLs from the same server. However, there is no common + hierarchical model to the FTP protocol, so if a directory change + command has been given, it is impossible in general to deduce what + sequence should be given to navigate to another directory for a + second retrieval, if the paths are different. The only reliable + algorithm is to disconnect and reestablish the control connection. + + + + + + + +Berners-Lee, Masinter & McCahill [Page 8] + +RFC 1738 Uniform Resource Locators (URL) December 1994 + + +3.3. HTTP + + The HTTP URL scheme is used to designate Internet resources + accessible using HTTP (HyperText Transfer Protocol). + + The HTTP protocol is specified elsewhere. This specification only + describes the syntax of HTTP URLs. + + An HTTP URL takes the form: + + http://:/? + + where and are as described in Section 3.1. If : + is omitted, the port defaults to 80. No user name or password is + allowed. is an HTTP selector, and is a query + string. The is optional, as is the and its + preceding "?". If neither nor is present, the "/" + may also be omitted. + + Within the and components, "/", ";", "?" are + reserved. The "/" character may be used within HTTP to designate a + hierarchical structure. + +3.4. GOPHER + + The Gopher URL scheme is used to designate Internet resources + accessible using the Gopher protocol. + + The base Gopher protocol is described in RFC 1436 and supports items + and collections of items (directories). The Gopher+ protocol is a set + of upward compatible extensions to the base Gopher protocol and is + described in [2]. Gopher+ supports associating arbitrary sets of + attributes and alternate data representations with Gopher items. + Gopher URLs accommodate both Gopher and Gopher+ items and item + attributes. + +3.4.1. Gopher URL syntax + + A Gopher URL takes the form: + + gopher://:/ + + where is one of + + + %09 + %09%09 + + + + +Berners-Lee, Masinter & McCahill [Page 9] + +RFC 1738 Uniform Resource Locators (URL) December 1994 + + + If : is omitted, the port defaults to 70. is a + single-character field to denote the Gopher type of the resource to + which the URL refers. The entire may also be empty, in + which case the delimiting "/" is also optional and the + defaults to "1". + + is the Gopher selector string. In the Gopher protocol, + Gopher selector strings are a sequence of octets which may contain + any octets except 09 hexadecimal (US-ASCII HT or tab) 0A hexadecimal + (US-ASCII character LF), and 0D (US-ASCII character CR). + + Gopher clients specify which item to retrieve by sending the Gopher + selector string to a Gopher server. + + Within the , no characters are reserved. + + Note that some Gopher strings begin with a copy of the + character, in which case that character will occur twice + consecutively. The Gopher selector string may be an empty string; + this is how Gopher clients refer to the top-level directory on a + Gopher server. + +3.4.2 Specifying URLs for Gopher Search Engines + + If the URL refers to a search to be submitted to a Gopher search + engine, the selector is followed by an encoded tab (%09) and the + search string. To submit a search to a Gopher search engine, the + Gopher client sends the string (after decoding), a tab, + and the search string to the Gopher server. + +3.4.3 URL syntax for Gopher+ items + + URLs for Gopher+ items have a second encoded tab (%09) and a Gopher+ + string. Note that in this case, the %09 string must be + supplied, although the element may be the empty string. + + The is used to represent information required for + retrieval of the Gopher+ item. Gopher+ items may have alternate + views, arbitrary sets of attributes, and may have electronic forms + associated with them. + + To retrieve the data associated with a Gopher+ URL, a client will + connect to the server and send the Gopher selector, followed by a tab + and the search string (which may be empty), followed by a tab and the + Gopher+ commands. + + + + + + +Berners-Lee, Masinter & McCahill [Page 10] + +RFC 1738 Uniform Resource Locators (URL) December 1994 + + +3.4.4 Default Gopher+ data representation + + When a Gopher server returns a directory listing to a client, the + Gopher+ items are tagged with either a "+" (denoting Gopher+ items) + or a "?" (denoting Gopher+ items which have a +ASK form associated + with them). A Gopher URL with a Gopher+ string consisting of only a + "+" refers to the default view (data representation) of the item + while a Gopher+ string containing only a "?" refer to an item with a + Gopher electronic form associated with it. + +3.4.5 Gopher+ items with electronic forms + + Gopher+ items which have a +ASK associated with them (i.e. Gopher+ + items tagged with a "?") require the client to fetch the item's +ASK + attribute to get the form definition, and then ask the user to fill + out the form and return the user's responses along with the selector + string to retrieve the item. Gopher+ clients know how to do this but + depend on the "?" tag in the Gopher+ item description to know when to + handle this case. The "?" is used in the Gopher+ string to be + consistent with Gopher+ protocol's use of this symbol. + +3.4.6 Gopher+ item attribute collections + + To refer to the Gopher+ attributes of an item, the Gopher URL's + Gopher+ string consists of "!" or "$". "!" refers to the all of a + Gopher+ item's attributes. "$" refers to all the item attributes for + all items in a Gopher directory. + +3.4.7 Referring to specific Gopher+ attributes + + To refer to specific attributes, the URL's gopher+_string is + "!" or "$". For example, to refer to + the attribute containing the abstract of an item, the gopher+_string + would be "!+ABSTRACT". + + To refer to several attributes, the gopher+_string consists of the + attribute names separated by coded spaces. For example, + "!+ABSTRACT%20+SMELL" refers to the +ABSTRACT and +SMELL attributes + of an item. + +3.4.8 URL syntax for Gopher+ alternate views + + Gopher+ allows for optional alternate data representations (alternate + views) of items. To retrieve a Gopher+ alternate view, a Gopher+ + client sends the appropriate view and language identifier (found in + the item's +VIEW attribute). To refer to a specific Gopher+ alternate + view, the URL's Gopher+ string would be in the form: + + + + +Berners-Lee, Masinter & McCahill [Page 11] + +RFC 1738 Uniform Resource Locators (URL) December 1994 + + + +%20 + + For example, a Gopher+ string of "+application/postscript%20Es_ES" + refers to the Spanish language postscript alternate view of a Gopher+ + item. + +3.4.9 URL syntax for Gopher+ electronic forms + + The gopher+_string for a URL that refers to an item referenced by a + Gopher+ electronic form (an ASK block) filled out with specific + values is a coded version of what the client sends to the server. + The gopher+_string is of the form: + ++%091%0D%0A+-1%0D%0A%0D%0A%0D%0A.%0D%0A + + To retrieve this item, the Gopher client sends: + + +1 + +-1 + + + . + + to the Gopher server. + +3.5. MAILTO + + The mailto URL scheme is used to designate the Internet mailing + address of an individual or service. No additional information other + than an Internet mailing address is present or implied. + + A mailto URL takes the form: + + mailto: + + where is (the encoding of an) addr-spec, as + specified in RFC 822 [6]. Within mailto URLs, there are no reserved + characters. + + Note that the percent sign ("%") is commonly used within RFC 822 + addresses and must be encoded. + + Unlike many URLs, the mailto scheme does not represent a data object + to be accessed directly; there is no sense in which it designates an + object. It has a different use than the message/external-body type in + MIME. + + + + + +Berners-Lee, Masinter & McCahill [Page 12] + +RFC 1738 Uniform Resource Locators (URL) December 1994 + + +3.6. NEWS + + The news URL scheme is used to refer to either news groups or + individual articles of USENET news, as specified in RFC 1036. + + A news URL takes one of two forms: + + news: + news: + + A is a period-delimited hierarchical name, such as + "comp.infosystems.www.misc". A corresponds to the + Message-ID of section 2.1.5 of RFC 1036, without the enclosing "<" + and ">"; it takes the form @. A message + identifier may be distinguished from a news group name by the + presence of the commercial at "@" character. No additional characters + are reserved within the components of a news URL. + + If is "*" (as in ), it is used to refer + to "all available news groups". + + The news URLs are unusual in that by themselves, they do not contain + sufficient information to locate a single resource, but, rather, are + location-independent. + +3.7. NNTP + + The nntp URL scheme is an alternative method of referencing news + articles, useful for specifying news articles from NNTP servers (RFC + 977). + + A nntp URL take the form: + + nntp://:// + + where and are as described in Section 3.1. If : + is omitted, the port defaults to 119. + + The is the name of the group, while the is the numeric id of the article within that newsgroup. + + Note that while nntp: URLs specify a unique location for the article + resource, most NNTP servers currently on the Internet today are + configured only to allow access from local clients, and thus nntp + URLs do not designate globally accessible resources. Thus, the news: + form of URL is preferred as a way of identifying news articles. + + + + + +Berners-Lee, Masinter & McCahill [Page 13] + +RFC 1738 Uniform Resource Locators (URL) December 1994 + + +3.8. TELNET + + The Telnet URL scheme is used to designate interactive services that + may be accessed by the Telnet protocol. + + A telnet URL takes the form: + + telnet://:@:/ + + as specified in Section 3.1. The final "/" character may be omitted. + If : is omitted, the port defaults to 23. The : can + be omitted, as well as the whole : part. + + This URL does not designate a data object, but rather an interactive + service. Remote interactive services vary widely in the means by + which they allow remote logins; in practice, the and + supplied are advisory only: clients accessing a telnet URL + merely advise the user of the suggested username and password. + +3.9. WAIS + + The WAIS URL scheme is used to designate WAIS databases, searches, or + individual documents available from a WAIS database. WAIS is + described in [7]. The WAIS protocol is described in RFC 1625 [17]; + Although the WAIS protocol is based on Z39.50-1988, the WAIS URL + scheme is not intended for use with arbitrary Z39.50 services. + + A WAIS URL takes one of the following forms: + + wais://:/ + wais://:/? + wais://:/// + + where and are as described in Section 3.1. If : + is omitted, the port defaults to 210. The first form designates a + WAIS database that is available for searching. The second form + designates a particular search. is the name of the WAIS + database being queried. + + The third form designates a particular document within a WAIS + database to be retrieved. In this form is the WAIS + designation of the type of the object. Many WAIS implementations + require that a client know the "type" of an object prior to + retrieval, the type being returned along with the internal object + identifier in the search response. The is included in the + URL in order to allow the client interpreting the URL adequate + information to actually retrieve the document. + + + + +Berners-Lee, Masinter & McCahill [Page 14] + +RFC 1738 Uniform Resource Locators (URL) December 1994 + + + The of a WAIS URL consists of the WAIS document-id, encoded + as necessary using the method described in Section 2.2. The WAIS + document-id should be treated opaquely; it may only be decomposed by + the server that issued it. + +3.10 FILES + + The file URL scheme is used to designate files accessible on a + particular host computer. This scheme, unlike most other URL schemes, + does not designate a resource that is universally accessible over the + Internet. + + A file URL takes the form: + + file:/// + + where is the fully qualified domain name of the system on + which the is accessible, and is a hierarchical + directory path of the form //.../. + + For example, a VMS file + + DISK$USER:[MY.NOTES]NOTE123456.TXT + + might become + + + + As a special case, can be the string "localhost" or the empty + string; this is interpreted as `the machine from which the URL is + being interpreted'. + + The file URL scheme is unusual in that it does not specify an + Internet protocol or access method for such files; as such, its + utility in network protocols between hosts is limited. + +3.11 PROSPERO + + The Prospero URL scheme is used to designate resources that are + accessed via the Prospero Directory Service. The Prospero protocol is + described elsewhere [14]. + + A prospero URLs takes the form: + + prospero://:/;= + + where and are as described in Section 3.1. If : + is omitted, the port defaults to 1525. No username or password is + + + +Berners-Lee, Masinter & McCahill [Page 15] + +RFC 1738 Uniform Resource Locators (URL) December 1994 + + + allowed. + + The is the host-specific object name in the Prospero + protocol, suitably encoded. This name is opaque and interpreted by + the Prospero server. The semicolon ";" is reserved and may not + appear without quoting in the . + + Prospero URLs are interpreted by contacting a Prospero directory + server on the specified host and port to determine appropriate access + methods for a resource, which might themselves be represented as + different URLs. External Prospero links are represented as URLs of + the underlying access method and are not represented as Prospero + URLs. + + Note that a slash "/" may appear in the without quoting and + no significance may be assumed by the application. Though slashes + may indicate hierarchical structure on the server, such structure is + not guaranteed. Note that many s begin with a slash, in + which case the host or port will be followed by a double slash: the + slash from the URL syntax, followed by the initial slash from the + . (E.g., designates a + of "/pros/name".) + + In addition, after the , optional fields and values + associated with a Prospero link may be specified as part of the URL. + When present, each field/value pair is separated from each other and + from the rest of the URL by a ";" (semicolon). The name of the field + and its value are separated by a "=" (equal sign). If present, these + fields serve to identify the target of the URL. For example, the + OBJECT-VERSION field can be specified to identify a specific version + of an object. + +4. REGISTRATION OF NEW SCHEMES + + A new scheme may be introduced by defining a mapping onto a + conforming URL syntax, using a new prefix. URLs for experimental + schemes may be used by mutual agreement between parties. Scheme names + starting with the characters "x-" are reserved for experimental + purposes. + + The Internet Assigned Numbers Authority (IANA) will maintain a + registry of URL schemes. Any submission of a new URL scheme must + include a definition of an algorithm for accessing of resources + within that scheme and the syntax for representing such a scheme. + + URL schemes must have demonstrable utility and operability. One way + to provide such a demonstration is via a gateway which provides + objects in the new scheme for clients using an existing protocol. If + + + +Berners-Lee, Masinter & McCahill [Page 16] + +RFC 1738 Uniform Resource Locators (URL) December 1994 + + + the new scheme does not locate resources that are data objects, the + properties of names in the new space must be clearly defined. + + New schemes should try to follow the same syntactic conventions of + existing schemes, where appropriate. It is likewise recommended + that, where a protocol allows for retrieval by URL, that the client + software have provision for being configured to use specific gateway + locators for indirect access through new naming schemes. + + The following scheme have been proposed at various times, but this + document does not define their syntax or use at this time. It is + suggested that IANA reserve their scheme names for future definition: + + afs Andrew File System global file names. + mid Message identifiers for electronic mail. + cid Content identifiers for MIME body parts. + nfs Network File System (NFS) file names. + tn3270 Interactive 3270 emulation sessions. + mailserver Access to data available from mail servers. + z39.50 Access to ANSI Z39.50 services. + +5. BNF for specific URL schemes + + This is a BNF-like description of the Uniform Resource Locator + syntax, using the conventions of RFC822, except that "|" is used to + designate alternatives, and brackets [] are used around optional or + repeated elements. Briefly, literals are quoted with "", optional + elements are enclosed in [brackets], and elements may be preceded + with * to designate n or more repetitions of the following + element; n defaults to 0. + +; The generic form of a URL is: + +genericurl = scheme ":" schemepart + +; Specific predefined schemes are defined here; new schemes +; may be registered with IANA + +url = httpurl | ftpurl | newsurl | + nntpurl | telneturl | gopherurl | + waisurl | mailtourl | fileurl | + prosperourl | otherurl + +; new schemes follow the general syntax +otherurl = genericurl + +; the scheme is in lower case; interpreters should use case-ignore +scheme = 1*[ lowalpha | digit | "+" | "-" | "." ] + + + +Berners-Lee, Masinter & McCahill [Page 17] + +RFC 1738 Uniform Resource Locators (URL) December 1994 + + +schemepart = *xchar | ip-schemepart + + +; URL schemeparts for ip based protocols: + +ip-schemepart = "//" login [ "/" urlpath ] + +login = [ user [ ":" password ] "@" ] hostport +hostport = host [ ":" port ] +host = hostname | hostnumber +hostname = *[ domainlabel "." ] toplabel +domainlabel = alphadigit | alphadigit *[ alphadigit | "-" ] alphadigit +toplabel = alpha | alpha *[ alphadigit | "-" ] alphadigit +alphadigit = alpha | digit +hostnumber = digits "." digits "." digits "." digits +port = digits +user = *[ uchar | ";" | "?" | "&" | "=" ] +password = *[ uchar | ";" | "?" | "&" | "=" ] +urlpath = *xchar ; depends on protocol see section 3.1 + +; The predefined schemes: + +; FTP (see also RFC959) + +ftpurl = "ftp://" login [ "/" fpath [ ";type=" ftptype ]] +fpath = fsegment *[ "/" fsegment ] +fsegment = *[ uchar | "?" | ":" | "@" | "&" | "=" ] +ftptype = "A" | "I" | "D" | "a" | "i" | "d" + +; FILE + +fileurl = "file://" [ host | "localhost" ] "/" fpath + +; HTTP + +httpurl = "http://" hostport [ "/" hpath [ "?" search ]] +hpath = hsegment *[ "/" hsegment ] +hsegment = *[ uchar | ";" | ":" | "@" | "&" | "=" ] +search = *[ uchar | ";" | ":" | "@" | "&" | "=" ] + +; GOPHER (see also RFC1436) + +gopherurl = "gopher://" hostport [ / [ gtype [ selector + [ "%09" search [ "%09" gopher+_string ] ] ] ] ] +gtype = xchar +selector = *xchar +gopher+_string = *xchar + + + + +Berners-Lee, Masinter & McCahill [Page 18] + +RFC 1738 Uniform Resource Locators (URL) December 1994 + + +; MAILTO (see also RFC822) + +mailtourl = "mailto:" encoded822addr +encoded822addr = 1*xchar ; further defined in RFC822 + +; NEWS (see also RFC1036) + +newsurl = "news:" grouppart +grouppart = "*" | group | article +group = alpha *[ alpha | digit | "-" | "." | "+" | "_" ] +article = 1*[ uchar | ";" | "/" | "?" | ":" | "&" | "=" ] "@" host + +; NNTP (see also RFC977) + +nntpurl = "nntp://" hostport "/" group [ "/" digits ] + +; TELNET + +telneturl = "telnet://" login [ "/" ] + +; WAIS (see also RFC1625) + +waisurl = waisdatabase | waisindex | waisdoc +waisdatabase = "wais://" hostport "/" database +waisindex = "wais://" hostport "/" database "?" search +waisdoc = "wais://" hostport "/" database "/" wtype "/" wpath +database = *uchar +wtype = *uchar +wpath = *uchar + +; PROSPERO + +prosperourl = "prospero://" hostport "/" ppath *[ fieldspec ] +ppath = psegment *[ "/" psegment ] +psegment = *[ uchar | "?" | ":" | "@" | "&" | "=" ] +fieldspec = ";" fieldname "=" fieldvalue +fieldname = *[ uchar | "?" | ":" | "@" | "&" ] +fieldvalue = *[ uchar | "?" | ":" | "@" | "&" ] + +; Miscellaneous definitions + +lowalpha = "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" | + "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" | + "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" | + "y" | "z" +hialpha = "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" | "I" | + "J" | "K" | "L" | "M" | "N" | "O" | "P" | "Q" | "R" | + "S" | "T" | "U" | "V" | "W" | "X" | "Y" | "Z" + + + +Berners-Lee, Masinter & McCahill [Page 19] + +RFC 1738 Uniform Resource Locators (URL) December 1994 + + +alpha = lowalpha | hialpha +digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | + "8" | "9" +safe = "$" | "-" | "_" | "." | "+" +extra = "!" | "*" | "'" | "(" | ")" | "," +national = "{" | "}" | "|" | "\" | "^" | "~" | "[" | "]" | "`" +punctuation = "<" | ">" | "#" | "%" | <"> + + +reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" +hex = digit | "A" | "B" | "C" | "D" | "E" | "F" | + "a" | "b" | "c" | "d" | "e" | "f" +escape = "%" hex hex + +unreserved = alpha | digit | safe | extra +uchar = unreserved | escape +xchar = unreserved | reserved | escape +digits = 1*digit + +6. Security Considerations + + The URL scheme does not in itself pose a security threat. Users + should beware that there is no general guarantee that a URL which at + one time points to a given object continues to do so, and does not + even at some later time point to a different object due to the + movement of objects on servers. + + A URL-related security threat is that it is sometimes possible to + construct a URL such that an attempt to perform a harmless idempotent + operation such as the retrieval of the object will in fact cause a + possibly damaging remote operation to occur. The unsafe URL is + typically constructed by specifying a port number other than that + reserved for the network protocol in question. The client + unwittingly contacts a server which is in fact running a different + protocol. The content of the URL contains instructions which when + interpreted according to this other protocol cause an unexpected + operation. An example has been the use of gopher URLs to cause a rude + message to be sent via a SMTP server. Caution should be used when + using any URL which specifies a port number other than the default + for the protocol, especially when it is a number within the reserved + space. + + Care should be taken when URLs contain embedded encoded delimiters + for a given protocol (for example, CR and LF characters for telnet + protocols) that these are not unencoded before transmission. This + would violate the protocol but could be used to simulate an extra + operation or parameter, again causing an unexpected and possible + harmful remote operation to be performed. + + + +Berners-Lee, Masinter & McCahill [Page 20] + +RFC 1738 Uniform Resource Locators (URL) December 1994 + + + The use of URLs containing passwords that should be secret is clearly + unwise. + +7. Acknowledgements + + This paper builds on the basic WWW design (RFC 1630) and much + discussion of these issues by many people on the network. The + discussion was particularly stimulated by articles by Clifford Lynch, + Brewster Kahle [10] and Wengyik Yeong [18]. Contributions from John + Curran, Clifford Neuman, Ed Vielmetti and later the IETF URL BOF and + URI working group were incorporated. + + Most recently, careful readings and comments by Dan Connolly, Ned + Freed, Roy Fielding, Guido van Rossum, Michael Dolan, Bert Bos, John + Kunze, Olle Jarnefors, Peter Svanberg and many others have helped + refine this RFC. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Berners-Lee, Masinter & McCahill [Page 21] + +RFC 1738 Uniform Resource Locators (URL) December 1994 + + +APPENDIX: Recommendations for URLs in Context + + URIs, including URLs, are intended to be transmitted through + protocols which provide a context for their interpretation. + + In some cases, it will be necessary to distinguish URLs from other + possible data structures in a syntactic structure. In this case, is + recommended that URLs be preceeded with a prefix consisting of the + characters "URL:". For example, this prefix may be used to + distinguish URLs from other kinds of URIs. + + In addition, there are many occasions when URLs are included in other + kinds of text; examples include electronic mail, USENET news + messages, or printed on paper. In such cases, it is convenient to + have a separate syntactic wrapper that delimits the URL and separates + it from the rest of the text, and in particular from punctuation + marks that might be mistaken for part of the URL. For this purpose, + is recommended that angle brackets ("<" and ">"), along with the + prefix "URL:", be used to delimit the boundaries of the URL. This + wrapper does not form part of the URL and should not be used in + contexts in which delimiters are already specified. + + In the case where a fragment/anchor identifier is associated with a + URL (following a "#"), the identifier would be placed within the + brackets as well. + + In some cases, extra whitespace (spaces, linebreaks, tabs, etc.) may + need to be added to break long URLs across lines. The whitespace + should be ignored when extracting the URL. + + No whitespace should be introduced after a hyphen ("-") character. + Because some typesetters and printers may (erroneously) introduce a + hyphen at the end of line when breaking a line, the interpreter of a + URL containing a line break immediately after a hyphen should ignore + all unencoded whitespace around the line break, and should be aware + that the hyphen may or may not actually be part of the URL. + + Examples: + + Yes, Jim, I found it under but you can probably pick it up from . Note the warning in . + + + + + + + + +Berners-Lee, Masinter & McCahill [Page 22] + +RFC 1738 Uniform Resource Locators (URL) December 1994 + + +References + + [1] Anklesaria, F., McCahill, M., Lindner, P., Johnson, D., + Torrey, D., and B. Alberti, "The Internet Gopher Protocol + (a distributed document search and retrieval protocol)", + RFC 1436, University of Minnesota, March 1993. + + + [2] Anklesaria, F., Lindner, P., McCahill, M., Torrey, D., + Johnson, D., and B. Alberti, "Gopher+: Upward compatible + enhancements to the Internet Gopher protocol", + University of Minnesota, July 1993. + + + [3] Berners-Lee, T., "Universal Resource Identifiers in WWW: A + Unifying Syntax for the Expression of Names and Addresses of + Objects on the Network as used in the World-Wide Web", RFC + 1630, CERN, June 1994. + + + [4] Berners-Lee, T., "Hypertext Transfer Protocol (HTTP)", + CERN, November 1993. + + + [5] Braden, R., Editor, "Requirements for Internet Hosts -- + Application and Support", STD 3, RFC 1123, IETF, October 1989. + + + [6] Crocker, D. "Standard for the Format of ARPA Internet Text + Messages", STD 11, RFC 822, UDEL, April 1982. + + + [7] Davis, F., Kahle, B., Morris, H., Salem, J., Shen, T., Wang, R., + Sui, J., and M. Grinbaum, "WAIS Interface Protocol Prototype + Functional Specification", (v1.5), Thinking Machines + Corporation, April 1990. + + + [8] Horton, M. and R. Adams, "Standard For Interchange of USENET + Messages", RFC 1036, AT&T Bell Laboratories, Center for Seismic + Studies, December 1987. + + + [9] Huitema, C., "Naming: Strategies and Techniques", Computer + Networks and ISDN Systems 23 (1991) 107-110. + + + + + +Berners-Lee, Masinter & McCahill [Page 23] + +RFC 1738 Uniform Resource Locators (URL) December 1994 + + + [10] Kahle, B., "Document Identifiers, or International Standard + Book Numbers for the Electronic Age", 1991. + + + [11] Kantor, B. and P. Lapsley, "Network News Transfer Protocol: + A Proposed Standard for the Stream-Based Transmission of News", + RFC 977, UC San Diego & UC Berkeley, February 1986. + + + [12] Kunze, J., "Functional Requirements for Internet Resource + Locators", Work in Progress, December 1994. + + + [13] Mockapetris, P., "Domain Names - Concepts and Facilities", + STD 13, RFC 1034, USC/Information Sciences Institute, + November 1987. + + + [14] Neuman, B., and S. Augart, "The Prospero Protocol", + USC/Information Sciences Institute, June 1993. + + + [15] Postel, J. and J. Reynolds, "File Transfer Protocol (FTP)", + STD 9, RFC 959, USC/Information Sciences Institute, + October 1985. + + + [16] Sollins, K. and L. Masinter, "Functional Requirements for + Uniform Resource Names", RFC 1737, MIT/LCS, Xerox Corporation, + December 1994. + + + [17] St. Pierre, M, Fullton, J., Gamiel, K., Goldman, J., Kahle, B., + Kunze, J., Morris, H., and F. Schiettecatte, "WAIS over + Z39.50-1988", RFC 1625, WAIS, Inc., CNIDR, Thinking Machines + Corp., UC Berkeley, FS Consulting, June 1994. + + + [18] Yeong, W. "Towards Networked Information Retrieval", Technical + report 91-06-25-01, Performance Systems International, Inc. + , June 1991. + + [19] Yeong, W., "Representing Public Archives in the Directory", + Work in Progress, November 1991. + + + + + +Berners-Lee, Masinter & McCahill [Page 24] + +RFC 1738 Uniform Resource Locators (URL) December 1994 + + + [20] "Coded Character Set -- 7-bit American Standard Code for + Information Interchange", ANSI X3.4-1986. + +Editors' Addresses + +Tim Berners-Lee +World-Wide Web project +CERN, +1211 Geneva 23, +Switzerland + +Phone: +41 (22)767 3755 +Fax: +41 (22)767 7155 +EMail: timbl@info.cern.ch + + +Larry Masinter +Xerox PARC +3333 Coyote Hill Road +Palo Alto, CA 94034 + +Phone: (415) 812-4365 +Fax: (415) 812-4333 +EMail: masinter@parc.xerox.com + + +Mark McCahill +Computer and Information Services, +University of Minnesota +Room 152 Shepherd Labs +100 Union Street SE +Minneapolis, MN 55455 + +Phone: (612) 625 1300 +EMail: mpm@boombox.micro.umn.edu + + + + + + + + + + + + + + + + +Berners-Lee, Masinter & McCahill [Page 25] + diff --git a/doc/rfc/rfc1945.txt b/doc/rfc/rfc1945.txt new file mode 100644 index 0000000000..37f3f23c6a --- /dev/null +++ b/doc/rfc/rfc1945.txt @@ -0,0 +1,3363 @@ + + + + + + +Network Working Group T. Berners-Lee +Request for Comments: 1945 MIT/LCS +Category: Informational R. Fielding + UC Irvine + H. Frystyk + MIT/LCS + May 1996 + + + Hypertext Transfer Protocol -- HTTP/1.0 + +Status of This Memo + + This memo provides information for the Internet community. This memo + does not specify an Internet standard of any kind. Distribution of + this memo is unlimited. + +IESG Note: + + The IESG has concerns about this protocol, and expects this document + to be replaced relatively soon by a standards track document. + +Abstract + + The Hypertext Transfer Protocol (HTTP) is an application-level + protocol with the lightness and speed necessary for distributed, + collaborative, hypermedia information systems. It is a generic, + stateless, object-oriented protocol which can be used for many tasks, + such as name servers and distributed object management systems, + through extension of its request methods (commands). A feature of + HTTP is the typing of data representation, allowing systems to be + built independently of the data being transferred. + + HTTP has been in use by the World-Wide Web global information + initiative since 1990. This specification reflects common usage of + the protocol referred to as "HTTP/1.0". + +Table of Contents + + 1. Introduction .............................................. 4 + 1.1 Purpose .............................................. 4 + 1.2 Terminology .......................................... 4 + 1.3 Overall Operation .................................... 6 + 1.4 HTTP and MIME ........................................ 8 + 2. Notational Conventions and Generic Grammar ................ 8 + 2.1 Augmented BNF ........................................ 8 + 2.2 Basic Rules .......................................... 10 + 3. Protocol Parameters ....................................... 12 + + + +Berners-Lee, et al Informational [Page 1] + +RFC 1945 HTTP/1.0 May 1996 + + + 3.1 HTTP Version ......................................... 12 + 3.2 Uniform Resource Identifiers ......................... 14 + 3.2.1 General Syntax ................................ 14 + 3.2.2 http URL ...................................... 15 + 3.3 Date/Time Formats .................................... 15 + 3.4 Character Sets ....................................... 17 + 3.5 Content Codings ...................................... 18 + 3.6 Media Types .......................................... 19 + 3.6.1 Canonicalization and Text Defaults ............ 19 + 3.6.2 Multipart Types ............................... 20 + 3.7 Product Tokens ....................................... 20 + 4. HTTP Message .............................................. 21 + 4.1 Message Types ........................................ 21 + 4.2 Message Headers ...................................... 22 + 4.3 General Header Fields ................................ 23 + 5. Request ................................................... 23 + 5.1 Request-Line ......................................... 23 + 5.1.1 Method ........................................ 24 + 5.1.2 Request-URI ................................... 24 + 5.2 Request Header Fields ................................ 25 + 6. Response .................................................. 25 + 6.1 Status-Line .......................................... 26 + 6.1.1 Status Code and Reason Phrase ................. 26 + 6.2 Response Header Fields ............................... 28 + 7. Entity .................................................... 28 + 7.1 Entity Header Fields ................................. 29 + 7.2 Entity Body .......................................... 29 + 7.2.1 Type .......................................... 29 + 7.2.2 Length ........................................ 30 + 8. Method Definitions ........................................ 30 + 8.1 GET .................................................. 31 + 8.2 HEAD ................................................. 31 + 8.3 POST ................................................. 31 + 9. Status Code Definitions ................................... 32 + 9.1 Informational 1xx .................................... 32 + 9.2 Successful 2xx ....................................... 32 + 9.3 Redirection 3xx ...................................... 34 + 9.4 Client Error 4xx ..................................... 35 + 9.5 Server Error 5xx ..................................... 37 + 10. Header Field Definitions .................................. 37 + 10.1 Allow ............................................... 38 + 10.2 Authorization ....................................... 38 + 10.3 Content-Encoding .................................... 39 + 10.4 Content-Length ...................................... 39 + 10.5 Content-Type ........................................ 40 + 10.6 Date ................................................ 40 + 10.7 Expires ............................................. 41 + 10.8 From ................................................ 42 + + + +Berners-Lee, et al Informational [Page 2] + +RFC 1945 HTTP/1.0 May 1996 + + + 10.9 If-Modified-Since ................................... 42 + 10.10 Last-Modified ....................................... 43 + 10.11 Location ............................................ 44 + 10.12 Pragma .............................................. 44 + 10.13 Referer ............................................. 44 + 10.14 Server .............................................. 45 + 10.15 User-Agent .......................................... 46 + 10.16 WWW-Authenticate .................................... 46 + 11. Access Authentication ..................................... 47 + 11.1 Basic Authentication Scheme ......................... 48 + 12. Security Considerations ................................... 49 + 12.1 Authentication of Clients ........................... 49 + 12.2 Safe Methods ........................................ 49 + 12.3 Abuse of Server Log Information ..................... 50 + 12.4 Transfer of Sensitive Information ................... 50 + 12.5 Attacks Based On File and Path Names ................ 51 + 13. Acknowledgments ........................................... 51 + 14. References ................................................ 52 + 15. Authors' Addresses ........................................ 54 + Appendix A. Internet Media Type message/http ................ 55 + Appendix B. Tolerant Applications ........................... 55 + Appendix C. Relationship to MIME ............................ 56 + C.1 Conversion to Canonical Form ......................... 56 + C.2 Conversion of Date Formats ........................... 57 + C.3 Introduction of Content-Encoding ..................... 57 + C.4 No Content-Transfer-Encoding ......................... 57 + C.5 HTTP Header Fields in Multipart Body-Parts ........... 57 + Appendix D. Additional Features ............................. 57 + D.1 Additional Request Methods ........................... 58 + D.1.1 PUT ........................................... 58 + D.1.2 DELETE ........................................ 58 + D.1.3 LINK .......................................... 58 + D.1.4 UNLINK ........................................ 58 + D.2 Additional Header Field Definitions .................. 58 + D.2.1 Accept ........................................ 58 + D.2.2 Accept-Charset ................................ 59 + D.2.3 Accept-Encoding ............................... 59 + D.2.4 Accept-Language ............................... 59 + D.2.5 Content-Language .............................. 59 + D.2.6 Link .......................................... 59 + D.2.7 MIME-Version .................................. 59 + D.2.8 Retry-After ................................... 60 + D.2.9 Title ......................................... 60 + D.2.10 URI ........................................... 60 + + + + + + + +Berners-Lee, et al Informational [Page 3] + +RFC 1945 HTTP/1.0 May 1996 + + +1. Introduction + +1.1 Purpose + + The Hypertext Transfer Protocol (HTTP) is an application-level + protocol with the lightness and speed necessary for distributed, + collaborative, hypermedia information systems. HTTP has been in use + by the World-Wide Web global information initiative since 1990. This + specification reflects common usage of the protocol referred too as + "HTTP/1.0". This specification describes the features that seem to be + consistently implemented in most HTTP/1.0 clients and servers. The + specification is split into two sections. Those features of HTTP for + which implementations are usually consistent are described in the + main body of this document. Those features which have few or + inconsistent implementations are listed in Appendix D. + + Practical information systems require more functionality than simple + retrieval, including search, front-end update, and annotation. HTTP + allows an open-ended set of methods to be used to indicate the + purpose of a request. It builds on the discipline of reference + provided by the Uniform Resource Identifier (URI) [2], as a location + (URL) [4] or name (URN) [16], for indicating the resource on which a + method is to be applied. Messages are passed in a format similar to + that used by Internet Mail [7] and the Multipurpose Internet Mail + Extensions (MIME) [5]. + + HTTP is also used as a generic protocol for communication between + user agents and proxies/gateways to other Internet protocols, such as + SMTP [12], NNTP [11], FTP [14], Gopher [1], and WAIS [8], allowing + basic hypermedia access to resources available from diverse + applications and simplifying the implementation of user agents. + +1.2 Terminology + + This specification uses a number of terms to refer to the roles + played by participants in, and objects of, the HTTP communication. + + connection + + A transport layer virtual circuit established between two + application programs for the purpose of communication. + + message + + The basic unit of HTTP communication, consisting of a structured + sequence of octets matching the syntax defined in Section 4 and + transmitted via the connection. + + + + +Berners-Lee, et al Informational [Page 4] + +RFC 1945 HTTP/1.0 May 1996 + + + request + + An HTTP request message (as defined in Section 5). + + response + + An HTTP response message (as defined in Section 6). + + resource + + A network data object or service which can be identified by a + URI (Section 3.2). + + entity + + A particular representation or rendition of a data resource, or + reply from a service resource, that may be enclosed within a + request or response message. An entity consists of + metainformation in the form of entity headers and content in the + form of an entity body. + + client + + An application program that establishes connections for the + purpose of sending requests. + + user agent + + The client which initiates a request. These are often browsers, + editors, spiders (web-traversing robots), or other end user + tools. + + server + + An application program that accepts connections in order to + service requests by sending back responses. + + origin server + + The server on which a given resource resides or is to be created. + + proxy + + An intermediary program which acts as both a server and a client + for the purpose of making requests on behalf of other clients. + Requests are serviced internally or by passing them, with + possible translation, on to other servers. A proxy must + interpret and, if necessary, rewrite a request message before + + + +Berners-Lee, et al Informational [Page 5] + +RFC 1945 HTTP/1.0 May 1996 + + + forwarding it. Proxies are often used as client-side portals + through network firewalls and as helper applications for + handling requests via protocols not implemented by the user + agent. + + gateway + + A server which acts as an intermediary for some other server. + Unlike a proxy, a gateway receives requests as if it were the + origin server for the requested resource; the requesting client + may not be aware that it is communicating with a gateway. + Gateways are often used as server-side portals through network + firewalls and as protocol translators for access to resources + stored on non-HTTP systems. + + tunnel + + A tunnel is an intermediary program which is acting as a blind + relay between two connections. Once active, a tunnel is not + considered a party to the HTTP communication, though the tunnel + may have been initiated by an HTTP request. The tunnel ceases to + exist when both ends of the relayed connections are closed. + Tunnels are used when a portal is necessary and the intermediary + cannot, or should not, interpret the relayed communication. + + cache + + A program's local store of response messages and the subsystem + that controls its message storage, retrieval, and deletion. A + cache stores cachable responses in order to reduce the response + time and network bandwidth consumption on future, equivalent + requests. Any client or server may include a cache, though a + cache cannot be used by a server while it is acting as a tunnel. + + Any given program may be capable of being both a client and a server; + our use of these terms refers only to the role being performed by the + program for a particular connection, rather than to the program's + capabilities in general. Likewise, any server may act as an origin + server, proxy, gateway, or tunnel, switching behavior based on the + nature of each request. + +1.3 Overall Operation + + The HTTP protocol is based on a request/response paradigm. A client + establishes a connection with a server and sends a request to the + server in the form of a request method, URI, and protocol version, + followed by a MIME-like message containing request modifiers, client + information, and possible body content. The server responds with a + + + +Berners-Lee, et al Informational [Page 6] + +RFC 1945 HTTP/1.0 May 1996 + + + status line, including the message's protocol version and a success + or error code, followed by a MIME-like message containing server + information, entity metainformation, and possible body content. + + Most HTTP communication is initiated by a user agent and consists of + a request to be applied to a resource on some origin server. In the + simplest case, this may be accomplished via a single connection (v) + between the user agent (UA) and the origin server (O). + + request chain ------------------------> + UA -------------------v------------------- O + <----------------------- response chain + + A more complicated situation occurs when one or more intermediaries + are present in the request/response chain. There are three common + forms of intermediary: proxy, gateway, and tunnel. A proxy is a + forwarding agent, receiving requests for a URI in its absolute form, + rewriting all or parts of the message, and forwarding the reformatted + request toward the server identified by the URI. A gateway is a + receiving agent, acting as a layer above some other server(s) and, if + necessary, translating the requests to the underlying server's + protocol. A tunnel acts as a relay point between two connections + without changing the messages; tunnels are used when the + communication needs to pass through an intermediary (such as a + firewall) even when the intermediary cannot understand the contents + of the messages. + + request chain --------------------------------------> + UA -----v----- A -----v----- B -----v----- C -----v----- O + <------------------------------------- response chain + + The figure above shows three intermediaries (A, B, and C) between the + user agent and origin server. A request or response message that + travels the whole chain must pass through four separate connections. + This distinction is important because some HTTP communication options + may apply only to the connection with the nearest, non-tunnel + neighbor, only to the end-points of the chain, or to all connections + along the chain. Although the diagram is linear, each participant may + be engaged in multiple, simultaneous communications. For example, B + may be receiving requests from many clients other than A, and/or + forwarding requests to servers other than C, at the same time that it + is handling A's request. + + Any party to the communication which is not acting as a tunnel may + employ an internal cache for handling requests. The effect of a cache + is that the request/response chain is shortened if one of the + participants along the chain has a cached response applicable to that + request. The following illustrates the resulting chain if B has a + + + +Berners-Lee, et al Informational [Page 7] + +RFC 1945 HTTP/1.0 May 1996 + + + cached copy of an earlier response from O (via C) for a request which + has not been cached by UA or A. + + request chain ----------> + UA -----v----- A -----v----- B - - - - - - C - - - - - - O + <--------- response chain + + Not all responses are cachable, and some requests may contain + modifiers which place special requirements on cache behavior. Some + HTTP/1.0 applications use heuristics to describe what is or is not a + "cachable" response, but these rules are not standardized. + + On the Internet, HTTP communication generally takes place over TCP/IP + connections. The default port is TCP 80 [15], but other ports can be + used. This does not preclude HTTP from being implemented on top of + any other protocol on the Internet, or on other networks. HTTP only + presumes a reliable transport; any protocol that provides such + guarantees can be used, and the mapping of the HTTP/1.0 request and + response structures onto the transport data units of the protocol in + question is outside the scope of this specification. + + Except for experimental applications, current practice requires that + the connection be established by the client prior to each request and + closed by the server after sending the response. Both clients and + servers should be aware that either party may close the connection + prematurely, due to user action, automated time-out, or program + failure, and should handle such closing in a predictable fashion. In + any case, the closing of the connection by either or both parties + always terminates the current request, regardless of its status. + +1.4 HTTP and MIME + + HTTP/1.0 uses many of the constructs defined for MIME, as defined in + RFC 1521 [5]. Appendix C describes the ways in which the context of + HTTP allows for different use of Internet Media Types than is + typically found in Internet mail, and gives the rationale for those + differences. + +2. Notational Conventions and Generic Grammar + +2.1 Augmented BNF + + All of the mechanisms specified in this document are described in + both prose and an augmented Backus-Naur Form (BNF) similar to that + used by RFC 822 [7]. Implementors will need to be familiar with the + notation in order to understand this specification. The augmented BNF + includes the following constructs: + + + + +Berners-Lee, et al Informational [Page 8] + +RFC 1945 HTTP/1.0 May 1996 + + + name = definition + + The name of a rule is simply the name itself (without any + enclosing "<" and ">") and is separated from its definition by + the equal character "=". Whitespace is only significant in that + indentation of continuation lines is used to indicate a rule + definition that spans more than one line. Certain basic rules + are in uppercase, such as SP, LWS, HT, CRLF, DIGIT, ALPHA, etc. + Angle brackets are used within definitions whenever their + presence will facilitate discerning the use of rule names. + + "literal" + + Quotation marks surround literal text. Unless stated otherwise, + the text is case-insensitive. + + rule1 | rule2 + + Elements separated by a bar ("I") are alternatives, + e.g., "yes | no" will accept yes or no. + + (rule1 rule2) + + Elements enclosed in parentheses are treated as a single + element. Thus, "(elem (foo | bar) elem)" allows the token + sequences "elem foo elem" and "elem bar elem". + + *rule + + The character "*" preceding an element indicates repetition. The + full form is "*element" indicating at least and at + most occurrences of element. Default values are 0 and + infinity so that "*(element)" allows any number, including zero; + "1*element" requires at least one; and "1*2element" allows one + or two. + + [rule] + + Square brackets enclose optional elements; "[foo bar]" is + equivalent to "*1(foo bar)". + + N rule + + Specific repetition: "(element)" is equivalent to + "*(element)"; that is, exactly occurrences of + (element). Thus 2DIGIT is a 2-digit number, and 3ALPHA is a + string of three alphabetic characters. + + + + +Berners-Lee, et al Informational [Page 9] + +RFC 1945 HTTP/1.0 May 1996 + + + #rule + + A construct "#" is defined, similar to "*", for defining lists + of elements. The full form is "#element" indicating at + least and at most elements, each separated by one or + more commas (",") and optional linear whitespace (LWS). This + makes the usual form of lists very easy; a rule such as + "( *LWS element *( *LWS "," *LWS element ))" can be shown as + "1#element". Wherever this construct is used, null elements are + allowed, but do not contribute to the count of elements present. + That is, "(element), , (element)" is permitted, but counts as + only two elements. Therefore, where at least one element is + required, at least one non-null element must be present. Default + values are 0 and infinity so that "#(element)" allows any + number, including zero; "1#element" requires at least one; and + "1#2element" allows one or two. + + ; comment + + A semi-colon, set off some distance to the right of rule text, + starts a comment that continues to the end of line. This is a + simple way of including useful notes in parallel with the + specifications. + + implied *LWS + + The grammar described by this specification is word-based. + Except where noted otherwise, linear whitespace (LWS) can be + included between any two adjacent words (token or + quoted-string), and between adjacent tokens and delimiters + (tspecials), without changing the interpretation of a field. At + least one delimiter (tspecials) must exist between any two + tokens, since they would otherwise be interpreted as a single + token. However, applications should attempt to follow "common + form" when generating HTTP constructs, since there exist some + implementations that fail to accept anything beyond the common + forms. + +2.2 Basic Rules + + The following rules are used throughout this specification to + describe basic parsing constructs. The US-ASCII coded character set + is defined by [17]. + + OCTET = + CHAR = + UPALPHA = + LOALPHA = + + + +Berners-Lee, et al Informational [Page 10] + +RFC 1945 HTTP/1.0 May 1996 + + + ALPHA = UPALPHA | LOALPHA + DIGIT = + CTL = + CR = + LF = + SP = + HT = + <"> = + + HTTP/1.0 defines the octet sequence CR LF as the end-of-line marker + for all protocol elements except the Entity-Body (see Appendix B for + tolerant applications). The end-of-line marker within an Entity-Body + is defined by its associated media type, as described in Section 3.6. + + CRLF = CR LF + + HTTP/1.0 headers may be folded onto multiple lines if each + continuation line begins with a space or horizontal tab. All linear + whitespace, including folding, has the same semantics as SP. + + LWS = [CRLF] 1*( SP | HT ) + + However, folding of header lines is not expected by some + applications, and should not be generated by HTTP/1.0 applications. + + The TEXT rule is only used for descriptive field contents and values + that are not intended to be interpreted by the message parser. Words + of *TEXT may contain octets from character sets other than US-ASCII. + + TEXT = + + Recipients of header field TEXT containing octets outside the US- + ASCII character set may assume that they represent ISO-8859-1 + characters. + + Hexadecimal numeric characters are used in several protocol elements. + + HEX = "A" | "B" | "C" | "D" | "E" | "F" + | "a" | "b" | "c" | "d" | "e" | "f" | DIGIT + + Many HTTP/1.0 header field values consist of words separated by LWS + or special characters. These special characters must be in a quoted + string to be used within a parameter value. + + word = token | quoted-string + + + + +Berners-Lee, et al Informational [Page 11] + +RFC 1945 HTTP/1.0 May 1996 + + + token = 1* + + tspecials = "(" | ")" | "<" | ">" | "@" + | "," | ";" | ":" | "\" | <"> + | "/" | "[" | "]" | "?" | "=" + | "{" | "}" | SP | HT + + Comments may be included in some HTTP header fields by surrounding + the comment text with parentheses. Comments are only allowed in + fields containing "comment" as part of their field value definition. + In all other fields, parentheses are considered part of the field + value. + + comment = "(" *( ctext | comment ) ")" + ctext = + + A string of text is parsed as a single word if it is quoted using + double-quote marks. + + quoted-string = ( <"> *(qdtext) <"> ) + + qdtext = and CTLs, + but including LWS> + + Single-character quoting using the backslash ("\") character is not + permitted in HTTP/1.0. + +3. Protocol Parameters + +3.1 HTTP Version + + HTTP uses a "." numbering scheme to indicate versions + of the protocol. The protocol versioning policy is intended to allow + the sender to indicate the format of a message and its capacity for + understanding further HTTP communication, rather than the features + obtained via that communication. No change is made to the version + number for the addition of message components which do not affect + communication behavior or which only add to extensible field values. + The number is incremented when the changes made to the + protocol add features which do not change the general message parsing + algorithm, but which may add to the message semantics and imply + additional capabilities of the sender. The number is + incremented when the format of a message within the protocol is + changed. + + The version of an HTTP message is indicated by an HTTP-Version field + in the first line of the message. If the protocol version is not + specified, the recipient must assume that the message is in the + + + +Berners-Lee, et al Informational [Page 12] + +RFC 1945 HTTP/1.0 May 1996 + + + simple HTTP/0.9 format. + + HTTP-Version = "HTTP" "/" 1*DIGIT "." 1*DIGIT + + Note that the major and minor numbers should be treated as separate + integers and that each may be incremented higher than a single digit. + Thus, HTTP/2.4 is a lower version than HTTP/2.13, which in turn is + lower than HTTP/12.3. Leading zeros should be ignored by recipients + and never generated by senders. + + This document defines both the 0.9 and 1.0 versions of the HTTP + protocol. Applications sending Full-Request or Full-Response + messages, as defined by this specification, must include an HTTP- + Version of "HTTP/1.0". + + HTTP/1.0 servers must: + + o recognize the format of the Request-Line for HTTP/0.9 and + HTTP/1.0 requests; + + o understand any valid request in the format of HTTP/0.9 or + HTTP/1.0; + + o respond appropriately with a message in the same protocol + version used by the client. + + HTTP/1.0 clients must: + + o recognize the format of the Status-Line for HTTP/1.0 responses; + + o understand any valid response in the format of HTTP/0.9 or + HTTP/1.0. + + Proxy and gateway applications must be careful in forwarding requests + that are received in a format different than that of the + application's native HTTP version. Since the protocol version + indicates the protocol capability of the sender, a proxy/gateway must + never send a message with a version indicator which is greater than + its native version; if a higher version request is received, the + proxy/gateway must either downgrade the request version or respond + with an error. Requests with a version lower than that of the + application's native format may be upgraded before being forwarded; + the proxy/gateway's response to that request must follow the server + requirements listed above. + + + + + + + +Berners-Lee, et al Informational [Page 13] + +RFC 1945 HTTP/1.0 May 1996 + + +3.2 Uniform Resource Identifiers + + URIs have been known by many names: WWW addresses, Universal Document + Identifiers, Universal Resource Identifiers [2], and finally the + combination of Uniform Resource Locators (URL) [4] and Names (URN) + [16]. As far as HTTP is concerned, Uniform Resource Identifiers are + simply formatted strings which identify--via name, location, or any + other characteristic--a network resource. + +3.2.1 General Syntax + + URIs in HTTP can be represented in absolute form or relative to some + known base URI [9], depending upon the context of their use. The two + forms are differentiated by the fact that absolute URIs always begin + with a scheme name followed by a colon. + + URI = ( absoluteURI | relativeURI ) [ "#" fragment ] + + absoluteURI = scheme ":" *( uchar | reserved ) + + relativeURI = net_path | abs_path | rel_path + + net_path = "//" net_loc [ abs_path ] + abs_path = "/" rel_path + rel_path = [ path ] [ ";" params ] [ "?" query ] + + path = fsegment *( "/" segment ) + fsegment = 1*pchar + segment = *pchar + + params = param *( ";" param ) + param = *( pchar | "/" ) + + scheme = 1*( ALPHA | DIGIT | "+" | "-" | "." ) + net_loc = *( pchar | ";" | "?" ) + query = *( uchar | reserved ) + fragment = *( uchar | reserved ) + + pchar = uchar | ":" | "@" | "&" | "=" | "+" + uchar = unreserved | escape + unreserved = ALPHA | DIGIT | safe | extra | national + + escape = "%" HEX HEX + reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" + extra = "!" | "*" | "'" | "(" | ")" | "," + safe = "$" | "-" | "_" | "." + unsafe = CTL | SP | <"> | "#" | "%" | "<" | ">" + national = + + For definitive information on URL syntax and semantics, see RFC 1738 + [4] and RFC 1808 [9]. The BNF above includes national characters not + allowed in valid URLs as specified by RFC 1738, since HTTP servers + are not restricted in the set of unreserved characters allowed to + represent the rel_path part of addresses, and HTTP proxies may + receive requests for URIs not defined by RFC 1738. + +3.2.2 http URL + + The "http" scheme is used to locate network resources via the HTTP + protocol. This section defines the scheme-specific syntax and + semantics for http URLs. + + http_URL = "http:" "//" host [ ":" port ] [ abs_path ] + + host = + + port = *DIGIT + + If the port is empty or not given, port 80 is assumed. The semantics + are that the identified resource is located at the server listening + for TCP connections on that port of that host, and the Request-URI + for the resource is abs_path. If the abs_path is not present in the + URL, it must be given as "/" when used as a Request-URI (Section + 5.1.2). + + Note: Although the HTTP protocol is independent of the transport + layer protocol, the http URL only identifies resources by their + TCP location, and thus non-TCP resources must be identified by + some other URI scheme. + + The canonical form for "http" URLs is obtained by converting any + UPALPHA characters in host to their LOALPHA equivalent (hostnames are + case-insensitive), eliding the [ ":" port ] if the port is 80, and + replacing an empty abs_path with "/". + +3.3 Date/Time Formats + + HTTP/1.0 applications have historically allowed three different + formats for the representation of date/time stamps: + + Sun, 06 Nov 1994 08:49:37 GMT ; RFC 822, updated by RFC 1123 + Sunday, 06-Nov-94 08:49:37 GMT ; RFC 850, obsoleted by RFC 1036 + Sun Nov 6 08:49:37 1994 ; ANSI C's asctime() format + + + +Berners-Lee, et al Informational [Page 15] + +RFC 1945 HTTP/1.0 May 1996 + + + The first format is preferred as an Internet standard and represents + a fixed-length subset of that defined by RFC 1123 [6] (an update to + RFC 822 [7]). The second format is in common use, but is based on the + obsolete RFC 850 [10] date format and lacks a four-digit year. + HTTP/1.0 clients and servers that parse the date value should accept + all three formats, though they must never generate the third + (asctime) format. + + Note: Recipients of date values are encouraged to be robust in + accepting date values that may have been generated by non-HTTP + applications, as is sometimes the case when retrieving or posting + messages via proxies/gateways to SMTP or NNTP. + + All HTTP/1.0 date/time stamps must be represented in Universal Time + (UT), also known as Greenwich Mean Time (GMT), without exception. + This is indicated in the first two formats by the inclusion of "GMT" + as the three-letter abbreviation for time zone, and should be assumed + when reading the asctime format. + + HTTP-date = rfc1123-date | rfc850-date | asctime-date + + rfc1123-date = wkday "," SP date1 SP time SP "GMT" + rfc850-date = weekday "," SP date2 SP time SP "GMT" + asctime-date = wkday SP date3 SP time SP 4DIGIT + + date1 = 2DIGIT SP month SP 4DIGIT + ; day month year (e.g., 02 Jun 1982) + date2 = 2DIGIT "-" month "-" 2DIGIT + ; day-month-year (e.g., 02-Jun-82) + date3 = month SP ( 2DIGIT | ( SP 1DIGIT )) + ; month day (e.g., Jun 2) + + time = 2DIGIT ":" 2DIGIT ":" 2DIGIT + ; 00:00:00 - 23:59:59 + + wkday = "Mon" | "Tue" | "Wed" + | "Thu" | "Fri" | "Sat" | "Sun" + + weekday = "Monday" | "Tuesday" | "Wednesday" + | "Thursday" | "Friday" | "Saturday" | "Sunday" + + month = "Jan" | "Feb" | "Mar" | "Apr" + | "May" | "Jun" | "Jul" | "Aug" + | "Sep" | "Oct" | "Nov" | "Dec" + + Note: HTTP requirements for the date/time stamp format apply + only to their usage within the protocol stream. Clients and + servers are not required to use these formats for user + + + +Berners-Lee, et al Informational [Page 16] + +RFC 1945 HTTP/1.0 May 1996 + + + presentation, request logging, etc. + +3.4 Character Sets + + HTTP uses the same definition of the term "character set" as that + described for MIME: + + The term "character set" is used in this document to refer to a + method used with one or more tables to convert a sequence of + octets into a sequence of characters. Note that unconditional + conversion in the other direction is not required, in that not all + characters may be available in a given character set and a + character set may provide more than one sequence of octets to + represent a particular character. This definition is intended to + allow various kinds of character encodings, from simple single- + table mappings such as US-ASCII to complex table switching methods + such as those that use ISO 2022's techniques. However, the + definition associated with a MIME character set name must fully + specify the mapping to be performed from octets to characters. In + particular, use of external profiling information to determine the + exact mapping is not permitted. + + Note: This use of the term "character set" is more commonly + referred to as a "character encoding." However, since HTTP and + MIME share the same registry, it is important that the terminology + also be shared. + + HTTP character sets are identified by case-insensitive tokens. The + complete set of tokens are defined by the IANA Character Set registry + [15]. However, because that registry does not define a single, + consistent token for each character set, we define here the preferred + names for those character sets most likely to be used with HTTP + entities. These character sets include those registered by RFC 1521 + [5] -- the US-ASCII [17] and ISO-8859 [18] character sets -- and + other names specifically recommended for use within MIME charset + parameters. + + charset = "US-ASCII" + | "ISO-8859-1" | "ISO-8859-2" | "ISO-8859-3" + | "ISO-8859-4" | "ISO-8859-5" | "ISO-8859-6" + | "ISO-8859-7" | "ISO-8859-8" | "ISO-8859-9" + | "ISO-2022-JP" | "ISO-2022-JP-2" | "ISO-2022-KR" + | "UNICODE-1-1" | "UNICODE-1-1-UTF-7" | "UNICODE-1-1-UTF-8" + | token + + Although HTTP allows an arbitrary token to be used as a charset + value, any token that has a predefined value within the IANA + Character Set registry [15] must represent the character set defined + + + +Berners-Lee, et al Informational [Page 17] + +RFC 1945 HTTP/1.0 May 1996 + + + by that registry. Applications should limit their use of character + sets to those defined by the IANA registry. + + The character set of an entity body should be labelled as the lowest + common denominator of the character codes used within that body, with + the exception that no label is preferred over the labels US-ASCII or + ISO-8859-1. + +3.5 Content Codings + + Content coding values are used to indicate an encoding transformation + that has been applied to a resource. Content codings are primarily + used to allow a document to be compressed or encrypted without losing + the identity of its underlying media type. Typically, the resource is + stored in this encoding and only decoded before rendering or + analogous usage. + + content-coding = "x-gzip" | "x-compress" | token + + Note: For future compatibility, HTTP/1.0 applications should + consider "gzip" and "compress" to be equivalent to "x-gzip" + and "x-compress", respectively. + + All content-coding values are case-insensitive. HTTP/1.0 uses + content-coding values in the Content-Encoding (Section 10.3) header + field. Although the value describes the content-coding, what is more + important is that it indicates what decoding mechanism will be + required to remove the encoding. Note that a single program may be + capable of decoding multiple content-coding formats. Two values are + defined by this specification: + + x-gzip + An encoding format produced by the file compression program + "gzip" (GNU zip) developed by Jean-loup Gailly. This format is + typically a Lempel-Ziv coding (LZ77) with a 32 bit CRC. + + x-compress + The encoding format produced by the file compression program + "compress". This format is an adaptive Lempel-Ziv-Welch coding + (LZW). + + Note: Use of program names for the identification of + encoding formats is not desirable and should be discouraged + for future encodings. Their use here is representative of + historical practice, not good design. + + + + + + +Berners-Lee, et al Informational [Page 18] + +RFC 1945 HTTP/1.0 May 1996 + + +3.6 Media Types + + HTTP uses Internet Media Types [13] in the Content-Type header field + (Section 10.5) in order to provide open and extensible data typing. + + media-type = type "/" subtype *( ";" parameter ) + type = token + subtype = token + + Parameters may follow the type/subtype in the form of attribute/value + pairs. + + parameter = attribute "=" value + attribute = token + value = token | quoted-string + + The type, subtype, and parameter attribute names are case- + insensitive. Parameter values may or may not be case-sensitive, + depending on the semantics of the parameter name. LWS must not be + generated between the type and subtype, nor between an attribute and + its value. Upon receipt of a media type with an unrecognized + parameter, a user agent should treat the media type as if the + unrecognized parameter and its value were not present. + + Some older HTTP applications do not recognize media type parameters. + HTTP/1.0 applications should only use media type parameters when they + are necessary to define the content of a message. + + Media-type values are registered with the Internet Assigned Number + Authority (IANA [15]). The media type registration process is + outlined in RFC 1590 [13]. Use of non-registered media types is + discouraged. + +3.6.1 Canonicalization and Text Defaults + + Internet media types are registered with a canonical form. In + general, an Entity-Body transferred via HTTP must be represented in + the appropriate canonical form prior to its transmission. If the body + has been encoded with a Content-Encoding, the underlying data should + be in canonical form prior to being encoded. + + Media subtypes of the "text" type use CRLF as the text line break + when in canonical form. However, HTTP allows the transport of text + media with plain CR or LF alone representing a line break when used + consistently within the Entity-Body. HTTP applications must accept + CRLF, bare CR, and bare LF as being representative of a line break in + text media received via HTTP. + + + + +Berners-Lee, et al Informational [Page 19] + +RFC 1945 HTTP/1.0 May 1996 + + + In addition, if the text media is represented in a character set that + does not use octets 13 and 10 for CR and LF respectively, as is the + case for some multi-byte character sets, HTTP allows the use of + whatever octet sequences are defined by that character set to + represent the equivalent of CR and LF for line breaks. This + flexibility regarding line breaks applies only to text media in the + Entity-Body; a bare CR or LF should not be substituted for CRLF + within any of the HTTP control structures (such as header fields and + multipart boundaries). + + The "charset" parameter is used with some media types to define the + character set (Section 3.4) of the data. When no explicit charset + parameter is provided by the sender, media subtypes of the "text" + type are defined to have a default charset value of "ISO-8859-1" when + received via HTTP. Data in character sets other than "ISO-8859-1" or + its subsets must be labelled with an appropriate charset value in + order to be consistently interpreted by the recipient. + + Note: Many current HTTP servers provide data using charsets other + than "ISO-8859-1" without proper labelling. This situation reduces + interoperability and is not recommended. To compensate for this, + some HTTP user agents provide a configuration option to allow the + user to change the default interpretation of the media type + character set when no charset parameter is given. + +3.6.2 Multipart Types + + MIME provides for a number of "multipart" types -- encapsulations of + several entities within a single message's Entity-Body. The multipart + types registered by IANA [15] do not have any special meaning for + HTTP/1.0, though user agents may need to understand each type in + order to correctly interpret the purpose of each body-part. An HTTP + user agent should follow the same or similar behavior as a MIME user + agent does upon receipt of a multipart type. HTTP servers should not + assume that all HTTP clients are prepared to handle multipart types. + + All multipart types share a common syntax and must include a boundary + parameter as part of the media type value. The message body is itself + a protocol element and must therefore use only CRLF to represent line + breaks between body-parts. Multipart body-parts may contain HTTP + header fields which are significant to the meaning of that part. + +3.7 Product Tokens + + Product tokens are used to allow communicating applications to + identify themselves via a simple product token, with an optional + slash and version designator. Most fields using product tokens also + allow subproducts which form a significant part of the application to + + + +Berners-Lee, et al Informational [Page 20] + +RFC 1945 HTTP/1.0 May 1996 + + + be listed, separated by whitespace. By convention, the products are + listed in order of their significance for identifying the + application. + + product = token ["/" product-version] + product-version = token + + Examples: + + User-Agent: CERN-LineMode/2.15 libwww/2.17b3 + + Server: Apache/0.8.4 + + Product tokens should be short and to the point -- use of them for + advertizing or other non-essential information is explicitly + forbidden. Although any token character may appear in a product- + version, this token should only be used for a version identifier + (i.e., successive versions of the same product should only differ in + the product-version portion of the product value). + +4. HTTP Message + +4.1 Message Types + + HTTP messages consist of requests from client to server and responses + from server to client. + + HTTP-message = Simple-Request ; HTTP/0.9 messages + | Simple-Response + | Full-Request ; HTTP/1.0 messages + | Full-Response + + Full-Request and Full-Response use the generic message format of RFC + 822 [7] for transferring entities. Both messages may include optional + header fields (also known as "headers") and an entity body. The + entity body is separated from the headers by a null line (i.e., a + line with nothing preceding the CRLF). + + Full-Request = Request-Line ; Section 5.1 + *( General-Header ; Section 4.3 + | Request-Header ; Section 5.2 + | Entity-Header ) ; Section 7.1 + CRLF + [ Entity-Body ] ; Section 7.2 + + Full-Response = Status-Line ; Section 6.1 + *( General-Header ; Section 4.3 + | Response-Header ; Section 6.2 + + + +Berners-Lee, et al Informational [Page 21] + +RFC 1945 HTTP/1.0 May 1996 + + + | Entity-Header ) ; Section 7.1 + CRLF + [ Entity-Body ] ; Section 7.2 + + Simple-Request and Simple-Response do not allow the use of any header + information and are limited to a single request method (GET). + + Simple-Request = "GET" SP Request-URI CRLF + + Simple-Response = [ Entity-Body ] + + Use of the Simple-Request format is discouraged because it prevents + the server from identifying the media type of the returned entity. + +4.2 Message Headers + + HTTP header fields, which include General-Header (Section 4.3), + Request-Header (Section 5.2), Response-Header (Section 6.2), and + Entity-Header (Section 7.1) fields, follow the same generic format as + that given in Section 3.1 of RFC 822 [7]. Each header field consists + of a name followed immediately by a colon (":"), a single space (SP) + character, and the field value. Field names are case-insensitive. + Header fields can be extended over multiple lines by preceding each + extra line with at least one SP or HT, though this is not + recommended. + + HTTP-header = field-name ":" [ field-value ] CRLF + + field-name = token + field-value = *( field-content | LWS ) + + field-content = + + The order in which header fields are received is not significant. + However, it is "good practice" to send General-Header fields first, + followed by Request-Header or Response-Header fields prior to the + Entity-Header fields. + + Multiple HTTP-header fields with the same field-name may be present + in a message if and only if the entire field-value for that header + field is defined as a comma-separated list [i.e., #(values)]. It must + be possible to combine the multiple header fields into one "field- + name: field-value" pair, without changing the semantics of the + message, by appending each subsequent field-value to the first, each + separated by a comma. + + + + +Berners-Lee, et al Informational [Page 22] + +RFC 1945 HTTP/1.0 May 1996 + + +4.3 General Header Fields + + There are a few header fields which have general applicability for + both request and response messages, but which do not apply to the + entity being transferred. These headers apply only to the message + being transmitted. + + General-Header = Date ; Section 10.6 + | Pragma ; Section 10.12 + + General header field names can be extended reliably only in + combination with a change in the protocol version. However, new or + experimental header fields may be given the semantics of general + header fields if all parties in the communication recognize them to + be general header fields. Unrecognized header fields are treated as + Entity-Header fields. + +5. Request + + A request message from a client to a server includes, within the + first line of that message, the method to be applied to the resource, + the identifier of the resource, and the protocol version in use. For + backwards compatibility with the more limited HTTP/0.9 protocol, + there are two valid formats for an HTTP request: + + Request = Simple-Request | Full-Request + + Simple-Request = "GET" SP Request-URI CRLF + + Full-Request = Request-Line ; Section 5.1 + *( General-Header ; Section 4.3 + | Request-Header ; Section 5.2 + | Entity-Header ) ; Section 7.1 + CRLF + [ Entity-Body ] ; Section 7.2 + + If an HTTP/1.0 server receives a Simple-Request, it must respond with + an HTTP/0.9 Simple-Response. An HTTP/1.0 client capable of receiving + a Full-Response should never generate a Simple-Request. + +5.1 Request-Line + + The Request-Line begins with a method token, followed by the + Request-URI and the protocol version, and ending with CRLF. The + elements are separated by SP characters. No CR or LF are allowed + except in the final CRLF sequence. + + Request-Line = Method SP Request-URI SP HTTP-Version CRLF + + + +Berners-Lee, et al Informational [Page 23] + +RFC 1945 HTTP/1.0 May 1996 + + + Note that the difference between a Simple-Request and the Request- + Line of a Full-Request is the presence of the HTTP-Version field and + the availability of methods other than GET. + +5.1.1 Method + + The Method token indicates the method to be performed on the resource + identified by the Request-URI. The method is case-sensitive. + + Method = "GET" ; Section 8.1 + | "HEAD" ; Section 8.2 + | "POST" ; Section 8.3 + | extension-method + + extension-method = token + + The list of methods acceptable by a specific resource can change + dynamically; the client is notified through the return code of the + response if a method is not allowed on a resource. Servers should + return the status code 501 (not implemented) if the method is + unrecognized or not implemented. + + The methods commonly used by HTTP/1.0 applications are fully defined + in Section 8. + +5.1.2 Request-URI + + The Request-URI is a Uniform Resource Identifier (Section 3.2) and + identifies the resource upon which to apply the request. + + Request-URI = absoluteURI | abs_path + + The two options for Request-URI are dependent on the nature of the + request. + + The absoluteURI form is only allowed when the request is being made + to a proxy. The proxy is requested to forward the request and return + the response. If the request is GET or HEAD and a prior response is + cached, the proxy may use the cached message if it passes any + restrictions in the Expires header field. Note that the proxy may + forward the request on to another proxy or directly to the server + specified by the absoluteURI. In order to avoid request loops, a + proxy must be able to recognize all of its server names, including + any aliases, local variations, and the numeric IP address. An example + Request-Line would be: + + GET http://www.w3.org/pub/WWW/TheProject.html HTTP/1.0 + + + + +Berners-Lee, et al Informational [Page 24] + +RFC 1945 HTTP/1.0 May 1996 + + + The most common form of Request-URI is that used to identify a + resource on an origin server or gateway. In this case, only the + absolute path of the URI is transmitted (see Section 3.2.1, + abs_path). For example, a client wishing to retrieve the resource + above directly from the origin server would create a TCP connection + to port 80 of the host "www.w3.org" and send the line: + + GET /pub/WWW/TheProject.html HTTP/1.0 + + followed by the remainder of the Full-Request. Note that the absolute + path cannot be empty; if none is present in the original URI, it must + be given as "/" (the server root). + + The Request-URI is transmitted as an encoded string, where some + characters may be escaped using the "% HEX HEX" encoding defined by + RFC 1738 [4]. The origin server must decode the Request-URI in order + to properly interpret the request. + +5.2 Request Header Fields + + The request header fields allow the client to pass additional + information about the request, and about the client itself, to the + server. These fields act as request modifiers, with semantics + equivalent to the parameters on a programming language method + (procedure) invocation. + + Request-Header = Authorization ; Section 10.2 + | From ; Section 10.8 + | If-Modified-Since ; Section 10.9 + | Referer ; Section 10.13 + | User-Agent ; Section 10.15 + + Request-Header field names can be extended reliably only in + combination with a change in the protocol version. However, new or + experimental header fields may be given the semantics of request + header fields if all parties in the communication recognize them to + be request header fields. Unrecognized header fields are treated as + Entity-Header fields. + +6. Response + + After receiving and interpreting a request message, a server responds + in the form of an HTTP response message. + + Response = Simple-Response | Full-Response + + Simple-Response = [ Entity-Body ] + + + + +Berners-Lee, et al Informational [Page 25] + +RFC 1945 HTTP/1.0 May 1996 + + + Full-Response = Status-Line ; Section 6.1 + *( General-Header ; Section 4.3 + | Response-Header ; Section 6.2 + | Entity-Header ) ; Section 7.1 + CRLF + [ Entity-Body ] ; Section 7.2 + + A Simple-Response should only be sent in response to an HTTP/0.9 + Simple-Request or if the server only supports the more limited + HTTP/0.9 protocol. If a client sends an HTTP/1.0 Full-Request and + receives a response that does not begin with a Status-Line, it should + assume that the response is a Simple-Response and parse it + accordingly. Note that the Simple-Response consists only of the + entity body and is terminated by the server closing the connection. + +6.1 Status-Line + + The first line of a Full-Response message is the Status-Line, + consisting of the protocol version followed by a numeric status code + and its associated textual phrase, with each element separated by SP + characters. No CR or LF is allowed except in the final CRLF sequence. + + Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF + + Since a status line always begins with the protocol version and + status code + + "HTTP/" 1*DIGIT "." 1*DIGIT SP 3DIGIT SP + + (e.g., "HTTP/1.0 200 "), the presence of that expression is + sufficient to differentiate a Full-Response from a Simple-Response. + Although the Simple-Response format may allow such an expression to + occur at the beginning of an entity body, and thus cause a + misinterpretation of the message if it was given in response to a + Full-Request, most HTTP/0.9 servers are limited to responses of type + "text/html" and therefore would never generate such a response. + +6.1.1 Status Code and Reason Phrase + + The Status-Code element is a 3-digit integer result code of the + attempt to understand and satisfy the request. The Reason-Phrase is + intended to give a short textual description of the Status-Code. The + Status-Code is intended for use by automata and the Reason-Phrase is + intended for the human user. The client is not required to examine or + display the Reason-Phrase. + + + + + + +Berners-Lee, et al Informational [Page 26] + +RFC 1945 HTTP/1.0 May 1996 + + + The first digit of the Status-Code defines the class of response. The + last two digits do not have any categorization role. There are 5 + values for the first digit: + + o 1xx: Informational - Not used, but reserved for future use + + o 2xx: Success - The action was successfully received, + understood, and accepted. + + o 3xx: Redirection - Further action must be taken in order to + complete the request + + o 4xx: Client Error - The request contains bad syntax or cannot + be fulfilled + + o 5xx: Server Error - The server failed to fulfill an apparently + valid request + + The individual values of the numeric status codes defined for + HTTP/1.0, and an example set of corresponding Reason-Phrase's, are + presented below. The reason phrases listed here are only recommended + -- they may be replaced by local equivalents without affecting the + protocol. These codes are fully defined in Section 9. + + Status-Code = "200" ; OK + | "201" ; Created + | "202" ; Accepted + | "204" ; No Content + | "301" ; Moved Permanently + | "302" ; Moved Temporarily + | "304" ; Not Modified + | "400" ; Bad Request + | "401" ; Unauthorized + | "403" ; Forbidden + | "404" ; Not Found + | "500" ; Internal Server Error + | "501" ; Not Implemented + | "502" ; Bad Gateway + | "503" ; Service Unavailable + | extension-code + + extension-code = 3DIGIT + + Reason-Phrase = * + + HTTP status codes are extensible, but the above codes are the only + ones generally recognized in current practice. HTTP applications are + not required to understand the meaning of all registered status + + + +Berners-Lee, et al Informational [Page 27] + +RFC 1945 HTTP/1.0 May 1996 + + + codes, though such understanding is obviously desirable. However, + applications must understand the class of any status code, as + indicated by the first digit, and treat any unrecognized response as + being equivalent to the x00 status code of that class, with the + exception that an unrecognized response must not be cached. For + example, if an unrecognized status code of 431 is received by the + client, it can safely assume that there was something wrong with its + request and treat the response as if it had received a 400 status + code. In such cases, user agents should present to the user the + entity returned with the response, since that entity is likely to + include human-readable information which will explain the unusual + status. + +6.2 Response Header Fields + + The response header fields allow the server to pass additional + information about the response which cannot be placed in the Status- + Line. These header fields give information about the server and about + further access to the resource identified by the Request-URI. + + Response-Header = Location ; Section 10.11 + | Server ; Section 10.14 + | WWW-Authenticate ; Section 10.16 + + Response-Header field names can be extended reliably only in + combination with a change in the protocol version. However, new or + experimental header fields may be given the semantics of response + header fields if all parties in the communication recognize them to + be response header fields. Unrecognized header fields are treated as + Entity-Header fields. + +7. Entity + + Full-Request and Full-Response messages may transfer an entity within + some requests and responses. An entity consists of Entity-Header + fields and (usually) an Entity-Body. In this section, both sender and + recipient refer to either the client or the server, depending on who + sends and who receives the entity. + + + + + + + + + + + + + +Berners-Lee, et al Informational [Page 28] + +RFC 1945 HTTP/1.0 May 1996 + + +7.1 Entity Header Fields + + Entity-Header fields define optional metainformation about the + Entity-Body or, if no body is present, about the resource identified + by the request. + + Entity-Header = Allow ; Section 10.1 + | Content-Encoding ; Section 10.3 + | Content-Length ; Section 10.4 + | Content-Type ; Section 10.5 + | Expires ; Section 10.7 + | Last-Modified ; Section 10.10 + | extension-header + + extension-header = HTTP-header + + The extension-header mechanism allows additional Entity-Header fields + to be defined without changing the protocol, but these fields cannot + be assumed to be recognizable by the recipient. Unrecognized header + fields should be ignored by the recipient and forwarded by proxies. + +7.2 Entity Body + + The entity body (if any) sent with an HTTP request or response is in + a format and encoding defined by the Entity-Header fields. + + Entity-Body = *OCTET + + An entity body is included with a request message only when the + request method calls for one. The presence of an entity body in a + request is signaled by the inclusion of a Content-Length header field + in the request message headers. HTTP/1.0 requests containing an + entity body must include a valid Content-Length header field. + + For response messages, whether or not an entity body is included with + a message is dependent on both the request method and the response + code. All responses to the HEAD request method must not include a + body, even though the presence of entity header fields may lead one + to believe they do. All 1xx (informational), 204 (no content), and + 304 (not modified) responses must not include a body. All other + responses must include an entity body or a Content-Length header + field defined with a value of zero (0). + +7.2.1 Type + + When an Entity-Body is included with a message, the data type of that + body is determined via the header fields Content-Type and Content- + Encoding. These define a two-layer, ordered encoding model: + + + +Berners-Lee, et al Informational [Page 29] + +RFC 1945 HTTP/1.0 May 1996 + + + entity-body := Content-Encoding( Content-Type( data ) ) + + A Content-Type specifies the media type of the underlying data. A + Content-Encoding may be used to indicate any additional content + coding applied to the type, usually for the purpose of data + compression, that is a property of the resource requested. The + default for the content encoding is none (i.e., the identity + function). + + Any HTTP/1.0 message containing an entity body should include a + Content-Type header field defining the media type of that body. If + and only if the media type is not given by a Content-Type header, as + is the case for Simple-Response messages, the recipient may attempt + to guess the media type via inspection of its content and/or the name + extension(s) of the URL used to identify the resource. If the media + type remains unknown, the recipient should treat it as type + "application/octet-stream". + +7.2.2 Length + + When an Entity-Body is included with a message, the length of that + body may be determined in one of two ways. If a Content-Length header + field is present, its value in bytes represents the length of the + Entity-Body. Otherwise, the body length is determined by the closing + of the connection by the server. + + Closing the connection cannot be used to indicate the end of a + request body, since it leaves no possibility for the server to send + back a response. Therefore, HTTP/1.0 requests containing an entity + body must include a valid Content-Length header field. If a request + contains an entity body and Content-Length is not specified, and the + server does not recognize or cannot calculate the length from other + fields, then the server should send a 400 (bad request) response. + + Note: Some older servers supply an invalid Content-Length when + sending a document that contains server-side includes dynamically + inserted into the data stream. It must be emphasized that this + will not be tolerated by future versions of HTTP. Unless the + client knows that it is receiving a response from a compliant + server, it should not depend on the Content-Length value being + correct. + +8. Method Definitions + + The set of common methods for HTTP/1.0 is defined below. Although + this set can be expanded, additional methods cannot be assumed to + share the same semantics for separately extended clients and servers. + + + + +Berners-Lee, et al Informational [Page 30] + +RFC 1945 HTTP/1.0 May 1996 + + +8.1 GET + + The GET method means retrieve whatever information (in the form of an + entity) is identified by the Request-URI. If the Request-URI refers + to a data-producing process, it is the produced data which shall be + returned as the entity in the response and not the source text of the + process, unless that text happens to be the output of the process. + + The semantics of the GET method changes to a "conditional GET" if the + request message includes an If-Modified-Since header field. A + conditional GET method requests that the identified resource be + transferred only if it has been modified since the date given by the + If-Modified-Since header, as described in Section 10.9. The + conditional GET method is intended to reduce network usage by + allowing cached entities to be refreshed without requiring multiple + requests or transferring unnecessary data. + +8.2 HEAD + + The HEAD method is identical to GET except that the server must not + return any Entity-Body in the response. The metainformation contained + in the HTTP headers in response to a HEAD request should be identical + to the information sent in response to a GET request. This method can + be used for obtaining metainformation about the resource identified + by the Request-URI without transferring the Entity-Body itself. This + method is often used for testing hypertext links for validity, + accessibility, and recent modification. + + There is no "conditional HEAD" request analogous to the conditional + GET. If an If-Modified-Since header field is included with a HEAD + request, it should be ignored. + +8.3 POST + + The POST method is used to request that the destination server accept + the entity enclosed in the request as a new subordinate of the + resource identified by the Request-URI in the Request-Line. POST is + designed to allow a uniform method to cover the following functions: + + o Annotation of existing resources; + + o Posting a message to a bulletin board, newsgroup, mailing list, + or similar group of articles; + + o Providing a block of data, such as the result of submitting a + form [3], to a data-handling process; + + o Extending a database through an append operation. + + + +Berners-Lee, et al Informational [Page 31] + +RFC 1945 HTTP/1.0 May 1996 + + + The actual function performed by the POST method is determined by the + server and is usually dependent on the Request-URI. The posted entity + is subordinate to that URI in the same way that a file is subordinate + to a directory containing it, a news article is subordinate to a + newsgroup to which it is posted, or a record is subordinate to a + database. + + A successful POST does not require that the entity be created as a + resource on the origin server or made accessible for future + reference. That is, the action performed by the POST method might not + result in a resource that can be identified by a URI. In this case, + either 200 (ok) or 204 (no content) is the appropriate response + status, depending on whether or not the response includes an entity + that describes the result. + + If a resource has been created on the origin server, the response + should be 201 (created) and contain an entity (preferably of type + "text/html") which describes the status of the request and refers to + the new resource. + + A valid Content-Length is required on all HTTP/1.0 POST requests. An + HTTP/1.0 server should respond with a 400 (bad request) message if it + cannot determine the length of the request message's content. + + Applications must not cache responses to a POST request because the + application has no way of knowing that the server would return an + equivalent response on some future request. + +9. Status Code Definitions + + Each Status-Code is described below, including a description of which + method(s) it can follow and any metainformation required in the + response. + +9.1 Informational 1xx + + This class of status code indicates a provisional response, + consisting only of the Status-Line and optional headers, and is + terminated by an empty line. HTTP/1.0 does not define any 1xx status + codes and they are not a valid response to a HTTP/1.0 request. + However, they may be useful for experimental applications which are + outside the scope of this specification. + +9.2 Successful 2xx + + This class of status code indicates that the client's request was + successfully received, understood, and accepted. + + + + +Berners-Lee, et al Informational [Page 32] + +RFC 1945 HTTP/1.0 May 1996 + + + 200 OK + + The request has succeeded. The information returned with the + response is dependent on the method used in the request, as follows: + + GET an entity corresponding to the requested resource is sent + in the response; + + HEAD the response must only contain the header information and + no Entity-Body; + + POST an entity describing or containing the result of the action. + + 201 Created + + The request has been fulfilled and resulted in a new resource being + created. The newly created resource can be referenced by the URI(s) + returned in the entity of the response. The origin server should + create the resource before using this Status-Code. If the action + cannot be carried out immediately, the server must include in the + response body a description of when the resource will be available; + otherwise, the server should respond with 202 (accepted). + + Of the methods defined by this specification, only POST can create a + resource. + + 202 Accepted + + The request has been accepted for processing, but the processing + has not been completed. The request may or may not eventually be + acted upon, as it may be disallowed when processing actually takes + place. There is no facility for re-sending a status code from an + asynchronous operation such as this. + + The 202 response is intentionally non-committal. Its purpose is to + allow a server to accept a request for some other process (perhaps + a batch-oriented process that is only run once per day) without + requiring that the user agent's connection to the server persist + until the process is completed. The entity returned with this + response should include an indication of the request's current + status and either a pointer to a status monitor or some estimate of + when the user can expect the request to be fulfilled. + + 204 No Content + + The server has fulfilled the request but there is no new + information to send back. If the client is a user agent, it should + not change its document view from that which caused the request to + + + +Berners-Lee, et al Informational [Page 33] + +RFC 1945 HTTP/1.0 May 1996 + + + be generated. This response is primarily intended to allow input + for scripts or other actions to take place without causing a change + to the user agent's active document view. The response may include + new metainformation in the form of entity headers, which should + apply to the document currently in the user agent's active view. + +9.3 Redirection 3xx + + This class of status code indicates that further action needs to be + taken by the user agent in order to fulfill the request. The action + required may be carried out by the user agent without interaction + with the user if and only if the method used in the subsequent + request is GET or HEAD. A user agent should never automatically + redirect a request more than 5 times, since such redirections usually + indicate an infinite loop. + + 300 Multiple Choices + + This response code is not directly used by HTTP/1.0 applications, + but serves as the default for interpreting the 3xx class of + responses. + + The requested resource is available at one or more locations. + Unless it was a HEAD request, the response should include an entity + containing a list of resource characteristics and locations from + which the user or user agent can choose the one most appropriate. + If the server has a preferred choice, it should include the URL in + a Location field; user agents may use this field value for + automatic redirection. + + 301 Moved Permanently + + The requested resource has been assigned a new permanent URL and + any future references to this resource should be done using that + URL. Clients with link editing capabilities should automatically + relink references to the Request-URI to the new reference returned + by the server, where possible. + + The new URL must be given by the Location field in the response. + Unless it was a HEAD request, the Entity-Body of the response + should contain a short note with a hyperlink to the new URL. + + If the 301 status code is received in response to a request using + the POST method, the user agent must not automatically redirect the + request unless it can be confirmed by the user, since this might + change the conditions under which the request was issued. + + + + + +Berners-Lee, et al Informational [Page 34] + +RFC 1945 HTTP/1.0 May 1996 + + + Note: When automatically redirecting a POST request after + receiving a 301 status code, some existing user agents will + erroneously change it into a GET request. + + 302 Moved Temporarily + + The requested resource resides temporarily under a different URL. + Since the redirection may be altered on occasion, the client should + continue to use the Request-URI for future requests. + + The URL must be given by the Location field in the response. Unless + it was a HEAD request, the Entity-Body of the response should + contain a short note with a hyperlink to the new URI(s). + + If the 302 status code is received in response to a request using + the POST method, the user agent must not automatically redirect the + request unless it can be confirmed by the user, since this might + change the conditions under which the request was issued. + + Note: When automatically redirecting a POST request after + receiving a 302 status code, some existing user agents will + erroneously change it into a GET request. + + 304 Not Modified + + If the client has performed a conditional GET request and access is + allowed, but the document has not been modified since the date and + time specified in the If-Modified-Since field, the server must + respond with this status code and not send an Entity-Body to the + client. Header fields contained in the response should only include + information which is relevant to cache managers or which may have + changed independently of the entity's Last-Modified date. Examples + of relevant header fields include: Date, Server, and Expires. A + cache should update its cached entity to reflect any new field + values given in the 304 response. + +9.4 Client Error 4xx + + The 4xx class of status code is intended for cases in which the + client seems to have erred. If the client has not completed the + request when a 4xx code is received, it should immediately cease + sending data to the server. Except when responding to a HEAD request, + the server should include an entity containing an explanation of the + error situation, and whether it is a temporary or permanent + condition. These status codes are applicable to any request method. + + + + + + +Berners-Lee, et al Informational [Page 35] + +RFC 1945 HTTP/1.0 May 1996 + + + Note: If the client is sending data, server implementations on TCP + should be careful to ensure that the client acknowledges receipt + of the packet(s) containing the response prior to closing the + input connection. If the client continues sending data to the + server after the close, the server's controller will send a reset + packet to the client, which may erase the client's unacknowledged + input buffers before they can be read and interpreted by the HTTP + application. + + 400 Bad Request + + The request could not be understood by the server due to malformed + syntax. The client should not repeat the request without + modifications. + + 401 Unauthorized + + The request requires user authentication. The response must include + a WWW-Authenticate header field (Section 10.16) containing a + challenge applicable to the requested resource. The client may + repeat the request with a suitable Authorization header field + (Section 10.2). If the request already included Authorization + credentials, then the 401 response indicates that authorization has + been refused for those credentials. If the 401 response contains + the same challenge as the prior response, and the user agent has + already attempted authentication at least once, then the user + should be presented the entity that was given in the response, + since that entity may include relevant diagnostic information. HTTP + access authentication is explained in Section 11. + + 403 Forbidden + + The server understood the request, but is refusing to fulfill it. + Authorization will not help and the request should not be repeated. + If the request method was not HEAD and the server wishes to make + public why the request has not been fulfilled, it should describe + the reason for the refusal in the entity body. This status code is + commonly used when the server does not wish to reveal exactly why + the request has been refused, or when no other response is + applicable. + + 404 Not Found + + The server has not found anything matching the Request-URI. No + indication is given of whether the condition is temporary or + permanent. If the server does not wish to make this information + available to the client, the status code 403 (forbidden) can be + used instead. + + + +Berners-Lee, et al Informational [Page 36] + +RFC 1945 HTTP/1.0 May 1996 + + +9.5 Server Error 5xx + + Response status codes beginning with the digit "5" indicate cases in + which the server is aware that it has erred or is incapable of + performing the request. If the client has not completed the request + when a 5xx code is received, it should immediately cease sending data + to the server. Except when responding to a HEAD request, the server + should include an entity containing an explanation of the error + situation, and whether it is a temporary or permanent condition. + These response codes are applicable to any request method and there + are no required header fields. + + 500 Internal Server Error + + The server encountered an unexpected condition which prevented it + from fulfilling the request. + + 501 Not Implemented + + The server does not support the functionality required to fulfill + the request. This is the appropriate response when the server does + not recognize the request method and is not capable of supporting + it for any resource. + + 502 Bad Gateway + + The server, while acting as a gateway or proxy, received an invalid + response from the upstream server it accessed in attempting to + fulfill the request. + + 503 Service Unavailable + + The server is currently unable to handle the request due to a + temporary overloading or maintenance of the server. The implication + is that this is a temporary condition which will be alleviated + after some delay. + + Note: The existence of the 503 status code does not imply + that a server must use it when becoming overloaded. Some + servers may wish to simply refuse the connection. + +10. Header Field Definitions + + This section defines the syntax and semantics of all commonly used + HTTP/1.0 header fields. For general and entity header fields, both + sender and recipient refer to either the client or the server, + depending on who sends and who receives the message. + + + + +Berners-Lee, et al Informational [Page 37] + +RFC 1945 HTTP/1.0 May 1996 + + +10.1 Allow + + The Allow entity-header field lists the set of methods supported by + the resource identified by the Request-URI. The purpose of this field + is strictly to inform the recipient of valid methods associated with + the resource. The Allow header field is not permitted in a request + using the POST method, and thus should be ignored if it is received + as part of a POST entity. + + Allow = "Allow" ":" 1#method + + Example of use: + + Allow: GET, HEAD + + This field cannot prevent a client from trying other methods. + However, the indications given by the Allow header field value should + be followed. The actual set of allowed methods is defined by the + origin server at the time of each request. + + A proxy must not modify the Allow header field even if it does not + understand all the methods specified, since the user agent may have + other means of communicating with the origin server. + + The Allow header field does not indicate what methods are implemented + by the server. + +10.2 Authorization + + A user agent that wishes to authenticate itself with a server-- + usually, but not necessarily, after receiving a 401 response--may do + so by including an Authorization request-header field with the + request. The Authorization field value consists of credentials + containing the authentication information of the user agent for the + realm of the resource being requested. + + Authorization = "Authorization" ":" credentials + + HTTP access authentication is described in Section 11. If a request + is authenticated and a realm specified, the same credentials should + be valid for all other requests within this realm. + + Responses to requests containing an Authorization field are not + cachable. + + + + + + + +Berners-Lee, et al Informational [Page 38] + +RFC 1945 HTTP/1.0 May 1996 + + +10.3 Content-Encoding + + The Content-Encoding entity-header field is used as a modifier to the + media-type. When present, its value indicates what additional content + coding has been applied to the resource, and thus what decoding + mechanism must be applied in order to obtain the media-type + referenced by the Content-Type header field. The Content-Encoding is + primarily used to allow a document to be compressed without losing + the identity of its underlying media type. + + Content-Encoding = "Content-Encoding" ":" content-coding + + Content codings are defined in Section 3.5. An example of its use is + + Content-Encoding: x-gzip + + The Content-Encoding is a characteristic of the resource identified + by the Request-URI. Typically, the resource is stored with this + encoding and is only decoded before rendering or analogous usage. + +10.4 Content-Length + + The Content-Length entity-header field indicates the size of the + Entity-Body, in decimal number of octets, sent to the recipient or, + in the case of the HEAD method, the size of the Entity-Body that + would have been sent had the request been a GET. + + Content-Length = "Content-Length" ":" 1*DIGIT + + An example is + + Content-Length: 3495 + + Applications should use this field to indicate the size of the + Entity-Body to be transferred, regardless of the media type of the + entity. A valid Content-Length field value is required on all + HTTP/1.0 request messages containing an entity body. + + Any Content-Length greater than or equal to zero is a valid value. + Section 7.2.2 describes how to determine the length of a response + entity body if a Content-Length is not given. + + Note: The meaning of this field is significantly different from + the corresponding definition in MIME, where it is an optional + field used within the "message/external-body" content-type. In + HTTP, it should be used whenever the entity's length can be + determined prior to being transferred. + + + + +Berners-Lee, et al Informational [Page 39] + +RFC 1945 HTTP/1.0 May 1996 + + +10.5 Content-Type + + The Content-Type entity-header field indicates the media type of the + Entity-Body sent to the recipient or, in the case of the HEAD method, + the media type that would have been sent had the request been a GET. + + Content-Type = "Content-Type" ":" media-type + + Media types are defined in Section 3.6. An example of the field is + + Content-Type: text/html + + Further discussion of methods for identifying the media type of an + entity is provided in Section 7.2.1. + +10.6 Date + + The Date general-header field represents the date and time at which + the message was originated, having the same semantics as orig-date in + RFC 822. The field value is an HTTP-date, as described in Section + 3.3. + + Date = "Date" ":" HTTP-date + + An example is + + Date: Tue, 15 Nov 1994 08:12:31 GMT + + If a message is received via direct connection with the user agent + (in the case of requests) or the origin server (in the case of + responses), then the date can be assumed to be the current date at + the receiving end. However, since the date--as it is believed by the + origin--is important for evaluating cached responses, origin servers + should always include a Date header. Clients should only send a Date + header field in messages that include an entity body, as in the case + of the POST request, and even then it is optional. A received message + which does not have a Date header field should be assigned one by the + recipient if the message will be cached by that recipient or + gatewayed via a protocol which requires a Date. + + In theory, the date should represent the moment just before the + entity is generated. In practice, the date can be generated at any + time during the message origination without affecting its semantic + value. + + Note: An earlier version of this document incorrectly specified + that this field should contain the creation date of the enclosed + Entity-Body. This has been changed to reflect actual (and proper) + + + +Berners-Lee, et al Informational [Page 40] + +RFC 1945 HTTP/1.0 May 1996 + + + usage. + +10.7 Expires + + The Expires entity-header field gives the date/time after which the + entity should be considered stale. This allows information providers + to suggest the volatility of the resource, or a date after which the + information may no longer be valid. Applications must not cache this + entity beyond the date given. The presence of an Expires field does + not imply that the original resource will change or cease to exist + at, before, or after that time. However, information providers that + know or even suspect that a resource will change by a certain date + should include an Expires header with that date. The format is an + absolute date and time as defined by HTTP-date in Section 3.3. + + Expires = "Expires" ":" HTTP-date + + An example of its use is + + Expires: Thu, 01 Dec 1994 16:00:00 GMT + + If the date given is equal to or earlier than the value of the Date + header, the recipient must not cache the enclosed entity. If a + resource is dynamic by nature, as is the case with many data- + producing processes, entities from that resource should be given an + appropriate Expires value which reflects that dynamism. + + The Expires field cannot be used to force a user agent to refresh its + display or reload a resource; its semantics apply only to caching + mechanisms, and such mechanisms need only check a resource's + expiration status when a new request for that resource is initiated. + + User agents often have history mechanisms, such as "Back" buttons and + history lists, which can be used to redisplay an entity retrieved + earlier in a session. By default, the Expires field does not apply to + history mechanisms. If the entity is still in storage, a history + mechanism should display it even if the entity has expired, unless + the user has specifically configured the agent to refresh expired + history documents. + + Note: Applications are encouraged to be tolerant of bad or + misinformed implementations of the Expires header. A value of zero + (0) or an invalid date format should be considered equivalent to + an "expires immediately." Although these values are not legitimate + for HTTP/1.0, a robust implementation is always desirable. + + + + + + +Berners-Lee, et al Informational [Page 41] + +RFC 1945 HTTP/1.0 May 1996 + + +10.8 From + + The From request-header field, if given, should contain an Internet + e-mail address for the human user who controls the requesting user + agent. The address should be machine-usable, as defined by mailbox in + RFC 822 [7] (as updated by RFC 1123 [6]): + + From = "From" ":" mailbox + + An example is: + + From: webmaster@w3.org + + This header field may be used for logging purposes and as a means for + identifying the source of invalid or unwanted requests. It should not + be used as an insecure form of access protection. The interpretation + of this field is that the request is being performed on behalf of the + person given, who accepts responsibility for the method performed. In + particular, robot agents should include this header so that the + person responsible for running the robot can be contacted if problems + occur on the receiving end. + + The Internet e-mail address in this field may be separate from the + Internet host which issued the request. For example, when a request + is passed through a proxy, the original issuer's address should be + used. + + Note: The client should not send the From header field without the + user's approval, as it may conflict with the user's privacy + interests or their site's security policy. It is strongly + recommended that the user be able to disable, enable, and modify + the value of this field at any time prior to a request. + +10.9 If-Modified-Since + + The If-Modified-Since request-header field is used with the GET + method to make it conditional: if the requested resource has not been + modified since the time specified in this field, a copy of the + resource will not be returned from the server; instead, a 304 (not + modified) response will be returned without any Entity-Body. + + If-Modified-Since = "If-Modified-Since" ":" HTTP-date + + An example of the field is: + + If-Modified-Since: Sat, 29 Oct 1994 19:43:31 GMT + + + + + +Berners-Lee, et al Informational [Page 42] + +RFC 1945 HTTP/1.0 May 1996 + + + A conditional GET method requests that the identified resource be + transferred only if it has been modified since the date given by the + If-Modified-Since header. The algorithm for determining this includes + the following cases: + + a) If the request would normally result in anything other than + a 200 (ok) status, or if the passed If-Modified-Since date + is invalid, the response is exactly the same as for a + normal GET. A date which is later than the server's current + time is invalid. + + b) If the resource has been modified since the + If-Modified-Since date, the response is exactly the same as + for a normal GET. + + c) If the resource has not been modified since a valid + If-Modified-Since date, the server shall return a 304 (not + modified) response. + + The purpose of this feature is to allow efficient updates of cached + information with a minimum amount of transaction overhead. + +10.10 Last-Modified + + The Last-Modified entity-header field indicates the date and time at + which the sender believes the resource was last modified. The exact + semantics of this field are defined in terms of how the recipient + should interpret it: if the recipient has a copy of this resource + which is older than the date given by the Last-Modified field, that + copy should be considered stale. + + Last-Modified = "Last-Modified" ":" HTTP-date + + An example of its use is + + Last-Modified: Tue, 15 Nov 1994 12:45:26 GMT + + The exact meaning of this header field depends on the implementation + of the sender and the nature of the original resource. For files, it + may be just the file system last-modified time. For entities with + dynamically included parts, it may be the most recent of the set of + last-modify times for its component parts. For database gateways, it + may be the last-update timestamp of the record. For virtual objects, + it may be the last time the internal state changed. + + An origin server must not send a Last-Modified date which is later + than the server's time of message origination. In such cases, where + the resource's last modification would indicate some time in the + + + +Berners-Lee, et al Informational [Page 43] + +RFC 1945 HTTP/1.0 May 1996 + + + future, the server must replace that date with the message + origination date. + +10.11 Location + + The Location response-header field defines the exact location of the + resource that was identified by the Request-URI. For 3xx responses, + the location must indicate the server's preferred URL for automatic + redirection to the resource. Only one absolute URL is allowed. + + Location = "Location" ":" absoluteURI + + An example is + + Location: http://www.w3.org/hypertext/WWW/NewLocation.html + +10.12 Pragma + + The Pragma general-header field is used to include implementation- + specific directives that may apply to any recipient along the + request/response chain. All pragma directives specify optional + behavior from the viewpoint of the protocol; however, some systems + may require that behavior be consistent with the directives. + + Pragma = "Pragma" ":" 1#pragma-directive + + pragma-directive = "no-cache" | extension-pragma + extension-pragma = token [ "=" word ] + + When the "no-cache" directive is present in a request message, an + application should forward the request toward the origin server even + if it has a cached copy of what is being requested. This allows a + client to insist upon receiving an authoritative response to its + request. It also allows a client to refresh a cached copy which is + known to be corrupted or stale. + + Pragma directives must be passed through by a proxy or gateway + application, regardless of their significance to that application, + since the directives may be applicable to all recipients along the + request/response chain. It is not possible to specify a pragma for a + specific recipient; however, any pragma directive not relevant to a + recipient should be ignored by that recipient. + +10.13 Referer + + The Referer request-header field allows the client to specify, for + the server's benefit, the address (URI) of the resource from which + the Request-URI was obtained. This allows a server to generate lists + + + +Berners-Lee, et al Informational [Page 44] + +RFC 1945 HTTP/1.0 May 1996 + + + of back-links to resources for interest, logging, optimized caching, + etc. It also allows obsolete or mistyped links to be traced for + maintenance. The Referer field must not be sent if the Request-URI + was obtained from a source that does not have its own URI, such as + input from the user keyboard. + + Referer = "Referer" ":" ( absoluteURI | relativeURI ) + + Example: + + Referer: http://www.w3.org/hypertext/DataSources/Overview.html + + If a partial URI is given, it should be interpreted relative to the + Request-URI. The URI must not include a fragment. + + Note: Because the source of a link may be private information or + may reveal an otherwise private information source, it is strongly + recommended that the user be able to select whether or not the + Referer field is sent. For example, a browser client could have a + toggle switch for browsing openly/anonymously, which would + respectively enable/disable the sending of Referer and From + information. + +10.14 Server + + The Server response-header field contains information about the + software used by the origin server to handle the request. The field + can contain multiple product tokens (Section 3.7) and comments + identifying the server and any significant subproducts. By + convention, the product tokens are listed in order of their + significance for identifying the application. + + Server = "Server" ":" 1*( product | comment ) + + Example: + + Server: CERN/3.0 libwww/2.17 + + If the response is being forwarded through a proxy, the proxy + application must not add its data to the product list. + + Note: Revealing the specific software version of the server may + allow the server machine to become more vulnerable to attacks + against software that is known to contain security holes. Server + implementors are encouraged to make this field a configurable + option. + + + + + +Berners-Lee, et al Informational [Page 45] + +RFC 1945 HTTP/1.0 May 1996 + + + Note: Some existing servers fail to restrict themselves to the + product token syntax within the Server field. + +10.15 User-Agent + + The User-Agent request-header field contains information about the + user agent originating the request. This is for statistical purposes, + the tracing of protocol violations, and automated recognition of user + agents for the sake of tailoring responses to avoid particular user + agent limitations. Although it is not required, user agents should + include this field with requests. The field can contain multiple + product tokens (Section 3.7) and comments identifying the agent and + any subproducts which form a significant part of the user agent. By + convention, the product tokens are listed in order of their + significance for identifying the application. + + User-Agent = "User-Agent" ":" 1*( product | comment ) + + Example: + + User-Agent: CERN-LineMode/2.15 libwww/2.17b3 + + Note: Some current proxy applications append their product + information to the list in the User-Agent field. This is not + recommended, since it makes machine interpretation of these + fields ambiguous. + + Note: Some existing clients fail to restrict themselves to + the product token syntax within the User-Agent field. + +10.16 WWW-Authenticate + + The WWW-Authenticate response-header field must be included in 401 + (unauthorized) response messages. The field value consists of at + least one challenge that indicates the authentication scheme(s) and + parameters applicable to the Request-URI. + + WWW-Authenticate = "WWW-Authenticate" ":" 1#challenge + + The HTTP access authentication process is described in Section 11. + User agents must take special care in parsing the WWW-Authenticate + field value if it contains more than one challenge, or if more than + one WWW-Authenticate header field is provided, since the contents of + a challenge may itself contain a comma-separated list of + authentication parameters. + + + + + + +Berners-Lee, et al Informational [Page 46] + +RFC 1945 HTTP/1.0 May 1996 + + +11. Access Authentication + + HTTP provides a simple challenge-response authentication mechanism + which may be used by a server to challenge a client request and by a + client to provide authentication information. It uses an extensible, + case-insensitive token to identify the authentication scheme, + followed by a comma-separated list of attribute-value pairs which + carry the parameters necessary for achieving authentication via that + scheme. + + auth-scheme = token + + auth-param = token "=" quoted-string + + The 401 (unauthorized) response message is used by an origin server + to challenge the authorization of a user agent. This response must + include a WWW-Authenticate header field containing at least one + challenge applicable to the requested resource. + + challenge = auth-scheme 1*SP realm *( "," auth-param ) + + realm = "realm" "=" realm-value + realm-value = quoted-string + + The realm attribute (case-insensitive) is required for all + authentication schemes which issue a challenge. The realm value + (case-sensitive), in combination with the canonical root URL of the + server being accessed, defines the protection space. These realms + allow the protected resources on a server to be partitioned into a + set of protection spaces, each with its own authentication scheme + and/or authorization database. The realm value is a string, generally + assigned by the origin server, which may have additional semantics + specific to the authentication scheme. + + A user agent that wishes to authenticate itself with a server-- + usually, but not necessarily, after receiving a 401 response--may do + so by including an Authorization header field with the request. The + Authorization field value consists of credentials containing the + authentication information of the user agent for the realm of the + resource being requested. + + credentials = basic-credentials + | ( auth-scheme #auth-param ) + + The domain over which credentials can be automatically applied by a + user agent is determined by the protection space. If a prior request + has been authorized, the same credentials may be reused for all other + requests within that protection space for a period of time determined + + + +Berners-Lee, et al Informational [Page 47] + +RFC 1945 HTTP/1.0 May 1996 + + + by the authentication scheme, parameters, and/or user preference. + Unless otherwise defined by the authentication scheme, a single + protection space cannot extend outside the scope of its server. + + If the server does not wish to accept the credentials sent with a + request, it should return a 403 (forbidden) response. + + The HTTP protocol does not restrict applications to this simple + challenge-response mechanism for access authentication. Additional + mechanisms may be used, such as encryption at the transport level or + via message encapsulation, and with additional header fields + specifying authentication information. However, these additional + mechanisms are not defined by this specification. + + Proxies must be completely transparent regarding user agent + authentication. That is, they must forward the WWW-Authenticate and + Authorization headers untouched, and must not cache the response to a + request containing Authorization. HTTP/1.0 does not provide a means + for a client to be authenticated with a proxy. + +11.1 Basic Authentication Scheme + + The "basic" authentication scheme is based on the model that the user + agent must authenticate itself with a user-ID and a password for each + realm. The realm value should be considered an opaque string which + can only be compared for equality with other realms on that server. + The server will authorize the request only if it can validate the + user-ID and password for the protection space of the Request-URI. + There are no optional authentication parameters. + + Upon receipt of an unauthorized request for a URI within the + protection space, the server should respond with a challenge like the + following: + + WWW-Authenticate: Basic realm="WallyWorld" + + where "WallyWorld" is the string assigned by the server to identify + the protection space of the Request-URI. + + To receive authorization, the client sends the user-ID and password, + separated by a single colon (":") character, within a base64 [5] + encoded string in the credentials. + + basic-credentials = "Basic" SP basic-cookie + + basic-cookie = + + + + +Berners-Lee, et al Informational [Page 48] + +RFC 1945 HTTP/1.0 May 1996 + + + userid-password = [ token ] ":" *TEXT + + If the user agent wishes to send the user-ID "Aladdin" and password + "open sesame", it would use the following header field: + + Authorization: Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ== + + The basic authentication scheme is a non-secure method of filtering + unauthorized access to resources on an HTTP server. It is based on + the assumption that the connection between the client and the server + can be regarded as a trusted carrier. As this is not generally true + on an open network, the basic authentication scheme should be used + accordingly. In spite of this, clients should implement the scheme in + order to communicate with servers that use it. + +12. Security Considerations + + This section is meant to inform application developers, information + providers, and users of the security limitations in HTTP/1.0 as + described by this document. The discussion does not include + definitive solutions to the problems revealed, though it does make + some suggestions for reducing security risks. + +12.1 Authentication of Clients + + As mentioned in Section 11.1, the Basic authentication scheme is not + a secure method of user authentication, nor does it prevent the + Entity-Body from being transmitted in clear text across the physical + network used as the carrier. HTTP/1.0 does not prevent additional + authentication schemes and encryption mechanisms from being employed + to increase security. + +12.2 Safe Methods + + The writers of client software should be aware that the software + represents the user in their interactions over the Internet, and + should be careful to allow the user to be aware of any actions they + may take which may have an unexpected significance to themselves or + others. + + In particular, the convention has been established that the GET and + HEAD methods should never have the significance of taking an action + other than retrieval. These methods should be considered "safe." This + allows user agents to represent other methods, such as POST, in a + special way, so that the user is made aware of the fact that a + possibly unsafe action is being requested. + + + + + +Berners-Lee, et al Informational [Page 49] + +RFC 1945 HTTP/1.0 May 1996 + + + Naturally, it is not possible to ensure that the server does not + generate side-effects as a result of performing a GET request; in + fact, some dynamic resources consider that a feature. The important + distinction here is that the user did not request the side-effects, + so therefore cannot be held accountable for them. + +12.3 Abuse of Server Log Information + + A server is in the position to save personal data about a user's + requests which may identify their reading patterns or subjects of + interest. This information is clearly confidential in nature and its + handling may be constrained by law in certain countries. People using + the HTTP protocol to provide data are responsible for ensuring that + such material is not distributed without the permission of any + individuals that are identifiable by the published results. + +12.4 Transfer of Sensitive Information + + Like any generic data transfer protocol, HTTP cannot regulate the + content of the data that is transferred, nor is there any a priori + method of determining the sensitivity of any particular piece of + information within the context of any given request. Therefore, + applications should supply as much control over this information as + possible to the provider of that information. Three header fields are + worth special mention in this context: Server, Referer and From. + + Revealing the specific software version of the server may allow the + server machine to become more vulnerable to attacks against software + that is known to contain security holes. Implementors should make the + Server header field a configurable option. + + The Referer field allows reading patterns to be studied and reverse + links drawn. Although it can be very useful, its power can be abused + if user details are not separated from the information contained in + the Referer. Even when the personal information has been removed, the + Referer field may indicate a private document's URI whose publication + would be inappropriate. + + The information sent in the From field might conflict with the user's + privacy interests or their site's security policy, and hence it + should not be transmitted without the user being able to disable, + enable, and modify the contents of the field. The user must be able + to set the contents of this field within a user preference or + application defaults configuration. + + We suggest, though do not require, that a convenient toggle interface + be provided for the user to enable or disable the sending of From and + Referer information. + + + +Berners-Lee, et al Informational [Page 50] + +RFC 1945 HTTP/1.0 May 1996 + + +12.5 Attacks Based On File and Path Names + + Implementations of HTTP origin servers should be careful to restrict + the documents returned by HTTP requests to be only those that were + intended by the server administrators. If an HTTP server translates + HTTP URIs directly into file system calls, the server must take + special care not to serve files that were not intended to be + delivered to HTTP clients. For example, Unix, Microsoft Windows, and + other operating systems use ".." as a path component to indicate a + directory level above the current one. On such a system, an HTTP + server must disallow any such construct in the Request-URI if it + would otherwise allow access to a resource outside those intended to + be accessible via the HTTP server. Similarly, files intended for + reference only internally to the server (such as access control + files, configuration files, and script code) must be protected from + inappropriate retrieval, since they might contain sensitive + information. Experience has shown that minor bugs in such HTTP server + implementations have turned into security risks. + +13. Acknowledgments + + This specification makes heavy use of the augmented BNF and generic + constructs defined by David H. Crocker for RFC 822 [7]. Similarly, it + reuses many of the definitions provided by Nathaniel Borenstein and + Ned Freed for MIME [5]. We hope that their inclusion in this + specification will help reduce past confusion over the relationship + between HTTP/1.0 and Internet mail message formats. + + The HTTP protocol has evolved considerably over the past four years. + It has benefited from a large and active developer community--the + many people who have participated on the www-talk mailing list--and + it is that community which has been most responsible for the success + of HTTP and of the World-Wide Web in general. Marc Andreessen, Robert + Cailliau, Daniel W. Connolly, Bob Denny, Jean-Francois Groff, Phillip + M. Hallam-Baker, Hakon W. Lie, Ari Luotonen, Rob McCool, Lou + Montulli, Dave Raggett, Tony Sanders, and Marc VanHeyningen deserve + special recognition for their efforts in defining aspects of the + protocol for early versions of this specification. + + Paul Hoffman contributed sections regarding the informational status + of this document and Appendices C and D. + + + + + + + + + + +Berners-Lee, et al Informational [Page 51] + +RFC 1945 HTTP/1.0 May 1996 + + + This document has benefited greatly from the comments of all those + participating in the HTTP-WG. In addition to those already mentioned, + the following individuals have contributed to this specification: + + Gary Adams Harald Tveit Alvestrand + Keith Ball Brian Behlendorf + Paul Burchard Maurizio Codogno + Mike Cowlishaw Roman Czyborra + Michael A. Dolan John Franks + Jim Gettys Marc Hedlund + Koen Holtman Alex Hopmann + Bob Jernigan Shel Kaphan + Martijn Koster Dave Kristol + Daniel LaLiberte Paul Leach + Albert Lunde John C. Mallery + Larry Masinter Mitra + Jeffrey Mogul Gavin Nicol + Bill Perry Jeffrey Perry + Owen Rees Luigi Rizzo + David Robinson Marc Salomon + Rich Salz Jim Seidman + Chuck Shotton Eric W. Sink + Simon E. Spero Robert S. Thau + Francois Yergeau Mary Ellen Zurko + Jean-Philippe Martin-Flatin + +14. References + + [1] Anklesaria, F., McCahill, M., Lindner, P., Johnson, D., + Torrey, D., and B. Alberti, "The Internet Gopher Protocol: A + Distributed Document Search and Retrieval Protocol", RFC 1436, + University of Minnesota, March 1993. + + [2] Berners-Lee, T., "Universal Resource Identifiers in WWW: A + Unifying Syntax for the Expression of Names and Addresses of + Objects on the Network as used in the World-Wide Web", + RFC 1630, CERN, June 1994. + + [3] Berners-Lee, T., and D. Connolly, "Hypertext Markup Language - + 2.0", RFC 1866, MIT/W3C, November 1995. + + [4] Berners-Lee, T., Masinter, L., and M. McCahill, "Uniform + Resource Locators (URL)", RFC 1738, CERN, Xerox PARC, + University of Minnesota, December 1994. + + + + + + + +Berners-Lee, et al Informational [Page 52] + +RFC 1945 HTTP/1.0 May 1996 + + + [5] Borenstein, N., and N. Freed, "MIME (Multipurpose Internet Mail + Extensions) Part One: Mechanisms for Specifying and Describing + the Format of Internet Message Bodies", RFC 1521, Bellcore, + Innosoft, September 1993. + + [6] Braden, R., "Requirements for Internet hosts - Application and + Support", STD 3, RFC 1123, IETF, October 1989. + + [7] Crocker, D., "Standard for the Format of ARPA Internet Text + Messages", STD 11, RFC 822, UDEL, August 1982. + + [8] F. Davis, B. Kahle, H. Morris, J. Salem, T. Shen, R. Wang, + J. Sui, and M. Grinbaum. "WAIS Interface Protocol Prototype + Functional Specification." (v1.5), Thinking Machines + Corporation, April 1990. + + [9] Fielding, R., "Relative Uniform Resource Locators", RFC 1808, + UC Irvine, June 1995. + + [10] Horton, M., and R. Adams, "Standard for interchange of USENET + Messages", RFC 1036 (Obsoletes RFC 850), AT&T Bell + Laboratories, Center for Seismic Studies, December 1987. + + [11] Kantor, B., and P. Lapsley, "Network News Transfer Protocol: + A Proposed Standard for the Stream-Based Transmission of News", + RFC 977, UC San Diego, UC Berkeley, February 1986. + + [12] Postel, J., "Simple Mail Transfer Protocol." STD 10, RFC 821, + USC/ISI, August 1982. + + [13] Postel, J., "Media Type Registration Procedure." RFC 1590, + USC/ISI, March 1994. + + [14] Postel, J., and J. Reynolds, "File Transfer Protocol (FTP)", + STD 9, RFC 959, USC/ISI, October 1985. + + [15] Reynolds, J., and J. Postel, "Assigned Numbers", STD 2, RFC + 1700, USC/ISI, October 1994. + + [16] Sollins, K., and L. Masinter, "Functional Requirements for + Uniform Resource Names", RFC 1737, MIT/LCS, Xerox Corporation, + December 1994. + + [17] US-ASCII. Coded Character Set - 7-Bit American Standard Code + for Information Interchange. Standard ANSI X3.4-1986, ANSI, + 1986. + + + + + +Berners-Lee, et al Informational [Page 53] + +RFC 1945 HTTP/1.0 May 1996 + + + [18] ISO-8859. International Standard -- Information Processing -- + 8-bit Single-Byte Coded Graphic Character Sets -- + Part 1: Latin alphabet No. 1, ISO 8859-1:1987. + Part 2: Latin alphabet No. 2, ISO 8859-2, 1987. + Part 3: Latin alphabet No. 3, ISO 8859-3, 1988. + Part 4: Latin alphabet No. 4, ISO 8859-4, 1988. + Part 5: Latin/Cyrillic alphabet, ISO 8859-5, 1988. + Part 6: Latin/Arabic alphabet, ISO 8859-6, 1987. + Part 7: Latin/Greek alphabet, ISO 8859-7, 1987. + Part 8: Latin/Hebrew alphabet, ISO 8859-8, 1988. + Part 9: Latin alphabet No. 5, ISO 8859-9, 1990. + +15. Authors' Addresses + + Tim Berners-Lee + Director, W3 Consortium + MIT Laboratory for Computer Science + 545 Technology Square + Cambridge, MA 02139, U.S.A. + + Fax: +1 (617) 258 8682 + EMail: timbl@w3.org + + + Roy T. Fielding + Department of Information and Computer Science + University of California + Irvine, CA 92717-3425, U.S.A. + + Fax: +1 (714) 824-4056 + EMail: fielding@ics.uci.edu + + + Henrik Frystyk Nielsen + W3 Consortium + MIT Laboratory for Computer Science + 545 Technology Square + Cambridge, MA 02139, U.S.A. + + Fax: +1 (617) 258 8682 + EMail: frystyk@w3.org + + + + + + + + + + +Berners-Lee, et al Informational [Page 54] + +RFC 1945 HTTP/1.0 May 1996 + + +Appendices + + These appendices are provided for informational reasons only -- they + do not form a part of the HTTP/1.0 specification. + +A. Internet Media Type message/http + + In addition to defining the HTTP/1.0 protocol, this document serves + as the specification for the Internet media type "message/http". The + following is to be registered with IANA [13]. + + Media Type name: message + + Media subtype name: http + + Required parameters: none + + Optional parameters: version, msgtype + + version: The HTTP-Version number of the enclosed message + (e.g., "1.0"). If not present, the version can be + determined from the first line of the body. + + msgtype: The message type -- "request" or "response". If + not present, the type can be determined from the + first line of the body. + + Encoding considerations: only "7bit", "8bit", or "binary" are + permitted + + Security considerations: none + +B. Tolerant Applications + + Although this document specifies the requirements for the generation + of HTTP/1.0 messages, not all applications will be correct in their + implementation. We therefore recommend that operational applications + be tolerant of deviations whenever those deviations can be + interpreted unambiguously. + + Clients should be tolerant in parsing the Status-Line and servers + tolerant when parsing the Request-Line. In particular, they should + accept any amount of SP or HT characters between fields, even though + only a single SP is required. + + The line terminator for HTTP-header fields is the sequence CRLF. + However, we recommend that applications, when parsing such headers, + recognize a single LF as a line terminator and ignore the leading CR. + + + +Berners-Lee, et al Informational [Page 55] + +RFC 1945 HTTP/1.0 May 1996 + + +C. Relationship to MIME + + HTTP/1.0 uses many of the constructs defined for Internet Mail (RFC + 822 [7]) and the Multipurpose Internet Mail Extensions (MIME [5]) to + allow entities to be transmitted in an open variety of + representations and with extensible mechanisms. However, RFC 1521 + discusses mail, and HTTP has a few features that are different than + those described in RFC 1521. These differences were carefully chosen + to optimize performance over binary connections, to allow greater + freedom in the use of new media types, to make date comparisons + easier, and to acknowledge the practice of some early HTTP servers + and clients. + + At the time of this writing, it is expected that RFC 1521 will be + revised. The revisions may include some of the practices found in + HTTP/1.0 but not in RFC 1521. + + This appendix describes specific areas where HTTP differs from RFC + 1521. Proxies and gateways to strict MIME environments should be + aware of these differences and provide the appropriate conversions + where necessary. Proxies and gateways from MIME environments to HTTP + also need to be aware of the differences because some conversions may + be required. + +C.1 Conversion to Canonical Form + + RFC 1521 requires that an Internet mail entity be converted to + canonical form prior to being transferred, as described in Appendix G + of RFC 1521 [5]. Section 3.6.1 of this document describes the forms + allowed for subtypes of the "text" media type when transmitted over + HTTP. + + RFC 1521 requires that content with a Content-Type of "text" + represent line breaks as CRLF and forbids the use of CR or LF outside + of line break sequences. HTTP allows CRLF, bare CR, and bare LF to + indicate a line break within text content when a message is + transmitted over HTTP. + + Where it is possible, a proxy or gateway from HTTP to a strict RFC + 1521 environment should translate all line breaks within the text + media types described in Section 3.6.1 of this document to the RFC + 1521 canonical form of CRLF. Note, however, that this may be + complicated by the presence of a Content-Encoding and by the fact + that HTTP allows the use of some character sets which do not use + octets 13 and 10 to represent CR and LF, as is the case for some + multi-byte character sets. + + + + + +Berners-Lee, et al Informational [Page 56] + +RFC 1945 HTTP/1.0 May 1996 + + +C.2 Conversion of Date Formats + + HTTP/1.0 uses a restricted set of date formats (Section 3.3) to + simplify the process of date comparison. Proxies and gateways from + other protocols should ensure that any Date header field present in a + message conforms to one of the HTTP/1.0 formats and rewrite the date + if necessary. + +C.3 Introduction of Content-Encoding + + RFC 1521 does not include any concept equivalent to HTTP/1.0's + Content-Encoding header field. Since this acts as a modifier on the + media type, proxies and gateways from HTTP to MIME-compliant + protocols must either change the value of the Content-Type header + field or decode the Entity-Body before forwarding the message. (Some + experimental applications of Content-Type for Internet mail have used + a media-type parameter of ";conversions=" to perform + an equivalent function as Content-Encoding. However, this parameter + is not part of RFC 1521.) + +C.4 No Content-Transfer-Encoding + + HTTP does not use the Content-Transfer-Encoding (CTE) field of RFC + 1521. Proxies and gateways from MIME-compliant protocols to HTTP must + remove any non-identity CTE ("quoted-printable" or "base64") encoding + prior to delivering the response message to an HTTP client. + + Proxies and gateways from HTTP to MIME-compliant protocols are + responsible for ensuring that the message is in the correct format + and encoding for safe transport on that protocol, where "safe + transport" is defined by the limitations of the protocol being used. + Such a proxy or gateway should label the data with an appropriate + Content-Transfer-Encoding if doing so will improve the likelihood of + safe transport over the destination protocol. + +C.5 HTTP Header Fields in Multipart Body-Parts + + In RFC 1521, most header fields in multipart body-parts are generally + ignored unless the field name begins with "Content-". In HTTP/1.0, + multipart body-parts may contain any HTTP header fields which are + significant to the meaning of that part. + +D. Additional Features + + This appendix documents protocol elements used by some existing HTTP + implementations, but not consistently and correctly across most + HTTP/1.0 applications. Implementors should be aware of these + features, but cannot rely upon their presence in, or interoperability + + + +Berners-Lee, et al Informational [Page 57] + +RFC 1945 HTTP/1.0 May 1996 + + + with, other HTTP/1.0 applications. + +D.1 Additional Request Methods + +D.1.1 PUT + + The PUT method requests that the enclosed entity be stored under the + supplied Request-URI. If the Request-URI refers to an already + existing resource, the enclosed entity should be considered as a + modified version of the one residing on the origin server. If the + Request-URI does not point to an existing resource, and that URI is + capable of being defined as a new resource by the requesting user + agent, the origin server can create the resource with that URI. + + The fundamental difference between the POST and PUT requests is + reflected in the different meaning of the Request-URI. The URI in a + POST request identifies the resource that will handle the enclosed + entity as data to be processed. That resource may be a data-accepting + process, a gateway to some other protocol, or a separate entity that + accepts annotations. In contrast, the URI in a PUT request identifies + the entity enclosed with the request -- the user agent knows what URI + is intended and the server should not apply the request to some other + resource. + +D.1.2 DELETE + + The DELETE method requests that the origin server delete the resource + identified by the Request-URI. + +D.1.3 LINK + + The LINK method establishes one or more Link relationships between + the existing resource identified by the Request-URI and other + existing resources. + +D.1.4 UNLINK + + The UNLINK method removes one or more Link relationships from the + existing resource identified by the Request-URI. + +D.2 Additional Header Field Definitions + +D.2.1 Accept + + The Accept request-header field can be used to indicate a list of + media ranges which are acceptable as a response to the request. The + asterisk "*" character is used to group media types into ranges, with + "*/*" indicating all media types and "type/*" indicating all subtypes + + + +Berners-Lee, et al Informational [Page 58] + +RFC 1945 HTTP/1.0 May 1996 + + + of that type. The set of ranges given by the client should represent + what types are acceptable given the context of the request. + +D.2.2 Accept-Charset + + The Accept-Charset request-header field can be used to indicate a + list of preferred character sets other than the default US-ASCII and + ISO-8859-1. This field allows clients capable of understanding more + comprehensive or special-purpose character sets to signal that + capability to a server which is capable of representing documents in + those character sets. + +D.2.3 Accept-Encoding + + The Accept-Encoding request-header field is similar to Accept, but + restricts the content-coding values which are acceptable in the + response. + +D.2.4 Accept-Language + + The Accept-Language request-header field is similar to Accept, but + restricts the set of natural languages that are preferred as a + response to the request. + +D.2.5 Content-Language + + The Content-Language entity-header field describes the natural + language(s) of the intended audience for the enclosed entity. Note + that this may not be equivalent to all the languages used within the + entity. + +D.2.6 Link + + The Link entity-header field provides a means for describing a + relationship between the entity and some other resource. An entity + may include multiple Link values. Links at the metainformation level + typically indicate relationships like hierarchical structure and + navigation paths. + +D.2.7 MIME-Version + + HTTP messages may include a single MIME-Version general-header field + to indicate what version of the MIME protocol was used to construct + the message. Use of the MIME-Version header field, as defined by RFC + 1521 [5], should indicate that the message is MIME-conformant. + Unfortunately, some older HTTP/1.0 servers send it indiscriminately, + and thus this field should be ignored. + + + + +Berners-Lee, et al Informational [Page 59] + +RFC 1945 HTTP/1.0 May 1996 + + +D.2.8 Retry-After + + The Retry-After response-header field can be used with a 503 (service + unavailable) response to indicate how long the service is expected to + be unavailable to the requesting client. The value of this field can + be either an HTTP-date or an integer number of seconds (in decimal) + after the time of the response. + +D.2.9 Title + + The Title entity-header field indicates the title of the entity. + +D.2.10 URI + + The URI entity-header field may contain some or all of the Uniform + Resource Identifiers (Section 3.2) by which the Request-URI resource + can be identified. There is no guarantee that the resource can be + accessed using the URI(s) specified. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Berners-Lee, et al Informational [Page 60] + diff --git a/doc/rfc/rfc2616.txt b/doc/rfc/rfc2616.txt new file mode 100644 index 0000000000..45d7d08b8f --- /dev/null +++ b/doc/rfc/rfc2616.txt @@ -0,0 +1,9859 @@ + + + + + + +Network Working Group R. Fielding +Request for Comments: 2616 UC Irvine +Obsoletes: 2068 J. Gettys +Category: Standards Track Compaq/W3C + J. Mogul + Compaq + H. Frystyk + W3C/MIT + L. Masinter + Xerox + P. Leach + Microsoft + T. Berners-Lee + W3C/MIT + June 1999 + + + Hypertext Transfer Protocol -- HTTP/1.1 + +Status of this Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Copyright Notice + + Copyright (C) The Internet Society (1999). All Rights Reserved. + +Abstract + + The Hypertext Transfer Protocol (HTTP) is an application-level + protocol for distributed, collaborative, hypermedia information + systems. It is a generic, stateless, protocol which can be used for + many tasks beyond its use for hypertext, such as name servers and + distributed object management systems, through extension of its + request methods, error codes and headers [47]. A feature of HTTP is + the typing and negotiation of data representation, allowing systems + to be built independently of the data being transferred. + + HTTP has been in use by the World-Wide Web global information + initiative since 1990. This specification defines the protocol + referred to as "HTTP/1.1", and is an update to RFC 2068 [33]. + + + + + + +Fielding, et al. Standards Track [Page 1] + +RFC 2616 HTTP/1.1 June 1999 + + +Table of Contents + + 1 Introduction ...................................................7 + 1.1 Purpose......................................................7 + 1.2 Requirements .................................................8 + 1.3 Terminology ..................................................8 + 1.4 Overall Operation ...........................................12 + 2 Notational Conventions and Generic Grammar ....................14 + 2.1 Augmented BNF ...............................................14 + 2.2 Basic Rules .................................................15 + 3 Protocol Parameters ...........................................17 + 3.1 HTTP Version ................................................17 + 3.2 Uniform Resource Identifiers ................................18 + 3.2.1 General Syntax ...........................................19 + 3.2.2 http URL .................................................19 + 3.2.3 URI Comparison ...........................................20 + 3.3 Date/Time Formats ...........................................20 + 3.3.1 Full Date ................................................20 + 3.3.2 Delta Seconds ............................................21 + 3.4 Character Sets ..............................................21 + 3.4.1 Missing Charset ..........................................22 + 3.5 Content Codings .............................................23 + 3.6 Transfer Codings ............................................24 + 3.6.1 Chunked Transfer Coding ..................................25 + 3.7 Media Types .................................................26 + 3.7.1 Canonicalization and Text Defaults .......................27 + 3.7.2 Multipart Types ..........................................27 + 3.8 Product Tokens ..............................................28 + 3.9 Quality Values ..............................................29 + 3.10 Language Tags ...............................................29 + 3.11 Entity Tags .................................................30 + 3.12 Range Units .................................................30 + 4 HTTP Message ..................................................31 + 4.1 Message Types ...............................................31 + 4.2 Message Headers .............................................31 + 4.3 Message Body ................................................32 + 4.4 Message Length ..............................................33 + 4.5 General Header Fields .......................................34 + 5 Request .......................................................35 + 5.1 Request-Line ................................................35 + 5.1.1 Method ...................................................36 + 5.1.2 Request-URI ..............................................36 + 5.2 The Resource Identified by a Request ........................38 + 5.3 Request Header Fields .......................................38 + 6 Response ......................................................39 + 6.1 Status-Line .................................................39 + 6.1.1 Status Code and Reason Phrase ............................39 + 6.2 Response Header Fields ......................................41 + + + +Fielding, et al. Standards Track [Page 2] + +RFC 2616 HTTP/1.1 June 1999 + + + 7 Entity ........................................................42 + 7.1 Entity Header Fields ........................................42 + 7.2 Entity Body .................................................43 + 7.2.1 Type .....................................................43 + 7.2.2 Entity Length ............................................43 + 8 Connections ...................................................44 + 8.1 Persistent Connections ......................................44 + 8.1.1 Purpose ..................................................44 + 8.1.2 Overall Operation ........................................45 + 8.1.3 Proxy Servers ............................................46 + 8.1.4 Practical Considerations .................................46 + 8.2 Message Transmission Requirements ...........................47 + 8.2.1 Persistent Connections and Flow Control ..................47 + 8.2.2 Monitoring Connections for Error Status Messages .........48 + 8.2.3 Use of the 100 (Continue) Status .........................48 + 8.2.4 Client Behavior if Server Prematurely Closes Connection ..50 + 9 Method Definitions ............................................51 + 9.1 Safe and Idempotent Methods .................................51 + 9.1.1 Safe Methods .............................................51 + 9.1.2 Idempotent Methods .......................................51 + 9.2 OPTIONS .....................................................52 + 9.3 GET .........................................................53 + 9.4 HEAD ........................................................54 + 9.5 POST ........................................................54 + 9.6 PUT .........................................................55 + 9.7 DELETE ......................................................56 + 9.8 TRACE .......................................................56 + 9.9 CONNECT .....................................................57 + 10 Status Code Definitions ......................................57 + 10.1 Informational 1xx ...........................................57 + 10.1.1 100 Continue .............................................58 + 10.1.2 101 Switching Protocols ..................................58 + 10.2 Successful 2xx ..............................................58 + 10.2.1 200 OK ...................................................58 + 10.2.2 201 Created ..............................................59 + 10.2.3 202 Accepted .............................................59 + 10.2.4 203 Non-Authoritative Information ........................59 + 10.2.5 204 No Content ...........................................60 + 10.2.6 205 Reset Content ........................................60 + 10.2.7 206 Partial Content ......................................60 + 10.3 Redirection 3xx .............................................61 + 10.3.1 300 Multiple Choices .....................................61 + 10.3.2 301 Moved Permanently ....................................62 + 10.3.3 302 Found ................................................62 + 10.3.4 303 See Other ............................................63 + 10.3.5 304 Not Modified .........................................63 + 10.3.6 305 Use Proxy ............................................64 + 10.3.7 306 (Unused) .............................................64 + + + +Fielding, et al. Standards Track [Page 3] + +RFC 2616 HTTP/1.1 June 1999 + + + 10.3.8 307 Temporary Redirect ...................................65 + 10.4 Client Error 4xx ............................................65 + 10.4.1 400 Bad Request .........................................65 + 10.4.2 401 Unauthorized ........................................66 + 10.4.3 402 Payment Required ....................................66 + 10.4.4 403 Forbidden ...........................................66 + 10.4.5 404 Not Found ...........................................66 + 10.4.6 405 Method Not Allowed ..................................66 + 10.4.7 406 Not Acceptable ......................................67 + 10.4.8 407 Proxy Authentication Required .......................67 + 10.4.9 408 Request Timeout .....................................67 + 10.4.10 409 Conflict ............................................67 + 10.4.11 410 Gone ................................................68 + 10.4.12 411 Length Required .....................................68 + 10.4.13 412 Precondition Failed .................................68 + 10.4.14 413 Request Entity Too Large ............................69 + 10.4.15 414 Request-URI Too Long ................................69 + 10.4.16 415 Unsupported Media Type ..............................69 + 10.4.17 416 Requested Range Not Satisfiable .....................69 + 10.4.18 417 Expectation Failed ..................................70 + 10.5 Server Error 5xx ............................................70 + 10.5.1 500 Internal Server Error ................................70 + 10.5.2 501 Not Implemented ......................................70 + 10.5.3 502 Bad Gateway ..........................................70 + 10.5.4 503 Service Unavailable ..................................70 + 10.5.5 504 Gateway Timeout ......................................71 + 10.5.6 505 HTTP Version Not Supported ...........................71 + 11 Access Authentication ........................................71 + 12 Content Negotiation ..........................................71 + 12.1 Server-driven Negotiation ...................................72 + 12.2 Agent-driven Negotiation ....................................73 + 12.3 Transparent Negotiation .....................................74 + 13 Caching in HTTP ..............................................74 + 13.1.1 Cache Correctness ........................................75 + 13.1.2 Warnings .................................................76 + 13.1.3 Cache-control Mechanisms .................................77 + 13.1.4 Explicit User Agent Warnings .............................78 + 13.1.5 Exceptions to the Rules and Warnings .....................78 + 13.1.6 Client-controlled Behavior ...............................79 + 13.2 Expiration Model ............................................79 + 13.2.1 Server-Specified Expiration ..............................79 + 13.2.2 Heuristic Expiration .....................................80 + 13.2.3 Age Calculations .........................................80 + 13.2.4 Expiration Calculations ..................................83 + 13.2.5 Disambiguating Expiration Values .........................84 + 13.2.6 Disambiguating Multiple Responses ........................84 + 13.3 Validation Model ............................................85 + 13.3.1 Last-Modified Dates ......................................86 + + + +Fielding, et al. Standards Track [Page 4] + +RFC 2616 HTTP/1.1 June 1999 + + + 13.3.2 Entity Tag Cache Validators ..............................86 + 13.3.3 Weak and Strong Validators ...............................86 + 13.3.4 Rules for When to Use Entity Tags and Last-Modified Dates.89 + 13.3.5 Non-validating Conditionals ..............................90 + 13.4 Response Cacheability .......................................91 + 13.5 Constructing Responses From Caches ..........................92 + 13.5.1 End-to-end and Hop-by-hop Headers ........................92 + 13.5.2 Non-modifiable Headers ...................................92 + 13.5.3 Combining Headers ........................................94 + 13.5.4 Combining Byte Ranges ....................................95 + 13.6 Caching Negotiated Responses ................................95 + 13.7 Shared and Non-Shared Caches ................................96 + 13.8 Errors or Incomplete Response Cache Behavior ................97 + 13.9 Side Effects of GET and HEAD ................................97 + 13.10 Invalidation After Updates or Deletions ...................97 + 13.11 Write-Through Mandatory ...................................98 + 13.12 Cache Replacement .........................................99 + 13.13 History Lists .............................................99 + 14 Header Field Definitions ....................................100 + 14.1 Accept .....................................................100 + 14.2 Accept-Charset .............................................102 + 14.3 Accept-Encoding ............................................102 + 14.4 Accept-Language ............................................104 + 14.5 Accept-Ranges ..............................................105 + 14.6 Age ........................................................106 + 14.7 Allow ......................................................106 + 14.8 Authorization ..............................................107 + 14.9 Cache-Control ..............................................108 + 14.9.1 What is Cacheable .......................................109 + 14.9.2 What May be Stored by Caches ............................110 + 14.9.3 Modifications of the Basic Expiration Mechanism .........111 + 14.9.4 Cache Revalidation and Reload Controls ..................113 + 14.9.5 No-Transform Directive ..................................115 + 14.9.6 Cache Control Extensions ................................116 + 14.10 Connection ...............................................117 + 14.11 Content-Encoding .........................................118 + 14.12 Content-Language .........................................118 + 14.13 Content-Length ...........................................119 + 14.14 Content-Location .........................................120 + 14.15 Content-MD5 ..............................................121 + 14.16 Content-Range ............................................122 + 14.17 Content-Type .............................................124 + 14.18 Date .....................................................124 + 14.18.1 Clockless Origin Server Operation ......................125 + 14.19 ETag .....................................................126 + 14.20 Expect ...................................................126 + 14.21 Expires ..................................................127 + 14.22 From .....................................................128 + + + +Fielding, et al. Standards Track [Page 5] + +RFC 2616 HTTP/1.1 June 1999 + + + 14.23 Host .....................................................128 + 14.24 If-Match .................................................129 + 14.25 If-Modified-Since ........................................130 + 14.26 If-None-Match ............................................132 + 14.27 If-Range .................................................133 + 14.28 If-Unmodified-Since ......................................134 + 14.29 Last-Modified ............................................134 + 14.30 Location .................................................135 + 14.31 Max-Forwards .............................................136 + 14.32 Pragma ...................................................136 + 14.33 Proxy-Authenticate .......................................137 + 14.34 Proxy-Authorization ......................................137 + 14.35 Range ....................................................138 + 14.35.1 Byte Ranges ...........................................138 + 14.35.2 Range Retrieval Requests ..............................139 + 14.36 Referer ..................................................140 + 14.37 Retry-After ..............................................141 + 14.38 Server ...................................................141 + 14.39 TE .......................................................142 + 14.40 Trailer ..................................................143 + 14.41 Transfer-Encoding..........................................143 + 14.42 Upgrade ..................................................144 + 14.43 User-Agent ...............................................145 + 14.44 Vary .....................................................145 + 14.45 Via ......................................................146 + 14.46 Warning ..................................................148 + 14.47 WWW-Authenticate .........................................150 + 15 Security Considerations .......................................150 + 15.1 Personal Information....................................151 + 15.1.1 Abuse of Server Log Information .........................151 + 15.1.2 Transfer of Sensitive Information .......................151 + 15.1.3 Encoding Sensitive Information in URI's .................152 + 15.1.4 Privacy Issues Connected to Accept Headers ..............152 + 15.2 Attacks Based On File and Path Names .......................153 + 15.3 DNS Spoofing ...............................................154 + 15.4 Location Headers and Spoofing ..............................154 + 15.5 Content-Disposition Issues .................................154 + 15.6 Authentication Credentials and Idle Clients ................155 + 15.7 Proxies and Caching ........................................155 + 15.7.1 Denial of Service Attacks on Proxies....................156 + 16 Acknowledgments .............................................156 + 17 References ..................................................158 + 18 Authors' Addresses ..........................................162 + 19 Appendices ..................................................164 + 19.1 Internet Media Type message/http and application/http ......164 + 19.2 Internet Media Type multipart/byteranges ...................165 + 19.3 Tolerant Applications ......................................166 + 19.4 Differences Between HTTP Entities and RFC 2045 Entities ....167 + + + +Fielding, et al. Standards Track [Page 6] + +RFC 2616 HTTP/1.1 June 1999 + + + 19.4.1 MIME-Version ............................................167 + 19.4.2 Conversion to Canonical Form ............................167 + 19.4.3 Conversion of Date Formats ..............................168 + 19.4.4 Introduction of Content-Encoding ........................168 + 19.4.5 No Content-Transfer-Encoding ............................168 + 19.4.6 Introduction of Transfer-Encoding .......................169 + 19.4.7 MHTML and Line Length Limitations .......................169 + 19.5 Additional Features ........................................169 + 19.5.1 Content-Disposition .....................................170 + 19.6 Compatibility with Previous Versions .......................170 + 19.6.1 Changes from HTTP/1.0 ...................................171 + 19.6.2 Compatibility with HTTP/1.0 Persistent Connections ......172 + 19.6.3 Changes from RFC 2068 ...................................172 + 20 Index .......................................................175 + 21 Full Copyright Statement ....................................176 + +1 Introduction + +1.1 Purpose + + The Hypertext Transfer Protocol (HTTP) is an application-level + protocol for distributed, collaborative, hypermedia information + systems. HTTP has been in use by the World-Wide Web global + information initiative since 1990. The first version of HTTP, + referred to as HTTP/0.9, was a simple protocol for raw data transfer + across the Internet. HTTP/1.0, as defined by RFC 1945 [6], improved + the protocol by allowing messages to be in the format of MIME-like + messages, containing metainformation about the data transferred and + modifiers on the request/response semantics. However, HTTP/1.0 does + not sufficiently take into consideration the effects of hierarchical + proxies, caching, the need for persistent connections, or virtual + hosts. In addition, the proliferation of incompletely-implemented + applications calling themselves "HTTP/1.0" has necessitated a + protocol version change in order for two communicating applications + to determine each other's true capabilities. + + This specification defines the protocol referred to as "HTTP/1.1". + This protocol includes more stringent requirements than HTTP/1.0 in + order to ensure reliable implementation of its features. + + Practical information systems require more functionality than simple + retrieval, including search, front-end update, and annotation. HTTP + allows an open-ended set of methods and headers that indicate the + purpose of a request [47]. It builds on the discipline of reference + provided by the Uniform Resource Identifier (URI) [3], as a location + (URL) [4] or name (URN) [20], for indicating the resource to which a + + + + + +Fielding, et al. Standards Track [Page 7] + +RFC 2616 HTTP/1.1 June 1999 + + + method is to be applied. Messages are passed in a format similar to + that used by Internet mail [9] as defined by the Multipurpose + Internet Mail Extensions (MIME) [7]. + + HTTP is also used as a generic protocol for communication between + user agents and proxies/gateways to other Internet systems, including + those supported by the SMTP [16], NNTP [13], FTP [18], Gopher [2], + and WAIS [10] protocols. In this way, HTTP allows basic hypermedia + access to resources available from diverse applications. + +1.2 Requirements + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this + document are to be interpreted as described in RFC 2119 [34]. + + An implementation is not compliant if it fails to satisfy one or more + of the MUST or REQUIRED level requirements for the protocols it + implements. An implementation that satisfies all the MUST or REQUIRED + level and all the SHOULD level requirements for its protocols is said + to be "unconditionally compliant"; one that satisfies all the MUST + level requirements but not all the SHOULD level requirements for its + protocols is said to be "conditionally compliant." + +1.3 Terminology + + This specification uses a number of terms to refer to the roles + played by participants in, and objects of, the HTTP communication. + + connection + A transport layer virtual circuit established between two programs + for the purpose of communication. + + message + The basic unit of HTTP communication, consisting of a structured + sequence of octets matching the syntax defined in section 4 and + transmitted via the connection. + + request + An HTTP request message, as defined in section 5. + + response + An HTTP response message, as defined in section 6. + + + + + + + + +Fielding, et al. Standards Track [Page 8] + +RFC 2616 HTTP/1.1 June 1999 + + + resource + A network data object or service that can be identified by a URI, + as defined in section 3.2. Resources may be available in multiple + representations (e.g. multiple languages, data formats, size, and + resolutions) or vary in other ways. + + entity + The information transferred as the payload of a request or + response. An entity consists of metainformation in the form of + entity-header fields and content in the form of an entity-body, as + described in section 7. + + representation + An entity included with a response that is subject to content + negotiation, as described in section 12. There may exist multiple + representations associated with a particular response status. + + content negotiation + The mechanism for selecting the appropriate representation when + servicing a request, as described in section 12. The + representation of entities in any response can be negotiated + (including error responses). + + variant + A resource may have one, or more than one, representation(s) + associated with it at any given instant. Each of these + representations is termed a `varriant'. Use of the term `variant' + does not necessarily imply that the resource is subject to content + negotiation. + + client + A program that establishes connections for the purpose of sending + requests. + + user agent + The client which initiates a request. These are often browsers, + editors, spiders (web-traversing robots), or other end user tools. + + server + An application program that accepts connections in order to + service requests by sending back responses. Any given program may + be capable of being both a client and a server; our use of these + terms refers only to the role being performed by the program for a + particular connection, rather than to the program's capabilities + in general. Likewise, any server may act as an origin server, + proxy, gateway, or tunnel, switching behavior based on the nature + of each request. + + + + +Fielding, et al. Standards Track [Page 9] + +RFC 2616 HTTP/1.1 June 1999 + + + origin server + The server on which a given resource resides or is to be created. + + proxy + An intermediary program which acts as both a server and a client + for the purpose of making requests on behalf of other clients. + Requests are serviced internally or by passing them on, with + possible translation, to other servers. A proxy MUST implement + both the client and server requirements of this specification. A + "transparent proxy" is a proxy that does not modify the request or + response beyond what is required for proxy authentication and + identification. A "non-transparent proxy" is a proxy that modifies + the request or response in order to provide some added service to + the user agent, such as group annotation services, media type + transformation, protocol reduction, or anonymity filtering. Except + where either transparent or non-transparent behavior is explicitly + stated, the HTTP proxy requirements apply to both types of + proxies. + + gateway + A server which acts as an intermediary for some other server. + Unlike a proxy, a gateway receives requests as if it were the + origin server for the requested resource; the requesting client + may not be aware that it is communicating with a gateway. + + tunnel + An intermediary program which is acting as a blind relay between + two connections. Once active, a tunnel is not considered a party + to the HTTP communication, though the tunnel may have been + initiated by an HTTP request. The tunnel ceases to exist when both + ends of the relayed connections are closed. + + cache + A program's local store of response messages and the subsystem + that controls its message storage, retrieval, and deletion. A + cache stores cacheable responses in order to reduce the response + time and network bandwidth consumption on future, equivalent + requests. Any client or server may include a cache, though a cache + cannot be used by a server that is acting as a tunnel. + + cacheable + A response is cacheable if a cache is allowed to store a copy of + the response message for use in answering subsequent requests. The + rules for determining the cacheability of HTTP responses are + defined in section 13. Even if a resource is cacheable, there may + be additional constraints on whether a cache can use the cached + copy for a particular request. + + + + +Fielding, et al. Standards Track [Page 10] + +RFC 2616 HTTP/1.1 June 1999 + + + first-hand + A response is first-hand if it comes directly and without + unnecessary delay from the origin server, perhaps via one or more + proxies. A response is also first-hand if its validity has just + been checked directly with the origin server. + + explicit expiration time + The time at which the origin server intends that an entity should + no longer be returned by a cache without further validation. + + heuristic expiration time + An expiration time assigned by a cache when no explicit expiration + time is available. + + age + The age of a response is the time since it was sent by, or + successfully validated with, the origin server. + + freshness lifetime + The length of time between the generation of a response and its + expiration time. + + fresh + A response is fresh if its age has not yet exceeded its freshness + lifetime. + + stale + A response is stale if its age has passed its freshness lifetime. + + semantically transparent + A cache behaves in a "semantically transparent" manner, with + respect to a particular response, when its use affects neither the + requesting client nor the origin server, except to improve + performance. When a cache is semantically transparent, the client + receives exactly the same response (except for hop-by-hop headers) + that it would have received had its request been handled directly + by the origin server. + + validator + A protocol element (e.g., an entity tag or a Last-Modified time) + that is used to find out whether a cache entry is an equivalent + copy of an entity. + + upstream/downstream + Upstream and downstream describe the flow of a message: all + messages flow from upstream to downstream. + + + + + +Fielding, et al. Standards Track [Page 11] + +RFC 2616 HTTP/1.1 June 1999 + + + inbound/outbound + Inbound and outbound refer to the request and response paths for + messages: "inbound" means "traveling toward the origin server", + and "outbound" means "traveling toward the user agent" + +1.4 Overall Operation + + The HTTP protocol is a request/response protocol. A client sends a + request to the server in the form of a request method, URI, and + protocol version, followed by a MIME-like message containing request + modifiers, client information, and possible body content over a + connection with a server. The server responds with a status line, + including the message's protocol version and a success or error code, + followed by a MIME-like message containing server information, entity + metainformation, and possible entity-body content. The relationship + between HTTP and MIME is described in appendix 19.4. + + Most HTTP communication is initiated by a user agent and consists of + a request to be applied to a resource on some origin server. In the + simplest case, this may be accomplished via a single connection (v) + between the user agent (UA) and the origin server (O). + + request chain ------------------------> + UA -------------------v------------------- O + <----------------------- response chain + + A more complicated situation occurs when one or more intermediaries + are present in the request/response chain. There are three common + forms of intermediary: proxy, gateway, and tunnel. A proxy is a + forwarding agent, receiving requests for a URI in its absolute form, + rewriting all or part of the message, and forwarding the reformatted + request toward the server identified by the URI. A gateway is a + receiving agent, acting as a layer above some other server(s) and, if + necessary, translating the requests to the underlying server's + protocol. A tunnel acts as a relay point between two connections + without changing the messages; tunnels are used when the + communication needs to pass through an intermediary (such as a + firewall) even when the intermediary cannot understand the contents + of the messages. + + request chain --------------------------------------> + UA -----v----- A -----v----- B -----v----- C -----v----- O + <------------------------------------- response chain + + The figure above shows three intermediaries (A, B, and C) between the + user agent and origin server. A request or response message that + travels the whole chain will pass through four separate connections. + This distinction is important because some HTTP communication options + + + +Fielding, et al. Standards Track [Page 12] + +RFC 2616 HTTP/1.1 June 1999 + + + may apply only to the connection with the nearest, non-tunnel + neighbor, only to the end-points of the chain, or to all connections + along the chain. Although the diagram is linear, each participant may + be engaged in multiple, simultaneous communications. For example, B + may be receiving requests from many clients other than A, and/or + forwarding requests to servers other than C, at the same time that it + is handling A's request. + + Any party to the communication which is not acting as a tunnel may + employ an internal cache for handling requests. The effect of a cache + is that the request/response chain is shortened if one of the + participants along the chain has a cached response applicable to that + request. The following illustrates the resulting chain if B has a + cached copy of an earlier response from O (via C) for a request which + has not been cached by UA or A. + + request chain ----------> + UA -----v----- A -----v----- B - - - - - - C - - - - - - O + <--------- response chain + + Not all responses are usefully cacheable, and some requests may + contain modifiers which place special requirements on cache behavior. + HTTP requirements for cache behavior and cacheable responses are + defined in section 13. + + In fact, there are a wide variety of architectures and configurations + of caches and proxies currently being experimented with or deployed + across the World Wide Web. These systems include national hierarchies + of proxy caches to save transoceanic bandwidth, systems that + broadcast or multicast cache entries, organizations that distribute + subsets of cached data via CD-ROM, and so on. HTTP systems are used + in corporate intranets over high-bandwidth links, and for access via + PDAs with low-power radio links and intermittent connectivity. The + goal of HTTP/1.1 is to support the wide diversity of configurations + already deployed while introducing protocol constructs that meet the + needs of those who build web applications that require high + reliability and, failing that, at least reliable indications of + failure. + + HTTP communication usually takes place over TCP/IP connections. The + default port is TCP 80 [19], but other ports can be used. This does + not preclude HTTP from being implemented on top of any other protocol + on the Internet, or on other networks. HTTP only presumes a reliable + transport; any protocol that provides such guarantees can be used; + the mapping of the HTTP/1.1 request and response structures onto the + transport data units of the protocol in question is outside the scope + of this specification. + + + + +Fielding, et al. Standards Track [Page 13] + +RFC 2616 HTTP/1.1 June 1999 + + + In HTTP/1.0, most implementations used a new connection for each + request/response exchange. In HTTP/1.1, a connection may be used for + one or more request/response exchanges, although connections may be + closed for a variety of reasons (see section 8.1). + +2 Notational Conventions and Generic Grammar + +2.1 Augmented BNF + + All of the mechanisms specified in this document are described in + both prose and an augmented Backus-Naur Form (BNF) similar to that + used by RFC 822 [9]. Implementors will need to be familiar with the + notation in order to understand this specification. The augmented BNF + includes the following constructs: + + name = definition + The name of a rule is simply the name itself (without any + enclosing "<" and ">") and is separated from its definition by the + equal "=" character. White space is only significant in that + indentation of continuation lines is used to indicate a rule + definition that spans more than one line. Certain basic rules are + in uppercase, such as SP, LWS, HT, CRLF, DIGIT, ALPHA, etc. Angle + brackets are used within definitions whenever their presence will + facilitate discerning the use of rule names. + + "literal" + Quotation marks surround literal text. Unless stated otherwise, + the text is case-insensitive. + + rule1 | rule2 + Elements separated by a bar ("|") are alternatives, e.g., "yes | + no" will accept yes or no. + + (rule1 rule2) + Elements enclosed in parentheses are treated as a single element. + Thus, "(elem (foo | bar) elem)" allows the token sequences "elem + foo elem" and "elem bar elem". + + *rule + The character "*" preceding an element indicates repetition. The + full form is "*element" indicating at least and at most + occurrences of element. Default values are 0 and infinity so + that "*(element)" allows any number, including zero; "1*element" + requires at least one; and "1*2element" allows one or two. + + [rule] + Square brackets enclose optional elements; "[foo bar]" is + equivalent to "*1(foo bar)". + + + +Fielding, et al. Standards Track [Page 14] + +RFC 2616 HTTP/1.1 June 1999 + + + N rule + Specific repetition: "(element)" is equivalent to + "*(element)"; that is, exactly occurrences of (element). + Thus 2DIGIT is a 2-digit number, and 3ALPHA is a string of three + alphabetic characters. + + #rule + A construct "#" is defined, similar to "*", for defining lists of + elements. The full form is "#element" indicating at least + and at most elements, each separated by one or more commas + (",") and OPTIONAL linear white space (LWS). This makes the usual + form of lists very easy; a rule such as + ( *LWS element *( *LWS "," *LWS element )) + can be shown as + 1#element + Wherever this construct is used, null elements are allowed, but do + not contribute to the count of elements present. That is, + "(element), , (element) " is permitted, but counts as only two + elements. Therefore, where at least one element is required, at + least one non-null element MUST be present. Default values are 0 + and infinity so that "#element" allows any number, including zero; + "1#element" requires at least one; and "1#2element" allows one or + two. + + ; comment + A semi-colon, set off some distance to the right of rule text, + starts a comment that continues to the end of line. This is a + simple way of including useful notes in parallel with the + specifications. + + implied *LWS + The grammar described by this specification is word-based. Except + where noted otherwise, linear white space (LWS) can be included + between any two adjacent words (token or quoted-string), and + between adjacent words and separators, without changing the + interpretation of a field. At least one delimiter (LWS and/or + + separators) MUST exist between any two tokens (for the definition + of "token" below), since they would otherwise be interpreted as a + single token. + +2.2 Basic Rules + + The following rules are used throughout this specification to + describe basic parsing constructs. The US-ASCII coded character set + is defined by ANSI X3.4-1986 [21]. + + + + + +Fielding, et al. Standards Track [Page 15] + +RFC 2616 HTTP/1.1 June 1999 + + + OCTET = + CHAR = + UPALPHA = + LOALPHA = + ALPHA = UPALPHA | LOALPHA + DIGIT = + CTL = + CR = + LF = + SP = + HT = + <"> = + + HTTP/1.1 defines the sequence CR LF as the end-of-line marker for all + protocol elements except the entity-body (see appendix 19.3 for + tolerant applications). The end-of-line marker within an entity-body + is defined by its associated media type, as described in section 3.7. + + CRLF = CR LF + + HTTP/1.1 header field values can be folded onto multiple lines if the + continuation line begins with a space or horizontal tab. All linear + white space, including folding, has the same semantics as SP. A + recipient MAY replace any linear white space with a single SP before + interpreting the field value or forwarding the message downstream. + + LWS = [CRLF] 1*( SP | HT ) + + The TEXT rule is only used for descriptive field contents and values + that are not intended to be interpreted by the message parser. Words + of *TEXT MAY contain characters from character sets other than ISO- + 8859-1 [22] only when encoded according to the rules of RFC 2047 + [14]. + + TEXT = + + A CRLF is allowed in the definition of TEXT only as part of a header + field continuation. It is expected that the folding LWS will be + replaced with a single SP before interpretation of the TEXT value. + + Hexadecimal numeric characters are used in several protocol elements. + + HEX = "A" | "B" | "C" | "D" | "E" | "F" + | "a" | "b" | "c" | "d" | "e" | "f" | DIGIT + + + + + +Fielding, et al. Standards Track [Page 16] + +RFC 2616 HTTP/1.1 June 1999 + + + Many HTTP/1.1 header field values consist of words separated by LWS + or special characters. These special characters MUST be in a quoted + string to be used within a parameter value (as defined in section + 3.6). + + token = 1* + separators = "(" | ")" | "<" | ">" | "@" + | "," | ";" | ":" | "\" | <"> + | "/" | "[" | "]" | "?" | "=" + | "{" | "}" | SP | HT + + Comments can be included in some HTTP header fields by surrounding + the comment text with parentheses. Comments are only allowed in + fields containing "comment" as part of their field value definition. + In all other fields, parentheses are considered part of the field + value. + + comment = "(" *( ctext | quoted-pair | comment ) ")" + ctext = + + A string of text is parsed as a single word if it is quoted using + double-quote marks. + + quoted-string = ( <"> *(qdtext | quoted-pair ) <"> ) + qdtext = > + + The backslash character ("\") MAY be used as a single-character + quoting mechanism only within quoted-string and comment constructs. + + quoted-pair = "\" CHAR + +3 Protocol Parameters + +3.1 HTTP Version + + HTTP uses a "." numbering scheme to indicate versions + of the protocol. The protocol versioning policy is intended to allow + the sender to indicate the format of a message and its capacity for + understanding further HTTP communication, rather than the features + obtained via that communication. No change is made to the version + number for the addition of message components which do not affect + communication behavior or which only add to extensible field values. + The number is incremented when the changes made to the + protocol add features which do not change the general message parsing + algorithm, but which may add to the message semantics and imply + additional capabilities of the sender. The number is + incremented when the format of a message within the protocol is + changed. See RFC 2145 [36] for a fuller explanation. + + + +Fielding, et al. Standards Track [Page 17] + +RFC 2616 HTTP/1.1 June 1999 + + + The version of an HTTP message is indicated by an HTTP-Version field + in the first line of the message. + + HTTP-Version = "HTTP" "/" 1*DIGIT "." 1*DIGIT + + Note that the major and minor numbers MUST be treated as separate + integers and that each MAY be incremented higher than a single digit. + Thus, HTTP/2.4 is a lower version than HTTP/2.13, which in turn is + lower than HTTP/12.3. Leading zeros MUST be ignored by recipients and + MUST NOT be sent. + + An application that sends a request or response message that includes + HTTP-Version of "HTTP/1.1" MUST be at least conditionally compliant + with this specification. Applications that are at least conditionally + compliant with this specification SHOULD use an HTTP-Version of + "HTTP/1.1" in their messages, and MUST do so for any message that is + not compatible with HTTP/1.0. For more details on when to send + specific HTTP-Version values, see RFC 2145 [36]. + + The HTTP version of an application is the highest HTTP version for + which the application is at least conditionally compliant. + + Proxy and gateway applications need to be careful when forwarding + messages in protocol versions different from that of the application. + Since the protocol version indicates the protocol capability of the + sender, a proxy/gateway MUST NOT send a message with a version + indicator which is greater than its actual version. If a higher + version request is received, the proxy/gateway MUST either downgrade + the request version, or respond with an error, or switch to tunnel + behavior. + + Due to interoperability problems with HTTP/1.0 proxies discovered + since the publication of RFC 2068[33], caching proxies MUST, gateways + MAY, and tunnels MUST NOT upgrade the request to the highest version + they support. The proxy/gateway's response to that request MUST be in + the same major version as the request. + + Note: Converting between versions of HTTP may involve modification + of header fields required or forbidden by the versions involved. + +3.2 Uniform Resource Identifiers + + URIs have been known by many names: WWW addresses, Universal Document + Identifiers, Universal Resource Identifiers [3], and finally the + combination of Uniform Resource Locators (URL) [4] and Names (URN) + [20]. As far as HTTP is concerned, Uniform Resource Identifiers are + simply formatted strings which identify--via name, location, or any + other characteristic--a resource. + + + +Fielding, et al. Standards Track [Page 18] + +RFC 2616 HTTP/1.1 June 1999 + + +3.2.1 General Syntax + + URIs in HTTP can be represented in absolute form or relative to some + known base URI [11], depending upon the context of their use. The two + forms are differentiated by the fact that absolute URIs always begin + with a scheme name followed by a colon. For definitive information on + URL syntax and semantics, see "Uniform Resource Identifiers (URI): + Generic Syntax and Semantics," RFC 2396 [42] (which replaces RFCs + 1738 [4] and RFC 1808 [11]). This specification adopts the + definitions of "URI-reference", "absoluteURI", "relativeURI", "port", + "host","abs_path", "rel_path", and "authority" from that + specification. + + The HTTP protocol does not place any a priori limit on the length of + a URI. Servers MUST be able to handle the URI of any resource they + serve, and SHOULD be able to handle URIs of unbounded length if they + provide GET-based forms that could generate such URIs. A server + SHOULD return 414 (Request-URI Too Long) status if a URI is longer + than the server can handle (see section 10.4.15). + + Note: Servers ought to be cautious about depending on URI lengths + above 255 bytes, because some older client or proxy + implementations might not properly support these lengths. + +3.2.2 http URL + + The "http" scheme is used to locate network resources via the HTTP + protocol. This section defines the scheme-specific syntax and + semantics for http URLs. + + http_URL = "http:" "//" host [ ":" port ] [ abs_path [ "?" query ]] + + If the port is empty or not given, port 80 is assumed. The semantics + are that the identified resource is located at the server listening + for TCP connections on that port of that host, and the Request-URI + for the resource is abs_path (section 5.1.2). The use of IP addresses + in URLs SHOULD be avoided whenever possible (see RFC 1900 [24]). If + the abs_path is not present in the URL, it MUST be given as "/" when + used as a Request-URI for a resource (section 5.1.2). If a proxy + receives a host name which is not a fully qualified domain name, it + MAY add its domain to the host name it received. If a proxy receives + a fully qualified domain name, the proxy MUST NOT change the host + name. + + + + + + + + +Fielding, et al. Standards Track [Page 19] + +RFC 2616 HTTP/1.1 June 1999 + + +3.2.3 URI Comparison + + When comparing two URIs to decide if they match or not, a client + SHOULD use a case-sensitive octet-by-octet comparison of the entire + URIs, with these exceptions: + + - A port that is empty or not given is equivalent to the default + port for that URI-reference; + + - Comparisons of host names MUST be case-insensitive; + + - Comparisons of scheme names MUST be case-insensitive; + + - An empty abs_path is equivalent to an abs_path of "/". + + Characters other than those in the "reserved" and "unsafe" sets (see + RFC 2396 [42]) are equivalent to their ""%" HEX HEX" encoding. + + For example, the following three URIs are equivalent: + + http://abc.com:80/~smith/home.html + http://ABC.com/%7Esmith/home.html + http://ABC.com:/%7esmith/home.html + +3.3 Date/Time Formats + +3.3.1 Full Date + + HTTP applications have historically allowed three different formats + for the representation of date/time stamps: + + Sun, 06 Nov 1994 08:49:37 GMT ; RFC 822, updated by RFC 1123 + Sunday, 06-Nov-94 08:49:37 GMT ; RFC 850, obsoleted by RFC 1036 + Sun Nov 6 08:49:37 1994 ; ANSI C's asctime() format + + The first format is preferred as an Internet standard and represents + a fixed-length subset of that defined by RFC 1123 [8] (an update to + RFC 822 [9]). The second format is in common use, but is based on the + obsolete RFC 850 [12] date format and lacks a four-digit year. + HTTP/1.1 clients and servers that parse the date value MUST accept + all three formats (for compatibility with HTTP/1.0), though they MUST + only generate the RFC 1123 format for representing HTTP-date values + in header fields. See section 19.3 for further information. + + Note: Recipients of date values are encouraged to be robust in + accepting date values that may have been sent by non-HTTP + applications, as is sometimes the case when retrieving or posting + messages via proxies/gateways to SMTP or NNTP. + + + +Fielding, et al. Standards Track [Page 20] + +RFC 2616 HTTP/1.1 June 1999 + + + All HTTP date/time stamps MUST be represented in Greenwich Mean Time + (GMT), without exception. For the purposes of HTTP, GMT is exactly + equal to UTC (Coordinated Universal Time). This is indicated in the + first two formats by the inclusion of "GMT" as the three-letter + abbreviation for time zone, and MUST be assumed when reading the + asctime format. HTTP-date is case sensitive and MUST NOT include + additional LWS beyond that specifically included as SP in the + grammar. + + HTTP-date = rfc1123-date | rfc850-date | asctime-date + rfc1123-date = wkday "," SP date1 SP time SP "GMT" + rfc850-date = weekday "," SP date2 SP time SP "GMT" + asctime-date = wkday SP date3 SP time SP 4DIGIT + date1 = 2DIGIT SP month SP 4DIGIT + ; day month year (e.g., 02 Jun 1982) + date2 = 2DIGIT "-" month "-" 2DIGIT + ; day-month-year (e.g., 02-Jun-82) + date3 = month SP ( 2DIGIT | ( SP 1DIGIT )) + ; month day (e.g., Jun 2) + time = 2DIGIT ":" 2DIGIT ":" 2DIGIT + ; 00:00:00 - 23:59:59 + wkday = "Mon" | "Tue" | "Wed" + | "Thu" | "Fri" | "Sat" | "Sun" + weekday = "Monday" | "Tuesday" | "Wednesday" + | "Thursday" | "Friday" | "Saturday" | "Sunday" + month = "Jan" | "Feb" | "Mar" | "Apr" + | "May" | "Jun" | "Jul" | "Aug" + | "Sep" | "Oct" | "Nov" | "Dec" + + Note: HTTP requirements for the date/time stamp format apply only + to their usage within the protocol stream. Clients and servers are + not required to use these formats for user presentation, request + logging, etc. + +3.3.2 Delta Seconds + + Some HTTP header fields allow a time value to be specified as an + integer number of seconds, represented in decimal, after the time + that the message was received. + + delta-seconds = 1*DIGIT + +3.4 Character Sets + + HTTP uses the same definition of the term "character set" as that + described for MIME: + + + + + +Fielding, et al. Standards Track [Page 21] + +RFC 2616 HTTP/1.1 June 1999 + + + The term "character set" is used in this document to refer to a + method used with one or more tables to convert a sequence of octets + into a sequence of characters. Note that unconditional conversion in + the other direction is not required, in that not all characters may + be available in a given character set and a character set may provide + more than one sequence of octets to represent a particular character. + This definition is intended to allow various kinds of character + encoding, from simple single-table mappings such as US-ASCII to + complex table switching methods such as those that use ISO-2022's + techniques. However, the definition associated with a MIME character + set name MUST fully specify the mapping to be performed from octets + to characters. In particular, use of external profiling information + to determine the exact mapping is not permitted. + + Note: This use of the term "character set" is more commonly + referred to as a "character encoding." However, since HTTP and + MIME share the same registry, it is important that the terminology + also be shared. + + HTTP character sets are identified by case-insensitive tokens. The + complete set of tokens is defined by the IANA Character Set registry + [19]. + + charset = token + + Although HTTP allows an arbitrary token to be used as a charset + value, any token that has a predefined value within the IANA + Character Set registry [19] MUST represent the character set defined + by that registry. Applications SHOULD limit their use of character + sets to those defined by the IANA registry. + + Implementors should be aware of IETF character set requirements [38] + [41]. + +3.4.1 Missing Charset + + Some HTTP/1.0 software has interpreted a Content-Type header without + charset parameter incorrectly to mean "recipient should guess." + Senders wishing to defeat this behavior MAY include a charset + parameter even when the charset is ISO-8859-1 and SHOULD do so when + it is known that it will not confuse the recipient. + + Unfortunately, some older HTTP/1.0 clients did not deal properly with + an explicit charset parameter. HTTP/1.1 recipients MUST respect the + charset label provided by the sender; and those user agents that have + a provision to "guess" a charset MUST use the charset from the + + + + + +Fielding, et al. Standards Track [Page 22] + +RFC 2616 HTTP/1.1 June 1999 + + + content-type field if they support that charset, rather than the + recipient's preference, when initially displaying a document. See + section 3.7.1. + +3.5 Content Codings + + Content coding values indicate an encoding transformation that has + been or can be applied to an entity. Content codings are primarily + used to allow a document to be compressed or otherwise usefully + transformed without losing the identity of its underlying media type + and without loss of information. Frequently, the entity is stored in + coded form, transmitted directly, and only decoded by the recipient. + + content-coding = token + + All content-coding values are case-insensitive. HTTP/1.1 uses + content-coding values in the Accept-Encoding (section 14.3) and + Content-Encoding (section 14.11) header fields. Although the value + describes the content-coding, what is more important is that it + indicates what decoding mechanism will be required to remove the + encoding. + + The Internet Assigned Numbers Authority (IANA) acts as a registry for + content-coding value tokens. Initially, the registry contains the + following tokens: + + gzip An encoding format produced by the file compression program + "gzip" (GNU zip) as described in RFC 1952 [25]. This format is a + Lempel-Ziv coding (LZ77) with a 32 bit CRC. + + compress + The encoding format produced by the common UNIX file compression + program "compress". This format is an adaptive Lempel-Ziv-Welch + coding (LZW). + + Use of program names for the identification of encoding formats + is not desirable and is discouraged for future encodings. Their + use here is representative of historical practice, not good + design. For compatibility with previous implementations of HTTP, + applications SHOULD consider "x-gzip" and "x-compress" to be + equivalent to "gzip" and "compress" respectively. + + deflate + The "zlib" format defined in RFC 1950 [31] in combination with + the "deflate" compression mechanism described in RFC 1951 [29]. + + + + + + +Fielding, et al. Standards Track [Page 23] + +RFC 2616 HTTP/1.1 June 1999 + + + identity + The default (identity) encoding; the use of no transformation + whatsoever. This content-coding is used only in the Accept- + Encoding header, and SHOULD NOT be used in the Content-Encoding + header. + + New content-coding value tokens SHOULD be registered; to allow + interoperability between clients and servers, specifications of the + content coding algorithms needed to implement a new value SHOULD be + publicly available and adequate for independent implementation, and + conform to the purpose of content coding defined in this section. + +3.6 Transfer Codings + + Transfer-coding values are used to indicate an encoding + transformation that has been, can be, or may need to be applied to an + entity-body in order to ensure "safe transport" through the network. + This differs from a content coding in that the transfer-coding is a + property of the message, not of the original entity. + + transfer-coding = "chunked" | transfer-extension + transfer-extension = token *( ";" parameter ) + + Parameters are in the form of attribute/value pairs. + + parameter = attribute "=" value + attribute = token + value = token | quoted-string + + All transfer-coding values are case-insensitive. HTTP/1.1 uses + transfer-coding values in the TE header field (section 14.39) and in + the Transfer-Encoding header field (section 14.41). + + Whenever a transfer-coding is applied to a message-body, the set of + transfer-codings MUST include "chunked", unless the message is + terminated by closing the connection. When the "chunked" transfer- + coding is used, it MUST be the last transfer-coding applied to the + message-body. The "chunked" transfer-coding MUST NOT be applied more + than once to a message-body. These rules allow the recipient to + determine the transfer-length of the message (section 4.4). + + Transfer-codings are analogous to the Content-Transfer-Encoding + values of MIME [7], which were designed to enable safe transport of + binary data over a 7-bit transport service. However, safe transport + has a different focus for an 8bit-clean transfer protocol. In HTTP, + the only unsafe characteristic of message-bodies is the difficulty in + determining the exact body length (section 7.2.2), or the desire to + encrypt data over a shared transport. + + + +Fielding, et al. Standards Track [Page 24] + +RFC 2616 HTTP/1.1 June 1999 + + + The Internet Assigned Numbers Authority (IANA) acts as a registry for + transfer-coding value tokens. Initially, the registry contains the + following tokens: "chunked" (section 3.6.1), "identity" (section + 3.6.2), "gzip" (section 3.5), "compress" (section 3.5), and "deflate" + (section 3.5). + + New transfer-coding value tokens SHOULD be registered in the same way + as new content-coding value tokens (section 3.5). + + A server which receives an entity-body with a transfer-coding it does + not understand SHOULD return 501 (Unimplemented), and close the + connection. A server MUST NOT send transfer-codings to an HTTP/1.0 + client. + +3.6.1 Chunked Transfer Coding + + The chunked encoding modifies the body of a message in order to + transfer it as a series of chunks, each with its own size indicator, + followed by an OPTIONAL trailer containing entity-header fields. This + allows dynamically produced content to be transferred along with the + information necessary for the recipient to verify that it has + received the full message. + + Chunked-Body = *chunk + last-chunk + trailer + CRLF + + chunk = chunk-size [ chunk-extension ] CRLF + chunk-data CRLF + chunk-size = 1*HEX + last-chunk = 1*("0") [ chunk-extension ] CRLF + + chunk-extension= *( ";" chunk-ext-name [ "=" chunk-ext-val ] ) + chunk-ext-name = token + chunk-ext-val = token | quoted-string + chunk-data = chunk-size(OCTET) + trailer = *(entity-header CRLF) + + The chunk-size field is a string of hex digits indicating the size of + the chunk. The chunked encoding is ended by any chunk whose size is + zero, followed by the trailer, which is terminated by an empty line. + + The trailer allows the sender to include additional HTTP header + fields at the end of the message. The Trailer header field can be + used to indicate which header fields are included in a trailer (see + section 14.40). + + + + +Fielding, et al. Standards Track [Page 25] + +RFC 2616 HTTP/1.1 June 1999 + + + A server using chunked transfer-coding in a response MUST NOT use the + trailer for any header fields unless at least one of the following is + true: + + a)the request included a TE header field that indicates "trailers" is + acceptable in the transfer-coding of the response, as described in + section 14.39; or, + + b)the server is the origin server for the response, the trailer + fields consist entirely of optional metadata, and the recipient + could use the message (in a manner acceptable to the origin server) + without receiving this metadata. In other words, the origin server + is willing to accept the possibility that the trailer fields might + be silently discarded along the path to the client. + + This requirement prevents an interoperability failure when the + message is being received by an HTTP/1.1 (or later) proxy and + forwarded to an HTTP/1.0 recipient. It avoids a situation where + compliance with the protocol would have necessitated a possibly + infinite buffer on the proxy. + + An example process for decoding a Chunked-Body is presented in + appendix 19.4.6. + + All HTTP/1.1 applications MUST be able to receive and decode the + "chunked" transfer-coding, and MUST ignore chunk-extension extensions + they do not understand. + +3.7 Media Types + + HTTP uses Internet Media Types [17] in the Content-Type (section + 14.17) and Accept (section 14.1) header fields in order to provide + open and extensible data typing and type negotiation. + + media-type = type "/" subtype *( ";" parameter ) + type = token + subtype = token + + Parameters MAY follow the type/subtype in the form of attribute/value + pairs (as defined in section 3.6). + + The type, subtype, and parameter attribute names are case- + insensitive. Parameter values might or might not be case-sensitive, + depending on the semantics of the parameter name. Linear white space + (LWS) MUST NOT be used between the type and subtype, nor between an + attribute and its value. The presence or absence of a parameter might + be significant to the processing of a media-type, depending on its + definition within the media type registry. + + + +Fielding, et al. Standards Track [Page 26] + +RFC 2616 HTTP/1.1 June 1999 + + + Note that some older HTTP applications do not recognize media type + parameters. When sending data to older HTTP applications, + implementations SHOULD only use media type parameters when they are + required by that type/subtype definition. + + Media-type values are registered with the Internet Assigned Number + Authority (IANA [19]). The media type registration process is + outlined in RFC 1590 [17]. Use of non-registered media types is + discouraged. + +3.7.1 Canonicalization and Text Defaults + + Internet media types are registered with a canonical form. An + entity-body transferred via HTTP messages MUST be represented in the + appropriate canonical form prior to its transmission except for + "text" types, as defined in the next paragraph. + + When in canonical form, media subtypes of the "text" type use CRLF as + the text line break. HTTP relaxes this requirement and allows the + transport of text media with plain CR or LF alone representing a line + break when it is done consistently for an entire entity-body. HTTP + applications MUST accept CRLF, bare CR, and bare LF as being + representative of a line break in text media received via HTTP. In + addition, if the text is represented in a character set that does not + use octets 13 and 10 for CR and LF respectively, as is the case for + some multi-byte character sets, HTTP allows the use of whatever octet + sequences are defined by that character set to represent the + equivalent of CR and LF for line breaks. This flexibility regarding + line breaks applies only to text media in the entity-body; a bare CR + or LF MUST NOT be substituted for CRLF within any of the HTTP control + structures (such as header fields and multipart boundaries). + + If an entity-body is encoded with a content-coding, the underlying + data MUST be in a form defined above prior to being encoded. + + The "charset" parameter is used with some media types to define the + character set (section 3.4) of the data. When no explicit charset + parameter is provided by the sender, media subtypes of the "text" + type are defined to have a default charset value of "ISO-8859-1" when + received via HTTP. Data in character sets other than "ISO-8859-1" or + its subsets MUST be labeled with an appropriate charset value. See + section 3.4.1 for compatibility problems. + +3.7.2 Multipart Types + + MIME provides for a number of "multipart" types -- encapsulations of + one or more entities within a single message-body. All multipart + types share a common syntax, as defined in section 5.1.1 of RFC 2046 + + + +Fielding, et al. Standards Track [Page 27] + +RFC 2616 HTTP/1.1 June 1999 + + + [40], and MUST include a boundary parameter as part of the media type + value. The message body is itself a protocol element and MUST + therefore use only CRLF to represent line breaks between body-parts. + Unlike in RFC 2046, the epilogue of any multipart message MUST be + empty; HTTP applications MUST NOT transmit the epilogue (even if the + original multipart contains an epilogue). These restrictions exist in + order to preserve the self-delimiting nature of a multipart message- + body, wherein the "end" of the message-body is indicated by the + ending multipart boundary. + + In general, HTTP treats a multipart message-body no differently than + any other media type: strictly as payload. The one exception is the + "multipart/byteranges" type (appendix 19.2) when it appears in a 206 + (Partial Content) response, which will be interpreted by some HTTP + caching mechanisms as described in sections 13.5.4 and 14.16. In all + other cases, an HTTP user agent SHOULD follow the same or similar + behavior as a MIME user agent would upon receipt of a multipart type. + The MIME header fields within each body-part of a multipart message- + body do not have any significance to HTTP beyond that defined by + their MIME semantics. + + In general, an HTTP user agent SHOULD follow the same or similar + behavior as a MIME user agent would upon receipt of a multipart type. + If an application receives an unrecognized multipart subtype, the + application MUST treat it as being equivalent to "multipart/mixed". + + Note: The "multipart/form-data" type has been specifically defined + for carrying form data suitable for processing via the POST + request method, as described in RFC 1867 [15]. + +3.8 Product Tokens + + Product tokens are used to allow communicating applications to + identify themselves by software name and version. Most fields using + product tokens also allow sub-products which form a significant part + of the application to be listed, separated by white space. By + convention, the products are listed in order of their significance + for identifying the application. + + product = token ["/" product-version] + product-version = token + + Examples: + + User-Agent: CERN-LineMode/2.15 libwww/2.17b3 + Server: Apache/0.8.4 + + + + + +Fielding, et al. Standards Track [Page 28] + +RFC 2616 HTTP/1.1 June 1999 + + + Product tokens SHOULD be short and to the point. They MUST NOT be + used for advertising or other non-essential information. Although any + token character MAY appear in a product-version, this token SHOULD + only be used for a version identifier (i.e., successive versions of + the same product SHOULD only differ in the product-version portion of + the product value). + +3.9 Quality Values + + HTTP content negotiation (section 12) uses short "floating point" + numbers to indicate the relative importance ("weight") of various + negotiable parameters. A weight is normalized to a real number in + the range 0 through 1, where 0 is the minimum and 1 the maximum + value. If a parameter has a quality value of 0, then content with + this parameter is `not acceptable' for the client. HTTP/1.1 + applications MUST NOT generate more than three digits after the + decimal point. User configuration of these values SHOULD also be + limited in this fashion. + + qvalue = ( "0" [ "." 0*3DIGIT ] ) + | ( "1" [ "." 0*3("0") ] ) + + "Quality values" is a misnomer, since these values merely represent + relative degradation in desired quality. + +3.10 Language Tags + + A language tag identifies a natural language spoken, written, or + otherwise conveyed by human beings for communication of information + to other human beings. Computer languages are explicitly excluded. + HTTP uses language tags within the Accept-Language and Content- + Language fields. + + The syntax and registry of HTTP language tags is the same as that + defined by RFC 1766 [1]. In summary, a language tag is composed of 1 + or more parts: A primary language tag and a possibly empty series of + subtags: + + language-tag = primary-tag *( "-" subtag ) + primary-tag = 1*8ALPHA + subtag = 1*8ALPHA + + White space is not allowed within the tag and all tags are case- + insensitive. The name space of language tags is administered by the + IANA. Example tags include: + + en, en-US, en-cockney, i-cherokee, x-pig-latin + + + + +Fielding, et al. Standards Track [Page 29] + +RFC 2616 HTTP/1.1 June 1999 + + + where any two-letter primary-tag is an ISO-639 language abbreviation + and any two-letter initial subtag is an ISO-3166 country code. (The + last three tags above are not registered tags; all but the last are + examples of tags which could be registered in future.) + +3.11 Entity Tags + + Entity tags are used for comparing two or more entities from the same + requested resource. HTTP/1.1 uses entity tags in the ETag (section + 14.19), If-Match (section 14.24), If-None-Match (section 14.26), and + If-Range (section 14.27) header fields. The definition of how they + are used and compared as cache validators is in section 13.3.3. An + entity tag consists of an opaque quoted string, possibly prefixed by + a weakness indicator. + + entity-tag = [ weak ] opaque-tag + weak = "W/" + opaque-tag = quoted-string + + A "strong entity tag" MAY be shared by two entities of a resource + only if they are equivalent by octet equality. + + A "weak entity tag," indicated by the "W/" prefix, MAY be shared by + two entities of a resource only if the entities are equivalent and + could be substituted for each other with no significant change in + semantics. A weak entity tag can only be used for weak comparison. + + An entity tag MUST be unique across all versions of all entities + associated with a particular resource. A given entity tag value MAY + be used for entities obtained by requests on different URIs. The use + of the same entity tag value in conjunction with entities obtained by + requests on different URIs does not imply the equivalence of those + entities. + +3.12 Range Units + + HTTP/1.1 allows a client to request that only part (a range of) the + response entity be included within the response. HTTP/1.1 uses range + units in the Range (section 14.35) and Content-Range (section 14.16) + header fields. An entity can be broken down into subranges according + to various structural units. + + range-unit = bytes-unit | other-range-unit + bytes-unit = "bytes" + other-range-unit = token + + The only range unit defined by HTTP/1.1 is "bytes". HTTP/1.1 + implementations MAY ignore ranges specified using other units. + + + +Fielding, et al. Standards Track [Page 30] + +RFC 2616 HTTP/1.1 June 1999 + + + HTTP/1.1 has been designed to allow implementations of applications + that do not depend on knowledge of ranges. + +4 HTTP Message + +4.1 Message Types + + HTTP messages consist of requests from client to server and responses + from server to client. + + HTTP-message = Request | Response ; HTTP/1.1 messages + + Request (section 5) and Response (section 6) messages use the generic + message format of RFC 822 [9] for transferring entities (the payload + of the message). Both types of message consist of a start-line, zero + or more header fields (also known as "headers"), an empty line (i.e., + a line with nothing preceding the CRLF) indicating the end of the + header fields, and possibly a message-body. + + generic-message = start-line + *(message-header CRLF) + CRLF + [ message-body ] + start-line = Request-Line | Status-Line + + In the interest of robustness, servers SHOULD ignore any empty + line(s) received where a Request-Line is expected. In other words, if + the server is reading the protocol stream at the beginning of a + message and receives a CRLF first, it should ignore the CRLF. + + Certain buggy HTTP/1.0 client implementations generate extra CRLF's + after a POST request. To restate what is explicitly forbidden by the + BNF, an HTTP/1.1 client MUST NOT preface or follow a request with an + extra CRLF. + +4.2 Message Headers + + HTTP header fields, which include general-header (section 4.5), + request-header (section 5.3), response-header (section 6.2), and + entity-header (section 7.1) fields, follow the same generic format as + that given in Section 3.1 of RFC 822 [9]. Each header field consists + of a name followed by a colon (":") and the field value. Field names + are case-insensitive. The field value MAY be preceded by any amount + of LWS, though a single SP is preferred. Header fields can be + extended over multiple lines by preceding each extra line with at + least one SP or HT. Applications ought to follow "common form", where + one is known or indicated, when generating HTTP constructs, since + there might exist some implementations that fail to accept anything + + + +Fielding, et al. Standards Track [Page 31] + +RFC 2616 HTTP/1.1 June 1999 + + + beyond the common forms. + + message-header = field-name ":" [ field-value ] + field-name = token + field-value = *( field-content | LWS ) + field-content = + + The field-content does not include any leading or trailing LWS: + linear white space occurring before the first non-whitespace + character of the field-value or after the last non-whitespace + character of the field-value. Such leading or trailing LWS MAY be + removed without changing the semantics of the field value. Any LWS + that occurs between field-content MAY be replaced with a single SP + before interpreting the field value or forwarding the message + downstream. + + The order in which header fields with differing field names are + received is not significant. However, it is "good practice" to send + general-header fields first, followed by request-header or response- + header fields, and ending with the entity-header fields. + + Multiple message-header fields with the same field-name MAY be + present in a message if and only if the entire field-value for that + header field is defined as a comma-separated list [i.e., #(values)]. + It MUST be possible to combine the multiple header fields into one + "field-name: field-value" pair, without changing the semantics of the + message, by appending each subsequent field-value to the first, each + separated by a comma. The order in which header fields with the same + field-name are received is therefore significant to the + interpretation of the combined field value, and thus a proxy MUST NOT + change the order of these field values when a message is forwarded. + +4.3 Message Body + + The message-body (if any) of an HTTP message is used to carry the + entity-body associated with the request or response. The message-body + differs from the entity-body only when a transfer-coding has been + applied, as indicated by the Transfer-Encoding header field (section + 14.41). + + message-body = entity-body + | + + Transfer-Encoding MUST be used to indicate any transfer-codings + applied by an application to ensure safe and proper transfer of the + message. Transfer-Encoding is a property of the message, not of the + + + +Fielding, et al. Standards Track [Page 32] + +RFC 2616 HTTP/1.1 June 1999 + + + entity, and thus MAY be added or removed by any application along the + request/response chain. (However, section 3.6 places restrictions on + when certain transfer-codings may be used.) + + The rules for when a message-body is allowed in a message differ for + requests and responses. + + The presence of a message-body in a request is signaled by the + inclusion of a Content-Length or Transfer-Encoding header field in + the request's message-headers. A message-body MUST NOT be included in + a request if the specification of the request method (section 5.1.1) + does not allow sending an entity-body in requests. A server SHOULD + read and forward a message-body on any request; if the request method + does not include defined semantics for an entity-body, then the + message-body SHOULD be ignored when handling the request. + + For response messages, whether or not a message-body is included with + a message is dependent on both the request method and the response + status code (section 6.1.1). All responses to the HEAD request method + MUST NOT include a message-body, even though the presence of entity- + header fields might lead one to believe they do. All 1xx + (informational), 204 (no content), and 304 (not modified) responses + MUST NOT include a message-body. All other responses do include a + message-body, although it MAY be of zero length. + +4.4 Message Length + + The transfer-length of a message is the length of the message-body as + it appears in the message; that is, after any transfer-codings have + been applied. When a message-body is included with a message, the + transfer-length of that body is determined by one of the following + (in order of precedence): + + 1.Any response message which "MUST NOT" include a message-body (such + as the 1xx, 204, and 304 responses and any response to a HEAD + request) is always terminated by the first empty line after the + header fields, regardless of the entity-header fields present in + the message. + + 2.If a Transfer-Encoding header field (section 14.41) is present and + has any value other than "identity", then the transfer-length is + defined by use of the "chunked" transfer-coding (section 3.6), + unless the message is terminated by closing the connection. + + 3.If a Content-Length header field (section 14.13) is present, its + decimal value in OCTETs represents both the entity-length and the + transfer-length. The Content-Length header field MUST NOT be sent + if these two lengths are different (i.e., if a Transfer-Encoding + + + +Fielding, et al. Standards Track [Page 33] + +RFC 2616 HTTP/1.1 June 1999 + + + header field is present). If a message is received with both a + Transfer-Encoding header field and a Content-Length header field, + the latter MUST be ignored. + + 4.If the message uses the media type "multipart/byteranges", and the + ransfer-length is not otherwise specified, then this self- + elimiting media type defines the transfer-length. This media type + UST NOT be used unless the sender knows that the recipient can arse + it; the presence in a request of a Range header with ultiple byte- + range specifiers from a 1.1 client implies that the lient can parse + multipart/byteranges responses. + + A range header might be forwarded by a 1.0 proxy that does not + understand multipart/byteranges; in this case the server MUST + delimit the message using methods defined in items 1,3 or 5 of + this section. + + 5.By the server closing the connection. (Closing the connection + cannot be used to indicate the end of a request body, since that + would leave no possibility for the server to send back a response.) + + For compatibility with HTTP/1.0 applications, HTTP/1.1 requests + containing a message-body MUST include a valid Content-Length header + field unless the server is known to be HTTP/1.1 compliant. If a + request contains a message-body and a Content-Length is not given, + the server SHOULD respond with 400 (bad request) if it cannot + determine the length of the message, or with 411 (length required) if + it wishes to insist on receiving a valid Content-Length. + + All HTTP/1.1 applications that receive entities MUST accept the + "chunked" transfer-coding (section 3.6), thus allowing this mechanism + to be used for messages when the message length cannot be determined + in advance. + + Messages MUST NOT include both a Content-Length header field and a + non-identity transfer-coding. If the message does include a non- + identity transfer-coding, the Content-Length MUST be ignored. + + When a Content-Length is given in a message where a message-body is + allowed, its field value MUST exactly match the number of OCTETs in + the message-body. HTTP/1.1 user agents MUST notify the user when an + invalid length is received and detected. + +4.5 General Header Fields + + There are a few header fields which have general applicability for + both request and response messages, but which do not apply to the + entity being transferred. These header fields apply only to the + + + +Fielding, et al. Standards Track [Page 34] + +RFC 2616 HTTP/1.1 June 1999 + + + message being transmitted. + + general-header = Cache-Control ; Section 14.9 + | Connection ; Section 14.10 + | Date ; Section 14.18 + | Pragma ; Section 14.32 + | Trailer ; Section 14.40 + | Transfer-Encoding ; Section 14.41 + | Upgrade ; Section 14.42 + | Via ; Section 14.45 + | Warning ; Section 14.46 + + General-header field names can be extended reliably only in + combination with a change in the protocol version. However, new or + experimental header fields may be given the semantics of general + header fields if all parties in the communication recognize them to + be general-header fields. Unrecognized header fields are treated as + entity-header fields. + +5 Request + + A request message from a client to a server includes, within the + first line of that message, the method to be applied to the resource, + the identifier of the resource, and the protocol version in use. + + Request = Request-Line ; Section 5.1 + *(( general-header ; Section 4.5 + | request-header ; Section 5.3 + | entity-header ) CRLF) ; Section 7.1 + CRLF + [ message-body ] ; Section 4.3 + +5.1 Request-Line + + The Request-Line begins with a method token, followed by the + Request-URI and the protocol version, and ending with CRLF. The + elements are separated by SP characters. No CR or LF is allowed + except in the final CRLF sequence. + + Request-Line = Method SP Request-URI SP HTTP-Version CRLF + + + + + + + + + + + +Fielding, et al. Standards Track [Page 35] + +RFC 2616 HTTP/1.1 June 1999 + + +5.1.1 Method + + The Method token indicates the method to be performed on the + resource identified by the Request-URI. The method is case-sensitive. + + Method = "OPTIONS" ; Section 9.2 + | "GET" ; Section 9.3 + | "HEAD" ; Section 9.4 + | "POST" ; Section 9.5 + | "PUT" ; Section 9.6 + | "DELETE" ; Section 9.7 + | "TRACE" ; Section 9.8 + | "CONNECT" ; Section 9.9 + | extension-method + extension-method = token + + The list of methods allowed by a resource can be specified in an + Allow header field (section 14.7). The return code of the response + always notifies the client whether a method is currently allowed on a + resource, since the set of allowed methods can change dynamically. An + origin server SHOULD return the status code 405 (Method Not Allowed) + if the method is known by the origin server but not allowed for the + requested resource, and 501 (Not Implemented) if the method is + unrecognized or not implemented by the origin server. The methods GET + and HEAD MUST be supported by all general-purpose servers. All other + methods are OPTIONAL; however, if the above methods are implemented, + they MUST be implemented with the same semantics as those specified + in section 9. + +5.1.2 Request-URI + + The Request-URI is a Uniform Resource Identifier (section 3.2) and + identifies the resource upon which to apply the request. + + Request-URI = "*" | absoluteURI | abs_path | authority + + The four options for Request-URI are dependent on the nature of the + request. The asterisk "*" means that the request does not apply to a + particular resource, but to the server itself, and is only allowed + when the method used does not necessarily apply to a resource. One + example would be + + OPTIONS * HTTP/1.1 + + The absoluteURI form is REQUIRED when the request is being made to a + proxy. The proxy is requested to forward the request or service it + from a valid cache, and return the response. Note that the proxy MAY + forward the request on to another proxy or directly to the server + + + +Fielding, et al. Standards Track [Page 36] + +RFC 2616 HTTP/1.1 June 1999 + + + specified by the absoluteURI. In order to avoid request loops, a + proxy MUST be able to recognize all of its server names, including + any aliases, local variations, and the numeric IP address. An example + Request-Line would be: + + GET http://www.w3.org/pub/WWW/TheProject.html HTTP/1.1 + + To allow for transition to absoluteURIs in all requests in future + versions of HTTP, all HTTP/1.1 servers MUST accept the absoluteURI + form in requests, even though HTTP/1.1 clients will only generate + them in requests to proxies. + + The authority form is only used by the CONNECT method (section 9.9). + + The most common form of Request-URI is that used to identify a + resource on an origin server or gateway. In this case the absolute + path of the URI MUST be transmitted (see section 3.2.1, abs_path) as + the Request-URI, and the network location of the URI (authority) MUST + be transmitted in a Host header field. For example, a client wishing + to retrieve the resource above directly from the origin server would + create a TCP connection to port 80 of the host "www.w3.org" and send + the lines: + + GET /pub/WWW/TheProject.html HTTP/1.1 + Host: www.w3.org + + followed by the remainder of the Request. Note that the absolute path + cannot be empty; if none is present in the original URI, it MUST be + given as "/" (the server root). + + The Request-URI is transmitted in the format specified in section + 3.2.1. If the Request-URI is encoded using the "% HEX HEX" encoding + [42], the origin server MUST decode the Request-URI in order to + properly interpret the request. Servers SHOULD respond to invalid + Request-URIs with an appropriate status code. + + A transparent proxy MUST NOT rewrite the "abs_path" part of the + received Request-URI when forwarding it to the next inbound server, + except as noted above to replace a null abs_path with "/". + + Note: The "no rewrite" rule prevents the proxy from changing the + meaning of the request when the origin server is improperly using + a non-reserved URI character for a reserved purpose. Implementors + should be aware that some pre-HTTP/1.1 proxies have been known to + rewrite the Request-URI. + + + + + + +Fielding, et al. Standards Track [Page 37] + +RFC 2616 HTTP/1.1 June 1999 + + +5.2 The Resource Identified by a Request + + The exact resource identified by an Internet request is determined by + examining both the Request-URI and the Host header field. + + An origin server that does not allow resources to differ by the + requested host MAY ignore the Host header field value when + determining the resource identified by an HTTP/1.1 request. (But see + section 19.6.1.1 for other requirements on Host support in HTTP/1.1.) + + An origin server that does differentiate resources based on the host + requested (sometimes referred to as virtual hosts or vanity host + names) MUST use the following rules for determining the requested + resource on an HTTP/1.1 request: + + 1. If Request-URI is an absoluteURI, the host is part of the + Request-URI. Any Host header field value in the request MUST be + ignored. + + 2. If the Request-URI is not an absoluteURI, and the request includes + a Host header field, the host is determined by the Host header + field value. + + 3. If the host as determined by rule 1 or 2 is not a valid host on + the server, the response MUST be a 400 (Bad Request) error message. + + Recipients of an HTTP/1.0 request that lacks a Host header field MAY + attempt to use heuristics (e.g., examination of the URI path for + something unique to a particular host) in order to determine what + exact resource is being requested. + +5.3 Request Header Fields + + The request-header fields allow the client to pass additional + information about the request, and about the client itself, to the + server. These fields act as request modifiers, with semantics + equivalent to the parameters on a programming language method + invocation. + + request-header = Accept ; Section 14.1 + | Accept-Charset ; Section 14.2 + | Accept-Encoding ; Section 14.3 + | Accept-Language ; Section 14.4 + | Authorization ; Section 14.8 + | Expect ; Section 14.20 + | From ; Section 14.22 + | Host ; Section 14.23 + | If-Match ; Section 14.24 + + + +Fielding, et al. Standards Track [Page 38] + +RFC 2616 HTTP/1.1 June 1999 + + + | If-Modified-Since ; Section 14.25 + | If-None-Match ; Section 14.26 + | If-Range ; Section 14.27 + | If-Unmodified-Since ; Section 14.28 + | Max-Forwards ; Section 14.31 + | Proxy-Authorization ; Section 14.34 + | Range ; Section 14.35 + | Referer ; Section 14.36 + | TE ; Section 14.39 + | User-Agent ; Section 14.43 + + Request-header field names can be extended reliably only in + combination with a change in the protocol version. However, new or + experimental header fields MAY be given the semantics of request- + header fields if all parties in the communication recognize them to + be request-header fields. Unrecognized header fields are treated as + entity-header fields. + +6 Response + + After receiving and interpreting a request message, a server responds + with an HTTP response message. + + Response = Status-Line ; Section 6.1 + *(( general-header ; Section 4.5 + | response-header ; Section 6.2 + | entity-header ) CRLF) ; Section 7.1 + CRLF + [ message-body ] ; Section 7.2 + +6.1 Status-Line + + The first line of a Response message is the Status-Line, consisting + of the protocol version followed by a numeric status code and its + associated textual phrase, with each element separated by SP + characters. No CR or LF is allowed except in the final CRLF sequence. + + Status-Line = HTTP-Version SP Status-Code SP Reason-Phrase CRLF + +6.1.1 Status Code and Reason Phrase + + The Status-Code element is a 3-digit integer result code of the + attempt to understand and satisfy the request. These codes are fully + defined in section 10. The Reason-Phrase is intended to give a short + textual description of the Status-Code. The Status-Code is intended + for use by automata and the Reason-Phrase is intended for the human + user. The client is not required to examine or display the Reason- + Phrase. + + + +Fielding, et al. Standards Track [Page 39] + +RFC 2616 HTTP/1.1 June 1999 + + + The first digit of the Status-Code defines the class of response. The + last two digits do not have any categorization role. There are 5 + values for the first digit: + + - 1xx: Informational - Request received, continuing process + + - 2xx: Success - The action was successfully received, + understood, and accepted + + - 3xx: Redirection - Further action must be taken in order to + complete the request + + - 4xx: Client Error - The request contains bad syntax or cannot + be fulfilled + + - 5xx: Server Error - The server failed to fulfill an apparently + valid request + + The individual values of the numeric status codes defined for + HTTP/1.1, and an example set of corresponding Reason-Phrase's, are + presented below. The reason phrases listed here are only + recommendations -- they MAY be replaced by local equivalents without + affecting the protocol. + + Status-Code = + "100" ; Section 10.1.1: Continue + | "101" ; Section 10.1.2: Switching Protocols + | "200" ; Section 10.2.1: OK + | "201" ; Section 10.2.2: Created + | "202" ; Section 10.2.3: Accepted + | "203" ; Section 10.2.4: Non-Authoritative Information + | "204" ; Section 10.2.5: No Content + | "205" ; Section 10.2.6: Reset Content + | "206" ; Section 10.2.7: Partial Content + | "300" ; Section 10.3.1: Multiple Choices + | "301" ; Section 10.3.2: Moved Permanently + | "302" ; Section 10.3.3: Found + | "303" ; Section 10.3.4: See Other + | "304" ; Section 10.3.5: Not Modified + | "305" ; Section 10.3.6: Use Proxy + | "307" ; Section 10.3.8: Temporary Redirect + | "400" ; Section 10.4.1: Bad Request + | "401" ; Section 10.4.2: Unauthorized + | "402" ; Section 10.4.3: Payment Required + | "403" ; Section 10.4.4: Forbidden + | "404" ; Section 10.4.5: Not Found + | "405" ; Section 10.4.6: Method Not Allowed + | "406" ; Section 10.4.7: Not Acceptable + + + +Fielding, et al. Standards Track [Page 40] + +RFC 2616 HTTP/1.1 June 1999 + + + | "407" ; Section 10.4.8: Proxy Authentication Required + | "408" ; Section 10.4.9: Request Time-out + | "409" ; Section 10.4.10: Conflict + | "410" ; Section 10.4.11: Gone + | "411" ; Section 10.4.12: Length Required + | "412" ; Section 10.4.13: Precondition Failed + | "413" ; Section 10.4.14: Request Entity Too Large + | "414" ; Section 10.4.15: Request-URI Too Large + | "415" ; Section 10.4.16: Unsupported Media Type + | "416" ; Section 10.4.17: Requested range not satisfiable + | "417" ; Section 10.4.18: Expectation Failed + | "500" ; Section 10.5.1: Internal Server Error + | "501" ; Section 10.5.2: Not Implemented + | "502" ; Section 10.5.3: Bad Gateway + | "503" ; Section 10.5.4: Service Unavailable + | "504" ; Section 10.5.5: Gateway Time-out + | "505" ; Section 10.5.6: HTTP Version not supported + | extension-code + + extension-code = 3DIGIT + Reason-Phrase = * + + HTTP status codes are extensible. HTTP applications are not required + to understand the meaning of all registered status codes, though such + understanding is obviously desirable. However, applications MUST + understand the class of any status code, as indicated by the first + digit, and treat any unrecognized response as being equivalent to the + x00 status code of that class, with the exception that an + unrecognized response MUST NOT be cached. For example, if an + unrecognized status code of 431 is received by the client, it can + safely assume that there was something wrong with its request and + treat the response as if it had received a 400 status code. In such + cases, user agents SHOULD present to the user the entity returned + with the response, since that entity is likely to include human- + readable information which will explain the unusual status. + +6.2 Response Header Fields + + The response-header fields allow the server to pass additional + information about the response which cannot be placed in the Status- + Line. These header fields give information about the server and about + further access to the resource identified by the Request-URI. + + response-header = Accept-Ranges ; Section 14.5 + | Age ; Section 14.6 + | ETag ; Section 14.19 + | Location ; Section 14.30 + | Proxy-Authenticate ; Section 14.33 + + + +Fielding, et al. Standards Track [Page 41] + +RFC 2616 HTTP/1.1 June 1999 + + + | Retry-After ; Section 14.37 + | Server ; Section 14.38 + | Vary ; Section 14.44 + | WWW-Authenticate ; Section 14.47 + + Response-header field names can be extended reliably only in + combination with a change in the protocol version. However, new or + experimental header fields MAY be given the semantics of response- + header fields if all parties in the communication recognize them to + be response-header fields. Unrecognized header fields are treated as + entity-header fields. + +7 Entity + + Request and Response messages MAY transfer an entity if not otherwise + restricted by the request method or response status code. An entity + consists of entity-header fields and an entity-body, although some + responses will only include the entity-headers. + + In this section, both sender and recipient refer to either the client + or the server, depending on who sends and who receives the entity. + +7.1 Entity Header Fields + + Entity-header fields define metainformation about the entity-body or, + if no body is present, about the resource identified by the request. + Some of this metainformation is OPTIONAL; some might be REQUIRED by + portions of this specification. + + entity-header = Allow ; Section 14.7 + | Content-Encoding ; Section 14.11 + | Content-Language ; Section 14.12 + | Content-Length ; Section 14.13 + | Content-Location ; Section 14.14 + | Content-MD5 ; Section 14.15 + | Content-Range ; Section 14.16 + | Content-Type ; Section 14.17 + | Expires ; Section 14.21 + | Last-Modified ; Section 14.29 + | extension-header + + extension-header = message-header + + The extension-header mechanism allows additional entity-header fields + to be defined without changing the protocol, but these fields cannot + be assumed to be recognizable by the recipient. Unrecognized header + fields SHOULD be ignored by the recipient and MUST be forwarded by + transparent proxies. + + + +Fielding, et al. Standards Track [Page 42] + +RFC 2616 HTTP/1.1 June 1999 + + +7.2 Entity Body + + The entity-body (if any) sent with an HTTP request or response is in + a format and encoding defined by the entity-header fields. + + entity-body = *OCTET + + An entity-body is only present in a message when a message-body is + present, as described in section 4.3. The entity-body is obtained + from the message-body by decoding any Transfer-Encoding that might + have been applied to ensure safe and proper transfer of the message. + +7.2.1 Type + + When an entity-body is included with a message, the data type of that + body is determined via the header fields Content-Type and Content- + Encoding. These define a two-layer, ordered encoding model: + + entity-body := Content-Encoding( Content-Type( data ) ) + + Content-Type specifies the media type of the underlying data. + Content-Encoding may be used to indicate any additional content + codings applied to the data, usually for the purpose of data + compression, that are a property of the requested resource. There is + no default encoding. + + Any HTTP/1.1 message containing an entity-body SHOULD include a + Content-Type header field defining the media type of that body. If + and only if the media type is not given by a Content-Type field, the + recipient MAY attempt to guess the media type via inspection of its + content and/or the name extension(s) of the URI used to identify the + resource. If the media type remains unknown, the recipient SHOULD + treat it as type "application/octet-stream". + +7.2.2 Entity Length + + The entity-length of a message is the length of the message-body + before any transfer-codings have been applied. Section 4.4 defines + how the transfer-length of a message-body is determined. + + + + + + + + + + + + +Fielding, et al. Standards Track [Page 43] + +RFC 2616 HTTP/1.1 June 1999 + + +8 Connections + +8.1 Persistent Connections + +8.1.1 Purpose + + Prior to persistent connections, a separate TCP connection was + established to fetch each URL, increasing the load on HTTP servers + and causing congestion on the Internet. The use of inline images and + other associated data often require a client to make multiple + requests of the same server in a short amount of time. Analysis of + these performance problems and results from a prototype + implementation are available [26] [30]. Implementation experience and + measurements of actual HTTP/1.1 (RFC 2068) implementations show good + results [39]. Alternatives have also been explored, for example, + T/TCP [27]. + + Persistent HTTP connections have a number of advantages: + + - By opening and closing fewer TCP connections, CPU time is saved + in routers and hosts (clients, servers, proxies, gateways, + tunnels, or caches), and memory used for TCP protocol control + blocks can be saved in hosts. + + - HTTP requests and responses can be pipelined on a connection. + Pipelining allows a client to make multiple requests without + waiting for each response, allowing a single TCP connection to + be used much more efficiently, with much lower elapsed time. + + - Network congestion is reduced by reducing the number of packets + caused by TCP opens, and by allowing TCP sufficient time to + determine the congestion state of the network. + + - Latency on subsequent requests is reduced since there is no time + spent in TCP's connection opening handshake. + + - HTTP can evolve more gracefully, since errors can be reported + without the penalty of closing the TCP connection. Clients using + future versions of HTTP might optimistically try a new feature, + but if communicating with an older server, retry with old + semantics after an error is reported. + + HTTP implementations SHOULD implement persistent connections. + + + + + + + + +Fielding, et al. Standards Track [Page 44] + +RFC 2616 HTTP/1.1 June 1999 + + +8.1.2 Overall Operation + + A significant difference between HTTP/1.1 and earlier versions of + HTTP is that persistent connections are the default behavior of any + HTTP connection. That is, unless otherwise indicated, the client + SHOULD assume that the server will maintain a persistent connection, + even after error responses from the server. + + Persistent connections provide a mechanism by which a client and a + server can signal the close of a TCP connection. This signaling takes + place using the Connection header field (section 14.10). Once a close + has been signaled, the client MUST NOT send any more requests on that + connection. + +8.1.2.1 Negotiation + + An HTTP/1.1 server MAY assume that a HTTP/1.1 client intends to + maintain a persistent connection unless a Connection header including + the connection-token "close" was sent in the request. If the server + chooses to close the connection immediately after sending the + response, it SHOULD send a Connection header including the + connection-token close. + + An HTTP/1.1 client MAY expect a connection to remain open, but would + decide to keep it open based on whether the response from a server + contains a Connection header with the connection-token close. In case + the client does not want to maintain a connection for more than that + request, it SHOULD send a Connection header including the + connection-token close. + + If either the client or the server sends the close token in the + Connection header, that request becomes the last one for the + connection. + + Clients and servers SHOULD NOT assume that a persistent connection is + maintained for HTTP versions less than 1.1 unless it is explicitly + signaled. See section 19.6.2 for more information on backward + compatibility with HTTP/1.0 clients. + + In order to remain persistent, all messages on the connection MUST + have a self-defined message length (i.e., one not defined by closure + of the connection), as described in section 4.4. + + + + + + + + + +Fielding, et al. Standards Track [Page 45] + +RFC 2616 HTTP/1.1 June 1999 + + +8.1.2.2 Pipelining + + A client that supports persistent connections MAY "pipeline" its + requests (i.e., send multiple requests without waiting for each + response). A server MUST send its responses to those requests in the + same order that the requests were received. + + Clients which assume persistent connections and pipeline immediately + after connection establishment SHOULD be prepared to retry their + connection if the first pipelined attempt fails. If a client does + such a retry, it MUST NOT pipeline before it knows the connection is + persistent. Clients MUST also be prepared to resend their requests if + the server closes the connection before sending all of the + corresponding responses. + + Clients SHOULD NOT pipeline requests using non-idempotent methods or + non-idempotent sequences of methods (see section 9.1.2). Otherwise, a + premature termination of the transport connection could lead to + indeterminate results. A client wishing to send a non-idempotent + request SHOULD wait to send that request until it has received the + response status for the previous request. + +8.1.3 Proxy Servers + + It is especially important that proxies correctly implement the + properties of the Connection header field as specified in section + 14.10. + + The proxy server MUST signal persistent connections separately with + its clients and the origin servers (or other proxy servers) that it + connects to. Each persistent connection applies to only one transport + link. + + A proxy server MUST NOT establish a HTTP/1.1 persistent connection + with an HTTP/1.0 client (but see RFC 2068 [33] for information and + discussion of the problems with the Keep-Alive header implemented by + many HTTP/1.0 clients). + +8.1.4 Practical Considerations + + Servers will usually have some time-out value beyond which they will + no longer maintain an inactive connection. Proxy servers might make + this a higher value since it is likely that the client will be making + more connections through the same server. The use of persistent + connections places no requirements on the length (or existence) of + this time-out for either the client or the server. + + + + + +Fielding, et al. Standards Track [Page 46] + +RFC 2616 HTTP/1.1 June 1999 + + + When a client or server wishes to time-out it SHOULD issue a graceful + close on the transport connection. Clients and servers SHOULD both + constantly watch for the other side of the transport close, and + respond to it as appropriate. If a client or server does not detect + the other side's close promptly it could cause unnecessary resource + drain on the network. + + A client, server, or proxy MAY close the transport connection at any + time. For example, a client might have started to send a new request + at the same time that the server has decided to close the "idle" + connection. From the server's point of view, the connection is being + closed while it was idle, but from the client's point of view, a + request is in progress. + + This means that clients, servers, and proxies MUST be able to recover + from asynchronous close events. Client software SHOULD reopen the + transport connection and retransmit the aborted sequence of requests + without user interaction so long as the request sequence is + idempotent (see section 9.1.2). Non-idempotent methods or sequences + MUST NOT be automatically retried, although user agents MAY offer a + human operator the choice of retrying the request(s). Confirmation by + user-agent software with semantic understanding of the application + MAY substitute for user confirmation. The automatic retry SHOULD NOT + be repeated if the second sequence of requests fails. + + Servers SHOULD always respond to at least one request per connection, + if at all possible. Servers SHOULD NOT close a connection in the + middle of transmitting a response, unless a network or client failure + is suspected. + + Clients that use persistent connections SHOULD limit the number of + simultaneous connections that they maintain to a given server. A + single-user client SHOULD NOT maintain more than 2 connections with + any server or proxy. A proxy SHOULD use up to 2*N connections to + another server or proxy, where N is the number of simultaneously + active users. These guidelines are intended to improve HTTP response + times and avoid congestion. + +8.2 Message Transmission Requirements + +8.2.1 Persistent Connections and Flow Control + + HTTP/1.1 servers SHOULD maintain persistent connections and use TCP's + flow control mechanisms to resolve temporary overloads, rather than + terminating connections with the expectation that clients will retry. + The latter technique can exacerbate network congestion. + + + + + +Fielding, et al. Standards Track [Page 47] + +RFC 2616 HTTP/1.1 June 1999 + + +8.2.2 Monitoring Connections for Error Status Messages + + An HTTP/1.1 (or later) client sending a message-body SHOULD monitor + the network connection for an error status while it is transmitting + the request. If the client sees an error status, it SHOULD + immediately cease transmitting the body. If the body is being sent + using a "chunked" encoding (section 3.6), a zero length chunk and + empty trailer MAY be used to prematurely mark the end of the message. + If the body was preceded by a Content-Length header, the client MUST + close the connection. + +8.2.3 Use of the 100 (Continue) Status + + The purpose of the 100 (Continue) status (see section 10.1.1) is to + allow a client that is sending a request message with a request body + to determine if the origin server is willing to accept the request + (based on the request headers) before the client sends the request + body. In some cases, it might either be inappropriate or highly + inefficient for the client to send the body if the server will reject + the message without looking at the body. + + Requirements for HTTP/1.1 clients: + + - If a client will wait for a 100 (Continue) response before + sending the request body, it MUST send an Expect request-header + field (section 14.20) with the "100-continue" expectation. + + - A client MUST NOT send an Expect request-header field (section + 14.20) with the "100-continue" expectation if it does not intend + to send a request body. + + Because of the presence of older implementations, the protocol allows + ambiguous situations in which a client may send "Expect: 100- + continue" without receiving either a 417 (Expectation Failed) status + or a 100 (Continue) status. Therefore, when a client sends this + header field to an origin server (possibly via a proxy) from which it + has never seen a 100 (Continue) status, the client SHOULD NOT wait + for an indefinite period before sending the request body. + + Requirements for HTTP/1.1 origin servers: + + - Upon receiving a request which includes an Expect request-header + field with the "100-continue" expectation, an origin server MUST + either respond with 100 (Continue) status and continue to read + from the input stream, or respond with a final status code. The + origin server MUST NOT wait for the request body before sending + the 100 (Continue) response. If it responds with a final status + code, it MAY close the transport connection or it MAY continue + + + +Fielding, et al. Standards Track [Page 48] + +RFC 2616 HTTP/1.1 June 1999 + + + to read and discard the rest of the request. It MUST NOT + perform the requested method if it returns a final status code. + + - An origin server SHOULD NOT send a 100 (Continue) response if + the request message does not include an Expect request-header + field with the "100-continue" expectation, and MUST NOT send a + 100 (Continue) response if such a request comes from an HTTP/1.0 + (or earlier) client. There is an exception to this rule: for + compatibility with RFC 2068, a server MAY send a 100 (Continue) + status in response to an HTTP/1.1 PUT or POST request that does + not include an Expect request-header field with the "100- + continue" expectation. This exception, the purpose of which is + to minimize any client processing delays associated with an + undeclared wait for 100 (Continue) status, applies only to + HTTP/1.1 requests, and not to requests with any other HTTP- + version value. + + - An origin server MAY omit a 100 (Continue) response if it has + already received some or all of the request body for the + corresponding request. + + - An origin server that sends a 100 (Continue) response MUST + ultimately send a final status code, once the request body is + received and processed, unless it terminates the transport + connection prematurely. + + - If an origin server receives a request that does not include an + Expect request-header field with the "100-continue" expectation, + the request includes a request body, and the server responds + with a final status code before reading the entire request body + from the transport connection, then the server SHOULD NOT close + the transport connection until it has read the entire request, + or until the client closes the connection. Otherwise, the client + might not reliably receive the response message. However, this + requirement is not be construed as preventing a server from + defending itself against denial-of-service attacks, or from + badly broken client implementations. + + Requirements for HTTP/1.1 proxies: + + - If a proxy receives a request that includes an Expect request- + header field with the "100-continue" expectation, and the proxy + either knows that the next-hop server complies with HTTP/1.1 or + higher, or does not know the HTTP version of the next-hop + server, it MUST forward the request, including the Expect header + field. + + + + + +Fielding, et al. Standards Track [Page 49] + +RFC 2616 HTTP/1.1 June 1999 + + + - If the proxy knows that the version of the next-hop server is + HTTP/1.0 or lower, it MUST NOT forward the request, and it MUST + respond with a 417 (Expectation Failed) status. + + - Proxies SHOULD maintain a cache recording the HTTP version + numbers received from recently-referenced next-hop servers. + + - A proxy MUST NOT forward a 100 (Continue) response if the + request message was received from an HTTP/1.0 (or earlier) + client and did not include an Expect request-header field with + the "100-continue" expectation. This requirement overrides the + general rule for forwarding of 1xx responses (see section 10.1). + +8.2.4 Client Behavior if Server Prematurely Closes Connection + + If an HTTP/1.1 client sends a request which includes a request body, + but which does not include an Expect request-header field with the + "100-continue" expectation, and if the client is not directly + connected to an HTTP/1.1 origin server, and if the client sees the + connection close before receiving any status from the server, the + client SHOULD retry the request. If the client does retry this + request, it MAY use the following "binary exponential backoff" + algorithm to be assured of obtaining a reliable response: + + 1. Initiate a new connection to the server + + 2. Transmit the request-headers + + 3. Initialize a variable R to the estimated round-trip time to the + server (e.g., based on the time it took to establish the + connection), or to a constant value of 5 seconds if the round- + trip time is not available. + + 4. Compute T = R * (2**N), where N is the number of previous + retries of this request. + + 5. Wait either for an error response from the server, or for T + seconds (whichever comes first) + + 6. If no error response is received, after T seconds transmit the + body of the request. + + 7. If client sees that the connection is closed prematurely, + repeat from step 1 until the request is accepted, an error + response is received, or the user becomes impatient and + terminates the retry process. + + + + + +Fielding, et al. Standards Track [Page 50] + +RFC 2616 HTTP/1.1 June 1999 + + + If at any point an error status is received, the client + + - SHOULD NOT continue and + + - SHOULD close the connection if it has not completed sending the + request message. + +9 Method Definitions + + The set of common methods for HTTP/1.1 is defined below. Although + this set can be expanded, additional methods cannot be assumed to + share the same semantics for separately extended clients and servers. + + The Host request-header field (section 14.23) MUST accompany all + HTTP/1.1 requests. + +9.1 Safe and Idempotent Methods + +9.1.1 Safe Methods + + Implementors should be aware that the software represents the user in + their interactions over the Internet, and should be careful to allow + the user to be aware of any actions they might take which may have an + unexpected significance to themselves or others. + + In particular, the convention has been established that the GET and + HEAD methods SHOULD NOT have the significance of taking an action + other than retrieval. These methods ought to be considered "safe". + This allows user agents to represent other methods, such as POST, PUT + and DELETE, in a special way, so that the user is made aware of the + fact that a possibly unsafe action is being requested. + + Naturally, it is not possible to ensure that the server does not + generate side-effects as a result of performing a GET request; in + fact, some dynamic resources consider that a feature. The important + distinction here is that the user did not request the side-effects, + so therefore cannot be held accountable for them. + +9.1.2 Idempotent Methods + + Methods can also have the property of "idempotence" in that (aside + from error or expiration issues) the side-effects of N > 0 identical + requests is the same as for a single request. The methods GET, HEAD, + PUT and DELETE share this property. Also, the methods OPTIONS and + TRACE SHOULD NOT have side effects, and so are inherently idempotent. + + + + + + +Fielding, et al. Standards Track [Page 51] + +RFC 2616 HTTP/1.1 June 1999 + + + However, it is possible that a sequence of several requests is non- + idempotent, even if all of the methods executed in that sequence are + idempotent. (A sequence is idempotent if a single execution of the + entire sequence always yields a result that is not changed by a + reexecution of all, or part, of that sequence.) For example, a + sequence is non-idempotent if its result depends on a value that is + later modified in the same sequence. + + A sequence that never has side effects is idempotent, by definition + (provided that no concurrent operations are being executed on the + same set of resources). + +9.2 OPTIONS + + The OPTIONS method represents a request for information about the + communication options available on the request/response chain + identified by the Request-URI. This method allows the client to + determine the options and/or requirements associated with a resource, + or the capabilities of a server, without implying a resource action + or initiating a resource retrieval. + + Responses to this method are not cacheable. + + If the OPTIONS request includes an entity-body (as indicated by the + presence of Content-Length or Transfer-Encoding), then the media type + MUST be indicated by a Content-Type field. Although this + specification does not define any use for such a body, future + extensions to HTTP might use the OPTIONS body to make more detailed + queries on the server. A server that does not support such an + extension MAY discard the request body. + + If the Request-URI is an asterisk ("*"), the OPTIONS request is + intended to apply to the server in general rather than to a specific + resource. Since a server's communication options typically depend on + the resource, the "*" request is only useful as a "ping" or "no-op" + type of method; it does nothing beyond allowing the client to test + the capabilities of the server. For example, this can be used to test + a proxy for HTTP/1.1 compliance (or lack thereof). + + If the Request-URI is not an asterisk, the OPTIONS request applies + only to the options that are available when communicating with that + resource. + + A 200 response SHOULD include any header fields that indicate + optional features implemented by the server and applicable to that + resource (e.g., Allow), possibly including extensions not defined by + this specification. The response body, if any, SHOULD also include + information about the communication options. The format for such a + + + +Fielding, et al. Standards Track [Page 52] + +RFC 2616 HTTP/1.1 June 1999 + + + body is not defined by this specification, but might be defined by + future extensions to HTTP. Content negotiation MAY be used to select + the appropriate response format. If no response body is included, the + response MUST include a Content-Length field with a field-value of + "0". + + The Max-Forwards request-header field MAY be used to target a + specific proxy in the request chain. When a proxy receives an OPTIONS + request on an absoluteURI for which request forwarding is permitted, + the proxy MUST check for a Max-Forwards field. If the Max-Forwards + field-value is zero ("0"), the proxy MUST NOT forward the message; + instead, the proxy SHOULD respond with its own communication options. + If the Max-Forwards field-value is an integer greater than zero, the + proxy MUST decrement the field-value when it forwards the request. If + no Max-Forwards field is present in the request, then the forwarded + request MUST NOT include a Max-Forwards field. + +9.3 GET + + The GET method means retrieve whatever information (in the form of an + entity) is identified by the Request-URI. If the Request-URI refers + to a data-producing process, it is the produced data which shall be + returned as the entity in the response and not the source text of the + process, unless that text happens to be the output of the process. + + The semantics of the GET method change to a "conditional GET" if the + request message includes an If-Modified-Since, If-Unmodified-Since, + If-Match, If-None-Match, or If-Range header field. A conditional GET + method requests that the entity be transferred only under the + circumstances described by the conditional header field(s). The + conditional GET method is intended to reduce unnecessary network + usage by allowing cached entities to be refreshed without requiring + multiple requests or transferring data already held by the client. + + The semantics of the GET method change to a "partial GET" if the + request message includes a Range header field. A partial GET requests + that only part of the entity be transferred, as described in section + 14.35. The partial GET method is intended to reduce unnecessary + network usage by allowing partially-retrieved entities to be + completed without transferring data already held by the client. + + The response to a GET request is cacheable if and only if it meets + the requirements for HTTP caching described in section 13. + + See section 15.1.3 for security considerations when used for forms. + + + + + + +Fielding, et al. Standards Track [Page 53] + +RFC 2616 HTTP/1.1 June 1999 + + +9.4 HEAD + + The HEAD method is identical to GET except that the server MUST NOT + return a message-body in the response. The metainformation contained + in the HTTP headers in response to a HEAD request SHOULD be identical + to the information sent in response to a GET request. This method can + be used for obtaining metainformation about the entity implied by the + request without transferring the entity-body itself. This method is + often used for testing hypertext links for validity, accessibility, + and recent modification. + + The response to a HEAD request MAY be cacheable in the sense that the + information contained in the response MAY be used to update a + previously cached entity from that resource. If the new field values + indicate that the cached entity differs from the current entity (as + would be indicated by a change in Content-Length, Content-MD5, ETag + or Last-Modified), then the cache MUST treat the cache entry as + stale. + +9.5 POST + + The POST method is used to request that the origin server accept the + entity enclosed in the request as a new subordinate of the resource + identified by the Request-URI in the Request-Line. POST is designed + to allow a uniform method to cover the following functions: + + - Annotation of existing resources; + + - Posting a message to a bulletin board, newsgroup, mailing list, + or similar group of articles; + + - Providing a block of data, such as the result of submitting a + form, to a data-handling process; + + - Extending a database through an append operation. + + The actual function performed by the POST method is determined by the + server and is usually dependent on the Request-URI. The posted entity + is subordinate to that URI in the same way that a file is subordinate + to a directory containing it, a news article is subordinate to a + newsgroup to which it is posted, or a record is subordinate to a + database. + + The action performed by the POST method might not result in a + resource that can be identified by a URI. In this case, either 200 + (OK) or 204 (No Content) is the appropriate response status, + depending on whether or not the response includes an entity that + describes the result. + + + +Fielding, et al. Standards Track [Page 54] + +RFC 2616 HTTP/1.1 June 1999 + + + If a resource has been created on the origin server, the response + SHOULD be 201 (Created) and contain an entity which describes the + status of the request and refers to the new resource, and a Location + header (see section 14.30). + + Responses to this method are not cacheable, unless the response + includes appropriate Cache-Control or Expires header fields. However, + the 303 (See Other) response can be used to direct the user agent to + retrieve a cacheable resource. + + POST requests MUST obey the message transmission requirements set out + in section 8.2. + + See section 15.1.3 for security considerations. + +9.6 PUT + + The PUT method requests that the enclosed entity be stored under the + supplied Request-URI. If the Request-URI refers to an already + existing resource, the enclosed entity SHOULD be considered as a + modified version of the one residing on the origin server. If the + Request-URI does not point to an existing resource, and that URI is + capable of being defined as a new resource by the requesting user + agent, the origin server can create the resource with that URI. If a + new resource is created, the origin server MUST inform the user agent + via the 201 (Created) response. If an existing resource is modified, + either the 200 (OK) or 204 (No Content) response codes SHOULD be sent + to indicate successful completion of the request. If the resource + could not be created or modified with the Request-URI, an appropriate + error response SHOULD be given that reflects the nature of the + problem. The recipient of the entity MUST NOT ignore any Content-* + (e.g. Content-Range) headers that it does not understand or implement + and MUST return a 501 (Not Implemented) response in such cases. + + If the request passes through a cache and the Request-URI identifies + one or more currently cached entities, those entries SHOULD be + treated as stale. Responses to this method are not cacheable. + + The fundamental difference between the POST and PUT requests is + reflected in the different meaning of the Request-URI. The URI in a + POST request identifies the resource that will handle the enclosed + entity. That resource might be a data-accepting process, a gateway to + some other protocol, or a separate entity that accepts annotations. + In contrast, the URI in a PUT request identifies the entity enclosed + with the request -- the user agent knows what URI is intended and the + server MUST NOT attempt to apply the request to some other resource. + If the server desires that the request be applied to a different URI, + + + + +Fielding, et al. Standards Track [Page 55] + +RFC 2616 HTTP/1.1 June 1999 + + + it MUST send a 301 (Moved Permanently) response; the user agent MAY + then make its own decision regarding whether or not to redirect the + request. + + A single resource MAY be identified by many different URIs. For + example, an article might have a URI for identifying "the current + version" which is separate from the URI identifying each particular + version. In this case, a PUT request on a general URI might result in + several other URIs being defined by the origin server. + + HTTP/1.1 does not define how a PUT method affects the state of an + origin server. + + PUT requests MUST obey the message transmission requirements set out + in section 8.2. + + Unless otherwise specified for a particular entity-header, the + entity-headers in the PUT request SHOULD be applied to the resource + created or modified by the PUT. + +9.7 DELETE + + The DELETE method requests that the origin server delete the resource + identified by the Request-URI. This method MAY be overridden by human + intervention (or other means) on the origin server. The client cannot + be guaranteed that the operation has been carried out, even if the + status code returned from the origin server indicates that the action + has been completed successfully. However, the server SHOULD NOT + indicate success unless, at the time the response is given, it + intends to delete the resource or move it to an inaccessible + location. + + A successful response SHOULD be 200 (OK) if the response includes an + entity describing the status, 202 (Accepted) if the action has not + yet been enacted, or 204 (No Content) if the action has been enacted + but the response does not include an entity. + + If the request passes through a cache and the Request-URI identifies + one or more currently cached entities, those entries SHOULD be + treated as stale. Responses to this method are not cacheable. + +9.8 TRACE + + The TRACE method is used to invoke a remote, application-layer loop- + back of the request message. The final recipient of the request + SHOULD reflect the message received back to the client as the + entity-body of a 200 (OK) response. The final recipient is either the + + + + +Fielding, et al. Standards Track [Page 56] + +RFC 2616 HTTP/1.1 June 1999 + + + origin server or the first proxy or gateway to receive a Max-Forwards + value of zero (0) in the request (see section 14.31). A TRACE request + MUST NOT include an entity. + + TRACE allows the client to see what is being received at the other + end of the request chain and use that data for testing or diagnostic + information. The value of the Via header field (section 14.45) is of + particular interest, since it acts as a trace of the request chain. + Use of the Max-Forwards header field allows the client to limit the + length of the request chain, which is useful for testing a chain of + proxies forwarding messages in an infinite loop. + + If the request is valid, the response SHOULD contain the entire + request message in the entity-body, with a Content-Type of + "message/http". Responses to this method MUST NOT be cached. + +9.9 CONNECT + + This specification reserves the method name CONNECT for use with a + proxy that can dynamically switch to being a tunnel (e.g. SSL + tunneling [44]). + +10 Status Code Definitions + + Each Status-Code is described below, including a description of which + method(s) it can follow and any metainformation required in the + response. + +10.1 Informational 1xx + + This class of status code indicates a provisional response, + consisting only of the Status-Line and optional headers, and is + terminated by an empty line. There are no required headers for this + class of status code. Since HTTP/1.0 did not define any 1xx status + codes, servers MUST NOT send a 1xx response to an HTTP/1.0 client + except under experimental conditions. + + A client MUST be prepared to accept one or more 1xx status responses + prior to a regular response, even if the client does not expect a 100 + (Continue) status message. Unexpected 1xx status responses MAY be + ignored by a user agent. + + Proxies MUST forward 1xx responses, unless the connection between the + proxy and its client has been closed, or unless the proxy itself + requested the generation of the 1xx response. (For example, if a + + + + + + +Fielding, et al. Standards Track [Page 57] + +RFC 2616 HTTP/1.1 June 1999 + + + proxy adds a "Expect: 100-continue" field when it forwards a request, + then it need not forward the corresponding 100 (Continue) + response(s).) + +10.1.1 100 Continue + + The client SHOULD continue with its request. This interim response is + used to inform the client that the initial part of the request has + been received and has not yet been rejected by the server. The client + SHOULD continue by sending the remainder of the request or, if the + request has already been completed, ignore this response. The server + MUST send a final response after the request has been completed. See + section 8.2.3 for detailed discussion of the use and handling of this + status code. + +10.1.2 101 Switching Protocols + + The server understands and is willing to comply with the client's + request, via the Upgrade message header field (section 14.42), for a + change in the application protocol being used on this connection. The + server will switch protocols to those defined by the response's + Upgrade header field immediately after the empty line which + terminates the 101 response. + + The protocol SHOULD be switched only when it is advantageous to do + so. For example, switching to a newer version of HTTP is advantageous + over older versions, and switching to a real-time, synchronous + protocol might be advantageous when delivering resources that use + such features. + +10.2 Successful 2xx + + This class of status code indicates that the client's request was + successfully received, understood, and accepted. + +10.2.1 200 OK + + The request has succeeded. The information returned with the response + is dependent on the method used in the request, for example: + + GET an entity corresponding to the requested resource is sent in + the response; + + HEAD the entity-header fields corresponding to the requested + resource are sent in the response without any message-body; + + POST an entity describing or containing the result of the action; + + + + +Fielding, et al. Standards Track [Page 58] + +RFC 2616 HTTP/1.1 June 1999 + + + TRACE an entity containing the request message as received by the + end server. + +10.2.2 201 Created + + The request has been fulfilled and resulted in a new resource being + created. The newly created resource can be referenced by the URI(s) + returned in the entity of the response, with the most specific URI + for the resource given by a Location header field. The response + SHOULD include an entity containing a list of resource + characteristics and location(s) from which the user or user agent can + choose the one most appropriate. The entity format is specified by + the media type given in the Content-Type header field. The origin + server MUST create the resource before returning the 201 status code. + If the action cannot be carried out immediately, the server SHOULD + respond with 202 (Accepted) response instead. + + A 201 response MAY contain an ETag response header field indicating + the current value of the entity tag for the requested variant just + created, see section 14.19. + +10.2.3 202 Accepted + + The request has been accepted for processing, but the processing has + not been completed. The request might or might not eventually be + acted upon, as it might be disallowed when processing actually takes + place. There is no facility for re-sending a status code from an + asynchronous operation such as this. + + The 202 response is intentionally non-committal. Its purpose is to + allow a server to accept a request for some other process (perhaps a + batch-oriented process that is only run once per day) without + requiring that the user agent's connection to the server persist + until the process is completed. The entity returned with this + response SHOULD include an indication of the request's current status + and either a pointer to a status monitor or some estimate of when the + user can expect the request to be fulfilled. + +10.2.4 203 Non-Authoritative Information + + The returned metainformation in the entity-header is not the + definitive set as available from the origin server, but is gathered + from a local or a third-party copy. The set presented MAY be a subset + or superset of the original version. For example, including local + annotation information about the resource might result in a superset + of the metainformation known by the origin server. Use of this + response code is not required and is only appropriate when the + response would otherwise be 200 (OK). + + + +Fielding, et al. Standards Track [Page 59] + +RFC 2616 HTTP/1.1 June 1999 + + +10.2.5 204 No Content + + The server has fulfilled the request but does not need to return an + entity-body, and might want to return updated metainformation. The + response MAY include new or updated metainformation in the form of + entity-headers, which if present SHOULD be associated with the + requested variant. + + If the client is a user agent, it SHOULD NOT change its document view + from that which caused the request to be sent. This response is + primarily intended to allow input for actions to take place without + causing a change to the user agent's active document view, although + any new or updated metainformation SHOULD be applied to the document + currently in the user agent's active view. + + The 204 response MUST NOT include a message-body, and thus is always + terminated by the first empty line after the header fields. + +10.2.6 205 Reset Content + + The server has fulfilled the request and the user agent SHOULD reset + the document view which caused the request to be sent. This response + is primarily intended to allow input for actions to take place via + user input, followed by a clearing of the form in which the input is + given so that the user can easily initiate another input action. The + response MUST NOT include an entity. + +10.2.7 206 Partial Content + + The server has fulfilled the partial GET request for the resource. + The request MUST have included a Range header field (section 14.35) + indicating the desired range, and MAY have included an If-Range + header field (section 14.27) to make the request conditional. + + The response MUST include the following header fields: + + - Either a Content-Range header field (section 14.16) indicating + the range included with this response, or a multipart/byteranges + Content-Type including Content-Range fields for each part. If a + Content-Length header field is present in the response, its + value MUST match the actual number of OCTETs transmitted in the + message-body. + + - Date + + - ETag and/or Content-Location, if the header would have been sent + in a 200 response to the same request + + + + +Fielding, et al. Standards Track [Page 60] + +RFC 2616 HTTP/1.1 June 1999 + + + - Expires, Cache-Control, and/or Vary, if the field-value might + differ from that sent in any previous response for the same + variant + + If the 206 response is the result of an If-Range request that used a + strong cache validator (see section 13.3.3), the response SHOULD NOT + include other entity-headers. If the response is the result of an + If-Range request that used a weak validator, the response MUST NOT + include other entity-headers; this prevents inconsistencies between + cached entity-bodies and updated headers. Otherwise, the response + MUST include all of the entity-headers that would have been returned + with a 200 (OK) response to the same request. + + A cache MUST NOT combine a 206 response with other previously cached + content if the ETag or Last-Modified headers do not match exactly, + see 13.5.4. + + A cache that does not support the Range and Content-Range headers + MUST NOT cache 206 (Partial) responses. + +10.3 Redirection 3xx + + This class of status code indicates that further action needs to be + taken by the user agent in order to fulfill the request. The action + required MAY be carried out by the user agent without interaction + with the user if and only if the method used in the second request is + GET or HEAD. A client SHOULD detect infinite redirection loops, since + such loops generate network traffic for each redirection. + + Note: previous versions of this specification recommended a + maximum of five redirections. Content developers should be aware + that there might be clients that implement such a fixed + limitation. + +10.3.1 300 Multiple Choices + + The requested resource corresponds to any one of a set of + representations, each with its own specific location, and agent- + driven negotiation information (section 12) is being provided so that + the user (or user agent) can select a preferred representation and + redirect its request to that location. + + Unless it was a HEAD request, the response SHOULD include an entity + containing a list of resource characteristics and location(s) from + which the user or user agent can choose the one most appropriate. The + entity format is specified by the media type given in the Content- + Type header field. Depending upon the format and the capabilities of + + + + +Fielding, et al. Standards Track [Page 61] + +RFC 2616 HTTP/1.1 June 1999 + + + the user agent, selection of the most appropriate choice MAY be + performed automatically. However, this specification does not define + any standard for such automatic selection. + + If the server has a preferred choice of representation, it SHOULD + include the specific URI for that representation in the Location + field; user agents MAY use the Location field value for automatic + redirection. This response is cacheable unless indicated otherwise. + +10.3.2 301 Moved Permanently + + The requested resource has been assigned a new permanent URI and any + future references to this resource SHOULD use one of the returned + URIs. Clients with link editing capabilities ought to automatically + re-link references to the Request-URI to one or more of the new + references returned by the server, where possible. This response is + cacheable unless indicated otherwise. + + The new permanent URI SHOULD be given by the Location field in the + response. Unless the request method was HEAD, the entity of the + response SHOULD contain a short hypertext note with a hyperlink to + the new URI(s). + + If the 301 status code is received in response to a request other + than GET or HEAD, the user agent MUST NOT automatically redirect the + request unless it can be confirmed by the user, since this might + change the conditions under which the request was issued. + + Note: When automatically redirecting a POST request after + receiving a 301 status code, some existing HTTP/1.0 user agents + will erroneously change it into a GET request. + +10.3.3 302 Found + + The requested resource resides temporarily under a different URI. + Since the redirection might be altered on occasion, the client SHOULD + continue to use the Request-URI for future requests. This response + is only cacheable if indicated by a Cache-Control or Expires header + field. + + The temporary URI SHOULD be given by the Location field in the + response. Unless the request method was HEAD, the entity of the + response SHOULD contain a short hypertext note with a hyperlink to + the new URI(s). + + + + + + + +Fielding, et al. Standards Track [Page 62] + +RFC 2616 HTTP/1.1 June 1999 + + + If the 302 status code is received in response to a request other + than GET or HEAD, the user agent MUST NOT automatically redirect the + request unless it can be confirmed by the user, since this might + change the conditions under which the request was issued. + + Note: RFC 1945 and RFC 2068 specify that the client is not allowed + to change the method on the redirected request. However, most + existing user agent implementations treat 302 as if it were a 303 + response, performing a GET on the Location field-value regardless + of the original request method. The status codes 303 and 307 have + been added for servers that wish to make unambiguously clear which + kind of reaction is expected of the client. + +10.3.4 303 See Other + + The response to the request can be found under a different URI and + SHOULD be retrieved using a GET method on that resource. This method + exists primarily to allow the output of a POST-activated script to + redirect the user agent to a selected resource. The new URI is not a + substitute reference for the originally requested resource. The 303 + response MUST NOT be cached, but the response to the second + (redirected) request might be cacheable. + + The different URI SHOULD be given by the Location field in the + response. Unless the request method was HEAD, the entity of the + response SHOULD contain a short hypertext note with a hyperlink to + the new URI(s). + + Note: Many pre-HTTP/1.1 user agents do not understand the 303 + status. When interoperability with such clients is a concern, the + 302 status code may be used instead, since most user agents react + to a 302 response as described here for 303. + +10.3.5 304 Not Modified + + If the client has performed a conditional GET request and access is + allowed, but the document has not been modified, the server SHOULD + respond with this status code. The 304 response MUST NOT contain a + message-body, and thus is always terminated by the first empty line + after the header fields. + + The response MUST include the following header fields: + + - Date, unless its omission is required by section 14.18.1 + + + + + + + +Fielding, et al. Standards Track [Page 63] + +RFC 2616 HTTP/1.1 June 1999 + + + If a clockless origin server obeys these rules, and proxies and + clients add their own Date to any response received without one (as + already specified by [RFC 2068], section 14.19), caches will operate + correctly. + + - ETag and/or Content-Location, if the header would have been sent + in a 200 response to the same request + + - Expires, Cache-Control, and/or Vary, if the field-value might + differ from that sent in any previous response for the same + variant + + If the conditional GET used a strong cache validator (see section + 13.3.3), the response SHOULD NOT include other entity-headers. + Otherwise (i.e., the conditional GET used a weak validator), the + response MUST NOT include other entity-headers; this prevents + inconsistencies between cached entity-bodies and updated headers. + + If a 304 response indicates an entity not currently cached, then the + cache MUST disregard the response and repeat the request without the + conditional. + + If a cache uses a received 304 response to update a cache entry, the + cache MUST update the entry to reflect any new field values given in + the response. + +10.3.6 305 Use Proxy + + The requested resource MUST be accessed through the proxy given by + the Location field. The Location field gives the URI of the proxy. + The recipient is expected to repeat this single request via the + proxy. 305 responses MUST only be generated by origin servers. + + Note: RFC 2068 was not clear that 305 was intended to redirect a + single request, and to be generated by origin servers only. Not + observing these limitations has significant security consequences. + +10.3.7 306 (Unused) + + The 306 status code was used in a previous version of the + specification, is no longer used, and the code is reserved. + + + + + + + + + + +Fielding, et al. Standards Track [Page 64] + +RFC 2616 HTTP/1.1 June 1999 + + +10.3.8 307 Temporary Redirect + + The requested resource resides temporarily under a different URI. + Since the redirection MAY be altered on occasion, the client SHOULD + continue to use the Request-URI for future requests. This response + is only cacheable if indicated by a Cache-Control or Expires header + field. + + The temporary URI SHOULD be given by the Location field in the + response. Unless the request method was HEAD, the entity of the + response SHOULD contain a short hypertext note with a hyperlink to + the new URI(s) , since many pre-HTTP/1.1 user agents do not + understand the 307 status. Therefore, the note SHOULD contain the + information necessary for a user to repeat the original request on + the new URI. + + If the 307 status code is received in response to a request other + than GET or HEAD, the user agent MUST NOT automatically redirect the + request unless it can be confirmed by the user, since this might + change the conditions under which the request was issued. + +10.4 Client Error 4xx + + The 4xx class of status code is intended for cases in which the + client seems to have erred. Except when responding to a HEAD request, + the server SHOULD include an entity containing an explanation of the + error situation, and whether it is a temporary or permanent + condition. These status codes are applicable to any request method. + User agents SHOULD display any included entity to the user. + + If the client is sending data, a server implementation using TCP + SHOULD be careful to ensure that the client acknowledges receipt of + the packet(s) containing the response, before the server closes the + input connection. If the client continues sending data to the server + after the close, the server's TCP stack will send a reset packet to + the client, which may erase the client's unacknowledged input buffers + before they can be read and interpreted by the HTTP application. + +10.4.1 400 Bad Request + + The request could not be understood by the server due to malformed + syntax. The client SHOULD NOT repeat the request without + modifications. + + + + + + + + +Fielding, et al. Standards Track [Page 65] + +RFC 2616 HTTP/1.1 June 1999 + + +10.4.2 401 Unauthorized + + The request requires user authentication. The response MUST include a + WWW-Authenticate header field (section 14.47) containing a challenge + applicable to the requested resource. The client MAY repeat the + request with a suitable Authorization header field (section 14.8). If + the request already included Authorization credentials, then the 401 + response indicates that authorization has been refused for those + credentials. If the 401 response contains the same challenge as the + prior response, and the user agent has already attempted + authentication at least once, then the user SHOULD be presented the + entity that was given in the response, since that entity might + include relevant diagnostic information. HTTP access authentication + is explained in "HTTP Authentication: Basic and Digest Access + Authentication" [43]. + +10.4.3 402 Payment Required + + This code is reserved for future use. + +10.4.4 403 Forbidden + + The server understood the request, but is refusing to fulfill it. + Authorization will not help and the request SHOULD NOT be repeated. + If the request method was not HEAD and the server wishes to make + public why the request has not been fulfilled, it SHOULD describe the + reason for the refusal in the entity. If the server does not wish to + make this information available to the client, the status code 404 + (Not Found) can be used instead. + +10.4.5 404 Not Found + + The server has not found anything matching the Request-URI. No + indication is given of whether the condition is temporary or + permanent. The 410 (Gone) status code SHOULD be used if the server + knows, through some internally configurable mechanism, that an old + resource is permanently unavailable and has no forwarding address. + This status code is commonly used when the server does not wish to + reveal exactly why the request has been refused, or when no other + response is applicable. + +10.4.6 405 Method Not Allowed + + The method specified in the Request-Line is not allowed for the + resource identified by the Request-URI. The response MUST include an + Allow header containing a list of valid methods for the requested + resource. + + + + +Fielding, et al. Standards Track [Page 66] + +RFC 2616 HTTP/1.1 June 1999 + + +10.4.7 406 Not Acceptable + + The resource identified by the request is only capable of generating + response entities which have content characteristics not acceptable + according to the accept headers sent in the request. + + Unless it was a HEAD request, the response SHOULD include an entity + containing a list of available entity characteristics and location(s) + from which the user or user agent can choose the one most + appropriate. The entity format is specified by the media type given + in the Content-Type header field. Depending upon the format and the + capabilities of the user agent, selection of the most appropriate + choice MAY be performed automatically. However, this specification + does not define any standard for such automatic selection. + + Note: HTTP/1.1 servers are allowed to return responses which are + not acceptable according to the accept headers sent in the + request. In some cases, this may even be preferable to sending a + 406 response. User agents are encouraged to inspect the headers of + an incoming response to determine if it is acceptable. + + If the response could be unacceptable, a user agent SHOULD + temporarily stop receipt of more data and query the user for a + decision on further actions. + +10.4.8 407 Proxy Authentication Required + + This code is similar to 401 (Unauthorized), but indicates that the + client must first authenticate itself with the proxy. The proxy MUST + return a Proxy-Authenticate header field (section 14.33) containing a + challenge applicable to the proxy for the requested resource. The + client MAY repeat the request with a suitable Proxy-Authorization + header field (section 14.34). HTTP access authentication is explained + in "HTTP Authentication: Basic and Digest Access Authentication" + [43]. + +10.4.9 408 Request Timeout + + The client did not produce a request within the time that the server + was prepared to wait. The client MAY repeat the request without + modifications at any later time. + +10.4.10 409 Conflict + + The request could not be completed due to a conflict with the current + state of the resource. This code is only allowed in situations where + it is expected that the user might be able to resolve the conflict + and resubmit the request. The response body SHOULD include enough + + + +Fielding, et al. Standards Track [Page 67] + +RFC 2616 HTTP/1.1 June 1999 + + + information for the user to recognize the source of the conflict. + Ideally, the response entity would include enough information for the + user or user agent to fix the problem; however, that might not be + possible and is not required. + + Conflicts are most likely to occur in response to a PUT request. For + example, if versioning were being used and the entity being PUT + included changes to a resource which conflict with those made by an + earlier (third-party) request, the server might use the 409 response + to indicate that it can't complete the request. In this case, the + response entity would likely contain a list of the differences + between the two versions in a format defined by the response + Content-Type. + +10.4.11 410 Gone + + The requested resource is no longer available at the server and no + forwarding address is known. This condition is expected to be + considered permanent. Clients with link editing capabilities SHOULD + delete references to the Request-URI after user approval. If the + server does not know, or has no facility to determine, whether or not + the condition is permanent, the status code 404 (Not Found) SHOULD be + used instead. This response is cacheable unless indicated otherwise. + + The 410 response is primarily intended to assist the task of web + maintenance by notifying the recipient that the resource is + intentionally unavailable and that the server owners desire that + remote links to that resource be removed. Such an event is common for + limited-time, promotional services and for resources belonging to + individuals no longer working at the server's site. It is not + necessary to mark all permanently unavailable resources as "gone" or + to keep the mark for any length of time -- that is left to the + discretion of the server owner. + +10.4.12 411 Length Required + + The server refuses to accept the request without a defined Content- + Length. The client MAY repeat the request if it adds a valid + Content-Length header field containing the length of the message-body + in the request message. + +10.4.13 412 Precondition Failed + + The precondition given in one or more of the request-header fields + evaluated to false when it was tested on the server. This response + code allows the client to place preconditions on the current resource + metainformation (header field data) and thus prevent the requested + method from being applied to a resource other than the one intended. + + + +Fielding, et al. Standards Track [Page 68] + +RFC 2616 HTTP/1.1 June 1999 + + +10.4.14 413 Request Entity Too Large + + The server is refusing to process a request because the request + entity is larger than the server is willing or able to process. The + server MAY close the connection to prevent the client from continuing + the request. + + If the condition is temporary, the server SHOULD include a Retry- + After header field to indicate that it is temporary and after what + time the client MAY try again. + +10.4.15 414 Request-URI Too Long + + The server is refusing to service the request because the Request-URI + is longer than the server is willing to interpret. This rare + condition is only likely to occur when a client has improperly + converted a POST request to a GET request with long query + information, when the client has descended into a URI "black hole" of + redirection (e.g., a redirected URI prefix that points to a suffix of + itself), or when the server is under attack by a client attempting to + exploit security holes present in some servers using fixed-length + buffers for reading or manipulating the Request-URI. + +10.4.16 415 Unsupported Media Type + + The server is refusing to service the request because the entity of + the request is in a format not supported by the requested resource + for the requested method. + +10.4.17 416 Requested Range Not Satisfiable + + A server SHOULD return a response with this status code if a request + included a Range request-header field (section 14.35), and none of + the range-specifier values in this field overlap the current extent + of the selected resource, and the request did not include an If-Range + request-header field. (For byte-ranges, this means that the first- + byte-pos of all of the byte-range-spec values were greater than the + current length of the selected resource.) + + When this status code is returned for a byte-range request, the + response SHOULD include a Content-Range entity-header field + specifying the current length of the selected resource (see section + 14.16). This response MUST NOT use the multipart/byteranges content- + type. + + + + + + + +Fielding, et al. Standards Track [Page 69] + +RFC 2616 HTTP/1.1 June 1999 + + +10.4.18 417 Expectation Failed + + The expectation given in an Expect request-header field (see section + 14.20) could not be met by this server, or, if the server is a proxy, + the server has unambiguous evidence that the request could not be met + by the next-hop server. + +10.5 Server Error 5xx + + Response status codes beginning with the digit "5" indicate cases in + which the server is aware that it has erred or is incapable of + performing the request. Except when responding to a HEAD request, the + server SHOULD include an entity containing an explanation of the + error situation, and whether it is a temporary or permanent + condition. User agents SHOULD display any included entity to the + user. These response codes are applicable to any request method. + +10.5.1 500 Internal Server Error + + The server encountered an unexpected condition which prevented it + from fulfilling the request. + +10.5.2 501 Not Implemented + + The server does not support the functionality required to fulfill the + request. This is the appropriate response when the server does not + recognize the request method and is not capable of supporting it for + any resource. + +10.5.3 502 Bad Gateway + + The server, while acting as a gateway or proxy, received an invalid + response from the upstream server it accessed in attempting to + fulfill the request. + +10.5.4 503 Service Unavailable + + The server is currently unable to handle the request due to a + temporary overloading or maintenance of the server. The implication + is that this is a temporary condition which will be alleviated after + some delay. If known, the length of the delay MAY be indicated in a + Retry-After header. If no Retry-After is given, the client SHOULD + handle the response as it would for a 500 response. + + Note: The existence of the 503 status code does not imply that a + server must use it when becoming overloaded. Some servers may wish + to simply refuse the connection. + + + + +Fielding, et al. Standards Track [Page 70] + +RFC 2616 HTTP/1.1 June 1999 + + +10.5.5 504 Gateway Timeout + + The server, while acting as a gateway or proxy, did not receive a + timely response from the upstream server specified by the URI (e.g. + HTTP, FTP, LDAP) or some other auxiliary server (e.g. DNS) it needed + to access in attempting to complete the request. + + Note: Note to implementors: some deployed proxies are known to + return 400 or 500 when DNS lookups time out. + +10.5.6 505 HTTP Version Not Supported + + The server does not support, or refuses to support, the HTTP protocol + version that was used in the request message. The server is + indicating that it is unable or unwilling to complete the request + using the same major version as the client, as described in section + 3.1, other than with this error message. The response SHOULD contain + an entity describing why that version is not supported and what other + protocols are supported by that server. + +11 Access Authentication + + HTTP provides several OPTIONAL challenge-response authentication + mechanisms which can be used by a server to challenge a client + request and by a client to provide authentication information. The + general framework for access authentication, and the specification of + "basic" and "digest" authentication, are specified in "HTTP + Authentication: Basic and Digest Access Authentication" [43]. This + specification adopts the definitions of "challenge" and "credentials" + from that specification. + +12 Content Negotiation + + Most HTTP responses include an entity which contains information for + interpretation by a human user. Naturally, it is desirable to supply + the user with the "best available" entity corresponding to the + request. Unfortunately for servers and caches, not all users have the + same preferences for what is "best," and not all user agents are + equally capable of rendering all entity types. For that reason, HTTP + has provisions for several mechanisms for "content negotiation" -- + the process of selecting the best representation for a given response + when there are multiple representations available. + + Note: This is not called "format negotiation" because the + alternate representations may be of the same media type, but use + different capabilities of that type, be in different languages, + etc. + + + + +Fielding, et al. Standards Track [Page 71] + +RFC 2616 HTTP/1.1 June 1999 + + + Any response containing an entity-body MAY be subject to negotiation, + including error responses. + + There are two kinds of content negotiation which are possible in + HTTP: server-driven and agent-driven negotiation. These two kinds of + negotiation are orthogonal and thus may be used separately or in + combination. One method of combination, referred to as transparent + negotiation, occurs when a cache uses the agent-driven negotiation + information provided by the origin server in order to provide + server-driven negotiation for subsequent requests. + +12.1 Server-driven Negotiation + + If the selection of the best representation for a response is made by + an algorithm located at the server, it is called server-driven + negotiation. Selection is based on the available representations of + the response (the dimensions over which it can vary; e.g. language, + content-coding, etc.) and the contents of particular header fields in + the request message or on other information pertaining to the request + (such as the network address of the client). + + Server-driven negotiation is advantageous when the algorithm for + selecting from among the available representations is difficult to + describe to the user agent, or when the server desires to send its + "best guess" to the client along with the first response (hoping to + avoid the round-trip delay of a subsequent request if the "best + guess" is good enough for the user). In order to improve the server's + guess, the user agent MAY include request header fields (Accept, + Accept-Language, Accept-Encoding, etc.) which describe its + preferences for such a response. + + Server-driven negotiation has disadvantages: + + 1. It is impossible for the server to accurately determine what + might be "best" for any given user, since that would require + complete knowledge of both the capabilities of the user agent + and the intended use for the response (e.g., does the user want + to view it on screen or print it on paper?). + + 2. Having the user agent describe its capabilities in every + request can be both very inefficient (given that only a small + percentage of responses have multiple representations) and a + potential violation of the user's privacy. + + 3. It complicates the implementation of an origin server and the + algorithms for generating responses to a request. + + + + + +Fielding, et al. Standards Track [Page 72] + +RFC 2616 HTTP/1.1 June 1999 + + + 4. It may limit a public cache's ability to use the same response + for multiple user's requests. + + HTTP/1.1 includes the following request-header fields for enabling + server-driven negotiation through description of user agent + capabilities and user preferences: Accept (section 14.1), Accept- + Charset (section 14.2), Accept-Encoding (section 14.3), Accept- + Language (section 14.4), and User-Agent (section 14.43). However, an + origin server is not limited to these dimensions and MAY vary the + response based on any aspect of the request, including information + outside the request-header fields or within extension header fields + not defined by this specification. + + The Vary header field can be used to express the parameters the + server uses to select a representation that is subject to server- + driven negotiation. See section 13.6 for use of the Vary header field + by caches and section 14.44 for use of the Vary header field by + servers. + +12.2 Agent-driven Negotiation + + With agent-driven negotiation, selection of the best representation + for a response is performed by the user agent after receiving an + initial response from the origin server. Selection is based on a list + of the available representations of the response included within the + header fields or entity-body of the initial response, with each + representation identified by its own URI. Selection from among the + representations may be performed automatically (if the user agent is + capable of doing so) or manually by the user selecting from a + generated (possibly hypertext) menu. + + Agent-driven negotiation is advantageous when the response would vary + over commonly-used dimensions (such as type, language, or encoding), + when the origin server is unable to determine a user agent's + capabilities from examining the request, and generally when public + caches are used to distribute server load and reduce network usage. + + Agent-driven negotiation suffers from the disadvantage of needing a + second request to obtain the best alternate representation. This + second request is only efficient when caching is used. In addition, + this specification does not define any mechanism for supporting + automatic selection, though it also does not prevent any such + mechanism from being developed as an extension and used within + HTTP/1.1. + + + + + + + +Fielding, et al. Standards Track [Page 73] + +RFC 2616 HTTP/1.1 June 1999 + + + HTTP/1.1 defines the 300 (Multiple Choices) and 406 (Not Acceptable) + status codes for enabling agent-driven negotiation when the server is + unwilling or unable to provide a varying response using server-driven + negotiation. + +12.3 Transparent Negotiation + + Transparent negotiation is a combination of both server-driven and + agent-driven negotiation. When a cache is supplied with a form of the + list of available representations of the response (as in agent-driven + negotiation) and the dimensions of variance are completely understood + by the cache, then the cache becomes capable of performing server- + driven negotiation on behalf of the origin server for subsequent + requests on that resource. + + Transparent negotiation has the advantage of distributing the + negotiation work that would otherwise be required of the origin + server and also removing the second request delay of agent-driven + negotiation when the cache is able to correctly guess the right + response. + + This specification does not define any mechanism for transparent + negotiation, though it also does not prevent any such mechanism from + being developed as an extension that could be used within HTTP/1.1. + +13 Caching in HTTP + + HTTP is typically used for distributed information systems, where + performance can be improved by the use of response caches. The + HTTP/1.1 protocol includes a number of elements intended to make + caching work as well as possible. Because these elements are + inextricable from other aspects of the protocol, and because they + interact with each other, it is useful to describe the basic caching + design of HTTP separately from the detailed descriptions of methods, + headers, response codes, etc. + + Caching would be useless if it did not significantly improve + performance. The goal of caching in HTTP/1.1 is to eliminate the need + to send requests in many cases, and to eliminate the need to send + full responses in many other cases. The former reduces the number of + network round-trips required for many operations; we use an + "expiration" mechanism for this purpose (see section 13.2). The + latter reduces network bandwidth requirements; we use a "validation" + mechanism for this purpose (see section 13.3). + + Requirements for performance, availability, and disconnected + operation require us to be able to relax the goal of semantic + transparency. The HTTP/1.1 protocol allows origin servers, caches, + + + +Fielding, et al. Standards Track [Page 74] + +RFC 2616 HTTP/1.1 June 1999 + + + and clients to explicitly reduce transparency when necessary. + However, because non-transparent operation may confuse non-expert + users, and might be incompatible with certain server applications + (such as those for ordering merchandise), the protocol requires that + transparency be relaxed + + - only by an explicit protocol-level request when relaxed by + client or origin server + + - only with an explicit warning to the end user when relaxed by + cache or client + + Therefore, the HTTP/1.1 protocol provides these important elements: + + 1. Protocol features that provide full semantic transparency when + this is required by all parties. + + 2. Protocol features that allow an origin server or user agent to + explicitly request and control non-transparent operation. + + 3. Protocol features that allow a cache to attach warnings to + responses that do not preserve the requested approximation of + semantic transparency. + + A basic principle is that it must be possible for the clients to + detect any potential relaxation of semantic transparency. + + Note: The server, cache, or client implementor might be faced with + design decisions not explicitly discussed in this specification. + If a decision might affect semantic transparency, the implementor + ought to err on the side of maintaining transparency unless a + careful and complete analysis shows significant benefits in + breaking transparency. + +13.1.1 Cache Correctness + + A correct cache MUST respond to a request with the most up-to-date + response held by the cache that is appropriate to the request (see + sections 13.2.5, 13.2.6, and 13.12) which meets one of the following + conditions: + + 1. It has been checked for equivalence with what the origin server + would have returned by revalidating the response with the + origin server (section 13.3); + + + + + + + +Fielding, et al. Standards Track [Page 75] + +RFC 2616 HTTP/1.1 June 1999 + + + 2. It is "fresh enough" (see section 13.2). In the default case, + this means it meets the least restrictive freshness requirement + of the client, origin server, and cache (see section 14.9); if + the origin server so specifies, it is the freshness requirement + of the origin server alone. + + If a stored response is not "fresh enough" by the most + restrictive freshness requirement of both the client and the + origin server, in carefully considered circumstances the cache + MAY still return the response with the appropriate Warning + header (see section 13.1.5 and 14.46), unless such a response + is prohibited (e.g., by a "no-store" cache-directive, or by a + "no-cache" cache-request-directive; see section 14.9). + + 3. It is an appropriate 304 (Not Modified), 305 (Proxy Redirect), + or error (4xx or 5xx) response message. + + If the cache can not communicate with the origin server, then a + correct cache SHOULD respond as above if the response can be + correctly served from the cache; if not it MUST return an error or + warning indicating that there was a communication failure. + + If a cache receives a response (either an entire response, or a 304 + (Not Modified) response) that it would normally forward to the + requesting client, and the received response is no longer fresh, the + cache SHOULD forward it to the requesting client without adding a new + Warning (but without removing any existing Warning headers). A cache + SHOULD NOT attempt to revalidate a response simply because that + response became stale in transit; this might lead to an infinite + loop. A user agent that receives a stale response without a Warning + MAY display a warning indication to the user. + +13.1.2 Warnings + + Whenever a cache returns a response that is neither first-hand nor + "fresh enough" (in the sense of condition 2 in section 13.1.1), it + MUST attach a warning to that effect, using a Warning general-header. + The Warning header and the currently defined warnings are described + in section 14.46. The warning allows clients to take appropriate + action. + + Warnings MAY be used for other purposes, both cache-related and + otherwise. The use of a warning, rather than an error status code, + distinguish these responses from true failures. + + Warnings are assigned three digit warn-codes. The first digit + indicates whether the Warning MUST or MUST NOT be deleted from a + stored cache entry after a successful revalidation: + + + +Fielding, et al. Standards Track [Page 76] + +RFC 2616 HTTP/1.1 June 1999 + + + 1xx Warnings that describe the freshness or revalidation status of + the response, and so MUST be deleted after a successful + revalidation. 1XX warn-codes MAY be generated by a cache only when + validating a cached entry. It MUST NOT be generated by clients. + + 2xx Warnings that describe some aspect of the entity body or entity + headers that is not rectified by a revalidation (for example, a + lossy compression of the entity bodies) and which MUST NOT be + deleted after a successful revalidation. + + See section 14.46 for the definitions of the codes themselves. + + HTTP/1.0 caches will cache all Warnings in responses, without + deleting the ones in the first category. Warnings in responses that + are passed to HTTP/1.0 caches carry an extra warning-date field, + which prevents a future HTTP/1.1 recipient from believing an + erroneously cached Warning. + + Warnings also carry a warning text. The text MAY be in any + appropriate natural language (perhaps based on the client's Accept + headers), and include an OPTIONAL indication of what character set is + used. + + Multiple warnings MAY be attached to a response (either by the origin + server or by a cache), including multiple warnings with the same code + number. For example, a server might provide the same warning with + texts in both English and Basque. + + When multiple warnings are attached to a response, it might not be + practical or reasonable to display all of them to the user. This + version of HTTP does not specify strict priority rules for deciding + which warnings to display and in what order, but does suggest some + heuristics. + +13.1.3 Cache-control Mechanisms + + The basic cache mechanisms in HTTP/1.1 (server-specified expiration + times and validators) are implicit directives to caches. In some + cases, a server or client might need to provide explicit directives + to the HTTP caches. We use the Cache-Control header for this purpose. + + The Cache-Control header allows a client or server to transmit a + variety of directives in either requests or responses. These + directives typically override the default caching algorithms. As a + general rule, if there is any apparent conflict between header + values, the most restrictive interpretation is applied (that is, the + one that is most likely to preserve semantic transparency). However, + + + + +Fielding, et al. Standards Track [Page 77] + +RFC 2616 HTTP/1.1 June 1999 + + + in some cases, cache-control directives are explicitly specified as + weakening the approximation of semantic transparency (for example, + "max-stale" or "public"). + + The cache-control directives are described in detail in section 14.9. + +13.1.4 Explicit User Agent Warnings + + Many user agents make it possible for users to override the basic + caching mechanisms. For example, the user agent might allow the user + to specify that cached entities (even explicitly stale ones) are + never validated. Or the user agent might habitually add "Cache- + Control: max-stale=3600" to every request. The user agent SHOULD NOT + default to either non-transparent behavior, or behavior that results + in abnormally ineffective caching, but MAY be explicitly configured + to do so by an explicit action of the user. + + If the user has overridden the basic caching mechanisms, the user + agent SHOULD explicitly indicate to the user whenever this results in + the display of information that might not meet the server's + transparency requirements (in particular, if the displayed entity is + known to be stale). Since the protocol normally allows the user agent + to determine if responses are stale or not, this indication need only + be displayed when this actually happens. The indication need not be a + dialog box; it could be an icon (for example, a picture of a rotting + fish) or some other indicator. + + If the user has overridden the caching mechanisms in a way that would + abnormally reduce the effectiveness of caches, the user agent SHOULD + continually indicate this state to the user (for example, by a + display of a picture of currency in flames) so that the user does not + inadvertently consume excess resources or suffer from excessive + latency. + +13.1.5 Exceptions to the Rules and Warnings + + In some cases, the operator of a cache MAY choose to configure it to + return stale responses even when not requested by clients. This + decision ought not be made lightly, but may be necessary for reasons + of availability or performance, especially when the cache is poorly + connected to the origin server. Whenever a cache returns a stale + response, it MUST mark it as such (using a Warning header) enabling + the client software to alert the user that there might be a potential + problem. + + + + + + + +Fielding, et al. Standards Track [Page 78] + +RFC 2616 HTTP/1.1 June 1999 + + + It also allows the user agent to take steps to obtain a first-hand or + fresh response. For this reason, a cache SHOULD NOT return a stale + response if the client explicitly requests a first-hand or fresh one, + unless it is impossible to comply for technical or policy reasons. + +13.1.6 Client-controlled Behavior + + While the origin server (and to a lesser extent, intermediate caches, + by their contribution to the age of a response) are the primary + source of expiration information, in some cases the client might need + to control a cache's decision about whether to return a cached + response without validating it. Clients do this using several + directives of the Cache-Control header. + + A client's request MAY specify the maximum age it is willing to + accept of an unvalidated response; specifying a value of zero forces + the cache(s) to revalidate all responses. A client MAY also specify + the minimum time remaining before a response expires. Both of these + options increase constraints on the behavior of caches, and so cannot + further relax the cache's approximation of semantic transparency. + + A client MAY also specify that it will accept stale responses, up to + some maximum amount of staleness. This loosens the constraints on the + caches, and so might violate the origin server's specified + constraints on semantic transparency, but might be necessary to + support disconnected operation, or high availability in the face of + poor connectivity. + +13.2 Expiration Model + +13.2.1 Server-Specified Expiration + + HTTP caching works best when caches can entirely avoid making + requests to the origin server. The primary mechanism for avoiding + requests is for an origin server to provide an explicit expiration + time in the future, indicating that a response MAY be used to satisfy + subsequent requests. In other words, a cache can return a fresh + response without first contacting the server. + + Our expectation is that servers will assign future explicit + expiration times to responses in the belief that the entity is not + likely to change, in a semantically significant way, before the + expiration time is reached. This normally preserves semantic + transparency, as long as the server's expiration times are carefully + chosen. + + + + + + +Fielding, et al. Standards Track [Page 79] + +RFC 2616 HTTP/1.1 June 1999 + + + The expiration mechanism applies only to responses taken from a cache + and not to first-hand responses forwarded immediately to the + requesting client. + + If an origin server wishes to force a semantically transparent cache + to validate every request, it MAY assign an explicit expiration time + in the past. This means that the response is always stale, and so the + cache SHOULD validate it before using it for subsequent requests. See + section 14.9.4 for a more restrictive way to force revalidation. + + If an origin server wishes to force any HTTP/1.1 cache, no matter how + it is configured, to validate every request, it SHOULD use the "must- + revalidate" cache-control directive (see section 14.9). + + Servers specify explicit expiration times using either the Expires + header, or the max-age directive of the Cache-Control header. + + An expiration time cannot be used to force a user agent to refresh + its display or reload a resource; its semantics apply only to caching + mechanisms, and such mechanisms need only check a resource's + expiration status when a new request for that resource is initiated. + See section 13.13 for an explanation of the difference between caches + and history mechanisms. + +13.2.2 Heuristic Expiration + + Since origin servers do not always provide explicit expiration times, + HTTP caches typically assign heuristic expiration times, employing + algorithms that use other header values (such as the Last-Modified + time) to estimate a plausible expiration time. The HTTP/1.1 + specification does not provide specific algorithms, but does impose + worst-case constraints on their results. Since heuristic expiration + times might compromise semantic transparency, they ought to used + cautiously, and we encourage origin servers to provide explicit + expiration times as much as possible. + +13.2.3 Age Calculations + + In order to know if a cached entry is fresh, a cache needs to know if + its age exceeds its freshness lifetime. We discuss how to calculate + the latter in section 13.2.4; this section describes how to calculate + the age of a response or cache entry. + + In this discussion, we use the term "now" to mean "the current value + of the clock at the host performing the calculation." Hosts that use + HTTP, but especially hosts running origin servers and caches, SHOULD + use NTP [28] or some similar protocol to synchronize their clocks to + a globally accurate time standard. + + + +Fielding, et al. Standards Track [Page 80] + +RFC 2616 HTTP/1.1 June 1999 + + + HTTP/1.1 requires origin servers to send a Date header, if possible, + with every response, giving the time at which the response was + generated (see section 14.18). We use the term "date_value" to denote + the value of the Date header, in a form appropriate for arithmetic + operations. + + HTTP/1.1 uses the Age response-header to convey the estimated age of + the response message when obtained from a cache. The Age field value + is the cache's estimate of the amount of time since the response was + generated or revalidated by the origin server. + + In essence, the Age value is the sum of the time that the response + has been resident in each of the caches along the path from the + origin server, plus the amount of time it has been in transit along + network paths. + + We use the term "age_value" to denote the value of the Age header, in + a form appropriate for arithmetic operations. + + A response's age can be calculated in two entirely independent ways: + + 1. now minus date_value, if the local clock is reasonably well + synchronized to the origin server's clock. If the result is + negative, the result is replaced by zero. + + 2. age_value, if all of the caches along the response path + implement HTTP/1.1. + + Given that we have two independent ways to compute the age of a + response when it is received, we can combine these as + + corrected_received_age = max(now - date_value, age_value) + + and as long as we have either nearly synchronized clocks or all- + HTTP/1.1 paths, one gets a reliable (conservative) result. + + Because of network-imposed delays, some significant interval might + pass between the time that a server generates a response and the time + it is received at the next outbound cache or client. If uncorrected, + this delay could result in improperly low ages. + + Because the request that resulted in the returned Age value must have + been initiated prior to that Age value's generation, we can correct + for delays imposed by the network by recording the time at which the + request was initiated. Then, when an Age value is received, it MUST + be interpreted relative to the time the request was initiated, not + + + + + +Fielding, et al. Standards Track [Page 81] + +RFC 2616 HTTP/1.1 June 1999 + + + the time that the response was received. This algorithm results in + conservative behavior no matter how much delay is experienced. So, we + compute: + + corrected_initial_age = corrected_received_age + + (now - request_time) + + where "request_time" is the time (according to the local clock) when + the request that elicited this response was sent. + + Summary of age calculation algorithm, when a cache receives a + response: + + /* + * age_value + * is the value of Age: header received by the cache with + * this response. + * date_value + * is the value of the origin server's Date: header + * request_time + * is the (local) time when the cache made the request + * that resulted in this cached response + * response_time + * is the (local) time when the cache received the + * response + * now + * is the current (local) time + */ + + apparent_age = max(0, response_time - date_value); + corrected_received_age = max(apparent_age, age_value); + response_delay = response_time - request_time; + corrected_initial_age = corrected_received_age + response_delay; + resident_time = now - response_time; + current_age = corrected_initial_age + resident_time; + + The current_age of a cache entry is calculated by adding the amount + of time (in seconds) since the cache entry was last validated by the + origin server to the corrected_initial_age. When a response is + generated from a cache entry, the cache MUST include a single Age + header field in the response with a value equal to the cache entry's + current_age. + + The presence of an Age header field in a response implies that a + response is not first-hand. However, the converse is not true, since + the lack of an Age header field in a response does not imply that the + + + + + +Fielding, et al. Standards Track [Page 82] + +RFC 2616 HTTP/1.1 June 1999 + + + response is first-hand unless all caches along the request path are + compliant with HTTP/1.1 (i.e., older HTTP caches did not implement + the Age header field). + +13.2.4 Expiration Calculations + + In order to decide whether a response is fresh or stale, we need to + compare its freshness lifetime to its age. The age is calculated as + described in section 13.2.3; this section describes how to calculate + the freshness lifetime, and to determine if a response has expired. + In the discussion below, the values can be represented in any form + appropriate for arithmetic operations. + + We use the term "expires_value" to denote the value of the Expires + header. We use the term "max_age_value" to denote an appropriate + value of the number of seconds carried by the "max-age" directive of + the Cache-Control header in a response (see section 14.9.3). + + The max-age directive takes priority over Expires, so if max-age is + present in a response, the calculation is simply: + + freshness_lifetime = max_age_value + + Otherwise, if Expires is present in the response, the calculation is: + + freshness_lifetime = expires_value - date_value + + Note that neither of these calculations is vulnerable to clock skew, + since all of the information comes from the origin server. + + If none of Expires, Cache-Control: max-age, or Cache-Control: s- + maxage (see section 14.9.3) appears in the response, and the response + does not include other restrictions on caching, the cache MAY compute + a freshness lifetime using a heuristic. The cache MUST attach Warning + 113 to any response whose age is more than 24 hours if such warning + has not already been added. + + Also, if the response does have a Last-Modified time, the heuristic + expiration value SHOULD be no more than some fraction of the interval + since that time. A typical setting of this fraction might be 10%. + + The calculation to determine if a response has expired is quite + simple: + + response_is_fresh = (freshness_lifetime > current_age) + + + + + + +Fielding, et al. Standards Track [Page 83] + +RFC 2616 HTTP/1.1 June 1999 + + +13.2.5 Disambiguating Expiration Values + + Because expiration values are assigned optimistically, it is possible + for two caches to contain fresh values for the same resource that are + different. + + If a client performing a retrieval receives a non-first-hand response + for a request that was already fresh in its own cache, and the Date + header in its existing cache entry is newer than the Date on the new + response, then the client MAY ignore the response. If so, it MAY + retry the request with a "Cache-Control: max-age=0" directive (see + section 14.9), to force a check with the origin server. + + If a cache has two fresh responses for the same representation with + different validators, it MUST use the one with the more recent Date + header. This situation might arise because the cache is pooling + responses from other caches, or because a client has asked for a + reload or a revalidation of an apparently fresh cache entry. + +13.2.6 Disambiguating Multiple Responses + + Because a client might be receiving responses via multiple paths, so + that some responses flow through one set of caches and other + responses flow through a different set of caches, a client might + receive responses in an order different from that in which the origin + server sent them. We would like the client to use the most recently + generated response, even if older responses are still apparently + fresh. + + Neither the entity tag nor the expiration value can impose an + ordering on responses, since it is possible that a later response + intentionally carries an earlier expiration time. The Date values are + ordered to a granularity of one second. + + When a client tries to revalidate a cache entry, and the response it + receives contains a Date header that appears to be older than the one + for the existing entry, then the client SHOULD repeat the request + unconditionally, and include + + Cache-Control: max-age=0 + + to force any intermediate caches to validate their copies directly + with the origin server, or + + Cache-Control: no-cache + + to force any intermediate caches to obtain a new copy from the origin + server. + + + +Fielding, et al. Standards Track [Page 84] + +RFC 2616 HTTP/1.1 June 1999 + + + If the Date values are equal, then the client MAY use either response + (or MAY, if it is being extremely prudent, request a new response). + Servers MUST NOT depend on clients being able to choose + deterministically between responses generated during the same second, + if their expiration times overlap. + +13.3 Validation Model + + When a cache has a stale entry that it would like to use as a + response to a client's request, it first has to check with the origin + server (or possibly an intermediate cache with a fresh response) to + see if its cached entry is still usable. We call this "validating" + the cache entry. Since we do not want to have to pay the overhead of + retransmitting the full response if the cached entry is good, and we + do not want to pay the overhead of an extra round trip if the cached + entry is invalid, the HTTP/1.1 protocol supports the use of + conditional methods. + + The key protocol features for supporting conditional methods are + those concerned with "cache validators." When an origin server + generates a full response, it attaches some sort of validator to it, + which is kept with the cache entry. When a client (user agent or + proxy cache) makes a conditional request for a resource for which it + has a cache entry, it includes the associated validator in the + request. + + The server then checks that validator against the current validator + for the entity, and, if they match (see section 13.3.3), it responds + with a special status code (usually, 304 (Not Modified)) and no + entity-body. Otherwise, it returns a full response (including + entity-body). Thus, we avoid transmitting the full response if the + validator matches, and we avoid an extra round trip if it does not + match. + + In HTTP/1.1, a conditional request looks exactly the same as a normal + request for the same resource, except that it carries a special + header (which includes the validator) that implicitly turns the + method (usually, GET) into a conditional. + + The protocol includes both positive and negative senses of cache- + validating conditions. That is, it is possible to request either that + a method be performed if and only if a validator matches or if and + only if no validators match. + + + + + + + + +Fielding, et al. Standards Track [Page 85] + +RFC 2616 HTTP/1.1 June 1999 + + + Note: a response that lacks a validator may still be cached, and + served from cache until it expires, unless this is explicitly + prohibited by a cache-control directive. However, a cache cannot + do a conditional retrieval if it does not have a validator for the + entity, which means it will not be refreshable after it expires. + +13.3.1 Last-Modified Dates + + The Last-Modified entity-header field value is often used as a cache + validator. In simple terms, a cache entry is considered to be valid + if the entity has not been modified since the Last-Modified value. + +13.3.2 Entity Tag Cache Validators + + The ETag response-header field value, an entity tag, provides for an + "opaque" cache validator. This might allow more reliable validation + in situations where it is inconvenient to store modification dates, + where the one-second resolution of HTTP date values is not + sufficient, or where the origin server wishes to avoid certain + paradoxes that might arise from the use of modification dates. + + Entity Tags are described in section 3.11. The headers used with + entity tags are described in sections 14.19, 14.24, 14.26 and 14.44. + +13.3.3 Weak and Strong Validators + + Since both origin servers and caches will compare two validators to + decide if they represent the same or different entities, one normally + would expect that if the entity (the entity-body or any entity- + headers) changes in any way, then the associated validator would + change as well. If this is true, then we call this validator a + "strong validator." + + However, there might be cases when a server prefers to change the + validator only on semantically significant changes, and not when + insignificant aspects of the entity change. A validator that does not + always change when the resource changes is a "weak validator." + + Entity tags are normally "strong validators," but the protocol + provides a mechanism to tag an entity tag as "weak." One can think of + a strong validator as one that changes whenever the bits of an entity + changes, while a weak value changes whenever the meaning of an entity + changes. Alternatively, one can think of a strong validator as part + of an identifier for a specific entity, while a weak validator is + part of an identifier for a set of semantically equivalent entities. + + Note: One example of a strong validator is an integer that is + incremented in stable storage every time an entity is changed. + + + +Fielding, et al. Standards Track [Page 86] + +RFC 2616 HTTP/1.1 June 1999 + + + An entity's modification time, if represented with one-second + resolution, could be a weak validator, since it is possible that + the resource might be modified twice during a single second. + + Support for weak validators is optional. However, weak validators + allow for more efficient caching of equivalent objects; for + example, a hit counter on a site is probably good enough if it is + updated every few days or weeks, and any value during that period + is likely "good enough" to be equivalent. + + A "use" of a validator is either when a client generates a request + and includes the validator in a validating header field, or when a + server compares two validators. + + Strong validators are usable in any context. Weak validators are only + usable in contexts that do not depend on exact equality of an entity. + For example, either kind is usable for a conditional GET of a full + entity. However, only a strong validator is usable for a sub-range + retrieval, since otherwise the client might end up with an internally + inconsistent entity. + + Clients MAY issue simple (non-subrange) GET requests with either weak + validators or strong validators. Clients MUST NOT use weak validators + in other forms of request. + + The only function that the HTTP/1.1 protocol defines on validators is + comparison. There are two validator comparison functions, depending + on whether the comparison context allows the use of weak validators + or not: + + - The strong comparison function: in order to be considered equal, + both validators MUST be identical in every way, and both MUST + NOT be weak. + + - The weak comparison function: in order to be considered equal, + both validators MUST be identical in every way, but either or + both of them MAY be tagged as "weak" without affecting the + result. + + An entity tag is strong unless it is explicitly tagged as weak. + Section 3.11 gives the syntax for entity tags. + + A Last-Modified time, when used as a validator in a request, is + implicitly weak unless it is possible to deduce that it is strong, + using the following rules: + + - The validator is being compared by an origin server to the + actual current validator for the entity and, + + + +Fielding, et al. Standards Track [Page 87] + +RFC 2616 HTTP/1.1 June 1999 + + + - That origin server reliably knows that the associated entity did + not change twice during the second covered by the presented + validator. + + or + + - The validator is about to be used by a client in an If- + Modified-Since or If-Unmodified-Since header, because the client + has a cache entry for the associated entity, and + + - That cache entry includes a Date value, which gives the time + when the origin server sent the original response, and + + - The presented Last-Modified time is at least 60 seconds before + the Date value. + + or + + - The validator is being compared by an intermediate cache to the + validator stored in its cache entry for the entity, and + + - That cache entry includes a Date value, which gives the time + when the origin server sent the original response, and + + - The presented Last-Modified time is at least 60 seconds before + the Date value. + + This method relies on the fact that if two different responses were + sent by the origin server during the same second, but both had the + same Last-Modified time, then at least one of those responses would + have a Date value equal to its Last-Modified time. The arbitrary 60- + second limit guards against the possibility that the Date and Last- + Modified values are generated from different clocks, or at somewhat + different times during the preparation of the response. An + implementation MAY use a value larger than 60 seconds, if it is + believed that 60 seconds is too short. + + If a client wishes to perform a sub-range retrieval on a value for + which it has only a Last-Modified time and no opaque validator, it + MAY do this only if the Last-Modified time is strong in the sense + described here. + + A cache or origin server receiving a conditional request, other than + a full-body GET request, MUST use the strong comparison function to + evaluate the condition. + + These rules allow HTTP/1.1 caches and clients to safely perform sub- + range retrievals on values that have been obtained from HTTP/1.0 + + + +Fielding, et al. Standards Track [Page 88] + +RFC 2616 HTTP/1.1 June 1999 + + + servers. + +13.3.4 Rules for When to Use Entity Tags and Last-Modified Dates + + We adopt a set of rules and recommendations for origin servers, + clients, and caches regarding when various validator types ought to + be used, and for what purposes. + + HTTP/1.1 origin servers: + + - SHOULD send an entity tag validator unless it is not feasible to + generate one. + + - MAY send a weak entity tag instead of a strong entity tag, if + performance considerations support the use of weak entity tags, + or if it is unfeasible to send a strong entity tag. + + - SHOULD send a Last-Modified value if it is feasible to send one, + unless the risk of a breakdown in semantic transparency that + could result from using this date in an If-Modified-Since header + would lead to serious problems. + + In other words, the preferred behavior for an HTTP/1.1 origin server + is to send both a strong entity tag and a Last-Modified value. + + In order to be legal, a strong entity tag MUST change whenever the + associated entity value changes in any way. A weak entity tag SHOULD + change whenever the associated entity changes in a semantically + significant way. + + Note: in order to provide semantically transparent caching, an + origin server must avoid reusing a specific strong entity tag + value for two different entities, or reusing a specific weak + entity tag value for two semantically different entities. Cache + entries might persist for arbitrarily long periods, regardless of + expiration times, so it might be inappropriate to expect that a + cache will never again attempt to validate an entry using a + validator that it obtained at some point in the past. + + HTTP/1.1 clients: + + - If an entity tag has been provided by the origin server, MUST + use that entity tag in any cache-conditional request (using If- + Match or If-None-Match). + + - If only a Last-Modified value has been provided by the origin + server, SHOULD use that value in non-subrange cache-conditional + requests (using If-Modified-Since). + + + +Fielding, et al. Standards Track [Page 89] + +RFC 2616 HTTP/1.1 June 1999 + + + - If only a Last-Modified value has been provided by an HTTP/1.0 + origin server, MAY use that value in subrange cache-conditional + requests (using If-Unmodified-Since:). The user agent SHOULD + provide a way to disable this, in case of difficulty. + + - If both an entity tag and a Last-Modified value have been + provided by the origin server, SHOULD use both validators in + cache-conditional requests. This allows both HTTP/1.0 and + HTTP/1.1 caches to respond appropriately. + + An HTTP/1.1 origin server, upon receiving a conditional request that + includes both a Last-Modified date (e.g., in an If-Modified-Since or + If-Unmodified-Since header field) and one or more entity tags (e.g., + in an If-Match, If-None-Match, or If-Range header field) as cache + validators, MUST NOT return a response status of 304 (Not Modified) + unless doing so is consistent with all of the conditional header + fields in the request. + + An HTTP/1.1 caching proxy, upon receiving a conditional request that + includes both a Last-Modified date and one or more entity tags as + cache validators, MUST NOT return a locally cached response to the + client unless that cached response is consistent with all of the + conditional header fields in the request. + + Note: The general principle behind these rules is that HTTP/1.1 + servers and clients should transmit as much non-redundant + information as is available in their responses and requests. + HTTP/1.1 systems receiving this information will make the most + conservative assumptions about the validators they receive. + + HTTP/1.0 clients and caches will ignore entity tags. Generally, + last-modified values received or used by these systems will + support transparent and efficient caching, and so HTTP/1.1 origin + servers should provide Last-Modified values. In those rare cases + where the use of a Last-Modified value as a validator by an + HTTP/1.0 system could result in a serious problem, then HTTP/1.1 + origin servers should not provide one. + +13.3.5 Non-validating Conditionals + + The principle behind entity tags is that only the service author + knows the semantics of a resource well enough to select an + appropriate cache validation mechanism, and the specification of any + validator comparison function more complex than byte-equality would + open up a can of worms. Thus, comparisons of any other headers + (except Last-Modified, for compatibility with HTTP/1.0) are never + used for purposes of validating a cache entry. + + + + +Fielding, et al. Standards Track [Page 90] + +RFC 2616 HTTP/1.1 June 1999 + + +13.4 Response Cacheability + + Unless specifically constrained by a cache-control (section 14.9) + directive, a caching system MAY always store a successful response + (see section 13.8) as a cache entry, MAY return it without validation + if it is fresh, and MAY return it after successful validation. If + there is neither a cache validator nor an explicit expiration time + associated with a response, we do not expect it to be cached, but + certain caches MAY violate this expectation (for example, when little + or no network connectivity is available). A client can usually detect + that such a response was taken from a cache by comparing the Date + header to the current time. + + Note: some HTTP/1.0 caches are known to violate this expectation + without providing any Warning. + + However, in some cases it might be inappropriate for a cache to + retain an entity, or to return it in response to a subsequent + request. This might be because absolute semantic transparency is + deemed necessary by the service author, or because of security or + privacy considerations. Certain cache-control directives are + therefore provided so that the server can indicate that certain + resource entities, or portions thereof, are not to be cached + regardless of other considerations. + + Note that section 14.8 normally prevents a shared cache from saving + and returning a response to a previous request if that request + included an Authorization header. + + A response received with a status code of 200, 203, 206, 300, 301 or + 410 MAY be stored by a cache and used in reply to a subsequent + request, subject to the expiration mechanism, unless a cache-control + directive prohibits caching. However, a cache that does not support + the Range and Content-Range headers MUST NOT cache 206 (Partial + Content) responses. + + A response received with any other status code (e.g. status codes 302 + and 307) MUST NOT be returned in a reply to a subsequent request + unless there are cache-control directives or another header(s) that + explicitly allow it. For example, these include the following: an + Expires header (section 14.21); a "max-age", "s-maxage", "must- + revalidate", "proxy-revalidate", "public" or "private" cache-control + directive (section 14.9). + + + + + + + + +Fielding, et al. Standards Track [Page 91] + +RFC 2616 HTTP/1.1 June 1999 + + +13.5 Constructing Responses From Caches + + The purpose of an HTTP cache is to store information received in + response to requests for use in responding to future requests. In + many cases, a cache simply returns the appropriate parts of a + response to the requester. However, if the cache holds a cache entry + based on a previous response, it might have to combine parts of a new + response with what is held in the cache entry. + +13.5.1 End-to-end and Hop-by-hop Headers + + For the purpose of defining the behavior of caches and non-caching + proxies, we divide HTTP headers into two categories: + + - End-to-end headers, which are transmitted to the ultimate + recipient of a request or response. End-to-end headers in + responses MUST be stored as part of a cache entry and MUST be + transmitted in any response formed from a cache entry. + + - Hop-by-hop headers, which are meaningful only for a single + transport-level connection, and are not stored by caches or + forwarded by proxies. + + The following HTTP/1.1 headers are hop-by-hop headers: + + - Connection + - Keep-Alive + - Proxy-Authenticate + - Proxy-Authorization + - TE + - Trailers + - Transfer-Encoding + - Upgrade + + All other headers defined by HTTP/1.1 are end-to-end headers. + + Other hop-by-hop headers MUST be listed in a Connection header, + (section 14.10) to be introduced into HTTP/1.1 (or later). + +13.5.2 Non-modifiable Headers + + Some features of the HTTP/1.1 protocol, such as Digest + Authentication, depend on the value of certain end-to-end headers. A + transparent proxy SHOULD NOT modify an end-to-end header unless the + definition of that header requires or specifically allows that. + + + + + + +Fielding, et al. Standards Track [Page 92] + +RFC 2616 HTTP/1.1 June 1999 + + + A transparent proxy MUST NOT modify any of the following fields in a + request or response, and it MUST NOT add any of these fields if not + already present: + + - Content-Location + + - Content-MD5 + + - ETag + + - Last-Modified + + A transparent proxy MUST NOT modify any of the following fields in a + response: + + - Expires + + but it MAY add any of these fields if not already present. If an + Expires header is added, it MUST be given a field-value identical to + that of the Date header in that response. + + A proxy MUST NOT modify or add any of the following fields in a + message that contains the no-transform cache-control directive, or in + any request: + + - Content-Encoding + + - Content-Range + + - Content-Type + + A non-transparent proxy MAY modify or add these fields to a message + that does not include no-transform, but if it does so, it MUST add a + Warning 214 (Transformation applied) if one does not already appear + in the message (see section 14.46). + + Warning: unnecessary modification of end-to-end headers might + cause authentication failures if stronger authentication + mechanisms are introduced in later versions of HTTP. Such + authentication mechanisms MAY rely on the values of header fields + not listed here. + + The Content-Length field of a request or response is added or deleted + according to the rules in section 4.4. A transparent proxy MUST + preserve the entity-length (section 7.2.2) of the entity-body, + although it MAY change the transfer-length (section 4.4). + + + + + +Fielding, et al. Standards Track [Page 93] + +RFC 2616 HTTP/1.1 June 1999 + + +13.5.3 Combining Headers + + When a cache makes a validating request to a server, and the server + provides a 304 (Not Modified) response or a 206 (Partial Content) + response, the cache then constructs a response to send to the + requesting client. + + If the status code is 304 (Not Modified), the cache uses the entity- + body stored in the cache entry as the entity-body of this outgoing + response. If the status code is 206 (Partial Content) and the ETag or + Last-Modified headers match exactly, the cache MAY combine the + contents stored in the cache entry with the new contents received in + the response and use the result as the entity-body of this outgoing + response, (see 13.5.4). + + The end-to-end headers stored in the cache entry are used for the + constructed response, except that + + - any stored Warning headers with warn-code 1xx (see section + 14.46) MUST be deleted from the cache entry and the forwarded + response. + + - any stored Warning headers with warn-code 2xx MUST be retained + in the cache entry and the forwarded response. + + - any end-to-end headers provided in the 304 or 206 response MUST + replace the corresponding headers from the cache entry. + + Unless the cache decides to remove the cache entry, it MUST also + replace the end-to-end headers stored with the cache entry with + corresponding headers received in the incoming response, except for + Warning headers as described immediately above. If a header field- + name in the incoming response matches more than one header in the + cache entry, all such old headers MUST be replaced. + + In other words, the set of end-to-end headers received in the + incoming response overrides all corresponding end-to-end headers + stored with the cache entry (except for stored Warning headers with + warn-code 1xx, which are deleted even if not overridden). + + Note: this rule allows an origin server to use a 304 (Not + Modified) or a 206 (Partial Content) response to update any header + associated with a previous response for the same entity or sub- + ranges thereof, although it might not always be meaningful or + correct to do so. This rule does not allow an origin server to use + a 304 (Not Modified) or a 206 (Partial Content) response to + entirely delete a header that it had provided with a previous + response. + + + +Fielding, et al. Standards Track [Page 94] + +RFC 2616 HTTP/1.1 June 1999 + + +13.5.4 Combining Byte Ranges + + A response might transfer only a subrange of the bytes of an entity- + body, either because the request included one or more Range + specifications, or because a connection was broken prematurely. After + several such transfers, a cache might have received several ranges of + the same entity-body. + + If a cache has a stored non-empty set of subranges for an entity, and + an incoming response transfers another subrange, the cache MAY + combine the new subrange with the existing set if both the following + conditions are met: + + - Both the incoming response and the cache entry have a cache + validator. + + - The two cache validators match using the strong comparison + function (see section 13.3.3). + + If either requirement is not met, the cache MUST use only the most + recent partial response (based on the Date values transmitted with + every response, and using the incoming response if these values are + equal or missing), and MUST discard the other partial information. + +13.6 Caching Negotiated Responses + + Use of server-driven content negotiation (section 12.1), as indicated + by the presence of a Vary header field in a response, alters the + conditions and procedure by which a cache can use the response for + subsequent requests. See section 14.44 for use of the Vary header + field by servers. + + A server SHOULD use the Vary header field to inform a cache of what + request-header fields were used to select among multiple + representations of a cacheable response subject to server-driven + negotiation. The set of header fields named by the Vary field value + is known as the "selecting" request-headers. + + When the cache receives a subsequent request whose Request-URI + specifies one or more cache entries including a Vary header field, + the cache MUST NOT use such a cache entry to construct a response to + the new request unless all of the selecting request-headers present + in the new request match the corresponding stored request-headers in + the original request. + + The selecting request-headers from two requests are defined to match + if and only if the selecting request-headers in the first request can + be transformed to the selecting request-headers in the second request + + + +Fielding, et al. Standards Track [Page 95] + +RFC 2616 HTTP/1.1 June 1999 + + + by adding or removing linear white space (LWS) at places where this + is allowed by the corresponding BNF, and/or combining multiple + message-header fields with the same field name following the rules + about message headers in section 4.2. + + A Vary header field-value of "*" always fails to match and subsequent + requests on that resource can only be properly interpreted by the + origin server. + + If the selecting request header fields for the cached entry do not + match the selecting request header fields of the new request, then + the cache MUST NOT use a cached entry to satisfy the request unless + it first relays the new request to the origin server in a conditional + request and the server responds with 304 (Not Modified), including an + entity tag or Content-Location that indicates the entity to be used. + + If an entity tag was assigned to a cached representation, the + forwarded request SHOULD be conditional and include the entity tags + in an If-None-Match header field from all its cache entries for the + resource. This conveys to the server the set of entities currently + held by the cache, so that if any one of these entities matches the + requested entity, the server can use the ETag header field in its 304 + (Not Modified) response to tell the cache which entry is appropriate. + If the entity-tag of the new response matches that of an existing + entry, the new response SHOULD be used to update the header fields of + the existing entry, and the result MUST be returned to the client. + + If any of the existing cache entries contains only partial content + for the associated entity, its entity-tag SHOULD NOT be included in + the If-None-Match header field unless the request is for a range that + would be fully satisfied by that entry. + + If a cache receives a successful response whose Content-Location + field matches that of an existing cache entry for the same Request- + ]URI, whose entity-tag differs from that of the existing entry, and + whose Date is more recent than that of the existing entry, the + existing entry SHOULD NOT be returned in response to future requests + and SHOULD be deleted from the cache. + +13.7 Shared and Non-Shared Caches + + For reasons of security and privacy, it is necessary to make a + distinction between "shared" and "non-shared" caches. A non-shared + cache is one that is accessible only to a single user. Accessibility + in this case SHOULD be enforced by appropriate security mechanisms. + All other caches are considered to be "shared." Other sections of + + + + + +Fielding, et al. Standards Track [Page 96] + +RFC 2616 HTTP/1.1 June 1999 + + + this specification place certain constraints on the operation of + shared caches in order to prevent loss of privacy or failure of + access controls. + +13.8 Errors or Incomplete Response Cache Behavior + + A cache that receives an incomplete response (for example, with fewer + bytes of data than specified in a Content-Length header) MAY store + the response. However, the cache MUST treat this as a partial + response. Partial responses MAY be combined as described in section + 13.5.4; the result might be a full response or might still be + partial. A cache MUST NOT return a partial response to a client + without explicitly marking it as such, using the 206 (Partial + Content) status code. A cache MUST NOT return a partial response + using a status code of 200 (OK). + + If a cache receives a 5xx response while attempting to revalidate an + entry, it MAY either forward this response to the requesting client, + or act as if the server failed to respond. In the latter case, it MAY + return a previously received response unless the cached entry + includes the "must-revalidate" cache-control directive (see section + 14.9). + +13.9 Side Effects of GET and HEAD + + Unless the origin server explicitly prohibits the caching of their + responses, the application of GET and HEAD methods to any resources + SHOULD NOT have side effects that would lead to erroneous behavior if + these responses are taken from a cache. They MAY still have side + effects, but a cache is not required to consider such side effects in + its caching decisions. Caches are always expected to observe an + origin server's explicit restrictions on caching. + + We note one exception to this rule: since some applications have + traditionally used GETs and HEADs with query URLs (those containing a + "?" in the rel_path part) to perform operations with significant side + effects, caches MUST NOT treat responses to such URIs as fresh unless + the server provides an explicit expiration time. This specifically + means that responses from HTTP/1.0 servers for such URIs SHOULD NOT + be taken from a cache. See section 9.1.1 for related information. + +13.10 Invalidation After Updates or Deletions + + The effect of certain methods performed on a resource at the origin + server might cause one or more existing cache entries to become non- + transparently invalid. That is, although they might continue to be + "fresh," they do not accurately reflect what the origin server would + return for a new request on that resource. + + + +Fielding, et al. Standards Track [Page 97] + +RFC 2616 HTTP/1.1 June 1999 + + + There is no way for the HTTP protocol to guarantee that all such + cache entries are marked invalid. For example, the request that + caused the change at the origin server might not have gone through + the proxy where a cache entry is stored. However, several rules help + reduce the likelihood of erroneous behavior. + + In this section, the phrase "invalidate an entity" means that the + cache will either remove all instances of that entity from its + storage, or will mark these as "invalid" and in need of a mandatory + revalidation before they can be returned in response to a subsequent + request. + + Some HTTP methods MUST cause a cache to invalidate an entity. This is + either the entity referred to by the Request-URI, or by the Location + or Content-Location headers (if present). These methods are: + + - PUT + + - DELETE + + - POST + + In order to prevent denial of service attacks, an invalidation based + on the URI in a Location or Content-Location header MUST only be + performed if the host part is the same as in the Request-URI. + + A cache that passes through requests for methods it does not + understand SHOULD invalidate any entities referred to by the + Request-URI. + +13.11 Write-Through Mandatory + + All methods that might be expected to cause modifications to the + origin server's resources MUST be written through to the origin + server. This currently includes all methods except for GET and HEAD. + A cache MUST NOT reply to such a request from a client before having + transmitted the request to the inbound server, and having received a + corresponding response from the inbound server. This does not prevent + a proxy cache from sending a 100 (Continue) response before the + inbound server has sent its final reply. + + The alternative (known as "write-back" or "copy-back" caching) is not + allowed in HTTP/1.1, due to the difficulty of providing consistent + updates and the problems arising from server, cache, or network + failure prior to write-back. + + + + + + +Fielding, et al. Standards Track [Page 98] + +RFC 2616 HTTP/1.1 June 1999 + + +13.12 Cache Replacement + + If a new cacheable (see sections 14.9.2, 13.2.5, 13.2.6 and 13.8) + response is received from a resource while any existing responses for + the same resource are cached, the cache SHOULD use the new response + to reply to the current request. It MAY insert it into cache storage + and MAY, if it meets all other requirements, use it to respond to any + future requests that would previously have caused the old response to + be returned. If it inserts the new response into cache storage the + rules in section 13.5.3 apply. + + Note: a new response that has an older Date header value than + existing cached responses is not cacheable. + +13.13 History Lists + + User agents often have history mechanisms, such as "Back" buttons and + history lists, which can be used to redisplay an entity retrieved + earlier in a session. + + History mechanisms and caches are different. In particular history + mechanisms SHOULD NOT try to show a semantically transparent view of + the current state of a resource. Rather, a history mechanism is meant + to show exactly what the user saw at the time when the resource was + retrieved. + + By default, an expiration time does not apply to history mechanisms. + If the entity is still in storage, a history mechanism SHOULD display + it even if the entity has expired, unless the user has specifically + configured the agent to refresh expired history documents. + + This is not to be construed to prohibit the history mechanism from + telling the user that a view might be stale. + + Note: if history list mechanisms unnecessarily prevent users from + viewing stale resources, this will tend to force service authors + to avoid using HTTP expiration controls and cache controls when + they would otherwise like to. Service authors may consider it + important that users not be presented with error messages or + warning messages when they use navigation controls (such as BACK) + to view previously fetched resources. Even though sometimes such + resources ought not to cached, or ought to expire quickly, user + interface considerations may force service authors to resort to + other means of preventing caching (e.g. "once-only" URLs) in order + not to suffer the effects of improperly functioning history + mechanisms. + + + + + +Fielding, et al. Standards Track [Page 99] + +RFC 2616 HTTP/1.1 June 1999 + + +14 Header Field Definitions + + This section defines the syntax and semantics of all standard + HTTP/1.1 header fields. For entity-header fields, both sender and + recipient refer to either the client or the server, depending on who + sends and who receives the entity. + +14.1 Accept + + The Accept request-header field can be used to specify certain media + types which are acceptable for the response. Accept headers can be + used to indicate that the request is specifically limited to a small + set of desired types, as in the case of a request for an in-line + image. + + Accept = "Accept" ":" + #( media-range [ accept-params ] ) + + media-range = ( "*/*" + | ( type "/" "*" ) + | ( type "/" subtype ) + ) *( ";" parameter ) + accept-params = ";" "q" "=" qvalue *( accept-extension ) + accept-extension = ";" token [ "=" ( token | quoted-string ) ] + + The asterisk "*" character is used to group media types into ranges, + with "*/*" indicating all media types and "type/*" indicating all + subtypes of that type. The media-range MAY include media type + parameters that are applicable to that range. + + Each media-range MAY be followed by one or more accept-params, + beginning with the "q" parameter for indicating a relative quality + factor. The first "q" parameter (if any) separates the media-range + parameter(s) from the accept-params. Quality factors allow the user + or user agent to indicate the relative degree of preference for that + media-range, using the qvalue scale from 0 to 1 (section 3.9). The + default value is q=1. + + Note: Use of the "q" parameter name to separate media type + parameters from Accept extension parameters is due to historical + practice. Although this prevents any media type parameter named + "q" from being used with a media range, such an event is believed + to be unlikely given the lack of any "q" parameters in the IANA + media type registry and the rare usage of any media type + parameters in Accept. Future media types are discouraged from + registering any parameter named "q". + + + + + +Fielding, et al. Standards Track [Page 100] + +RFC 2616 HTTP/1.1 June 1999 + + + The example + + Accept: audio/*; q=0.2, audio/basic + + SHOULD be interpreted as "I prefer audio/basic, but send me any audio + type if it is the best available after an 80% mark-down in quality." + + If no Accept header field is present, then it is assumed that the + client accepts all media types. If an Accept header field is present, + and if the server cannot send a response which is acceptable + according to the combined Accept field value, then the server SHOULD + send a 406 (not acceptable) response. + + A more elaborate example is + + Accept: text/plain; q=0.5, text/html, + text/x-dvi; q=0.8, text/x-c + + Verbally, this would be interpreted as "text/html and text/x-c are + the preferred media types, but if they do not exist, then send the + text/x-dvi entity, and if that does not exist, send the text/plain + entity." + + Media ranges can be overridden by more specific media ranges or + specific media types. If more than one media range applies to a given + type, the most specific reference has precedence. For example, + + Accept: text/*, text/html, text/html;level=1, */* + + have the following precedence: + + 1) text/html;level=1 + 2) text/html + 3) text/* + 4) */* + + The media type quality factor associated with a given type is + determined by finding the media range with the highest precedence + which matches that type. For example, + + Accept: text/*;q=0.3, text/html;q=0.7, text/html;level=1, + text/html;level=2;q=0.4, */*;q=0.5 + + would cause the following values to be associated: + + text/html;level=1 = 1 + text/html = 0.7 + text/plain = 0.3 + + + +Fielding, et al. Standards Track [Page 101] + +RFC 2616 HTTP/1.1 June 1999 + + + image/jpeg = 0.5 + text/html;level=2 = 0.4 + text/html;level=3 = 0.7 + + Note: A user agent might be provided with a default set of quality + values for certain media ranges. However, unless the user agent is + a closed system which cannot interact with other rendering agents, + this default set ought to be configurable by the user. + +14.2 Accept-Charset + + The Accept-Charset request-header field can be used to indicate what + character sets are acceptable for the response. This field allows + clients capable of understanding more comprehensive or special- + purpose character sets to signal that capability to a server which is + capable of representing documents in those character sets. + + Accept-Charset = "Accept-Charset" ":" + 1#( ( charset | "*" )[ ";" "q" "=" qvalue ] ) + + + Character set values are described in section 3.4. Each charset MAY + be given an associated quality value which represents the user's + preference for that charset. The default value is q=1. An example is + + Accept-Charset: iso-8859-5, unicode-1-1;q=0.8 + + The special value "*", if present in the Accept-Charset field, + matches every character set (including ISO-8859-1) which is not + mentioned elsewhere in the Accept-Charset field. If no "*" is present + in an Accept-Charset field, then all character sets not explicitly + mentioned get a quality value of 0, except for ISO-8859-1, which gets + a quality value of 1 if not explicitly mentioned. + + If no Accept-Charset header is present, the default is that any + character set is acceptable. If an Accept-Charset header is present, + and if the server cannot send a response which is acceptable + according to the Accept-Charset header, then the server SHOULD send + an error response with the 406 (not acceptable) status code, though + the sending of an unacceptable response is also allowed. + +14.3 Accept-Encoding + + The Accept-Encoding request-header field is similar to Accept, but + restricts the content-codings (section 3.5) that are acceptable in + the response. + + Accept-Encoding = "Accept-Encoding" ":" + + + +Fielding, et al. Standards Track [Page 102] + +RFC 2616 HTTP/1.1 June 1999 + + + 1#( codings [ ";" "q" "=" qvalue ] ) + codings = ( content-coding | "*" ) + + Examples of its use are: + + Accept-Encoding: compress, gzip + Accept-Encoding: + Accept-Encoding: * + Accept-Encoding: compress;q=0.5, gzip;q=1.0 + Accept-Encoding: gzip;q=1.0, identity; q=0.5, *;q=0 + + A server tests whether a content-coding is acceptable, according to + an Accept-Encoding field, using these rules: + + 1. If the content-coding is one of the content-codings listed in + the Accept-Encoding field, then it is acceptable, unless it is + accompanied by a qvalue of 0. (As defined in section 3.9, a + qvalue of 0 means "not acceptable.") + + 2. The special "*" symbol in an Accept-Encoding field matches any + available content-coding not explicitly listed in the header + field. + + 3. If multiple content-codings are acceptable, then the acceptable + content-coding with the highest non-zero qvalue is preferred. + + 4. The "identity" content-coding is always acceptable, unless + specifically refused because the Accept-Encoding field includes + "identity;q=0", or because the field includes "*;q=0" and does + not explicitly include the "identity" content-coding. If the + Accept-Encoding field-value is empty, then only the "identity" + encoding is acceptable. + + If an Accept-Encoding field is present in a request, and if the + server cannot send a response which is acceptable according to the + Accept-Encoding header, then the server SHOULD send an error response + with the 406 (Not Acceptable) status code. + + If no Accept-Encoding field is present in a request, the server MAY + assume that the client will accept any content coding. In this case, + if "identity" is one of the available content-codings, then the + server SHOULD use the "identity" content-coding, unless it has + additional information that a different content-coding is meaningful + to the client. + + Note: If the request does not include an Accept-Encoding field, + and if the "identity" content-coding is unavailable, then + content-codings commonly understood by HTTP/1.0 clients (i.e., + + + +Fielding, et al. Standards Track [Page 103] + +RFC 2616 HTTP/1.1 June 1999 + + + "gzip" and "compress") are preferred; some older clients + improperly display messages sent with other content-codings. The + server might also make this decision based on information about + the particular user-agent or client. + + Note: Most HTTP/1.0 applications do not recognize or obey qvalues + associated with content-codings. This means that qvalues will not + work and are not permitted with x-gzip or x-compress. + +14.4 Accept-Language + + The Accept-Language request-header field is similar to Accept, but + restricts the set of natural languages that are preferred as a + response to the request. Language tags are defined in section 3.10. + + Accept-Language = "Accept-Language" ":" + 1#( language-range [ ";" "q" "=" qvalue ] ) + language-range = ( ( 1*8ALPHA *( "-" 1*8ALPHA ) ) | "*" ) + + Each language-range MAY be given an associated quality value which + represents an estimate of the user's preference for the languages + specified by that range. The quality value defaults to "q=1". For + example, + + Accept-Language: da, en-gb;q=0.8, en;q=0.7 + + would mean: "I prefer Danish, but will accept British English and + other types of English." A language-range matches a language-tag if + it exactly equals the tag, or if it exactly equals a prefix of the + tag such that the first tag character following the prefix is "-". + The special range "*", if present in the Accept-Language field, + matches every tag not matched by any other range present in the + Accept-Language field. + + Note: This use of a prefix matching rule does not imply that + language tags are assigned to languages in such a way that it is + always true that if a user understands a language with a certain + tag, then this user will also understand all languages with tags + for which this tag is a prefix. The prefix rule simply allows the + use of prefix tags if this is the case. + + The language quality factor assigned to a language-tag by the + Accept-Language field is the quality value of the longest language- + range in the field that matches the language-tag. If no language- + range in the field matches the tag, the language quality factor + assigned is 0. If no Accept-Language header is present in the + request, the server + + + + +Fielding, et al. Standards Track [Page 104] + +RFC 2616 HTTP/1.1 June 1999 + + + SHOULD assume that all languages are equally acceptable. If an + Accept-Language header is present, then all languages which are + assigned a quality factor greater than 0 are acceptable. + + It might be contrary to the privacy expectations of the user to send + an Accept-Language header with the complete linguistic preferences of + the user in every request. For a discussion of this issue, see + section 15.1.4. + + As intelligibility is highly dependent on the individual user, it is + recommended that client applications make the choice of linguistic + preference available to the user. If the choice is not made + available, then the Accept-Language header field MUST NOT be given in + the request. + + Note: When making the choice of linguistic preference available to + the user, we remind implementors of the fact that users are not + familiar with the details of language matching as described above, + and should provide appropriate guidance. As an example, users + might assume that on selecting "en-gb", they will be served any + kind of English document if British English is not available. A + user agent might suggest in such a case to add "en" to get the + best matching behavior. + +14.5 Accept-Ranges + + The Accept-Ranges response-header field allows the server to + indicate its acceptance of range requests for a resource: + + Accept-Ranges = "Accept-Ranges" ":" acceptable-ranges + acceptable-ranges = 1#range-unit | "none" + + Origin servers that accept byte-range requests MAY send + + Accept-Ranges: bytes + + but are not required to do so. Clients MAY generate byte-range + requests without having received this header for the resource + involved. Range units are defined in section 3.12. + + Servers that do not accept any kind of range request for a + resource MAY send + + Accept-Ranges: none + + to advise the client not to attempt a range request. + + + + + +Fielding, et al. Standards Track [Page 105] + +RFC 2616 HTTP/1.1 June 1999 + + +14.6 Age + + The Age response-header field conveys the sender's estimate of the + amount of time since the response (or its revalidation) was + generated at the origin server. A cached response is "fresh" if + its age does not exceed its freshness lifetime. Age values are + calculated as specified in section 13.2.3. + + Age = "Age" ":" age-value + age-value = delta-seconds + + Age values are non-negative decimal integers, representing time in + seconds. + + If a cache receives a value larger than the largest positive + integer it can represent, or if any of its age calculations + overflows, it MUST transmit an Age header with a value of + 2147483648 (2^31). An HTTP/1.1 server that includes a cache MUST + include an Age header field in every response generated from its + own cache. Caches SHOULD use an arithmetic type of at least 31 + bits of range. + +14.7 Allow + + The Allow entity-header field lists the set of methods supported + by the resource identified by the Request-URI. The purpose of this + field is strictly to inform the recipient of valid methods + associated with the resource. An Allow header field MUST be + present in a 405 (Method Not Allowed) response. + + Allow = "Allow" ":" #Method + + Example of use: + + Allow: GET, HEAD, PUT + + This field cannot prevent a client from trying other methods. + However, the indications given by the Allow header field value + SHOULD be followed. The actual set of allowed methods is defined + by the origin server at the time of each request. + + The Allow header field MAY be provided with a PUT request to + recommend the methods to be supported by the new or modified + resource. The server is not required to support these methods and + SHOULD include an Allow header in the response giving the actual + supported methods. + + + + + +Fielding, et al. Standards Track [Page 106] + +RFC 2616 HTTP/1.1 June 1999 + + + A proxy MUST NOT modify the Allow header field even if it does not + understand all the methods specified, since the user agent might + have other means of communicating with the origin server. + +14.8 Authorization + + A user agent that wishes to authenticate itself with a server-- + usually, but not necessarily, after receiving a 401 response--does + so by including an Authorization request-header field with the + request. The Authorization field value consists of credentials + containing the authentication information of the user agent for + the realm of the resource being requested. + + Authorization = "Authorization" ":" credentials + + HTTP access authentication is described in "HTTP Authentication: + Basic and Digest Access Authentication" [43]. If a request is + authenticated and a realm specified, the same credentials SHOULD + be valid for all other requests within this realm (assuming that + the authentication scheme itself does not require otherwise, such + as credentials that vary according to a challenge value or using + synchronized clocks). + + When a shared cache (see section 13.7) receives a request + containing an Authorization field, it MUST NOT return the + corresponding response as a reply to any other request, unless one + of the following specific exceptions holds: + + 1. If the response includes the "s-maxage" cache-control + directive, the cache MAY use that response in replying to a + subsequent request. But (if the specified maximum age has + passed) a proxy cache MUST first revalidate it with the origin + server, using the request-headers from the new request to allow + the origin server to authenticate the new request. (This is the + defined behavior for s-maxage.) If the response includes "s- + maxage=0", the proxy MUST always revalidate it before re-using + it. + + 2. If the response includes the "must-revalidate" cache-control + directive, the cache MAY use that response in replying to a + subsequent request. But if the response is stale, all caches + MUST first revalidate it with the origin server, using the + request-headers from the new request to allow the origin server + to authenticate the new request. + + 3. If the response includes the "public" cache-control directive, + it MAY be returned in reply to any subsequent request. + + + + +Fielding, et al. Standards Track [Page 107] + +RFC 2616 HTTP/1.1 June 1999 + + +14.9 Cache-Control + + The Cache-Control general-header field is used to specify directives + that MUST be obeyed by all caching mechanisms along the + request/response chain. The directives specify behavior intended to + prevent caches from adversely interfering with the request or + response. These directives typically override the default caching + algorithms. Cache directives are unidirectional in that the presence + of a directive in a request does not imply that the same directive is + to be given in the response. + + Note that HTTP/1.0 caches might not implement Cache-Control and + might only implement Pragma: no-cache (see section 14.32). + + Cache directives MUST be passed through by a proxy or gateway + application, regardless of their significance to that application, + since the directives might be applicable to all recipients along the + request/response chain. It is not possible to specify a cache- + directive for a specific cache. + + Cache-Control = "Cache-Control" ":" 1#cache-directive + + cache-directive = cache-request-directive + | cache-response-directive + + cache-request-directive = + "no-cache" ; Section 14.9.1 + | "no-store" ; Section 14.9.2 + | "max-age" "=" delta-seconds ; Section 14.9.3, 14.9.4 + | "max-stale" [ "=" delta-seconds ] ; Section 14.9.3 + | "min-fresh" "=" delta-seconds ; Section 14.9.3 + | "no-transform" ; Section 14.9.5 + | "only-if-cached" ; Section 14.9.4 + | cache-extension ; Section 14.9.6 + + cache-response-directive = + "public" ; Section 14.9.1 + | "private" [ "=" <"> 1#field-name <"> ] ; Section 14.9.1 + | "no-cache" [ "=" <"> 1#field-name <"> ]; Section 14.9.1 + | "no-store" ; Section 14.9.2 + | "no-transform" ; Section 14.9.5 + | "must-revalidate" ; Section 14.9.4 + | "proxy-revalidate" ; Section 14.9.4 + | "max-age" "=" delta-seconds ; Section 14.9.3 + | "s-maxage" "=" delta-seconds ; Section 14.9.3 + | cache-extension ; Section 14.9.6 + + cache-extension = token [ "=" ( token | quoted-string ) ] + + + +Fielding, et al. Standards Track [Page 108] + +RFC 2616 HTTP/1.1 June 1999 + + + When a directive appears without any 1#field-name parameter, the + directive applies to the entire request or response. When such a + directive appears with a 1#field-name parameter, it applies only to + the named field or fields, and not to the rest of the request or + response. This mechanism supports extensibility; implementations of + future versions of the HTTP protocol might apply these directives to + header fields not defined in HTTP/1.1. + + The cache-control directives can be broken down into these general + categories: + + - Restrictions on what are cacheable; these may only be imposed by + the origin server. + + - Restrictions on what may be stored by a cache; these may be + imposed by either the origin server or the user agent. + + - Modifications of the basic expiration mechanism; these may be + imposed by either the origin server or the user agent. + + - Controls over cache revalidation and reload; these may only be + imposed by a user agent. + + - Control over transformation of entities. + + - Extensions to the caching system. + +14.9.1 What is Cacheable + + By default, a response is cacheable if the requirements of the + request method, request header fields, and the response status + indicate that it is cacheable. Section 13.4 summarizes these defaults + for cacheability. The following Cache-Control response directives + allow an origin server to override the default cacheability of a + response: + + public + Indicates that the response MAY be cached by any cache, even if it + would normally be non-cacheable or cacheable only within a non- + shared cache. (See also Authorization, section 14.8, for + additional details.) + + private + Indicates that all or part of the response message is intended for + a single user and MUST NOT be cached by a shared cache. This + allows an origin server to state that the specified parts of the + + + + + +Fielding, et al. Standards Track [Page 109] + +RFC 2616 HTTP/1.1 June 1999 + + + response are intended for only one user and are not a valid + response for requests by other users. A private (non-shared) cache + MAY cache the response. + + Note: This usage of the word private only controls where the + response may be cached, and cannot ensure the privacy of the + message content. + + no-cache + If the no-cache directive does not specify a field-name, then a + cache MUST NOT use the response to satisfy a subsequent request + without successful revalidation with the origin server. This + allows an origin server to prevent caching even by caches that + have been configured to return stale responses to client requests. + + If the no-cache directive does specify one or more field-names, + then a cache MAY use the response to satisfy a subsequent request, + subject to any other restrictions on caching. However, the + specified field-name(s) MUST NOT be sent in the response to a + subsequent request without successful revalidation with the origin + server. This allows an origin server to prevent the re-use of + certain header fields in a response, while still allowing caching + of the rest of the response. + + Note: Most HTTP/1.0 caches will not recognize or obey this + directive. + +14.9.2 What May be Stored by Caches + + no-store + The purpose of the no-store directive is to prevent the + inadvertent release or retention of sensitive information (for + example, on backup tapes). The no-store directive applies to the + entire message, and MAY be sent either in a response or in a + request. If sent in a request, a cache MUST NOT store any part of + either this request or any response to it. If sent in a response, + a cache MUST NOT store any part of either this response or the + request that elicited it. This directive applies to both non- + shared and shared caches. "MUST NOT store" in this context means + that the cache MUST NOT intentionally store the information in + non-volatile storage, and MUST make a best-effort attempt to + remove the information from volatile storage as promptly as + possible after forwarding it. + + Even when this directive is associated with a response, users + might explicitly store such a response outside of the caching + system (e.g., with a "Save As" dialog). History buffers MAY store + such responses as part of their normal operation. + + + +Fielding, et al. Standards Track [Page 110] + +RFC 2616 HTTP/1.1 June 1999 + + + The purpose of this directive is to meet the stated requirements + of certain users and service authors who are concerned about + accidental releases of information via unanticipated accesses to + cache data structures. While the use of this directive might + improve privacy in some cases, we caution that it is NOT in any + way a reliable or sufficient mechanism for ensuring privacy. In + particular, malicious or compromised caches might not recognize or + obey this directive, and communications networks might be + vulnerable to eavesdropping. + +14.9.3 Modifications of the Basic Expiration Mechanism + + The expiration time of an entity MAY be specified by the origin + server using the Expires header (see section 14.21). Alternatively, + it MAY be specified using the max-age directive in a response. When + the max-age cache-control directive is present in a cached response, + the response is stale if its current age is greater than the age + value given (in seconds) at the time of a new request for that + resource. The max-age directive on a response implies that the + response is cacheable (i.e., "public") unless some other, more + restrictive cache directive is also present. + + If a response includes both an Expires header and a max-age + directive, the max-age directive overrides the Expires header, even + if the Expires header is more restrictive. This rule allows an origin + server to provide, for a given response, a longer expiration time to + an HTTP/1.1 (or later) cache than to an HTTP/1.0 cache. This might be + useful if certain HTTP/1.0 caches improperly calculate ages or + expiration times, perhaps due to desynchronized clocks. + + Many HTTP/1.0 cache implementations will treat an Expires value that + is less than or equal to the response Date value as being equivalent + to the Cache-Control response directive "no-cache". If an HTTP/1.1 + cache receives such a response, and the response does not include a + Cache-Control header field, it SHOULD consider the response to be + non-cacheable in order to retain compatibility with HTTP/1.0 servers. + + Note: An origin server might wish to use a relatively new HTTP + cache control feature, such as the "private" directive, on a + network including older caches that do not understand that + feature. The origin server will need to combine the new feature + with an Expires field whose value is less than or equal to the + Date value. This will prevent older caches from improperly + caching the response. + + + + + + + +Fielding, et al. Standards Track [Page 111] + +RFC 2616 HTTP/1.1 June 1999 + + + s-maxage + If a response includes an s-maxage directive, then for a shared + cache (but not for a private cache), the maximum age specified by + this directive overrides the maximum age specified by either the + max-age directive or the Expires header. The s-maxage directive + also implies the semantics of the proxy-revalidate directive (see + section 14.9.4), i.e., that the shared cache must not use the + entry after it becomes stale to respond to a subsequent request + without first revalidating it with the origin server. The s- + maxage directive is always ignored by a private cache. + + Note that most older caches, not compliant with this specification, + do not implement any cache-control directives. An origin server + wishing to use a cache-control directive that restricts, but does not + prevent, caching by an HTTP/1.1-compliant cache MAY exploit the + requirement that the max-age directive overrides the Expires header, + and the fact that pre-HTTP/1.1-compliant caches do not observe the + max-age directive. + + Other directives allow a user agent to modify the basic expiration + mechanism. These directives MAY be specified on a request: + + max-age + Indicates that the client is willing to accept a response whose + age is no greater than the specified time in seconds. Unless max- + stale directive is also included, the client is not willing to + accept a stale response. + + min-fresh + Indicates that the client is willing to accept a response whose + freshness lifetime is no less than its current age plus the + specified time in seconds. That is, the client wants a response + that will still be fresh for at least the specified number of + seconds. + + max-stale + Indicates that the client is willing to accept a response that has + exceeded its expiration time. If max-stale is assigned a value, + then the client is willing to accept a response that has exceeded + its expiration time by no more than the specified number of + seconds. If no value is assigned to max-stale, then the client is + willing to accept a stale response of any age. + + If a cache returns a stale response, either because of a max-stale + directive on a request, or because the cache is configured to + override the expiration time of a response, the cache MUST attach a + Warning header to the stale response, using Warning 110 (Response is + stale). + + + +Fielding, et al. Standards Track [Page 112] + +RFC 2616 HTTP/1.1 June 1999 + + + A cache MAY be configured to return stale responses without + validation, but only if this does not conflict with any "MUST"-level + requirements concerning cache validation (e.g., a "must-revalidate" + cache-control directive). + + If both the new request and the cached entry include "max-age" + directives, then the lesser of the two values is used for determining + the freshness of the cached entry for that request. + +14.9.4 Cache Revalidation and Reload Controls + + Sometimes a user agent might want or need to insist that a cache + revalidate its cache entry with the origin server (and not just with + the next cache along the path to the origin server), or to reload its + cache entry from the origin server. End-to-end revalidation might be + necessary if either the cache or the origin server has overestimated + the expiration time of the cached response. End-to-end reload may be + necessary if the cache entry has become corrupted for some reason. + + End-to-end revalidation may be requested either when the client does + not have its own local cached copy, in which case we call it + "unspecified end-to-end revalidation", or when the client does have a + local cached copy, in which case we call it "specific end-to-end + revalidation." + + The client can specify these three kinds of action using Cache- + Control request directives: + + End-to-end reload + The request includes a "no-cache" cache-control directive or, for + compatibility with HTTP/1.0 clients, "Pragma: no-cache". Field + names MUST NOT be included with the no-cache directive in a + request. The server MUST NOT use a cached copy when responding to + such a request. + + Specific end-to-end revalidation + The request includes a "max-age=0" cache-control directive, which + forces each cache along the path to the origin server to + revalidate its own entry, if any, with the next cache or server. + The initial request includes a cache-validating conditional with + the client's current validator. + + Unspecified end-to-end revalidation + The request includes "max-age=0" cache-control directive, which + forces each cache along the path to the origin server to + revalidate its own entry, if any, with the next cache or server. + The initial request does not include a cache-validating + + + + +Fielding, et al. Standards Track [Page 113] + +RFC 2616 HTTP/1.1 June 1999 + + + conditional; the first cache along the path (if any) that holds a + cache entry for this resource includes a cache-validating + conditional with its current validator. + + max-age + When an intermediate cache is forced, by means of a max-age=0 + directive, to revalidate its own cache entry, and the client has + supplied its own validator in the request, the supplied validator + might differ from the validator currently stored with the cache + entry. In this case, the cache MAY use either validator in making + its own request without affecting semantic transparency. + + However, the choice of validator might affect performance. The + best approach is for the intermediate cache to use its own + validator when making its request. If the server replies with 304 + (Not Modified), then the cache can return its now validated copy + to the client with a 200 (OK) response. If the server replies with + a new entity and cache validator, however, the intermediate cache + can compare the returned validator with the one provided in the + client's request, using the strong comparison function. If the + client's validator is equal to the origin server's, then the + intermediate cache simply returns 304 (Not Modified). Otherwise, + it returns the new entity with a 200 (OK) response. + + If a request includes the no-cache directive, it SHOULD NOT + include min-fresh, max-stale, or max-age. + + only-if-cached + In some cases, such as times of extremely poor network + connectivity, a client may want a cache to return only those + responses that it currently has stored, and not to reload or + revalidate with the origin server. To do this, the client may + include the only-if-cached directive in a request. If it receives + this directive, a cache SHOULD either respond using a cached entry + that is consistent with the other constraints of the request, or + respond with a 504 (Gateway Timeout) status. However, if a group + of caches is being operated as a unified system with good internal + connectivity, such a request MAY be forwarded within that group of + caches. + + must-revalidate + Because a cache MAY be configured to ignore a server's specified + expiration time, and because a client request MAY include a max- + stale directive (which has a similar effect), the protocol also + includes a mechanism for the origin server to require revalidation + of a cache entry on any subsequent use. When the must-revalidate + directive is present in a response received by a cache, that cache + MUST NOT use the entry after it becomes stale to respond to a + + + +Fielding, et al. Standards Track [Page 114] + +RFC 2616 HTTP/1.1 June 1999 + + + subsequent request without first revalidating it with the origin + server. (I.e., the cache MUST do an end-to-end revalidation every + time, if, based solely on the origin server's Expires or max-age + value, the cached response is stale.) + + The must-revalidate directive is necessary to support reliable + operation for certain protocol features. In all circumstances an + HTTP/1.1 cache MUST obey the must-revalidate directive; in + particular, if the cache cannot reach the origin server for any + reason, it MUST generate a 504 (Gateway Timeout) response. + + Servers SHOULD send the must-revalidate directive if and only if + failure to revalidate a request on the entity could result in + incorrect operation, such as a silently unexecuted financial + transaction. Recipients MUST NOT take any automated action that + violates this directive, and MUST NOT automatically provide an + unvalidated copy of the entity if revalidation fails. + + Although this is not recommended, user agents operating under + severe connectivity constraints MAY violate this directive but, if + so, MUST explicitly warn the user that an unvalidated response has + been provided. The warning MUST be provided on each unvalidated + access, and SHOULD require explicit user confirmation. + + proxy-revalidate + The proxy-revalidate directive has the same meaning as the must- + revalidate directive, except that it does not apply to non-shared + user agent caches. It can be used on a response to an + authenticated request to permit the user's cache to store and + later return the response without needing to revalidate it (since + it has already been authenticated once by that user), while still + requiring proxies that service many users to revalidate each time + (in order to make sure that each user has been authenticated). + Note that such authenticated responses also need the public cache + control directive in order to allow them to be cached at all. + +14.9.5 No-Transform Directive + + no-transform + Implementors of intermediate caches (proxies) have found it useful + to convert the media type of certain entity bodies. A non- + transparent proxy might, for example, convert between image + formats in order to save cache space or to reduce the amount of + traffic on a slow link. + + Serious operational problems occur, however, when these + transformations are applied to entity bodies intended for certain + kinds of applications. For example, applications for medical + + + +Fielding, et al. Standards Track [Page 115] + +RFC 2616 HTTP/1.1 June 1999 + + + imaging, scientific data analysis and those using end-to-end + authentication, all depend on receiving an entity body that is bit + for bit identical to the original entity-body. + + Therefore, if a message includes the no-transform directive, an + intermediate cache or proxy MUST NOT change those headers that are + listed in section 13.5.2 as being subject to the no-transform + directive. This implies that the cache or proxy MUST NOT change + any aspect of the entity-body that is specified by these headers, + including the value of the entity-body itself. + +14.9.6 Cache Control Extensions + + The Cache-Control header field can be extended through the use of one + or more cache-extension tokens, each with an optional assigned value. + Informational extensions (those which do not require a change in + cache behavior) MAY be added without changing the semantics of other + directives. Behavioral extensions are designed to work by acting as + modifiers to the existing base of cache directives. Both the new + directive and the standard directive are supplied, such that + applications which do not understand the new directive will default + to the behavior specified by the standard directive, and those that + understand the new directive will recognize it as modifying the + requirements associated with the standard directive. In this way, + extensions to the cache-control directives can be made without + requiring changes to the base protocol. + + This extension mechanism depends on an HTTP cache obeying all of the + cache-control directives defined for its native HTTP-version, obeying + certain extensions, and ignoring all directives that it does not + understand. + + For example, consider a hypothetical new response directive called + community which acts as a modifier to the private directive. We + define this new directive to mean that, in addition to any non-shared + cache, any cache which is shared only by members of the community + named within its value may cache the response. An origin server + wishing to allow the UCI community to use an otherwise private + response in their shared cache(s) could do so by including + + Cache-Control: private, community="UCI" + + A cache seeing this header field will act correctly even if the cache + does not understand the community cache-extension, since it will also + see and understand the private directive and thus default to the safe + behavior. + + + + + +Fielding, et al. Standards Track [Page 116] + +RFC 2616 HTTP/1.1 June 1999 + + + Unrecognized cache-directives MUST be ignored; it is assumed that any + cache-directive likely to be unrecognized by an HTTP/1.1 cache will + be combined with standard directives (or the response's default + cacheability) such that the cache behavior will remain minimally + correct even if the cache does not understand the extension(s). + +14.10 Connection + + The Connection general-header field allows the sender to specify + options that are desired for that particular connection and MUST NOT + be communicated by proxies over further connections. + + The Connection header has the following grammar: + + Connection = "Connection" ":" 1#(connection-token) + connection-token = token + + HTTP/1.1 proxies MUST parse the Connection header field before a + message is forwarded and, for each connection-token in this field, + remove any header field(s) from the message with the same name as the + connection-token. Connection options are signaled by the presence of + a connection-token in the Connection header field, not by any + corresponding additional header field(s), since the additional header + field may not be sent if there are no parameters associated with that + connection option. + + Message headers listed in the Connection header MUST NOT include + end-to-end headers, such as Cache-Control. + + HTTP/1.1 defines the "close" connection option for the sender to + signal that the connection will be closed after completion of the + response. For example, + + Connection: close + + in either the request or the response header fields indicates that + the connection SHOULD NOT be considered `persistent' (section 8.1) + after the current request/response is complete. + + HTTP/1.1 applications that do not support persistent connections MUST + include the "close" connection option in every message. + + A system receiving an HTTP/1.0 (or lower-version) message that + includes a Connection header MUST, for each connection-token in this + field, remove and ignore any header field(s) from the message with + the same name as the connection-token. This protects against mistaken + forwarding of such header fields by pre-HTTP/1.1 proxies. See section + 19.6.2. + + + +Fielding, et al. Standards Track [Page 117] + +RFC 2616 HTTP/1.1 June 1999 + + +14.11 Content-Encoding + + The Content-Encoding entity-header field is used as a modifier to the + media-type. When present, its value indicates what additional content + codings have been applied to the entity-body, and thus what decoding + mechanisms must be applied in order to obtain the media-type + referenced by the Content-Type header field. Content-Encoding is + primarily used to allow a document to be compressed without losing + the identity of its underlying media type. + + Content-Encoding = "Content-Encoding" ":" 1#content-coding + + Content codings are defined in section 3.5. An example of its use is + + Content-Encoding: gzip + + The content-coding is a characteristic of the entity identified by + the Request-URI. Typically, the entity-body is stored with this + encoding and is only decoded before rendering or analogous usage. + However, a non-transparent proxy MAY modify the content-coding if the + new coding is known to be acceptable to the recipient, unless the + "no-transform" cache-control directive is present in the message. + + If the content-coding of an entity is not "identity", then the + response MUST include a Content-Encoding entity-header (section + 14.11) that lists the non-identity content-coding(s) used. + + If the content-coding of an entity in a request message is not + acceptable to the origin server, the server SHOULD respond with a + status code of 415 (Unsupported Media Type). + + If multiple encodings have been applied to an entity, the content + codings MUST be listed in the order in which they were applied. + Additional information about the encoding parameters MAY be provided + by other entity-header fields not defined by this specification. + +14.12 Content-Language + + The Content-Language entity-header field describes the natural + language(s) of the intended audience for the enclosed entity. Note + that this might not be equivalent to all the languages used within + the entity-body. + + Content-Language = "Content-Language" ":" 1#language-tag + + + + + + + +Fielding, et al. Standards Track [Page 118] + +RFC 2616 HTTP/1.1 June 1999 + + + Language tags are defined in section 3.10. The primary purpose of + Content-Language is to allow a user to identify and differentiate + entities according to the user's own preferred language. Thus, if the + body content is intended only for a Danish-literate audience, the + appropriate field is + + Content-Language: da + + If no Content-Language is specified, the default is that the content + is intended for all language audiences. This might mean that the + sender does not consider it to be specific to any natural language, + or that the sender does not know for which language it is intended. + + Multiple languages MAY be listed for content that is intended for + multiple audiences. For example, a rendition of the "Treaty of + Waitangi," presented simultaneously in the original Maori and English + versions, would call for + + Content-Language: mi, en + + However, just because multiple languages are present within an entity + does not mean that it is intended for multiple linguistic audiences. + An example would be a beginner's language primer, such as "A First + Lesson in Latin," which is clearly intended to be used by an + English-literate audience. In this case, the Content-Language would + properly only include "en". + + Content-Language MAY be applied to any media type -- it is not + limited to textual documents. + +14.13 Content-Length + + The Content-Length entity-header field indicates the size of the + entity-body, in decimal number of OCTETs, sent to the recipient or, + in the case of the HEAD method, the size of the entity-body that + would have been sent had the request been a GET. + + Content-Length = "Content-Length" ":" 1*DIGIT + + An example is + + Content-Length: 3495 + + Applications SHOULD use this field to indicate the transfer-length of + the message-body, unless this is prohibited by the rules in section + 4.4. + + + + + +Fielding, et al. Standards Track [Page 119] + +RFC 2616 HTTP/1.1 June 1999 + + + Any Content-Length greater than or equal to zero is a valid value. + Section 4.4 describes how to determine the length of a message-body + if a Content-Length is not given. + + Note that the meaning of this field is significantly different from + the corresponding definition in MIME, where it is an optional field + used within the "message/external-body" content-type. In HTTP, it + SHOULD be sent whenever the message's length can be determined prior + to being transferred, unless this is prohibited by the rules in + section 4.4. + +14.14 Content-Location + + The Content-Location entity-header field MAY be used to supply the + resource location for the entity enclosed in the message when that + entity is accessible from a location separate from the requested + resource's URI. A server SHOULD provide a Content-Location for the + variant corresponding to the response entity; especially in the case + where a resource has multiple entities associated with it, and those + entities actually have separate locations by which they might be + individually accessed, the server SHOULD provide a Content-Location + for the particular variant which is returned. + + Content-Location = "Content-Location" ":" + ( absoluteURI | relativeURI ) + + The value of Content-Location also defines the base URI for the + entity. + + The Content-Location value is not a replacement for the original + requested URI; it is only a statement of the location of the resource + corresponding to this particular entity at the time of the request. + Future requests MAY specify the Content-Location URI as the request- + URI if the desire is to identify the source of that particular + entity. + + A cache cannot assume that an entity with a Content-Location + different from the URI used to retrieve it can be used to respond to + later requests on that Content-Location URI. However, the Content- + Location can be used to differentiate between multiple entities + retrieved from a single requested resource, as described in section + 13.6. + + If the Content-Location is a relative URI, the relative URI is + interpreted relative to the Request-URI. + + The meaning of the Content-Location header in PUT or POST requests is + undefined; servers are free to ignore it in those cases. + + + +Fielding, et al. Standards Track [Page 120] + +RFC 2616 HTTP/1.1 June 1999 + + +14.15 Content-MD5 + + The Content-MD5 entity-header field, as defined in RFC 1864 [23], is + an MD5 digest of the entity-body for the purpose of providing an + end-to-end message integrity check (MIC) of the entity-body. (Note: a + MIC is good for detecting accidental modification of the entity-body + in transit, but is not proof against malicious attacks.) + + Content-MD5 = "Content-MD5" ":" md5-digest + md5-digest = + + The Content-MD5 header field MAY be generated by an origin server or + client to function as an integrity check of the entity-body. Only + origin servers or clients MAY generate the Content-MD5 header field; + proxies and gateways MUST NOT generate it, as this would defeat its + value as an end-to-end integrity check. Any recipient of the entity- + body, including gateways and proxies, MAY check that the digest value + in this header field matches that of the entity-body as received. + + The MD5 digest is computed based on the content of the entity-body, + including any content-coding that has been applied, but not including + any transfer-encoding applied to the message-body. If the message is + received with a transfer-encoding, that encoding MUST be removed + prior to checking the Content-MD5 value against the received entity. + + This has the result that the digest is computed on the octets of the + entity-body exactly as, and in the order that, they would be sent if + no transfer-encoding were being applied. + + HTTP extends RFC 1864 to permit the digest to be computed for MIME + composite media-types (e.g., multipart/* and message/rfc822), but + this does not change how the digest is computed as defined in the + preceding paragraph. + + There are several consequences of this. The entity-body for composite + types MAY contain many body-parts, each with its own MIME and HTTP + headers (including Content-MD5, Content-Transfer-Encoding, and + Content-Encoding headers). If a body-part has a Content-Transfer- + Encoding or Content-Encoding header, it is assumed that the content + of the body-part has had the encoding applied, and the body-part is + included in the Content-MD5 digest as is -- i.e., after the + application. The Transfer-Encoding header field is not allowed within + body-parts. + + Conversion of all line breaks to CRLF MUST NOT be done before + computing or checking the digest: the line break convention used in + the text actually transmitted MUST be left unaltered when computing + the digest. + + + +Fielding, et al. Standards Track [Page 121] + +RFC 2616 HTTP/1.1 June 1999 + + + Note: while the definition of Content-MD5 is exactly the same for + HTTP as in RFC 1864 for MIME entity-bodies, there are several ways + in which the application of Content-MD5 to HTTP entity-bodies + differs from its application to MIME entity-bodies. One is that + HTTP, unlike MIME, does not use Content-Transfer-Encoding, and + does use Transfer-Encoding and Content-Encoding. Another is that + HTTP more frequently uses binary content types than MIME, so it is + worth noting that, in such cases, the byte order used to compute + the digest is the transmission byte order defined for the type. + Lastly, HTTP allows transmission of text types with any of several + line break conventions and not just the canonical form using CRLF. + +14.16 Content-Range + + The Content-Range entity-header is sent with a partial entity-body to + specify where in the full entity-body the partial body should be + applied. Range units are defined in section 3.12. + + Content-Range = "Content-Range" ":" content-range-spec + + content-range-spec = byte-content-range-spec + byte-content-range-spec = bytes-unit SP + byte-range-resp-spec "/" + ( instance-length | "*" ) + + byte-range-resp-spec = (first-byte-pos "-" last-byte-pos) + | "*" + instance-length = 1*DIGIT + + The header SHOULD indicate the total length of the full entity-body, + unless this length is unknown or difficult to determine. The asterisk + "*" character means that the instance-length is unknown at the time + when the response was generated. + + Unlike byte-ranges-specifier values (see section 14.35.1), a byte- + range-resp-spec MUST only specify one range, and MUST contain + absolute byte positions for both the first and last byte of the + range. + + A byte-content-range-spec with a byte-range-resp-spec whose last- + byte-pos value is less than its first-byte-pos value, or whose + instance-length value is less than or equal to its last-byte-pos + value, is invalid. The recipient of an invalid byte-content-range- + spec MUST ignore it and any content transferred along with it. + + A server sending a response with status code 416 (Requested range not + satisfiable) SHOULD include a Content-Range field with a byte-range- + resp-spec of "*". The instance-length specifies the current length of + + + +Fielding, et al. Standards Track [Page 122] + +RFC 2616 HTTP/1.1 June 1999 + + + the selected resource. A response with status code 206 (Partial + Content) MUST NOT include a Content-Range field with a byte-range- + resp-spec of "*". + + Examples of byte-content-range-spec values, assuming that the entity + contains a total of 1234 bytes: + + . The first 500 bytes: + bytes 0-499/1234 + + . The second 500 bytes: + bytes 500-999/1234 + + . All except for the first 500 bytes: + bytes 500-1233/1234 + + . The last 500 bytes: + bytes 734-1233/1234 + + When an HTTP message includes the content of a single range (for + example, a response to a request for a single range, or to a request + for a set of ranges that overlap without any holes), this content is + transmitted with a Content-Range header, and a Content-Length header + showing the number of bytes actually transferred. For example, + + HTTP/1.1 206 Partial content + Date: Wed, 15 Nov 1995 06:25:24 GMT + Last-Modified: Wed, 15 Nov 1995 04:58:08 GMT + Content-Range: bytes 21010-47021/47022 + Content-Length: 26012 + Content-Type: image/gif + + When an HTTP message includes the content of multiple ranges (for + example, a response to a request for multiple non-overlapping + ranges), these are transmitted as a multipart message. The multipart + media type used for this purpose is "multipart/byteranges" as defined + in appendix 19.2. See appendix 19.6.3 for a compatibility issue. + + A response to a request for a single range MUST NOT be sent using the + multipart/byteranges media type. A response to a request for + multiple ranges, whose result is a single range, MAY be sent as a + multipart/byteranges media type with one part. A client that cannot + decode a multipart/byteranges message MUST NOT ask for multiple + byte-ranges in a single request. + + When a client requests multiple byte-ranges in one request, the + server SHOULD return them in the order that they appeared in the + request. + + + +Fielding, et al. Standards Track [Page 123] + +RFC 2616 HTTP/1.1 June 1999 + + + If the server ignores a byte-range-spec because it is syntactically + invalid, the server SHOULD treat the request as if the invalid Range + header field did not exist. (Normally, this means return a 200 + response containing the full entity). + + If the server receives a request (other than one including an If- + Range request-header field) with an unsatisfiable Range request- + header field (that is, all of whose byte-range-spec values have a + first-byte-pos value greater than the current length of the selected + resource), it SHOULD return a response code of 416 (Requested range + not satisfiable) (section 10.4.17). + + Note: clients cannot depend on servers to send a 416 (Requested + range not satisfiable) response instead of a 200 (OK) response for + an unsatisfiable Range request-header, since not all servers + implement this request-header. + +14.17 Content-Type + + The Content-Type entity-header field indicates the media type of the + entity-body sent to the recipient or, in the case of the HEAD method, + the media type that would have been sent had the request been a GET. + + Content-Type = "Content-Type" ":" media-type + + Media types are defined in section 3.7. An example of the field is + + Content-Type: text/html; charset=ISO-8859-4 + + Further discussion of methods for identifying the media type of an + entity is provided in section 7.2.1. + +14.18 Date + + The Date general-header field represents the date and time at which + the message was originated, having the same semantics as orig-date in + RFC 822. The field value is an HTTP-date, as described in section + 3.3.1; it MUST be sent in RFC 1123 [8]-date format. + + Date = "Date" ":" HTTP-date + + An example is + + Date: Tue, 15 Nov 1994 08:12:31 GMT + + Origin servers MUST include a Date header field in all responses, + except in these cases: + + + + +Fielding, et al. Standards Track [Page 124] + +RFC 2616 HTTP/1.1 June 1999 + + + 1. If the response status code is 100 (Continue) or 101 (Switching + Protocols), the response MAY include a Date header field, at + the server's option. + + 2. If the response status code conveys a server error, e.g. 500 + (Internal Server Error) or 503 (Service Unavailable), and it is + inconvenient or impossible to generate a valid Date. + + 3. If the server does not have a clock that can provide a + reasonable approximation of the current time, its responses + MUST NOT include a Date header field. In this case, the rules + in section 14.18.1 MUST be followed. + + A received message that does not have a Date header field MUST be + assigned one by the recipient if the message will be cached by that + recipient or gatewayed via a protocol which requires a Date. An HTTP + implementation without a clock MUST NOT cache responses without + revalidating them on every use. An HTTP cache, especially a shared + cache, SHOULD use a mechanism, such as NTP [28], to synchronize its + clock with a reliable external standard. + + Clients SHOULD only send a Date header field in messages that include + an entity-body, as in the case of the PUT and POST requests, and even + then it is optional. A client without a clock MUST NOT send a Date + header field in a request. + + The HTTP-date sent in a Date header SHOULD NOT represent a date and + time subsequent to the generation of the message. It SHOULD represent + the best available approximation of the date and time of message + generation, unless the implementation has no means of generating a + reasonably accurate date and time. In theory, the date ought to + represent the moment just before the entity is generated. In + practice, the date can be generated at any time during the message + origination without affecting its semantic value. + +14.18.1 Clockless Origin Server Operation + + Some origin server implementations might not have a clock available. + An origin server without a clock MUST NOT assign Expires or Last- + Modified values to a response, unless these values were associated + with the resource by a system or user with a reliable clock. It MAY + assign an Expires value that is known, at or before server + configuration time, to be in the past (this allows "pre-expiration" + of responses without storing separate Expires values for each + resource). + + + + + + +Fielding, et al. Standards Track [Page 125] + +RFC 2616 HTTP/1.1 June 1999 + + +14.19 ETag + + The ETag response-header field provides the current value of the + entity tag for the requested variant. The headers used with entity + tags are described in sections 14.24, 14.26 and 14.44. The entity tag + MAY be used for comparison with other entities from the same resource + (see section 13.3.3). + + ETag = "ETag" ":" entity-tag + + Examples: + + ETag: "xyzzy" + ETag: W/"xyzzy" + ETag: "" + +14.20 Expect + + The Expect request-header field is used to indicate that particular + server behaviors are required by the client. + + Expect = "Expect" ":" 1#expectation + + expectation = "100-continue" | expectation-extension + expectation-extension = token [ "=" ( token | quoted-string ) + *expect-params ] + expect-params = ";" token [ "=" ( token | quoted-string ) ] + + + A server that does not understand or is unable to comply with any of + the expectation values in the Expect field of a request MUST respond + with appropriate error status. The server MUST respond with a 417 + (Expectation Failed) status if any of the expectations cannot be met + or, if there are other problems with the request, some other 4xx + status. + + This header field is defined with extensible syntax to allow for + future extensions. If a server receives a request containing an + Expect field that includes an expectation-extension that it does not + support, it MUST respond with a 417 (Expectation Failed) status. + + Comparison of expectation values is case-insensitive for unquoted + tokens (including the 100-continue token), and is case-sensitive for + quoted-string expectation-extensions. + + + + + + + +Fielding, et al. Standards Track [Page 126] + +RFC 2616 HTTP/1.1 June 1999 + + + The Expect mechanism is hop-by-hop: that is, an HTTP/1.1 proxy MUST + return a 417 (Expectation Failed) status if it receives a request + with an expectation that it cannot meet. However, the Expect + request-header itself is end-to-end; it MUST be forwarded if the + request is forwarded. + + Many older HTTP/1.0 and HTTP/1.1 applications do not understand the + Expect header. + + See section 8.2.3 for the use of the 100 (continue) status. + +14.21 Expires + + The Expires entity-header field gives the date/time after which the + response is considered stale. A stale cache entry may not normally be + returned by a cache (either a proxy cache or a user agent cache) + unless it is first validated with the origin server (or with an + intermediate cache that has a fresh copy of the entity). See section + 13.2 for further discussion of the expiration model. + + The presence of an Expires field does not imply that the original + resource will change or cease to exist at, before, or after that + time. + + The format is an absolute date and time as defined by HTTP-date in + section 3.3.1; it MUST be in RFC 1123 date format: + + Expires = "Expires" ":" HTTP-date + + An example of its use is + + Expires: Thu, 01 Dec 1994 16:00:00 GMT + + Note: if a response includes a Cache-Control field with the max- + age directive (see section 14.9.3), that directive overrides the + Expires field. + + HTTP/1.1 clients and caches MUST treat other invalid date formats, + especially including the value "0", as in the past (i.e., "already + expired"). + + To mark a response as "already expired," an origin server sends an + Expires date that is equal to the Date header value. (See the rules + for expiration calculations in section 13.2.4.) + + + + + + + +Fielding, et al. Standards Track [Page 127] + +RFC 2616 HTTP/1.1 June 1999 + + + To mark a response as "never expires," an origin server sends an + Expires date approximately one year from the time the response is + sent. HTTP/1.1 servers SHOULD NOT send Expires dates more than one + year in the future. + + The presence of an Expires header field with a date value of some + time in the future on a response that otherwise would by default be + non-cacheable indicates that the response is cacheable, unless + indicated otherwise by a Cache-Control header field (section 14.9). + +14.22 From + + The From request-header field, if given, SHOULD contain an Internet + e-mail address for the human user who controls the requesting user + agent. The address SHOULD be machine-usable, as defined by "mailbox" + in RFC 822 [9] as updated by RFC 1123 [8]: + + From = "From" ":" mailbox + + An example is: + + From: webmaster@w3.org + + This header field MAY be used for logging purposes and as a means for + identifying the source of invalid or unwanted requests. It SHOULD NOT + be used as an insecure form of access protection. The interpretation + of this field is that the request is being performed on behalf of the + person given, who accepts responsibility for the method performed. In + particular, robot agents SHOULD include this header so that the + person responsible for running the robot can be contacted if problems + occur on the receiving end. + + The Internet e-mail address in this field MAY be separate from the + Internet host which issued the request. For example, when a request + is passed through a proxy the original issuer's address SHOULD be + used. + + The client SHOULD NOT send the From header field without the user's + approval, as it might conflict with the user's privacy interests or + their site's security policy. It is strongly recommended that the + user be able to disable, enable, and modify the value of this field + at any time prior to a request. + +14.23 Host + + The Host request-header field specifies the Internet host and port + number of the resource being requested, as obtained from the original + URI given by the user or referring resource (generally an HTTP URL, + + + +Fielding, et al. Standards Track [Page 128] + +RFC 2616 HTTP/1.1 June 1999 + + + as described in section 3.2.2). The Host field value MUST represent + the naming authority of the origin server or gateway given by the + original URL. This allows the origin server or gateway to + differentiate between internally-ambiguous URLs, such as the root "/" + URL of a server for multiple host names on a single IP address. + + Host = "Host" ":" host [ ":" port ] ; Section 3.2.2 + + A "host" without any trailing port information implies the default + port for the service requested (e.g., "80" for an HTTP URL). For + example, a request on the origin server for + would properly include: + + GET /pub/WWW/ HTTP/1.1 + Host: www.w3.org + + A client MUST include a Host header field in all HTTP/1.1 request + messages . If the requested URI does not include an Internet host + name for the service being requested, then the Host header field MUST + be given with an empty value. An HTTP/1.1 proxy MUST ensure that any + request message it forwards does contain an appropriate Host header + field that identifies the service being requested by the proxy. All + Internet-based HTTP/1.1 servers MUST respond with a 400 (Bad Request) + status code to any HTTP/1.1 request message which lacks a Host header + field. + + See sections 5.2 and 19.6.1.1 for other requirements relating to + Host. + +14.24 If-Match + + The If-Match request-header field is used with a method to make it + conditional. A client that has one or more entities previously + obtained from the resource can verify that one of those entities is + current by including a list of their associated entity tags in the + If-Match header field. Entity tags are defined in section 3.11. The + purpose of this feature is to allow efficient updates of cached + information with a minimum amount of transaction overhead. It is also + used, on updating requests, to prevent inadvertent modification of + the wrong version of a resource. As a special case, the value "*" + matches any current entity of the resource. + + If-Match = "If-Match" ":" ( "*" | 1#entity-tag ) + + If any of the entity tags match the entity tag of the entity that + would have been returned in the response to a similar GET request + (without the If-Match header) on that resource, or if "*" is given + + + + +Fielding, et al. Standards Track [Page 129] + +RFC 2616 HTTP/1.1 June 1999 + + + and any current entity exists for that resource, then the server MAY + perform the requested method as if the If-Match header field did not + exist. + + A server MUST use the strong comparison function (see section 13.3.3) + to compare the entity tags in If-Match. + + If none of the entity tags match, or if "*" is given and no current + entity exists, the server MUST NOT perform the requested method, and + MUST return a 412 (Precondition Failed) response. This behavior is + most useful when the client wants to prevent an updating method, such + as PUT, from modifying a resource that has changed since the client + last retrieved it. + + If the request would, without the If-Match header field, result in + anything other than a 2xx or 412 status, then the If-Match header + MUST be ignored. + + The meaning of "If-Match: *" is that the method SHOULD be performed + if the representation selected by the origin server (or by a cache, + possibly using the Vary mechanism, see section 14.44) exists, and + MUST NOT be performed if the representation does not exist. + + A request intended to update a resource (e.g., a PUT) MAY include an + If-Match header field to signal that the request method MUST NOT be + applied if the entity corresponding to the If-Match value (a single + entity tag) is no longer a representation of that resource. This + allows the user to indicate that they do not wish the request to be + successful if the resource has been changed without their knowledge. + Examples: + + If-Match: "xyzzy" + If-Match: "xyzzy", "r2d2xxxx", "c3piozzzz" + If-Match: * + + The result of a request having both an If-Match header field and + either an If-None-Match or an If-Modified-Since header fields is + undefined by this specification. + +14.25 If-Modified-Since + + The If-Modified-Since request-header field is used with a method to + make it conditional: if the requested variant has not been modified + since the time specified in this field, an entity will not be + returned from the server; instead, a 304 (not modified) response will + be returned without any message-body. + + If-Modified-Since = "If-Modified-Since" ":" HTTP-date + + + +Fielding, et al. Standards Track [Page 130] + +RFC 2616 HTTP/1.1 June 1999 + + + An example of the field is: + + If-Modified-Since: Sat, 29 Oct 1994 19:43:31 GMT + + A GET method with an If-Modified-Since header and no Range header + requests that the identified entity be transferred only if it has + been modified since the date given by the If-Modified-Since header. + The algorithm for determining this includes the following cases: + + a) If the request would normally result in anything other than a + 200 (OK) status, or if the passed If-Modified-Since date is + invalid, the response is exactly the same as for a normal GET. + A date which is later than the server's current time is + invalid. + + b) If the variant has been modified since the If-Modified-Since + date, the response is exactly the same as for a normal GET. + + c) If the variant has not been modified since a valid If- + Modified-Since date, the server SHOULD return a 304 (Not + Modified) response. + + The purpose of this feature is to allow efficient updates of cached + information with a minimum amount of transaction overhead. + + Note: The Range request-header field modifies the meaning of If- + Modified-Since; see section 14.35 for full details. + + Note: If-Modified-Since times are interpreted by the server, whose + clock might not be synchronized with the client. + + Note: When handling an If-Modified-Since header field, some + servers will use an exact date comparison function, rather than a + less-than function, for deciding whether to send a 304 (Not + Modified) response. To get best results when sending an If- + Modified-Since header field for cache validation, clients are + advised to use the exact date string received in a previous Last- + Modified header field whenever possible. + + Note: If a client uses an arbitrary date in the If-Modified-Since + header instead of a date taken from the Last-Modified header for + the same request, the client should be aware of the fact that this + date is interpreted in the server's understanding of time. The + client should consider unsynchronized clocks and rounding problems + due to the different encodings of time between the client and + server. This includes the possibility of race conditions if the + document has changed between the time it was first requested and + the If-Modified-Since date of a subsequent request, and the + + + +Fielding, et al. Standards Track [Page 131] + +RFC 2616 HTTP/1.1 June 1999 + + + possibility of clock-skew-related problems if the If-Modified- + Since date is derived from the client's clock without correction + to the server's clock. Corrections for different time bases + between client and server are at best approximate due to network + latency. + + The result of a request having both an If-Modified-Since header field + and either an If-Match or an If-Unmodified-Since header fields is + undefined by this specification. + +14.26 If-None-Match + + The If-None-Match request-header field is used with a method to make + it conditional. A client that has one or more entities previously + obtained from the resource can verify that none of those entities is + current by including a list of their associated entity tags in the + If-None-Match header field. The purpose of this feature is to allow + efficient updates of cached information with a minimum amount of + transaction overhead. It is also used to prevent a method (e.g. PUT) + from inadvertently modifying an existing resource when the client + believes that the resource does not exist. + + As a special case, the value "*" matches any current entity of the + resource. + + If-None-Match = "If-None-Match" ":" ( "*" | 1#entity-tag ) + + If any of the entity tags match the entity tag of the entity that + would have been returned in the response to a similar GET request + (without the If-None-Match header) on that resource, or if "*" is + given and any current entity exists for that resource, then the + server MUST NOT perform the requested method, unless required to do + so because the resource's modification date fails to match that + supplied in an If-Modified-Since header field in the request. + Instead, if the request method was GET or HEAD, the server SHOULD + respond with a 304 (Not Modified) response, including the cache- + related header fields (particularly ETag) of one of the entities that + matched. For all other request methods, the server MUST respond with + a status of 412 (Precondition Failed). + + See section 13.3.3 for rules on how to determine if two entities tags + match. The weak comparison function can only be used with GET or HEAD + requests. + + + + + + + + +Fielding, et al. Standards Track [Page 132] + +RFC 2616 HTTP/1.1 June 1999 + + + If none of the entity tags match, then the server MAY perform the + requested method as if the If-None-Match header field did not exist, + but MUST also ignore any If-Modified-Since header field(s) in the + request. That is, if no entity tags match, then the server MUST NOT + return a 304 (Not Modified) response. + + If the request would, without the If-None-Match header field, result + in anything other than a 2xx or 304 status, then the If-None-Match + header MUST be ignored. (See section 13.3.4 for a discussion of + server behavior when both If-Modified-Since and If-None-Match appear + in the same request.) + + The meaning of "If-None-Match: *" is that the method MUST NOT be + performed if the representation selected by the origin server (or by + a cache, possibly using the Vary mechanism, see section 14.44) + exists, and SHOULD be performed if the representation does not exist. + This feature is intended to be useful in preventing races between PUT + operations. + + Examples: + + If-None-Match: "xyzzy" + If-None-Match: W/"xyzzy" + If-None-Match: "xyzzy", "r2d2xxxx", "c3piozzzz" + If-None-Match: W/"xyzzy", W/"r2d2xxxx", W/"c3piozzzz" + If-None-Match: * + + The result of a request having both an If-None-Match header field and + either an If-Match or an If-Unmodified-Since header fields is + undefined by this specification. + +14.27 If-Range + + If a client has a partial copy of an entity in its cache, and wishes + to have an up-to-date copy of the entire entity in its cache, it + could use the Range request-header with a conditional GET (using + either or both of If-Unmodified-Since and If-Match.) However, if the + condition fails because the entity has been modified, the client + would then have to make a second request to obtain the entire current + entity-body. + + The If-Range header allows a client to "short-circuit" the second + request. Informally, its meaning is `if the entity is unchanged, send + me the part(s) that I am missing; otherwise, send me the entire new + entity'. + + If-Range = "If-Range" ":" ( entity-tag | HTTP-date ) + + + + +Fielding, et al. Standards Track [Page 133] + +RFC 2616 HTTP/1.1 June 1999 + + + If the client has no entity tag for an entity, but does have a Last- + Modified date, it MAY use that date in an If-Range header. (The + server can distinguish between a valid HTTP-date and any form of + entity-tag by examining no more than two characters.) The If-Range + header SHOULD only be used together with a Range header, and MUST be + ignored if the request does not include a Range header, or if the + server does not support the sub-range operation. + + If the entity tag given in the If-Range header matches the current + entity tag for the entity, then the server SHOULD provide the + specified sub-range of the entity using a 206 (Partial content) + response. If the entity tag does not match, then the server SHOULD + return the entire entity using a 200 (OK) response. + +14.28 If-Unmodified-Since + + The If-Unmodified-Since request-header field is used with a method to + make it conditional. If the requested resource has not been modified + since the time specified in this field, the server SHOULD perform the + requested operation as if the If-Unmodified-Since header were not + present. + + If the requested variant has been modified since the specified time, + the server MUST NOT perform the requested operation, and MUST return + a 412 (Precondition Failed). + + If-Unmodified-Since = "If-Unmodified-Since" ":" HTTP-date + + An example of the field is: + + If-Unmodified-Since: Sat, 29 Oct 1994 19:43:31 GMT + + If the request normally (i.e., without the If-Unmodified-Since + header) would result in anything other than a 2xx or 412 status, the + If-Unmodified-Since header SHOULD be ignored. + + If the specified date is invalid, the header is ignored. + + The result of a request having both an If-Unmodified-Since header + field and either an If-None-Match or an If-Modified-Since header + fields is undefined by this specification. + +14.29 Last-Modified + + The Last-Modified entity-header field indicates the date and time at + which the origin server believes the variant was last modified. + + Last-Modified = "Last-Modified" ":" HTTP-date + + + +Fielding, et al. Standards Track [Page 134] + +RFC 2616 HTTP/1.1 June 1999 + + + An example of its use is + + Last-Modified: Tue, 15 Nov 1994 12:45:26 GMT + + The exact meaning of this header field depends on the implementation + of the origin server and the nature of the original resource. For + files, it may be just the file system last-modified time. For + entities with dynamically included parts, it may be the most recent + of the set of last-modify times for its component parts. For database + gateways, it may be the last-update time stamp of the record. For + virtual objects, it may be the last time the internal state changed. + + An origin server MUST NOT send a Last-Modified date which is later + than the server's time of message origination. In such cases, where + the resource's last modification would indicate some time in the + future, the server MUST replace that date with the message + origination date. + + An origin server SHOULD obtain the Last-Modified value of the entity + as close as possible to the time that it generates the Date value of + its response. This allows a recipient to make an accurate assessment + of the entity's modification time, especially if the entity changes + near the time that the response is generated. + + HTTP/1.1 servers SHOULD send Last-Modified whenever feasible. + +14.30 Location + + The Location response-header field is used to redirect the recipient + to a location other than the Request-URI for completion of the + request or identification of a new resource. For 201 (Created) + responses, the Location is that of the new resource which was created + by the request. For 3xx responses, the location SHOULD indicate the + server's preferred URI for automatic redirection to the resource. The + field value consists of a single absolute URI. + + Location = "Location" ":" absoluteURI + + An example is: + + Location: http://www.w3.org/pub/WWW/People.html + + Note: The Content-Location header field (section 14.14) differs + from Location in that the Content-Location identifies the original + location of the entity enclosed in the request. It is therefore + possible for a response to contain header fields for both Location + and Content-Location. Also see section 13.10 for cache + requirements of some methods. + + + +Fielding, et al. Standards Track [Page 135] + +RFC 2616 HTTP/1.1 June 1999 + + +14.31 Max-Forwards + + The Max-Forwards request-header field provides a mechanism with the + TRACE (section 9.8) and OPTIONS (section 9.2) methods to limit the + number of proxies or gateways that can forward the request to the + next inbound server. This can be useful when the client is attempting + to trace a request chain which appears to be failing or looping in + mid-chain. + + Max-Forwards = "Max-Forwards" ":" 1*DIGIT + + The Max-Forwards value is a decimal integer indicating the remaining + number of times this request message may be forwarded. + + Each proxy or gateway recipient of a TRACE or OPTIONS request + containing a Max-Forwards header field MUST check and update its + value prior to forwarding the request. If the received value is zero + (0), the recipient MUST NOT forward the request; instead, it MUST + respond as the final recipient. If the received Max-Forwards value is + greater than zero, then the forwarded message MUST contain an updated + Max-Forwards field with a value decremented by one (1). + + The Max-Forwards header field MAY be ignored for all other methods + defined by this specification and for any extension methods for which + it is not explicitly referred to as part of that method definition. + +14.32 Pragma + + The Pragma general-header field is used to include implementation- + specific directives that might apply to any recipient along the + request/response chain. All pragma directives specify optional + behavior from the viewpoint of the protocol; however, some systems + MAY require that behavior be consistent with the directives. + + Pragma = "Pragma" ":" 1#pragma-directive + pragma-directive = "no-cache" | extension-pragma + extension-pragma = token [ "=" ( token | quoted-string ) ] + + When the no-cache directive is present in a request message, an + application SHOULD forward the request toward the origin server even + if it has a cached copy of what is being requested. This pragma + directive has the same semantics as the no-cache cache-directive (see + section 14.9) and is defined here for backward compatibility with + HTTP/1.0. Clients SHOULD include both header fields when a no-cache + request is sent to a server not known to be HTTP/1.1 compliant. + + + + + + +Fielding, et al. Standards Track [Page 136] + +RFC 2616 HTTP/1.1 June 1999 + + + Pragma directives MUST be passed through by a proxy or gateway + application, regardless of their significance to that application, + since the directives might be applicable to all recipients along the + request/response chain. It is not possible to specify a pragma for a + specific recipient; however, any pragma directive not relevant to a + recipient SHOULD be ignored by that recipient. + + HTTP/1.1 caches SHOULD treat "Pragma: no-cache" as if the client had + sent "Cache-Control: no-cache". No new Pragma directives will be + defined in HTTP. + + Note: because the meaning of "Pragma: no-cache as a response + header field is not actually specified, it does not provide a + reliable replacement for "Cache-Control: no-cache" in a response + +14.33 Proxy-Authenticate + + The Proxy-Authenticate response-header field MUST be included as part + of a 407 (Proxy Authentication Required) response. The field value + consists of a challenge that indicates the authentication scheme and + parameters applicable to the proxy for this Request-URI. + + Proxy-Authenticate = "Proxy-Authenticate" ":" 1#challenge + + The HTTP access authentication process is described in "HTTP + Authentication: Basic and Digest Access Authentication" [43]. Unlike + WWW-Authenticate, the Proxy-Authenticate header field applies only to + the current connection and SHOULD NOT be passed on to downstream + clients. However, an intermediate proxy might need to obtain its own + credentials by requesting them from the downstream client, which in + some circumstances will appear as if the proxy is forwarding the + Proxy-Authenticate header field. + +14.34 Proxy-Authorization + + The Proxy-Authorization request-header field allows the client to + identify itself (or its user) to a proxy which requires + authentication. The Proxy-Authorization field value consists of + credentials containing the authentication information of the user + agent for the proxy and/or realm of the resource being requested. + + Proxy-Authorization = "Proxy-Authorization" ":" credentials + + The HTTP access authentication process is described in "HTTP + Authentication: Basic and Digest Access Authentication" [43] . Unlike + Authorization, the Proxy-Authorization header field applies only to + the next outbound proxy that demanded authentication using the Proxy- + Authenticate field. When multiple proxies are used in a chain, the + + + +Fielding, et al. Standards Track [Page 137] + +RFC 2616 HTTP/1.1 June 1999 + + + Proxy-Authorization header field is consumed by the first outbound + proxy that was expecting to receive credentials. A proxy MAY relay + the credentials from the client request to the next proxy if that is + the mechanism by which the proxies cooperatively authenticate a given + request. + +14.35 Range + +14.35.1 Byte Ranges + + Since all HTTP entities are represented in HTTP messages as sequences + of bytes, the concept of a byte range is meaningful for any HTTP + entity. (However, not all clients and servers need to support byte- + range operations.) + + Byte range specifications in HTTP apply to the sequence of bytes in + the entity-body (not necessarily the same as the message-body). + + A byte range operation MAY specify a single range of bytes, or a set + of ranges within a single entity. + + ranges-specifier = byte-ranges-specifier + byte-ranges-specifier = bytes-unit "=" byte-range-set + byte-range-set = 1#( byte-range-spec | suffix-byte-range-spec ) + byte-range-spec = first-byte-pos "-" [last-byte-pos] + first-byte-pos = 1*DIGIT + last-byte-pos = 1*DIGIT + + The first-byte-pos value in a byte-range-spec gives the byte-offset + of the first byte in a range. The last-byte-pos value gives the + byte-offset of the last byte in the range; that is, the byte + positions specified are inclusive. Byte offsets start at zero. + + If the last-byte-pos value is present, it MUST be greater than or + equal to the first-byte-pos in that byte-range-spec, or the byte- + range-spec is syntactically invalid. The recipient of a byte-range- + set that includes one or more syntactically invalid byte-range-spec + values MUST ignore the header field that includes that byte-range- + set. + + If the last-byte-pos value is absent, or if the value is greater than + or equal to the current length of the entity-body, last-byte-pos is + taken to be equal to one less than the current length of the entity- + body in bytes. + + By its choice of last-byte-pos, a client can limit the number of + bytes retrieved without knowing the size of the entity. + + + + +Fielding, et al. Standards Track [Page 138] + +RFC 2616 HTTP/1.1 June 1999 + + + suffix-byte-range-spec = "-" suffix-length + suffix-length = 1*DIGIT + + A suffix-byte-range-spec is used to specify the suffix of the + entity-body, of a length given by the suffix-length value. (That is, + this form specifies the last N bytes of an entity-body.) If the + entity is shorter than the specified suffix-length, the entire + entity-body is used. + + If a syntactically valid byte-range-set includes at least one byte- + range-spec whose first-byte-pos is less than the current length of + the entity-body, or at least one suffix-byte-range-spec with a non- + zero suffix-length, then the byte-range-set is satisfiable. + Otherwise, the byte-range-set is unsatisfiable. If the byte-range-set + is unsatisfiable, the server SHOULD return a response with a status + of 416 (Requested range not satisfiable). Otherwise, the server + SHOULD return a response with a status of 206 (Partial Content) + containing the satisfiable ranges of the entity-body. + + Examples of byte-ranges-specifier values (assuming an entity-body of + length 10000): + + - The first 500 bytes (byte offsets 0-499, inclusive): bytes=0- + 499 + + - The second 500 bytes (byte offsets 500-999, inclusive): + bytes=500-999 + + - The final 500 bytes (byte offsets 9500-9999, inclusive): + bytes=-500 + + - Or bytes=9500- + + - The first and last bytes only (bytes 0 and 9999): bytes=0-0,-1 + + - Several legal but not canonical specifications of the second 500 + bytes (byte offsets 500-999, inclusive): + bytes=500-600,601-999 + bytes=500-700,601-999 + +14.35.2 Range Retrieval Requests + + HTTP retrieval requests using conditional or unconditional GET + methods MAY request one or more sub-ranges of the entity, instead of + the entire entity, using the Range request header, which applies to + the entity returned as the result of the request: + + Range = "Range" ":" ranges-specifier + + + +Fielding, et al. Standards Track [Page 139] + +RFC 2616 HTTP/1.1 June 1999 + + + A server MAY ignore the Range header. However, HTTP/1.1 origin + servers and intermediate caches ought to support byte ranges when + possible, since Range supports efficient recovery from partially + failed transfers, and supports efficient partial retrieval of large + entities. + + If the server supports the Range header and the specified range or + ranges are appropriate for the entity: + + - The presence of a Range header in an unconditional GET modifies + what is returned if the GET is otherwise successful. In other + words, the response carries a status code of 206 (Partial + Content) instead of 200 (OK). + + - The presence of a Range header in a conditional GET (a request + using one or both of If-Modified-Since and If-None-Match, or + one or both of If-Unmodified-Since and If-Match) modifies what + is returned if the GET is otherwise successful and the + condition is true. It does not affect the 304 (Not Modified) + response returned if the conditional is false. + + In some cases, it might be more appropriate to use the If-Range + header (see section 14.27) in addition to the Range header. + + If a proxy that supports ranges receives a Range request, forwards + the request to an inbound server, and receives an entire entity in + reply, it SHOULD only return the requested range to its client. It + SHOULD store the entire received response in its cache if that is + consistent with its cache allocation policies. + +14.36 Referer + + The Referer[sic] request-header field allows the client to specify, + for the server's benefit, the address (URI) of the resource from + which the Request-URI was obtained (the "referrer", although the + header field is misspelled.) The Referer request-header allows a + server to generate lists of back-links to resources for interest, + logging, optimized caching, etc. It also allows obsolete or mistyped + links to be traced for maintenance. The Referer field MUST NOT be + sent if the Request-URI was obtained from a source that does not have + its own URI, such as input from the user keyboard. + + Referer = "Referer" ":" ( absoluteURI | relativeURI ) + + Example: + + Referer: http://www.w3.org/hypertext/DataSources/Overview.html + + + + +Fielding, et al. Standards Track [Page 140] + +RFC 2616 HTTP/1.1 June 1999 + + + If the field value is a relative URI, it SHOULD be interpreted + relative to the Request-URI. The URI MUST NOT include a fragment. See + section 15.1.3 for security considerations. + +14.37 Retry-After + + The Retry-After response-header field can be used with a 503 (Service + Unavailable) response to indicate how long the service is expected to + be unavailable to the requesting client. This field MAY also be used + with any 3xx (Redirection) response to indicate the minimum time the + user-agent is asked wait before issuing the redirected request. The + value of this field can be either an HTTP-date or an integer number + of seconds (in decimal) after the time of the response. + + Retry-After = "Retry-After" ":" ( HTTP-date | delta-seconds ) + + Two examples of its use are + + Retry-After: Fri, 31 Dec 1999 23:59:59 GMT + Retry-After: 120 + + In the latter example, the delay is 2 minutes. + +14.38 Server + + The Server response-header field contains information about the + software used by the origin server to handle the request. The field + can contain multiple product tokens (section 3.8) and comments + identifying the server and any significant subproducts. The product + tokens are listed in order of their significance for identifying the + application. + + Server = "Server" ":" 1*( product | comment ) + + Example: + + Server: CERN/3.0 libwww/2.17 + + If the response is being forwarded through a proxy, the proxy + application MUST NOT modify the Server response-header. Instead, it + SHOULD include a Via field (as described in section 14.45). + + Note: Revealing the specific software version of the server might + allow the server machine to become more vulnerable to attacks + against software that is known to contain security holes. Server + implementors are encouraged to make this field a configurable + option. + + + + +Fielding, et al. Standards Track [Page 141] + +RFC 2616 HTTP/1.1 June 1999 + + +14.39 TE + + The TE request-header field indicates what extension transfer-codings + it is willing to accept in the response and whether or not it is + willing to accept trailer fields in a chunked transfer-coding. Its + value may consist of the keyword "trailers" and/or a comma-separated + list of extension transfer-coding names with optional accept + parameters (as described in section 3.6). + + TE = "TE" ":" #( t-codings ) + t-codings = "trailers" | ( transfer-extension [ accept-params ] ) + + The presence of the keyword "trailers" indicates that the client is + willing to accept trailer fields in a chunked transfer-coding, as + defined in section 3.6.1. This keyword is reserved for use with + transfer-coding values even though it does not itself represent a + transfer-coding. + + Examples of its use are: + + TE: deflate + TE: + TE: trailers, deflate;q=0.5 + + The TE header field only applies to the immediate connection. + Therefore, the keyword MUST be supplied within a Connection header + field (section 14.10) whenever TE is present in an HTTP/1.1 message. + + A server tests whether a transfer-coding is acceptable, according to + a TE field, using these rules: + + 1. The "chunked" transfer-coding is always acceptable. If the + keyword "trailers" is listed, the client indicates that it is + willing to accept trailer fields in the chunked response on + behalf of itself and any downstream clients. The implication is + that, if given, the client is stating that either all + downstream clients are willing to accept trailer fields in the + forwarded response, or that it will attempt to buffer the + response on behalf of downstream recipients. + + Note: HTTP/1.1 does not define any means to limit the size of a + chunked response such that a client can be assured of buffering + the entire response. + + 2. If the transfer-coding being tested is one of the transfer- + codings listed in the TE field, then it is acceptable unless it + is accompanied by a qvalue of 0. (As defined in section 3.9, a + qvalue of 0 means "not acceptable.") + + + +Fielding, et al. Standards Track [Page 142] + +RFC 2616 HTTP/1.1 June 1999 + + + 3. If multiple transfer-codings are acceptable, then the + acceptable transfer-coding with the highest non-zero qvalue is + preferred. The "chunked" transfer-coding always has a qvalue + of 1. + + If the TE field-value is empty or if no TE field is present, the only + transfer-coding is "chunked". A message with no transfer-coding is + always acceptable. + +14.40 Trailer + + The Trailer general field value indicates that the given set of + header fields is present in the trailer of a message encoded with + chunked transfer-coding. + + Trailer = "Trailer" ":" 1#field-name + + An HTTP/1.1 message SHOULD include a Trailer header field in a + message using chunked transfer-coding with a non-empty trailer. Doing + so allows the recipient to know which header fields to expect in the + trailer. + + If no Trailer header field is present, the trailer SHOULD NOT include + any header fields. See section 3.6.1 for restrictions on the use of + trailer fields in a "chunked" transfer-coding. + + Message header fields listed in the Trailer header field MUST NOT + include the following header fields: + + . Transfer-Encoding + + . Content-Length + + . Trailer + +14.41 Transfer-Encoding + + The Transfer-Encoding general-header field indicates what (if any) + type of transformation has been applied to the message body in order + to safely transfer it between the sender and the recipient. This + differs from the content-coding in that the transfer-coding is a + property of the message, not of the entity. + + Transfer-Encoding = "Transfer-Encoding" ":" 1#transfer-coding + + Transfer-codings are defined in section 3.6. An example is: + + Transfer-Encoding: chunked + + + +Fielding, et al. Standards Track [Page 143] + +RFC 2616 HTTP/1.1 June 1999 + + + If multiple encodings have been applied to an entity, the transfer- + codings MUST be listed in the order in which they were applied. + Additional information about the encoding parameters MAY be provided + by other entity-header fields not defined by this specification. + + Many older HTTP/1.0 applications do not understand the Transfer- + Encoding header. + +14.42 Upgrade + + The Upgrade general-header allows the client to specify what + additional communication protocols it supports and would like to use + if the server finds it appropriate to switch protocols. The server + MUST use the Upgrade header field within a 101 (Switching Protocols) + response to indicate which protocol(s) are being switched. + + Upgrade = "Upgrade" ":" 1#product + + For example, + + Upgrade: HTTP/2.0, SHTTP/1.3, IRC/6.9, RTA/x11 + + The Upgrade header field is intended to provide a simple mechanism + for transition from HTTP/1.1 to some other, incompatible protocol. It + does so by allowing the client to advertise its desire to use another + protocol, such as a later version of HTTP with a higher major version + number, even though the current request has been made using HTTP/1.1. + This eases the difficult transition between incompatible protocols by + allowing the client to initiate a request in the more commonly + supported protocol while indicating to the server that it would like + to use a "better" protocol if available (where "better" is determined + by the server, possibly according to the nature of the method and/or + resource being requested). + + The Upgrade header field only applies to switching application-layer + protocols upon the existing transport-layer connection. Upgrade + cannot be used to insist on a protocol change; its acceptance and use + by the server is optional. The capabilities and nature of the + application-layer communication after the protocol change is entirely + dependent upon the new protocol chosen, although the first action + after changing the protocol MUST be a response to the initial HTTP + request containing the Upgrade header field. + + The Upgrade header field only applies to the immediate connection. + Therefore, the upgrade keyword MUST be supplied within a Connection + header field (section 14.10) whenever Upgrade is present in an + HTTP/1.1 message. + + + + +Fielding, et al. Standards Track [Page 144] + +RFC 2616 HTTP/1.1 June 1999 + + + The Upgrade header field cannot be used to indicate a switch to a + protocol on a different connection. For that purpose, it is more + appropriate to use a 301, 302, 303, or 305 redirection response. + + This specification only defines the protocol name "HTTP" for use by + the family of Hypertext Transfer Protocols, as defined by the HTTP + version rules of section 3.1 and future updates to this + specification. Any token can be used as a protocol name; however, it + will only be useful if both the client and server associate the name + with the same protocol. + +14.43 User-Agent + + The User-Agent request-header field contains information about the + user agent originating the request. This is for statistical purposes, + the tracing of protocol violations, and automated recognition of user + agents for the sake of tailoring responses to avoid particular user + agent limitations. User agents SHOULD include this field with + requests. The field can contain multiple product tokens (section 3.8) + and comments identifying the agent and any subproducts which form a + significant part of the user agent. By convention, the product tokens + are listed in order of their significance for identifying the + application. + + User-Agent = "User-Agent" ":" 1*( product | comment ) + + Example: + + User-Agent: CERN-LineMode/2.15 libwww/2.17b3 + +14.44 Vary + + The Vary field value indicates the set of request-header fields that + fully determines, while the response is fresh, whether a cache is + permitted to use the response to reply to a subsequent request + without revalidation. For uncacheable or stale responses, the Vary + field value advises the user agent about the criteria that were used + to select the representation. A Vary field value of "*" implies that + a cache cannot determine from the request headers of a subsequent + request whether this response is the appropriate representation. See + section 13.6 for use of the Vary header field by caches. + + Vary = "Vary" ":" ( "*" | 1#field-name ) + + An HTTP/1.1 server SHOULD include a Vary header field with any + cacheable response that is subject to server-driven negotiation. + Doing so allows a cache to properly interpret future requests on that + resource and informs the user agent about the presence of negotiation + + + +Fielding, et al. Standards Track [Page 145] + +RFC 2616 HTTP/1.1 June 1999 + + + on that resource. A server MAY include a Vary header field with a + non-cacheable response that is subject to server-driven negotiation, + since this might provide the user agent with useful information about + the dimensions over which the response varies at the time of the + response. + + A Vary field value consisting of a list of field-names signals that + the representation selected for the response is based on a selection + algorithm which considers ONLY the listed request-header field values + in selecting the most appropriate representation. A cache MAY assume + that the same selection will be made for future requests with the + same values for the listed field names, for the duration of time for + which the response is fresh. + + The field-names given are not limited to the set of standard + request-header fields defined by this specification. Field names are + case-insensitive. + + A Vary field value of "*" signals that unspecified parameters not + limited to the request-headers (e.g., the network address of the + client), play a role in the selection of the response representation. + The "*" value MUST NOT be generated by a proxy server; it may only be + generated by an origin server. + +14.45 Via + + The Via general-header field MUST be used by gateways and proxies to + indicate the intermediate protocols and recipients between the user + agent and the server on requests, and between the origin server and + the client on responses. It is analogous to the "Received" field of + RFC 822 [9] and is intended to be used for tracking message forwards, + avoiding request loops, and identifying the protocol capabilities of + all senders along the request/response chain. + + Via = "Via" ":" 1#( received-protocol received-by [ comment ] ) + received-protocol = [ protocol-name "/" ] protocol-version + protocol-name = token + protocol-version = token + received-by = ( host [ ":" port ] ) | pseudonym + pseudonym = token + + The received-protocol indicates the protocol version of the message + received by the server or client along each segment of the + request/response chain. The received-protocol version is appended to + the Via field value when the message is forwarded so that information + about the protocol capabilities of upstream applications remains + visible to all recipients. + + + + +Fielding, et al. Standards Track [Page 146] + +RFC 2616 HTTP/1.1 June 1999 + + + The protocol-name is optional if and only if it would be "HTTP". The + received-by field is normally the host and optional port number of a + recipient server or client that subsequently forwarded the message. + However, if the real host is considered to be sensitive information, + it MAY be replaced by a pseudonym. If the port is not given, it MAY + be assumed to be the default port of the received-protocol. + + Multiple Via field values represents each proxy or gateway that has + forwarded the message. Each recipient MUST append its information + such that the end result is ordered according to the sequence of + forwarding applications. + + Comments MAY be used in the Via header field to identify the software + of the recipient proxy or gateway, analogous to the User-Agent and + Server header fields. However, all comments in the Via field are + optional and MAY be removed by any recipient prior to forwarding the + message. + + For example, a request message could be sent from an HTTP/1.0 user + agent to an internal proxy code-named "fred", which uses HTTP/1.1 to + forward the request to a public proxy at nowhere.com, which completes + the request by forwarding it to the origin server at www.ics.uci.edu. + The request received by www.ics.uci.edu would then have the following + Via header field: + + Via: 1.0 fred, 1.1 nowhere.com (Apache/1.1) + + Proxies and gateways used as a portal through a network firewall + SHOULD NOT, by default, forward the names and ports of hosts within + the firewall region. This information SHOULD only be propagated if + explicitly enabled. If not enabled, the received-by host of any host + behind the firewall SHOULD be replaced by an appropriate pseudonym + for that host. + + For organizations that have strong privacy requirements for hiding + internal structures, a proxy MAY combine an ordered subsequence of + Via header field entries with identical received-protocol values into + a single such entry. For example, + + Via: 1.0 ricky, 1.1 ethel, 1.1 fred, 1.0 lucy + + could be collapsed to + + Via: 1.0 ricky, 1.1 mertz, 1.0 lucy + + + + + + + +Fielding, et al. Standards Track [Page 147] + +RFC 2616 HTTP/1.1 June 1999 + + + Applications SHOULD NOT combine multiple entries unless they are all + under the same organizational control and the hosts have already been + replaced by pseudonyms. Applications MUST NOT combine entries which + have different received-protocol values. + +14.46 Warning + + The Warning general-header field is used to carry additional + information about the status or transformation of a message which + might not be reflected in the message. This information is typically + used to warn about a possible lack of semantic transparency from + caching operations or transformations applied to the entity body of + the message. + + Warning headers are sent with responses using: + + Warning = "Warning" ":" 1#warning-value + + warning-value = warn-code SP warn-agent SP warn-text + [SP warn-date] + + warn-code = 3DIGIT + warn-agent = ( host [ ":" port ] ) | pseudonym + ; the name or pseudonym of the server adding + ; the Warning header, for use in debugging + warn-text = quoted-string + warn-date = <"> HTTP-date <"> + + A response MAY carry more than one Warning header. + + The warn-text SHOULD be in a natural language and character set that + is most likely to be intelligible to the human user receiving the + response. This decision MAY be based on any available knowledge, such + as the location of the cache or user, the Accept-Language field in a + request, the Content-Language field in a response, etc. The default + language is English and the default character set is ISO-8859-1. + + If a character set other than ISO-8859-1 is used, it MUST be encoded + in the warn-text using the method described in RFC 2047 [14]. + + Warning headers can in general be applied to any message, however + some specific warn-codes are specific to caches and can only be + applied to response messages. New Warning headers SHOULD be added + after any existing Warning headers. A cache MUST NOT delete any + Warning header that it received with a message. However, if a cache + successfully validates a cache entry, it SHOULD remove any Warning + headers previously attached to that entry except as specified for + + + + +Fielding, et al. Standards Track [Page 148] + +RFC 2616 HTTP/1.1 June 1999 + + + specific Warning codes. It MUST then add any Warning headers received + in the validating response. In other words, Warning headers are those + that would be attached to the most recent relevant response. + + When multiple Warning headers are attached to a response, the user + agent ought to inform the user of as many of them as possible, in the + order that they appear in the response. If it is not possible to + inform the user of all of the warnings, the user agent SHOULD follow + these heuristics: + + - Warnings that appear early in the response take priority over + those appearing later in the response. + + - Warnings in the user's preferred character set take priority + over warnings in other character sets but with identical warn- + codes and warn-agents. + + Systems that generate multiple Warning headers SHOULD order them with + this user agent behavior in mind. + + Requirements for the behavior of caches with respect to Warnings are + stated in section 13.1.2. + + This is a list of the currently-defined warn-codes, each with a + recommended warn-text in English, and a description of its meaning. + + 110 Response is stale + MUST be included whenever the returned response is stale. + + 111 Revalidation failed + MUST be included if a cache returns a stale response because an + attempt to revalidate the response failed, due to an inability to + reach the server. + + 112 Disconnected operation + SHOULD be included if the cache is intentionally disconnected from + the rest of the network for a period of time. + + 113 Heuristic expiration + MUST be included if the cache heuristically chose a freshness + lifetime greater than 24 hours and the response's age is greater + than 24 hours. + + 199 Miscellaneous warning + The warning text MAY include arbitrary information to be presented + to a human user, or logged. A system receiving this warning MUST + NOT take any automated action, besides presenting the warning to + the user. + + + +Fielding, et al. Standards Track [Page 149] + +RFC 2616 HTTP/1.1 June 1999 + + + 214 Transformation applied + MUST be added by an intermediate cache or proxy if it applies any + transformation changing the content-coding (as specified in the + Content-Encoding header) or media-type (as specified in the + Content-Type header) of the response, or the entity-body of the + response, unless this Warning code already appears in the response. + + 299 Miscellaneous persistent warning + The warning text MAY include arbitrary information to be presented + to a human user, or logged. A system receiving this warning MUST + NOT take any automated action. + + If an implementation sends a message with one or more Warning headers + whose version is HTTP/1.0 or lower, then the sender MUST include in + each warning-value a warn-date that matches the date in the response. + + If an implementation receives a message with a warning-value that + includes a warn-date, and that warn-date is different from the Date + value in the response, then that warning-value MUST be deleted from + the message before storing, forwarding, or using it. (This prevents + bad consequences of naive caching of Warning header fields.) If all + of the warning-values are deleted for this reason, the Warning header + MUST be deleted as well. + +14.47 WWW-Authenticate + + The WWW-Authenticate response-header field MUST be included in 401 + (Unauthorized) response messages. The field value consists of at + least one challenge that indicates the authentication scheme(s) and + parameters applicable to the Request-URI. + + WWW-Authenticate = "WWW-Authenticate" ":" 1#challenge + + The HTTP access authentication process is described in "HTTP + Authentication: Basic and Digest Access Authentication" [43]. User + agents are advised to take special care in parsing the WWW- + Authenticate field value as it might contain more than one challenge, + or if more than one WWW-Authenticate header field is provided, the + contents of a challenge itself can contain a comma-separated list of + authentication parameters. + +15 Security Considerations + + This section is meant to inform application developers, information + providers, and users of the security limitations in HTTP/1.1 as + described by this document. The discussion does not include + definitive solutions to the problems revealed, though it does make + some suggestions for reducing security risks. + + + +Fielding, et al. Standards Track [Page 150] + +RFC 2616 HTTP/1.1 June 1999 + + +15.1 Personal Information + + HTTP clients are often privy to large amounts of personal information + (e.g. the user's name, location, mail address, passwords, encryption + keys, etc.), and SHOULD be very careful to prevent unintentional + leakage of this information via the HTTP protocol to other sources. + We very strongly recommend that a convenient interface be provided + for the user to control dissemination of such information, and that + designers and implementors be particularly careful in this area. + History shows that errors in this area often create serious security + and/or privacy problems and generate highly adverse publicity for the + implementor's company. + +15.1.1 Abuse of Server Log Information + + A server is in the position to save personal data about a user's + requests which might identify their reading patterns or subjects of + interest. This information is clearly confidential in nature and its + handling can be constrained by law in certain countries. People using + the HTTP protocol to provide data are responsible for ensuring that + such material is not distributed without the permission of any + individuals that are identifiable by the published results. + +15.1.2 Transfer of Sensitive Information + + Like any generic data transfer protocol, HTTP cannot regulate the + content of the data that is transferred, nor is there any a priori + method of determining the sensitivity of any particular piece of + information within the context of any given request. Therefore, + applications SHOULD supply as much control over this information as + possible to the provider of that information. Four header fields are + worth special mention in this context: Server, Via, Referer and From. + + Revealing the specific software version of the server might allow the + server machine to become more vulnerable to attacks against software + that is known to contain security holes. Implementors SHOULD make the + Server header field a configurable option. + + Proxies which serve as a portal through a network firewall SHOULD + take special precautions regarding the transfer of header information + that identifies the hosts behind the firewall. In particular, they + SHOULD remove, or replace with sanitized versions, any Via fields + generated behind the firewall. + + The Referer header allows reading patterns to be studied and reverse + links drawn. Although it can be very useful, its power can be abused + if user details are not separated from the information contained in + + + + +Fielding, et al. Standards Track [Page 151] + +RFC 2616 HTTP/1.1 June 1999 + + + the Referer. Even when the personal information has been removed, the + Referer header might indicate a private document's URI whose + publication would be inappropriate. + + The information sent in the From field might conflict with the user's + privacy interests or their site's security policy, and hence it + SHOULD NOT be transmitted without the user being able to disable, + enable, and modify the contents of the field. The user MUST be able + to set the contents of this field within a user preference or + application defaults configuration. + + We suggest, though do not require, that a convenient toggle interface + be provided for the user to enable or disable the sending of From and + Referer information. + + The User-Agent (section 14.43) or Server (section 14.38) header + fields can sometimes be used to determine that a specific client or + server have a particular security hole which might be exploited. + Unfortunately, this same information is often used for other valuable + purposes for which HTTP currently has no better mechanism. + +15.1.3 Encoding Sensitive Information in URI's + + Because the source of a link might be private information or might + reveal an otherwise private information source, it is strongly + recommended that the user be able to select whether or not the + Referer field is sent. For example, a browser client could have a + toggle switch for browsing openly/anonymously, which would + respectively enable/disable the sending of Referer and From + information. + + Clients SHOULD NOT include a Referer header field in a (non-secure) + HTTP request if the referring page was transferred with a secure + protocol. + + Authors of services which use the HTTP protocol SHOULD NOT use GET + based forms for the submission of sensitive data, because this will + cause this data to be encoded in the Request-URI. Many existing + servers, proxies, and user agents will log the request URI in some + place where it might be visible to third parties. Servers can use + POST-based form submission instead + +15.1.4 Privacy Issues Connected to Accept Headers + + Accept request-headers can reveal information about the user to all + servers which are accessed. The Accept-Language header in particular + can reveal information the user would consider to be of a private + nature, because the understanding of particular languages is often + + + +Fielding, et al. Standards Track [Page 152] + +RFC 2616 HTTP/1.1 June 1999 + + + strongly correlated to the membership of a particular ethnic group. + User agents which offer the option to configure the contents of an + Accept-Language header to be sent in every request are strongly + encouraged to let the configuration process include a message which + makes the user aware of the loss of privacy involved. + + An approach that limits the loss of privacy would be for a user agent + to omit the sending of Accept-Language headers by default, and to ask + the user whether or not to start sending Accept-Language headers to a + server if it detects, by looking for any Vary response-header fields + generated by the server, that such sending could improve the quality + of service. + + Elaborate user-customized accept header fields sent in every request, + in particular if these include quality values, can be used by servers + as relatively reliable and long-lived user identifiers. Such user + identifiers would allow content providers to do click-trail tracking, + and would allow collaborating content providers to match cross-server + click-trails or form submissions of individual users. Note that for + many users not behind a proxy, the network address of the host + running the user agent will also serve as a long-lived user + identifier. In environments where proxies are used to enhance + privacy, user agents ought to be conservative in offering accept + header configuration options to end users. As an extreme privacy + measure, proxies could filter the accept headers in relayed requests. + General purpose user agents which provide a high degree of header + configurability SHOULD warn users about the loss of privacy which can + be involved. + +15.2 Attacks Based On File and Path Names + + Implementations of HTTP origin servers SHOULD be careful to restrict + the documents returned by HTTP requests to be only those that were + intended by the server administrators. If an HTTP server translates + HTTP URIs directly into file system calls, the server MUST take + special care not to serve files that were not intended to be + delivered to HTTP clients. For example, UNIX, Microsoft Windows, and + other operating systems use ".." as a path component to indicate a + directory level above the current one. On such a system, an HTTP + server MUST disallow any such construct in the Request-URI if it + would otherwise allow access to a resource outside those intended to + be accessible via the HTTP server. Similarly, files intended for + reference only internally to the server (such as access control + files, configuration files, and script code) MUST be protected from + inappropriate retrieval, since they might contain sensitive + information. Experience has shown that minor bugs in such HTTP server + implementations have turned into security risks. + + + + +Fielding, et al. Standards Track [Page 153] + +RFC 2616 HTTP/1.1 June 1999 + + +15.3 DNS Spoofing + + Clients using HTTP rely heavily on the Domain Name Service, and are + thus generally prone to security attacks based on the deliberate + mis-association of IP addresses and DNS names. Clients need to be + cautious in assuming the continuing validity of an IP number/DNS name + association. + + In particular, HTTP clients SHOULD rely on their name resolver for + confirmation of an IP number/DNS name association, rather than + caching the result of previous host name lookups. Many platforms + already can cache host name lookups locally when appropriate, and + they SHOULD be configured to do so. It is proper for these lookups to + be cached, however, only when the TTL (Time To Live) information + reported by the name server makes it likely that the cached + information will remain useful. + + If HTTP clients cache the results of host name lookups in order to + achieve a performance improvement, they MUST observe the TTL + information reported by DNS. + + If HTTP clients do not observe this rule, they could be spoofed when + a previously-accessed server's IP address changes. As network + renumbering is expected to become increasingly common [24], the + possibility of this form of attack will grow. Observing this + requirement thus reduces this potential security vulnerability. + + This requirement also improves the load-balancing behavior of clients + for replicated servers using the same DNS name and reduces the + likelihood of a user's experiencing failure in accessing sites which + use that strategy. + +15.4 Location Headers and Spoofing + + If a single server supports multiple organizations that do not trust + one another, then it MUST check the values of Location and Content- + Location headers in responses that are generated under control of + said organizations to make sure that they do not attempt to + invalidate resources over which they have no authority. + +15.5 Content-Disposition Issues + + RFC 1806 [35], from which the often implemented Content-Disposition + (see section 19.5.1) header in HTTP is derived, has a number of very + serious security considerations. Content-Disposition is not part of + the HTTP standard, but since it is widely implemented, we are + documenting its use and risks for implementors. See RFC 2183 [49] + (which updates RFC 1806) for details. + + + +Fielding, et al. Standards Track [Page 154] + +RFC 2616 HTTP/1.1 June 1999 + + +15.6 Authentication Credentials and Idle Clients + + Existing HTTP clients and user agents typically retain authentication + information indefinitely. HTTP/1.1. does not provide a method for a + server to direct clients to discard these cached credentials. This is + a significant defect that requires further extensions to HTTP. + Circumstances under which credential caching can interfere with the + application's security model include but are not limited to: + + - Clients which have been idle for an extended period following + which the server might wish to cause the client to reprompt the + user for credentials. + + - Applications which include a session termination indication + (such as a `logout' or `commit' button on a page) after which + the server side of the application `knows' that there is no + further reason for the client to retain the credentials. + + This is currently under separate study. There are a number of work- + arounds to parts of this problem, and we encourage the use of + password protection in screen savers, idle time-outs, and other + methods which mitigate the security problems inherent in this + problem. In particular, user agents which cache credentials are + encouraged to provide a readily accessible mechanism for discarding + cached credentials under user control. + +15.7 Proxies and Caching + + By their very nature, HTTP proxies are men-in-the-middle, and + represent an opportunity for man-in-the-middle attacks. Compromise of + the systems on which the proxies run can result in serious security + and privacy problems. Proxies have access to security-related + information, personal information about individual users and + organizations, and proprietary information belonging to users and + content providers. A compromised proxy, or a proxy implemented or + configured without regard to security and privacy considerations, + might be used in the commission of a wide range of potential attacks. + + Proxy operators should protect the systems on which proxies run as + they would protect any system that contains or transports sensitive + information. In particular, log information gathered at proxies often + contains highly sensitive personal information, and/or information + about organizations. Log information should be carefully guarded, and + appropriate guidelines for use developed and followed. (Section + 15.1.1). + + + + + + +Fielding, et al. Standards Track [Page 155] + +RFC 2616 HTTP/1.1 June 1999 + + + Caching proxies provide additional potential vulnerabilities, since + the contents of the cache represent an attractive target for + malicious exploitation. Because cache contents persist after an HTTP + request is complete, an attack on the cache can reveal information + long after a user believes that the information has been removed from + the network. Therefore, cache contents should be protected as + sensitive information. + + Proxy implementors should consider the privacy and security + implications of their design and coding decisions, and of the + configuration options they provide to proxy operators (especially the + default configuration). + + Users of a proxy need to be aware that they are no trustworthier than + the people who run the proxy; HTTP itself cannot solve this problem. + + The judicious use of cryptography, when appropriate, may suffice to + protect against a broad range of security and privacy attacks. Such + cryptography is beyond the scope of the HTTP/1.1 specification. + +15.7.1 Denial of Service Attacks on Proxies + + They exist. They are hard to defend against. Research continues. + Beware. + +16 Acknowledgments + + This specification makes heavy use of the augmented BNF and generic + constructs defined by David H. Crocker for RFC 822 [9]. Similarly, it + reuses many of the definitions provided by Nathaniel Borenstein and + Ned Freed for MIME [7]. We hope that their inclusion in this + specification will help reduce past confusion over the relationship + between HTTP and Internet mail message formats. + + The HTTP protocol has evolved considerably over the years. It has + benefited from a large and active developer community--the many + people who have participated on the www-talk mailing list--and it is + that community which has been most responsible for the success of + HTTP and of the World-Wide Web in general. Marc Andreessen, Robert + Cailliau, Daniel W. Connolly, Bob Denny, John Franks, Jean-Francois + Groff, Phillip M. Hallam-Baker, Hakon W. Lie, Ari Luotonen, Rob + McCool, Lou Montulli, Dave Raggett, Tony Sanders, and Marc + VanHeyningen deserve special recognition for their efforts in + defining early aspects of the protocol. + + This document has benefited greatly from the comments of all those + participating in the HTTP-WG. In addition to those already mentioned, + the following individuals have contributed to this specification: + + + +Fielding, et al. Standards Track [Page 156] + +RFC 2616 HTTP/1.1 June 1999 + + + Gary Adams Ross Patterson + Harald Tveit Alvestrand Albert Lunde + Keith Ball John C. Mallery + Brian Behlendorf Jean-Philippe Martin-Flatin + Paul Burchard Mitra + Maurizio Codogno David Morris + Mike Cowlishaw Gavin Nicol + Roman Czyborra Bill Perry + Michael A. Dolan Jeffrey Perry + David J. Fiander Scott Powers + Alan Freier Owen Rees + Marc Hedlund Luigi Rizzo + Greg Herlihy David Robinson + Koen Holtman Marc Salomon + Alex Hopmann Rich Salz + Bob Jernigan Allan M. Schiffman + Shel Kaphan Jim Seidman + Rohit Khare Chuck Shotton + John Klensin Eric W. Sink + Martijn Koster Simon E. Spero + Alexei Kosut Richard N. Taylor + David M. Kristol Robert S. Thau + Daniel LaLiberte Bill (BearHeart) Weinman + Ben Laurie Francois Yergeau + Paul J. Leach Mary Ellen Zurko + Daniel DuBois Josh Cohen + + + Much of the content and presentation of the caching design is due to + suggestions and comments from individuals including: Shel Kaphan, + Paul Leach, Koen Holtman, David Morris, and Larry Masinter. + + Most of the specification of ranges is based on work originally done + by Ari Luotonen and John Franks, with additional input from Steve + Zilles. + + Thanks to the "cave men" of Palo Alto. You know who you are. + + Jim Gettys (the current editor of this document) wishes particularly + to thank Roy Fielding, the previous editor of this document, along + with John Klensin, Jeff Mogul, Paul Leach, Dave Kristol, Koen + Holtman, John Franks, Josh Cohen, Alex Hopmann, Scott Lawrence, and + Larry Masinter for their help. And thanks go particularly to Jeff + Mogul and Scott Lawrence for performing the "MUST/MAY/SHOULD" audit. + + + + + + + +Fielding, et al. Standards Track [Page 157] + +RFC 2616 HTTP/1.1 June 1999 + + + The Apache Group, Anselm Baird-Smith, author of Jigsaw, and Henrik + Frystyk implemented RFC 2068 early, and we wish to thank them for the + discovery of many of the problems that this document attempts to + rectify. + +17 References + + [1] Alvestrand, H., "Tags for the Identification of Languages", RFC + 1766, March 1995. + + [2] Anklesaria, F., McCahill, M., Lindner, P., Johnson, D., Torrey, + D. and B. Alberti, "The Internet Gopher Protocol (a distributed + document search and retrieval protocol)", RFC 1436, March 1993. + + [3] Berners-Lee, T., "Universal Resource Identifiers in WWW", RFC + 1630, June 1994. + + [4] Berners-Lee, T., Masinter, L. and M. McCahill, "Uniform Resource + Locators (URL)", RFC 1738, December 1994. + + [5] Berners-Lee, T. and D. Connolly, "Hypertext Markup Language - + 2.0", RFC 1866, November 1995. + + [6] Berners-Lee, T., Fielding, R. and H. Frystyk, "Hypertext Transfer + Protocol -- HTTP/1.0", RFC 1945, May 1996. + + [7] Freed, N. and N. Borenstein, "Multipurpose Internet Mail + Extensions (MIME) Part One: Format of Internet Message Bodies", + RFC 2045, November 1996. + + [8] Braden, R., "Requirements for Internet Hosts -- Communication + Layers", STD 3, RFC 1123, October 1989. + + [9] Crocker, D., "Standard for The Format of ARPA Internet Text + Messages", STD 11, RFC 822, August 1982. + + [10] Davis, F., Kahle, B., Morris, H., Salem, J., Shen, T., Wang, R., + Sui, J., and M. Grinbaum, "WAIS Interface Protocol Prototype + Functional Specification," (v1.5), Thinking Machines + Corporation, April 1990. + + [11] Fielding, R., "Relative Uniform Resource Locators", RFC 1808, + June 1995. + + [12] Horton, M. and R. Adams, "Standard for Interchange of USENET + Messages", RFC 1036, December 1987. + + + + + +Fielding, et al. Standards Track [Page 158] + +RFC 2616 HTTP/1.1 June 1999 + + + [13] Kantor, B. and P. Lapsley, "Network News Transfer Protocol", RFC + 977, February 1986. + + [14] Moore, K., "MIME (Multipurpose Internet Mail Extensions) Part + Three: Message Header Extensions for Non-ASCII Text", RFC 2047, + November 1996. + + [15] Nebel, E. and L. Masinter, "Form-based File Upload in HTML", RFC + 1867, November 1995. + + [16] Postel, J., "Simple Mail Transfer Protocol", STD 10, RFC 821, + August 1982. + + [17] Postel, J., "Media Type Registration Procedure", RFC 1590, + November 1996. + + [18] Postel, J. and J. Reynolds, "File Transfer Protocol", STD 9, RFC + 959, October 1985. + + [19] Reynolds, J. and J. Postel, "Assigned Numbers", STD 2, RFC 1700, + October 1994. + + [20] Sollins, K. and L. Masinter, "Functional Requirements for + Uniform Resource Names", RFC 1737, December 1994. + + [21] US-ASCII. Coded Character Set - 7-Bit American Standard Code for + Information Interchange. Standard ANSI X3.4-1986, ANSI, 1986. + + [22] ISO-8859. International Standard -- Information Processing -- + 8-bit Single-Byte Coded Graphic Character Sets -- + Part 1: Latin alphabet No. 1, ISO-8859-1:1987. + Part 2: Latin alphabet No. 2, ISO-8859-2, 1987. + Part 3: Latin alphabet No. 3, ISO-8859-3, 1988. + Part 4: Latin alphabet No. 4, ISO-8859-4, 1988. + Part 5: Latin/Cyrillic alphabet, ISO-8859-5, 1988. + Part 6: Latin/Arabic alphabet, ISO-8859-6, 1987. + Part 7: Latin/Greek alphabet, ISO-8859-7, 1987. + Part 8: Latin/Hebrew alphabet, ISO-8859-8, 1988. + Part 9: Latin alphabet No. 5, ISO-8859-9, 1990. + + [23] Meyers, J. and M. Rose, "The Content-MD5 Header Field", RFC + 1864, October 1995. + + [24] Carpenter, B. and Y. Rekhter, "Renumbering Needs Work", RFC + 1900, February 1996. + + [25] Deutsch, P., "GZIP file format specification version 4.3", RFC + 1952, May 1996. + + + +Fielding, et al. Standards Track [Page 159] + +RFC 2616 HTTP/1.1 June 1999 + + + [26] Venkata N. Padmanabhan, and Jeffrey C. Mogul. "Improving HTTP + Latency", Computer Networks and ISDN Systems, v. 28, pp. 25-35, + Dec. 1995. Slightly revised version of paper in Proc. 2nd + International WWW Conference '94: Mosaic and the Web, Oct. 1994, + which is available at + http://www.ncsa.uiuc.edu/SDG/IT94/Proceedings/DDay/mogul/HTTPLat + ency.html. + + [27] Joe Touch, John Heidemann, and Katia Obraczka. "Analysis of HTTP + Performance", , + ISI Research Report ISI/RR-98-463, (original report dated Aug. + 1996), USC/Information Sciences Institute, August 1998. + + [28] Mills, D., "Network Time Protocol (Version 3) Specification, + Implementation and Analysis", RFC 1305, March 1992. + + [29] Deutsch, P., "DEFLATE Compressed Data Format Specification + version 1.3", RFC 1951, May 1996. + + [30] S. Spero, "Analysis of HTTP Performance Problems," + http://sunsite.unc.edu/mdma-release/http-prob.html. + + [31] Deutsch, P. and J. Gailly, "ZLIB Compressed Data Format + Specification version 3.3", RFC 1950, May 1996. + + [32] Franks, J., Hallam-Baker, P., Hostetler, J., Leach, P., + Luotonen, A., Sink, E. and L. Stewart, "An Extension to HTTP: + Digest Access Authentication", RFC 2069, January 1997. + + [33] Fielding, R., Gettys, J., Mogul, J., Frystyk, H. and T. + Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", RFC + 2068, January 1997. + + [34] Bradner, S., "Key words for use in RFCs to Indicate Requirement + Levels", BCP 14, RFC 2119, March 1997. + + [35] Troost, R. and Dorner, S., "Communicating Presentation + Information in Internet Messages: The Content-Disposition + Header", RFC 1806, June 1995. + + [36] Mogul, J., Fielding, R., Gettys, J. and H. Frystyk, "Use and + Interpretation of HTTP Version Numbers", RFC 2145, May 1997. + [jg639] + + [37] Palme, J., "Common Internet Message Headers", RFC 2076, February + 1997. [jg640] + + + + + +Fielding, et al. Standards Track [Page 160] + +RFC 2616 HTTP/1.1 June 1999 + + + [38] Yergeau, F., "UTF-8, a transformation format of Unicode and + ISO-10646", RFC 2279, January 1998. [jg641] + + [39] Nielsen, H.F., Gettys, J., Baird-Smith, A., Prud'hommeaux, E., + Lie, H., and C. Lilley. "Network Performance Effects of + HTTP/1.1, CSS1, and PNG," Proceedings of ACM SIGCOMM '97, Cannes + France, September 1997.[jg642] + + [40] Freed, N. and N. Borenstein, "Multipurpose Internet Mail + Extensions (MIME) Part Two: Media Types", RFC 2046, November + 1996. [jg643] + + [41] Alvestrand, H., "IETF Policy on Character Sets and Languages", + BCP 18, RFC 2277, January 1998. [jg644] + + [42] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform Resource + Identifiers (URI): Generic Syntax and Semantics", RFC 2396, + August 1998. [jg645] + + [43] Franks, J., Hallam-Baker, P., Hostetler, J., Lawrence, S., + Leach, P., Luotonen, A., Sink, E. and L. Stewart, "HTTP + Authentication: Basic and Digest Access Authentication", RFC + 2617, June 1999. [jg646] + + [44] Luotonen, A., "Tunneling TCP based protocols through Web proxy + servers," Work in Progress. [jg647] + + [45] Palme, J. and A. Hopmann, "MIME E-mail Encapsulation of + Aggregate Documents, such as HTML (MHTML)", RFC 2110, March + 1997. + + [46] Bradner, S., "The Internet Standards Process -- Revision 3", BCP + 9, RFC 2026, October 1996. + + [47] Masinter, L., "Hyper Text Coffee Pot Control Protocol + (HTCPCP/1.0)", RFC 2324, 1 April 1998. + + [48] Freed, N. and N. Borenstein, "Multipurpose Internet Mail + Extensions (MIME) Part Five: Conformance Criteria and Examples", + RFC 2049, November 1996. + + [49] Troost, R., Dorner, S. and K. Moore, "Communicating Presentation + Information in Internet Messages: The Content-Disposition Header + Field", RFC 2183, August 1997. + + + + + + + +Fielding, et al. Standards Track [Page 161] + +RFC 2616 HTTP/1.1 June 1999 + + +18 Authors' Addresses + + Roy T. Fielding + Information and Computer Science + University of California, Irvine + Irvine, CA 92697-3425, USA + + Fax: +1 (949) 824-1715 + EMail: fielding@ics.uci.edu + + + James Gettys + World Wide Web Consortium + MIT Laboratory for Computer Science + 545 Technology Square + Cambridge, MA 02139, USA + + Fax: +1 (617) 258 8682 + EMail: jg@w3.org + + + Jeffrey C. Mogul + Western Research Laboratory + Compaq Computer Corporation + 250 University Avenue + Palo Alto, California, 94305, USA + + EMail: mogul@wrl.dec.com + + + Henrik Frystyk Nielsen + World Wide Web Consortium + MIT Laboratory for Computer Science + 545 Technology Square + Cambridge, MA 02139, USA + + Fax: +1 (617) 258 8682 + EMail: frystyk@w3.org + + + Larry Masinter + Xerox Corporation + 3333 Coyote Hill Road + Palo Alto, CA 94034, USA + + EMail: masinter@parc.xerox.com + + + + + +Fielding, et al. Standards Track [Page 162] + +RFC 2616 HTTP/1.1 June 1999 + + + Paul J. Leach + Microsoft Corporation + 1 Microsoft Way + Redmond, WA 98052, USA + + EMail: paulle@microsoft.com + + + Tim Berners-Lee + Director, World Wide Web Consortium + MIT Laboratory for Computer Science + 545 Technology Square + Cambridge, MA 02139, USA + + Fax: +1 (617) 258 8682 + EMail: timbl@w3.org + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Fielding, et al. Standards Track [Page 163] + +RFC 2616 HTTP/1.1 June 1999 + + +19 Appendices + +19.1 Internet Media Type message/http and application/http + + In addition to defining the HTTP/1.1 protocol, this document serves + as the specification for the Internet media type "message/http" and + "application/http". The message/http type can be used to enclose a + single HTTP request or response message, provided that it obeys the + MIME restrictions for all "message" types regarding line length and + encodings. The application/http type can be used to enclose a + pipeline of one or more HTTP request or response messages (not + intermixed). The following is to be registered with IANA [17]. + + Media Type name: message + Media subtype name: http + Required parameters: none + Optional parameters: version, msgtype + version: The HTTP-Version number of the enclosed message + (e.g., "1.1"). If not present, the version can be + determined from the first line of the body. + msgtype: The message type -- "request" or "response". If not + present, the type can be determined from the first + line of the body. + Encoding considerations: only "7bit", "8bit", or "binary" are + permitted + Security considerations: none + + Media Type name: application + Media subtype name: http + Required parameters: none + Optional parameters: version, msgtype + version: The HTTP-Version number of the enclosed messages + (e.g., "1.1"). If not present, the version can be + determined from the first line of the body. + msgtype: The message type -- "request" or "response". If not + present, the type can be determined from the first + line of the body. + Encoding considerations: HTTP messages enclosed by this type + are in "binary" format; use of an appropriate + Content-Transfer-Encoding is required when + transmitted via E-mail. + Security considerations: none + + + + + + + + + +Fielding, et al. Standards Track [Page 164] + +RFC 2616 HTTP/1.1 June 1999 + + +19.2 Internet Media Type multipart/byteranges + + When an HTTP 206 (Partial Content) response message includes the + content of multiple ranges (a response to a request for multiple + non-overlapping ranges), these are transmitted as a multipart + message-body. The media type for this purpose is called + "multipart/byteranges". + + The multipart/byteranges media type includes two or more parts, each + with its own Content-Type and Content-Range fields. The required + boundary parameter specifies the boundary string used to separate + each body-part. + + Media Type name: multipart + Media subtype name: byteranges + Required parameters: boundary + Optional parameters: none + Encoding considerations: only "7bit", "8bit", or "binary" are + permitted + Security considerations: none + + + For example: + + HTTP/1.1 206 Partial Content + Date: Wed, 15 Nov 1995 06:25:24 GMT + Last-Modified: Wed, 15 Nov 1995 04:58:08 GMT + Content-type: multipart/byteranges; boundary=THIS_STRING_SEPARATES + + --THIS_STRING_SEPARATES + Content-type: application/pdf + Content-range: bytes 500-999/8000 + + ...the first range... + --THIS_STRING_SEPARATES + Content-type: application/pdf + Content-range: bytes 7000-7999/8000 + + ...the second range + --THIS_STRING_SEPARATES-- + + Notes: + + 1) Additional CRLFs may precede the first boundary string in the + entity. + + + + + + +Fielding, et al. Standards Track [Page 165] + +RFC 2616 HTTP/1.1 June 1999 + + + 2) Although RFC 2046 [40] permits the boundary string to be + quoted, some existing implementations handle a quoted boundary + string incorrectly. + + 3) A number of browsers and servers were coded to an early draft + of the byteranges specification to use a media type of + multipart/x-byteranges, which is almost, but not quite + compatible with the version documented in HTTP/1.1. + +19.3 Tolerant Applications + + Although this document specifies the requirements for the generation + of HTTP/1.1 messages, not all applications will be correct in their + implementation. We therefore recommend that operational applications + be tolerant of deviations whenever those deviations can be + interpreted unambiguously. + + Clients SHOULD be tolerant in parsing the Status-Line and servers + tolerant when parsing the Request-Line. In particular, they SHOULD + accept any amount of SP or HT characters between fields, even though + only a single SP is required. + + The line terminator for message-header fields is the sequence CRLF. + However, we recommend that applications, when parsing such headers, + recognize a single LF as a line terminator and ignore the leading CR. + + The character set of an entity-body SHOULD be labeled as the lowest + common denominator of the character codes used within that body, with + the exception that not labeling the entity is preferred over labeling + the entity with the labels US-ASCII or ISO-8859-1. See section 3.7.1 + and 3.4.1. + + Additional rules for requirements on parsing and encoding of dates + and other potential problems with date encodings include: + + - HTTP/1.1 clients and caches SHOULD assume that an RFC-850 date + which appears to be more than 50 years in the future is in fact + in the past (this helps solve the "year 2000" problem). + + - An HTTP/1.1 implementation MAY internally represent a parsed + Expires date as earlier than the proper value, but MUST NOT + internally represent a parsed Expires date as later than the + proper value. + + - All expiration-related calculations MUST be done in GMT. The + local time zone MUST NOT influence the calculation or comparison + of an age or expiration time. + + + + +Fielding, et al. Standards Track [Page 166] + +RFC 2616 HTTP/1.1 June 1999 + + + - If an HTTP header incorrectly carries a date value with a time + zone other than GMT, it MUST be converted into GMT using the + most conservative possible conversion. + +19.4 Differences Between HTTP Entities and RFC 2045 Entities + + HTTP/1.1 uses many of the constructs defined for Internet Mail (RFC + 822 [9]) and the Multipurpose Internet Mail Extensions (MIME [7]) to + allow entities to be transmitted in an open variety of + representations and with extensible mechanisms. However, RFC 2045 + discusses mail, and HTTP has a few features that are different from + those described in RFC 2045. These differences were carefully chosen + to optimize performance over binary connections, to allow greater + freedom in the use of new media types, to make date comparisons + easier, and to acknowledge the practice of some early HTTP servers + and clients. + + This appendix describes specific areas where HTTP differs from RFC + 2045. Proxies and gateways to strict MIME environments SHOULD be + aware of these differences and provide the appropriate conversions + where necessary. Proxies and gateways from MIME environments to HTTP + also need to be aware of the differences because some conversions + might be required. + +19.4.1 MIME-Version + + HTTP is not a MIME-compliant protocol. However, HTTP/1.1 messages MAY + include a single MIME-Version general-header field to indicate what + version of the MIME protocol was used to construct the message. Use + of the MIME-Version header field indicates that the message is in + full compliance with the MIME protocol (as defined in RFC 2045[7]). + Proxies/gateways are responsible for ensuring full compliance (where + possible) when exporting HTTP messages to strict MIME environments. + + MIME-Version = "MIME-Version" ":" 1*DIGIT "." 1*DIGIT + + MIME version "1.0" is the default for use in HTTP/1.1. However, + HTTP/1.1 message parsing and semantics are defined by this document + and not the MIME specification. + +19.4.2 Conversion to Canonical Form + + RFC 2045 [7] requires that an Internet mail entity be converted to + canonical form prior to being transferred, as described in section 4 + of RFC 2049 [48]. Section 3.7.1 of this document describes the forms + allowed for subtypes of the "text" media type when transmitted over + HTTP. RFC 2046 requires that content with a type of "text" represent + line breaks as CRLF and forbids the use of CR or LF outside of line + + + +Fielding, et al. Standards Track [Page 167] + +RFC 2616 HTTP/1.1 June 1999 + + + break sequences. HTTP allows CRLF, bare CR, and bare LF to indicate a + line break within text content when a message is transmitted over + HTTP. + + Where it is possible, a proxy or gateway from HTTP to a strict MIME + environment SHOULD translate all line breaks within the text media + types described in section 3.7.1 of this document to the RFC 2049 + canonical form of CRLF. Note, however, that this might be complicated + by the presence of a Content-Encoding and by the fact that HTTP + allows the use of some character sets which do not use octets 13 and + 10 to represent CR and LF, as is the case for some multi-byte + character sets. + + Implementors should note that conversion will break any cryptographic + checksums applied to the original content unless the original content + is already in canonical form. Therefore, the canonical form is + recommended for any content that uses such checksums in HTTP. + +19.4.3 Conversion of Date Formats + + HTTP/1.1 uses a restricted set of date formats (section 3.3.1) to + simplify the process of date comparison. Proxies and gateways from + other protocols SHOULD ensure that any Date header field present in a + message conforms to one of the HTTP/1.1 formats and rewrite the date + if necessary. + +19.4.4 Introduction of Content-Encoding + + RFC 2045 does not include any concept equivalent to HTTP/1.1's + Content-Encoding header field. Since this acts as a modifier on the + media type, proxies and gateways from HTTP to MIME-compliant + protocols MUST either change the value of the Content-Type header + field or decode the entity-body before forwarding the message. (Some + experimental applications of Content-Type for Internet mail have used + a media-type parameter of ";conversions=" to perform + a function equivalent to Content-Encoding. However, this parameter is + not part of RFC 2045.) + +19.4.5 No Content-Transfer-Encoding + + HTTP does not use the Content-Transfer-Encoding (CTE) field of RFC + 2045. Proxies and gateways from MIME-compliant protocols to HTTP MUST + remove any non-identity CTE ("quoted-printable" or "base64") encoding + prior to delivering the response message to an HTTP client. + + Proxies and gateways from HTTP to MIME-compliant protocols are + responsible for ensuring that the message is in the correct format + and encoding for safe transport on that protocol, where "safe + + + +Fielding, et al. Standards Track [Page 168] + +RFC 2616 HTTP/1.1 June 1999 + + + transport" is defined by the limitations of the protocol being used. + Such a proxy or gateway SHOULD label the data with an appropriate + Content-Transfer-Encoding if doing so will improve the likelihood of + safe transport over the destination protocol. + +19.4.6 Introduction of Transfer-Encoding + + HTTP/1.1 introduces the Transfer-Encoding header field (section + 14.41). Proxies/gateways MUST remove any transfer-coding prior to + forwarding a message via a MIME-compliant protocol. + + A process for decoding the "chunked" transfer-coding (section 3.6) + can be represented in pseudo-code as: + + length := 0 + read chunk-size, chunk-extension (if any) and CRLF + while (chunk-size > 0) { + read chunk-data and CRLF + append chunk-data to entity-body + length := length + chunk-size + read chunk-size and CRLF + } + read entity-header + while (entity-header not empty) { + append entity-header to existing header fields + read entity-header + } + Content-Length := length + Remove "chunked" from Transfer-Encoding + +19.4.7 MHTML and Line Length Limitations + + HTTP implementations which share code with MHTML [45] implementations + need to be aware of MIME line length limitations. Since HTTP does not + have this limitation, HTTP does not fold long lines. MHTML messages + being transported by HTTP follow all conventions of MHTML, including + line length limitations and folding, canonicalization, etc., since + HTTP transports all message-bodies as payload (see section 3.7.2) and + does not interpret the content or any MIME header lines that might be + contained therein. + +19.5 Additional Features + + RFC 1945 and RFC 2068 document protocol elements used by some + existing HTTP implementations, but not consistently and correctly + across most HTTP/1.1 applications. Implementors are advised to be + aware of these features, but cannot rely upon their presence in, or + interoperability with, other HTTP/1.1 applications. Some of these + + + +Fielding, et al. Standards Track [Page 169] + +RFC 2616 HTTP/1.1 June 1999 + + + describe proposed experimental features, and some describe features + that experimental deployment found lacking that are now addressed in + the base HTTP/1.1 specification. + + A number of other headers, such as Content-Disposition and Title, + from SMTP and MIME are also often implemented (see RFC 2076 [37]). + +19.5.1 Content-Disposition + + The Content-Disposition response-header field has been proposed as a + means for the origin server to suggest a default filename if the user + requests that the content is saved to a file. This usage is derived + from the definition of Content-Disposition in RFC 1806 [35]. + + content-disposition = "Content-Disposition" ":" + disposition-type *( ";" disposition-parm ) + disposition-type = "attachment" | disp-extension-token + disposition-parm = filename-parm | disp-extension-parm + filename-parm = "filename" "=" quoted-string + disp-extension-token = token + disp-extension-parm = token "=" ( token | quoted-string ) + + An example is + + Content-Disposition: attachment; filename="fname.ext" + + The receiving user agent SHOULD NOT respect any directory path + information present in the filename-parm parameter, which is the only + parameter believed to apply to HTTP implementations at this time. The + filename SHOULD be treated as a terminal component only. + + If this header is used in a response with the application/octet- + stream content-type, the implied suggestion is that the user agent + should not display the response, but directly enter a `save response + as...' dialog. + + See section 15.5 for Content-Disposition security issues. + +19.6 Compatibility with Previous Versions + + It is beyond the scope of a protocol specification to mandate + compliance with previous versions. HTTP/1.1 was deliberately + designed, however, to make supporting previous versions easy. It is + worth noting that, at the time of composing this specification + (1996), we would expect commercial HTTP/1.1 servers to: + + - recognize the format of the Request-Line for HTTP/0.9, 1.0, and + 1.1 requests; + + + +Fielding, et al. Standards Track [Page 170] + +RFC 2616 HTTP/1.1 June 1999 + + + - understand any valid request in the format of HTTP/0.9, 1.0, or + 1.1; + + - respond appropriately with a message in the same major version + used by the client. + + And we would expect HTTP/1.1 clients to: + + - recognize the format of the Status-Line for HTTP/1.0 and 1.1 + responses; + + - understand any valid response in the format of HTTP/0.9, 1.0, or + 1.1. + + For most implementations of HTTP/1.0, each connection is established + by the client prior to the request and closed by the server after + sending the response. Some implementations implement the Keep-Alive + version of persistent connections described in section 19.7.1 of RFC + 2068 [33]. + +19.6.1 Changes from HTTP/1.0 + + This section summarizes major differences between versions HTTP/1.0 + and HTTP/1.1. + +19.6.1.1 Changes to Simplify Multi-homed Web Servers and Conserve IP + Addresses + + The requirements that clients and servers support the Host request- + header, report an error if the Host request-header (section 14.23) is + missing from an HTTP/1.1 request, and accept absolute URIs (section + 5.1.2) are among the most important changes defined by this + specification. + + Older HTTP/1.0 clients assumed a one-to-one relationship of IP + addresses and servers; there was no other established mechanism for + distinguishing the intended server of a request than the IP address + to which that request was directed. The changes outlined above will + allow the Internet, once older HTTP clients are no longer common, to + support multiple Web sites from a single IP address, greatly + simplifying large operational Web servers, where allocation of many + IP addresses to a single host has created serious problems. The + Internet will also be able to recover the IP addresses that have been + allocated for the sole purpose of allowing special-purpose domain + names to be used in root-level HTTP URLs. Given the rate of growth of + the Web, and the number of servers already deployed, it is extremely + + + + + +Fielding, et al. Standards Track [Page 171] + +RFC 2616 HTTP/1.1 June 1999 + + + important that all implementations of HTTP (including updates to + existing HTTP/1.0 applications) correctly implement these + requirements: + + - Both clients and servers MUST support the Host request-header. + + - A client that sends an HTTP/1.1 request MUST send a Host header. + + - Servers MUST report a 400 (Bad Request) error if an HTTP/1.1 + request does not include a Host request-header. + + - Servers MUST accept absolute URIs. + +19.6.2 Compatibility with HTTP/1.0 Persistent Connections + + Some clients and servers might wish to be compatible with some + previous implementations of persistent connections in HTTP/1.0 + clients and servers. Persistent connections in HTTP/1.0 are + explicitly negotiated as they are not the default behavior. HTTP/1.0 + experimental implementations of persistent connections are faulty, + and the new facilities in HTTP/1.1 are designed to rectify these + problems. The problem was that some existing 1.0 clients may be + sending Keep-Alive to a proxy server that doesn't understand + Connection, which would then erroneously forward it to the next + inbound server, which would establish the Keep-Alive connection and + result in a hung HTTP/1.0 proxy waiting for the close on the + response. The result is that HTTP/1.0 clients must be prevented from + using Keep-Alive when talking to proxies. + + However, talking to proxies is the most important use of persistent + connections, so that prohibition is clearly unacceptable. Therefore, + we need some other mechanism for indicating a persistent connection + is desired, which is safe to use even when talking to an old proxy + that ignores Connection. Persistent connections are the default for + HTTP/1.1 messages; we introduce a new keyword (Connection: close) for + declaring non-persistence. See section 14.10. + + The original HTTP/1.0 form of persistent connections (the Connection: + Keep-Alive and Keep-Alive header) is documented in RFC 2068. [33] + +19.6.3 Changes from RFC 2068 + + This specification has been carefully audited to correct and + disambiguate key word usage; RFC 2068 had many problems in respect to + the conventions laid out in RFC 2119 [34]. + + Clarified which error code should be used for inbound server failures + (e.g. DNS failures). (Section 10.5.5). + + + +Fielding, et al. Standards Track [Page 172] + +RFC 2616 HTTP/1.1 June 1999 + + + CREATE had a race that required an Etag be sent when a resource is + first created. (Section 10.2.2). + + Content-Base was deleted from the specification: it was not + implemented widely, and there is no simple, safe way to introduce it + without a robust extension mechanism. In addition, it is used in a + similar, but not identical fashion in MHTML [45]. + + Transfer-coding and message lengths all interact in ways that + required fixing exactly when chunked encoding is used (to allow for + transfer encoding that may not be self delimiting); it was important + to straighten out exactly how message lengths are computed. (Sections + 3.6, 4.4, 7.2.2, 13.5.2, 14.13, 14.16) + + A content-coding of "identity" was introduced, to solve problems + discovered in caching. (section 3.5) + + Quality Values of zero should indicate that "I don't want something" + to allow clients to refuse a representation. (Section 3.9) + + The use and interpretation of HTTP version numbers has been clarified + by RFC 2145. Require proxies to upgrade requests to highest protocol + version they support to deal with problems discovered in HTTP/1.0 + implementations (Section 3.1) + + Charset wildcarding is introduced to avoid explosion of character set + names in accept headers. (Section 14.2) + + A case was missed in the Cache-Control model of HTTP/1.1; s-maxage + was introduced to add this missing case. (Sections 13.4, 14.8, 14.9, + 14.9.3) + + The Cache-Control: max-age directive was not properly defined for + responses. (Section 14.9.3) + + There are situations where a server (especially a proxy) does not + know the full length of a response but is capable of serving a + byterange request. We therefore need a mechanism to allow byteranges + with a content-range not indicating the full length of the message. + (Section 14.16) + + Range request responses would become very verbose if all meta-data + were always returned; by allowing the server to only send needed + headers in a 206 response, this problem can be avoided. (Section + 10.2.7, 13.5.3, and 14.27) + + + + + + +Fielding, et al. Standards Track [Page 173] + +RFC 2616 HTTP/1.1 June 1999 + + + Fix problem with unsatisfiable range requests; there are two cases: + syntactic problems, and range doesn't exist in the document. The 416 + status code was needed to resolve this ambiguity needed to indicate + an error for a byte range request that falls outside of the actual + contents of a document. (Section 10.4.17, 14.16) + + Rewrite of message transmission requirements to make it much harder + for implementors to get it wrong, as the consequences of errors here + can have significant impact on the Internet, and to deal with the + following problems: + + 1. Changing "HTTP/1.1 or later" to "HTTP/1.1", in contexts where + this was incorrectly placing a requirement on the behavior of + an implementation of a future version of HTTP/1.x + + 2. Made it clear that user-agents should retry requests, not + "clients" in general. + + 3. Converted requirements for clients to ignore unexpected 100 + (Continue) responses, and for proxies to forward 100 responses, + into a general requirement for 1xx responses. + + 4. Modified some TCP-specific language, to make it clearer that + non-TCP transports are possible for HTTP. + + 5. Require that the origin server MUST NOT wait for the request + body before it sends a required 100 (Continue) response. + + 6. Allow, rather than require, a server to omit 100 (Continue) if + it has already seen some of the request body. + + 7. Allow servers to defend against denial-of-service attacks and + broken clients. + + This change adds the Expect header and 417 status code. The message + transmission requirements fixes are in sections 8.2, 10.4.18, + 8.1.2.2, 13.11, and 14.20. + + Proxies should be able to add Content-Length when appropriate. + (Section 13.5.2) + + Clean up confusion between 403 and 404 responses. (Section 10.4.4, + 10.4.5, and 10.4.11) + + Warnings could be cached incorrectly, or not updated appropriately. + (Section 13.1.2, 13.2.4, 13.5.2, 13.5.3, 14.9.3, and 14.46) Warning + also needed to be a general header, as PUT or other methods may have + need for it in requests. + + + +Fielding, et al. Standards Track [Page 174] + +RFC 2616 HTTP/1.1 June 1999 + + + Transfer-coding had significant problems, particularly with + interactions with chunked encoding. The solution is that transfer- + codings become as full fledged as content-codings. This involves + adding an IANA registry for transfer-codings (separate from content + codings), a new header field (TE) and enabling trailer headers in the + future. Transfer encoding is a major performance benefit, so it was + worth fixing [39]. TE also solves another, obscure, downward + interoperability problem that could have occurred due to interactions + between authentication trailers, chunked encoding and HTTP/1.0 + clients.(Section 3.6, 3.6.1, and 14.39) + + The PATCH, LINK, UNLINK methods were defined but not commonly + implemented in previous versions of this specification. See RFC 2068 + [33]. + + The Alternates, Content-Version, Derived-From, Link, URI, Public and + Content-Base header fields were defined in previous versions of this + specification, but not commonly implemented. See RFC 2068 [33]. + +20 Index + + Please see the PostScript version of this RFC for the INDEX. + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Fielding, et al. Standards Track [Page 175] + +RFC 2616 HTTP/1.1 June 1999 + + +21. Full Copyright Statement + + Copyright (C) The Internet Society (1999). All Rights Reserved. + + This document and translations of it may be copied and furnished to + others, and derivative works that comment on or otherwise explain it + or assist in its implementation may be prepared, copied, published + and distributed, in whole or in part, without restriction of any + kind, provided that the above copyright notice and this paragraph are + included on all such copies and derivative works. However, this + document itself may not be modified in any way, such as by removing + the copyright notice or references to the Internet Society or other + Internet organizations, except as needed for the purpose of + developing Internet standards in which case the procedures for + copyrights defined in the Internet Standards process must be + followed, or as required to translate it into languages other than + English. + + The limited permissions granted above are perpetual and will not be + revoked by the Internet Society or its successors or assigns. + + This document and the information contained herein is provided on an + "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING + TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING + BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION + HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF + MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +Acknowledgement + + Funding for the RFC Editor function is currently provided by the + Internet Society. + + + + + + + + + + + + + + + + + + + +Fielding, et al. Standards Track [Page 176] + diff --git a/doc/rfc/rfc2617.txt b/doc/rfc/rfc2617.txt new file mode 100644 index 0000000000..771aa924a5 --- /dev/null +++ b/doc/rfc/rfc2617.txt @@ -0,0 +1,1907 @@ + + + + + + +Network Working Group J. Franks +Request for Comments: 2617 Northwestern University +Obsoletes: 2069 P. Hallam-Baker +Category: Standards Track Verisign, Inc. + J. Hostetler + AbiSource, Inc. + S. Lawrence + Agranat Systems, Inc. + P. Leach + Microsoft Corporation + A. Luotonen + Netscape Communications Corporation + L. Stewart + Open Market, Inc. + June 1999 + + + HTTP Authentication: Basic and Digest Access Authentication + +Status of this Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Copyright Notice + + Copyright (C) The Internet Society (1999). All Rights Reserved. + +Abstract + + "HTTP/1.0", includes the specification for a Basic Access + Authentication scheme. This scheme is not considered to be a secure + method of user authentication (unless used in conjunction with some + external secure system such as SSL [5]), as the user name and + password are passed over the network as cleartext. + + This document also provides the specification for HTTP's + authentication framework, the original Basic authentication scheme + and a scheme based on cryptographic hashes, referred to as "Digest + Access Authentication". It is therefore also intended to serve as a + replacement for RFC 2069 [6]. Some optional elements specified by + RFC 2069 have been removed from this specification due to problems + found since its publication; other new elements have been added for + compatibility, those new elements have been made optional, but are + strongly recommended. + + + +Franks, et al. Standards Track [Page 1] + +RFC 2617 HTTP Authentication June 1999 + + + Like Basic, Digest access authentication verifies that both parties + to a communication know a shared secret (a password); unlike Basic, + this verification can be done without sending the password in the + clear, which is Basic's biggest weakness. As with most other + authentication protocols, the greatest sources of risks are usually + found not in the core protocol itself but in policies and procedures + surrounding its use. + +Table of Contents + + 1 Access Authentication................................ 3 + 1.1 Reliance on the HTTP/1.1 Specification............ 3 + 1.2 Access Authentication Framework................... 3 + 2 Basic Authentication Scheme.......................... 5 + 3 Digest Access Authentication Scheme.................. 6 + 3.1 Introduction...................................... 6 + 3.1.1 Purpose......................................... 6 + 3.1.2 Overall Operation............................... 6 + 3.1.3 Representation of digest values................. 7 + 3.1.4 Limitations..................................... 7 + 3.2 Specification of Digest Headers................... 7 + 3.2.1 The WWW-Authenticate Response Header............ 8 + 3.2.2 The Authorization Request Header................ 11 + 3.2.3 The Authentication-Info Header.................. 15 + 3.3 Digest Operation.................................. 17 + 3.4 Security Protocol Negotiation..................... 18 + 3.5 Example........................................... 18 + 3.6 Proxy-Authentication and Proxy-Authorization...... 19 + 4 Security Considerations.............................. 19 + 4.1 Authentication of Clients using Basic + Authentication.................................... 19 + 4.2 Authentication of Clients using Digest + Authentication.................................... 20 + 4.3 Limited Use Nonce Values.......................... 21 + 4.4 Comparison of Digest with Basic Authentication.... 22 + 4.5 Replay Attacks.................................... 22 + 4.6 Weakness Created by Multiple Authentication + Schemes........................................... 23 + 4.7 Online dictionary attacks......................... 23 + 4.8 Man in the Middle................................. 24 + 4.9 Chosen plaintext attacks.......................... 24 + 4.10 Precomputed dictionary attacks.................... 25 + 4.11 Batch brute force attacks......................... 25 + 4.12 Spoofing by Counterfeit Servers................... 25 + 4.13 Storing passwords................................. 26 + 4.14 Summary........................................... 26 + 5 Sample implementation................................ 27 + 6 Acknowledgments...................................... 31 + + + +Franks, et al. Standards Track [Page 2] + +RFC 2617 HTTP Authentication June 1999 + + + 7 References........................................... 31 + 8 Authors' Addresses................................... 32 + 9 Full Copyright Statement............................. 34 + +1 Access Authentication + +1.1 Reliance on the HTTP/1.1 Specification + + This specification is a companion to the HTTP/1.1 specification [2]. + It uses the augmented BNF section 2.1 of that document, and relies on + both the non-terminals defined in that document and other aspects of + the HTTP/1.1 specification. + +1.2 Access Authentication Framework + + HTTP provides a simple challenge-response authentication mechanism + that MAY be used by a server to challenge a client request and by a + client to provide authentication information. It uses an extensible, + case-insensitive token to identify the authentication scheme, + followed by a comma-separated list of attribute-value pairs which + carry the parameters necessary for achieving authentication via that + scheme. + + auth-scheme = token + auth-param = token "=" ( token | quoted-string ) + + The 401 (Unauthorized) response message is used by an origin server + to challenge the authorization of a user agent. This response MUST + include a WWW-Authenticate header field containing at least one + challenge applicable to the requested resource. The 407 (Proxy + Authentication Required) response message is used by a proxy to + challenge the authorization of a client and MUST include a Proxy- + Authenticate header field containing at least one challenge + applicable to the proxy for the requested resource. + + challenge = auth-scheme 1*SP 1#auth-param + + Note: User agents will need to take special care in parsing the WWW- + Authenticate or Proxy-Authenticate header field value if it contains + more than one challenge, or if more than one WWW-Authenticate header + field is provided, since the contents of a challenge may itself + contain a comma-separated list of authentication parameters. + + The authentication parameter realm is defined for all authentication + schemes: + + realm = "realm" "=" realm-value + realm-value = quoted-string + + + +Franks, et al. Standards Track [Page 3] + +RFC 2617 HTTP Authentication June 1999 + + + The realm directive (case-insensitive) is required for all + authentication schemes that issue a challenge. The realm value + (case-sensitive), in combination with the canonical root URL (the + absoluteURI for the server whose abs_path is empty; see section 5.1.2 + of [2]) of the server being accessed, defines the protection space. + These realms allow the protected resources on a server to be + partitioned into a set of protection spaces, each with its own + authentication scheme and/or authorization database. The realm value + is a string, generally assigned by the origin server, which may have + additional semantics specific to the authentication scheme. Note that + there may be multiple challenges with the same auth-scheme but + different realms. + + A user agent that wishes to authenticate itself with an origin + server--usually, but not necessarily, after receiving a 401 + (Unauthorized)--MAY do so by including an Authorization header field + with the request. A client that wishes to authenticate itself with a + proxy--usually, but not necessarily, after receiving a 407 (Proxy + Authentication Required)--MAY do so by including a Proxy- + Authorization header field with the request. Both the Authorization + field value and the Proxy-Authorization field value consist of + credentials containing the authentication information of the client + for the realm of the resource being requested. The user agent MUST + choose to use one of the challenges with the strongest auth-scheme it + understands and request credentials from the user based upon that + challenge. + + credentials = auth-scheme #auth-param + + Note that many browsers will only recognize Basic and will require + that it be the first auth-scheme presented. Servers should only + include Basic if it is minimally acceptable. + + The protection space determines the domain over which credentials can + be automatically applied. If a prior request has been authorized, the + same credentials MAY be reused for all other requests within that + protection space for a period of time determined by the + authentication scheme, parameters, and/or user preference. Unless + otherwise defined by the authentication scheme, a single protection + space cannot extend outside the scope of its server. + + If the origin server does not wish to accept the credentials sent + with a request, it SHOULD return a 401 (Unauthorized) response. The + response MUST include a WWW-Authenticate header field containing at + least one (possibly new) challenge applicable to the requested + resource. If a proxy does not accept the credentials sent with a + request, it SHOULD return a 407 (Proxy Authentication Required). The + response MUST include a Proxy-Authenticate header field containing a + + + +Franks, et al. Standards Track [Page 4] + +RFC 2617 HTTP Authentication June 1999 + + + (possibly new) challenge applicable to the proxy for the requested + resource. + + The HTTP protocol does not restrict applications to this simple + challenge-response mechanism for access authentication. Additional + mechanisms MAY be used, such as encryption at the transport level or + via message encapsulation, and with additional header fields + specifying authentication information. However, these additional + mechanisms are not defined by this specification. + + Proxies MUST be completely transparent regarding user agent + authentication by origin servers. That is, they must forward the + WWW-Authenticate and Authorization headers untouched, and follow the + rules found in section 14.8 of [2]. Both the Proxy-Authenticate and + the Proxy-Authorization header fields are hop-by-hop headers (see + section 13.5.1 of [2]). + +2 Basic Authentication Scheme + + The "basic" authentication scheme is based on the model that the + client must authenticate itself with a user-ID and a password for + each realm. The realm value should be considered an opaque string + which can only be compared for equality with other realms on that + server. The server will service the request only if it can validate + the user-ID and password for the protection space of the Request-URI. + There are no optional authentication parameters. + + For Basic, the framework above is utilized as follows: + + challenge = "Basic" realm + credentials = "Basic" basic-credentials + + Upon receipt of an unauthorized request for a URI within the + protection space, the origin server MAY respond with a challenge like + the following: + + WWW-Authenticate: Basic realm="WallyWorld" + + where "WallyWorld" is the string assigned by the server to identify + the protection space of the Request-URI. A proxy may respond with the + same challenge using the Proxy-Authenticate header field. + + To receive authorization, the client sends the userid and password, + separated by a single colon (":") character, within a base64 [7] + encoded string in the credentials. + + basic-credentials = base64-user-pass + base64-user-pass = + user-pass = userid ":" password + userid = * + password = *TEXT + + Userids might be case sensitive. + + If the user agent wishes to send the userid "Aladdin" and password + "open sesame", it would use the following header field: + + Authorization: Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ== + + A client SHOULD assume that all paths at or deeper than the depth of + the last symbolic element in the path field of the Request-URI also + are within the protection space specified by the Basic realm value of + the current challenge. A client MAY preemptively send the + corresponding Authorization header with requests for resources in + that space without receipt of another challenge from the server. + Similarly, when a client sends a request to a proxy, it may reuse a + userid and password in the Proxy-Authorization header field without + receiving another challenge from the proxy server. See section 4 for + security considerations associated with Basic authentication. + +3 Digest Access Authentication Scheme + +3.1 Introduction + +3.1.1 Purpose + + The protocol referred to as "HTTP/1.0" includes the specification for + a Basic Access Authentication scheme[1]. That scheme is not + considered to be a secure method of user authentication, as the user + name and password are passed over the network in an unencrypted form. + This section provides the specification for a scheme that does not + send the password in cleartext, referred to as "Digest Access + Authentication". + + The Digest Access Authentication scheme is not intended to be a + complete answer to the need for security in the World Wide Web. This + scheme provides no encryption of message content. The intent is + simply to create an access authentication method that avoids the most + serious flaws of Basic authentication. + +3.1.2 Overall Operation + + Like Basic Access Authentication, the Digest scheme is based on a + simple challenge-response paradigm. The Digest scheme challenges + using a nonce value. A valid response contains a checksum (by + + + +Franks, et al. Standards Track [Page 6] + +RFC 2617 HTTP Authentication June 1999 + + + default, the MD5 checksum) of the username, the password, the given + nonce value, the HTTP method, and the requested URI. In this way, the + password is never sent in the clear. Just as with the Basic scheme, + the username and password must be prearranged in some fashion not + addressed by this document. + +3.1.3 Representation of digest values + + An optional header allows the server to specify the algorithm used to + create the checksum or digest. By default the MD5 algorithm is used + and that is the only algorithm described in this document. + + For the purposes of this document, an MD5 digest of 128 bits is + represented as 32 ASCII printable characters. The bits in the 128 bit + digest are converted from most significant to least significant bit, + four bits at a time to their ASCII presentation as follows. Each four + bits is represented by its familiar hexadecimal notation from the + characters 0123456789abcdef. That is, binary 0000 gets represented by + the character '0', 0001, by '1', and so on up to the representation + of 1111 as 'f'. + +3.1.4 Limitations + + The Digest authentication scheme described in this document suffers + from many known limitations. It is intended as a replacement for + Basic authentication and nothing more. It is a password-based system + and (on the server side) suffers from all the same problems of any + password system. In particular, no provision is made in this protocol + for the initial secure arrangement between user and server to + establish the user's password. + + Users and implementors should be aware that this protocol is not as + secure as Kerberos, and not as secure as any client-side private-key + scheme. Nevertheless it is better than nothing, better than what is + commonly used with telnet and ftp, and better than Basic + authentication. + +3.2 Specification of Digest Headers + + The Digest Access Authentication scheme is conceptually similar to + the Basic scheme. The formats of the modified WWW-Authenticate header + line and the Authorization header line are specified below. In + addition, a new header, Authentication-Info, is specified. + + + + + + + + +Franks, et al. Standards Track [Page 7] + +RFC 2617 HTTP Authentication June 1999 + + +3.2.1 The WWW-Authenticate Response Header + + If a server receives a request for an access-protected object, and an + acceptable Authorization header is not sent, the server responds with + a "401 Unauthorized" status code, and a WWW-Authenticate header as + per the framework defined above, which for the digest scheme is + utilized as follows: + + challenge = "Digest" digest-challenge + + digest-challenge = 1#( realm | [ domain ] | nonce | + [ opaque ] |[ stale ] | [ algorithm ] | + [ qop-options ] | [auth-param] ) + + + domain = "domain" "=" <"> URI ( 1*SP URI ) <"> + URI = absoluteURI | abs_path + nonce = "nonce" "=" nonce-value + nonce-value = quoted-string + opaque = "opaque" "=" quoted-string + stale = "stale" "=" ( "true" | "false" ) + algorithm = "algorithm" "=" ( "MD5" | "MD5-sess" | + token ) + qop-options = "qop" "=" <"> 1#qop-value <"> + qop-value = "auth" | "auth-int" | token + + The meanings of the values of the directives used above are as + follows: + + realm + A string to be displayed to users so they know which username and + password to use. This string should contain at least the name of + the host performing the authentication and might additionally + indicate the collection of users who might have access. An example + might be "registered_users@gotham.news.com". + + domain + A quoted, space-separated list of URIs, as specified in RFC XURI + [7], that define the protection space. If a URI is an abs_path, it + is relative to the canonical root URL (see section 1.2 above) of + the server being accessed. An absoluteURI in this list may refer to + a different server than the one being accessed. The client can use + this list to determine the set of URIs for which the same + authentication information may be sent: any URI that has a URI in + this list as a prefix (after both have been made absolute) may be + assumed to be in the same protection space. If this directive is + omitted or its value is empty, the client should assume that the + protection space consists of all URIs on the responding server. + + + +Franks, et al. Standards Track [Page 8] + +RFC 2617 HTTP Authentication June 1999 + + + This directive is not meaningful in Proxy-Authenticate headers, for + which the protection space is always the entire proxy; if present + it should be ignored. + + nonce + A server-specified data string which should be uniquely generated + each time a 401 response is made. It is recommended that this + string be base64 or hexadecimal data. Specifically, since the + string is passed in the header lines as a quoted string, the + double-quote character is not allowed. + + The contents of the nonce are implementation dependent. The quality + of the implementation depends on a good choice. A nonce might, for + example, be constructed as the base 64 encoding of + + time-stamp H(time-stamp ":" ETag ":" private-key) + + where time-stamp is a server-generated time or other non-repeating + value, ETag is the value of the HTTP ETag header associated with + the requested entity, and private-key is data known only to the + server. With a nonce of this form a server would recalculate the + hash portion after receiving the client authentication header and + reject the request if it did not match the nonce from that header + or if the time-stamp value is not recent enough. In this way the + server can limit the time of the nonce's validity. The inclusion of + the ETag prevents a replay request for an updated version of the + resource. (Note: including the IP address of the client in the + nonce would appear to offer the server the ability to limit the + reuse of the nonce to the same client that originally got it. + However, that would break proxy farms, where requests from a single + user often go through different proxies in the farm. Also, IP + address spoofing is not that hard.) + + An implementation might choose not to accept a previously used + nonce or a previously used digest, in order to protect against a + replay attack. Or, an implementation might choose to use one-time + nonces or digests for POST or PUT requests and a time-stamp for GET + requests. For more details on the issues involved see section 4. + of this document. + + The nonce is opaque to the client. + + opaque + A string of data, specified by the server, which should be returned + by the client unchanged in the Authorization header of subsequent + requests with URIs in the same protection space. It is recommended + that this string be base64 or hexadecimal data. + + + + +Franks, et al. Standards Track [Page 9] + +RFC 2617 HTTP Authentication June 1999 + + + stale + A flag, indicating that the previous request from the client was + rejected because the nonce value was stale. If stale is TRUE + (case-insensitive), the client may wish to simply retry the request + with a new encrypted response, without reprompting the user for a + new username and password. The server should only set stale to TRUE + if it receives a request for which the nonce is invalid but with a + valid digest for that nonce (indicating that the client knows the + correct username/password). If stale is FALSE, or anything other + than TRUE, or the stale directive is not present, the username + and/or password are invalid, and new values must be obtained. + + algorithm + A string indicating a pair of algorithms used to produce the digest + and a checksum. If this is not present it is assumed to be "MD5". + If the algorithm is not understood, the challenge should be ignored + (and a different one used, if there is more than one). + + In this document the string obtained by applying the digest + algorithm to the data "data" with secret "secret" will be denoted + by KD(secret, data), and the string obtained by applying the + checksum algorithm to the data "data" will be denoted H(data). The + notation unq(X) means the value of the quoted-string X without the + surrounding quotes. + + For the "MD5" and "MD5-sess" algorithms + + H(data) = MD5(data) + + and + + KD(secret, data) = H(concat(secret, ":", data)) + + i.e., the digest is the MD5 of the secret concatenated with a colon + concatenated with the data. The "MD5-sess" algorithm is intended to + allow efficient 3rd party authentication servers; for the + difference in usage, see the description in section 3.2.2.2. + + qop-options + This directive is optional, but is made so only for backward + compatibility with RFC 2069 [6]; it SHOULD be used by all + implementations compliant with this version of the Digest scheme. + If present, it is a quoted string of one or more tokens indicating + the "quality of protection" values supported by the server. The + value "auth" indicates authentication; the value "auth-int" + indicates authentication with integrity protection; see the + + + + + +Franks, et al. Standards Track [Page 10] + +RFC 2617 HTTP Authentication June 1999 + + + descriptions below for calculating the response directive value for + the application of this choice. Unrecognized options MUST be + ignored. + + auth-param + This directive allows for future extensions. Any unrecognized + directive MUST be ignored. + +3.2.2 The Authorization Request Header + + The client is expected to retry the request, passing an Authorization + header line, which is defined according to the framework above, + utilized as follows. + + credentials = "Digest" digest-response + digest-response = 1#( username | realm | nonce | digest-uri + | response | [ algorithm ] | [cnonce] | + [opaque] | [message-qop] | + [nonce-count] | [auth-param] ) + + username = "username" "=" username-value + username-value = quoted-string + digest-uri = "uri" "=" digest-uri-value + digest-uri-value = request-uri ; As specified by HTTP/1.1 + message-qop = "qop" "=" qop-value + cnonce = "cnonce" "=" cnonce-value + cnonce-value = nonce-value + nonce-count = "nc" "=" nc-value + nc-value = 8LHEX + response = "response" "=" request-digest + request-digest = <"> 32LHEX <"> + LHEX = "0" | "1" | "2" | "3" | + "4" | "5" | "6" | "7" | + "8" | "9" | "a" | "b" | + "c" | "d" | "e" | "f" + + The values of the opaque and algorithm fields must be those supplied + in the WWW-Authenticate response header for the entity being + requested. + + response + A string of 32 hex digits computed as defined below, which proves + that the user knows a password + + username + The user's name in the specified realm. + + + + + +Franks, et al. Standards Track [Page 11] + +RFC 2617 HTTP Authentication June 1999 + + + digest-uri + The URI from Request-URI of the Request-Line; duplicated here + because proxies are allowed to change the Request-Line in transit. + + qop + Indicates what "quality of protection" the client has applied to + the message. If present, its value MUST be one of the alternatives + the server indicated it supports in the WWW-Authenticate header. + These values affect the computation of the request-digest. Note + that this is a single token, not a quoted list of alternatives as + in WWW- Authenticate. This directive is optional in order to + preserve backward compatibility with a minimal implementation of + RFC 2069 [6], but SHOULD be used if the server indicated that qop + is supported by providing a qop directive in the WWW-Authenticate + header field. + + cnonce + This MUST be specified if a qop directive is sent (see above), and + MUST NOT be specified if the server did not send a qop directive in + the WWW-Authenticate header field. The cnonce-value is an opaque + quoted string value provided by the client and used by both client + and server to avoid chosen plaintext attacks, to provide mutual + authentication, and to provide some message integrity protection. + See the descriptions below of the calculation of the response- + digest and request-digest values. + + nonce-count + This MUST be specified if a qop directive is sent (see above), and + MUST NOT be specified if the server did not send a qop directive in + the WWW-Authenticate header field. The nc-value is the hexadecimal + count of the number of requests (including the current request) + that the client has sent with the nonce value in this request. For + example, in the first request sent in response to a given nonce + value, the client sends "nc=00000001". The purpose of this + directive is to allow the server to detect request replays by + maintaining its own copy of this count - if the same nc-value is + seen twice, then the request is a replay. See the description + below of the construction of the request-digest value. + + auth-param + This directive allows for future extensions. Any unrecognized + directive MUST be ignored. + + If a directive or its value is improper, or required directives are + missing, the proper response is 400 Bad Request. If the request- + digest is invalid, then a login failure should be logged, since + repeated login failures from a single client may indicate an attacker + attempting to guess passwords. + + + +Franks, et al. Standards Track [Page 12] + +RFC 2617 HTTP Authentication June 1999 + + + The definition of request-digest above indicates the encoding for its + value. The following definitions show how the value is computed. + +3.2.2.1 Request-Digest + + If the "qop" value is "auth" or "auth-int": + + request-digest = <"> < KD ( H(A1), unq(nonce-value) + ":" nc-value + ":" unq(cnonce-value) + ":" unq(qop-value) + ":" H(A2) + ) <"> + + If the "qop" directive is not present (this construction is for + compatibility with RFC 2069): + + request-digest = + <"> < KD ( H(A1), unq(nonce-value) ":" H(A2) ) > + <"> + + See below for the definitions for A1 and A2. + +3.2.2.2 A1 + + If the "algorithm" directive's value is "MD5" or is unspecified, then + A1 is: + + A1 = unq(username-value) ":" unq(realm-value) ":" passwd + + where + + passwd = < user's password > + + If the "algorithm" directive's value is "MD5-sess", then A1 is + calculated only once - on the first request by the client following + receipt of a WWW-Authenticate challenge from the server. It uses the + server nonce from that challenge, and the first client nonce value to + construct A1 as follows: + + A1 = H( unq(username-value) ":" unq(realm-value) + ":" passwd ) + ":" unq(nonce-value) ":" unq(cnonce-value) + + This creates a 'session key' for the authentication of subsequent + requests and responses which is different for each "authentication + session", thus limiting the amount of material hashed with any one + key. (Note: see further discussion of the authentication session in + + + +Franks, et al. Standards Track [Page 13] + +RFC 2617 HTTP Authentication June 1999 + + + section 3.3.) Because the server need only use the hash of the user + credentials in order to create the A1 value, this construction could + be used in conjunction with a third party authentication service so + that the web server would not need the actual password value. The + specification of such a protocol is beyond the scope of this + specification. + +3.2.2.3 A2 + + If the "qop" directive's value is "auth" or is unspecified, then A2 + is: + + A2 = Method ":" digest-uri-value + + If the "qop" value is "auth-int", then A2 is: + + A2 = Method ":" digest-uri-value ":" H(entity-body) + +3.2.2.4 Directive values and quoted-string + + Note that the value of many of the directives, such as "username- + value", are defined as a "quoted-string". However, the "unq" notation + indicates that surrounding quotation marks are removed in forming the + string A1. Thus if the Authorization header includes the fields + + username="Mufasa", realm=myhost@testrealm.com + + and the user Mufasa has password "Circle Of Life" then H(A1) would be + H(Mufasa:myhost@testrealm.com:Circle Of Life) with no quotation marks + in the digested string. + + No white space is allowed in any of the strings to which the digest + function H() is applied unless that white space exists in the quoted + strings or entity body whose contents make up the string to be + digested. For example, the string A1 illustrated above must be + + Mufasa:myhost@testrealm.com:Circle Of Life + + with no white space on either side of the colons, but with the white + space between the words used in the password value. Likewise, the + other strings digested by H() must not have white space on either + side of the colons which delimit their fields unless that white space + was in the quoted strings or entity body being digested. + + Also note that if integrity protection is applied (qop=auth-int), the + H(entity-body) is the hash of the entity body, not the message body - + it is computed before any transfer encoding is applied by the sender + + + + +Franks, et al. Standards Track [Page 14] + +RFC 2617 HTTP Authentication June 1999 + + + and after it has been removed by the recipient. Note that this + includes multipart boundaries and embedded headers in each part of + any multipart content-type. + +3.2.2.5 Various considerations + + The "Method" value is the HTTP request method as specified in section + 5.1.1 of [2]. The "request-uri" value is the Request-URI from the + request line as specified in section 5.1.2 of [2]. This may be "*", + an "absoluteURL" or an "abs_path" as specified in section 5.1.2 of + [2], but it MUST agree with the Request-URI. In particular, it MUST + be an "absoluteURL" if the Request-URI is an "absoluteURL". The + "cnonce-value" is an optional client-chosen value whose purpose is + to foil chosen plaintext attacks. + + The authenticating server must assure that the resource designated by + the "uri" directive is the same as the resource specified in the + Request-Line; if they are not, the server SHOULD return a 400 Bad + Request error. (Since this may be a symptom of an attack, server + implementers may want to consider logging such errors.) The purpose + of duplicating information from the request URL in this field is to + deal with the possibility that an intermediate proxy may alter the + client's Request-Line. This altered (but presumably semantically + equivalent) request would not result in the same digest as that + calculated by the client. + + Implementers should be aware of how authenticated transactions + interact with shared caches. The HTTP/1.1 protocol specifies that + when a shared cache (see section 13.7 of [2]) has received a request + containing an Authorization header and a response from relaying that + request, it MUST NOT return that response as a reply to any other + request, unless one of two Cache-Control (see section 14.9 of [2]) + directives was present in the response. If the original response + included the "must-revalidate" Cache-Control directive, the cache MAY + use the entity of that response in replying to a subsequent request, + but MUST first revalidate it with the origin server, using the + request headers from the new request to allow the origin server to + authenticate the new request. Alternatively, if the original response + included the "public" Cache-Control directive, the response entity + MAY be returned in reply to any subsequent request. + +3.2.3 The Authentication-Info Header + + The Authentication-Info header is used by the server to communicate + some information regarding the successful authentication in the + response. + + + + + +Franks, et al. Standards Track [Page 15] + +RFC 2617 HTTP Authentication June 1999 + + + AuthenticationInfo = "Authentication-Info" ":" auth-info + auth-info = 1#(nextnonce | [ message-qop ] + | [ response-auth ] | [ cnonce ] + | [nonce-count] ) + nextnonce = "nextnonce" "=" nonce-value + response-auth = "rspauth" "=" response-digest + response-digest = <"> *LHEX <"> + + The value of the nextnonce directive is the nonce the server wishes + the client to use for a future authentication response. The server + may send the Authentication-Info header with a nextnonce field as a + means of implementing one-time or otherwise changing nonces. If the + nextnonce field is present the client SHOULD use it when constructing + the Authorization header for its next request. Failure of the client + to do so may result in a request to re-authenticate from the server + with the "stale=TRUE". + + Server implementations should carefully consider the performance + implications of the use of this mechanism; pipelined requests will + not be possible if every response includes a nextnonce directive + that must be used on the next request received by the server. + Consideration should be given to the performance vs. security + tradeoffs of allowing an old nonce value to be used for a limited + time to permit request pipelining. Use of the nonce-count can + retain most of the security advantages of a new server nonce + without the deleterious affects on pipelining. + + message-qop + Indicates the "quality of protection" options applied to the + response by the server. The value "auth" indicates authentication; + the value "auth-int" indicates authentication with integrity + protection. The server SHOULD use the same value for the message- + qop directive in the response as was sent by the client in the + corresponding request. + + The optional response digest in the "response-auth" directive + supports mutual authentication -- the server proves that it knows the + user's secret, and with qop=auth-int also provides limited integrity + protection of the response. The "response-digest" value is calculated + as for the "request-digest" in the Authorization header, except that + if "qop=auth" or is not specified in the Authorization header for the + request, A2 is + + A2 = ":" digest-uri-value + + and if "qop=auth-int", then A2 is + + A2 = ":" digest-uri-value ":" H(entity-body) + + + +Franks, et al. Standards Track [Page 16] + +RFC 2617 HTTP Authentication June 1999 + + + where "digest-uri-value" is the value of the "uri" directive on the + Authorization header in the request. The "cnonce-value" and "nc- + value" MUST be the ones for the client request to which this message + is the response. The "response-auth", "cnonce", and "nonce-count" + directives MUST BE present if "qop=auth" or "qop=auth-int" is + specified. + + The Authentication-Info header is allowed in the trailer of an HTTP + message transferred via chunked transfer-coding. + +3.3 Digest Operation + + Upon receiving the Authorization header, the server may check its + validity by looking up the password that corresponds to the submitted + username. Then, the server must perform the same digest operation + (e.g., MD5) performed by the client, and compare the result to the + given request-digest value. + + Note that the HTTP server does not actually need to know the user's + cleartext password. As long as H(A1) is available to the server, the + validity of an Authorization header may be verified. + + The client response to a WWW-Authenticate challenge for a protection + space starts an authentication session with that protection space. + The authentication session lasts until the client receives another + WWW-Authenticate challenge from any server in the protection space. A + client should remember the username, password, nonce, nonce count and + opaque values associated with an authentication session to use to + construct the Authorization header in future requests within that + protection space. The Authorization header may be included + preemptively; doing so improves server efficiency and avoids extra + round trips for authentication challenges. The server may choose to + accept the old Authorization header information, even though the + nonce value included might not be fresh. Alternatively, the server + may return a 401 response with a new nonce value, causing the client + to retry the request; by specifying stale=TRUE with this response, + the server tells the client to retry with the new nonce, but without + prompting for a new username and password. + + Because the client is required to return the value of the opaque + directive given to it by the server for the duration of a session, + the opaque data may be used to transport authentication session state + information. (Note that any such use can also be accomplished more + easily and safely by including the state in the nonce.) For example, + a server could be responsible for authenticating content that + actually sits on another server. It would achieve this by having the + first 401 response include a domain directive whose value includes a + URI on the second server, and an opaque directive whose value + + + +Franks, et al. Standards Track [Page 17] + +RFC 2617 HTTP Authentication June 1999 + + + contains the state information. The client will retry the request, at + which time the server might respond with a 301/302 redirection, + pointing to the URI on the second server. The client will follow the + redirection, and pass an Authorization header , including the + data. + + As with the basic scheme, proxies must be completely transparent in + the Digest access authentication scheme. That is, they must forward + the WWW-Authenticate, Authentication-Info and Authorization headers + untouched. If a proxy wants to authenticate a client before a request + is forwarded to the server, it can be done using the Proxy- + Authenticate and Proxy-Authorization headers described in section 3.6 + below. + +3.4 Security Protocol Negotiation + + It is useful for a server to be able to know which security schemes a + client is capable of handling. + + It is possible that a server may want to require Digest as its + authentication method, even if the server does not know that the + client supports it. A client is encouraged to fail gracefully if the + server specifies only authentication schemes it cannot handle. + +3.5 Example + + The following example assumes that an access-protected document is + being requested from the server via a GET request. The URI of the + document is "http://www.nowhere.org/dir/index.html". Both client and + server know that the username for this document is "Mufasa", and the + password is "Circle Of Life" (with one space between each of the + three words). + + The first time the client requests the document, no Authorization + header is sent, so the server responds with: + + HTTP/1.1 401 Unauthorized + WWW-Authenticate: Digest + realm="testrealm@host.com", + qop="auth,auth-int", + nonce="dcd98b7102dd2f0e8b11d0f600bfb0c093", + opaque="5ccc069c403ebaf9f0171e9517f40e41" + + The client may prompt the user for the username and password, after + which it will respond with a new request, including the following + Authorization header: + + + + + +Franks, et al. Standards Track [Page 18] + +RFC 2617 HTTP Authentication June 1999 + + + Authorization: Digest username="Mufasa", + realm="testrealm@host.com", + nonce="dcd98b7102dd2f0e8b11d0f600bfb0c093", + uri="/dir/index.html", + qop=auth, + nc=00000001, + cnonce="0a4f113b", + response="6629fae49393a05397450978507c4ef1", + opaque="5ccc069c403ebaf9f0171e9517f40e41" + +3.6 Proxy-Authentication and Proxy-Authorization + + The digest authentication scheme may also be used for authenticating + users to proxies, proxies to proxies, or proxies to origin servers by + use of the Proxy-Authenticate and Proxy-Authorization headers. These + headers are instances of the Proxy-Authenticate and Proxy- + Authorization headers specified in sections 10.33 and 10.34 of the + HTTP/1.1 specification [2] and their behavior is subject to + restrictions described there. The transactions for proxy + authentication are very similar to those already described. Upon + receiving a request which requires authentication, the proxy/server + must issue the "407 Proxy Authentication Required" response with a + "Proxy-Authenticate" header. The digest-challenge used in the + Proxy-Authenticate header is the same as that for the WWW- + Authenticate header as defined above in section 3.2.1. + + The client/proxy must then re-issue the request with a Proxy- + Authorization header, with directives as specified for the + Authorization header in section 3.2.2 above. + + On subsequent responses, the server sends Proxy-Authentication-Info + with directives the same as those for the Authentication-Info header + field. + + Note that in principle a client could be asked to authenticate itself + to both a proxy and an end-server, but never in the same response. + +4 Security Considerations + +4.1 Authentication of Clients using Basic Authentication + + The Basic authentication scheme is not a secure method of user + authentication, nor does it in any way protect the entity, which is + transmitted in cleartext across the physical network used as the + carrier. HTTP does not prevent additional authentication schemes and + encryption mechanisms from being employed to increase security or the + addition of enhancements (such as schemes to use one-time passwords) + to Basic authentication. + + + +Franks, et al. Standards Track [Page 19] + +RFC 2617 HTTP Authentication June 1999 + + + The most serious flaw in Basic authentication is that it results in + the essentially cleartext transmission of the user's password over + the physical network. It is this problem which Digest Authentication + attempts to address. + + Because Basic authentication involves the cleartext transmission of + passwords it SHOULD NOT be used (without enhancements) to protect + sensitive or valuable information. + + A common use of Basic authentication is for identification purposes + -- requiring the user to provide a user name and password as a means + of identification, for example, for purposes of gathering accurate + usage statistics on a server. When used in this way it is tempting to + think that there is no danger in its use if illicit access to the + protected documents is not a major concern. This is only correct if + the server issues both user name and password to the users and in + particular does not allow the user to choose his or her own password. + The danger arises because naive users frequently reuse a single + password to avoid the task of maintaining multiple passwords. + + If a server permits users to select their own passwords, then the + threat is not only unauthorized access to documents on the server but + also unauthorized access to any other resources on other systems that + the user protects with the same password. Furthermore, in the + server's password database, many of the passwords may also be users' + passwords for other sites. The owner or administrator of such a + system could therefore expose all users of the system to the risk of + unauthorized access to all those sites if this information is not + maintained in a secure fashion. + + Basic Authentication is also vulnerable to spoofing by counterfeit + servers. If a user can be led to believe that he is connecting to a + host containing information protected by Basic authentication when, + in fact, he is connecting to a hostile server or gateway, then the + attacker can request a password, store it for later use, and feign an + error. This type of attack is not possible with Digest + Authentication. Server implementers SHOULD guard against the + possibility of this sort of counterfeiting by gateways or CGI + scripts. In particular it is very dangerous for a server to simply + turn over a connection to a gateway. That gateway can then use the + persistent connection mechanism to engage in multiple transactions + with the client while impersonating the original server in a way that + is not detectable by the client. + +4.2 Authentication of Clients using Digest Authentication + + Digest Authentication does not provide a strong authentication + mechanism, when compared to public key based mechanisms, for example. + + + +Franks, et al. Standards Track [Page 20] + +RFC 2617 HTTP Authentication June 1999 + + + However, it is significantly stronger than (e.g.) CRAM-MD5, which has + been proposed for use with LDAP [10], POP and IMAP (see RFC 2195 + [9]). It is intended to replace the much weaker and even more + dangerous Basic mechanism. + + Digest Authentication offers no confidentiality protection beyond + protecting the actual password. All of the rest of the request and + response are available to an eavesdropper. + + Digest Authentication offers only limited integrity protection for + the messages in either direction. If qop=auth-int mechanism is used, + those parts of the message used in the calculation of the WWW- + Authenticate and Authorization header field response directive values + (see section 3.2 above) are protected. Most header fields and their + values could be modified as a part of a man-in-the-middle attack. + + Many needs for secure HTTP transactions cannot be met by Digest + Authentication. For those needs TLS or SHTTP are more appropriate + protocols. In particular Digest authentication cannot be used for any + transaction requiring confidentiality protection. Nevertheless many + functions remain for which Digest authentication is both useful and + appropriate. Any service in present use that uses Basic should be + switched to Digest as soon as practical. + +4.3 Limited Use Nonce Values + + The Digest scheme uses a server-specified nonce to seed the + generation of the request-digest value (as specified in section + 3.2.2.1 above). As shown in the example nonce in section 3.2.1, the + server is free to construct the nonce such that it may only be used + from a particular client, for a particular resource, for a limited + period of time or number of uses, or any other restrictions. Doing + so strengthens the protection provided against, for example, replay + attacks (see 4.5). However, it should be noted that the method + chosen for generating and checking the nonce also has performance and + resource implications. For example, a server may choose to allow + each nonce value to be used only once by maintaining a record of + whether or not each recently issued nonce has been returned and + sending a next-nonce directive in the Authentication-Info header + field of every response. This protects against even an immediate + replay attack, but has a high cost checking nonce values, and perhaps + more important will cause authentication failures for any pipelined + requests (presumably returning a stale nonce indication). Similarly, + incorporating a request-specific element such as the Etag value for a + resource limits the use of the nonce to that version of the resource + and also defeats pipelining. Thus it may be useful to do so for + methods with side effects but have unacceptable performance for those + that do not. + + + +Franks, et al. Standards Track [Page 21] + +RFC 2617 HTTP Authentication June 1999 + + +4.4 Comparison of Digest with Basic Authentication + + Both Digest and Basic Authentication are very much on the weak end of + the security strength spectrum. But a comparison between the two + points out the utility, even necessity, of replacing Basic by Digest. + + The greatest threat to the type of transactions for which these + protocols are used is network snooping. This kind of transaction + might involve, for example, online access to a database whose use is + restricted to paying subscribers. With Basic authentication an + eavesdropper can obtain the password of the user. This not only + permits him to access anything in the database, but, often worse, + will permit access to anything else the user protects with the same + password. + + By contrast, with Digest Authentication the eavesdropper only gets + access to the transaction in question and not to the user's password. + The information gained by the eavesdropper would permit a replay + attack, but only with a request for the same document, and even that + may be limited by the server's choice of nonce. + +4.5 Replay Attacks + + A replay attack against Digest authentication would usually be + pointless for a simple GET request since an eavesdropper would + already have seen the only document he could obtain with a replay. + This is because the URI of the requested document is digested in the + client request and the server will only deliver that document. By + contrast under Basic Authentication once the eavesdropper has the + user's password, any document protected by that password is open to + him. + + Thus, for some purposes, it is necessary to protect against replay + attacks. A good Digest implementation can do this in various ways. + The server created "nonce" value is implementation dependent, but if + it contains a digest of the client IP, a time-stamp, the resource + ETag, and a private server key (as recommended above) then a replay + attack is not simple. An attacker must convince the server that the + request is coming from a false IP address and must cause the server + to deliver the document to an IP address different from the address + to which it believes it is sending the document. An attack can only + succeed in the period before the time-stamp expires. Digesting the + client IP and time-stamp in the nonce permits an implementation which + does not maintain state between transactions. + + For applications where no possibility of replay attack can be + tolerated the server can use one-time nonce values which will not be + honored for a second use. This requires the overhead of the server + + + +Franks, et al. Standards Track [Page 22] + +RFC 2617 HTTP Authentication June 1999 + + + remembering which nonce values have been used until the nonce time- + stamp (and hence the digest built with it) has expired, but it + effectively protects against replay attacks. + + An implementation must give special attention to the possibility of + replay attacks with POST and PUT requests. Unless the server employs + one-time or otherwise limited-use nonces and/or insists on the use of + the integrity protection of qop=auth-int, an attacker could replay + valid credentials from a successful request with counterfeit form + data or other message body. Even with the use of integrity protection + most metadata in header fields is not protected. Proper nonce + generation and checking provides some protection against replay of + previously used valid credentials, but see 4.8. + +4.6 Weakness Created by Multiple Authentication Schemes + + An HTTP/1.1 server may return multiple challenges with a 401 + (Authenticate) response, and each challenge may use a different + auth-scheme. A user agent MUST choose to use the strongest auth- + scheme it understands and request credentials from the user based + upon that challenge. + + Note that many browsers will only recognize Basic and will require + that it be the first auth-scheme presented. Servers should only + include Basic if it is minimally acceptable. + + When the server offers choices of authentication schemes using the + WWW-Authenticate header, the strength of the resulting authentication + is only as good as that of the of the weakest of the authentication + schemes. See section 4.8 below for discussion of particular attack + scenarios that exploit multiple authentication schemes. + +4.7 Online dictionary attacks + + If the attacker can eavesdrop, then it can test any overheard + nonce/response pairs against a list of common words. Such a list is + usually much smaller than the total number of possible passwords. The + cost of computing the response for each password on the list is paid + once for each challenge. + + The server can mitigate this attack by not allowing users to select + passwords that are in a dictionary. + + + + + + + + + +Franks, et al. Standards Track [Page 23] + +RFC 2617 HTTP Authentication June 1999 + + +4.8 Man in the Middle + + Both Basic and Digest authentication are vulnerable to "man in the + middle" (MITM) attacks, for example, from a hostile or compromised + proxy. Clearly, this would present all the problems of eavesdropping. + But it also offers some additional opportunities to the attacker. + + A possible man-in-the-middle attack would be to add a weak + authentication scheme to the set of choices, hoping that the client + will use one that exposes the user's credentials (e.g. password). For + this reason, the client should always use the strongest scheme that + it understands from the choices offered. + + An even better MITM attack would be to remove all offered choices, + replacing them with a challenge that requests only Basic + authentication, then uses the cleartext credentials from the Basic + authentication to authenticate to the origin server using the + stronger scheme it requested. A particularly insidious way to mount + such a MITM attack would be to offer a "free" proxy caching service + to gullible users. + + User agents should consider measures such as presenting a visual + indication at the time of the credentials request of what + authentication scheme is to be used, or remembering the strongest + authentication scheme ever requested by a server and produce a + warning message before using a weaker one. It might also be a good + idea for the user agent to be configured to demand Digest + authentication in general, or from specific sites. + + Or, a hostile proxy might spoof the client into making a request the + attacker wanted rather than one the client wanted. Of course, this is + still much harder than a comparable attack against Basic + Authentication. + +4.9 Chosen plaintext attacks + + With Digest authentication, a MITM or a malicious server can + arbitrarily choose the nonce that the client will use to compute the + response. This is called a "chosen plaintext" attack. The ability to + choose the nonce is known to make cryptanalysis much easier [8]. + + However, no way to analyze the MD5 one-way function used by Digest + using chosen plaintext is currently known. + + The countermeasure against this attack is for clients to be + configured to require the use of the optional "cnonce" directive; + this allows the client to vary the input to the hash in a way not + chosen by the attacker. + + + +Franks, et al. Standards Track [Page 24] + +RFC 2617 HTTP Authentication June 1999 + + +4.10 Precomputed dictionary attacks + + With Digest authentication, if the attacker can execute a chosen + plaintext attack, the attacker can precompute the response for many + common words to a nonce of its choice, and store a dictionary of + (response, password) pairs. Such precomputation can often be done in + parallel on many machines. It can then use the chosen plaintext + attack to acquire a response corresponding to that challenge, and + just look up the password in the dictionary. Even if most passwords + are not in the dictionary, some might be. Since the attacker gets to + pick the challenge, the cost of computing the response for each + password on the list can be amortized over finding many passwords. A + dictionary with 100 million password/response pairs would take about + 3.2 gigabytes of disk storage. + + The countermeasure against this attack is to for clients to be + configured to require the use of the optional "cnonce" directive. + +4.11 Batch brute force attacks + + With Digest authentication, a MITM can execute a chosen plaintext + attack, and can gather responses from many users to the same nonce. + It can then find all the passwords within any subset of password + space that would generate one of the nonce/response pairs in a single + pass over that space. It also reduces the time to find the first + password by a factor equal to the number of nonce/response pairs + gathered. This search of the password space can often be done in + parallel on many machines, and even a single machine can search large + subsets of the password space very quickly -- reports exist of + searching all passwords with six or fewer letters in a few hours. + + The countermeasure against this attack is to for clients to be + configured to require the use of the optional "cnonce" directive. + +4.12 Spoofing by Counterfeit Servers + + Basic Authentication is vulnerable to spoofing by counterfeit + servers. If a user can be led to believe that she is connecting to a + host containing information protected by a password she knows, when + in fact she is connecting to a hostile server, then the hostile + server can request a password, store it away for later use, and feign + an error. This type of attack is more difficult with Digest + Authentication -- but the client must know to demand that Digest + authentication be used, perhaps using some of the techniques + described above to counter "man-in-the-middle" attacks. Again, the + user can be helped in detecting this attack by a visual indication of + the authentication mechanism in use with appropriate guidance in + interpreting the implications of each scheme. + + + +Franks, et al. Standards Track [Page 25] + +RFC 2617 HTTP Authentication June 1999 + + +4.13 Storing passwords + + Digest authentication requires that the authenticating agent (usually + the server) store some data derived from the user's name and password + in a "password file" associated with a given realm. Normally this + might contain pairs consisting of username and H(A1), where H(A1) is + the digested value of the username, realm, and password as described + above. + + The security implications of this are that if this password file is + compromised, then an attacker gains immediate access to documents on + the server using this realm. Unlike, say a standard UNIX password + file, this information need not be decrypted in order to access + documents in the server realm associated with this file. On the other + hand, decryption, or more likely a brute force attack, would be + necessary to obtain the user's password. This is the reason that the + realm is part of the digested data stored in the password file. It + means that if one Digest authentication password file is compromised, + it does not automatically compromise others with the same username + and password (though it does expose them to brute force attack). + + There are two important security consequences of this. First the + password file must be protected as if it contained unencrypted + passwords, because for the purpose of accessing documents in its + realm, it effectively does. + + A second consequence of this is that the realm string should be + unique among all realms which any single user is likely to use. In + particular a realm string should include the name of the host doing + the authentication. The inability of the client to authenticate the + server is a weakness of Digest Authentication. + +4.14 Summary + + By modern cryptographic standards Digest Authentication is weak. But + for a large range of purposes it is valuable as a replacement for + Basic Authentication. It remedies some, but not all, weaknesses of + Basic Authentication. Its strength may vary depending on the + implementation. In particular the structure of the nonce (which is + dependent on the server implementation) may affect the ease of + mounting a replay attack. A range of server options is appropriate + since, for example, some implementations may be willing to accept the + server overhead of one-time nonces or digests to eliminate the + possibility of replay. Others may satisfied with a nonce like the one + recommended above restricted to a single IP address and a single ETag + or with a limited lifetime. + + + + + +Franks, et al. Standards Track [Page 26] + +RFC 2617 HTTP Authentication June 1999 + + + The bottom line is that *any* compliant implementation will be + relatively weak by cryptographic standards, but *any* compliant + implementation will be far superior to Basic Authentication. + +5 Sample implementation + + The following code implements the calculations of H(A1), H(A2), + request-digest and response-digest, and a test program which computes + the values used in the example of section 3.5. It uses the MD5 + implementation from RFC 1321. + + File "digcalc.h": + +#define HASHLEN 16 +typedef char HASH[HASHLEN]; +#define HASHHEXLEN 32 +typedef char HASHHEX[HASHHEXLEN+1]; +#define IN +#define OUT + +/* calculate H(A1) as per HTTP Digest spec */ +void DigestCalcHA1( + IN char * pszAlg, + IN char * pszUserName, + IN char * pszRealm, + IN char * pszPassword, + IN char * pszNonce, + IN char * pszCNonce, + OUT HASHHEX SessionKey + ); + +/* calculate request-digest/response-digest as per HTTP Digest spec */ +void DigestCalcResponse( + IN HASHHEX HA1, /* H(A1) */ + IN char * pszNonce, /* nonce from server */ + IN char * pszNonceCount, /* 8 hex digits */ + IN char * pszCNonce, /* client nonce */ + IN char * pszQop, /* qop-value: "", "auth", "auth-int" */ + IN char * pszMethod, /* method from the request */ + IN char * pszDigestUri, /* requested URL */ + IN HASHHEX HEntity, /* H(entity body) if qop="auth-int" */ + OUT HASHHEX Response /* request-digest or response-digest */ + ); + +File "digcalc.c": + +#include +#include + + + +Franks, et al. Standards Track [Page 27] + +RFC 2617 HTTP Authentication June 1999 + + +#include +#include "digcalc.h" + +void CvtHex( + IN HASH Bin, + OUT HASHHEX Hex + ) +{ + unsigned short i; + unsigned char j; + + for (i = 0; i < HASHLEN; i++) { + j = (Bin[i] >> 4) & 0xf; + if (j <= 9) + Hex[i*2] = (j + '0'); + else + Hex[i*2] = (j + 'a' - 10); + j = Bin[i] & 0xf; + if (j <= 9) + Hex[i*2+1] = (j + '0'); + else + Hex[i*2+1] = (j + 'a' - 10); + }; + Hex[HASHHEXLEN] = '\0'; +}; + +/* calculate H(A1) as per spec */ +void DigestCalcHA1( + IN char * pszAlg, + IN char * pszUserName, + IN char * pszRealm, + IN char * pszPassword, + IN char * pszNonce, + IN char * pszCNonce, + OUT HASHHEX SessionKey + ) +{ + MD5_CTX Md5Ctx; + HASH HA1; + + MD5Init(&Md5Ctx); + MD5Update(&Md5Ctx, pszUserName, strlen(pszUserName)); + MD5Update(&Md5Ctx, ":", 1); + MD5Update(&Md5Ctx, pszRealm, strlen(pszRealm)); + MD5Update(&Md5Ctx, ":", 1); + MD5Update(&Md5Ctx, pszPassword, strlen(pszPassword)); + MD5Final(HA1, &Md5Ctx); + if (stricmp(pszAlg, "md5-sess") == 0) { + + + +Franks, et al. Standards Track [Page 28] + +RFC 2617 HTTP Authentication June 1999 + + + MD5Init(&Md5Ctx); + MD5Update(&Md5Ctx, HA1, HASHLEN); + MD5Update(&Md5Ctx, ":", 1); + MD5Update(&Md5Ctx, pszNonce, strlen(pszNonce)); + MD5Update(&Md5Ctx, ":", 1); + MD5Update(&Md5Ctx, pszCNonce, strlen(pszCNonce)); + MD5Final(HA1, &Md5Ctx); + }; + CvtHex(HA1, SessionKey); +}; + +/* calculate request-digest/response-digest as per HTTP Digest spec */ +void DigestCalcResponse( + IN HASHHEX HA1, /* H(A1) */ + IN char * pszNonce, /* nonce from server */ + IN char * pszNonceCount, /* 8 hex digits */ + IN char * pszCNonce, /* client nonce */ + IN char * pszQop, /* qop-value: "", "auth", "auth-int" */ + IN char * pszMethod, /* method from the request */ + IN char * pszDigestUri, /* requested URL */ + IN HASHHEX HEntity, /* H(entity body) if qop="auth-int" */ + OUT HASHHEX Response /* request-digest or response-digest */ + ) +{ + MD5_CTX Md5Ctx; + HASH HA2; + HASH RespHash; + HASHHEX HA2Hex; + + // calculate H(A2) + MD5Init(&Md5Ctx); + MD5Update(&Md5Ctx, pszMethod, strlen(pszMethod)); + MD5Update(&Md5Ctx, ":", 1); + MD5Update(&Md5Ctx, pszDigestUri, strlen(pszDigestUri)); + if (stricmp(pszQop, "auth-int") == 0) { + MD5Update(&Md5Ctx, ":", 1); + MD5Update(&Md5Ctx, HEntity, HASHHEXLEN); + }; + MD5Final(HA2, &Md5Ctx); + CvtHex(HA2, HA2Hex); + + // calculate response + MD5Init(&Md5Ctx); + MD5Update(&Md5Ctx, HA1, HASHHEXLEN); + MD5Update(&Md5Ctx, ":", 1); + MD5Update(&Md5Ctx, pszNonce, strlen(pszNonce)); + MD5Update(&Md5Ctx, ":", 1); + if (*pszQop) { + + + +Franks, et al. Standards Track [Page 29] + +RFC 2617 HTTP Authentication June 1999 + + + MD5Update(&Md5Ctx, pszNonceCount, strlen(pszNonceCount)); + MD5Update(&Md5Ctx, ":", 1); + MD5Update(&Md5Ctx, pszCNonce, strlen(pszCNonce)); + MD5Update(&Md5Ctx, ":", 1); + MD5Update(&Md5Ctx, pszQop, strlen(pszQop)); + MD5Update(&Md5Ctx, ":", 1); + }; + MD5Update(&Md5Ctx, HA2Hex, HASHHEXLEN); + MD5Final(RespHash, &Md5Ctx); + CvtHex(RespHash, Response); +}; + +File "digtest.c": + + +#include +#include "digcalc.h" + +void main(int argc, char ** argv) { + + char * pszNonce = "dcd98b7102dd2f0e8b11d0f600bfb0c093"; + char * pszCNonce = "0a4f113b"; + char * pszUser = "Mufasa"; + char * pszRealm = "testrealm@host.com"; + char * pszPass = "Circle Of Life"; + char * pszAlg = "md5"; + char szNonceCount[9] = "00000001"; + char * pszMethod = "GET"; + char * pszQop = "auth"; + char * pszURI = "/dir/index.html"; + HASHHEX HA1; + HASHHEX HA2 = ""; + HASHHEX Response; + + DigestCalcHA1(pszAlg, pszUser, pszRealm, pszPass, pszNonce, +pszCNonce, HA1); + DigestCalcResponse(HA1, pszNonce, szNonceCount, pszCNonce, pszQop, + pszMethod, pszURI, HA2, Response); + printf("Response = %s\n", Response); +}; + + + + + + + + + + + +Franks, et al. Standards Track [Page 30] + +RFC 2617 HTTP Authentication June 1999 + + +6 Acknowledgments + + Eric W. Sink, of AbiSource, Inc., was one of the original authors + before the specification underwent substantial revision. + + In addition to the authors, valuable discussion instrumental in + creating this document has come from Peter J. Churchyard, Ned Freed, + and David M. Kristol. + + Jim Gettys and Larry Masinter edited this document for update. + +7 References + + [1] Berners-Lee, T., Fielding, R. and H. Frystyk, "Hypertext + Transfer Protocol -- HTTP/1.0", RFC 1945, May 1996. + + [2] Fielding, R., Gettys, J., Mogul, J., Frysyk, H., Masinter, L., + Leach, P. and T. Berners-Lee, "Hypertext Transfer Protocol -- + HTTP/1.1", RFC 2616, June 1999. + + [3] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321, April + 1992. + + [4] Freed, N. and N. Borenstein. "Multipurpose Internet Mail + Extensions (MIME) Part One: Format of Internet Message Bodies", + RFC 2045, November 1996. + + [5] Dierks, T. and C. Allen "The TLS Protocol, Version 1.0", RFC + 2246, January 1999. + + [6] Franks, J., Hallam-Baker, P., Hostetler, J., Leach, P., + Luotonen, A., Sink, E. and L. Stewart, "An Extension to HTTP : + Digest Access Authentication", RFC 2069, January 1997. + + [7] Berners Lee, T, Fielding, R. and L. Masinter, "Uniform Resource + Identifiers (URI): Generic Syntax", RFC 2396, August 1998. + + [8] Kaliski, B.,Robshaw, M., "Message Authentication with MD5", + CryptoBytes, Sping 1995, RSA Inc, + (http://www.rsa.com/rsalabs/pubs/cryptobytes/spring95/md5.htm) + + [9] Klensin, J., Catoe, R. and P. Krumviede, "IMAP/POP AUTHorize + Extension for Simple Challenge/Response", RFC 2195, September + 1997. + + [10] Morgan, B., Alvestrand, H., Hodges, J., Wahl, M., + "Authentication Methods for LDAP", Work in Progress. + + + + +Franks, et al. Standards Track [Page 31] + +RFC 2617 HTTP Authentication June 1999 + + +8 Authors' Addresses + + John Franks + Professor of Mathematics + Department of Mathematics + Northwestern University + Evanston, IL 60208-2730, USA + + EMail: john@math.nwu.edu + + + Phillip M. Hallam-Baker + Principal Consultant + Verisign Inc. + 301 Edgewater Place + Suite 210 + Wakefield MA 01880, USA + + EMail: pbaker@verisign.com + + + Jeffery L. Hostetler + Software Craftsman + AbiSource, Inc. + 6 Dunlap Court + Savoy, IL 61874 + + EMail: jeff@AbiSource.com + + + Scott D. Lawrence + Agranat Systems, Inc. + 5 Clocktower Place, Suite 400 + Maynard, MA 01754, USA + + EMail: lawrence@agranat.com + + + Paul J. Leach + Microsoft Corporation + 1 Microsoft Way + Redmond, WA 98052, USA + + EMail: paulle@microsoft.com + + + + + + + +Franks, et al. Standards Track [Page 32] + +RFC 2617 HTTP Authentication June 1999 + + + Ari Luotonen + Member of Technical Staff + Netscape Communications Corporation + 501 East Middlefield Road + Mountain View, CA 94043, USA + + + Lawrence C. Stewart + Open Market, Inc. + 215 First Street + Cambridge, MA 02142, USA + + EMail: stewart@OpenMarket.com + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Franks, et al. Standards Track [Page 33] + +RFC 2617 HTTP Authentication June 1999 + + +9. Full Copyright Statement + + Copyright (C) The Internet Society (1999). All Rights Reserved. + + This document and translations of it may be copied and furnished to + others, and derivative works that comment on or otherwise explain it + or assist in its implementation may be prepared, copied, published + and distributed, in whole or in part, without restriction of any + kind, provided that the above copyright notice and this paragraph are + included on all such copies and derivative works. However, this + document itself may not be modified in any way, such as by removing + the copyright notice or references to the Internet Society or other + Internet organizations, except as needed for the purpose of + developing Internet standards in which case the procedures for + copyrights defined in the Internet Standards process must be + followed, or as required to translate it into languages other than + English. + + The limited permissions granted above are perpetual and will not be + revoked by the Internet Society or its successors or assigns. + + This document and the information contained herein is provided on an + "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING + TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING + BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION + HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF + MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +Acknowledgement + + Funding for the RFC Editor function is currently provided by the + Internet Society. + + + + + + + + + + + + + + + + + + + +Franks, et al. Standards Track [Page 34] + diff --git a/doc/rfc/rfc2817.txt b/doc/rfc/rfc2817.txt new file mode 100644 index 0000000000..d7b7e703bb --- /dev/null +++ b/doc/rfc/rfc2817.txt @@ -0,0 +1,731 @@ + + + + + + +Network Working Group R. Khare +Request for Comments: 2817 4K Associates / UC Irvine +Updates: 2616 S. Lawrence +Category: Standards Track Agranat Systems, Inc. + May 2000 + + + Upgrading to TLS Within HTTP/1.1 + +Status of this Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Copyright Notice + + Copyright (C) The Internet Society (2000). All Rights Reserved. + +Abstract + + This memo explains how to use the Upgrade mechanism in HTTP/1.1 to + initiate Transport Layer Security (TLS) over an existing TCP + connection. This allows unsecured and secured HTTP traffic to share + the same well known port (in this case, http: at 80 rather than + https: at 443). It also enables "virtual hosting", so a single HTTP + + TLS server can disambiguate traffic intended for several hostnames at + a single IP address. + + Since HTTP/1.1 [1] defines Upgrade as a hop-by-hop mechanism, this + memo also documents the HTTP CONNECT method for establishing end-to- + end tunnels across HTTP proxies. Finally, this memo establishes new + IANA registries for public HTTP status codes, as well as public or + private Upgrade product tokens. + + This memo does NOT affect the current definition of the 'https' URI + scheme, which already defines a separate namespace + (http://example.org/ and https://example.org/ are not equivalent). + + + + + + + + + + + +Khare & Lawrence Standards Track [Page 1] + +RFC 2817 HTTP Upgrade to TLS May 2000 + + +Table of Contents + + 1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . 2 + 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 + 2.1 Requirements Terminology . . . . . . . . . . . . . . . . . . . 4 + 3. Client Requested Upgrade to HTTP over TLS . . . . . . . . . . 4 + 3.1 Optional Upgrade . . . . . . . . . . . . . . . . . . . . . . . 4 + 3.2 Mandatory Upgrade . . . . . . . . . . . . . . . . . . . . . . 4 + 3.3 Server Acceptance of Upgrade Request . . . . . . . . . . . . . 4 + 4. Server Requested Upgrade to HTTP over TLS . . . . . . . . . . 5 + 4.1 Optional Advertisement . . . . . . . . . . . . . . . . . . . . 5 + 4.2 Mandatory Advertisement . . . . . . . . . . . . . . . . . . . 5 + 5. Upgrade across Proxies . . . . . . . . . . . . . . . . . . . . 6 + 5.1 Implications of Hop By Hop Upgrade . . . . . . . . . . . . . . 6 + 5.2 Requesting a Tunnel with CONNECT . . . . . . . . . . . . . . . 6 + 5.3 Establishing a Tunnel with CONNECT . . . . . . . . . . . . . . 7 + 6. Rationale for the use of a 4xx (client error) Status Code . . 7 + 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 + 7.1 HTTP Status Code Registry . . . . . . . . . . . . . . . . . . 8 + 7.2 HTTP Upgrade Token Registry . . . . . . . . . . . . . . . . . 8 + 8. Security Considerations . . . . . . . . . . . . . . . . . . . 9 + 8.1 Implications for the https: URI Scheme . . . . . . . . . . . . 10 + 8.2 Security Considerations for CONNECT . . . . . . . . . . . . . 10 + References . . . . . . . . . . . . . . . . . . . . . . . . . . 10 + Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 11 + A. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 12 + Full Copyright Statement . . . . . . . . . . . . . . . . . . . 13 + +1. Motivation + + The historical practice of deploying HTTP over SSL3 [3] has + distinguished the combination from HTTP alone by a unique URI scheme + and the TCP port number. The scheme 'http' meant the HTTP protocol + alone on port 80, while 'https' meant the HTTP protocol over SSL on + port 443. Parallel well-known port numbers have similarly been + requested -- and in some cases, granted -- to distinguish between + secured and unsecured use of other application protocols (e.g. + snews, ftps). This approach effectively halves the number of + available well known ports. + + At the Washington DC IETF meeting in December 1997, the Applications + Area Directors and the IESG reaffirmed that the practice of issuing + parallel "secure" port numbers should be deprecated. The HTTP/1.1 + Upgrade mechanism can apply Transport Layer Security [6] to an open + HTTP connection. + + + + + + +Khare & Lawrence Standards Track [Page 2] + +RFC 2817 HTTP Upgrade to TLS May 2000 + + + In the nearly two years since, there has been broad acceptance of the + concept behind this proposal, but little interest in implementing + alternatives to port 443 for generic Web browsing. In fact, nothing + in this memo affects the current interpretation of https: URIs. + However, new application protocols built atop HTTP, such as the + Internet Printing Protocol [7], call for just such a mechanism in + order to move ahead in the IETF standards process. + + The Upgrade mechanism also solves the "virtual hosting" problem. + Rather than allocating multiple IP addresses to a single host, an + HTTP/1.1 server will use the Host: header to disambiguate the + intended web service. As HTTP/1.1 usage has grown more prevalent, + more ISPs are offering name-based virtual hosting, thus delaying IP + address space exhaustion. + + TLS (and SSL) have been hobbled by the same limitation as earlier + versions of HTTP: the initial handshake does not specify the intended + hostname, relying exclusively on the IP address. Using a cleartext + HTTP/1.1 Upgrade: preamble to the TLS handshake -- choosing the + certificates based on the initial Host: header -- will allow ISPs to + provide secure name-based virtual hosting as well. + +2. Introduction + + TLS, a.k.a., SSL (Secure Sockets Layer), establishes a private end- + to-end connection, optionally including strong mutual authentication, + using a variety of cryptosystems. Initially, a handshake phase uses + three subprotocols to set up a record layer, authenticate endpoints, + set parameters, as well as report errors. Then, there is an ongoing + layered record protocol that handles encryption, compression, and + reassembly for the remainder of the connection. The latter is + intended to be completely transparent. For example, there is no + dependency between TLS's record markers and or certificates and + HTTP/1.1's chunked encoding or authentication. + + Either the client or server can use the HTTP/1.1 [1] Upgrade + mechanism (Section 14.42) to indicate that a TLS-secured connection + is desired or necessary. This memo defines the "TLS/1.0" Upgrade + token, and a new HTTP Status Code, "426 Upgrade Required". + + Section 3 and Section 4 describe the operation of a directly + connected client and server. Intermediate proxies must establish an + end-to-end tunnel before applying those operations, as explained in + Section 5. + + + + + + + +Khare & Lawrence Standards Track [Page 3] + +RFC 2817 HTTP Upgrade to TLS May 2000 + + +2.1 Requirements Terminology + + Keywords "MUST", "MUST NOT", "REQUIRED", "SHOULD", "SHOULD NOT" and + "MAY" that appear in this document are to be interpreted as described + in RFC 2119 [11]. + +3. Client Requested Upgrade to HTTP over TLS + + When the client sends an HTTP/1.1 request with an Upgrade header + field containing the token "TLS/1.0", it is requesting the server to + complete the current HTTP/1.1 request after switching to TLS/1.0. + +3.1 Optional Upgrade + + A client MAY offer to switch to secured operation during any clear + HTTP request when an unsecured response would be acceptable: + + GET http://example.bank.com/acct_stat.html?749394889300 HTTP/1.1 + Host: example.bank.com + Upgrade: TLS/1.0 + Connection: Upgrade + + In this case, the server MAY respond to the clear HTTP operation + normally, OR switch to secured operation (as detailed in the next + section). + + Note that HTTP/1.1 [1] specifies "the upgrade keyword MUST be + supplied within a Connection header field (section 14.10) whenever + Upgrade is present in an HTTP/1.1 message". + +3.2 Mandatory Upgrade + + If an unsecured response would be unacceptable, a client MUST send an + OPTIONS request first to complete the switch to TLS/1.0 (if + possible). + + OPTIONS * HTTP/1.1 + Host: example.bank.com + Upgrade: TLS/1.0 + Connection: Upgrade + +3.3 Server Acceptance of Upgrade Request + + As specified in HTTP/1.1 [1], if the server is prepared to initiate + the TLS handshake, it MUST send the intermediate "101 Switching + Protocol" and MUST include an Upgrade response header specifying the + tokens of the protocol stack it is switching to: + + + + +Khare & Lawrence Standards Track [Page 4] + +RFC 2817 HTTP Upgrade to TLS May 2000 + + + HTTP/1.1 101 Switching Protocols + Upgrade: TLS/1.0, HTTP/1.1 + Connection: Upgrade + + Note that the protocol tokens listed in the Upgrade header of a 101 + Switching Protocols response specify an ordered 'bottom-up' stack. + + As specified in HTTP/1.1 [1], Section 10.1.2: "The server will + switch protocols to those defined by the response's Upgrade header + field immediately after the empty line which terminates the 101 + response". + + Once the TLS handshake completes successfully, the server MUST + continue with the response to the original request. Any TLS handshake + failure MUST lead to disconnection, per the TLS error alert + specification. + +4. Server Requested Upgrade to HTTP over TLS + + The Upgrade response header field advertises possible protocol + upgrades a server MAY accept. In conjunction with the "426 Upgrade + Required" status code, a server can advertise the exact protocol + upgrade(s) that a client MUST accept to complete the request. + +4.1 Optional Advertisement + + As specified in HTTP/1.1 [1], the server MAY include an Upgrade + header in any response other than 101 or 426 to indicate a + willingness to switch to any (combination) of the protocols listed. + +4.2 Mandatory Advertisement + + A server MAY indicate that a client request can not be completed + without TLS using the "426 Upgrade Required" status code, which MUST + include an an Upgrade header field specifying the token of the + required TLS version. + + HTTP/1.1 426 Upgrade Required + Upgrade: TLS/1.0, HTTP/1.1 + Connection: Upgrade + + The server SHOULD include a message body in the 426 response which + indicates in human readable form the reason for the error and + describes any alternative courses which may be available to the user. + + Note that even if a client is willing to use TLS, it must use the + operations in Section 3 to proceed; the TLS handshake cannot begin + immediately after the 426 response. + + + +Khare & Lawrence Standards Track [Page 5] + +RFC 2817 HTTP Upgrade to TLS May 2000 + + +5. Upgrade across Proxies + + As a hop-by-hop header, Upgrade is negotiated between each pair of + HTTP counterparties. If a User Agent sends a request with an Upgrade + header to a proxy, it is requesting a change to the protocol between + itself and the proxy, not an end-to-end change. + + Since TLS, in particular, requires end-to-end connectivity to provide + authentication and prevent man-in-the-middle attacks, this memo + specifies the CONNECT method to establish a tunnel across proxies. + + Once a tunnel is established, any of the operations in Section 3 can + be used to establish a TLS connection. + +5.1 Implications of Hop By Hop Upgrade + + If an origin server receives an Upgrade header from a proxy and + responds with a 101 Switching Protocols response, it is changing the + protocol only on the connection between the proxy and itself. + Similarly, a proxy might return a 101 response to its client to + change the protocol on that connection independently of the protocols + it is using to communicate toward the origin server. + + These scenarios also complicate diagnosis of a 426 response. Since + Upgrade is a hop-by-hop header, a proxy that does not recognize 426 + might remove the accompanying Upgrade header and prevent the client + from determining the required protocol switch. If a client receives + a 426 status without an accompanying Upgrade header, it will need to + request an end to end tunnel connection as described in Section 5.2 + and repeat the request in order to obtain the required upgrade + information. + + This hop-by-hop definition of Upgrade was a deliberate choice. It + allows for incremental deployment on either side of proxies, and for + optimized protocols between cascaded proxies without the knowledge of + the parties that are not a part of the change. + +5.2 Requesting a Tunnel with CONNECT + + A CONNECT method requests that a proxy establish a tunnel connection + on its behalf. The Request-URI portion of the Request-Line is always + an 'authority' as defined by URI Generic Syntax [2], which is to say + the host name and port number destination of the requested connection + separated by a colon: + + CONNECT server.example.com:80 HTTP/1.1 + Host: server.example.com:80 + + + + +Khare & Lawrence Standards Track [Page 6] + +RFC 2817 HTTP Upgrade to TLS May 2000 + + + Other HTTP mechanisms can be used normally with the CONNECT method -- + except end-to-end protocol Upgrade requests, of course, since the + tunnel must be established first. + + For example, proxy authentication might be used to establish the + authority to create a tunnel: + + CONNECT server.example.com:80 HTTP/1.1 + Host: server.example.com:80 + Proxy-Authorization: basic aGVsbG86d29ybGQ= + + Like any other pipelined HTTP/1.1 request, data to be tunneled may be + sent immediately after the blank line. The usual caveats also apply: + data may be discarded if the eventual response is negative, and the + connection may be reset with no response if more than one TCP segment + is outstanding. + +5.3 Establishing a Tunnel with CONNECT + + Any successful (2xx) response to a CONNECT request indicates that the + proxy has established a connection to the requested host and port, + and has switched to tunneling the current connection to that server + connection. + + It may be the case that the proxy itself can only reach the requested + origin server through another proxy. In this case, the first proxy + SHOULD make a CONNECT request of that next proxy, requesting a tunnel + to the authority. A proxy MUST NOT respond with any 2xx status code + unless it has either a direct or tunnel connection established to the + authority. + + An origin server which receives a CONNECT request for itself MAY + respond with a 2xx status code to indicate that a connection is + established. + + If at any point either one of the peers gets disconnected, any + outstanding data that came from that peer will be passed to the other + one, and after that also the other connection will be terminated by + the proxy. If there is outstanding data to that peer undelivered, + that data will be discarded. + +6. Rationale for the use of a 4xx (client error) Status Code + + Reliable, interoperable negotiation of Upgrade features requires an + unambiguous failure signal. The 426 Upgrade Required status code + allows a server to definitively state the precise protocol extensions + a given resource must be served with. + + + + +Khare & Lawrence Standards Track [Page 7] + +RFC 2817 HTTP Upgrade to TLS May 2000 + + + It might at first appear that the response should have been some form + of redirection (a 3xx code), by analogy to an old-style redirection + to an https: URI. User agents that do not understand Upgrade: + preclude this. + + Suppose that a 3xx code had been assigned for "Upgrade Required"; a + user agent that did not recognize it would treat it as 300. It would + then properly look for a "Location" header in the response and + attempt to repeat the request at the URL in that header field. Since + it did not know to Upgrade to incorporate the TLS layer, it would at + best fail again at the new URL. + +7. IANA Considerations + + IANA shall create registries for two name spaces, as described in BCP + 26 [10]: + + o HTTP Status Codes + o HTTP Upgrade Tokens + +7.1 HTTP Status Code Registry + + The HTTP Status Code Registry defines the name space for the Status- + Code token in the Status line of an HTTP response. The initial + values for this name space are those specified by: + + 1. Draft Standard for HTTP/1.1 [1] + 2. Web Distributed Authoring and Versioning [4] [defines 420-424] + 3. WebDAV Advanced Collections [5] (Work in Progress) [defines 425] + 4. Section 6 [defines 426] + + Values to be added to this name space SHOULD be subject to review in + the form of a standards track document within the IETF Applications + Area. Any such document SHOULD be traceable through statuses of + either 'Obsoletes' or 'Updates' to the Draft Standard for + HTTP/1.1 [1]. + +7.2 HTTP Upgrade Token Registry + + The HTTP Upgrade Token Registry defines the name space for product + tokens used to identify protocols in the Upgrade HTTP header field. + Each registered token should be associated with one or a set of + specifications, and with contact information. + + The Draft Standard for HTTP/1.1 [1] specifies that these tokens obey + the production for 'product': + + + + + +Khare & Lawrence Standards Track [Page 8] + +RFC 2817 HTTP Upgrade to TLS May 2000 + + + product = token ["/" product-version] + product-version = token + + Registrations should be allowed on a First Come First Served basis as + described in BCP 26 [10]. These specifications need not be IETF + documents or be subject to IESG review, but should obey the following + rules: + + 1. A token, once registered, stays registered forever. + 2. The registration MUST name a responsible party for the + registration. + 3. The registration MUST name a point of contact. + 4. The registration MAY name the documentation required for the + token. + 5. The responsible party MAY change the registration at any time. + The IANA will keep a record of all such changes, and make them + available upon request. + 6. The responsible party for the first registration of a "product" + token MUST approve later registrations of a "version" token + together with that "product" token before they can be registered. + 7. If absolutely required, the IESG MAY reassign the responsibility + for a token. This will normally only be used in the case when a + responsible party cannot be contacted. + + This specification defines the protocol token "TLS/1.0" as the + identifier for the protocol specified by The TLS Protocol [6]. + + It is NOT required that specifications for upgrade tokens be made + publicly available, but the contact information for the registration + SHOULD be. + +8. Security Considerations + + The potential for a man-in-the-middle attack (deleting the Upgrade + header) remains the same as current, mixed http/https practice: + + o Removing the Upgrade header is similar to rewriting web pages to + change https:// links to http:// links. + o The risk is only present if the server is willing to vend such + information over both a secure and an insecure channel in the + first place. + o If the client knows for a fact that a server is TLS-compliant, it + can insist on it by only sending an Upgrade request with a no-op + method like OPTIONS. + o Finally, as the https: specification warns, "users should + carefully examine the certificate presented by the server to + determine if it meets their expectations". + + + + +Khare & Lawrence Standards Track [Page 9] + +RFC 2817 HTTP Upgrade to TLS May 2000 + + + Furthermore, for clients that do not explicitly try to invoke TLS, + servers can use the Upgrade header in any response other than 101 or + 426 to advertise TLS compliance. Since TLS compliance should be + considered a feature of the server and not the resource at hand, it + should be sufficient to send it once, and let clients cache that + fact. + +8.1 Implications for the https: URI Scheme + + While nothing in this memo affects the definition of the 'https' URI + scheme, widespread adoption of this mechanism for HyperText content + could use 'http' to identify both secure and non-secure resources. + + The choice of what security characteristics are required on the + connection is left to the client and server. This allows either + party to use any information available in making this determination. + For example, user agents may rely on user preference settings or + information about the security of the network such as 'TLS required + on all POST operations not on my local net', or servers may apply + resource access rules such as 'the FORM on this page must be served + and submitted using TLS'. + +8.2 Security Considerations for CONNECT + + A generic TCP tunnel is fraught with security risks. First, such + authorization should be limited to a small number of known ports. + The Upgrade: mechanism defined here only requires onward tunneling at + port 80. Second, since tunneled data is opaque to the proxy, there + are additional risks to tunneling to other well-known or reserved + ports. A putative HTTP client CONNECTing to port 25 could relay spam + via SMTP, for example. + +References + + [1] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., + Leach, P. and T. Berners-Lee, "Hypertext Transfer Protocol -- + HTTP/1.1", RFC 2616, June 1999. + + [2] Berners-Lee, T., Fielding, R. and L. Masinter, "URI Generic + Syntax", RFC 2396, August 1998. + + [3] Rescorla, E., "HTTP Over TLS", RFC 2818, May 2000. + + [4] Goland, Y., Whitehead, E., Faizi, A., Carter, S. and D. Jensen, + "Web Distributed Authoring and Versioning", RFC 2518, February + 1999. + + + + + +Khare & Lawrence Standards Track [Page 10] + +RFC 2817 HTTP Upgrade to TLS May 2000 + + + [5] Slein, J., Whitehead, E.J., et al., "WebDAV Advanced Collections + Protocol", Work In Progress. + + [6] Dierks, T. and C. Allen, "The TLS Protocol", RFC 2246, January + 1999. + + [7] Herriot, R., Butler, S., Moore, P. and R. Turner, "Internet + Printing Protocol/1.0: Encoding and Transport", RFC 2565, April + 1999. + + [8] Luotonen, A., "Tunneling TCP based protocols through Web proxy + servers", Work In Progress. (Also available in: Luotonen, Ari. + Web Proxy Servers, Prentice-Hall, 1997 ISBN:0136806120.) + + [9] Rose, M., "Writing I-Ds and RFCs using XML", RFC 2629, June + 1999. + + [10] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA + Considerations Section in RFCs", BCP 26, RFC 2434, October 1998. + + [11] Bradner, S., "Key words for use in RFCs to Indicate Requirement + Levels", BCP 14, RFC 2119, March 1997. + +Authors' Addresses + + Rohit Khare + 4K Associates / UC Irvine + 3207 Palo Verde + Irvine, CA 92612 + US + + Phone: +1 626 806 7574 + EMail: rohit@4K-associates.com + URI: http://www.4K-associates.com/ + + + Scott Lawrence + Agranat Systems, Inc. + 5 Clocktower Place + Suite 400 + Maynard, MA 01754 + US + + Phone: +1 978 461 0888 + EMail: lawrence@agranat.com + URI: http://www.agranat.com/ + + + + + +Khare & Lawrence Standards Track [Page 11] + +RFC 2817 HTTP Upgrade to TLS May 2000 + + +Appendix A. Acknowledgments + + The CONNECT method was originally described in a Work in Progress + titled, "Tunneling TCP based protocols through Web proxy servers", + [8] by Ari Luotonen of Netscape Communications Corporation. It was + widely implemented by HTTP proxies, but was never made a part of any + IETF Standards Track document. The method name CONNECT was reserved, + but not defined in [1]. + + The definition provided here is derived directly from that earlier + memo, with some editorial changes and conformance to the stylistic + conventions since established in other HTTP specifications. + + Additional Thanks to: + + o Paul Hoffman for his work on the STARTTLS command extension for + ESMTP. + o Roy Fielding for assistance with the rationale behind Upgrade: + and its interaction with OPTIONS. + o Eric Rescorla for his work on standardizing the existing https: + practice to compare with. + o Marshall Rose, for the xml2rfc document type description and tools + [9]. + o Jim Whitehead, for sorting out the current range of available HTTP + status codes. + o Henrik Frystyk Nielsen, whose work on the Mandatory extension + mechanism pointed out a hop-by-hop Upgrade still requires + tunneling. + o Harald Alvestrand for improvements to the token registration + rules. + + + + + + + + + + + + + + + + + + + + + +Khare & Lawrence Standards Track [Page 12] + +RFC 2817 HTTP Upgrade to TLS May 2000 + + +Full Copyright Statement + + Copyright (C) The Internet Society (2000). All Rights Reserved. + + This document and translations of it may be copied and furnished to + others, and derivative works that comment on or otherwise explain it + or assist in its implementation may be prepared, copied, published + and distributed, in whole or in part, without restriction of any + kind, provided that the above copyright notice and this paragraph are + included on all such copies and derivative works. However, this + document itself may not be modified in any way, such as by removing + the copyright notice or references to the Internet Society or other + Internet organizations, except as needed for the purpose of + developing Internet standards in which case the procedures for + copyrights defined in the Internet Standards process must be + followed, or as required to translate it into languages other than + English. + + The limited permissions granted above are perpetual and will not be + revoked by the Internet Society or its successors or assigns. + + This document and the information contained herein is provided on an + "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING + TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING + BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION + HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF + MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +Acknowledgement + + Funding for the RFC Editor function is currently provided by the + Internet Society. + + + + + + + + + + + + + + + + + + + +Khare & Lawrence Standards Track [Page 13] + diff --git a/doc/rfc/rfc2818.txt b/doc/rfc/rfc2818.txt new file mode 100644 index 0000000000..219a1c427f --- /dev/null +++ b/doc/rfc/rfc2818.txt @@ -0,0 +1,395 @@ + + + + + + +Network Working Group E. Rescorla +Request for Comments: 2818 RTFM, Inc. +Category: Informational May 2000 + + + HTTP Over TLS + +Status of this Memo + + This memo provides information for the Internet community. It does + not specify an Internet standard of any kind. Distribution of this + memo is unlimited. + +Copyright Notice + + Copyright (C) The Internet Society (2000). All Rights Reserved. + +Abstract + + This memo describes how to use TLS to secure HTTP connections over + the Internet. Current practice is to layer HTTP over SSL (the + predecessor to TLS), distinguishing secured traffic from insecure + traffic by the use of a different server port. This document + documents that practice using TLS. A companion document describes a + method for using HTTP/TLS over the same port as normal HTTP + [RFC2817]. + +Table of Contents + + 1. Introduction . . . . . . . . . . . . . . . . . . . . . . 2 + 1.1. Requirements Terminology . . . . . . . . . . . . . . . 2 + 2. HTTP Over TLS . . . . . . . . . . . . . . . . . . . . . . 2 + 2.1. Connection Initiation . . . . . . . . . . . . . . . . . 2 + 2.2. Connection Closure . . . . . . . . . . . . . . . . . . 2 + 2.2.1. Client Behavior . . . . . . . . . . . . . . . . . . . 3 + 2.2.2. Server Behavior . . . . . . . . . . . . . . . . . . . 3 + 2.3. Port Number . . . . . . . . . . . . . . . . . . . . . . 4 + 2.4. URI Format . . . . . . . . . . . . . . . . . . . . . . 4 + 3. Endpoint Identification . . . . . . . . . . . . . . . . . 4 + 3.1. Server Identity . . . . . . . . . . . . . . . . . . . . 4 + 3.2. Client Identity . . . . . . . . . . . . . . . . . . . . 5 + References . . . . . . . . . . . . . . . . . . . . . . . . . 6 + Security Considerations . . . . . . . . . . . . . . . . . . 6 + Author's Address . . . . . . . . . . . . . . . . . . . . . . 6 + Full Copyright Statement . . . . . . . . . . . . . . . . . . 7 + + + + + + +Rescorla Informational [Page 1] + +RFC 2818 HTTP Over TLS May 2000 + + +1. Introduction + + HTTP [RFC2616] was originally used in the clear on the Internet. + However, increased use of HTTP for sensitive applications has + required security measures. SSL, and its successor TLS [RFC2246] were + designed to provide channel-oriented security. This document + describes how to use HTTP over TLS. + +1.1. Requirements Terminology + + Keywords "MUST", "MUST NOT", "REQUIRED", "SHOULD", "SHOULD NOT" and + "MAY" that appear in this document are to be interpreted as described + in [RFC2119]. + +2. HTTP Over TLS + + Conceptually, HTTP/TLS is very simple. Simply use HTTP over TLS + precisely as you would use HTTP over TCP. + +2.1. Connection Initiation + + The agent acting as the HTTP client should also act as the TLS + client. It should initiate a connection to the server on the + appropriate port and then send the TLS ClientHello to begin the TLS + handshake. When the TLS handshake has finished. The client may then + initiate the first HTTP request. All HTTP data MUST be sent as TLS + "application data". Normal HTTP behavior, including retained + connections should be followed. + +2.2. Connection Closure + + TLS provides a facility for secure connection closure. When a valid + closure alert is received, an implementation can be assured that no + further data will be received on that connection. TLS + implementations MUST initiate an exchange of closure alerts before + closing a connection. A TLS implementation MAY, after sending a + closure alert, close the connection without waiting for the peer to + send its closure alert, generating an "incomplete close". Note that + an implementation which does this MAY choose to reuse the session. + This SHOULD only be done when the application knows (typically + through detecting HTTP message boundaries) that it has received all + the message data that it cares about. + + As specified in [RFC2246], any implementation which receives a + connection close without first receiving a valid closure alert (a + "premature close") MUST NOT reuse that session. Note that a + premature close does not call into question the security of the data + already received, but simply indicates that subsequent data might + + + +Rescorla Informational [Page 2] + +RFC 2818 HTTP Over TLS May 2000 + + + have been truncated. Because TLS is oblivious to HTTP + request/response boundaries, it is necessary to examine the HTTP data + itself (specifically the Content-Length header) to determine whether + the truncation occurred inside a message or between messages. + +2.2.1. Client Behavior + + Because HTTP uses connection closure to signal end of server data, + client implementations MUST treat any premature closes as errors and + the data received as potentially truncated. While in some cases the + HTTP protocol allows the client to find out whether truncation took + place so that, if it received the complete reply, it may tolerate + such errors following the principle to "[be] strict when sending and + tolerant when receiving" [RFC1958], often truncation does not show in + the HTTP protocol data; two cases in particular deserve special note: + + A HTTP response without a Content-Length header. Since data length + in this situation is signalled by connection close a premature + close generated by the server cannot be distinguished from a + spurious close generated by an attacker. + + A HTTP response with a valid Content-Length header closed before + all data has been read. Because TLS does not provide document + oriented protection, it is impossible to determine whether the + server has miscomputed the Content-Length or an attacker has + truncated the connection. + + There is one exception to the above rule. When encountering a + premature close, a client SHOULD treat as completed all requests for + which it has received as much data as specified in the Content-Length + header. + + A client detecting an incomplete close SHOULD recover gracefully. It + MAY resume a TLS session closed in this fashion. + + Clients MUST send a closure alert before closing the connection. + Clients which are unprepared to receive any more data MAY choose not + to wait for the server's closure alert and simply close the + connection, thus generating an incomplete close on the server side. + +2.2.2. Server Behavior + + RFC 2616 permits an HTTP client to close the connection at any time, + and requires servers to recover gracefully. In particular, servers + SHOULD be prepared to receive an incomplete close from the client, + since the client can often determine when the end of server data is. + Servers SHOULD be willing to resume TLS sessions closed in this + fashion. + + + +Rescorla Informational [Page 3] + +RFC 2818 HTTP Over TLS May 2000 + + + Implementation note: In HTTP implementations which do not use + persistent connections, the server ordinarily expects to be able to + signal end of data by closing the connection. When Content-Length is + used, however, the client may have already sent the closure alert and + dropped the connection. + + Servers MUST attempt to initiate an exchange of closure alerts with + the client before closing the connection. Servers MAY close the + connection after sending the closure alert, thus generating an + incomplete close on the client side. + +2.3. Port Number + + The first data that an HTTP server expects to receive from the client + is the Request-Line production. The first data that a TLS server (and + hence an HTTP/TLS server) expects to receive is the ClientHello. + Consequently, common practice has been to run HTTP/TLS over a + separate port in order to distinguish which protocol is being used. + When HTTP/TLS is being run over a TCP/IP connection, the default port + is 443. This does not preclude HTTP/TLS from being run over another + transport. TLS only presumes a reliable connection-oriented data + stream. + +2.4. URI Format + + HTTP/TLS is differentiated from HTTP URIs by using the 'https' + protocol identifier in place of the 'http' protocol identifier. An + example URI specifying HTTP/TLS is: + + https://www.example.com/~smith/home.html + +3. Endpoint Identification + +3.1. Server Identity + + In general, HTTP/TLS requests are generated by dereferencing a URI. + As a consequence, the hostname for the server is known to the client. + If the hostname is available, the client MUST check it against the + server's identity as presented in the server's Certificate message, + in order to prevent man-in-the-middle attacks. + + If the client has external information as to the expected identity of + the server, the hostname check MAY be omitted. (For instance, a + client may be connecting to a machine whose address and hostname are + dynamic but the client knows the certificate that the server will + present.) In such cases, it is important to narrow the scope of + acceptable certificates as much as possible in order to prevent man + + + + +Rescorla Informational [Page 4] + +RFC 2818 HTTP Over TLS May 2000 + + + in the middle attacks. In special cases, it may be appropriate for + the client to simply ignore the server's identity, but it must be + understood that this leaves the connection open to active attack. + + If a subjectAltName extension of type dNSName is present, that MUST + be used as the identity. Otherwise, the (most specific) Common Name + field in the Subject field of the certificate MUST be used. Although + the use of the Common Name is existing practice, it is deprecated and + Certification Authorities are encouraged to use the dNSName instead. + + Matching is performed using the matching rules specified by + [RFC2459]. If more than one identity of a given type is present in + the certificate (e.g., more than one dNSName name, a match in any one + of the set is considered acceptable.) Names may contain the wildcard + character * which is considered to match any single domain name + component or component fragment. E.g., *.a.com matches foo.a.com but + not bar.foo.a.com. f*.com matches foo.com but not bar.com. + + In some cases, the URI is specified as an IP address rather than a + hostname. In this case, the iPAddress subjectAltName must be present + in the certificate and must exactly match the IP in the URI. + + If the hostname does not match the identity in the certificate, user + oriented clients MUST either notify the user (clients MAY give the + user the opportunity to continue with the connection in any case) or + terminate the connection with a bad certificate error. Automated + clients MUST log the error to an appropriate audit log (if available) + and SHOULD terminate the connection (with a bad certificate error). + Automated clients MAY provide a configuration setting that disables + this check, but MUST provide a setting which enables it. + + Note that in many cases the URI itself comes from an untrusted + source. The above-described check provides no protection against + attacks where this source is compromised. For example, if the URI was + obtained by clicking on an HTML page which was itself obtained + without using HTTP/TLS, a man in the middle could have replaced the + URI. In order to prevent this form of attack, users should carefully + examine the certificate presented by the server to determine if it + meets their expectations. + +3.2. Client Identity + + Typically, the server has no external knowledge of what the client's + identity ought to be and so checks (other than that the client has a + certificate chain rooted in an appropriate CA) are not possible. If a + server has such knowledge (typically from some source external to + HTTP or TLS) it SHOULD check the identity as described above. + + + + +Rescorla Informational [Page 5] + +RFC 2818 HTTP Over TLS May 2000 + + +References + + [RFC2459] Housley, R., Ford, W., Polk, W. and D. Solo, "Internet + Public Key Infrastructure: Part I: X.509 Certificate and + CRL Profile", RFC 2459, January 1999. + + [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, + L., Leach, P. and T. Berners-Lee, "Hypertext Transfer + Protocol, HTTP/1.1", RFC 2616, June 1999. + + [RFC2119] Bradner, S., "Key Words for use in RFCs to indicate + Requirement Levels", BCP 14, RFC 2119, March 1997. + + [RFC2246] Dierks, T. and C. Allen, "The TLS Protocol", RFC 2246, + January 1999. + + [RFC2817] Khare, R. and S. Lawrence, "Upgrading to TLS Within + HTTP/1.1", RFC 2817, May 2000. + +Security Considerations + + This entire document is about security. + +Author's Address + + Eric Rescorla + RTFM, Inc. + 30 Newell Road, #16 + East Palo Alto, CA 94303 + + Phone: (650) 328-8631 + EMail: ekr@rtfm.com + + + + + + + + + + + + + + + + + + + +Rescorla Informational [Page 6] + +RFC 2818 HTTP Over TLS May 2000 + + +Full Copyright Statement + + Copyright (C) The Internet Society (2000). All Rights Reserved. + + This document and translations of it may be copied and furnished to + others, and derivative works that comment on or otherwise explain it + or assist in its implementation may be prepared, copied, published + and distributed, in whole or in part, without restriction of any + kind, provided that the above copyright notice and this paragraph are + included on all such copies and derivative works. However, this + document itself may not be modified in any way, such as by removing + the copyright notice or references to the Internet Society or other + Internet organizations, except as needed for the purpose of + developing Internet standards in which case the procedures for + copyrights defined in the Internet Standards process must be + followed, or as required to translate it into languages other than + English. + + The limited permissions granted above are perpetual and will not be + revoked by the Internet Society or its successors or assigns. + + This document and the information contained herein is provided on an + "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING + TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING + BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION + HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF + MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +Acknowledgement + + Funding for the RFC Editor function is currently provided by the + Internet Society. + + + + + + + + + + + + + + + + + + + +Rescorla Informational [Page 7] + diff --git a/doc/rfc/rfc2964.txt b/doc/rfc/rfc2964.txt new file mode 100644 index 0000000000..0fe0008565 --- /dev/null +++ b/doc/rfc/rfc2964.txt @@ -0,0 +1,451 @@ + + + + + + +Network Working Group K. Moore +Request for Comments: 2964 University of Tennessee +BCP: 44 N. Freed +Category: Best Current Practice Innosoft + October 2000 + + + Use of HTTP State Management + +Status of this Memo + + This document specifies an Internet Best Current Practices for the + Internet Community, and requests discussion and suggestions for + improvements. Distribution of this memo is unlimited. + +Copyright Notice + + Copyright (C) The Internet Society (2000). All Rights Reserved. + +IESG Note + + The IESG notes that this mechanism makes use of the .local top-level + domain (TLD) internally when handling host names that don't contain + any dots, and that this mechanism might not work in the expected way + should an actual .local TLD ever be registered. + +Abstract + + The mechanisms described in "HTTP State Management Mechanism" (RFC- + 2965), and its predecessor (RFC-2109), can be used for many different + purposes. However, some current and potential uses of the protocol + are controversial because they have significant user privacy and + security implications. This memo identifies specific uses of + Hypertext Transfer Protocol (HTTP) State Management protocol which + are either (a) not recommended by the IETF, or (b) believed to be + harmful, and discouraged. This memo also details additional privacy + considerations which are not covered by the HTTP State Management + protocol specification. + +1. Introduction + + The HTTP State Management mechanism is both useful and controversial. + It is useful because numerous applications of HTTP benefit from the + ability to save state between HTTP transactions, without encoding + such state in URLs. It is controversial because the mechanism has + been used to accomplish things for which it was not designed and is + not well-suited. Some of these uses have attracted a great deal of + public criticism because they threaten to violate the privacy of web + + + +Moore & Freed Best Current Practice [Page 1] + +RFC 2964 Use of HTTP State Management October 2000 + + + users, specifically by leaking potentially sensitive information to + third parties such as the Web sites a user has visited. There are + also other uses of HTTP State Management which are inappropriate even + though they do not threaten user privacy. + + This memo therefore identifies uses of the HTTP State Management + protocol specified in RFC-2965 which are not recommended by the IETF, + or which are believed to be harmful and are therefore discouraged. + + This document occasionally uses terms that appear in capital letters. + When the terms "MUST", "MUST NOT", "SHOULD", "SHOULD NOT", and "MAY" + appear capitalized, they are being used to indicate particular + requirements of this specification. A discussion of the meanings of + the terms "MUST", "SHOULD", and "MAY" appears in [RFC-1123]; the + terms "MUST NOT" and "SHOULD NOT" are logical extensions of this + usage. + +2. Uses of HTTP State Management + + The purpose of HTTP State Management is to allow an HTTP-based + service to create stateful "sessions" which persist across multiple + HTTP transactions. A single session may involve transactions with + multiple server hosts. Multiple client hosts may also be involved in + a single session when the session data for a particular user is + shared between client hosts (e.g., via a networked file system). In + other words, the "session" retains state between a "user" and a + "service", not between particular hosts. + + It's important to realize that similar capabilities may also be + achieved using the "bare" HTTP protocol, and/or dynamically-generated + HTML, without the State Management extensions. For example, state + information can be transmitted from the service to the user by + embedding a session identifier in one or more URLs which appear in + HTTP redirects, or dynamically generated HTML; and the state + information may be returned from the user to the service when such + URLs appear in a GET or POST request. HTML forms can also be used to + pass state information from the service to the user and back, without + the user being aware of this happening. + + However, the HTTP State Management facility does provide an increase + in functionality over ordinary HTTP and HTML. In practice, this + additional functionality includes: + + (1) The ability to exchange URLs between users, of resources + accessed during stateful sessions, without leaking the state + information associated with those sessions. (e.g. "Here's the + URL for the FooCorp web catalog entry for those sandals that + you wanted.") + + + +Moore & Freed Best Current Practice [Page 2] + +RFC 2964 Use of HTTP State Management October 2000 + + + (2) The ability to maintain session state without "cache-busting". + That is, separating the session state from the URL allows a web + cache to maintain only a single copy of the named resource. If + the state is maintained in session-specific URLs, the cache + would likely have to maintain several identical copies of the + resource. + + (3) The ability to implement sessions with minimal server + configuration and minimal protocol overhead, as compared to + other techniques of maintaining session state. + + (4) The ability to associate the user with session state whenever a + user accesses the service, regardless of whether the user + enters through a particular "home page" or "portal". + + (5) The ability to save session information in stable storage, so + that a "session" can be maintained across client invocations, + system reboots, and client or system crashes. + +2.1. Recommended Uses + + Use of HTTP State Management is appropriate whenever it is desirable + to maintain state between a user and a service across multiple HTTP + transactions, provided that: + + (1) the user is aware that session state is being maintained and + consents to it, + + (2) the user has the ability to delete the state associated with + such a session at any time, + + (3) the information obtained through the ability to track the + user's usage of the service is not disclosed to other parties + without the user's explicit consent, and + + (4) session information itself cannot contain sensitive information + and cannot be used to obtain sensitive information that is not + otherwise available to an eavesdropper. + + This last point is important because cookies are usually sent in the + clear and hence are readily available to eavesdroppers. + + An example of such a recommended use would be a "shopping cart", + where the existence of the shopping cart is explicitly made known to + the user, the user can explicitly "empty" his or her shopping cart + (either by requesting that it be emptied or by purchasing those + + + + + +Moore & Freed Best Current Practice [Page 3] + +RFC 2964 Use of HTTP State Management October 2000 + + + items) and thus cause the shared state to be discarded, and the + service asserts that it will not disclose the user's shopping or + browsing habits to third parties without the user's consent. + + Note that the HTTP State Management protocol effectively allows a + service provider to refuse to provide a service, or provide a reduced + level of service, if the user or a user's client fails to honor a + request to maintain session state. Absent legal prohibition to the + contrary, the server MAY refuse to provide the service, or provide a + reduced level of service, under these conditions. As a purely + practical consideration, services designed to utilize HTTP State + Management may be unable to function properly if the client does not + provide it. Such servers SHOULD gracefully handle such conditions + and explain to the user why the full level of service is not + available. + +2.2. Problematic Uses + + The following uses of HTTP State Management are deemed inappropriate + and contrary to this specification: + +2.2.1. Leakage of Information to Third Parties + + HTTP State Management MUST NOT be used to leak information about the + user or the user's browsing habits to other parties besides the user + or service, without the user's explicit consent. Such usage is + prohibited even if the user's name or other externally-assigned + identifier are not exposed to other parties, because the state + management mechanism itself provides an identifier which can be used + to compile information about the user. + + Because such practices encourage users to defeat HTTP State + Management mechanisms, they tend to reduce the effectiveness of HTTP + State Management, and are therefore considered detrimental to the + operation of the web. + +2.2.2. Use as an Authentication Mechanism + + It is generally inappropriate to use the HTTP State Management + protocol as an authentication mechanism. HTTP State Management is + not designed with such use in mind, and safeguards for protection of + authentication credentials are lacking in both the protocol + specification and in widely deployed HTTP clients and servers. Most + HTTP sessions are not encrypted and "cookies" may therefore be + exposed to passive eavesdroppers. Furthermore, HTTP clients and + servers typically store "cookies" in cleartext with little or no + protection against exposure. HTTP State Management therefore SHOULD + + + + +Moore & Freed Best Current Practice [Page 4] + +RFC 2964 Use of HTTP State Management October 2000 + + + NOT be used as an authentication mechanism to protect information + from being exposed to unauthorized parties, even if the HTTP sessions + are encrypted. + + The prohibition against using HTTP State Management for + authentication includes both its use to protect information which is + provided by the service, and its use to protect potentially sensitive + information about the user which is entrusted to the service's care. + For example, it would be inappropriate to expose a user's name, + address, telephone number, or billing information to a client that + merely presented a cookie which had been previously associated with + the user. + + Similarly, HTTP State Management SHOULD NOT be used to authenticate + user requests if unauthorized requests might have undesirable side- + effects for the user, unless the user is aware of the potential for + such side-effects and explicitly consents to such use. For example, + a service which allowed a user to order merchandise with a single + "click", based entirely on the user's stored "cookies", could + inconvenience the user by requiring her to dispute charges to her + credit card, and/or return the unwanted merchandise, in the event + that the cookies were exposed to third parties. + + Some uses of HTTP State Management to identify users may be + relatively harmless, for example, if the only information which can + be thus exposed belongs to the service, and the service will suffer + little harm from the exposure of such information. + +3. User Interface Considerations for HTTP State Management + + HTTP State Management has been very controversial because of its + potential to expose information about a user's browsing habits to + third parties, without the knowledge or consent of the user. While + such exposure is possible, this is less a flaw in the protocol itself + than a failure of HTTP client implementations (and of some providers + of HTTP-based services) to protect users' interests. + + As implied above, there are other ways to maintain session state than + using HTTP State Management, and therefore other ways in which users' + browsing habits can be tracked. Indeed, it is difficult to imagine + how the HTTP protocol or an HTTP client could actually prevent a + service from disclosing a user's "click trail" to other parties if + the service chose to do so. Protection of such information from + inappropriate exposure must therefore be the responsibility of the + service. HTTP client implementations inherently cannot provide such + protection, though they can implement countermeasures which make it + more difficult for HTTP State Management to be used as the mechanism + by which such information is exposed. + + + +Moore & Freed Best Current Practice [Page 5] + +RFC 2964 Use of HTTP State Management October 2000 + + + It is arguable that HTTP clients should provide more protection in + general against inappropriate exposure of tracking information, + regardless of whether the exposure were facilitated by use of HTTP + State Management or by some other means. However, issues related to + other mechanisms are beyond the scope of this memo. + +3.1. Capabilities Required of an HTTP Client + + A user's willingness to consent to use of HTTP State Management is + likely to vary from one service to another, according to whether the + user trusts the service to use the information appropriately and to + limit its exposure to other parties. The user therefore SHOULD be + able to control whether his client supports a service's request to + use HTTP State Management, on a per-service basis. In particular: + + (1) Clients MUST NOT respond to HTTP State Management requests + unless explicitly enabled by the user. + + (2) Clients SHOULD provide an effective interface which allows + users to review, and approve or refuse, any particular requests + from a server to maintain state information, before the client + provides any state information to the server. + + (3) Clients SHOULD provide an effective interface which allows + users to instruct their clients to ignore all requests from a + particular service to maintain state information, on a per- + service basis, immediately in response to any particular + request from a server, before the client provides any state + information to the server. + + (4) Clients SHOULD provide an effective interface which allows a + user to disable future transmission of any state information to + a service, and/or discard any saved state information for that + service, even though the user has previously approved a + service's request to maintain state information. + + (5) Clients SHOULD provide an effective interface which allows a + user to terminate a previous request not to retain state + management information for a given service. + +3.2. Limitations of the domain-match algorithm + + The domain-match algorithm in RFC-2965 section 2 is intended as a + heuristic to allow a client to "guess" whether or not two domains are + part of the same service. There are few rules about how domain names + can be used, and the structure of domain names and how they are + delegated varies from one top-level domain to another (i.e. the + client cannot tell which part of the domain was assigned to the + + + +Moore & Freed Best Current Practice [Page 6] + +RFC 2964 Use of HTTP State Management October 2000 + + + service). Therefore NO string comparison algorithm (including the + domain-match algorithm) can be relied on to distinguish a domain that + belongs to a particular service, from a domain that belongs to + another party. + + As stated above, each service is ultimately responsible for ensuring + that user information is not inappropriately leaked to third parties. + Leaking information to third parties via State Management by careful + selection of domain names, or by assigning domain names to hosts + maintained by third parties, is at least as inappropriate as leaking + the same information by other means. + +4. Security Considerations + + This entire memo is about security considerations. + +5. Authors' Addresses + + Keith Moore + University of Tennessee Computer Science Department + 1122 Volunteer Blvd, Suite 203 + Knoxville TN, 37996-3450 + + EMail: moore@cs.utk.edu + + + Ned Freed + Innosoft International, Inc. + 1050 Lakes Drive + West Covina, CA 81790 + + EMail: ned.freed@innosoft.com + +6. References + + [RFC 1123] Braden, R., "Requirements for Internet Hosts -- + Application and Support", STD 3, RFC 1123, October 1989. + + [RFC 2965] Kristol, D. and L. Montulli, "HTTP State Management + Mechanism", RFC 2965, October 2000. + + [RFC 2109] Kristol, D. and L. Montulli, "HTTP State Management + Mechanism", RFC 2109, February 1997. + + + + + + + + +Moore & Freed Best Current Practice [Page 7] + +RFC 2964 Use of HTTP State Management October 2000 + + +7. Full Copyright Statement + + Copyright (C) The Internet Society (2000). All Rights Reserved. + + This document and translations of it may be copied and furnished to + others, and derivative works that comment on or otherwise explain it + or assist in its implementation may be prepared, copied, published + and distributed, in whole or in part, without restriction of any + kind, provided that the above copyright notice and this paragraph are + included on all such copies and derivative works. However, this + document itself may not be modified in any way, such as by removing + the copyright notice or references to the Internet Society or other + Internet organizations, except as needed for the purpose of + developing Internet standards in which case the procedures for + copyrights defined in the Internet Standards process must be + followed, or as required to translate it into languages other than + English. + + The limited permissions granted above are perpetual and will not be + revoked by the Internet Society or its successors or assigns. + + This document and the information contained herein is provided on an + "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING + TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING + BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION + HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF + MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +Acknowledgement + + Funding for the RFC Editor function is currently provided by the + Internet Society. + + + + + + + + + + + + + + + + + + + +Moore & Freed Best Current Practice [Page 8] + diff --git a/doc/rfc/rfc2965.txt b/doc/rfc/rfc2965.txt new file mode 100644 index 0000000000..8a4d02b176 --- /dev/null +++ b/doc/rfc/rfc2965.txt @@ -0,0 +1,1459 @@ + + + + + + +Network Working Group D. Kristol +Request for Comments: 2965 Bell Laboratories, Lucent Technologies +Obsoletes: 2109 L. Montulli +Category: Standards Track Epinions.com, Inc. + October 2000 + + + HTTP State Management Mechanism + +Status of this Memo + + This document specifies an Internet standards track protocol for the + Internet community, and requests discussion and suggestions for + improvements. Please refer to the current edition of the "Internet + Official Protocol Standards" (STD 1) for the standardization state + and status of this protocol. Distribution of this memo is unlimited. + +Copyright Notice + + Copyright (C) The Internet Society (2000). All Rights Reserved. + +IESG Note + + The IESG notes that this mechanism makes use of the .local top-level + domain (TLD) internally when handling host names that don't contain + any dots, and that this mechanism might not work in the expected way + should an actual .local TLD ever be registered. + +Abstract + + This document specifies a way to create a stateful session with + Hypertext Transfer Protocol (HTTP) requests and responses. It + describes three new headers, Cookie, Cookie2, and Set-Cookie2, which + carry state information between participating origin servers and user + agents. The method described here differs from Netscape's Cookie + proposal [Netscape], but it can interoperate with HTTP/1.0 user + agents that use Netscape's method. (See the HISTORICAL section.) + + This document reflects implementation experience with RFC 2109 and + obsoletes it. + +1. TERMINOLOGY + + The terms user agent, client, server, proxy, origin server, and + http_URL have the same meaning as in the HTTP/1.1 specification + [RFC2616]. The terms abs_path and absoluteURI have the same meaning + as in the URI Syntax specification [RFC2396]. + + + + +Kristol & Montulli Standards Track [Page 1] + +RFC 2965 HTTP State Management Mechanism October 2000 + + + Host name (HN) means either the host domain name (HDN) or the numeric + Internet Protocol (IP) address of a host. The fully qualified domain + name is preferred; use of numeric IP addresses is strongly + discouraged. + + The terms request-host and request-URI refer to the values the client + would send to the server as, respectively, the host (but not port) + and abs_path portions of the absoluteURI (http_URL) of the HTTP + request line. Note that request-host is a HN. + + The term effective host name is related to host name. If a host name + contains no dots, the effective host name is that name with the + string .local appended to it. Otherwise the effective host name is + the same as the host name. Note that all effective host names + contain at least one dot. + + The term request-port refers to the port portion of the absoluteURI + (http_URL) of the HTTP request line. If the absoluteURI has no + explicit port, the request-port is the HTTP default, 80. The + request-port of a cookie is the request-port of the request in which + a Set-Cookie2 response header was returned to the user agent. + + Host names can be specified either as an IP address or a HDN string. + Sometimes we compare one host name with another. (Such comparisons + SHALL be case-insensitive.) Host A's name domain-matches host B's if + + * their host name strings string-compare equal; or + + * A is a HDN string and has the form NB, where N is a non-empty + name string, B has the form .B', and B' is a HDN string. (So, + x.y.com domain-matches .Y.com but not Y.com.) + + Note that domain-match is not a commutative operation: a.b.c.com + domain-matches .c.com, but not the reverse. + + The reach R of a host name H is defined as follows: + + * If + + - H is the host domain name of a host; and, + + - H has the form A.B; and + + - A has no embedded (that is, interior) dots; and + + - B has at least one embedded dot, or B is the string "local". + then the reach of H is .B. + + + + +Kristol & Montulli Standards Track [Page 2] + +RFC 2965 HTTP State Management Mechanism October 2000 + + + * Otherwise, the reach of H is H. + + For two strings that represent paths, P1 and P2, P1 path-matches P2 + if P2 is a prefix of P1 (including the case where P1 and P2 string- + compare equal). Thus, the string /tec/waldo path-matches /tec. + + Because it was used in Netscape's original implementation of state + management, we will use the term cookie to refer to the state + information that passes between an origin server and user agent, and + that gets stored by the user agent. + +1.1 Requirements + + The key words "MAY", "MUST", "MUST NOT", "OPTIONAL", "RECOMMENDED", + "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT" in this + document are to be interpreted as described in RFC 2119 [RFC2119]. + +2. STATE AND SESSIONS + + This document describes a way to create stateful sessions with HTTP + requests and responses. Currently, HTTP servers respond to each + client request without relating that request to previous or + subsequent requests; the state management mechanism allows clients + and servers that wish to exchange state information to place HTTP + requests and responses within a larger context, which we term a + "session". This context might be used to create, for example, a + "shopping cart", in which user selections can be aggregated before + purchase, or a magazine browsing system, in which a user's previous + reading affects which offerings are presented. + + Neither clients nor servers are required to support cookies. A + server MAY refuse to provide content to a client that does not return + the cookies it sends. + +3. DESCRIPTION + + We describe here a way for an origin server to send state information + to the user agent, and for the user agent to return the state + information to the origin server. The goal is to have a minimal + impact on HTTP and user agents. + +3.1 Syntax: General + + The two state management headers, Set-Cookie2 and Cookie, have common + syntactic properties involving attribute-value pairs. The following + grammar uses the notation, and tokens DIGIT (decimal digits), token + + + + + +Kristol & Montulli Standards Track [Page 3] + +RFC 2965 HTTP State Management Mechanism October 2000 + + + (informally, a sequence of non-special, non-white space characters), + and http_URL from the HTTP/1.1 specification [RFC2616] to describe + their syntax. + + av-pairs = av-pair *(";" av-pair) + av-pair = attr ["=" value] ; optional value + attr = token + value = token | quoted-string + + Attributes (names) (attr) are case-insensitive. White space is + permitted between tokens. Note that while the above syntax + description shows value as optional, most attrs require them. + + NOTE: The syntax above allows whitespace between the attribute and + the = sign. + +3.2 Origin Server Role + + 3.2.1 General The origin server initiates a session, if it so + desires. To do so, it returns an extra response header to the + client, Set-Cookie2. (The details follow later.) + + A user agent returns a Cookie request header (see below) to the + origin server if it chooses to continue a session. The origin server + MAY ignore it or use it to determine the current state of the + session. It MAY send back to the client a Set-Cookie2 response + header with the same or different information, or it MAY send no + Set-Cookie2 header at all. The origin server effectively ends a + session by sending the client a Set-Cookie2 header with Max-Age=0. + + Servers MAY return Set-Cookie2 response headers with any response. + User agents SHOULD send Cookie request headers, subject to other + rules detailed below, with every request. + + An origin server MAY include multiple Set-Cookie2 headers in a + response. Note that an intervening gateway could fold multiple such + headers into a single header. + + + + + + + + + + + + + + +Kristol & Montulli Standards Track [Page 4] + +RFC 2965 HTTP State Management Mechanism October 2000 + + + 3.2.2 Set-Cookie2 Syntax The syntax for the Set-Cookie2 response + header is + + set-cookie = "Set-Cookie2:" cookies + cookies = 1#cookie + cookie = NAME "=" VALUE *(";" set-cookie-av) + NAME = attr + VALUE = value + set-cookie-av = "Comment" "=" value + | "CommentURL" "=" <"> http_URL <"> + | "Discard" + | "Domain" "=" value + | "Max-Age" "=" value + | "Path" "=" value + | "Port" [ "=" <"> portlist <"> ] + | "Secure" + | "Version" "=" 1*DIGIT + portlist = 1#portnum + portnum = 1*DIGIT + + Informally, the Set-Cookie2 response header comprises the token Set- + Cookie2:, followed by a comma-separated list of one or more cookies. + Each cookie begins with a NAME=VALUE pair, followed by zero or more + semi-colon-separated attribute-value pairs. The syntax for + attribute-value pairs was shown earlier. The specific attributes and + the semantics of their values follows. The NAME=VALUE attribute- + value pair MUST come first in each cookie. The others, if present, + can occur in any order. If an attribute appears more than once in a + cookie, the client SHALL use only the value associated with the first + appearance of the attribute; a client MUST ignore values after the + first. + + The NAME of a cookie MAY be the same as one of the attributes in this + specification. However, because the cookie's NAME must come first in + a Set-Cookie2 response header, the NAME and its VALUE cannot be + confused with an attribute-value pair. + + NAME=VALUE + REQUIRED. The name of the state information ("cookie") is NAME, + and its value is VALUE. NAMEs that begin with $ are reserved and + MUST NOT be used by applications. + + The VALUE is opaque to the user agent and may be anything the + origin server chooses to send, possibly in a server-selected + printable ASCII encoding. "Opaque" implies that the content is of + interest and relevance only to the origin server. The content + may, in fact, be readable by anyone that examines the Set-Cookie2 + header. + + + +Kristol & Montulli Standards Track [Page 5] + +RFC 2965 HTTP State Management Mechanism October 2000 + + + Comment=value + OPTIONAL. Because cookies can be used to derive or store private + information about a user, the value of the Comment attribute + allows an origin server to document how it intends to use the + cookie. The user can inspect the information to decide whether to + initiate or continue a session with this cookie. Characters in + value MUST be in UTF-8 encoding. [RFC2279] + + CommentURL="http_URL" + OPTIONAL. Because cookies can be used to derive or store private + information about a user, the CommentURL attribute allows an + origin server to document how it intends to use the cookie. The + user can inspect the information identified by the URL to decide + whether to initiate or continue a session with this cookie. + + Discard + OPTIONAL. The Discard attribute instructs the user agent to + discard the cookie unconditionally when the user agent terminates. + + Domain=value + OPTIONAL. The value of the Domain attribute specifies the domain + for which the cookie is valid. If an explicitly specified value + does not start with a dot, the user agent supplies a leading dot. + + Max-Age=value + OPTIONAL. The value of the Max-Age attribute is delta-seconds, + the lifetime of the cookie in seconds, a decimal non-negative + integer. To handle cached cookies correctly, a client SHOULD + calculate the age of the cookie according to the age calculation + rules in the HTTP/1.1 specification [RFC2616]. When the age is + greater than delta-seconds seconds, the client SHOULD discard the + cookie. A value of zero means the cookie SHOULD be discarded + immediately. + + Path=value + OPTIONAL. The value of the Path attribute specifies the subset of + URLs on the origin server to which this cookie applies. + + Port[="portlist"] + OPTIONAL. The Port attribute restricts the port to which a cookie + may be returned in a Cookie request header. Note that the syntax + REQUIREs quotes around the OPTIONAL portlist even if there is only + one portnum in portlist. + + + + + + + + +Kristol & Montulli Standards Track [Page 6] + +RFC 2965 HTTP State Management Mechanism October 2000 + + + Secure + OPTIONAL. The Secure attribute (with no value) directs the user + agent to use only (unspecified) secure means to contact the origin + server whenever it sends back this cookie, to protect the + confidentially and authenticity of the information in the cookie. + + The user agent (possibly with user interaction) MAY determine what + level of security it considers appropriate for "secure" cookies. + The Secure attribute should be considered security advice from the + server to the user agent, indicating that it is in the session's + interest to protect the cookie contents. When it sends a "secure" + cookie back to a server, the user agent SHOULD use no less than + the same level of security as was used when it received the cookie + from the server. + + Version=value + REQUIRED. The value of the Version attribute, a decimal integer, + identifies the version of the state management specification to + which the cookie conforms. For this specification, Version=1 + applies. + + 3.2.3 Controlling Caching An origin server must be cognizant of the + effect of possible caching of both the returned resource and the + Set-Cookie2 header. Caching "public" documents is desirable. For + example, if the origin server wants to use a public document such as + a "front door" page as a sentinel to indicate the beginning of a + session for which a Set-Cookie2 response header must be generated, + the page SHOULD be stored in caches "pre-expired" so that the origin + server will see further requests. "Private documents", for example + those that contain information strictly private to a session, SHOULD + NOT be cached in shared caches. + + If the cookie is intended for use by a single user, the Set-Cookie2 + header SHOULD NOT be cached. A Set-Cookie2 header that is intended + to be shared by multiple users MAY be cached. + + The origin server SHOULD send the following additional HTTP/1.1 + response headers, depending on circumstances: + + * To suppress caching of the Set-Cookie2 header: + + Cache-control: no-cache="set-cookie2" + + and one of the following: + + * To suppress caching of a private document in shared caches: + + Cache-control: private + + + +Kristol & Montulli Standards Track [Page 7] + +RFC 2965 HTTP State Management Mechanism October 2000 + + + * To allow caching of a document and require that it be validated + before returning it to the client: + + Cache-Control: must-revalidate, max-age=0 + + * To allow caching of a document, but to require that proxy + caches (not user agent caches) validate it before returning it + to the client: + + Cache-Control: proxy-revalidate, max-age=0 + + * To allow caching of a document and request that it be validated + before returning it to the client (by "pre-expiring" it): + + Cache-control: max-age=0 + + Not all caches will revalidate the document in every case. + + HTTP/1.1 servers MUST send Expires: old-date (where old-date is a + date long in the past) on responses containing Set-Cookie2 response + headers unless they know for certain (by out of band means) that + there are no HTTP/1.0 proxies in the response chain. HTTP/1.1 + servers MAY send other Cache-Control directives that permit caching + by HTTP/1.1 proxies in addition to the Expires: old-date directive; + the Cache-Control directive will override the Expires: old-date for + HTTP/1.1 proxies. + +3.3 User Agent Role + + 3.3.1 Interpreting Set-Cookie2 The user agent keeps separate track + of state information that arrives via Set-Cookie2 response headers + from each origin server (as distinguished by name or IP address and + port). The user agent MUST ignore attribute-value pairs whose + attribute it does not recognize. The user agent applies these + defaults for optional attributes that are missing: + + Discard The default behavior is dictated by the presence or absence + of a Max-Age attribute. + + Domain Defaults to the effective request-host. (Note that because + there is no dot at the beginning of effective request-host, + the default Domain can only domain-match itself.) + + Max-Age The default behavior is to discard the cookie when the user + agent exits. + + Path Defaults to the path of the request URL that generated the + Set-Cookie2 response, up to and including the right-most /. + + + +Kristol & Montulli Standards Track [Page 8] + +RFC 2965 HTTP State Management Mechanism October 2000 + + + Port The default behavior is that a cookie MAY be returned to any + request-port. + + Secure If absent, the user agent MAY send the cookie over an + insecure channel. + + 3.3.2 Rejecting Cookies To prevent possible security or privacy + violations, a user agent rejects a cookie according to rules below. + The goal of the rules is to try to limit the set of servers for which + a cookie is valid, based on the values of the Path, Domain, and Port + attributes and the request-URI, request-host and request-port. + + A user agent rejects (SHALL NOT store its information) if the Version + attribute is missing. Moreover, a user agent rejects (SHALL NOT + store its information) if any of the following is true of the + attributes explicitly present in the Set-Cookie2 response header: + + * The value for the Path attribute is not a prefix of the + request-URI. + + * The value for the Domain attribute contains no embedded dots, + and the value is not .local. + + * The effective host name that derives from the request-host does + not domain-match the Domain attribute. + + * The request-host is a HDN (not IP address) and has the form HD, + where D is the value of the Domain attribute, and H is a string + that contains one or more dots. + + * The Port attribute has a "port-list", and the request-port was + not in the list. + + Examples: + + * A Set-Cookie2 from request-host y.x.foo.com for Domain=.foo.com + would be rejected, because H is y.x and contains a dot. + + * A Set-Cookie2 from request-host x.foo.com for Domain=.foo.com + would be accepted. + + * A Set-Cookie2 with Domain=.com or Domain=.com., will always be + rejected, because there is no embedded dot. + + * A Set-Cookie2 with Domain=ajax.com will be accepted, and the + value for Domain will be taken to be .ajax.com, because a dot + gets prepended to the value. + + + + +Kristol & Montulli Standards Track [Page 9] + +RFC 2965 HTTP State Management Mechanism October 2000 + + + * A Set-Cookie2 with Port="80,8000" will be accepted if the + request was made to port 80 or 8000 and will be rejected + otherwise. + + * A Set-Cookie2 from request-host example for Domain=.local will + be accepted, because the effective host name for the request- + host is example.local, and example.local domain-matches .local. + + 3.3.3 Cookie Management If a user agent receives a Set-Cookie2 + response header whose NAME is the same as that of a cookie it has + previously stored, the new cookie supersedes the old when: the old + and new Domain attribute values compare equal, using a case- + insensitive string-compare; and, the old and new Path attribute + values string-compare equal (case-sensitive). However, if the Set- + Cookie2 has a value for Max-Age of zero, the (old and new) cookie is + discarded. Otherwise a cookie persists (resources permitting) until + whichever happens first, then gets discarded: its Max-Age lifetime is + exceeded; or, if the Discard attribute is set, the user agent + terminates the session. + + Because user agents have finite space in which to store cookies, they + MAY also discard older cookies to make space for newer ones, using, + for example, a least-recently-used algorithm, along with constraints + on the maximum number of cookies that each origin server may set. + + If a Set-Cookie2 response header includes a Comment attribute, the + user agent SHOULD store that information in a human-readable form + with the cookie and SHOULD display the comment text as part of a + cookie inspection user interface. + + If a Set-Cookie2 response header includes a CommentURL attribute, the + user agent SHOULD store that information in a human-readable form + with the cookie, or, preferably, SHOULD allow the user to follow the + http_URL link as part of a cookie inspection user interface. + + The cookie inspection user interface may include a facility whereby a + user can decide, at the time the user agent receives the Set-Cookie2 + response header, whether or not to accept the cookie. A potentially + confusing situation could arise if the following sequence occurs: + + * the user agent receives a cookie that contains a CommentURL + attribute; + + * the user agent's cookie inspection interface is configured so + that it presents a dialog to the user before the user agent + accepts the cookie; + + + + + +Kristol & Montulli Standards Track [Page 10] + +RFC 2965 HTTP State Management Mechanism October 2000 + + + * the dialog allows the user to follow the CommentURL link when + the user agent receives the cookie; and, + + * when the user follows the CommentURL link, the origin server + (or another server, via other links in the returned content) + returns another cookie. + + The user agent SHOULD NOT send any cookies in this context. The user + agent MAY discard any cookie it receives in this context that the + user has not, through some user agent mechanism, deemed acceptable. + + User agents SHOULD allow the user to control cookie destruction, but + they MUST NOT extend the cookie's lifetime beyond that controlled by + the Discard and Max-Age attributes. An infrequently-used cookie may + function as a "preferences file" for network applications, and a user + may wish to keep it even if it is the least-recently-used cookie. One + possible implementation would be an interface that allows the + permanent storage of a cookie through a checkbox (or, conversely, its + immediate destruction). + + Privacy considerations dictate that the user have considerable + control over cookie management. The PRIVACY section contains more + information. + + 3.3.4 Sending Cookies to the Origin Server When it sends a request + to an origin server, the user agent includes a Cookie request header + if it has stored cookies that are applicable to the request, based on + + * the request-host and request-port; + + * the request-URI; + + * the cookie's age. + + The syntax for the header is: + +cookie = "Cookie:" cookie-version 1*((";" | ",") cookie-value) +cookie-value = NAME "=" VALUE [";" path] [";" domain] [";" port] +cookie-version = "$Version" "=" value +NAME = attr +VALUE = value +path = "$Path" "=" value +domain = "$Domain" "=" value +port = "$Port" [ "=" <"> value <"> ] + + The value of the cookie-version attribute MUST be the value from the + Version attribute of the corresponding Set-Cookie2 response header. + Otherwise the value for cookie-version is 0. The value for the path + + + +Kristol & Montulli Standards Track [Page 11] + +RFC 2965 HTTP State Management Mechanism October 2000 + + + attribute MUST be the value from the Path attribute, if one was + present, of the corresponding Set-Cookie2 response header. Otherwise + the attribute SHOULD be omitted from the Cookie request header. The + value for the domain attribute MUST be the value from the Domain + attribute, if one was present, of the corresponding Set-Cookie2 + response header. Otherwise the attribute SHOULD be omitted from the + Cookie request header. + + The port attribute of the Cookie request header MUST mirror the Port + attribute, if one was present, in the corresponding Set-Cookie2 + response header. That is, the port attribute MUST be present if the + Port attribute was present in the Set-Cookie2 header, and it MUST + have the same value, if any. Otherwise, if the Port attribute was + absent from the Set-Cookie2 header, the attribute likewise MUST be + omitted from the Cookie request header. + + Note that there is neither a Comment nor a CommentURL attribute in + the Cookie request header corresponding to the ones in the Set- + Cookie2 response header. The user agent does not return the comment + information to the origin server. + + The user agent applies the following rules to choose applicable + cookie-values to send in Cookie request headers from among all the + cookies it has received. + + Domain Selection + The origin server's effective host name MUST domain-match the + Domain attribute of the cookie. + + Port Selection + There are three possible behaviors, depending on the Port + attribute in the Set-Cookie2 response header: + + 1. By default (no Port attribute), the cookie MAY be sent to any + port. + + 2. If the attribute is present but has no value (e.g., Port), the + cookie MUST only be sent to the request-port it was received + from. + + 3. If the attribute has a port-list, the cookie MUST only be + returned if the new request-port is one of those listed in + port-list. + + Path Selection + The request-URI MUST path-match the Path attribute of the cookie. + + + + + +Kristol & Montulli Standards Track [Page 12] + +RFC 2965 HTTP State Management Mechanism October 2000 + + + Max-Age Selection + Cookies that have expired should have been discarded and thus are + not forwarded to an origin server. + + If multiple cookies satisfy the criteria above, they are ordered in + the Cookie header such that those with more specific Path attributes + precede those with less specific. Ordering with respect to other + attributes (e.g., Domain) is unspecified. + + Note: For backward compatibility, the separator in the Cookie header + is semi-colon (;) everywhere. A server SHOULD also accept comma (,) + as the separator between cookie-values for future compatibility. + + 3.3.5 Identifying What Version is Understood: Cookie2 The Cookie2 + request header facilitates interoperation between clients and servers + that understand different versions of the cookie specification. When + the client sends one or more cookies to an origin server, if at least + one of those cookies contains a $Version attribute whose value is + different from the version that the client understands, then the + client MUST also send a Cookie2 request header, the syntax for which + is + + cookie2 = "Cookie2:" cookie-version + + Here the value for cookie-version is the highest version of cookie + specification (currently 1) that the client understands. The client + needs to send at most one such request header per request. + + 3.3.6 Sending Cookies in Unverifiable Transactions Users MUST have + control over sessions in order to ensure privacy. (See PRIVACY + section below.) To simplify implementation and to prevent an + additional layer of complexity where adequate safeguards exist, + however, this document distinguishes between transactions that are + verifiable and those that are unverifiable. A transaction is + verifiable if the user, or a user-designated agent, has the option to + review the request-URI prior to its use in the transaction. A + transaction is unverifiable if the user does not have that option. + Unverifiable transactions typically arise when a user agent + automatically requests inlined or embedded entities or when it + resolves redirection (3xx) responses from an origin server. + Typically the origin transaction, the transaction that the user + initiates, is verifiable, and that transaction may directly or + indirectly induce the user agent to make unverifiable transactions. + + An unverifiable transaction is to a third-party host if its request- + host U does not domain-match the reach R of the request-host O in the + origin transaction. + + + + +Kristol & Montulli Standards Track [Page 13] + +RFC 2965 HTTP State Management Mechanism October 2000 + + + When it makes an unverifiable transaction, a user agent MUST disable + all cookie processing (i.e., MUST NOT send cookies, and MUST NOT + accept any received cookies) if the transaction is to a third-party + host. + + This restriction prevents a malicious service author from using + unverifiable transactions to induce a user agent to start or continue + a session with a server in a different domain. The starting or + continuation of such sessions could be contrary to the privacy + expectations of the user, and could also be a security problem. + + User agents MAY offer configurable options that allow the user agent, + or any autonomous programs that the user agent executes, to ignore + the above rule, so long as these override options default to "off". + + (N.B. Mechanisms may be proposed that will automate overriding the + third-party restrictions under controlled conditions.) + + Many current user agents already provide a review option that would + render many links verifiable. For instance, some user agents display + the URL that would be referenced for a particular link when the mouse + pointer is placed over that link. The user can therefore determine + whether to visit that site before causing the browser to do so. + (Though not implemented on current user agents, a similar technique + could be used for a button used to submit a form -- the user agent + could display the action to be taken if the user were to select that + button.) However, even this would not make all links verifiable; for + example, links to automatically loaded images would not normally be + subject to "mouse pointer" verification. + + Many user agents also provide the option for a user to view the HTML + source of a document, or to save the source to an external file where + it can be viewed by another application. While such an option does + provide a crude review mechanism, some users might not consider it + acceptable for this purpose. + +3.4 How an Origin Server Interprets the Cookie Header + + A user agent returns much of the information in the Set-Cookie2 + header to the origin server when the request-URI path-matches the + Path attribute of the cookie. When it receives a Cookie header, the + origin server SHOULD treat cookies with NAMEs whose prefix is $ + specially, as an attribute for the cookie. + + + + + + + + +Kristol & Montulli Standards Track [Page 14] + +RFC 2965 HTTP State Management Mechanism October 2000 + + +3.5 Caching Proxy Role + + One reason for separating state information from both a URL and + document content is to facilitate the scaling that caching permits. + To support cookies, a caching proxy MUST obey these rules already in + the HTTP specification: + + * Honor requests from the cache, if possible, based on cache + validity rules. + + * Pass along a Cookie request header in any request that the + proxy must make of another server. + + * Return the response to the client. Include any Set-Cookie2 + response header. + + * Cache the received response subject to the control of the usual + headers, such as Expires, + + Cache-control: no-cache + + and + + Cache-control: private + + * Cache the Set-Cookie2 subject to the control of the usual + header, + + Cache-control: no-cache="set-cookie2" + + (The Set-Cookie2 header should usually not be cached.) + + Proxies MUST NOT introduce Set-Cookie2 (Cookie) headers of their own + in proxy responses (requests). + +4. EXAMPLES + +4.1 Example 1 + + Most detail of request and response headers has been omitted. Assume + the user agent has no stored cookies. + + 1. User Agent -> Server + + POST /acme/login HTTP/1.1 + [form data] + + User identifies self via a form. + + + +Kristol & Montulli Standards Track [Page 15] + +RFC 2965 HTTP State Management Mechanism October 2000 + + + 2. Server -> User Agent + + HTTP/1.1 200 OK + Set-Cookie2: Customer="WILE_E_COYOTE"; Version="1"; Path="/acme" + + Cookie reflects user's identity. + + 3. User Agent -> Server + + POST /acme/pickitem HTTP/1.1 + Cookie: $Version="1"; Customer="WILE_E_COYOTE"; $Path="/acme" + [form data] + + User selects an item for "shopping basket". + + 4. Server -> User Agent + + HTTP/1.1 200 OK + Set-Cookie2: Part_Number="Rocket_Launcher_0001"; Version="1"; + Path="/acme" + + Shopping basket contains an item. + + 5. User Agent -> Server + + POST /acme/shipping HTTP/1.1 + Cookie: $Version="1"; + Customer="WILE_E_COYOTE"; $Path="/acme"; + Part_Number="Rocket_Launcher_0001"; $Path="/acme" + [form data] + + User selects shipping method from form. + + 6. Server -> User Agent + + HTTP/1.1 200 OK + Set-Cookie2: Shipping="FedEx"; Version="1"; Path="/acme" + + New cookie reflects shipping method. + + 7. User Agent -> Server + + POST /acme/process HTTP/1.1 + Cookie: $Version="1"; + Customer="WILE_E_COYOTE"; $Path="/acme"; + Part_Number="Rocket_Launcher_0001"; $Path="/acme"; + Shipping="FedEx"; $Path="/acme" + [form data] + + + +Kristol & Montulli Standards Track [Page 16] + +RFC 2965 HTTP State Management Mechanism October 2000 + + + User chooses to process order. + + 8. Server -> User Agent + + HTTP/1.1 200 OK + + Transaction is complete. + + The user agent makes a series of requests on the origin server, after + each of which it receives a new cookie. All the cookies have the + same Path attribute and (default) domain. Because the request-URIs + all path-match /acme, the Path attribute of each cookie, each request + contains all the cookies received so far. + +4.2 Example 2 + + This example illustrates the effect of the Path attribute. All + detail of request and response headers has been omitted. Assume the + user agent has no stored cookies. + + Imagine the user agent has received, in response to earlier requests, + the response headers + + Set-Cookie2: Part_Number="Rocket_Launcher_0001"; Version="1"; + Path="/acme" + + and + + Set-Cookie2: Part_Number="Riding_Rocket_0023"; Version="1"; + Path="/acme/ammo" + + A subsequent request by the user agent to the (same) server for URLs + of the form /acme/ammo/... would include the following request + header: + + Cookie: $Version="1"; + Part_Number="Riding_Rocket_0023"; $Path="/acme/ammo"; + Part_Number="Rocket_Launcher_0001"; $Path="/acme" + + Note that the NAME=VALUE pair for the cookie with the more specific + Path attribute, /acme/ammo, comes before the one with the less + specific Path attribute, /acme. Further note that the same cookie + name appears more than once. + + A subsequent request by the user agent to the (same) server for a URL + of the form /acme/parts/ would include the following request header: + + + + + +Kristol & Montulli Standards Track [Page 17] + +RFC 2965 HTTP State Management Mechanism October 2000 + + + Cookie: $Version="1"; Part_Number="Rocket_Launcher_0001"; + $Path="/acme" + + Here, the second cookie's Path attribute /acme/ammo is not a prefix + of the request URL, /acme/parts/, so the cookie does not get + forwarded to the server. + +5. IMPLEMENTATION CONSIDERATIONS + + Here we provide guidance on likely or desirable details for an origin + server that implements state management. + +5.1 Set-Cookie2 Content + + An origin server's content should probably be divided into disjoint + application areas, some of which require the use of state + information. The application areas can be distinguished by their + request URLs. The Set-Cookie2 header can incorporate information + about the application areas by setting the Path attribute for each + one. + + The session information can obviously be clear or encoded text that + describes state. However, if it grows too large, it can become + unwieldy. Therefore, an implementor might choose for the session + information to be a key to a server-side resource. Of course, using + a database creates some problems that this state management + specification was meant to avoid, namely: + + 1. keeping real state on the server side; + + 2. how and when to garbage-collect the database entry, in case the + user agent terminates the session by, for example, exiting. + +5.2 Stateless Pages + + Caching benefits the scalability of WWW. Therefore it is important + to reduce the number of documents that have state embedded in them + inherently. For example, if a shopping-basket-style application + always displays a user's current basket contents on each page, those + pages cannot be cached, because each user's basket's contents would + be different. On the other hand, if each page contains just a link + that allows the user to "Look at My Shopping Basket", the page can be + cached. + + + + + + + + +Kristol & Montulli Standards Track [Page 18] + +RFC 2965 HTTP State Management Mechanism October 2000 + + +5.3 Implementation Limits + + Practical user agent implementations have limits on the number and + size of cookies that they can store. In general, user agents' cookie + support should have no fixed limits. They should strive to store as + many frequently-used cookies as possible. Furthermore, general-use + user agents SHOULD provide each of the following minimum capabilities + individually, although not necessarily simultaneously: + + * at least 300 cookies + + * at least 4096 bytes per cookie (as measured by the characters + that comprise the cookie non-terminal in the syntax description + of the Set-Cookie2 header, and as received in the Set-Cookie2 + header) + + * at least 20 cookies per unique host or domain name + + User agents created for specific purposes or for limited-capacity + devices SHOULD provide at least 20 cookies of 4096 bytes, to ensure + that the user can interact with a session-based origin server. + + The information in a Set-Cookie2 response header MUST be retained in + its entirety. If for some reason there is inadequate space to store + the cookie, it MUST be discarded, not truncated. + + Applications should use as few and as small cookies as possible, and + they should cope gracefully with the loss of a cookie. + + 5.3.1 Denial of Service Attacks User agents MAY choose to set an + upper bound on the number of cookies to be stored from a given host + or domain name or on the size of the cookie information. Otherwise a + malicious server could attempt to flood a user agent with many + cookies, or large cookies, on successive responses, which would force + out cookies the user agent had received from other servers. However, + the minima specified above SHOULD still be supported. + +6. PRIVACY + + Informed consent should guide the design of systems that use cookies. + A user should be able to find out how a web site plans to use + information in a cookie and should be able to choose whether or not + those policies are acceptable. Both the user agent and the origin + server must assist informed consent. + + + + + + + +Kristol & Montulli Standards Track [Page 19] + +RFC 2965 HTTP State Management Mechanism October 2000 + + +6.1 User Agent Control + + An origin server could create a Set-Cookie2 header to track the path + of a user through the server. Users may object to this behavior as + an intrusive accumulation of information, even if their identity is + not evident. (Identity might become evident, for example, if a user + subsequently fills out a form that contains identifying information.) + This state management specification therefore requires that a user + agent give the user control over such a possible intrusion, although + the interface through which the user is given this control is left + unspecified. However, the control mechanisms provided SHALL at least + allow the user + + * to completely disable the sending and saving of cookies. + + * to determine whether a stateful session is in progress. + + * to control the saving of a cookie on the basis of the cookie's + Domain attribute. + + Such control could be provided, for example, by mechanisms + + * to notify the user when the user agent is about to send a + cookie to the origin server, to offer the option not to begin a + session. + + * to display a visual indication that a stateful session is in + progress. + + * to let the user decide which cookies, if any, should be saved + when the user concludes a window or user agent session. + + * to let the user examine and delete the contents of a cookie at + any time. + + A user agent usually begins execution with no remembered state + information. It SHOULD be possible to configure a user agent never + to send Cookie headers, in which case it can never sustain state with + an origin server. (The user agent would then behave like one that is + unaware of how to handle Set-Cookie2 response headers.) + + When the user agent terminates execution, it SHOULD let the user + discard all state information. Alternatively, the user agent MAY ask + the user whether state information should be retained; the default + should be "no". If the user chooses to retain state information, it + would be restored the next time the user agent runs. + + + + + +Kristol & Montulli Standards Track [Page 20] + +RFC 2965 HTTP State Management Mechanism October 2000 + + + NOTE: User agents should probably be cautious about using files to + store cookies long-term. If a user runs more than one instance of + the user agent, the cookies could be commingled or otherwise + corrupted. + +6.2 Origin Server Role + + An origin server SHOULD promote informed consent by adding CommentURL + or Comment information to the cookies it sends. CommentURL is + preferred because of the opportunity to provide richer information in + a multiplicity of languages. + +6.3 Clear Text + + The information in the Set-Cookie2 and Cookie headers is unprotected. + As a consequence: + + 1. Any sensitive information that is conveyed in them is exposed + to intruders. + + 2. A malicious intermediary could alter the headers as they travel + in either direction, with unpredictable results. + + These facts imply that information of a personal and/or financial + nature should only be sent over a secure channel. For less sensitive + information, or when the content of the header is a database key, an + origin server should be vigilant to prevent a bad Cookie value from + causing failures. + + A user agent in a shared user environment poses a further risk. + Using a cookie inspection interface, User B could examine the + contents of cookies that were saved when User A used the machine. + +7. SECURITY CONSIDERATIONS + +7.1 Protocol Design + + The restrictions on the value of the Domain attribute, and the rules + concerning unverifiable transactions, are meant to reduce the ways + that cookies can "leak" to the "wrong" site. The intent is to + restrict cookies to one host, or a closely related set of hosts. + Therefore a request-host is limited as to what values it can set for + Domain. We consider it acceptable for hosts host1.foo.com and + host2.foo.com to share cookies, but not a.com and b.com. + + Similarly, a server can set a Path only for cookies that are related + to the request-URI. + + + + +Kristol & Montulli Standards Track [Page 21] + +RFC 2965 HTTP State Management Mechanism October 2000 + + +7.2 Cookie Spoofing + + Proper application design can avoid spoofing attacks from related + domains. Consider: + + 1. User agent makes request to victim.cracker.edu, gets back + cookie session_id="1234" and sets the default domain + victim.cracker.edu. + + 2. User agent makes request to spoof.cracker.edu, gets back cookie + session-id="1111", with Domain=".cracker.edu". + + 3. User agent makes request to victim.cracker.edu again, and + passes + + Cookie: $Version="1"; session_id="1234", + $Version="1"; session_id="1111"; $Domain=".cracker.edu" + + The server at victim.cracker.edu should detect that the second + cookie was not one it originated by noticing that the Domain + attribute is not for itself and ignore it. + +7.3 Unexpected Cookie Sharing + + A user agent SHOULD make every attempt to prevent the sharing of + session information between hosts that are in different domains. + Embedded or inlined objects may cause particularly severe privacy + problems if they can be used to share cookies between disparate + hosts. For example, a malicious server could embed cookie + information for host a.com in a URI for a CGI on host b.com. User + agent implementors are strongly encouraged to prevent this sort of + exchange whenever possible. + +7.4 Cookies For Account Information + + While it is common practice to use them this way, cookies are not + designed or intended to be used to hold authentication information, + such as account names and passwords. Unless such cookies are + exchanged over an encrypted path, the account information they + contain is highly vulnerable to perusal and theft. + +8. OTHER, SIMILAR, PROPOSALS + + Apart from RFC 2109, three other proposals have been made to + accomplish similar goals. This specification began as an amalgam of + Kristol's State-Info proposal [DMK95] and Netscape's Cookie proposal + [Netscape]. + + + + +Kristol & Montulli Standards Track [Page 22] + +RFC 2965 HTTP State Management Mechanism October 2000 + + + Brian Behlendorf proposed a Session-ID header that would be user- + agent-initiated and could be used by an origin server to track + "clicktrails". It would not carry any origin-server-defined state, + however. Phillip Hallam-Baker has proposed another client-defined + session ID mechanism for similar purposes. + + While both session IDs and cookies can provide a way to sustain + stateful sessions, their intended purpose is different, and, + consequently, the privacy requirements for them are different. A + user initiates session IDs to allow servers to track progress through + them, or to distinguish multiple users on a shared machine. Cookies + are server-initiated, so the cookie mechanism described here gives + users control over something that would otherwise take place without + the users' awareness. Furthermore, cookies convey rich, server- + selected information, whereas session IDs comprise user-selected, + simple information. + +9. HISTORICAL + +9.1 Compatibility with Existing Implementations + + Existing cookie implementations, based on the Netscape specification, + use the Set-Cookie (not Set-Cookie2) header. User agents that + receive in the same response both a Set-Cookie and Set-Cookie2 + response header for the same cookie MUST discard the Set-Cookie + information and use only the Set-Cookie2 information. Furthermore, a + user agent MUST assume, if it received a Set-Cookie2 response header, + that the sending server complies with this document and will + understand Cookie request headers that also follow this + specification. + + New cookies MUST replace both equivalent old- and new-style cookies. + That is, if a user agent that follows both this specification and + Netscape's original specification receives a Set-Cookie2 response + header, and the NAME and the Domain and Path attributes match (per + the Cookie Management section) a Netscape-style cookie, the + Netscape-style cookie MUST be discarded, and the user agent MUST + retain only the cookie adhering to this specification. + + Older user agents that do not understand this specification, but that + do understand Netscape's original specification, will not recognize + the Set-Cookie2 response header and will receive and send cookies + according to the older specification. + + + + + + + + +Kristol & Montulli Standards Track [Page 23] + +RFC 2965 HTTP State Management Mechanism October 2000 + + + A user agent that supports both this specification and Netscape-style + cookies SHOULD send a Cookie request header that follows the older + Netscape specification if it received the cookie in a Set-Cookie + response header and not in a Set-Cookie2 response header. However, + it SHOULD send the following request header as well: + + Cookie2: $Version="1" + + The Cookie2 header advises the server that the user agent understands + new-style cookies. If the server understands new-style cookies, as + well, it SHOULD continue the stateful session by sending a Set- + Cookie2 response header, rather than Set-Cookie. A server that does + not understand new-style cookies will simply ignore the Cookie2 + request header. + +9.2 Caching and HTTP/1.0 + + Some caches, such as those conforming to HTTP/1.0, will inevitably + cache the Set-Cookie2 and Set-Cookie headers, because there was no + mechanism to suppress caching of headers prior to HTTP/1.1. This + caching can lead to security problems. Documents transmitted by an + origin server along with Set-Cookie2 and Set-Cookie headers usually + either will be uncachable, or will be "pre-expired". As long as + caches obey instructions not to cache documents (following Expires: + or Pragma: no-cache (HTTP/1.0), or Cache- + control: no-cache (HTTP/1.1)) uncachable documents present no + problem. However, pre-expired documents may be stored in caches. + They require validation (a conditional GET) on each new request, but + some cache operators loosen the rules for their caches, and sometimes + serve expired documents without first validating them. This + combination of factors can lead to cookies meant for one user later + being sent to another user. The Set-Cookie2 and Set-Cookie headers + are stored in the cache, and, although the document is stale + (expired), the cache returns the document in response to later + requests, including cached headers. + +10. ACKNOWLEDGEMENTS + + This document really represents the collective efforts of the HTTP + Working Group of the IETF and, particularly, the following people, in + addition to the authors: Roy Fielding, Yaron Goland, Marc Hedlund, + Ted Hardie, Koen Holtman, Shel Kaphan, Rohit Khare, Foteos Macrides, + David W. Morris. + + + + + + + + +Kristol & Montulli Standards Track [Page 24] + +RFC 2965 HTTP State Management Mechanism October 2000 + + +11. AUTHORS' ADDRESSES + + David M. Kristol + Bell Laboratories, Lucent Technologies + 600 Mountain Ave. Room 2A-333 + Murray Hill, NJ 07974 + + Phone: (908) 582-2250 + Fax: (908) 582-1239 + EMail: dmk@bell-labs.com + + + Lou Montulli + Epinions.com, Inc. + 2037 Landings Dr. + Mountain View, CA 94301 + + EMail: lou@montulli.org + +12. REFERENCES + + [DMK95] Kristol, D.M., "Proposed HTTP State-Info Mechanism", + available at , September, 1995. + + [Netscape] "Persistent Client State -- HTTP Cookies", available at + , + undated. + + [RFC2109] Kristol, D. and L. Montulli, "HTTP State Management + Mechanism", RFC 2109, February 1997. + + [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate + Requirement Levels", BCP 14, RFC 2119, March 1997. + + [RFC2279] Yergeau, F., "UTF-8, a transformation format of Unicode + and ISO-10646", RFC 2279, January 1998. + + [RFC2396] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform + Resource Identifiers (URI): Generic Syntax", RFC 2396, + August 1998. + + [RFC2616] Fielding, R., Gettys, J., Mogul, J., Frystyk, H. and T. + Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", + RFC 2616, June 1999. + + + + + + +Kristol & Montulli Standards Track [Page 25] + +RFC 2965 HTTP State Management Mechanism October 2000 + + +13. Full Copyright Statement + + Copyright (C) The Internet Society (2000). All Rights Reserved. + + This document and translations of it may be copied and furnished to + others, and derivative works that comment on or otherwise explain it + or assist in its implementation may be prepared, copied, published + and distributed, in whole or in part, without restriction of any + kind, provided that the above copyright notice and this paragraph are + included on all such copies and derivative works. However, this + document itself may not be modified in any way, such as by removing + the copyright notice or references to the Internet Society or other + Internet organizations, except as needed for the purpose of + developing Internet standards in which case the procedures for + copyrights defined in the Internet Standards process must be + followed, or as required to translate it into languages other than + English. + + The limited permissions granted above are perpetual and will not be + revoked by the Internet Society or its successors or assigns. + + This document and the information contained herein is provided on an + "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING + TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING + BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION + HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF + MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +Acknowledgement + + Funding for the RFC Editor function is currently provided by the + Internet Society. + + + + + + + + + + + + + + + + + + + +Kristol & Montulli Standards Track [Page 26] + diff --git a/doc/rfc/rfc3310.txt b/doc/rfc/rfc3310.txt new file mode 100644 index 0000000000..edd2affbd0 --- /dev/null +++ b/doc/rfc/rfc3310.txt @@ -0,0 +1,1011 @@ + + + + + + +Network Working Group A. Niemi +Request for Comments: 3310 Nokia +Category: Informational J. Arkko + V. Torvinen + Ericsson + September 2002 + + + Hypertext Transfer Protocol (HTTP) Digest Authentication + Using Authentication and Key Agreement (AKA) + +Status of this Memo + + This memo provides information for the Internet community. It does + not specify an Internet standard of any kind. Distribution of this + memo is unlimited. + +Copyright Notice + + Copyright (C) The Internet Society (2002). All Rights Reserved. + +Abstract + + This memo specifies an Authentication and Key Agreement (AKA) based + one-time password generation mechanism for Hypertext Transfer + Protocol (HTTP) Digest access authentication. The HTTP + Authentication Framework includes two authentication schemes: Basic + and Digest. Both schemes employ a shared secret based mechanism for + access authentication. The AKA mechanism performs user + authentication and session key distribution in Universal Mobile + Telecommunications System (UMTS) networks. AKA is a challenge- + response based mechanism that uses symmetric cryptography. + + + + + + + + + + + + + + + + + + + +Niemi, et. al. Informational [Page 1] + +RFC 3310 HTTP Digest Authentication Using AKA September 2002 + + +Table of Contents + + 1. Introduction and Motivation . . . . . . . . . . . . . . . . . 2 + 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 + 1.2 Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 4 + 2. AKA Mechanism Overview . . . . . . . . . . . . . . . . . . . . 4 + 3. Specification of Digest AKA . . . . . . . . . . . . . . . . . 5 + 3.1 Algorithm Directive . . . . . . . . . . . . . . . . . . . . . 5 + 3.2 Creating a Challenge . . . . . . . . . . . . . . . . . . . . . 6 + 3.3 Client Authentication . . . . . . . . . . . . . . . . . . . . 7 + 3.4 Synchronization Failure . . . . . . . . . . . . . . . . . . . 7 + 3.5 Server Authentication . . . . . . . . . . . . . . . . . . . . 8 + 4. Example Digest AKA Operation . . . . . . . . . . . . . . . . . 8 + 5. Security Considerations . . . . . . . . . . . . . . . . . . . 12 + 5.1 Authentication of Clients using Digest AKA . . . . . . . . . . 13 + 5.2 Limited Use of Nonce Values . . . . . . . . . . . . . . . . . 13 + 5.3 Multiple Authentication Schemes and Algorithms . . . . . . . . 14 + 5.4 Online Dictionary Attacks . . . . . . . . . . . . . . . . . . 14 + 5.5 Session Protection . . . . . . . . . . . . . . . . . . . . . . 14 + 5.6 Replay Protection . . . . . . . . . . . . . . . . . . . . . . 15 + 5.7 Improvements to AKA Security . . . . . . . . . . . . . . . . . 15 + 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 15 + 6.1 Registration Template . . . . . . . . . . . . . . . . . . . . 16 + Normative References . . . . . . . . . . . . . . . . . . . . . 16 + Informative References . . . . . . . . . . . . . . . . . . . . 16 + A. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 17 + Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 17 + Full Copyright Statement . . . . . . . . . . . . . . . . . . . 18 + +1. Introduction and Motivation + + The Hypertext Transfer Protocol (HTTP) Authentication Framework, + described in RFC 2617 [2], includes two authentication schemes: Basic + and Digest. Both schemes employ a shared secret based mechanism for + access authentication. The Basic scheme is inherently insecure in + that it transmits user credentials in plain text. The Digest scheme + improves security by hiding user credentials with cryptographic + hashes, and additionally by providing limited message integrity. + + The Authentication and Key Agreement (AKA) [6] mechanism performs + authentication and session key distribution in Universal Mobile + Telecommunications System (UMTS) networks. AKA is a challenge- + response based mechanism that uses symmetric cryptography. AKA is + typically run in a UMTS IM Services Identity Module (ISIM), which + resides on a smart card like device that also provides tamper + resistant storage of shared secrets. + + + + + +Niemi, et. al. Informational [Page 2] + +RFC 3310 HTTP Digest Authentication Using AKA September 2002 + + + This document specifies a mapping of AKA parameters onto HTTP Digest + authentication. In essence, this mapping enables the usage of AKA as + a one-time password generation mechanism for Digest authentication. + + As the Session Initiation Protocol (SIP) [3] Authentication Framework + closely follows the HTTP Authentication Framework, Digest AKA is + directly applicable to SIP as well as any other embodiment of HTTP + Digest. + +1.1 Terminology + + This chapter explains the terminology used in this document. + + AKA + Authentication and Key Agreement. + + AuC + Authentication Center. The network element in mobile networks + that can authorize users either in GSM or in UMTS networks. + + AUTN + Authentication Token. A 128 bit value generated by the AuC, which + together with the RAND parameter authenticates the server to the + client. + + AUTS + Authentication Token. A 112 bit value generated by the client + upon experiencing an SQN synchronization failure. + + CK + Cipher Key. An AKA session key for encryption. + + IK + Integrity Key. An AKA session key for integrity check. + + ISIM + IP Multimedia Services Identity Module. + + PIN + Personal Identification Number. Commonly assigned passcodes for + use with automatic cash machines, smart cards, etc. + + RAND + Random Challenge. Generated by the AuC using the SQN. + + RES + Authentication Response. Generated by the ISIM. + + + + +Niemi, et. al. Informational [Page 3] + +RFC 3310 HTTP Digest Authentication Using AKA September 2002 + + + SIM + Subscriber Identity Module. GSM counter part for ISIM. + + SQN + Sequence Number. Both AuC and ISIM maintain the value of the SQN. + + UMTS + Universal Mobile Telecommunications System. + + XRES + Expected Authentication Response. In a successful authentication + this is equal to RES. + +1.2 Conventions + + The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", + "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this + document are to be interpreted as described in BCP 14, RFC 2119 [1]. + +2. AKA Mechanism Overview + + This chapter describes the AKA operation in detail: + + 1. A shared secret K is established beforehand between the ISIM and + the Authentication Center (AuC). The secret is stored in the + ISIM, which resides on a smart card like, tamper resistant device. + + 2. The AuC of the home network produces an authentication vector AV, + based on the shared secret K and a sequence number SQN. The + authentication vector contains a random challenge RAND, network + authentication token AUTN, expected authentication result XRES, a + session key for integrity check IK, and a session key for + encryption CK. + + 3. The authentication vector is downloaded to a server. Optionally, + the server can also download a batch of AVs, containing more than + one authentication vector. + + 4. The server creates an authentication request, which contains the + random challenge RAND, and the network authenticator token AUTN. + + 5. The authentication request is delivered to the client. + + 6. Using the shared secret K and the sequence number SQN, the client + verifies the AUTN with the ISIM. If the verification is + successful, the network has been authenticated. The client then + produces an authentication response RES, using the shared secret K + and the random challenge RAND. + + + +Niemi, et. al. Informational [Page 4] + +RFC 3310 HTTP Digest Authentication Using AKA September 2002 + + + 7. The authentication response, RES, is delivered to the server. + + 8. The server compares the authentication response RES with the + expected response, XRES. If the two match, the user has been + successfully authenticated, and the session keys, IK and CK, can + be used for protecting further communications between the client + and the server. + + When verifying the AUTN, the client may detect that the sequence + numbers between the client and the server have fallen out of sync. + In this case, the client produces a synchronization parameter AUTS, + using the shared secret K and the client sequence number SQN. The + AUTS parameter is delivered to the network in the authentication + response, and the authentication can be tried again based on + authentication vectors generated with the synchronized sequence + number. + + For a specification of the AKA mechanism and the generation of the + cryptographic parameters AUTN, RES, IK, CK, and AUTS, see reference + 3GPP TS 33.102 [6]. + +3. Specification of Digest AKA + + In general, the Digest AKA operation is identical to the Digest + operation in RFC 2617 [2]. This chapter specifies the parts in which + Digest AKA extends the Digest operation. The notation used in the + Augmented BNF definitions for the new and modified syntax elements in + this section is as used in SIP [3], and any elements not defined in + this section are as defined in SIP and the documents to which it + refers. + +3.1 Algorithm Directive + + In order to direct the client into using AKA for authentication + instead of the standard password system, the RFC 2617 defined + algorithm directive is overloaded in Digest AKA: + + algorithm = "algorithm" EQUAL ( aka-namespace + / algorithm-value ) + aka-namespace = aka-version "-" algorithm-value + aka-version = "AKAv" 1*DIGIT + algorithm-value = ( "MD5" / "MD5-sess" / token ) + + algorithm + A string indicating the algorithm used in producing the digest and + the checksum. If the directive is not understood, the nonce + SHOULD be ignored, and another challenge (if one is present) + should be used instead. The default aka-version is "AKAv1". + + + +Niemi, et. al. Informational [Page 5] + +RFC 3310 HTTP Digest Authentication Using AKA September 2002 + + + Further AKA versions can be specified, with version numbers + assigned by IANA [7]. When the algorithm directive is not + present, it is assumed to be "MD5". This indicates, that AKA is + not used to produce the Digest password. + + Example: + + algorithm=AKAv1-MD5 + + If the entropy of the used RES value is limited (e.g., only 32 + bits), reuse of the same RES value in authenticating subsequent + requests and responses is NOT RECOMMENDED. Such a RES value + SHOULD only be used as a one-time password, and algorithms such as + "MD5-sess", which limit the amount of material hashed with a + single key, by producing a session key for authentication, SHOULD + NOT be used. + +3.2 Creating a Challenge + + In order to deliver the AKA authentication challenge to the client in + Digest AKA, the nonce directive defined in RFC 2617 is extended: + + nonce = "nonce" EQUAL ( aka-nonce + / nonce-value ) + aka-nonce = LDQUOT aka-nonce-value RDQUOT + aka-nonce-value = + + nonce + A parameter, which is populated with the Base64 [4] encoding of + the concatenation of the AKA authentication challenge RAND, the + AKA AUTN token, and optionally some server specific data, as in + Figure 1. + + + + + + + + + + + + + + + + + + +Niemi, et. al. Informational [Page 6] + +RFC 3310 HTTP Digest Authentication Using AKA September 2002 + + + Example: + + nonce="MzQ0a2xrbGtmbGtsZm9wb2tsc2tqaHJzZXNy9uQyMzMzMzQK=" + + 0 1 2 3 + 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | + | RAND | + | | + | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | + | AUTN | + | | + | | + +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | Server Data... + +-+-+-+-+-+-+-+-+-+-+-+ + + Figure 1: Generating the nonce value. + + If the server receives a client authentication containing the "auts" + parameter defined in Section 3.4, that includes a valid AKA AUTS + parameter, the server MUST use it to generate a new challenge to the + client. Note that when the AUTS is present, the included "response" + parameter is calculated using an empty password (password of ""), + instead of a RES. + +3.3 Client Authentication + + When a client receives a Digest AKA authentication challenge, it + extracts the RAND and AUTN from the "nonce" parameter, and assesses + the AUTN token provided by the server. If the client successfully + authenticates the server with the AUTN, and determines that the SQN + used in generating the challenge is within expected range, the AKA + algorithms are run with the RAND challenge and shared secret K. + + The resulting AKA RES parameter is treated as a "password" when + calculating the response directive of RFC 2617. + +3.4 Synchronization Failure + + For indicating an AKA sequence number synchronization failure, and to + re-synchronize the SQN in the AuC using the AUTS token, a new + directive is defined for the "digest-response" of the "Authorization" + request header defined in RFC 2617: + + + + +Niemi, et. al. Informational [Page 7] + +RFC 3310 HTTP Digest Authentication Using AKA September 2002 + + + auts = "auts" EQUAL auts-param + auts-param = LDQUOT auts-value RDQUOT + auts-value = + + + auts + A string carrying a base64 encoded AKA AUTS parameter. This + directive is used to re-synchronize the server side SQN. If the + directive is present, the client doesn't use any password when + calculating its credentials. Instead, the client MUST calculate + its credentials using an empty password (password of ""). + + Example: + + auts="CjkyMzRfOiwg5CfkJ2UK=" + + Upon receiving the "auts" parameter, the server will check the + validity of the parameter value using the shared secret K. A valid + AUTS parameter is used to re-synchronize the SQN in the AuC. The + synchronized SQN is then used to generate a fresh authentication + vector AV, with which the client is then re-challenged. + +3.5 Server Authentication + + Even though AKA provides inherent mutual authentication with the AKA + AUTN token, mutual authentication mechanisms provided by Digest may + still be useful in order to provide message integrity. + + In Digest AKA, the server uses the AKA XRES parameter as "password" + when calculating the "response-auth" of the "Authentication-Info" + header defined in RFC 2617. + +4. Example Digest AKA Operation + + Figure 2 below describes a message flow describing a Digest AKA + process of authenticating a SIP request, namely the SIP REGISTER + request. + + + + + + + + + + + + + + +Niemi, et. al. Informational [Page 8] + +RFC 3310 HTTP Digest Authentication Using AKA September 2002 + + + Client Server + + | 1) REGISTER | + |------------------------------------------------------>| + | | + | +-----------------------------+ + | | Server runs AKA algorithms, | + | | generates RAND and AUTN. | + | +-----------------------------+ + | | + | 2) 401 Unauthorized | + | WWW-Authenticate: Digest | + | (RAND, AUTN delivered) | + |<------------------------------------------------------| + | | + +------------------------------------+ | + | Client runs AKA algorithms on ISIM,| | + | verifies AUTN, derives RES | | + | and session keys. | | + +------------------------------------+ | + | | + | 3) REGISTER | + | Authorization: Digest (RES is used) | + |------------------------------------------------------>| + | | + | +------------------------------+ + | | Server checks the given RES, | + | | and finds it correct. | + | +------------------------------+ + | | + | 4) 200 OK | + | Authentication-Info: (XRES is used) | + |<------------------------------------------------------| + | | + + Figure 2: Message flow representing a successful authentication. + + 1) Initial request + + REGISTER sip:home.mobile.biz SIP/2.0 + + + + + + + + + + + +Niemi, et. al. Informational [Page 9] + +RFC 3310 HTTP Digest Authentication Using AKA September 2002 + + + 2) Response containing a challenge + + SIP/2.0 401 Unauthorized + WWW-Authenticate: Digest + realm="RoamingUsers@mobile.biz", + nonce="CjPk9mRqNuT25eRkajM09uTl9nM09uTl9nMz5OX25PZz==", + qop="auth,auth-int", + opaque="5ccc069c403ebaf9f0171e9517f40e41", + algorithm=AKAv1-MD5 + + 3) Request containing credentials + + REGISTER sip:home.mobile.biz SIP/2.0 + Authorization: Digest + username="jon.dough@mobile.biz", + realm="RoamingUsers@mobile.biz", + nonce="CjPk9mRqNuT25eRkajM09uTl9nM09uTl9nMz5OX25PZz==", + uri="sip:home.mobile.biz", + qop=auth-int, + nc=00000001, + cnonce="0a4f113b", + response="6629fae49393a05397450978507c4ef1", + opaque="5ccc069c403ebaf9f0171e9517f40e41" + + 4) Successful response + + SIP/2.0 200 OK + Authentication-Info: + qop=auth-int, + rspauth="6629fae49393a05397450978507c4ef1", + cnonce="0a4f113b", + nc=00000001 + + + + + + + + + + + + + + + + + + + +Niemi, et. al. Informational [Page 10] + +RFC 3310 HTTP Digest Authentication Using AKA September 2002 + + + Figure 3 below describes a message flow describing a Digest AKA + authentication process, in which there is a synchronization failure. + + Client Server + + | 1) REGISTER | + |------------------------------------------------------>| + | | + | +-----------------------------+ + | | Server runs AKA algorithms, | + | | generates RAND and AUTN. | + | +-----------------------------+ + | | + | 2) 401 Unauthorized | + | WWW-Authenticate: Digest | + | (RAND, AUTN delivered) | + |<------------------------------------------------------| + | | + +------------------------------------+ | + | Client runs AKA algorithms on ISIM,| | + | verifies the AUTN, but discovers | | + | that it contains an invalid | | + | sequence number. The client then | | + | generates an AUTS token. | | + +------------------------------------+ | + | | + | 3) REGISTER | + | Authorization: Digest (AUTS is delivered) | + |------------------------------------------------------>| + | | + | +-----------------------+ + | | Server performs | + | | re-synchronization | + | | using AUTS and RAND. | + | +-----------------------+ + | | + | 4) 401 Unauthorized | + | WWW-Authenticate: Digest | + | (re-synchronized RAND, | + | AUTN delivered) | + |<------------------------------------------------------| + | | + + Figure 3: Message flow representing an authentication synchronization + failure. + + + + + + +Niemi, et. al. Informational [Page 11] + +RFC 3310 HTTP Digest Authentication Using AKA September 2002 + + + 1) Initial request + + REGISTER sip:home.mobile.biz SIP/2.0 + + 2) Response containing a challenge + + SIP/2.0 401 Unauthorized + WWW-Authenticate: Digest + realm="RoamingUsers@mobile.biz", + qop="auth", + nonce="CjPk9mRqNuT25eRkajM09uTl9nM09uTl9nMz5OX25PZz==", + opaque="5ccc069c403ebaf9f0171e9517f40e41", + algorithm=AKAv1-MD5 + + 3) Request containing credentials + + REGISTER sip:home.mobile.biz SIP/2.0 + Authorization: Digest + username="jon.dough@mobile.biz", + realm="RoamingUsers@mobile.biz", + nonce="CjPk9mRqNuT25eRkajM09uTl9nM09uTl9nMz5OX25PZz==", + uri="sip:home.mobile.biz", + qop=auth, + nc=00000001, + cnonce="0a4f113b", + response="4429ffe49393c02397450934607c4ef1", + opaque="5ccc069c403ebaf9f0171e9517f40e41", + auts="5PYxMuX2NOT2NeQ=" + + 4) Response containing a new challenge + + SIP/2.0 401 Unauthorized + WWW-Authenticate: Digest + realm="RoamingUsers@mobile.biz", + qop="auth,auth-int", + nonce="9uQzNPbk9jM05Pbl5Pbl5DIz9uTl9uTl9jM0NTHk9uXk==", + opaque="dcd98b7102dd2f0e8b11d0f600bfb0c093", + algorithm=AKAv1-MD5 + +5. Security Considerations + + In general, Digest AKA is vulnerable to the same security threats as + HTTP authentication [2]. This chapter discusses the relevant + exceptions. + + + + + + + +Niemi, et. al. Informational [Page 12] + +RFC 3310 HTTP Digest Authentication Using AKA September 2002 + + +5.1 Authentication of Clients using Digest AKA + + AKA is typically -- though this isn't a theoretical limitation -- run + on an ISIM application that usually resides in a tamper resistant + smart card. Interfaces to the ISIM exist, which enable the host + device to request authentication to be performed on the card. + However, these interfaces do not allow access to the long-term secret + outside the ISIM, and the authentication can only be performed if the + device accessing the ISIM has knowledge of a PIN code, shared between + the user and the ISIM. Such PIN codes are typically obtained from + user input, and are usually required when the device is powered on. + + The use of tamper resistant cards with secure interfaces implies that + Digest AKA is typically more secure than regular Digest + implementations, as neither possession of the host device nor Trojan + Horses in the software give access to the long term secret. Where a + PIN scheme is used, the user is also authenticated when the device is + powered on. However, there may be a difference in the resulting + security of Digest AKA, compared to traditional Digest + implementations, depending of course on whether those implementations + cache/store passwords that are received from the user. + +5.2 Limited Use of Nonce Values + + The Digest scheme uses server-specified nonce values to seed the + generation of the request-digest value. The server is free to + construct the nonce in such a way, that it may only be used from a + particular client, for a particular resource, for a limited period of + time or number of uses, or any other restrictions. Doing so + strengthens the protection provided against, for example, replay + attacks. + + Digest AKA limits the applicability of a nonce value to a particular + ISIM. Typically, the ISIM is accessible only to one client device at + a time. However, the nonce values are strong and secure even though + limited to a particular ISIM. Additionally, this requires that the + server is provided with the client identity before an authentication + challenge can be generated. If a client identity is not available, + an additional round trip is needed to acquire it. Such a case is + analogous to an AKA synchronization failure. + + A server may allow each nonce value to be used only once by sending a + next-nonce directive in the Authentication-Info header field of every + response. However, this may cause a synchronization failure, and + consequently some additional round trips in AKA, if the same SQN + space is also used for other access schemes at the same time. + + + + + +Niemi, et. al. Informational [Page 13] + +RFC 3310 HTTP Digest Authentication Using AKA September 2002 + + +5.3 Multiple Authentication Schemes and Algorithms + + In HTTP authentication, a user agent MUST choose the strongest + authentication scheme it understands and request credentials from the + user, based upon that challenge. + + In general, using passwords generated by Digest AKA with other HTTP + authentication schemes is not recommended even though the realm + values or protection domains would coincide. In these cases, a + password should be requested from the end-user instead. Digest AKA + passwords MUST NOT be re-used with such HTTP authentication schemes, + which send the password in clear. In particular, AKA passwords MUST + NOT be re-used with HTTP Basic. + + The same principle must be applied within a scheme if several + algorithms are supported. A client receiving an HTTP Digest + challenge with several available algorithms MUST choose the strongest + algorithm it understands. For example, Digest with "AKAv1-MD5" would + be stronger than Digest with "MD5". + +5.4 Online Dictionary Attacks + + Since user-selected passwords are typically quite simple, it has been + proposed that servers should not accept passwords for HTTP Digest, + which are in the dictionary [2]. This potential threat does not + exist in HTTP Digest AKA because the algorithm will use ISIM + originated passwords. However, the end-user must still be careful + with PIN codes. Even though HTTP Digest AKA password requests are + never displayed to the end-user, she will be authenticated to the + ISIM via a PIN code. Commonly known initial PIN codes are typically + installed to the ISIM during manufacturing and if the end-users do + not change them, there is a danger that an unauthorized user may be + able to use the device. Naturally this requires that the + unauthorized user has access to the physical device, and that the + end-user has not changed the initial PIN code. For this reason, + end-users are strongly encouraged to change their PIN codes when they + receive an ISIM. + +5.5 Session Protection + + Digest AKA is able to generate additional session keys for integrity + (IK) and confidentiality (CK) protection. Even though this document + does not specify the use of these additional keys, they may be used + for creating additional security within HTTP authentication or some + other security mechanism. + + + + + + +Niemi, et. al. Informational [Page 14] + +RFC 3310 HTTP Digest Authentication Using AKA September 2002 + + +5.6 Replay Protection + + AKA allows sequence numbers to be tracked for each authentication, + with the SQN parameter. This allows authentications to be replay + protected even if the RAND parameter happened to be the same for two + authentication requests. More importantly, this offers additional + protection for the case where an attacker replays an old + authentication request sent by the network. The client will be able + to detect that the request is old, and refuse authentication. This + proves liveliness of the authentication request even in the case + where a MitM attacker tries to trick the client into providing an + authentication response, and then replaces parts of the message with + something else. In other words, a client challenged by Digest AKA is + not vulnerable for chosen plain text attacks. Finally, frequent + sequence number errors would reveal an attack where the tamper + resistant card has been cloned and is being used in multiple devices. + + The downside of sequence number tracking is that servers must hold + more information for each user than just their long-term secret, + namely the current SQN value. However, this information is typically + not stored in the SIP nodes, but in dedicated authentication servers + instead. + +5.7 Improvements to AKA Security + + Even though AKA is perceived as a secure mechanism, Digest AKA is + able to improve it. More specifically, the AKA parameters carried + between the client and the server during authentication may be + protected along with other parts of the message by using Digest AKA. + This is not possible with plain AKA. + +6. IANA Considerations + + This document specifies an aka-version namespace in Section 3.1 which + requires a central coordinating body. The body responsible for this + coordination is the Internet Assigned Numbers Authority (IANA). + + The default aka-version defined in this document is "AKAv1". + Following the policies outlined in [5], versions above 1 are + allocated as Expert Review. + + Registrations with the IANA MUST include the version number being + registered, including the "AKAv" prefix. For example, a registration + for "AKAv2" would potentially be a valid one, whereas a registration + for "FOOv2" or "2" would not be valid. Further, the registration + MUST include contact information for the party responsible for the + registration. + + + + +Niemi, et. al. Informational [Page 15] + +RFC 3310 HTTP Digest Authentication Using AKA September 2002 + + + As this document defines the default aka-version, the initial IANA + registration for aka-version values will contain an entry for + "AKAv1". + +6.1 Registration Template + + To: ietf-digest-aka@iana.org + Subject: Registration of a new AKA version + + Version identifier: + + (Must contain a valid aka-version value, + as described in section 3.1.) + + Person & email address to contact for further information: + + (Must contain contact information for the + person(s) responsible for the registration.) + +Normative References + + [1] Bradner, S., "Key words for use in RFCs to Indicate Requirement + Levels", BCP 14, RFC 2119, March 1997. + + [2] Franks, J., Hallam-Baker, P., Hostetler, J., Lawrence, S., + Leach, P., Luotonen, A. and L. Stewart, "HTTP Authentication: + Basic and Digest Access Authentication", RFC 2617, June 1999. + + [3] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, A., + Peterson, J., Sparks, R., Handley, M. and E. Schooler, "SIP: + Session Initiation Protocol", RFC 3261, June 2002. + + [4] Freed, N. and N. Borenstein, "Multipurpose Internet Mail + Extensions (MIME) Part One: Format of Internet Message Bodies", + RFC 2045, November 1996. + +Informative References + + [5] Narten, T. and H. Alvestrand, "Guidelines for Writing an IANA + Considerations Section in RFCs", BCP 26, RFC 2434, October 1998. + + [6] 3rd Generation Partnership Project, "Security Architecture + (Release 4)", TS 33.102, December 2001. + + [7] http://www.iana.org, "Assigned Numbers". + + + + + + +Niemi, et. al. Informational [Page 16] + +RFC 3310 HTTP Digest Authentication Using AKA September 2002 + + +Appendix A. Acknowledgements + + The authors would like to thank Sanjoy Sen, Jonathan Rosenberg, Pete + McCann, Tao Haukka, Ilkka Uusitalo, Henry Haverinen, John Loughney, + Allison Mankin and Greg Rose. + +Authors' Addresses + + Aki Niemi + Nokia + P.O. Box 301 + NOKIA GROUP, FIN 00045 + Finland + + Phone: +358 50 389 1644 + EMail: aki.niemi@nokia.com + + + Jari Arkko + Ericsson + Hirsalantie 1 + Jorvas, FIN 02420 + Finland + + Phone: +358 40 5079256 + EMail: jari.arkko@ericsson.com + + + Vesa Torvinen + Ericsson + Joukahaisenkatu 1 + Turku, FIN 20520 + Finland + + Phone: +358 40 7230822 + EMail: vesa.torvinen@ericsson.fi + + + + + + + + + + + + + + + +Niemi, et. al. Informational [Page 17] + +RFC 3310 HTTP Digest Authentication Using AKA September 2002 + + +Full Copyright Statement + + Copyright (C) The Internet Society (2002). All Rights Reserved. + + This document and translations of it may be copied and furnished to + others, and derivative works that comment on or otherwise explain it + or assist in its implementation may be prepared, copied, published + and distributed, in whole or in part, without restriction of any + kind, provided that the above copyright notice and this paragraph are + included on all such copies and derivative works. However, this + document itself may not be modified in any way, such as by removing + the copyright notice or references to the Internet Society or other + Internet organizations, except as needed for the purpose of + developing Internet standards in which case the procedures for + copyrights defined in the Internet Standards process must be + followed, or as required to translate it into languages other than + English. + + The limited permissions granted above are perpetual and will not be + revoked by the Internet Society or its successors or assigns. + + This document and the information contained herein is provided on an + "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING + TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING + BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION + HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF + MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. + +Acknowledgement + + Funding for the RFC Editor function is currently provided by the + Internet Society. + + + + + + + + + + + + + + + + + + + +Niemi, et. al. Informational [Page 18] +