From: (no author) <(no author)@unknown> Date: Thu, 20 Feb 1997 04:26:07 +0000 (+0000) Subject: This commit was manufactured by cvs2svn to create tag 'APACHE_1_2b7'. X-Git-Tag: APACHE_1_2b7^0 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=ea1935cdf60fbb4c35de972d4e0bcc801bb66f1f;p=thirdparty%2Fapache%2Fhttpd.git This commit was manufactured by cvs2svn to create tag 'APACHE_1_2b7'. git-svn-id: https://svn.apache.org/repos/asf/httpd/httpd/tags/APACHE_1_2b7@77651 13f79535-47bb-0310-9956-ffa450edef68 --- diff --git a/docs/docroot/apache_pb.gif b/docs/docroot/apache_pb.gif deleted file mode 100644 index 3a1c139fc42..00000000000 Binary files a/docs/docroot/apache_pb.gif and /dev/null differ diff --git a/docs/manual/bind.html.en b/docs/manual/bind.html.en deleted file mode 100644 index cb1fa0daacf..00000000000 --- a/docs/manual/bind.html.en +++ /dev/null @@ -1,100 +0,0 @@ - -Setting which addresses and ports Apache uses - - - -

Setting which addresses and ports Apache uses

- -

- -When Apache starts, it connects to some port and address on the -local machine and waits for incoming requests. By default, it -listens to all addresses on the machine, and to the port -as specified by the Port directive in the server configuration. -However, it can be told to listen to more the one port, or to listen -to only selected addresses, or a combination. This is often combined -with the Virtual Host feature which determines how Apache -responds to different IP addresses, hostnames and ports.

- -There are two directives used to restrict or specify which addresses -and ports Apache listens to. - -

BindAddress is used to restrict the server to listening to - a single address, and can be used to permit multiple Apache servers - on the same machine listening to different IP addresses. -
Listen can be used to make a single Apache server listen - to more than one address and/or port. -

- -

BindAddress

-Syntax: BindAddress [ * | IP-address | hostname ]
-Default: BindAddress *
-Context: server config
-Status: Core

- -Makes the server listen to just the specified address. If the argument -is *, the server listens to all addresses. The port listened to -is set with the Port directive. Only one BindAddress -should be used. - -

Listen

-Syntax: Listen [ port | IP-address:port ]
-Default: none
-Context: server config
-Status: Core

- -Listen can be used instead of BindAddress and -Port. It tells the server to accept incoming requests on the -specified port or address-and-port combination. If the first format is -used, with a port number only, the server listens to the given port on -all interfaces, instead of the port given by the Port -directive. If an IP address is given as well as a port, the server -will listen on the given port and interface.

Multiple Listen -directives may be used to specify a number of addresses and ports to -listen to. The server will respond to requests from any of the listed -addresses and ports.

- -For example, to make the server accept connections on both port -80 and port 8000, use: -

-   Listen 80
-   Listen 8000
-

- -To make the server accept connections on two specified -interfaces and port numbers, use -

-   Listen 192.170.2.1:80
-   Listen 192.170.2.5:8000
-

- -

How this works with Virtual Hosts

- -BindAddress and Listen do not implement Virtual Hosts. They tell the -main server what addresses and ports to listen to. If no -<VirtualHost> directives are used, the server will behave the -same for all accepted requests. However, <VirtualHost> can be -used to specify a different behavour for one or more of the addresses -and ports. To implement a VirtualHost, the server must first be told -to listen to the address and port to be used. Then a -<VirtualHost> section should be created for a specified address -and port to set the behavior of this virtual host. Note that if the -<VirtualHost> is set for an address and port that the server is -not listening to, it cannot be accessed. - -

PATH_INFO Changes in the CGI Environment

- -

Overview

- -

As implemented in Apache 1.1.1 and earlier versions, the method -Apache used to create PATH_INFO in the CGI environment was -counterintiutive, and could result in crashes in certain cases. In -Apache 1.2 and beyond, this behavior has changed. Although this -results in some compatibility problems with certain legacy CGI -applications, the Apache 1.2 behavior is still compatible with the -CGI/1.1 specification, and CGI scripts can be easily modified (see below). - -

The Problem

- -

Apache 1.1.1 and earlier implemented the PATH_INFO and SCRIPT_NAME -environment variables by looking at the filename, not the URL. While -this resulted in the correct values in many cases, when the filesystem -path was overloaded to contain path information, it could result in -errant behavior. For example, if the following appeared in a config -file: -

-     Alias /cgi-ralph /usr/local/httpd/cgi-bin/user.cgi/ralph
-

In this case, user.cgi is the CGI script, the "/ralph" -is information to be passed onto the CGI. If this configuration was in -place, and a request came for "/cgi-ralph/script/", the -code would set PATH_INFO to "/ralph/script", and -SCRIPT_NAME to "/cgi-". Obviously, the latter is -incorrect. In certain cases, this could even cause the server to -crash.

- -

The Solution

- -

Apache 1.2 and later now determine SCRIPT_NAME and PATH_INFO by -looking directly at the URL, and determining how much of the URL is -client-modifiable, and setting PATH_INFO to it. To use the above -example, PATH_INFO would be set to "/script", and -SCRIPT_NAME to "/cgi-ralph". This makes sense and results -in no server behavior problems. It also permits the script to be -gauranteed that -"http://$SERVER_NAME:$SERVER_PORT$SCRIPT_NAME$PATH_INFO" -will always be an accessable URL that points to the current script, -something which was not neccessarily true with previous versions of -Apache. - -

However, the "/ralph" -information from the Alias directive is lost. This is -unfortunate, but we feel that using the filesystem to pass along this -sort of information is not a recommended method, and a script making -use of it "deserves" not to work. Apache 1.2b3 and later, however, do -provide a workaround. - -

Compatibility with Previous Servers

- -

It may be neccessary for a script that was designed for earlier -versions of Apache or other servers to need the information that the -old PATH_INFO variable provided. For this purpose, Apache 1.2 (1.2b3 -and later) sets an additional variable, FILEPATH_INFO. This -environment variable contains the value that PATH_INFO would have had -with Apache 1.1.1.

- -

A script that wishes to work with both Apache 1.2 and earlier -versions can simply test for the existance of FILEPATH_INFO, and use -it if available. Otherwise, it can use PATH_INFO. For example, in -Perl, one might use: -

-    $path_info = $ENV{'FILEPATH_INFO'} || $ENV{'PATH_INFO'};
-

- -

By doing this, a script can work with all servers supporting the -CGI/1.1 specification, including all versions of Apache.

- - - - - diff --git a/docs/manual/content-negotiation.html.en b/docs/manual/content-negotiation.html.en deleted file mode 100644 index 45388ff2943..00000000000 --- a/docs/manual/content-negotiation.html.en +++ /dev/null @@ -1,419 +0,0 @@ - - - -Apache Content Negotiation - - - - -

Content Negotiation

- -Apache's support for content negotiation has been updated to meet the -HTTP/1.1 specification. It can choose the best representation of a -resource based on the browser-supplied preferences for media type, -languages, character set and encoding. It is also implements a -couple of features to give more intelligent handling of requests from -browsers which send incomplete negotiation information.

- -Content negotiation is provided by the -mod_negotiation module, -which is compiled in by default. - -

- -

About Content Negotiation

- -A resource may be available in several different representations. For -example, it might be available in different languages or different -media types, or a combination. One way of selecting the most -appropriate choice is to give the user an index page, and let them -select. However it is often possible for the server to choose -automatically. This works because browsers can send as part of each -request information about what representations they prefer. For -example, a browser could indicate that it would like to see -information in French, if possible, else English will do. Browsers -indicate their preferences by headers in the request. To request only -French representations, the browser would send - -

-  Accept-Language: fr
-

- -Note that this preference will only be applied when there is a choice -of representations and they vary by language. -

- -As an example of a more complex request, this browser has been -configured to accept French and English, but prefer French, and to -accept various media types, preferring HTML over plain text or other -text types, and prefering GIF or jpeg over other media types, but also -allowing any other media type as a last resort: - -

-  Accept-Language: fr; q=1.0, en; q=0.5
-  Accept: text/html; q=1.0, text/*; q=0.8, image/gif; q=0.6,
-        image/jpeg; q=0.6, image/*; q=0.5, */*; q=0.1
-

- -Apache 1.2 supports 'server driven' content negotiation, as defined in -the HTTP/1.1 specification. It fully supports the Accept, -Accept-Language, Accept-Charset and Accept-Encoding request headers. -

- -The terms used in content negotiation are: a resource is an -item which can be requested of a server, which might be selected as -the result of a content negotiation algorithm. If a resource is -available in several formats, these are called representations -or variants. The ways in which the variants for a particular -resource vary are called the dimensions of negotiation. - -

Negotiation in Apache

- -In order to negotiate a resource, the server needs to be given -information about each of the variants. This is done in one of two -ways: - -

Using a type map (i.e., a *.var file) which - names the files containing the variants explicitly -
Or using a 'MultiViews' search, where the server does an implicit - filename pattern match, and chooses from among the results. -

- -

Using a type-map file

- -A type map is a document which is associated with the handler -named type-map (or, for backwards-compatibility with -older Apache configurations, the mime type -application/x-type-map). Note that to use this feature, -you've got to have a SetHandler some place which defines a -file suffix as type-map; this is best done with a -

-
-  AddHandler type-map var
-
-

-in srm.conf. See comments in the sample config files for -details.

- -Type map files have an entry for each available variant; these entries -consist of contiguous RFC822-format header lines. Entries for -different variants are separated by blank lines. Blank lines are -illegal within an entry. It is conventional to begin a map file with -an entry for the combined entity as a whole (although this -is not required, and if present will be ignored). An example -map file is: -

-
-  URI: foo
-
-  URI: foo.en.html
-  Content-type: text/html
-  Content-language: en
-
-  URI: foo.fr.de.html
-  Content-type: text/html; charset=iso-8859-2
-  Content-language: fr, de
-

- -If the variants have different source qualities, that may be indicated -by the "qs" parameter to the media type, as in this picture (available -as jpeg, gif, or ASCII-art): -

-  URI: foo
-
-  URI: foo.jpeg
-  Content-type: image/jpeg; qs=0.8
-
-  URI: foo.gif
-  Content-type: image/gif; qs=0.5
-
-  URI: foo.txt
-  Content-type: text/plain; qs=0.01
-
-

- -qs values can vary between 0.000 and 1.000. Note that any variant with -a qs value of 0.000 will never be chosen. Variants with no 'qs' -parameter value are given a qs factor of 1.0.

- -The full list of headers recognized is: - -

URI: -: uri of the file containing the variant (of the given media - type, encoded with the given content encoding). These are - interpreted as URLs relative to the map file; they must be on - the same server (!), and they must refer to files to which the - client would be granted access if they were to be requested - directly. -
Content-type: -: media type --- charset, level and "qs" parameters may be given. These - are often referred to as MIME types; typical media types are - image/gif, text/plain, or - text/html; level=3. -
Content-language: -: The languages of the variant, specified as an internet standard - language code (e.g., en for English, - kr for Korean, etc.). -
Content-encoding: -: If the file is compressed, or otherwise encoded, rather than - containing the actual raw data, this says how that was done. - For compressed files (the only case where this generally comes - up), content encoding should be - x-compress, or x-gzip, as appropriate. -
Content-length: -: The size of the file. Clients can ask to receive a given media - type only if the variant isn't too big; specifying a content - length in the map allows the server to compare against these - thresholds without checking the actual file. -

- -

Multiviews

- -This is a per-directory option, meaning it can be set with an -Options directive within a <Directory>, -<Location> or <Files> -section in access.conf, or (if AllowOverride -is properly set) in .htaccess files. Note that -Options All does not set MultiViews; you -have to ask for it by name. (Fixing this is a one-line change to -http_core.h). - -

- -The effect of MultiViews is as follows: if the server -receives a request for /some/dir/foo, if -/some/dir has MultiViews enabled, and -/some/dir/foo does not exist, then the server reads the -directory looking for files named foo.*, and effectively fakes up a -type map which names all those files, assigning them the same media -types and content-encodings it would have if the client had asked for -one of them by name. It then chooses the best match to the client's -requirements, and forwards them along. - -

- -This applies to searches for the file named by the -DirectoryIndex directive, if the server is trying to -index a directory; if the configuration files specify -

-
-  DirectoryIndex index
-
-

then the server will arbitrate between index.html -and index.html3 if both are present. If neither are -present, and index.cgi is there, the server will run it. - -

- -If one of the files found when reading the directive is a CGI script, -it's not obvious what should happen. The code gives that case -special treatment --- if the request was a POST, or a GET with -QUERY_ARGS or PATH_INFO, the script is given an extremely high quality -rating, and generally invoked; otherwise it is given an extremely low -quality rating, which generally causes one of the other views (if any) -to be retrieved. - -

The Negotiation Algorithm

- -After Apache has obtained a list of the variants for a given resource, -either from a type-map file or from the filenames in the directory, it -applies a algorithm to decide on the 'best' variant to return, if -any. To do this it calculates a quality value for each variant in each -of the dimensions of variance. It is not necessary to know any of the -details of how negotaion actually takes place in order to use Apache's -content negotation features. However the rest of this document -explains in detail the algorithm used for those interested.

- -In some circumstances, Apache can 'fiddle' the quality factor of a -particular dimension to achive a better result. The ways Apache can -fiddle quality factors is explained in more detail below. - -

Dimensions of Negotation

- - -

Dimension -	Notes -
Media Type -	Browser indicates preferences on Accept: header. Each item -can have an associated quality factor. Variant description can also -have a quality factor. -
Language -	Browser indicates preferneces on Accept-Language: header. Each -item -can have a quality factor. Variants can be associated with none, one -or more languages. -
Encoding -	Browser indicates preference with Accept-Encoding: header. -
Charset -	Browser indicates preference with Accept-Charset: header. Variants -can indicate a charset as a parameter of the media type. -

- -

Apache Negotiation Algorithm

- -Apache uses an algorithm to select the 'best' variant (if any) to -return to the browser. This algorithm is not configurable. It operates -like this: -

- -

-Firstly, for each dimension of the negotiation, the appropriate -Accept header is checked and a quality assigned to this each -variant. If the Accept header for any dimension means that this -variant is not acceptable, eliminate it. If no variants remain, go -to step 4. - -
Select the 'best' variant by a process of elimination. Each of -the following tests is applied in order. Any variants not selected at -each stage are eliminated. After each test, if only one variant -remains, it is selected as the best match. If more than one variant -remains, move onto the next test. - -
1. Multiply the quality factor from the Accept header with the - quality-of-source factor for this variant's media type, and select - the variants with the highest value - -
2. Select the variants with the highest language quality factor - -
3. Select the variants with the best language match, using either the - order of languages on the LanguagePriority directive (if present), - else the order of languages on the Accept-Language header. - -
4. Select the variants with the highest 'level' media parameter - (used to give the version of text/html media types). - -
5. Select only unencoded variants, if there is a mix of encoded - and non-encoded variants. If either all variants are encoded - or all variants are not encoded, select all. - -
6. Select only variants with acceptable charset media parameters, - as given on the Accept-Charset header line. Charset ISO-8859-1 - is always acceptable. Variants not associated with a particular - charset are assumed to be in ISO-8859-1. - -
7. Select the variants with the smallest content length - -
8. Select the first variant of those remaining (this will be either the -first listed in the type-map file, or the first read from the directory) -and go to stage 3. - -
- -
The algorithm has now selected one 'best' variant, so return - it as the response. The HTTP response header Vary is set to indicate the - dimensions of negotation (browsers and caches can use this - information when caching the resource). End. - -
To get here means no variant was selected (because non are acceptable - to the browser). Return a 406 status (meaning "No acceptable representation") - with a response body consisting of an HTML document listing the - available variants. Also set the HTTP Vary header to indicate the - dimensions of variance. - -

Fiddling with Quality Values

- -Apache sometimes changes the quality values from what would be -expected by a strict interpretation of the algorithm above. This is to -get a better result from the algorithm for browsers which do not send -full or accurate information. Some of the most popular browsers send -Accept header information which would otherwise result in the -selection of the wrong variant in many cases. If a browser -sends full and correct information these fiddles will not -be applied. -

- -

Media Types and Wildcards

- -The Accept: request header indicates preferences for media types. It -can also include 'wildcard' media types, such as "image/*" or "*/*" -where the * matches any string. So a request including: -

-  Accept: image/*, */*
-

- -would indicate that any type starting "image/" is acceptable, -as is any other type (so the first "image/*" is redundant). Some -browsers routinely send wildcards in addition to explicit types they -can handle. For example: -

-  Accept: text/html, text/plain, image/gif, image/jpeg, */*
-

- -The intention of this is to indicate that the explicitly -listed types are preferred, but if a different representation is -available, that is ok too. However under the basic algorithm, as given -above, the */* wildcard has exactly equal preference to all the other -types, so they are not being preferred. The browser should really have -sent a request with a lower quality (preference) value for *.*, such -as: -

-  Accept: text/html, text/plain, image/gif, image/jpeg, */*; q=0.01
-

- -The explicit types have no quality factor, so they default to a -preference of 1.0 (the highest). The wildcard */* is given -a low preference of 0.01, so other types will only be returned if -no variant matches an explicitly listed type. -

- -If the Accept: header contains no q factors at all, Apache sets -the q value of "*/*", if present, to 0.01 to emulate the desired -behaviour. It also sets the q value of wildcards of the format -"type/*" to 0.02 (so these are preferred over matches against -"*/*". If any media type on the Accept: header contains a q factor, -these special values are not applied, so requests from browsers -which send the correct information to start with work as expected. - -

Variants with no Language

- -If some of the variants for a particular resource have a language -attribute, and some do not, those variants with no language -are given a very low language quality factor of 0.001.

- -The reason for setting this language quality factor for -variant with no language to a very low value is to allow -for a default variant which can be supplied if none of the -other variants match the browser's language preferences. - -For example, consider the situation with three variants: - -

foo.en.html, language en -
foo.fr.html, language en -
foo.html, no language -

- -The meaning of a variant with no language is that it is -always acceptable to the browser. If the request Accept-Language -header includes either en or fr (or both) one of foo.en.html -or foo.fr.html will be returned. If the browser does not list -either en or fr as acceptable, foo.html will be returned instead. - -

Note on Caching

- -When a cache stores a document, it associates it with the request URL. -The next time that URL is requested, the cache can use the stored -document, provided it is still within date. But if the resource is -subject to content negotiation at the server, this would result in -only the first requested variant being cached, and subsequent cache -hits could return the wrong response. To prevent this, -Apache normally marks all responses that are returned after content negotiation -as non-cacheable by HTTP/1.0 clients. Apache also supports the HTTP/1.1 -protocol features to allow cacheing of negotiated responses.

- -For requests which come from a HTTP/1.0 compliant client (either a -browser or a cache), the directive CacheNegotiatedDocs can be -used to allow caching of responses which were subject to negotiation. -This directive can be given in the server config or virtual host, and -takes no arguments. It has no effect on requests from HTTP/1.1 -clients. - - - - diff --git a/docs/manual/custom-error.html.en b/docs/manual/custom-error.html.en deleted file mode 100644 index 434d98a4b3a..00000000000 --- a/docs/manual/custom-error.html.en +++ /dev/null @@ -1,141 +0,0 @@ - - -Custom error responses - - - - -

Custom error responses

- -

Purpose - -

Additional functionality. Allows webmasters to configure the response of - Apache to some error or problem. - -

Customizable responses can be defined to be activated in the - event of a server detected error or problem. - -

e.g. if a script crashes and produces a "500 Server Error" - response, then this response can be replaced with either some - friendlier text or by a redirection to another URL (local or - external). - -

- -

Old behavior - -

NCSA httpd 1.3 would return some boring old error/problem message - which would often be meaningless to the user, and would provide no - means of logging the symptoms which caused it.
- -

- -

New behavior - -

The server can be asked to; -

Display some other text, instead of the NCSA hard coded messages, or -
redirect to a local URL, or -
redirect to an external URL. -

- -

Redirecting to another URL can be useful, but only if some information - can be passed which can then be used to explain and/or log the error/problem - more clearly. - -

To achieve this, Apache will define new CGI-like environment - variables, e.g. - -

-REDIRECT_HTTP_ACCEPT=*/*, image/gif, image/x-xbitmap, image/jpeg -REDIRECT_HTTP_USER_AGENT=Mozilla/1.1b2 (X11; I; HP-UX A.09.05 9000/712) -REDIRECT_PATH=.:/bin:/usr/local/bin:/etc -REDIRECT_QUERY_STRING= -REDIRECT_REMOTE_ADDR=121.345.78.123 -REDIRECT_REMOTE_HOST=ooh.ahhh.com -REDIRECT_SERVER_NAME=crash.bang.edu -REDIRECT_SERVER_PORT=80 -REDIRECT_SERVER_SOFTWARE=Apache/0.8.15 -REDIRECT_URL=/cgi-bin/buggy.pl -

- -

note the REDIRECT_ prefix. - -

At least REDIRECT_URL and REDIRECT_QUERY_STRING will - be passed to the new URL (assuming it's a cgi-script or a cgi-include). The - other variables will exist only if they existed prior to the error/problem.

- -

Configuration - -

Use of "ErrorDocument" is enabled for .htaccess files when the - "FileInfo" override is allowed. - -

Here are some examples... - -

-ErrorDocument 500 /cgi-bin/crash-recover -ErrorDocument 500 "Sorry, our script crashed. Oh dear -ErrorDocument 500 http://xxx/ -ErrorDocument 404 /Lame_excuses/not_found.html -ErrorDocument 401 /Subscription/how_to_subscribe.html -

- -

The syntax is, - -

ErrorDocument -<3-digit-code> action - -

where the action can be, - -

Text to be displayed. Prefix the text with a quote ("). Whatever - follows the quote is displayed. Note: the (") prefix isn't - displayed. - -
An external URL to redirect to. - -
A local URL to redirect to. - -

- -

Custom error responses and redirects

- -

Purpose - -: Apache's behavior to redirected URLs has been modified so that additional - environment variables are available to a script/server-include.
- -
Old behavior - -: Standard CGI vars were made available to a script which has been - redirected to. No indication of where the redirection came from was provided. - -
- -
New behavior -: - -A new batch of environment variables will be initialized for use by a -script which has been redirected to. Each new variable will have the -prefix REDIRECT_. REDIRECT_ environment -variables are created from the CGI environment variables which existed -prior to the redirect, they are renamed with a REDIRECT_ -prefix, i.e. HTTP_USER_AGENT becomes -REDIRECT_HTTP_USER_AGENT. In addition to these new -variables, Apache will define REDIRECT_URL and -REDIRECT_STATUS to help the script trace its origin. -Both the original URL and the URL being redirected to can be logged in -the access log. - -

- - - - diff --git a/docs/manual/developer/API.html b/docs/manual/developer/API.html deleted file mode 100644 index 7bb98773ed2..00000000000 --- a/docs/manual/developer/API.html +++ /dev/null @@ -1,986 +0,0 @@ - -Apache API notes - - - -

Apache API notes

- -These are some notes on the Apache API and the data structures you -have to deal with, etc. They are not yet nearly complete, but -hopefully, they will help you get your bearings. Keep in mind that -the API is still subject to change as we gain experience with it. -(See the TODO file for what might be coming). However, -it will be easy to adapt modules to any changes that are made. -(We have more modules to adapt than you do). -

- -A few notes on general pedagogical style here. In the interest of -conciseness, all structure declarations here are incomplete --- the -real ones have more slots that I'm not telling you about. For the -most part, these are reserved to one component of the server core or -another, and should be altered by modules with caution. However, in -some cases, they really are things I just haven't gotten around to -yet. Welcome to the bleeding edge.

- -Finally, here's an outline, to give you some bare idea of what's -coming up, and in what order: - -

Basic concepts. - -
Handlers, Modules, and Requests -
A brief tour of a module -
-
How handlers work - -
A brief tour of the request_rec -
Where request_rec structures come from -
Handling requests, declining, and returning error codes -
Special considerations for response handlers -
Special considerations for authentication handlers -
Special considerations for logging handlers -
-
Resource allocation and resource pools -
Configuration, commands and the like - -
Per-directory configuration structures -
Command handling -
Side notes --- per-server configuration, virtual servers, etc. -
-

- -

Basic concepts.

- -We begin with an overview of the basic concepts behind the -API, and how they are manifested in the code. - -

Handlers, Modules, and Requests

- -Apache breaks down request handling into a series of steps, more or -less the same way the Netscape server API does (although this API has -a few more stages than NetSite does, as hooks for stuff I thought -might be useful in the future). These are: - -

URI -> Filename translation -
Auth ID checking [is the user who they say they are?] -
Auth access checking [is the user authorized here?] -
Access checking other than auth -
Determining MIME type of the object requested -
`Fixups' --- there aren't any of these yet, but the phase is - intended as a hook for possible extensions like - SetEnv, which don't really fit well elsewhere. -
Actually sending a response back to the client. -
Logging the request -

- -These phases are handled by looking at each of a succession of -modules, looking to see if each of them has a handler for the -phase, and attempting invoking it if so. The handler can typically do -one of three things: - -

Handle the request, and indicate that it has done so - by returning the magic constant OK. -
Decline to handle the request, by returning the magic - integer constant DECLINED. In this case, the - server behaves in all respects as if the handler simply hadn't - been there. -
Signal an error, by returning one of the HTTP error codes. - This terminates normal handling of the request, although an - ErrorDocument may be invoked to try to mop up, and it will be - logged in any case. -

- -Most phases are terminated by the first module that handles them; -however, for logging, `fixups', and non-access authentication -checking, all handlers always run (barring an error). Also, the -response phase is unique in that modules may declare multiple handlers -for it, via a dispatch table keyed on the MIME type of the requested -object. Modules may declare a response-phase handler which can handle -any request, by giving it the key */* (i.e., a -wildcard MIME type specification). However, wildcard handlers are -only invoked if the server has already tried and failed to find a more -specific response handler for the MIME type of the requested object -(either none existed, or they all declined).

- -The handlers themselves are functions of one argument (a -request_rec structure. vide infra), which returns an -integer, as above.

- -

A brief tour of a module

- -At this point, we need to explain the structure of a module. Our -candidate will be one of the messier ones, the CGI module --- this -handles both CGI scripts and the ScriptAlias config file -command. It's actually a great deal more complicated than most -modules, but if we're going to have only one example, it might as well -be the one with its fingers in every place.

- -Let's begin with handlers. In order to handle the CGI scripts, the -module declares a response handler for them. Because of -ScriptAlias, it also has handlers for the name -translation phase (to recognise ScriptAliased URIs), the -type-checking phase (any ScriptAliased request is typed -as a CGI script).

- -The module needs to maintain some per (virtual) -server information, namely, the ScriptAliases in effect; -the module structure therefore contains pointers to a functions which -builds these structures, and to another which combines two of them (in -case the main server and a virtual server both have -ScriptAliases declared).

- -Finally, this module contains code to handle the -ScriptAlias command itself. This particular module only -declares one command, but there could be more, so modules have -command tables which declare their commands, and describe -where they are permitted, and how they are to be invoked.

- -A final note on the declared types of the arguments of some of these -commands: a pool is a pointer to a resource pool -structure; these are used by the server to keep track of the memory -which has been allocated, files opened, etc., either to service a -particular request, or to handle the process of configuring itself. -That way, when the request is over (or, for the configuration pool, -when the server is restarting), the memory can be freed, and the files -closed, en masse, without anyone having to write explicit code to -track them all down and dispose of them. Also, a -cmd_parms structure contains various information about -the config file being read, and other status information, which is -sometimes of use to the function which processes a config-file command -(such as ScriptAlias). - -With no further ado, the module itself: - -

-/* Declarations of handlers. */
-
-int translate_scriptalias (request_rec *);
-int type_scriptalias (request_rec *);
-int cgi_handler (request_rec *);
-
-/* Subsidiary dispatch table for response-phase handlers, by MIME type */
-
-handler_rec cgi_handlers[] = {
-{ "application/x-httpd-cgi", cgi_handler },
-{ NULL }
-};
-
-/* Declarations of routines to manipulate the module's configuration
- * info.  Note that these are returned, and passed in, as void *'s;
- * the server core keeps track of them, but it doesn't, and can't,
- * know their internal structure.
- */
-
-void *make_cgi_server_config (pool *);
-void *merge_cgi_server_config (pool *, void *, void *);
-
-/* Declarations of routines to handle config-file commands */
-
-extern char *script_alias(cmd_parms *, void *per_dir_config, char *fake,
-                          char *real);
-
-command_rec cgi_cmds[] = {
-{ "ScriptAlias", script_alias, NULL, RSRC_CONF, TAKE2,
-    "a fakename and a realname"},
-{ NULL }
-};
-
-module cgi_module = {
-   STANDARD_MODULE_STUFF,
-   NULL,                     /* initializer */
-   NULL,                     /* dir config creator */
-   NULL,                     /* dir merger --- default is to override */
-   make_cgi_server_config,   /* server config */
-   merge_cgi_server_config,  /* merge server config */
-   cgi_cmds,                 /* command table */
-   cgi_handlers,             /* handlers */
-   translate_scriptalias,    /* filename translation */
-   NULL,                     /* check_user_id */
-   NULL,                     /* check auth */
-   NULL,                     /* check access */
-   type_scriptalias,         /* type_checker */
-   NULL,                     /* fixups */
-   NULL                      /* logger */
-};
-

- -

How handlers work

- -The sole argument to handlers is a request_rec structure. -This structure describes a particular request which has been made to -the server, on behalf of a client. In most cases, each connection to -the client generates only one request_rec structure.

- -

A brief tour of the `request_rec`

- -The request_rec contains pointers to a resource pool -which will be cleared when the server is finished handling the -request; to structures containing per-server and per-connection -information, and most importantly, information on the request itself.

- -The most important such information is a small set of character -strings describing attributes of the object being requested, including -its URI, filename, content-type and content-encoding (these being filled -in by the translation and type-check handlers which handle the -request, respectively).

- -Other commonly used data items are tables giving the MIME headers on -the client's original request, MIME headers to be sent back with the -response (which modules can add to at will), and environment variables -for any subprocesses which are spawned off in the course of servicing -the request. These tables are manipulated using the -table_get and table_set routines.

- -Finally, there are pointers to two data structures which, in turn, -point to per-module configuration structures. Specifically, these -hold pointers to the data structures which the module has built to -describe the way it has been configured to operate in a given -directory (via .htaccess files or -<Directory> sections), for private data it has -built in the course of servicing the request (so modules' handlers for -one phase can pass `notes' to their handlers for other phases). There -is another such configuration vector in the server_rec -data structure pointed to by the request_rec, which -contains per (virtual) server configuration data.

- -Here is an abridged declaration, giving the fields most commonly used:

- -

-struct request_rec {
-
-  pool *pool;
-  conn_rec *connection;
-  server_rec *server;
-
-  /* What object is being requested */
-  
-  char *uri;
-  char *filename;
-  char *path_info;
-  char *args;           /* QUERY_ARGS, if any */
-  struct stat finfo;    /* Set by server core;
-                         * st_mode set to zero if no such file */
-  
-  char *content_type;
-  char *content_encoding;
-  
-  /* MIME header environments, in and out.  Also, an array containing
-   * environment variables to be passed to subprocesses, so people can
-   * write modules to add to that environment.
-   *
-   * The difference between headers_out and err_headers_out is that
-   * the latter are printed even on error, and persist across internal
-   * redirects (so the headers printed for ErrorDocument handlers will
-   * have them).
-   */
-  
-  table *headers_in;
-  table *headers_out;
-  table *err_headers_out;
-  table *subprocess_env;
-
-  /* Info about the request itself... */
-  
-  int header_only;     /* HEAD request, as opposed to GET */
-  char *protocol;      /* Protocol, as given to us, or HTTP/0.9 */
-  char *method;        /* GET, HEAD, POST, etc. */
-  int method_number;   /* M_GET, M_POST, etc. */
-
-  /* Info for logging */
-
-  char *the_request;
-  int bytes_sent;
-
-  /* A flag which modules can set, to indicate that the data being
-   * returned is volatile, and clients should be told not to cache it.
-   */
-
-  int no_cache;
-
-  /* Various other config info which may change with .htaccess files
-   * These are config vectors, with one void* pointer for each module
-   * (the thing pointed to being the module's business).
-   */
-  
-  void *per_dir_config;   /* Options set in config files, etc. */
-  void *request_config;   /* Notes on *this* request */
-  
-};
-
-

- -

Where request_rec structures come from

- -Most request_rec structures are built by reading an HTTP -request from a client, and filling in the fields. However, there are -a few exceptions: - -

If the request is to an imagemap, a type map (i.e., a - *.var file), or a CGI script which returned a - local `Location:', then the resource which the user requested - is going to be ultimately located by some URI other than what - the client originally supplied. In this case, the server does - an internal redirect, constructing a new - request_rec for the new URI, and processing it - almost exactly as if the client had requested the new URI - directly.
- -
If some handler signaled an error, and an - ErrorDocument is in scope, the same internal - redirect machinery comes into play.
- -
Finally, a handler occasionally needs to investigate `what - would happen if' some other request were run. For instance, - the directory indexing module needs to know what MIME type - would be assigned to a request for each directory entry, in - order to figure out what icon to use.
- - Such handlers can construct a sub-request, using the - functions sub_req_lookup_file and - sub_req_lookup_uri; this constructs a new - request_rec structure and processes it as you - would expect, up to but not including the point of actually - sending a response. (These functions skip over the access - checks if the sub-request is for a file in the same directory - as the original request).
- - (Server-side includes work by building sub-requests and then - actually invoking the response handler for them, via the - function run_sub_request). -

- -

Handling requests, declining, and returning error codes

- -As discussed above, each handler, when invoked to handle a particular -request_rec, has to return an int to -indicate what happened. That can either be - -

OK --- the request was handled successfully. This may or may - not terminate the phase. -
DECLINED --- no erroneous condition exists, but the module - declines to handle the phase; the server tries to find another. -
an HTTP error code, which aborts handling of the request. -

- -Note that if the error code returned is REDIRECT, then -the module should put a Location in the request's -headers_out, to indicate where the client should be -redirected to.

- -

Special considerations for response handlers

- -Handlers for most phases do their work by simply setting a few fields -in the request_rec structure (or, in the case of access -checkers, simply by returning the correct error code). However, -response handlers have to actually send a request back to the client.

- -They should begin by sending an HTTP response header, using the -function send_http_header. (You don't have to do -anything special to skip sending the header for HTTP/0.9 requests; the -function figures out on its own that it shouldn't do anything). If -the request is marked header_only, that's all they should -do; they should return after that, without attempting any further -output.

- -Otherwise, they should produce a request body which responds to the -client as appropriate. The primitives for this are rputc -and rprintf, for internally generated output, and -send_fd, to copy the contents of some FILE * -straight to the client.

- -At this point, you should more or less understand the following piece -of code, which is the handler which handles GET requests -which have no more specific handler; it also shows how conditional -GETs can be handled, if it's desirable to do so in a -particular response handler --- set_last_modified checks -against the If-modified-since value supplied by the -client, if any, and returns an appropriate code (which will, if -nonzero, be USE_LOCAL_COPY). No similar considerations apply for -set_content_length, but it returns an error code for -symmetry.

- -

-int default_handler (request_rec *r)
-{
-    int errstatus;
-    FILE *f;
-    
-    if (r->method_number != M_GET) return DECLINED;
-    if (r->finfo.st_mode == 0) return NOT_FOUND;
-
-    if ((errstatus = set_content_length (r, r->finfo.st_size))
-        || (errstatus = set_last_modified (r, r->finfo.st_mtime)))
-        return errstatus;
-    
-    f = fopen (r->filename, "r");
-
-    if (f == NULL) {
-        log_reason("file permissions deny server access",
-                   r->filename, r);
-        return FORBIDDEN;
-    }
-      
-    register_timeout ("send", r);
-    send_http_header (r);
-
-    if (!r->header_only) send_fd (f, r);
-    pfclose (r->pool, f);
-    return OK;
-}
-

- -Finally, if all of this is too much of a challenge, there are a few -ways out of it. First off, as shown above, a response handler which -has not yet produced any output can simply return an error code, in -which case the server will automatically produce an error response. -Secondly, it can punt to some other handler by invoking -internal_redirect, which is how the internal redirection -machinery discussed above is invoked. A response handler which has -internally redirected should always return OK.

- -(Invoking internal_redirect from handlers which are -not response handlers will lead to serious confusion). - -

Special considerations for authentication handlers

- -Stuff that should be discussed here in detail: - -

Authentication-phase handlers not invoked unless auth is - configured for the directory. -
Common auth configuration stored in the core per-dir - configuration; it has accessors auth_type, - auth_name, and requires. -
Common routines, to handle the protocol end of things, at least - for HTTP basic authentication (get_basic_auth_pw, - which sets the connection->user structure field - automatically, and note_basic_auth_failure, which - arranges for the proper WWW-Authenticate: header - to be sent back). -

- -

Special considerations for logging handlers

- -When a request has internally redirected, there is the question of -what to log. Apache handles this by bundling the entire chain of -redirects into a list of request_rec structures which are -threaded through the r->prev and r->next -pointers. The request_rec which is passed to the logging -handlers in such cases is the one which was originally built for the -initial request from the client; note that the bytes_sent field will -only be correct in the last request in the chain (the one for which a -response was actually sent). - -

Resource allocation and resource pools

- -One of the problems of writing and designing a server-pool server is -that of preventing leakage, that is, allocating resources (memory, -open files, etc.), without subsequently releasing them. The resource -pool machinery is designed to make it easy to prevent this from -happening, by allowing resource to be allocated in such a way that -they are automatically released when the server is done with -them.

- -The way this works is as follows: the memory which is allocated, file -opened, etc., to deal with a particular request are tied to a -resource pool which is allocated for the request. The pool -is a data structure which itself tracks the resources in question.

- -When the request has been processed, the pool is cleared. At -that point, all the memory associated with it is released for reuse, -all files associated with it are closed, and any other clean-up -functions which are associated with the pool are run. When this is -over, we can be confident that all the resource tied to the pool have -been released, and that none of them have leaked.

- -Server restarts, and allocation of memory and resources for per-server -configuration, are handled in a similar way. There is a -configuration pool, which keeps track of resources which were -allocated while reading the server configuration files, and handling -the commands therein (for instance, the memory that was allocated for -per-server module configuration, log files and other files that were -opened, and so forth). When the server restarts, and has to reread -the configuration files, the configuration pool is cleared, and so the -memory and file descriptors which were taken up by reading them the -last time are made available for reuse.

- -It should be noted that use of the pool machinery isn't generally -obligatory, except for situations like logging handlers, where you -really need to register cleanups to make sure that the log file gets -closed when the server restarts (this is most easily done by using the -function pfopen, which also -arranges for the underlying file descriptor to be closed before any -child processes, such as for CGI scripts, are execed), or -in case you are using the timeout machinery (which isn't yet even -documented here). However, there are two benefits to using it: -resources allocated to a pool never leak (even if you allocate a -scratch string, and just forget about it); also, for memory -allocation, palloc is generally faster than -malloc.

- -We begin here by describing how memory is allocated to pools, and then -discuss how other resources are tracked by the resource pool -machinery. - -

Allocation of memory in pools

- -Memory is allocated to pools by calling the function -palloc, which takes two arguments, one being a pointer to -a resource pool structure, and the other being the amount of memory to -allocate (in chars). Within handlers for handling -requests, the most common way of getting a resource pool structure is -by looking at the pool slot of the relevant -request_rec; hence the repeated appearance of the -following idiom in module code: - -

-int my_handler(request_rec *r)
-{
-    struct my_structure *foo;
-    ...
-
-    foo = (foo *)palloc (r->pool, sizeof(my_structure));
-}
-

- -Note that there is no pfree --- -palloced memory is freed only when the associated -resource pool is cleared. This means that palloc does not -have to do as much accounting as malloc(); all it does in -the typical case is to round up the size, bump a pointer, and do a -range check.

- -(It also raises the possibility that heavy use of palloc -could cause a server process to grow excessively large. There are -two ways to deal with this, which are dealt with below; briefly, you -can use malloc, and try to be sure that all of the memory -gets explicitly freed, or you can allocate a sub-pool of -the main pool, allocate your memory in the sub-pool, and clear it out -periodically. The latter technique is discussed in the section on -sub-pools below, and is used in the directory-indexing code, in order -to avoid excessive storage allocation when listing directories with -thousands of files). - -

Allocating initialized memory

- -There are functions which allocate initialized memory, and are -frequently useful. The function pcalloc has the same -interface as palloc, but clears out the memory it -allocates before it returns it. The function pstrdup -takes a resource pool and a char * as arguments, and -allocates memory for a copy of the string the pointer points to, -returning a pointer to the copy. Finally pstrcat is a -varargs-style function, which takes a pointer to a resource pool, and -at least two char * arguments, the last of which must be -NULL. It allocates enough memory to fit copies of each -of the strings, as a unit; for instance: - -

-     pstrcat (r->pool, "foo", "/", "bar", NULL);
-

- -returns a pointer to 8 bytes worth of memory, initialized to -"foo/bar". - -

Tracking open files, etc.

- -As indicated above, resource pools are also used to track other sorts -of resources besides memory. The most common are open files. The -routine which is typically used for this is pfopen, which -takes a resource pool and two strings as arguments; the strings are -the same as the typical arguments to fopen, e.g., - -

-     ...
-     FILE *f = pfopen (r->pool, r->filename, "r");
-
-     if (f == NULL) { ... } else { ... }
-

- -There is also a popenf routine, which parallels the -lower-level open system call. Both of these routines -arrange for the file to be closed when the resource pool in question -is cleared.

- -Unlike the case for memory, there are functions to close -files allocated with pfopen, and popenf, -namely pfclose and pclosef. (This is -because, on many systems, the number of files which a single process -can have open is quite limited). It is important to use these -functions to close files allocated with pfopen and -popenf, since to do otherwise could cause fatal errors on -systems such as Linux, which react badly if the same -FILE* is closed more than once.

- -(Using the close functions is not mandatory, since the -file will eventually be closed regardless, but you should consider it -in cases where your module is opening, or could open, a lot of files). - -

Other sorts of resources --- cleanup functions

- -More text goes here. Describe the the cleanup primitives in terms of -which the file stuff is implemented; also, spawn_process. - -

Fine control --- creating and dealing with sub-pools, with a note -on sub-requests

- -On rare occasions, too-free use of palloc() and the -associated primitives may result in undesirably profligate resource -allocation. You can deal with such a case by creating a -sub-pool, allocating within the sub-pool rather than the main -pool, and clearing or destroying the sub-pool, which releases the -resources which were associated with it. (This really is a -rare situation; the only case in which it comes up in the standard -module set is in case of listing directories, and then only with -very large directories. Unnecessary use of the primitives -discussed here can hair up your code quite a bit, with very little -gain).

- -The primitive for creating a sub-pool is make_sub_pool, -which takes another pool (the parent pool) as an argument. When the -main pool is cleared, the sub-pool will be destroyed. The sub-pool -may also be cleared or destroyed at any time, by calling the functions -clear_pool and destroy_pool, respectively. -(The difference is that clear_pool frees resources -associated with the pool, while destroy_pool also -deallocates the pool itself. In the former case, you can allocate new -resources within the pool, and clear it again, and so forth; in the -latter case, it is simply gone).

- -One final note --- sub-requests have their own resource pools, which -are sub-pools of the resource pool for the main request. The polite -way to reclaim the resources associated with a sub request which you -have allocated (using the sub_req_lookup_... functions) -is destroy_sub_request, which frees the resource pool. -Before calling this function, be sure to copy anything that you care -about which might be allocated in the sub-request's resource pool into -someplace a little less volatile (for instance, the filename in its -request_rec structure).

- -(Again, under most circumstances, you shouldn't feel obliged to call -this function; only 2K of memory or so are allocated for a typical sub -request, and it will be freed anyway when the main request pool is -cleared. It is only when you are allocating many, many sub-requests -for a single main request that you should seriously consider the -destroy... functions). - -

Configuration, commands and the like

- -One of the design goals for this server was to maintain external -compatibility with the NCSA 1.3 server --- that is, to read the same -configuration files, to process all the directives therein correctly, -and in general to be a drop-in replacement for NCSA. On the other -hand, another design goal was to move as much of the server's -functionality into modules which have as little as possible to do with -the monolithic server core. The only way to reconcile these goals is -to move the handling of most commands from the central server into the -modules.

- -However, just giving the modules command tables is not enough to -divorce them completely from the server core. The server has to -remember the commands in order to act on them later. That involves -maintaining data which is private to the modules, and which can be -either per-server, or per-directory. Most things are per-directory, -including in particular access control and authorization information, -but also information on how to determine file types from suffixes, -which can be modified by AddType and -DefaultType directives, and so forth. In general, the -governing philosophy is that anything which can be made -configurable by directory should be; per-server information is -generally used in the standard set of modules for information like -Aliases and Redirects which come into play -before the request is tied to a particular place in the underlying -file system.

- -Another requirement for emulating the NCSA server is being able to -handle the per-directory configuration files, generally called -.htaccess files, though even in the NCSA server they can -contain directives which have nothing at all to do with access -control. Accordingly, after URI -> filename translation, but before -performing any other phase, the server walks down the directory -hierarchy of the underlying filesystem, following the translated -pathname, to read any .htaccess files which might be -present. The information which is read in then has to be -merged with the applicable information from the server's own -config files (either from the <Directory> sections -in access.conf, or from defaults in -srm.conf, which actually behaves for most purposes almost -exactly like <Directory />).

- -Finally, after having served a request which involved reading -.htaccess files, we need to discard the storage allocated -for handling them. That is solved the same way it is solved wherever -else similar problems come up, by tying those structures to the -per-transaction resource pool.

- -

Per-directory configuration structures

- -Let's look out how all of this plays out in mod_mime.c, -which defines the file typing handler which emulates the NCSA server's -behavior of determining file types from suffixes. What we'll be -looking at, here, is the code which implements the -AddType and AddEncoding commands. These -commands can appear in .htaccess files, so they must be -handled in the module's private per-directory data, which in fact, -consists of two separate tables for MIME types and -encoding information, and is declared as follows: - -

-typedef struct {
-    table *forced_types;      /* Additional AddTyped stuff */
-    table *encoding_types;    /* Added with AddEncoding... */
-} mime_dir_config;
-

- -When the server is reading a configuration file, or -<Directory> section, which includes one of the MIME -module's commands, it needs to create a mime_dir_config -structure, so those commands have something to act on. It does this -by invoking the function it finds in the module's `create per-dir -config slot', with two arguments: the name of the directory to which -this configuration information applies (or NULL for -srm.conf), and a pointer to a resource pool in which the -allocation should happen.

- -(If we are reading a .htaccess file, that resource pool -is the per-request resource pool for the request; otherwise it is a -resource pool which is used for configuration data, and cleared on -restarts. Either way, it is important for the structure being created -to vanish when the pool is cleared, by registering a cleanup on the -pool if necessary).

- -For the MIME module, the per-dir config creation function just -pallocs the structure above, and a creates a couple of -tables to fill it. That looks like this: - -

-void *create_mime_dir_config (pool *p, char *dummy)
-{
-    mime_dir_config *new =
-      (mime_dir_config *) palloc (p, sizeof(mime_dir_config));
-
-    new->forced_types = make_table (p, 4);
-    new->encoding_types = make_table (p, 4);
-    
-    return new;
-}
-

- -Now, suppose we've just read in a .htaccess file. We -already have the per-directory configuration structure for the next -directory up in the hierarchy. If the .htaccess file we -just read in didn't have any AddType or -AddEncoding commands, its per-directory config structure -for the MIME module is still valid, and we can just use it. -Otherwise, we need to merge the two structures somehow.

- -To do that, the server invokes the module's per-directory config merge -function, if one is present. That function takes three arguments: -the two structures being merged, and a resource pool in which to -allocate the result. For the MIME module, all that needs to be done -is overlay the tables from the new per-directory config structure with -those from the parent: - -

-void *merge_mime_dir_configs (pool *p, void *parent_dirv, void *subdirv)
-{
-    mime_dir_config *parent_dir = (mime_dir_config *)parent_dirv;
-    mime_dir_config *subdir = (mime_dir_config *)subdirv;
-    mime_dir_config *new =
-      (mime_dir_config *)palloc (p, sizeof(mime_dir_config));
-
-    new->forced_types = overlay_tables (p, subdir->forced_types,
-                                        parent_dir->forced_types);
-    new->encoding_types = overlay_tables (p, subdir->encoding_types,
-                                          parent_dir->encoding_types);
-
-    return new;
-}
-

- -As a note --- if there is no per-directory merge function present, the -server will just use the subdirectory's configuration info, and ignore -the parent's. For some modules, that works just fine (e.g., for the -includes module, whose per-directory configuration information -consists solely of the state of the XBITHACK), and for -those modules, you can just not declare one, and leave the -corresponding structure slot in the module itself NULL.

- -

Command handling

- -Now that we have these structures, we need to be able to figure out -how to fill them. That involves processing the actual -AddType and AddEncoding commands. To find -commands, the server looks in the module's command table. -That table contains information on how many arguments the commands -take, and in what formats, where it is permitted, and so forth. That -information is sufficient to allow the server to invoke most -command-handling functions with pre-parsed arguments. Without further -ado, let's look at the AddType command handler, which -looks like this (the AddEncoding command looks basically -the same, and won't be shown here): - -

-char *add_type(cmd_parms *cmd, mime_dir_config *m, char *ct, char *ext)
-{
-    if (*ext == '.') ++ext;
-    table_set (m->forced_types, ext, ct);
-    return NULL;
-}
-

- -This command handler is unusually simple. As you can see, it takes -four arguments, two of which are pre-parsed arguments, the third being -the per-directory configuration structure for the module in question, -and the fourth being a pointer to a cmd_parms structure. -That structure contains a bunch of arguments which are frequently of -use to some, but not all, commands, including a resource pool (from -which memory can be allocated, and to which cleanups should be tied), -and the (virtual) server being configured, from which the module's -per-server configuration data can be obtained if required.

- -Another way in which this particular command handler is unusually -simple is that there are no error conditions which it can encounter. -If there were, it could return an error message instead of -NULL; this causes an error to be printed out on the -server's stderr, followed by a quick exit, if it is in -the main config files; for a .htaccess file, the syntax -error is logged in the server error log (along with an indication of -where it came from), and the request is bounced with a server error -response (HTTP error status, code 500).

- -The MIME module's command table has entries for these commands, which -look like this: - -

-command_rec mime_cmds[] = {
-{ "AddType", add_type, NULL, OR_FILEINFO, TAKE2, 
-    "a mime type followed by a file extension" },
-{ "AddEncoding", add_encoding, NULL, OR_FILEINFO, TAKE2, 
-    "an encoding (e.g., gzip), followed by a file extension" },
-{ NULL }
-};
-

- -The entries in these tables are: - -

The name of the command -
The function which handles it -
a (void *) pointer, which is passed in the - cmd_parms structure to the command handler --- - this is useful in case many similar commands are handled by the - same function. -
A bit mask indicating where the command may appear. There are - mask bits corresponding to each AllowOverride - option, and an additional mask bit, RSRC_CONF, - indicating that the command may appear in the server's own - config files, but not in any .htaccess - file. -
A flag indicating how many arguments the command handler wants - pre-parsed, and how they should be passed in. - TAKE2 indicates two pre-parsed arguments. Other - options are TAKE1, which indicates one pre-parsed - argument, FLAG, which indicates that the argument - should be On or Off, and is passed in - as a boolean flag, RAW_ARGS, which causes the - server to give the command the raw, unparsed arguments - (everything but the command name itself). There is also - ITERATE, which means that the handler looks the - same as TAKE1, but that if multiple arguments are - present, it should be called multiple times, and finally - ITERATE2, which indicates that the command handler - looks like a TAKE2, but if more arguments are - present, then it should be called multiple times, holding the - first argument constant. -
Finally, we have a string which describes the arguments that - should be present. If the arguments in the actual config file - are not as required, this string will be used to help give a - more specific error message. (You can safely leave this - NULL). -

- -Finally, having set this all up, we have to use it. This is -ultimately done in the module's handlers, specifically for its -file-typing handler, which looks more or less like this; note that the -per-directory configuration structure is extracted from the -request_rec's per-directory configuration vector by using -the get_module_config function. - -

-int find_ct(request_rec *r)
-{
-    int i;
-    char *fn = pstrdup (r->pool, r->filename);
-    mime_dir_config *conf = (mime_dir_config *)
-             get_module_config(r->per_dir_config, &mime_module);
-    char *type;
-
-    if (S_ISDIR(r->finfo.st_mode)) {
-        r->content_type = DIR_MAGIC_TYPE;
-        return OK;
-    }
-    
-    if((i=rind(fn,'.')) < 0) return DECLINED;
-    ++i;
-
-    if ((type = table_get (conf->encoding_types, &fn[i])))
-    {
-        r->content_encoding = type;
-
-        /* go back to previous extension to try to use it as a type */
-
-        fn[i-1] = '\0';
-        if((i=rind(fn,'.')) < 0) return OK;
-        ++i;
-    }
-
-    if ((type = table_get (conf->forced_types, &fn[i])))
-    {
-        r->content_type = type;
-    }
-    
-    return OK;
-}
-
-

- -

Side notes --- per-server configuration, virtual servers, etc.

- -The basic ideas behind per-server module configuration are basically -the same as those for per-directory configuration; there is a creation -function and a merge function, the latter being invoked where a -virtual server has partially overridden the base server configuration, -and a combined structure must be computed. (As with per-directory -configuration, the default if no merge function is specified, and a -module is configured in some virtual server, is that the base -configuration is simply ignored).

- -The only substantial difference is that when a command needs to -configure the per-server private module data, it needs to go to the -cmd_parms data to get at it. Here's an example, from the -alias module, which also indicates how a syntax error can be returned -(note that the per-directory configuration argument to the command -handler is declared as a dummy, since the module doesn't actually have -per-directory config data): - -

-char *add_redirect(cmd_parms *cmd, void *dummy, char *f, char *url)
-{
-    server_rec *s = cmd->server;
-    alias_server_conf *conf = (alias_server_conf *)
-            get_module_config(s->module_config,&alias_module);
-    alias_entry *new = push_array (conf->redirects);
-
-    if (!is_url (url)) return "Redirect to non-URL";
-    
-    new->fake = f; new->real = url;
-    return NULL;
-}
-

- - diff --git a/docs/manual/handler.html.en b/docs/manual/handler.html.en deleted file mode 100644 index ea0f61c7e90..00000000000 --- a/docs/manual/handler.html.en +++ /dev/null @@ -1,134 +0,0 @@ - - - -Apache's Handler Use - - - - -

Apache's Handler Use

- -

What is a Handler

- -

A "handler" is an internal Apache representation of the action to be -performed when a file is called. Generally, files have implicit -handlers, based on the file type. Normally, all files are simply -served by the server, but certain file typed are "handled" -separately. For example, you may use a type of -"application/x-httpd-cgi" to invoke CGI scripts.

- -

Apache 1.1 adds the additional ability to use handlers -explicitly. Either based on filename extensions or on location, these -handlers are unrelated to file type. This is advantageous both because -it is a more elegant solution, but it also allows for both a type -and a handler to be associated with a file.

- -

Handlers can either be built into the server or to a module, or -they can be added with the Action directive. The built-in -handlers in the standard distribution are as follows:

- -

send-as-is: - Send file with HTTP headers as is. - (mod_asis) -
cgi-script: - Treat the file as a CGI script. - (mod_cgi) -
imap-file: - Imagemap rule file. - (mod_imap) -
server-info: - Get the server's configuration information - (mod_info) -
server-parsed: - Parse for server-side includes - (mod_include) -
server-status: - Get the server's status report - (mod_status) -
type-map: - Parse as a type map file for content negotiation - (mod_negotiation) -

- -

- -

AddHandler

- -Syntax: <AddHandler handler-name extension>
-Context: server config, virtual host, directory, .htaccess
-Status: Base
-Module: mod_mime - -

AddHandler maps the filename extension extension to the -handler handler-name. For example, to activate CGI scripts -with the file extension ".cgi", you might use: -

-    AddHandler cgi-script cgi
-

- -

Once that has been put into your srm.conf or httpd.conf file, any -file ending with ".cgi" will be treated as a CGI -program.

- -

SetHandler

- -Syntax: <SetHandler handler-name>
-Context: directory, .htaccess
-Status: Base
-Module: mod_mime - -

When placed into an .htaccess file or a -<Directory> or <Location section, -this directive forces all matching files to be parsed through the -handler given by handler-name. For example, if you had a -directory you wanted to be parsed entirely as imagemap rule files, -regardless of extension, you might put the following into an -.htaccess file in that directory: -

-    SetHandler imap-file
-

Another example: if you wanted to have the server display a status -report whenever a URL of http://servername/status was -called, you might put the following into access.conf: -

-    <Location /status>
-    SetHandler server-status
-    </Location>
-

- -

Programmer's Note

- -

In order to implement the handler features, an addition has been -made to the Apache API that you may wish to -make use of. Specifically, a new record has been added to the -request_rec structure:

-    char *handler
-

If you wish to have your module engage a handler, you need only to -set r->handler to the name of the handler at any time -prior to the invoke_handler stage of the -request. Handlers are implemented as they were before, albeit using -the handler name instead of a content type. While it is not -necessary, the naming convention for handlers is to use a -dash-separated word, with no slashes, so as to not invade the media -type name-space.

- - - - - diff --git a/docs/manual/install.html.en b/docs/manual/install.html.en deleted file mode 100644 index a9b7619b9d1..00000000000 --- a/docs/manual/install.html.en +++ /dev/null @@ -1,239 +0,0 @@ - - - -Compiling and Installing Apache - - - - - -

Compiling and Installing Apache 1.2

- -If you wish to download and install an earlier version of Apache please -read Compiling and Installing Apache 1.1. - -

Downloading Apache

- -Information on the latest version of Apache can be found on the Apache -web server at http://www.apache.org/. This will -list the current release, any more recent beta-test release, together -with details of mirror web and anonymous ftp sites. - -

- -If you downloaded a binary distribution, skip to Installing Apache. Otherwise read the next section -for how to compile the server. - -

Compiling Apache

- -Compiling Apache consists of three steps: Firstly select which Apache -modules you want to include into the server. Secondly create a -configuration for your operating system. Thirdly compile the -executable. -

- -All configuration of Apache is performed in the src -directory of the Apache distribution. Change into this directory. - -

- Select modules to compile into Apache in the - Configuration file. Uncomment lines corresponding to - those optional modules you wish to include (among the Module lines - at the bottom of the file), or add new lines corresponding to - additional modules you have downloaded or written. (See API.html for preliminary docs on how to - write Apache modules). Advanced users can comment out some of the - default modules if they are sure they will not need them (be careful - though, since many of the default modules are vital for the correct - operation and security of the server). -
- - You should also read the instructions in the Configuration - file to see if you need to set any of the Rule lines. - - -
- Configure Apache for your operating system. Normally you can just - type run the Configure script as given below. However - if this fails or you have any special requirements (e.g. to include - an additional library required by an optional module) you might need - to edit one or more of the following options in the - Configuration file: - EXTRA_CFLAGS, LIBS, LFLAGS, INCLUDES. -
- - Run the Configure script: -
-
```
-    % Configure
-    Using 'Configuration' as config file
-     + configured for <whatever> platform
-     + setting C compiler to <whatever> *
-     + setting C compiler optimization-level to <whatever> *
-    %
-   
```
-
- - (*: Depending on Configuration and your system, Configure - make not print these lines. That's OK).
- - This generates a Makefile for use in stage 3. It also creates a - Makefile in the support directory, for compilation of the optional - support programs. -
- - (If you want to maintain multiple configurations, you can give a - option to Configure to tell it to read an alternative - Configuration file, such as Configure -file - Configuration.ai). -
- -
- Type make. -

- -The modules we place in the Apache distribution are the ones we have -tested and are used regularly by various members of the Apache -development group. Additional modules contributed by members or third -parties with specific needs or functions are available at <URL:http://www.apache.org/dist/contrib/modules/>. -There are instructions on that page for linking these modules into the -core Apache code. - -

Installing Apache

- -You will have a binary file called httpd in the -src directory. A binary distribution of Apache will -supply this file.

- -The next step is to install the program and configure it. Apache is -designed to be configured and run from the same set of directories -where it is compiled. If you want to run it from somewhere else, make -a directory and copy the conf, logs and -icons directories into it.

- -The next step is to edit the configuration files for the server. This -consists of setting up various directives in up to three -central configuration files. By default, these files are located in -the conf directory and are called srm.conf, -access.conf and httpd.conf. To help you get -started there are same files in the conf directory of the -distribution, called srm.conf-dist, -access.conf-dist and httpd.conf-dist. Copy -or rename these files to the names without the -dist. -Then edit each of the files. Read the comments in each file carefully. -Failure to setup these files correctly could lead to your server not -working or being insecure. You should also have an additional file in -the conf directory called mime.types. This -file usually does not need editing. - -

- -First edit httpd.conf. This sets up general attributes -about the server: the port number, the user it runs as, etc. Next -edit the srm.conf file; this sets up the root of the -document tree, special functions like server-parsed HTML or internal -imagemap parsing, etc. Finally, edit the access.conf -file to at least set the base cases of access. - -

- -In addition to these three files, the server behavior can be configured -on a directory-by-directory basis by using .htaccess -files in directories accessed by the server. - -

Starting and Stopping the Server

- -To start the server, simply run httpd. This will look for -httpd.conf in the location compiled into the code (by -default /usr/locale/etc/httpd/conf/httpd.conf). If -this file is somewhere else, you can give the real -location with the -f argument. For example: - -

-    /usr/local/etc/apache/src/httpd -f /usr/local/etc/apache/conf/httpd.conf
-

- -If all goes well this will return to the command prompt almost -immediately. This indicates that the server is now up and running. If -anything goes wrong during the initiallisation of the server you will -see an error message on the screen. - -If the server started ok, you can now use your browser to -connect to the server and read the documentation. If you are running -the browser on the same machine as the server and using the default -port of 80, a suitable URL to enter into your browser is - -

-    http://localhost/
-

- -

- -Note that when the server starts it will create a number of -child processes to handle the requests. If you started Apache -as the root user, the parent process will continue to run as root -while the children will change to the user as given in the httpd.conf -file. - -

- -If when you run httpd it complained about being unable to -"bind" to an address, then either some other process is already using -the port you have configured Apache to use, or you are running httpd -as a normal user but trying to use port below 1024 (such as the -default port 80). - -

- -If the server is not running, read the error message displayed -when you run httpd. You should also check the server -error_log for additional information (with the default configuration, -this will be located in the file error_log in the -logs directory). - -

- -If you want your server to continue running after a system reboot, you -should add a call to httpd to your system startup files -(typically rc.local or a file in an -rc.N directory). This will start Apache as root. -Before doing this ensure that your server is properly configured -for security and access restrictions. - -

- -To stop Apache send the parent process a TERM signal. The PID of this -process is written to the file httpd.pid in the -logs directory (unless configured otherwise). Do not -attempt to kill the child processes because they will be renewed by -the parent. A typical command to stop the server is: - -

-    kill -TERM `cat /usr/local/etc/apache/logs/httpd.pid`
-

- -

- -For more information about Apache command line options, configuration -and log files, see Starting Apache. For a -reference guide to all Apache directives supported by the distributed -modules, see the Apache directives. - -

Compiling Support Programs

- -In addition to the main httpd server which is compiled -and configured as above, Apache includes a number of support programs. -These are not compiled by default. The support programs are in the -support directory of the distribution. To compile -the support programs, change into this directory and type -

-    make
-

- - - - diff --git a/docs/manual/invoking.html.en b/docs/manual/invoking.html.en deleted file mode 100644 index 8134e720720..00000000000 --- a/docs/manual/invoking.html.en +++ /dev/null @@ -1,116 +0,0 @@ - - - -Starting Apache - - - - -

Starting Apache

- -

Invoking Apache

-The httpd program is usually run as a daemon which executes -continuously, handling requests. It is possible to invoke Apache by -the Internet daemon inetd each time a connection to the HTTP -service is made (use the -ServerType directive) -but this is not recommended. - -

Command line options

-The following options are recognized on the httpd command line: -

-d serverroot -: Set the initial value for the -ServerRoot variable to -serverroot. This can be overridden by the ServerRoot command in the -configuration file. The default is /usr/local/etc/httpd. - -
-f config -: Execute the commands in the file config on startup. If -config does not begin with a /, then it is taken to be a -path relative to the ServerRoot. The -default is conf/httpd.conf. - -
-X -: Run in single-process mode, for internal debugging purposes only; the -daemon does not detach from the terminal or fork any children. Do NOT -use this mode to provide ordinary web service. - -
-v -: Print the version of httpd, and then exit. - -
-h -: Give a list of directives together with expected arguments and -places where the directive is valid. (New in Apache 1.2) - -
-l -: Give a list of all modules compiled into the server. - -
-? -: Print a list of the httpd options, and then exit. -

- -

Configuration files

-The server will read three files for configuration directives. Any directive -may appear in any of these files. The the names of these files are taken -to be relative to the server root; this is set by the -ServerRoot directive, or the --d command line flag. - -Conventionally, the files are: -

conf/httpd.conf -: Contains directives that control the operation of the server daemon. -The filename may be overridden with the -f command line flag. - -
conf/srm.conf -: Contains directives that control the specification of documents that -the server can provide to clients. The filename may be overridden with -the ResourceConfig directive. - -
conf/access.conf -: Contains directives that control access to documents. -The filename may be overridden with the -AccessConfig directive. -

-However, these conventions need not be adhered to. -

-The server also reads a file containing mime document types; the filename -is set by the TypesConfig directive, -and is conf/mime.types by default. - -

Log files

security warning

-Anyone who can write to the directory where Apache is writing a -log file can almost certainly gain access to the uid that the server is -started as, which is normally root. Do NOT give people write -access to the directory the logs are stored in without being aware of -the consequences; see the security tips -document for details. -

pid file

-On daemon startup, it saves the process id of the parent httpd process to -the file logs/httpd.pid. This filename can be changed with the -PidFile directive. The process-id is for -use by the administrator in restarting and terminating the daemon; -A HUP signal causes the daemon to re-read its configuration files and -a TERM signal causes it to die gracefully. -

-If the process dies (or is killed) abnormally, then it will be necessary to -kill the children httpd processes. - -

Error log

-The server will log error messages to a log file, logs/error_log -by default. The filename can be set using the -ErrorLog directive; different error logs can -be set for different virtual hosts. - -

Transfer log

-The server will typically log each request to a transfer file, -logs/access_log by default. The filename can be set using a -TransferLog directive; different -transfer logs can be set for different virtual -hosts. - - - - diff --git a/docs/manual/platform/perf-bsd44.html b/docs/manual/platform/perf-bsd44.html deleted file mode 100644 index 495579eca3e..00000000000 --- a/docs/manual/platform/perf-bsd44.html +++ /dev/null @@ -1,217 +0,0 @@ - - -Running a High-Performance Web Server for BSD - - - - - - -

Running a High-Performance Web Server for BSD

- -Like other OS's, the listen queue is often the first limit hit. The -following are comments from "Aaron Gifford <agifford@InfoWest.COM>" -on how to fix this on BSDI 1.x, 2.x, and FreeBSD 2.0 (and earlier): - -

- -Edit the following two files: -

/usr/include/sys/socket.h - /usr/src/sys/sys/socket.h

-In each file, look for the following: -

-    /*
-     * Maximum queue length specifiable by listen.
-     */
-    #define SOMAXCONN       5
-

- -Just change the "5" to whatever appears to work. I bumped the two -machines I was having problems with up to 32 and haven't noticed the -problem since. - -

- -After the edit, recompile the kernel and recompile the Apache server -then reboot. - -

- -FreeBSD 2.1 seems to be perfectly happy, with SOMAXCONN -set to 32 already. - -

- - -Addendum for very heavily loaded BSD servers
- -from Chuck Murcko <chuck@telebase.com> - -

- -If you're running a really busy BSD Apache server, the following are useful -things to do if the system is acting sluggish:

- -

Run vmstat to check memory usage, page/swap rates, etc. - -
Run netstat -m to check mbuf usage - -
Run fstat to check file descriptor usage - -

- -These utilities give you an idea what you'll need to tune in your kernel, -and whether it'll help to buy more RAM. - -Here are some BSD kernel config parameters (actually BSDI, but pertinent to -FreeBSD and other 4.4-lite derivatives) from a system getting heavy usage. -The tools mentioned above were used, and the system memory was increased to -48 MB before these tuneups. Other system parameters remained unchanged. - -

- -

-maxusers        256
-

- -Maxusers drives a lot of other kernel parameters: - -

Maximum # of processes - -
Maximum # of processes per user - -
System wide open files limit - -
Per-process open files limit - -
Maximum # of mbuf clusters - -
Proc/pgrp hash table size - -

- -The actual formulae for these derived parameters are in -/usr/src/sys/conf/param.c. -These calculated parameters can also be overridden (in part) by specifying -your own values in the kernel configuration file: - -

-# Network options. NMBCLUSTERS defines the number of mbuf clusters and
-# defaults to 256. This machine is a server that handles lots of traffic,
-# so we crank that value.
-options         SOMAXCONN=256           # max pending connects
-options         NMBCLUSTERS=4096        # mbuf clusters at 4096
-
-#
-# Misc. options
-#
-options         CHILD_MAX=512           # maximum number of child processes
-options         OPEN_MAX=512            # maximum fds (breaks RPC svcs)
-

- -SOMAXCONN is not derived from maxusers, so you'll always need to increase -that yourself. We used a value guaranteed to be larger than Apache's -default for the listen() of 128, currently. - -

- -In many cases, NMBCLUSTERS must be set much larger than would appear -necessary at first glance. The reason for this is that if the browser -disconnects in mid-transfer, the socket fd associated with that particular -connection ends up in the TIME_WAIT state for several minutes, during -which time its mbufs are not yet freed. Another reason is that, on server -timeouts, some connections end up in FIN_WAIT_2 state forever, because -this state doesn't time out on the server, and the browser never sent -a final FIN. For more details see the -FIN_WAIT_2 page. - -

- -Some more info on mbuf clusters (from sys/mbuf.h): -

-/*
- * Mbufs are of a single size, MSIZE (machine/machparam.h), which
- * includes overhead.  An mbuf may add a single "mbuf cluster" of size
- * MCLBYTES (also in machine/machparam.h), which has no additional overhead
- * and is used instead of the internal data area; this is done when
- * at least MINCLSIZE of data must be stored.
- */
-

- -

- -CHILD_MAX and OPEN_MAX are set to allow up to 512 child processes (different -than the maximum value for processes per user ID) and file descriptors. -These values may change for your particular configuration (a higher OPEN_MAX -value if you've got modules or CGI scripts opening lots of connections or -files). If you've got a lot of other activity besides httpd on the same -machine, you'll have to set NPROC higher still. In this example, the NPROC -value derived from maxusers proved sufficient for our load. - -

- -Caveats - -

- -Be aware that your system may not boot with a kernel that is configured -to use more resources than you have available system RAM. ALWAYS -have a known bootable kernel available when tuning your system this way, -and use the system tools beforehand to learn if you need to buy more -memory before tuning. - -

- -RPC services will fail when the value of OPEN_MAX is larger than 256. -This is a function of the original implementations of the RPC library, -which used a byte value for holding file descriptors. BSDI has partially -addressed this limit in its 2.1 release, but a real fix may well await -the redesign of RPC itself. - -

- -Finally, there's the hard limit of child processes configured in Apache. - -

- -For versions of Apache later than 1.0.5 you'll need to change the -definition for HARD_SERVER_LIMIT in httpd.h and recompile -if you need to run more than the default 150 instances of httpd. - -

- -From conf/httpd.conf-dist: - -

-# Limit on total number of servers running, i.e., limit on the number
-# of clients who can simultaneously connect --- if this limit is ever
-# reached, clients will be LOCKED OUT, so it should NOT BE SET TOO LOW.
-# It is intended mainly as a brake to keep a runaway server from taking
-# Unix with it as it spirals down...
-
-MaxClients 150
-

- -Know what you're doing if you bump this value up, and make sure you've -done your system monitoring, RAM expansion, and kernel tuning beforehand. -Then you're ready to service some serious hits! - -

- -Thanks to Tony Sanders and Chris Torek at BSDI for their -helpful suggestions and information. - -

- -

More welcome!

- -If you have tips to contribute, send mail to brian@organic.com - - - - diff --git a/docs/manual/platform/perf-dec.html b/docs/manual/platform/perf-dec.html deleted file mode 100644 index ef75f81b06c..00000000000 --- a/docs/manual/platform/perf-dec.html +++ /dev/null @@ -1,271 +0,0 @@ - -Performance Tuning Tips for Digital Unix - - - -

Performance Tuning Tips for Digital Unix

- -Below is a set of newsgroup posts made by an engineer from DEC in -response to queries about how to modify DEC's Digital Unix OS for more -heavily loaded web sites. Copied with permission. - -

- -

Update

-From: Jeffrey Mogul
-Date: Fri, 28 Jun 96 16:07:56 MDT
- -

The advice given in the README file regarding the - "tcbhashsize" variable is incorrect. The largest value - this should be set to is 1024. Setting it any higher - will have the perverse result of disabling the hashing - mechanism. - -

Patch ID OSF350-146 has been superseded by -

- Patch ID OSF350-195 for V3.2C
- Patch ID OSF360-350195 for V3.2D -

- Patch IDs for V3.2E and V3.2F should be available soon. - There is no known reason why the Patch ID OSF360-350195 - won't work on these releases, but such use is not officially - supported by Digital. This patch kit will not be needed for - V3.2G when it is released. - - -

- - -

-From           mogul@pa.dec.com (Jeffrey Mogul)
-Organization   DEC Western Research
-Date           30 May 1996 00:50:25 GMT
-Newsgroups     comp.unix.osf.osf1
-Message-ID     <4oirch$bc8@usenet.pa.dec.com>
-Subject        Re: Web Site Performance
-References     1
-
-
-
-In article <skoogDs54BH.9pF@netcom.com> skoog@netcom.com (Jim Skoog) writes:
->Where are the performance bottlenecks for Alpha AXP running the
->Netscape Commerce Server 1.12 with high volume internet traffic?
->We are evaluating network performance for a variety of Alpha AXP
->runing DEC UNIX 3.2C, which run DEC's seal firewall and behind
->that Alpha 1000 and 2100 webservers.
-
-Our experience (running such Web servers as altavista.digital.com
-and www.digital.com) is that there is one important kernel tuning
-knob to adjust in order to get good performance on V3.2C.  You
-need to patch the kernel global variable "somaxconn" (use dbx -k
-to do this) from its default value of 8 to something much larger.
-
-How much larger?  Well, no larger than 32767 (decimal).  And
-probably no less than about 2048, if you have a really high volume
-(millions of hits per day), like AltaVista does.
-
-This change allows the system to maintain more than 8 TCP
-connections in the SYN_RCVD state for the HTTP server.  (You
-can use "netstat -An |grep SYN_RCVD" to see how many such
-connections exist at any given instant).
-
-If you don't make this change, you might find that as the load gets
-high, some connection attempts take a very long time.  And if a lot
-of your clients disconnect from the Internet during the process of
-TCP connection establishment (this happens a lot with dialup
-users), these "embryonic" connections might tie up your somaxconn
-quota of SYN_RCVD-state connections.  Until the kernel times out
-these embryonic connections, no other connections will be accepted,
-and it will appear as if the server has died.
-
-The default value for somaxconn in Digital UNIX V4.0 will be quite
-a bit larger than it has been in previous versions (we inherited
-this default from 4.3BSD).
-
-Digital UNIX V4.0 includes some other performance-related changes
-that significantly improve its maximum HTTP connection rate.  However,
-we've been using V3.2C systems to front-end for altavista.digital.com
-with no obvious performance bottlenecks at the millions-of-hits-per-day
-level.
-
-We have some Webstone performance results available at
-        http://www.digital.com/info/alphaserver/news/webff.html
-I'm not sure if these were done using V4.0 or an earlier version
-of Digital UNIX, although I suspect they were done using a test
-version of V4.0.
-
--Jeff
-
-
-
-----------------------------------------------------------------------------
-
-From           mogul@pa.dec.com (Jeffrey Mogul)
-Organization   DEC Western Research
-Date           31 May 1996 21:01:01 GMT
-Newsgroups     comp.unix.osf.osf1
-Message-ID     <4onmmd$mmd@usenet.pa.dec.com>
-Subject        Digital UNIX V3.2C Internet tuning patch info
-
-----------------------------------------------------------------------------
-
-Something that probably few people are aware of is that Digital
-has a patch kit available for Digital UNIX V3.2C that may improve
-Internet performance, especially for busy web servers.
-
-This patch kit is one way to increase the value of somaxconn,
-which I discussed in a message here a day or two ago.
-
-I've included in this message the revised README file for this
-patch kit below.  Note that the original README file in the patch
-kit itself may be an earlier version; I'm told that the version
-below is the right one.
-
-Sorry, this patch kit is NOT available for other versions of Digital
-UNIX.  Most (but not quite all) of these changes also made it into V4.0,
-so the description of the various tuning parameters in this README
-file might be useful to people running V4.0 systems.
-
-This patch kit does not appear to be available (yet?) from
-        http://www.service.digital.com/html/patch_service.html
-so I guess you'll have to call Digital's Customer Support to get it.
-
--Jeff
-
-DESCRIPTION: Digital UNIX Network tuning patch
-
-             Patch ID: OSF350-146
-
-             SUPERSEDED PATCHES: OSF350-151, OSF350-158
-
-        This set of files improves the performance of the network
-        subsystem on a system being used as a web server.  There are
-        additional tunable parameters included here, to be used
-        cautiously by an informed system administrator.
-
-TUNING
-
-        To tune the web server, the number of simultaneous socket
-        connection requests are limited by:
-
-        somaxconn               Sets the maximum number of pending requests
-                                allowed to wait on a listening socket.  The
-                                default value in Digital UNIX V3.2 is 8.
-                                This patch kit increases the default to 1024,
-                                which matches the value in Digital UNIX V4.0.
-
-        sominconn               Sets the minimum number of pending connections
-                                allowed on a listening socket.  When a user
-                                process calls listen with a backlog less
-                                than sominconn, the backlog will be set to
-                                sominconn.  sominconn overrides somaxconn.
-                                The default value is 1.
-
-        The effectiveness of tuning these parameters can be monitored by
-        the sobacklog variables available in the kernel:
-
-        sobacklog_hiwat         Tracks the maximum pending requests to any
-                                socket.  The initial value is 0.
-
-        sobacklog_drops         Tracks the number of drops exceeding the
-                                socket set backlog limit.  The initial
-                                value is 0.
-
-        somaxconn_drops         Tracks the number of drops exceeding the
-                                somaxconn limit.  When sominconn is larger
-                                than somaxconn, tracks the number of drops
-                                exceeding sominconn.  The initial value is 0.
-
-        TCP timer parameters also affect performance. Tuning the following
-        require some knowledge of the characteristics of the network.
-
-        tcp_msl                 Sets the tcp maximum segment lifetime.
-                                This is the maximum lifetime in half
-                                seconds that a packet can be in transit
-                                on the network.  This value, when doubled,
-                                is the length of time a connection remains
-                                in the TIME_WAIT state after a incoming
-                                close request is processed.  The unit is
-                                specified in 1/2 seconds, the initial
-                                value is 60.
-
-        tcp_rexmit_interval_min
-                                Sets the minimum TCP retransmit interval.
-                                For some WAN networks the default value may
-                                be too short, causing unnecessary duplicate
-                                packets to be sent.  The unit is specified
-                                in 1/2 seconds, the initial value is 1.
-
-        tcp_keepinit            This is the amount of time a partially
-                                established connection will sit on the listen
-                                queue before timing out (e.g. if a client
-                                sends a SYN but never answers our SYN/ACK).
-                                Partially established connections tie up slots
-                                on the listen queue.  If the queue starts to
-                                fill with connections in SYN_RCVD state,
-                                tcp_keepinit can be decreased to make those
-                                partial connects time out sooner.  This should
-                                be used with caution, since there might be
-                                legitimate clients that are taking a while
-                                to respond to SYN/ACK.  The unit is specified
-                                in 1/2 seconds, the default value is 150
-                                (ie. 75 seconds).
-
-        The hashlist size for the TCP inpcb lookup table is regulated by:
-
-        tcbhashsize             The number of hash buckets used for the
-                                TCP connection table used in the kernel.
-                                The initial value is 32.  For best results,
-                                should be specified as a power of 2.  For
-                                busy Web servers, set this to 2048 or more.
-
-        The hashlist size for the interface alias table is regulated by:
-
-        inifaddr_hsize          The number of hash buckets used for the
-                                interface alias table used in the kernel.
-                                The initial value is 32.  For best results,
-                                should be specified as a power of 2.
-
-        ipport_userreserved     The maximum number of concurrent non-reserved,
-                                dynamically allocated ports.  Default range
-                                is 1025-5000.  The maximum value is 65535.
-                                This limits the numer of times you can
-                                simultaneously telnet or ftp out to connect
-                                to other systems.
-
-        tcpnodelack             Don't delay acknowledging TCP data; this
-                                can sometimes improve performance of locally
-                                run CAD packages.  Default is value is 0,
-                                the enabled value is 1.
-
-                           Digital UNIX version:
-
-                                  V3.2C
-Feature                    V3.2C  patch  V4.0
- =======                    =====  =====  ====
-somaxconn                   X      X      X
-sominconn                   -      X      X
-sobacklog_hiwat             -      X      -
-sobacklog_drops             -      X      -
-somaxconn_drops             -      X      -
-tcpnodelack                 X      X      X
-tcp_keepidle                X      X      X
-tcp_keepintvl               X      X      X
-tcp_keepcnt                 -      X      X
-tcp_keepinit                -      X      X
-TCP keepalive per-socket    -      -      X
-tcp_msl                     -      X      -
-tcp_rexmit_interval_min     -      X      -
-TCP inpcb hashing           -      X      X
-tcbhashsize                 -      X      X
-interface alias hashing     -      X      X
-inifaddr_hsize              -      X      X
-ipport_userreserved         -      X      -
-sysconfig -q inet           -      -      X
-sysconfig -q socket         -      -      X
-
-

- - - diff --git a/docs/manual/platform/perf.html b/docs/manual/platform/perf.html deleted file mode 100644 index d743f84a8da..00000000000 --- a/docs/manual/platform/perf.html +++ /dev/null @@ -1,134 +0,0 @@ - - - -Hints on Running a High-Performance Web Server - - - - -

Hints on Running a High-Performance Web Server

- -Running Apache on a heavily loaded web server, one often encounters -problems related to the machine and OS configuration. "Heavy" is -relative, of course - but if you are seeing more than a couple hits -per second on a sustained basis you should consult the pointers on -this page. In general the suggestions involve how to tune your kernel -for the heavier TCP load, hardware/software conflicts that arise, etc. - -

A/UX (Apple's UNIX) -
BSD-based (BSDI, FreeBSD, etc) -
Digital UNIX -
Hewlett-Packard -
Linux -
SGI -
Solaris -
SunOS 4.x -

- -

- - -

A/UX (Apple's UNIX)

- - -If you are running Apache on A/UX, a page that gives some helpful -performance hints (concerning the listen() queue and using -virtual hosts) -can be found here - -

- - -

BSD-based (BSDI, FreeBSD, etc)

- - -Quick and -detailed -performance tuning hints for BSD-derived systems. - -

- - -

Digital UNIX

- - -We have some newsgroup postings on how to -tune Digital UNIX 3.2 and 4.0. - -

- - -

Hewlett-Packard

- - -Some documentation on tuning HP machines can be found at http://www.software.hp.com/internet/perf/tuning.html. - -

- - -

Linux

- - -The most common problem on Linux shows up on heavily-loaded systems -where the whole server will appear to freeze for a couple of minutes -at a time, and then come back to life. This has been traced to a -listen() queue overload - certain Linux implementations have a low -value set for the incoming connection queue which can cause problems. -Please see our Using Apache on -Linux page for more info on how to fix this. - -

- - -

SGI

- -

-WebFORCE Web Server Tuning Guidelines for IRIX 5.3, -<http://www.sgi.com/Products/WebFORCE/TuningGuide.html> -

- -

- - -

Solaris 2.4

- - -The Solaris 2.4 TCP implementation has a few inherent limitations that -only became apparent under heavy loads. This has been fixed to some -extent in 2.5 (and completely revamped in 2.6), but for now consult -the following URL for tips on how to expand the capabilities if you -are finding slowdowns and lags are hurting performance. - -

- -

- - -

SunOS 4.x

- - -More information on tuning SOMAXCONN on SunOS can be found at - -http://www.islandnet.com/~mark/somaxconn.html. - -

- -

More welcome!

- -If you have tips to contribute, send mail to brian@organic.com - - - - diff --git a/docs/manual/suexec.html.en b/docs/manual/suexec.html.en deleted file mode 100644 index 53ad269e6f9..00000000000 --- a/docs/manual/suexec.html.en +++ /dev/null @@ -1,155 +0,0 @@ - -Apache SetUserID Support - - - -

Apache suEXEC Support

- -

What is suEXEC?

-The suEXEC feature, introduced in Apache 1.2 provides the ability to -run CGI programs under user ids different from the user id of the -calling web-server. Used properly, this feature can reduce considerably the -insecurity of allowing users to run CGI programs. At the same time, improperly -configured, this facility can crash your computer, burn your house down and -steal all the money from your retirement fund. :-) If you aren't -familiar with managing setuid root programs and the security issues they -present, we highly recommend that you not consider using this feature.

- -

Enabling suEXEC Support

-Having said all that, enabling this feature is purposefully difficult with -the intent that it will only be installed by users determined to use it and -is not part of the normal install/compile process.

- -

Configuring the suEXEC wrapper

cd support [ENTER]

-Edit the suexec.h file and change the following macros to match your -local Apache installation.

-From support/suexec.h --

-/*
- * HTTPD_USER -- Define as the username under which Apache normally
- *               runs.  This is the only user allowed to execute
- *               this program.
- */
-#define HTTPD_USER "www"
-
-/*
- * LOG_EXEC -- Define this as a filename if you want all suEXEC
- *             transactions and errors logged for auditing and
- *             debugging purposes.
- */
-#define LOG_EXEC "/usr/local/etc/httpd/logs/cgi.log"
-
-/*
- * DOC_ROOT -- Define as the DocumentRoot set for Apache.  This
- *             will be the only hierarchy (aside from UserDirs)
- *             that can be used for suEXEC behavior.
- */
-#define DOC_ROOT "/usr/local/etc/httpd/htdocs"
-
-/*
- * SAFE_PATH -- Define a safe PATH environment to pass to CGI executables.
- *
- */
-#define SAFE_PATH "/usr/local/bin:/usr/bin:/bin"
-

Compiling the suEXEC wrapper

cc suexec.c -o suexec [ENTER]

-This should create the suexec wrapper executable. - -

Compiling Apache for suEXEC support

-From src/httpd.h --

-/* The path to the suEXEC wrapper */
-#ifndef SUEXEC_BIN
-#define SUEXEC_BIN "/usr/local/etc/httpd/sbin/suexec"
-#endif
-

-If your installation requires location of the wrapper program in a different -directory, edit src/httpd.h and recompile your Apache server. See Compiling and Installing Apache for more info on this process.

- -

Installing the suEXEC wrapper

suexec

SUEXEC_BIN

-In order for the wrapper to set the user id for execution requests it must me installed -as owner root and must have the setuserid execution bit set for file modes. -If you are not running a root user shell, do so now and execute the following -commands.

- -chown root /usr/local/etc/httpd/sbin/suexec [ENTER]

-chmod 4711 /usr/local/etc/httpd/sbin/suexec [ENTER]

- -Change the path to the suEXEC wrapper to match your system installation. -

- -

- - -

Security Model of suEXEC

-The suEXEC wrapper supplied with Apache performs the following security -checks before it will execute any program passed to it for execution. -

User executing the wrapper must be a valid user on this system. -
User executing the wrapper must be the compiled in HTTPD_USER. -
The command that the request wishes to execute must not contain a /. -
The command being executed must reside under the compiled in DOC_ROOT. -
The current working directory must be a directory. -
The current working directory must not be writable by group or other. -
The command being executed cannot be a symbolic link. -
The command being executed cannot be writable by group or other. -
The command being executed cannot be a setuid or setgid program. -
The target UID and GID must be a valid user and group on this system. -
The target UID and GID to execute as, must match the UID and GID of the directory. -
The target execution UID and GID must not be the privledged ID 0. -

-If any of these issues are too restrictive, or do not seem restrictive enough, you are -welcome to install your own version of the wrapper. We've given you the rope, now go -have fun with it. :-) - -

- -

Using suEXEC

-After properly installing the suexec wrapper executable, you must kill and restart -the Apache server. A simple kill -1 `cat httpd.pid` will not be enough. -Upon startup of the web-server, if Apache finds a properly configured suexec wrapper, -it will print the following message to the console.

- -Configuring Apache for use with suexec wrapper.

- -If you don't see this message at server startup, the server is most likely not finding the -wrapper program where it expects it, or the executable is not installed setuid root. Check your installation and try again.

- -One way to use suEXEC is through the User and Group directives in VirtualHost definitions. By setting these directives to values -different from the main server user id, all requests for CGI resources will be executed as -the User and Group defined for that <VirtualHost>. If only one or -neither of these directives are specified for a <VirtualHost> then the main -server userid is assumed.

- -suEXEC can also be used to to execute CGI programs as the user to which the request -is being directed. This is accomplished by using the ~ character prefixing the -user id for whom execution is desired. The only requirement needed for this feature to work -is for CGI execution to be enabled for the user and that the script must meet the scrutiny of the security checks above. - -

- -

Debugging suEXEC

-The suEXEC wrapper will write log information to the location defined in the suexec.h as indicated above. If you feel you have configured and installed the wrapper properly, -have a look at this log and the error_log for the server to see where you may have gone astray. - - - - -

Setting which addresses and ports Apache uses

How this works with Virtual Hosts

See also

PATH_INFO Changes in the CGI Environment

Content Negotiation

About Content Negotiation

Negotiation in Apache

Using a type-map file

Multiviews

The Negotiation Algorithm

Dimensions of Negotation

Apache Negotiation Algorithm

Media Types and Wildcards

Variants with no Language

Note on Caching

Custom error responses

Custom error responses and redirects

Apache API notes

Allocation of memory in pools

Allocating initialized memory

Tracking open files, etc.

Other sorts of resources --- cleanup functions

Fine control --- creating and dealing with sub-pools, with a note -on sub-requests

Apache's Handler Use

What is a Handler

Directives

Programmer's Note

Compiling and Installing Apache 1.2

Downloading Apache

Compiling Apache

Starting and Stopping the Server

Compiling Support Programs

Starting Apache

Invoking Apache

Command line options

Configuration files

Log files

security warning

pid file

Error log

Transfer log

Running a High-Performance Web Server for BSD

More welcome!

Performance Tuning Tips for Digital Unix

Update

Hints on Running a High-Performance Web Server

A/UX (Apple's UNIX)

BSD-based (BSDI, FreeBSD, etc)

Digital UNIX

Hewlett-Packard

Linux

SGI

Solaris 2.4

SunOS 4.x

More welcome!

Apache suEXEC Support

What is suEXEC?

Enabling suEXEC Support

Configuring the suEXEC wrapper

Compiling the suEXEC wrapper

Compiling Apache for suEXEC support

Installing the suEXEC wrapper

Security Model of suEXEC

Using suEXEC

Debugging suEXEC