From: Ken Coar Date: Tue, 21 Apr 1998 12:31:50 +0000 (+0000) Subject: Add Dean's description of what pools there are to the soi-disant X-Git-Tag: djg_nspr_split~4 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=3c3ad42d01050055e9e39b1f275d1d6bb6b853a7;p=thirdparty%2Fapache%2Fhttpd.git Add Dean's description of what pools there are to the soi-disant documentation. Submitted by: Dean Gaudet git-svn-id: https://svn.apache.org/repos/asf/httpd/httpd/trunk@80993 13f79535-47bb-0310-9956-ffa450edef68 --- diff --git a/docs/manual/developer/API.html b/docs/manual/developer/API.html index 33ec861dade..39189225888 100644 --- a/docs/manual/developer/API.html +++ b/docs/manual/developer/API.html @@ -494,27 +494,30 @@ only be correct in the last request in the chain (the one for which a response was actually sent).

Resource allocation and resource pools

- +

One of the problems of writing and designing a server-pool server is that of preventing leakage, that is, allocating resources (memory, open files, etc.), without subsequently releasing them. The resource pool machinery is designed to make it easy to prevent this from happening, by allowing resource to be allocated in such a way that they are automatically released when the server is done with -them.

- +them. +

+

The way this works is as follows: the memory which is allocated, file opened, etc., to deal with a particular request are tied to a resource pool which is allocated for the request. The pool -is a data structure which itself tracks the resources in question.

- +is a data structure which itself tracks the resources in question. +

+

When the request has been processed, the pool is cleared. At that point, all the memory associated with it is released for reuse, all files associated with it are closed, and any other clean-up functions which are associated with the pool are run. When this is over, we can be confident that all the resource tied to the pool have -been released, and that none of them have leaked.

- +been released, and that none of them have leaked. +

+

Server restarts, and allocation of memory and resources for per-server configuration, are handled in a similar way. There is a configuration pool, which keeps track of resources which were @@ -524,8 +527,9 @@ per-server module configuration, log files and other files that were opened, and so forth). When the server restarts, and has to reread the configuration files, the configuration pool is cleared, and so the memory and file descriptors which were taken up by reading them the -last time are made available for reuse.

- +last time are made available for reuse. +

+

It should be noted that use of the pool machinery isn't generally obligatory, except for situations like logging handlers, where you really need to register cleanups to make sure that the log file gets @@ -538,14 +542,15 @@ documented here). However, there are two benefits to using it: resources allocated to a pool never leak (even if you allocate a scratch string, and just forget about it); also, for memory allocation, ap_palloc is generally faster than -malloc.

- +malloc. +

+

We begin here by describing how memory is allocated to pools, and then discuss how other resources are tracked by the resource pool machinery. - +

Allocation of memory in pools

- +

Memory is allocated to pools by calling the function ap_palloc, which takes two arguments, one being a pointer to a resource pool structure, and the other being the amount of memory to @@ -554,7 +559,7 @@ requests, the most common way of getting a resource pool structure is by looking at the pool slot of the relevant request_rec; hence the repeated appearance of the following idiom in module code: - +

 int my_handler(request_rec *r)
 {
@@ -564,14 +569,15 @@ int my_handler(request_rec *r)
     foo = (foo *)ap_palloc (r->pool, sizeof(my_structure));
 }
 
- +

Note that there is no ap_pfree --- ap_palloced memory is freed only when the associated resource pool is cleared. This means that ap_palloc does not have to do as much accounting as malloc(); all it does in the typical case is to round up the size, bump a pointer, and do a -range check.

- +range check. +

+

(It also raises the possibility that heavy use of ap_palloc could cause a server process to grow excessively large. There are two ways to deal with this, which are dealt with below; briefly, you @@ -582,9 +588,9 @@ periodically. The latter technique is discussed in the section on sub-pools below, and is used in the directory-indexing code, in order to avoid excessive storage allocation when listing directories with thousands of files). - +

Allocating initialized memory

- +

There are functions which allocate initialized memory, and are frequently useful. The function ap_pcalloc has the same interface as ap_palloc, but clears out the memory it @@ -596,34 +602,157 @@ varargs-style function, which takes a pointer to a resource pool, and at least two char * arguments, the last of which must be NULL. It allocates enough memory to fit copies of each of the strings, as a unit; for instance: - +

      ap_pstrcat (r->pool, "foo", "/", "bar", NULL);
 
- +

returns a pointer to 8 bytes worth of memory, initialized to "foo/bar". - +

+

Commonly-used pools in the Apache Web server

+

+A pool is really defined by its lifetime more than anything else. There +are some static pools in http_main which are passed to various +non-http_main functions as arguments at opportune times. Here they are: +

+
+
permanent_pool +
+
+
    +
  • never passed to anything else, this is the ancestor of all pools +
  • +
+
+
pconf +
+
+
    +
  • subpool of permanent_pool +
  • +
  • created at the beginning of a config "cycle"; exists until the + server is terminated or restarts; passed to all config-time + routines, either via cmd->pool, or as the "pool *p" argument on + those which don't take pools +
  • +
  • passed to the module init() functions +
  • +
+
+
ptemp +
+
+
    +
  • sorry I lie, this pool isn't called this currently in 1.3, I + renamed it this in my pthreads development. I'm referring to + the use of ptrans in the parent... contrast this with the later + definition of ptrans in the child. +
  • +
  • subpool of permanent_pool +
  • +
  • created at the beginning of a config "cycle"; exists until the + end of config parsing; passed to config-time routines via + cmd->temp_pool. Somewhat of a "bastard child" because it isn't + available everywhere. Used for temporary scratch space which + may be needed by some config routines but which is deleted at + the end of config. +
  • +
+
+
pchild +
+
+
    +
  • subpool of permanent_pool +
  • +
  • created when a child is spawned (or a thread is created); lives + until that child (thread) is destroyed +
  • +
  • passed to the module child_init functions +
  • +
  • destruction happens right after the child_exit functions are + called... (which may explain why I think child_exit is redundant + and unneeded) +
  • +
+
+
ptrans +
+
+
    +
  • should be a subpool of pchild, but currently is a subpool of + permanent_pool, see above +
  • +
  • cleared by the child before going into the accept() loop to receive + a connection +
  • +
  • used as connection->pool +
  • +
+
+
r->pool +
+
+
    +
  • for the main request this is a subpool of connection->pool; for + subrequests it is a subpool of the parent request's pool. +
  • +
  • exists until the end of the request (i.e., destroy_sub_req, or + in child_main after process_request has finished) +
  • +
  • note that r itself is allocated from r->pool; i.e., r->pool is + first created and then r is the first thing palloc()d from it +
  • +
+
+
+

+For almost everything folks do, r->pool is the pool to use. But you +can see how other lifetimes, such as pchild, are useful to some +modules... such as modules that need to open a database connection once +per child, and wish to clean it up when the child dies. +

+

+You can also see how some bugs have manifested themself, such as setting +connection->user to a value from r->pool -- in this case connection exists +for the lifetime of ptrans, which is longer than r->pool (especially if +r->pool is a subrequest!). So the correct thing to do is to allocate +from connection->pool. +

+

+And there was another interesting bug in mod_include/mod_cgi. You'll see +in those that they do this test to decide if they should use r->pool +or r->main->pool. In this case the resource that they are registering +for cleanup is a child process. If it were registered in r->pool, +then the code would wait() for the child when the subrequest finishes. +With mod_include this could be any old #include, and the delay can be up +to 3 seconds... and happened quite frequently. Instead the subprocess +is registered in r->main->pool which causes it to be cleaned up when +the entire request is done -- i.e., after the output has been sent to +the client and logging has happened. +

Tracking open files, etc.

- +

As indicated above, resource pools are also used to track other sorts of resources besides memory. The most common are open files. The routine which is typically used for this is ap_pfopen, which takes a resource pool and two strings as arguments; the strings are -the same as the typical arguments to fopen, e.g., - +the same as the typical arguments to fopen, e.g., +

      ...
      FILE *f = ap_pfopen (r->pool, r->filename, "r");
 
      if (f == NULL) { ... } else { ... }
 
- +

There is also a ap_popenf routine, which parallels the lower-level open system call. Both of these routines arrange for the file to be closed when the resource pool in question -is cleared.

- +is cleared. +

+

Unlike the case for memory, there are functions to close files allocated with ap_pfopen, and ap_popenf, namely ap_pfclose and ap_pclosef. (This is @@ -632,17 +761,26 @@ can have open is quite limited). It is important to use these functions to close files allocated with ap_pfopen and ap_popenf, since to do otherwise could cause fatal errors on systems such as Linux, which react badly if the same -FILE* is closed more than once.

- +FILE* is closed more than once. +

+

(Using the close functions is not mandatory, since the file will eventually be closed regardless, but you should consider it in cases where your module is opening, or could open, a lot of files). - +

Other sorts of resources --- cleanup functions

- +
More text goes here. Describe the the cleanup primitives in terms of which the file stuff is implemented; also, spawn_process. - +
+

+Pool cleanups live until clear_pool() is called: clear_pool(a) recursively +calls destroy_pool() on all subpools of a; then calls all the cleanups for a; +then releases all the memory for a. destroy_pool(a) calls clear_pool(a) +and then releases the pool structure itself. i.e. clear_pool(a) doesn't +delete a, it just frees up all the resources and you can start using it +again immediately. +

Fine control --- creating and dealing with sub-pools, with a note on sub-requests

diff --git a/docs/manual/misc/API.html b/docs/manual/misc/API.html index 33ec861dade..39189225888 100644 --- a/docs/manual/misc/API.html +++ b/docs/manual/misc/API.html @@ -494,27 +494,30 @@ only be correct in the last request in the chain (the one for which a response was actually sent).

Resource allocation and resource pools

- +

One of the problems of writing and designing a server-pool server is that of preventing leakage, that is, allocating resources (memory, open files, etc.), without subsequently releasing them. The resource pool machinery is designed to make it easy to prevent this from happening, by allowing resource to be allocated in such a way that they are automatically released when the server is done with -them.

- +them. +

+

The way this works is as follows: the memory which is allocated, file opened, etc., to deal with a particular request are tied to a resource pool which is allocated for the request. The pool -is a data structure which itself tracks the resources in question.

- +is a data structure which itself tracks the resources in question. +

+

When the request has been processed, the pool is cleared. At that point, all the memory associated with it is released for reuse, all files associated with it are closed, and any other clean-up functions which are associated with the pool are run. When this is over, we can be confident that all the resource tied to the pool have -been released, and that none of them have leaked.

- +been released, and that none of them have leaked. +

+

Server restarts, and allocation of memory and resources for per-server configuration, are handled in a similar way. There is a configuration pool, which keeps track of resources which were @@ -524,8 +527,9 @@ per-server module configuration, log files and other files that were opened, and so forth). When the server restarts, and has to reread the configuration files, the configuration pool is cleared, and so the memory and file descriptors which were taken up by reading them the -last time are made available for reuse.

- +last time are made available for reuse. +

+

It should be noted that use of the pool machinery isn't generally obligatory, except for situations like logging handlers, where you really need to register cleanups to make sure that the log file gets @@ -538,14 +542,15 @@ documented here). However, there are two benefits to using it: resources allocated to a pool never leak (even if you allocate a scratch string, and just forget about it); also, for memory allocation, ap_palloc is generally faster than -malloc.

- +malloc. +

+

We begin here by describing how memory is allocated to pools, and then discuss how other resources are tracked by the resource pool machinery. - +

Allocation of memory in pools

- +

Memory is allocated to pools by calling the function ap_palloc, which takes two arguments, one being a pointer to a resource pool structure, and the other being the amount of memory to @@ -554,7 +559,7 @@ requests, the most common way of getting a resource pool structure is by looking at the pool slot of the relevant request_rec; hence the repeated appearance of the following idiom in module code: - +

 int my_handler(request_rec *r)
 {
@@ -564,14 +569,15 @@ int my_handler(request_rec *r)
     foo = (foo *)ap_palloc (r->pool, sizeof(my_structure));
 }
 
- +

Note that there is no ap_pfree --- ap_palloced memory is freed only when the associated resource pool is cleared. This means that ap_palloc does not have to do as much accounting as malloc(); all it does in the typical case is to round up the size, bump a pointer, and do a -range check.

- +range check. +

+

(It also raises the possibility that heavy use of ap_palloc could cause a server process to grow excessively large. There are two ways to deal with this, which are dealt with below; briefly, you @@ -582,9 +588,9 @@ periodically. The latter technique is discussed in the section on sub-pools below, and is used in the directory-indexing code, in order to avoid excessive storage allocation when listing directories with thousands of files). - +

Allocating initialized memory

- +

There are functions which allocate initialized memory, and are frequently useful. The function ap_pcalloc has the same interface as ap_palloc, but clears out the memory it @@ -596,34 +602,157 @@ varargs-style function, which takes a pointer to a resource pool, and at least two char * arguments, the last of which must be NULL. It allocates enough memory to fit copies of each of the strings, as a unit; for instance: - +

      ap_pstrcat (r->pool, "foo", "/", "bar", NULL);
 
- +

returns a pointer to 8 bytes worth of memory, initialized to "foo/bar". - +

+

Commonly-used pools in the Apache Web server

+

+A pool is really defined by its lifetime more than anything else. There +are some static pools in http_main which are passed to various +non-http_main functions as arguments at opportune times. Here they are: +

+
+
permanent_pool +
+
+
    +
  • never passed to anything else, this is the ancestor of all pools +
  • +
+
+
pconf +
+
+
    +
  • subpool of permanent_pool +
  • +
  • created at the beginning of a config "cycle"; exists until the + server is terminated or restarts; passed to all config-time + routines, either via cmd->pool, or as the "pool *p" argument on + those which don't take pools +
  • +
  • passed to the module init() functions +
  • +
+
+
ptemp +
+
+
    +
  • sorry I lie, this pool isn't called this currently in 1.3, I + renamed it this in my pthreads development. I'm referring to + the use of ptrans in the parent... contrast this with the later + definition of ptrans in the child. +
  • +
  • subpool of permanent_pool +
  • +
  • created at the beginning of a config "cycle"; exists until the + end of config parsing; passed to config-time routines via + cmd->temp_pool. Somewhat of a "bastard child" because it isn't + available everywhere. Used for temporary scratch space which + may be needed by some config routines but which is deleted at + the end of config. +
  • +
+
+
pchild +
+
+
    +
  • subpool of permanent_pool +
  • +
  • created when a child is spawned (or a thread is created); lives + until that child (thread) is destroyed +
  • +
  • passed to the module child_init functions +
  • +
  • destruction happens right after the child_exit functions are + called... (which may explain why I think child_exit is redundant + and unneeded) +
  • +
+
+
ptrans +
+
+
    +
  • should be a subpool of pchild, but currently is a subpool of + permanent_pool, see above +
  • +
  • cleared by the child before going into the accept() loop to receive + a connection +
  • +
  • used as connection->pool +
  • +
+
+
r->pool +
+
+
    +
  • for the main request this is a subpool of connection->pool; for + subrequests it is a subpool of the parent request's pool. +
  • +
  • exists until the end of the request (i.e., destroy_sub_req, or + in child_main after process_request has finished) +
  • +
  • note that r itself is allocated from r->pool; i.e., r->pool is + first created and then r is the first thing palloc()d from it +
  • +
+
+
+

+For almost everything folks do, r->pool is the pool to use. But you +can see how other lifetimes, such as pchild, are useful to some +modules... such as modules that need to open a database connection once +per child, and wish to clean it up when the child dies. +

+

+You can also see how some bugs have manifested themself, such as setting +connection->user to a value from r->pool -- in this case connection exists +for the lifetime of ptrans, which is longer than r->pool (especially if +r->pool is a subrequest!). So the correct thing to do is to allocate +from connection->pool. +

+

+And there was another interesting bug in mod_include/mod_cgi. You'll see +in those that they do this test to decide if they should use r->pool +or r->main->pool. In this case the resource that they are registering +for cleanup is a child process. If it were registered in r->pool, +then the code would wait() for the child when the subrequest finishes. +With mod_include this could be any old #include, and the delay can be up +to 3 seconds... and happened quite frequently. Instead the subprocess +is registered in r->main->pool which causes it to be cleaned up when +the entire request is done -- i.e., after the output has been sent to +the client and logging has happened. +

Tracking open files, etc.

- +

As indicated above, resource pools are also used to track other sorts of resources besides memory. The most common are open files. The routine which is typically used for this is ap_pfopen, which takes a resource pool and two strings as arguments; the strings are -the same as the typical arguments to fopen, e.g., - +the same as the typical arguments to fopen, e.g., +

      ...
      FILE *f = ap_pfopen (r->pool, r->filename, "r");
 
      if (f == NULL) { ... } else { ... }
 
- +

There is also a ap_popenf routine, which parallels the lower-level open system call. Both of these routines arrange for the file to be closed when the resource pool in question -is cleared.

- +is cleared. +

+

Unlike the case for memory, there are functions to close files allocated with ap_pfopen, and ap_popenf, namely ap_pfclose and ap_pclosef. (This is @@ -632,17 +761,26 @@ can have open is quite limited). It is important to use these functions to close files allocated with ap_pfopen and ap_popenf, since to do otherwise could cause fatal errors on systems such as Linux, which react badly if the same -FILE* is closed more than once.

- +FILE* is closed more than once. +

+

(Using the close functions is not mandatory, since the file will eventually be closed regardless, but you should consider it in cases where your module is opening, or could open, a lot of files). - +

Other sorts of resources --- cleanup functions

- +
More text goes here. Describe the the cleanup primitives in terms of which the file stuff is implemented; also, spawn_process. - +
+

+Pool cleanups live until clear_pool() is called: clear_pool(a) recursively +calls destroy_pool() on all subpools of a; then calls all the cleanups for a; +then releases all the memory for a. destroy_pool(a) calls clear_pool(a) +and then releases the pool structure itself. i.e. clear_pool(a) doesn't +delete a, it just frees up all the resources and you can start using it +again immediately. +

Fine control --- creating and dealing with sub-pools, with a note on sub-requests