From: amosjeffries <> Date: Sun, 20 Jan 2008 16:48:41 +0000 (+0000) Subject: Add major additional information pages. X-Git-Tag: BASIC_TPROXY4~177 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=8b651fb31fea43cef240f7dd77095e417a81254a;p=thirdparty%2Fsquid.git Add major additional information pages. * These pages are for discourses on major components not suitable for writing into the code pages. --- diff --git a/doc/Programming-Guide/01_Main.dox b/doc/Programming-Guide/01_Main.dox new file mode 100644 index 0000000000..0692fa8798 --- /dev/null +++ b/doc/Programming-Guide/01_Main.dox @@ -0,0 +1,55 @@ +/** +\mainpage Squid 3.x Developer Programming Guide + +\section Abstract Abstract + +\par + Squid is a WWW Cache application developed by the National Laboratory + for Applied Network Research and members of the Web Caching community. + Squid is implemented as a single, non-blocking process based around + a BSD select() loop. This document describes the operation of the Squid + source code and is intended to be used by others who wish to customize + or improve it. + + +\section Introduction Introduction + +\par + The Squid source code has evolved more from empirical + observation and tinkering, rather than a solid design + process. It carries a legacy of being "touched" by + numerous individuals, each with somewhat different techniques + and terminology. + +\par + Squid is a single-process proxy server. Every request is + handled by the main process, with the exception of FTP. + However, Squid does not use a "threads package" such has + Pthreads. While this might be easier to code, it suffers + from portability and performance problems. Instead Squid + maintains data structures and state information for each + active request. + +\par + The code is often difficult to follow because there are no + explicit state variables for the active requests. Instead, + thread execution progresses as a sequence of "callback + functions" which get executed when I/O is ready to occur, + or some other event has happened. As a callback function + completes, it is responsible for registering the next + callback function for subsequent I/O. + +\par + Note there is only a pseudo-consistent naming scheme. In + most cases functions are named like \c moduleFooBar() . + However, there are also some functions named like + \c module_foo_bar() . + +\par + Note that the Squid source changes rapidly, and while we + do make some effort to document code as we go some parts + of the documentation may be left out. If you find any + inconsistencies, please feel free to notify + http://www.squid-cache.org/Support/contact.dyn the Squid Developers. + + */ diff --git a/doc/Programming-Guide/02_CodingConventions.dox b/doc/Programming-Guide/02_CodingConventions.dox new file mode 100644 index 0000000000..abee9694cb --- /dev/null +++ b/doc/Programming-Guide/02_CodingConventions.dox @@ -0,0 +1,191 @@ +/** +\page Conventions Coding and Other Conventions used in Squid + +\section Coding Code Conventions +\par + Most custom types and tools are documented in the code or the relevant + portions of this manual. Some key points apply globally however. + +\section FWT Fixed Width types + +\par + If you need to use specific width types - such as + a 16 bit unsigned integer, use one of the following types. To access + them simply include "config.h". + +\verbatim + int16_t - 16 bit signed. + u_int16_t - 16 bit unsigned. + int32_t - 32 bit signed. + u_int32_t - 32 bit unsigned. + int64_t - 64 bit signed. + u_int64_t - 64 bit unsigned. +\endverbatim + +\section Documentation Documentation Conventions +\par + Now that documentation is generated automatically from the sources + some common comment conventions need to be adopted. + + +\subsection CommentComponents API vs Internal Component Commenting + +\par + First among these is a definition seperation between component API + and Internal operations. API functions and objects should always be + commented and in the *.h file for the component. Internal logics and + objects should be commented in the *.cc file where they are defined. + The group is to be defined in the components main files with the + overview paragraphs about the API usage or component structure. + +\par + With C++ classes it is easy to seperate API and Internals with the C++ + public: and private: distinctions on whichever class defines the + component API. An Internal group may not be required if there are no + additional items in the Internals (rare as globals are common in squid). + +\par + With unconverted modules still coded in Objective-C, the task is harder. + In these cases two sub-groups must be defined *API and *Internal into + which naturally individual functions, variables, etc. are grouped using + the \b \\ingroup tag. The API group is usually a sub-group of Components + and the Internal is always a sub-group of the API. + +\par Rule of thumb: + For both items, if its referenced from elsewhere in the code or + defined in the .h file it should be part of the API. + Everything else should be in the Internals group and kept out of the .h file. + +\subsection FunctionComments Function/Method Comments + +\par + All descriptions may be more than one line, and while whitespace formatting is + ignored by doxygen, it is good to keep it clear for manual reading of the code. + +\par + Any text directly following a \b \\par tag will be highlighted in bold + automatically (like all the 'For Examples' below) so be careful what is placed + there. + + +\subsubsection PARAM Function Parameters + +\par + Function and Method parameters MUST be named in both the definition and in + the declaration, and they also MUST be the same text. The doxygen parser + needs them to be identical to accurately link the two with documentation. + Particularly linking function with documentation of the label itself. + +\par + Each function that takes parameters should have the possible range of values + commented in the pre-function descriptor. For API function this is as usual + in the .h file, for Internal functions it is i the .(cc|cci) file. + +\par + The \b \\param tag is used to describe these. It takes two required parameters; + the name of the function parameter being documented followed immediately by + either [in], [out], or [in,out]. + Followed by an optional description of what the parameter represents. + +\par For Example: +\verbatim +/** + \param g[out] Buffer to receive something + \param glen[in] Length of buffer available to write + */ +void +X::getFubar(char *g, int glen) +... +\endverbatim + + +\subsubsection RETVAL Return Values + +\par + Each function that returns a value should have the possible range of values + commented in the pre-function descriptor. +\par + The \b \\retval tag is used to describe these. It takes one required parameter; + the value or range of values returned. + Followed by an optional description of what/why of that value. + +\par For Example: +\verbatim +/** + \retval 0 when FUBAR does not start with 'F' + \retval 1 when FUBAR startes with F + */ +int +X::saidFubar() +... +\endverbatim + +\par Alternatively + when a state or other context-dependant object is returned the \b \return + tag is used. It is followed by a description of the object and ideally its + content. + + +\subsubsection FLOW Function Actions / Internal Flows + +\par Simple functions + do not exactly need a detailed description of their operation. + The \link PARAM input parameters \endlink and \link RETVAL Return \endlink + value should be enough for any developer to understand the function. + +\par Long or Complex Functions + do however need some commenting. + A well-designed function does all its operatons in distinct blocks; + \arg Input validation + \arg Processing on some state + \arg Processing on the output of that earlier processing + \arg etc, etc. + +\par + Each of these design blocks inside the function should be given a comment + indicating what they do. The comments should begin with + \verbatim /** \par \endverbatim + The resulting function description will then contain a paragraph on each of the + blocks in the order they occur in the function. + +\par For example: +\verbatim +/** + \param g The buffer to be used + \param glen Length of buffer provided + \param state Object of type X storing foo + */ +void +fubar(char *g, int glen, void *state) { +\endverbatim + Designed validation part of the function +\verbatim + /** \par + * When g is NULL or gen is 0 nothing is done */ + if(g == NULL || glen < 1) + return; + + /** \par + * When glen is longer than the accepted length it gets truncated */ + if(glen > MAX_FOO) glen = MAX_FOO; +\endverbatim + now we get on to the active part of the function +\verbatim + /** \par + * Appends up to MAX_FOO bytes from g onto the end of state->foo + * then passes the state off to FUBAR. + * No check for null-termination is done. + */ + xmemcpy(g, glen, state->foo_end_ptr ); + state->foo_end_ptr += glen; + fubar(state); +} +\endverbatim + +\par + Of course, this is a very simple example. This type of comment should only be + needed in the larger functions with many side effects. + A function this small could reasonably have all its commenting done just ahead of + the parameter description. + + */ diff --git a/doc/Programming-Guide/03_MajorComponents.dox b/doc/Programming-Guide/03_MajorComponents.dox new file mode 100644 index 0000000000..0216a8bd73 --- /dev/null +++ b/doc/Programming-Guide/03_MajorComponents.dox @@ -0,0 +1,351 @@ +/** +\ingroup Component + +\section Overview of Squid Components + +\par Squid consists of the following major components + +\section ClientSideSocket Client Side Socket + +\par + Here new client connections are accepted, parsed, and + reply data sent. Per-connection state information is held + in a data structure called ConnStateData. Per-request + state information is stored in the clientSocketContext + structure. With HTTP/1.1 we may have multiple requests from + a single TCP connection. +\todo DOCS: find out what has replaced clientSocketContext since it seems to not exist now. + +\section ClientSideRequest Client Side Request +\par + This is where requests are processed. We determine if the + request is to be redirected, if it passes access lists, + and setup the initial client stream for internal requests. + Temporary state for this processing is held in a + clientRequestContext. +\todo DOCS: find out what has replaced clientRequestContext since it seems not to exist now. + +\section ClientSideReply Client Side Reply +\par + This is where we determine if the request is cache HIT, + REFRESH, MISS, etc. This involves querying the store + (possibly multiple times) to work through Vary lists and + the list. Per-request state information is stored + in the clientReplyContext. + +\section StorageManager Storage Manager +\par + The Storage Manager is the glue between client and server + sides. Every object saved in the cache is allocated a + StoreEntry structure. While the object is being + accessed, it also has a MemObject structure. +\par + Squid can quickly locate cached objects because it keeps + (in memory) a hash table of all StoreEntry's. The + keys for the hash table are MD5 checksums of the objects + URI. In addition there is also a storage policy such + as LRU that keeps track of the objects and determines + the removal order when space needs to be reclaimed. + For the LRU policy this is implemented as a doubly linked + list. +\par + For each object the StoreEntry maps to a cache_dir + and location via sdirno and sfileno. For the "ufs" store + this file number (sfileno) is converted to a disk pathname + by a simple modulo of L2 and L1, but other storage drivers may + map sfilen in other ways. A cache swap file consists + of two parts: the cache metadata, and the object data. + Note the object data includes the full HTTP reply---headers + and body. The HTTP reply headers are not the same as the + cache metadata. +\par + Client-side requests register themselves with a StoreEntry + to be notified when new data arrives. Multiple clients + may receive data via a single StoreEntry. For POST + and PUT request, this process works in reverse. Server-side + functions are notified when additional data is read from + the client. + +\section RequestForwarding Request Forwarding + +\section PeerSelection Peer Selection +\par + These functions are responsible for selecting one (or none) + of the neighbor caches as the appropriate forwarding + location. + +\section AccessControl Access Control +\par + These functions are responsible for allowing or denying a + request, based on a number of different parameters. These + parameters include the client's IP address, the hostname + of the requested resource, the request method, etc. Some + of the necessary information may not be immediately available, + for example the origin server's IP address. In these cases, + the ACL routines initiate lookups for the necessary + information and continues the access control checks when + the information is available. + +\section AuthenticationFramework Authentication Framework +\par + These functions are responsible for handling HTTP + authentication. They follow a modular framework allow + different authentication schemes to be added at will. For + information on working with the authentication schemes See + the chapter Authentication Framework. + +\section NetworkCommunication Network Communication +\par + These are the routines for communicating over TCP and UDP + network sockets. Here is where sockets are opened, closed, + read, and written. In addition, note that the heart of + Squid (comm_select() or comm_poll()) exists here, + even though it handles all file descriptors, not just + network sockets. These routines do not support queuing + multiple blocks of data for writing. Consequently, a + callback occurs for every write request. +\todo DOCS: decide what to do for comm_poll() since its either obsolete or uses other names. + +\section FileDiskIO File/Disk I/O +\par + Routines for reading and writing disk files (and FIFOs). + Reasons for separating network and disk I/O functions are + partly historical, and partly because of different behaviors. + For example, we don't worry about getting a "No space left + on device" error for network sockets. The disk I/O routines + support queuing of multiple blocks for writing. In some + cases, it is possible to merge multiple blocks into a single + write request. The write callback does not necessarily + occur for every write request. + +\section Neighbors Neighbors +\par + Maintains the list of neighbor caches. Sends and receives + ICP messages to neighbors. Decides which neighbors to + query for a given request. File: neighbors.c. + +\section FQDNCache IP/FQDN Cache +\par + A cache of name-to-address and address-to-name lookups. + These are hash tables keyed on the names and addresses. + ipcache_nbgethostbyname() and fqdncache_nbgethostbyaddr() + implement the non-blocking lookups. Files: ipcache.c, + fqdncache.c. + +\section CacheManager Cache Manager +\par + This provides access to certain information needed by the + cache administrator. A companion program, cachemgr.cgi + can be used to make this information available via a Web + browser. Cache manager requests to Squid are made with a + special URL of the form +\code + cache_object://hostname/operation +\endcode + The cache manager provides essentially "read-only" access + to information. It does not provide a method for configuring + Squid while it is running. +\todo DOCS: get cachemgr.cgi documenting + +\section NetworkMeasurementDB Network Measurement Database +\par + In a number of situation, Squid finds it useful to know the + estimated network round-trip time (RTT) between itself and + origin servers. A particularly useful is example is + the peer selection algorithm. By making RTT measurements, a + Squid cache will know if it, or one if its neighbors, is closest + to a given origin server. The actual measurements are made + with the pinger program, described below. The measured + values are stored in a database indexed under two keys. The + primary index field is the /24 prefix of the origin server's + IP address. Secondly, a hash table of fully-qualified host + names that have data structures with links to the appropriate + network entry. This allows Squid to quickly look up measurements + when given either an IP address, or a host name. The /24 prefix + aggregation is used to reduce the overall database size. File: + net_db.c. + +\section Redirectors Redirectors +\par + Squid has the ability to rewrite requests from clients. After + checking the ACL access controls, but before checking for cache hits, + requested URLs may optionally be written to an external + redirector process. This program, which can be highly + customized, may return a new URL to replace the original request. + Common applications for this feature are extended access controls + and local mirroring. File: redirect.c. + +\section ASN Autonomous System Numbers +\par + Squid supports Autonomous System (AS) numbers as another + access control element. The routines in asn.c + query databases which map AS numbers into lists of CIDR + prefixes. These results are stored in a radix tree which + allows fast searching of the AS number for a given IP address. + +\section ConfigurationFileParsing Configuration File Parsing +\par + The primary configuration file specification is in the file + cf.data.pre. A simple utility program, cf_gen, + reads the cf.data.pre file and generates cf_parser.c + and squid.conf. cf_parser.c is included directly + into cache_cf.c at compile time. +\todo DOCS: get cf.data.pre documenting +\todo DOCS: get squid.conf documenting +\todo DOCS: get cf_gen documenting and linking. + +\section Callback Data Allocator +\par + Squid's extensive use of callback functions makes it very + susceptible to memory access errors. Care must be taken + so that the callback_data memory is still valid when + the callback function is executed. The routines in cbdata.c + provide a uniform method for managing callback data memory, + canceling callbacks, and preventing erroneous memory accesses. +\todo DOCS: get callback_data (object?) linking or repalcement named. + +\section RefCountDataAllocator Refcount Data Allocator +\since Squid 3.0 +\par + Manual reference counting such as cbdata uses is error prone, + and time consuming for the programmer. C++'s operator overloading + allows us to create automatic reference counting pointers, that will + free objects when they are no longer needed. With some care these + objects can be passed to functions needed Callback Data pointers. +\todo DOCS: get cbdata documenting and linking. + +\section Debugging Debugging +\par + Squid includes extensive debugging statements to assist in + tracking down bugs and strange behavior. Every debug statement + is assigned a section and level. Usually, every debug statement + in the same source file has the same section. Levels are chosen + depending on how much output will be generated, or how useful the + provided information will be. The \em debug_options line + in the configuration file determines which debug statements will + be shown and which will not. The \em debug_options line + assigns a maximum level for every section. If a given debug + statement has a level less than or equal to the configured + level for that section, it will be shown. This description + probably sounds more complicated than it really is. + File: debug.c. Note that debugs() itself is a macro. +\todo DOCS: get debugs() documenting as if it was a function. + +\section ErrorGeneration Error Generation +\par + The routines in errorpage.c generate error messages from + a template file and specific request parameters. This allows + for customized error messages and multilingual support. + +\section EventQueue Event Queue +\par + The routines in event.c maintain a linked-list event + queue for functions to be executed at a future time. The + event queue is used for periodic functions such as performing + cache replacement, cleaning swap directories, as well as one-time + functions such as ICP query timeouts. + +\section FiledescriptorManagement Filedescriptor Management +\par + Here we track the number of filedescriptors in use, and the + number of bytes which has been read from or written to each + file descriptor. + + +\section HashtableSupport Hashtable Support +\par + These routines implement generic hash tables. A hash table + is created with a function for hashing the key values, and a + function for comparing the key values. + +\section HTTPAnonymization HTTP Anonymization +\par + These routines support anonymizing of HTTP requests leaving + the cache. Either specific request headers will be removed + (the "standard" mode), or only specific request headers + will be allowed (the "paranoid" mode). + +\section DelayPools Delay Pools +\par + Delay pools provide bandwidth regulation by restricting the rate + at which squid reads from a server before sending to a client. They + do not prevent cache hits from being sent at maximal capacity. Delay + pools can aggregate the bandwidth from multiple machines and users + to provide more or less general restrictions. + +\section ICPSupport Internet Cache Protocol +\par + Here we implement the Internet Cache Protocol. This + protocol is documented in the RFC 2186 and RFC 2187. + The bulk of code is in the icp_v2.c file. The + other, icp_v3.c is a single function for handling + ICP queries from Netcache/Netapp caches; they use + a different version number and a slightly different message + format. +\todo DOCS: get RFCs linked from ietf + +\section IdentLookups Ident Lookups +\par + These routines support RFC 931 (http://www.ietf.org/rfc/rfc931.txt) + "Ident" lookups. An ident + server running on a host will report the user name associated + with a connected TCP socket. Some sites use this facility for + access control and logging purposes. + +\section MemoryManagement Memory Management +\par + These routines allocate and manage pools of memory for + frequently-used data structures. When the \em memory_pools + configuration option is enabled, unused memory is not actually + freed. Instead it is kept for future use. This may result + in more efficient use of memory at the expense of a larger + process size. + +\section MulticastSupport Multicast Support +\par + Currently, multicast is only used for ICP queries. The + routines in this file implement joining a UDP + socket to a multicast group (or groups), and setting + the multicast TTL value on outgoing packets. + +\section PresistentConnections Persistent Server Connections +\par + These routines manage idle, persistent HTTP connections + to origin servers and neighbor caches. Idle sockets + are indexed in a hash table by their socket address + (IP address and port number). Up to 10 idle sockets + will be kept for each socket address, but only for + 15 seconds. After 15 seconds, idle socket connections + are closed. + +\section RefreshRules Refresh Rules +\par + These routines decide whether a cached object is stale or fresh, + based on the \em refresh_pattern configuration options. + If an object is fresh, it can be returned as a cache hit. + If it is stale, then it must be revalidated with an + If-Modified-Since request. + +\section SNMPSupport SNMP Support +\par + These routines implement SNMP for Squid. At the present time, + we have made almost all of the cachemgr information available + via SNMP. + +\section URNSupport URN Support +\par + We are experimenting with URN support in Squid version 1.2. + Note, we're not talking full-blown generic URN's here. This + is primarily targeted toward using URN's as an smart way + of handling lists of mirror sites. For more details, please + see (http://squid.nlanr.net/Squid/urn-support.html) URN Support in Squid + . + +\section ESI ESI +\par + ESI is an implementation of Edge Side Includes (http://www.esi.org). + ESI is implemented as a client side stream and a small + modification to client_side_reply.c to check whether + ESI should be inserted into the reply stream or not. + + */ diff --git a/doc/Programming-Guide/05_TypicalRequestFlow.dox b/doc/Programming-Guide/05_TypicalRequestFlow.dox new file mode 100644 index 0000000000..9cce99e0b5 --- /dev/null +++ b/doc/Programming-Guide/05_TypicalRequestFlow.dox @@ -0,0 +1,72 @@ +/** +\page 05_TypicalRequestFlow Flow of a Typical Request + +\par +\li A client connection is accepted by the client-side socket + support and parsed, or is directly created via + clientBeginRequest(). + +\li The access controls are checked. The client-side-request builds + an ACL state data structure and registers a callback function + for notification when access control checking is completed. + +\li After the access controls have been verified, the request + may be redirected. + +\li The client-side-request is forwarded up the client stream + to GetMoreData() which looks for the requested object in the + cache, and or Vary: versions of the same. If is a cache hit, + then the client-side registers its interest in the + StoreEntry. Otherwise, Squid needs to forward the request, + perhaps with an If-Modified-Since header. + +\li The request-forwarding process begins with protoDispatch(). + This function begins the peer selection procedure, which + may involve sending ICP queries and receiving ICP replies. + The peer selection procedure also involves checking + configuration options such as \em never_direct and + \em always_direct. + +\li When the ICP replies (if any) have been processed, we end + up at protoStart(). This function calls an appropriate + protocol-specific function for forwarding the request. + Here we will assume it is an HTTP request. + +\li The HTTP module first opens a connection to the origin + server or cache peer. If there is no idle persistent socket + available, a new connection request is given to the Network + Communication module with a callback function. The + comm.c routines may try establishing a connection + multiple times before giving up. + +\li When a TCP connection has been established, HTTP builds a + request buffer and submits it for writing on the socket. + It then registers a read handler to receive and process + the HTTP reply. + +\li As the reply is initially received, the HTTP reply headers + are parsed and placed into a reply data structure. As + reply data is read, it is appended to the StoreEntry. + Every time data is appended to the StoreEntry, the + client-side is notified of the new data via a callback + function. The rate at which reading occurs is regulated by + the delay pools routines, via the deferred read mechanism. + +\li As the client-side is notified of new data, it copies the + data from the StoreEntry and submits it for writing on the + client socket. + +\li As data is appended to the StoreEntry, and the client(s) + read it, the data may be submitted for writing to disk. + +\li When the HTTP module finishes reading the reply from the + upstream server, it marks the StoreEntry as "complete". + The server socket is either closed or given to the persistent + connection pool for future use. + +\li When the client-side has written all of the object data, + it unregisters itself from the StoreEntry. At the + same time it either waits for another request from the + client, or closes the client connection. + +*/ diff --git a/doc/Programming-Guide/AccessControls.dox b/doc/Programming-Guide/AccessControls.dox new file mode 100644 index 0000000000..acd3ffe0ab --- /dev/null +++ b/doc/Programming-Guide/AccessControls.dox @@ -0,0 +1,16 @@ +/** +\defgroup ACLAPI Access Controls +\ingroup Components + +\par + These functions are responsible for allowing or denying a + request, based on a number of different parameters. These + parameters include the client's IP address, the hostname + of the requested resource, the request method, etc. Some + of the necessary information may not be immediately available, + for example the origin server's IP address. In these cases, + the ACL routines initiate lookups for the necessary + information and continues the access control checks when + the information is available. + + */ diff --git a/doc/Programming-Guide/BasicAuthentication.dox b/doc/Programming-Guide/BasicAuthentication.dox new file mode 100644 index 0000000000..45ce6b3a94 --- /dev/null +++ b/doc/Programming-Guide/BasicAuthentication.dox @@ -0,0 +1,43 @@ +/** +\defgroup AuthAPIBasic Basic Authentication +\ingroup AuthAPI + +\par +Basic authentication provides a username and password. These +are written to the authentication module processes on a single +line, separated by a space: +\code + +\endcode + +\par + The authentication module process reads username, password pairs + on stdin and returns either "OK" or "ERR" on stdout for + each input line. + +\par + The following simple perl script demonstrates how the + authentication module works. This script allows any + user named "Dirk" (without checking the password) + and allows any user that uses the password "Sekrit": + +\code +#!/usr/bin/perl -w +$|=1; # no buffering, important! +while (<>) { + chop; + ($u,$p) = split; + $ans = &check($u,$p); + print "$ans\n"; +} + +sub check { + local($u,$p) = @_; + return 'ERR' unless (defined $p && defined $u); + return 'OK' if ('Dirk' eq $u); + return 'OK' if ('Sekrit' eq $p); + return 'ERR'; +} +\endcode + + */ diff --git a/doc/Programming-Guide/DelayPools.dox b/doc/Programming-Guide/DelayPools.dox new file mode 100644 index 0000000000..cea810d0e9 --- /dev/null +++ b/doc/Programming-Guide/DelayPools.dox @@ -0,0 +1,49 @@ +/** +\page 10_DelayPools Delay Pools + +\section Introduction Introduction +\par + A DelayPool is a Composite used to manage bandwidth for any request + assigned to the pool by an access expression. DelayId's are a used + to manage the bandwith on a given request, whereas a DelayPool + manages the bandwidth availability and assigned DelayId's. + +\section ExtendingDelayPools Extending Delay Pools +\par + A CompositePoolNode is the base type for all members of a DelayPool. + Any child must implement the RefCounting primitives, as well as five + delay pool functions: + \li stats() - provide cachemanager statistics for itself. + \li dump() - generate squid.conf syntax for the current configuration of the item. + \li update() - allocate more bandwith to all buckets in the item. + \li parse() - accept squid.conf syntax for the item, and configure for use appropriately. + \li id() - return a DelayId entry for the current item. + +\par + A DelayIdComposite is the base type for all delay Id's. Concrete + Delay Id's must implement the refcounting primitives, as well as two + delay id functions: + \li bytesWanted() - return the largest amount of bytes that this delay id allows by policy. + \li bytesIn() - record the use of bandwidth by the request(s) that this delayId is monitoring. + +\par + Composite creation is currently under design review, so see the + DelayPool class and follow the parse() code path for details. + +\section NeatExtensions Neat things that could be done. +\par + With the composite structure, some neat things have become possible. + For instance: + +\par Dynamically defined pool arrangements. + For instance an aggregate (class 1) combined with the per-class-C-net tracking of a + class 3 pool, without the individual host tracking. This differs + from a class 3 pool with -1/-1 in the host bucket, because no memory + or cpu would be used on hosts, whereas with a class 3 pool, they are + allocated and used. + +\par Per request bandwidth limits. + A delayId that contains it's own bucket could limit each request + independently to a given policy, with no aggregate restrictions. + + */ diff --git a/doc/Programming-Guide/Groups.dox b/doc/Programming-Guide/Groups.dox new file mode 100644 index 0000000000..b77eac85c4 --- /dev/null +++ b/doc/Programming-Guide/Groups.dox @@ -0,0 +1,93 @@ +/** + \defgroup POD POD Classes + * + \par + * Classes which encapsulate POD (plain old data) in such a way + * that they can be used as POD themselves and passed around Squid. + * These objects should have a formal API for safe handling of their + * content, but it MUST NOT depend on any externality than itself + * or the standard C++ libraries. + */ + +/** + \defgroup Components Squid Components + */ + +/** + \defgroup ServerProtocol Server-Side Protocols + \ingroup Components + \par + * These routines are responsible for forwarding cache misses + * to other servers, depending on the protocol. Cache misses + * may be forwarded to either origin servers, or other proxy + * caches. + * All requests to other proxies are sent as HTTP requests. + * All requests to origin-server are sent in that servers protocol. + * + \par + * Wais and Gopher don't receive much + * attention because they comprise a relatively insignificant + * portion of Internet traffic. + */ + +/** + \defgroup libsquid Squid Library + * + \par + * These objects are provided publicly through lidsquid.la + */ + +/** + \defgroup Tests Unit Testing + * + \par + * Any good application has a set of tests to ensure it stays + * in a good condition. Squid tends to use cppunit tests. + \par + * It is preferrable to automated tests for units of functionality. There + * is a boilerplate for tests in "src/tests/testBoilerplate.[cc|h]". New + * tests need to be added to src/Makefile.am to build and run them during + * "make check". To add a new test script, just copy the references to + * testBoilerplate in Makefile.am adjusting the name, and likewise copy the + * source files. If you are testing an already tested area you may be able + * to just add new test cases to an existing script. I.e. to test the store + * some more just edit tests/testStore.h and add a new unit test method + * name. + */ + +/** + \defgroup Callbacks Event Callback Functions + * + \par + * Squid uses events to process asynchronous actions. + * These mehods are registered as callbacks to receive notice whenever a + * specific event occurs. + */ + +/** + \defgroup Timeouts Timeouts + \todo DOCS: document Timeouts. + */ + +/** + \defgroup ServerProtocolHTTP HTTP + \ingroup ServerProtocol + \todo Write Documentation about HTTP + */ + +/** + \defgroup ServerProtocolFTPAPI Server-Side FTP API + \ingroup ServerProtocol + */ + +/** + \defgroup ServerProtocolWAIS WAIS + \ingroup ServerProtocol + \todo Write Documentation about Wais + */ + +/** + \defgroup ServerProtocolPassthru Passthru + \ingroup ServerProtocol + \todo Write Documentation about Passthru + */ diff --git a/doc/Programming-Guide/Makefile.dox b/doc/Programming-Guide/Makefile.dox new file mode 100644 index 0000000000..98f501a2d5 --- /dev/null +++ b/doc/Programming-Guide/Makefile.dox @@ -0,0 +1,55 @@ +/** + \page 03_Makefile Altering Squid Makefiles + * + \section MakefileWhich1 Which file to edit. + \par + * Each directory in the squid sources is largely self-sufficient + * \b Makefile.in is auto-generated by autotools based on the + * \b configure.in and \b Makefile.am files. + * + \par + * In general your additions should go in \b Makefile.am + * + * + \section MakefileUnitTests Adding new Unit Tests + * + \par + * To alter or add new tests for a class where a set of tests + * already exist, you should simply edit the \b tests/testX.(h|cc) files + * for that class. + * + \par + * When a new class needs testing you will need to add some variables + * to Makefile.am telling autotools what to build. These variables are: + * + \subsection _SOURCES tests_testX_SOURCES= ... + \par + * The list of .(h|cc) files that need linking to the class. + * Most tests \b should use the actual Squid code. Though there are \b stub_X.cc + * files available that simplify some of the more complex optional components. + * + \subsection _LDFLAGS tests_testX_LDFLAGS= ... + \par + * Most cases it should be just \b \$(LIBADD_DL). + * + \subsection _DEPENDENCIES tests_testX_DEPENDENCIES= ... + \par + * this is a list of the additional module *.a files that need linking. + * All unit tests require: \b \@SQUID_CPPUNIT_LA\@ + * + \subsection _LDADD tests_testX_LDADD= ... + \par + * this is a list of the additional module libraries that need linking. + * All unit tests require: \b \@SQUID_CPPUNIT_LIBS\@ + \par + * + \subsection LIBS Modules available for *_DEPENDENCIES and *_LDADD + * + \par Linking ~/lib/* code: + \li *_LDADD= \b -L../lib \b -lmiscutil ... + \li *_DEPENDENCIES= \b \$(top_builddir)/lib/libmiscutil.a ... + * + \par Linking ~/src/auth/* code: + \li *_LDADD= \b libauth.la ... + * + */ diff --git a/doc/Programming-Guide/StorageManager.dox b/doc/Programming-Guide/StorageManager.dox new file mode 100644 index 0000000000..96e443cff4 --- /dev/null +++ b/doc/Programming-Guide/StorageManager.dox @@ -0,0 +1,40 @@ +\** + \defgroup StorageManager Storage Manager + \ingroup Components + * + \par + The Storage Manager is the glue between client and server + sides. Every object saved in the cache is allocated a + StoreEntry structure. While the object is being + accessed, it also has a MemObject structure. + +\par + Squid can quickly locate cached objects because it keeps + (in memory) a hash table of all StoreEntry's. The + keys for the hash table are MD5 checksums of the objects + URI. In addition there is also a storage policy such + as LRU that keeps track of the objects and determines + the removal order when space needs to be reclaimed. + For the LRU policy this is implemented as a doubly linked + list. + +\par + For each object the StoreEntry maps to a cache_dir + and location via sdirn and sfilen. For the "ufs" store + this file number (sfilen) is converted to a disk pathname + by a simple modulo of L2 and L1, but other storage drivers may + map sfilen in other ways. A cache swap file consists + of two parts: the cache metadata, and the object data. + Note the object data includes the full HTTP reply---headers + and body. The HTTP reply headers are not the same as the + cache metadata. + +\par + Client-side requests register themselves with a StoreEntry + to be notified when new data arrives. Multiple clients + may receive data via a single StoreEntry. For POST + and PUT request, this process works in reverse. Server-side + functions are notified when additional data is read from + the client. + + */