Willy Tarreau [Sun, 18 Mar 2007 15:22:39 +0000 (16:22 +0100)]
[MAJOR] completed the HTTP response processing.
Now the response is correctly processed in the backend first
then in the frontend. It has followed intensive tests to
catch regressions, and everything seems OK now, but the code
is young anyway.
Willy Tarreau [Sun, 4 Mar 2007 17:17:17 +0000 (18:17 +0100)]
[MINOR] code factoring : capture_headers() serves requests and responses
Both request and response captures will have to parse headers following
the same methods. It's better to factorize the code, hence the new
capture_headers() function.
Willy Tarreau [Sat, 3 Mar 2007 12:54:32 +0000 (13:54 +0100)]
[CLEANUP] renamed several HTTP structures
Some parts of HTTP processing were incorrectly called "request" while
they are messages or transactions. The following structure members
have changed :
Willy Tarreau [Sun, 11 Feb 2007 23:59:08 +0000 (00:59 +0100)]
[MINOR] slightly optimize time calculation for rbtree
The new rbtree-based scheduler makes heavy use of tv_cmp2(), and
this function becomes a huge CPU eater. Refine it a little bit in
order to slightly reduce CPU usage.
Willy Tarreau [Thu, 1 Feb 2007 22:15:45 +0000 (23:15 +0100)]
[BUG] segfault on some erroneous configurations
If captures were configured in a TCP-only listener, and
the logs were enabled, the proxy could segfault when
trying to scan the capture buffer which was NULL. Such
an erroneous configuration will not be possible anymore
soon, but let's avoid the problem for now by detecting
the NULL condition.
Willy Tarreau [Fri, 26 Jan 2007 22:49:01 +0000 (23:49 +0100)]
[RELEASE] Released version 1.3.7 with the following changes :
- fix critical bug introduced with 1.3.6 : an empty request header
may lead to a crash due to missing pointer assignment
- hdr_idx might be left uninitialized in debug mode
- fixed build on FreeBSD due to missing fd_set declaration
Willy Tarreau [Fri, 26 Jan 2007 22:39:38 +0000 (23:39 +0100)]
[CRITICAL] an empty header may lead to a crash
A missing pointer assignment in case of an empty header
will result in this header's length being 65535, causing
a SEGV when accessing the next header. It should not be
possible to exploit this problem to run arbitrary code
because the crash occurs while reading the data.
Willy Tarreau [Thu, 25 Jan 2007 11:03:42 +0000 (12:03 +0100)]
[BUG] hdr_idx might be left uninitialized in some cases
When a request is invalid during RQ_BEFORE AND the debug mode is active,
the hdr_idx might be used uninitialized. Let's initialize it right after
the accept() for now.
Willy Tarreau [Wed, 24 Jan 2007 17:20:50 +0000 (18:20 +0100)]
[BUILD] fix build on FreeBSD (missing fd_set declaration)
Sorin Pop reported a patch to fix build on FreeBSD.
The file common/standard.h used an fd_set in a declaration
but did not include enough headers for it to be known.
Willy Tarreau [Mon, 22 Jan 2007 07:55:47 +0000 (08:55 +0100)]
[MAJOR] invalid header offset broke cookies and authentication
Since the request is no longer part of the headers, cookies and
authentication did not work anymore. Obvious fix is to add the
request offset to the start pointer.
Willy Tarreau [Sun, 21 Jan 2007 23:56:46 +0000 (00:56 +0100)]
[RELEASE] Released 1.3.6 with the following changes :
- stats now support the HEAD method too
- extracted http request from the session
- huge rework of the HTTP parser which is now a 28-state FSM.
- linux-style likely/unlikely macros for optimization hints
- do not create a server socket when there's no server
Willy Tarreau [Sun, 21 Jan 2007 18:16:41 +0000 (19:16 +0100)]
[MAJOR] huge rework of the HTTP request FSM
The HTTP parser has been rewritten for better compliance to RFC2616.
The same parser is now usable for both requests and responses, and
it now supports HTTP/0.9 as well as multi-line headers. It has also
been improved for speed ; a typicial HTTP request is parsed in about
2 microseconds on a 1 GHz processor.
The monitor-uri check has been moved so that the requests are not
logged. The httpclose option now tries to change as little as
possible in the request, and does not affect the first header if
it is already set to 'close'. HTTP/0.9 requests are converted to
HTTP/1.0 before being forwarded.
Headers and request transformations are now distinct. The headers
list is updated after each insertion/removal/transformation. The
request is re-parsed and checked after each transformation. It is
not possible anymore to remove a request, and requests which lead
to invalid request lines are now rejected.
Willy Tarreau [Sat, 20 Jan 2007 10:07:46 +0000 (11:07 +0100)]
[MINOR] do not create a socket if there is no server
Since the distinction of backends and frontends, it has become
possible that some requests reach a frontend which has no
backend parameters. We must not create a socket on the backend
side just to destroy it later in such a case. The real problem
comes from the dispatch mode not being explictly stated.
Willy Tarreau [Sun, 7 Jan 2007 14:46:13 +0000 (15:46 +0100)]
[MEDIUM] separate the http request from the session (step 1)
A struct http_req has been created to collect every information
related to an HTTP request being processed. Right now, it is
still in the struct session but the frontier is clear now.
Willy Tarreau [Sun, 7 Jan 2007 12:47:30 +0000 (13:47 +0100)]
[MEDIUM] Stats: add support for the HEAD method
There are browsers which sometimes send HEAD requests to the stats
page, but it was not handled so it returned a 503 server error or
was simply sent to the default backend servers.
Now with a HEAD request, the stats return the headers and finish
there. Normally, other methods should be blocked so that the stats
page really catches the whole URI. Other methods would need to cause
a 405 Method not allowed to be returned.
Willy Tarreau [Sun, 7 Jan 2007 01:47:01 +0000 (02:47 +0100)]
[RELEASE] Released 1.3.5 with the following major changes :
- added complete support and doc for TCP Splicing
- replaced the wait-queue linked list with an rbtree.
- stats: swap color sets for active and backup servers
- try to guess server check port when unset
- a few bugfixes and cleanups
Willy Tarreau [Sun, 7 Jan 2007 01:40:09 +0000 (02:40 +0100)]
[MINOR] try to guess server check port when unset
When a server has no port specified and there is a check
enabled on it, the check is disabled because the port is
unknown. However, people expect the "listen" line to set
the check port just like it sets the server's port. Now,
if a port is specified in the listen or in the first bind
and nowhere else, it will be used for the checks as well.
Willy Tarreau [Sat, 6 Jan 2007 23:38:00 +0000 (00:38 +0100)]
[MAJOR] replace the wait-queue linked list with an rbtree.
This patch from Sin Yu makes use of an rbtree for the wait queue,
which will solve the slowdown problem encountered when timeouts
are heterogenous in the configuration. The next step will be to
turn maintain_proxies() into a per-proxy task so that we won't
have to scan them all after each poll() loop.
Willy Tarreau [Sun, 7 Jan 2007 01:03:04 +0000 (02:03 +0100)]
[MAJOR] complete support and doc for tcp-splicing
The tcp-splicing code has been merged, and a doc has been written.
A configuration example has been derived from the previous content
switching sample.
Willy Tarreau [Sat, 6 Jan 2007 20:09:17 +0000 (21:09 +0100)]
[MINOR] the options table now sets the prerequisite checks
Some options will need some checks (or initializations) to be performed
before starting everything. The cfg_opts table has been extended to
allow storing of options-dependant checks.
Willy Tarreau [Mon, 1 Jan 2007 23:44:53 +0000 (00:44 +0100)]
[RELEASE] released 1.3.4
Released 1.3.4 with the following major changes :
- support for cttproxy on the server side to present the client
address to the server.
- added support for SO_REUSEPORT on Linux (needs kernel patch)
- new RFC2616-compliant HTTP request parser with header indexing
- split proxies in frontends, rulesets and backends
- implemented the 'req[i]setbe' to select a backend depending
on the contents
- added the 'default_backend' keyword to select a default BE.
- new stats page featuring FEs and BEs + bytes in both dirs
- improved log format to indicate the backend and the time in ms.
- lots of cleanups
Willy Tarreau [Mon, 1 Jan 2007 22:32:30 +0000 (23:32 +0100)]
[CRITICAL] fixed memory leak in session_free()
Since the introduction of hdr_idx, session_free() had not
been updated to free the header ! It implied a consumption
of about 400 bytes per new session.
Willy Tarreau [Mon, 1 Jan 2007 20:38:07 +0000 (21:38 +0100)]
[MAJOR] udpated the stats page to clearly distinguish FEs and BEs
The stats page could not tell the difference between a FE and a BE.
It has been revamped to indicate all relevant information. The font
is also slightly smaller in order for all the info to fit into small
screens. The data output path has been greatly simplified to use
string chunks.
Willy Tarreau [Sat, 30 Dec 2006 23:24:10 +0000 (00:24 +0100)]
[MEDIUM] use an array to store most common options
Most common options are now stored in an array which eases
the parsing and which also permits reporting of ignored
options depending on the proxy's capabilities (back/front).
Willy Tarreau [Sat, 30 Dec 2006 22:43:54 +0000 (23:43 +0100)]
[MINOR] option httpclose is now checked both in FE and BE
The "httpclose" option affects both frontend and backend, so it
was logical to check for its presence at both places. A request
which traverses either a frontend or a backend with this option
set will have a "Connection: close" header appended.
Willy Tarreau [Sat, 30 Dec 2006 10:54:15 +0000 (11:54 +0100)]
[MEDIUM] updated log format to report frontend and backend
The log format has been slightly updated to separately report the name
of the frontend and the name of the backend. The accept date has been
enhanced to report the millisecond. The number of remaining connections
has also been updated and their order reversed, to include the number
of connections on the frontend. The new log format is now :
- $1: IP:port
- $2: accept date in this format : [dd/mm/YYYY:HH:MM:SS.ttt]
- $3: frontend name
- $4: backend name '/' server name
- $5: req time '/' queue time '/' conn time '/' header time '/' total time
- $6: HTTP status code
- $7: number of bytes returned
- $8: captures (request)
- $9: captures (response)
- $10: completion flags
- $11: remaining conns on process '/' frontend '/' backend '/' server
- $12: srv queue size '/' backend queue size
- $13..: '"' full request '"'
Willy Tarreau [Fri, 29 Dec 2006 13:19:17 +0000 (14:19 +0100)]
[MAJOR] distinguish between frontend, backend, ruleset and listen
The notion of capabilities has been added to the proxy so that we
know whether a proxy supports frontend, backend, or rulesets. Given
this, some parameters are optionnal, some are ignored with a warning
and others are forbidden. It is now possible to write valid two level
configs without binding to dummy address/ports.
Willy Tarreau [Thu, 28 Dec 2006 23:10:33 +0000 (00:10 +0100)]
[MEDIUM] split fe->maxconn into fe->maxconn and be->fullconn
The maxconn argument is used only for the listeners, and the
fullconn is used only for the backends. If unset, it inherits
maxconn's value which itself can inherit the default or the
global value (we might need to change this).
Willy Tarreau [Wed, 27 Dec 2006 16:18:38 +0000 (17:18 +0100)]
[MEDIUM] session logging is now defined by the frontend
To solve the logging maze, it has been decided that the frontend
and nothing else will define how a session will be logged. It might
change in the future but at least this choice allows all sort of
fantasies.
Willy Tarreau [Sun, 24 Dec 2006 16:47:20 +0000 (17:47 +0100)]
[MEDIUM] errorloc now checked first from backend then from frontend
It is now possible to define an errorloc in the backend as well as
in the frontend. The backend's will be used first, and if undefined,
then the frontend's will be used instead. If none is used, then the
original error messages will be used.
Willy Tarreau [Sat, 23 Dec 2006 19:51:41 +0000 (20:51 +0100)]
[MINOR] store HTTP error messages into a chunk array
HTTP error messages were all specific cases handled by an IF.
Now they are all in an array so that it will be easier to add
new ones. Also, the return functions now use chunks as inputs
so that it should be easier to provide alternative return
messages if needed.
Willy Tarreau [Sat, 23 Dec 2006 10:12:04 +0000 (11:12 +0100)]
[BUILD] makefile now detects and uses git to set the version
If git is found during the build process, then it will be used
to set the version, the commit number and the commit date. This
way, it will not be needed anymore to update the code to change
the version. The version is the last tag, and the commit number
is the number of commits since the last tag.
Willy Tarreau [Sun, 17 Dec 2006 22:15:24 +0000 (23:15 +0100)]
[MAJOR] merged the 'setbe' actions to switch the backend on a regex
Sin Yu's patch to permit to change the proxy from a regex was merged
with little changes :
- req_cap/rsp_cap are not reassigned to the new proxy, they stay
attached to the frontend
- the actions have been renamed "reqsetbe" and "reqisetbe" for
"set BackEnd".
- the buffer is not reset after the switch, instead, the headers are
parsed again by the backend
- in Sin's patch, it was theorically possible to switch multiple times,
but the switching track was lost, making it impossible to apply
server responsesin the reverse order. Now switching is limited to
1 action (separation between frontend and backend) but the filters
remain.
Now it will be extremely easy to add other switching conditions, such
as host matching, URI matching, etc...
There's still a hard work to be done on the logs and stats.
Willy Tarreau [Sun, 17 Dec 2006 21:55:52 +0000 (22:55 +0100)]
[MEDIUM] tried to clean the logs up a little bit
The logs have become a real mess. It is now very hard to tell which
frontend/backend will impose its configuration for the logs. This
needs a complete rework but at least it should work.
Willy Tarreau [Sun, 17 Dec 2006 21:14:12 +0000 (22:14 +0100)]
[MEDIUM] separated nbconn into feconn and beconn
The nbconn attribute in the proxies was not relevant anymore because
a frontend A may use backend B and both of them must account for their
respective connections. For this reason, there now are two separate
counters for frontend and backend connections.
The stats page has been updated to reflect the backend, but a separate
line entry for the frontend with error counts would be good.
Note that as of now, beconn may be higher than maxconn, because maxconn
applies to the frontend, while beconn may be increased due to sessions
passed from another frontend.
Willy Tarreau [Sun, 17 Dec 2006 18:31:23 +0000 (19:31 +0100)]
[MAJOR] reworked ->be, ->fe and ->fi in sessions
There was a confusion about the way to find filters and backend
parameters from sessions. The chaining has been changed between
the session and the proxy.
Now, a session knows only two proxies : one frontend (->fe) and
one backend (->be). Each proxy has a link to the proxy providing
filters and to the proxy providing backend parameters (both self
by default).
The captures (cookies and headers) have been attached to the
frontend's filters for now.
The uri_auth and the statistics are attached to the backend's
filters so that the uri can depend on a hostname for instance.
Willy Tarreau [Sun, 17 Dec 2006 17:02:30 +0000 (18:02 +0100)]
[MINOR] add the fiprm and beprm indirections to struct proxy
A proxy will be able to borrow parameters from another one.
In particular, the filters will be inheritable from another
proxy, and the backend parameters too.
Willy Tarreau [Sun, 17 Dec 2006 13:52:38 +0000 (14:52 +0100)]
[MEDIUM] moved uri_auth check to a separate function
The check of uri_auth is now in a separate function which is
checked after every backend switch, so that it will be possible
to have an uri_auth for the frontend and another one for the
backend.
Willy Tarreau [Sun, 17 Dec 2006 11:05:00 +0000 (12:05 +0100)]
[MEDIUM] optimized the request parser a bit more
Some while() constructs are not very efficient with gcc, yet they are
used to scan all the text in the start line and the headers. Replacing
them with more efficient (but ugly) loops provides a global gain of
about 2%, which is not bad at all !
Willy Tarreau [Sun, 17 Dec 2006 07:37:22 +0000 (08:37 +0100)]
[MEDIUM] reorganized request handling to prepare for content-switching
The filters are now iterated for FE, FI, BE.
Some grey areas remain :
- uri_auth has been propagated to the backend, but in fact it
should be checked at every level (fe, fi, be), depending
where it is declared, and before the filters.
- the HTTP method and URI should be stored and propagated everywhere
they are used. For this, we would need to first apply filters to
be aware of filter changes which affect them.
- there seems to be no need anymore for hdr_idx[0] being empty.
It may contain the start line, which will slightly improve
performance and make the code easier to read.
Willy Tarreau [Sat, 16 Dec 2006 23:05:15 +0000 (00:05 +0100)]
[MEDIUM] move all HTTP Request-related session material to struct hreq
The req_cap, hdr_state, hdr_idx, auth_hdr and req_line have been moved
to a dedicated hreq structure in the session. It makes is easier to
add HTTP-specific fields such as SOR (start of request) and EOF (end
of headers).
It also made it possible to fix two bugs introduced by last commit :
- end of headers not correctly detected
- hdr_idx not freed upon one specific error during session creation
When the backend side will be reworked, it should rely on a similar
structure.
Willy Tarreau [Mon, 4 Dec 2006 23:05:46 +0000 (00:05 +0100)]
[MAJOR] finished replacement of the client-side HTTP parser with a new one
The code is working again, but not as clean as it could be.
Many blocks should still move to dedicated functions. req->h
must be removed everywhere and updated everytime needed.
A few functions or macros should take care of the headers
during header insertion/deletion/change.
Willy Tarreau [Mon, 4 Dec 2006 01:26:12 +0000 (02:26 +0100)]
[MAJOR] replaced the client-side HTTP parser with a new one
The new parser uses an FSM to strictly follow RFC2616.
Headers are indexed and parsed only once they're all available.
That way, complex regexes make more sense.
HTTP processing is now performed in several phases by calling
multiple functions, making the code cleaner and easier to read.
Note that req[i]pass does not work anymore because it would
require that we mark a header to be ignored. What is really
needed is to have the ability to add an exception to a matching
(match xx except yy).
Several bugs have been fixed in appsession during the conversion
to the new FSM (method length and recovery on malloc errors).
The code does build and work with the debug examples, but is
not usable yet to connect to anything as it does not forward
the requests yet.
Willy Tarreau [Sun, 3 Dec 2006 14:21:35 +0000 (15:21 +0100)]
[MEDIUM] added the hdr_idx structure for future HTTP header indexing
This structure will consume 4 bytes per header to keep track of
headers within a request or a response without having to parse
the whole request for each regex. As it's not possible to allocate
only 4 bytes, we define a max number of HTTP headers. We set it
to (BUFSIZE+79)/80 so that 8kB buffers can contain 100 headers
(like Apache), resulting in 400 bytes dedicated to indexation,
or about 400/(2*8kB) ~= 2.4% of the memory usage.
Willy Tarreau [Sat, 2 Dec 2006 19:12:09 +0000 (20:12 +0100)]
[BUG] implemented support for multi-line headers as required by RFC2616.
This patch was added in 1.2.9 but was then incidentely reverted by
manipulation error when merging next patch (enforce max number of
conns). It's now merged again.
Willy Tarreau [Thu, 30 Nov 2006 10:40:23 +0000 (11:40 +0100)]
[MAJOR] separate sess->proxy into sess->{fe,fi,be}
The references to the proxy from the session have been turned into
Frontend (fe), Filters (fi) and Backend (be). This should ease the
migration to the L7 switching features. Next step will be to kill
the struct proxy and have 3 independant structs instead, each
referenced from entities called listener, frontend, filters and
backend.
Willy Tarreau [Mon, 13 Nov 2006 00:22:38 +0000 (01:22 +0100)]
[MEDIUM] add support for SO_REUSEPORT on Linux
SO_REUSEPORT does not exist on Linux but the checks are available in
the code. With a little patch, it's possible to implement the feature,
but the value of SO_REUSEPORT will still have to be known from userland.
This patch adds a workaround to this problem by figuring out the value
for the one used by SO_REUSEADDR.
Willy Tarreau [Sun, 12 Nov 2006 22:57:19 +0000 (23:57 +0100)]
[MAJOR] support for source binding via cttproxy
Using the cttproxy kernel patch, it's possible to bind to any source
address. It is highly recommended to use the 03-natdel patch with the
other ones.
A new keyword appears as a complement to the "source" keyword : "usesrc".
The source address is mandatory and must be valid on the interface which
will see the packets. The "usesrc" option supports "client" (for full
client_ip:client_port spoofing), "client_ip" (for client_ip spoofing)
and any 'IP[:port]' combination to pretend to be another machine.
Right now, the source binding is missing from server health-checks if
set to another address. It must be implemented (think restricted firewalls).
The doc is still missing too.
Willy Tarreau [Sun, 15 Oct 2006 22:03:35 +0000 (00:03 +0200)]
[RELEASE] released 1.3.3
Released 1.3.3 with the following changes :
- fix broken redispatch option in case the connection has already
been marked "in progress" (ie: nearly always).
- support regparm on x86 to speed up some often called functions
- removed a few useless calls to gettimeofday() in log functions.
- lots of 'const char*' cleanups
- turn every FD_* into functions which are faster on recent CPUs
- builds again on OpenBSD and Solaris