This commit introduces HTTP compression using the zlib library.
http_response_forward_body has been modified to call the compression
functions.
This feature includes 3 algorithms: identity, gzip and deflate:
* identity: this is mostly for debugging, and it was useful for
developping the compression feature. With Content-Length in input, it
is making each chunk with the data available in the current buffer.
With chunks in input, it is rechunking, the output chunks will be
bigger or smaller depending of the size of the input chunk and the
size of the buffer. Identity does not apply any change on data.
* gzip: same as identity, but applying a gzip compression. The data
are deflated using the Z_NO_FLUSH flag in zlib. When there is no more
data in the input buffer, it flushes the data in the output buffer
(Z_SYNC_FLUSH). At the end of data, when it receives the last chunk in
input, or when there is no more data to read, it writes the end of
data with Z_FINISH and the ending chunk.
* deflate: same as gzip, but with deflate algorithm and zlib format.
Note that this algorithm has ambiguous support on many browsers and
no support at all from recent ones. It is strongly recommended not
to use it for anything else than experimentation.
You can't choose the compression ratio at the moment, it will be set to
Z_BEST_SPEED (1), as tests have shown very little benefit in terms of
compression ration when going above for HTML contents, at the cost of
a massive CPU impact.
Compression will be activated depending of the Accept-Encoding request
header. With identity, it does not take care of that header.
To build HAProxy with zlib support, use USE_ZLIB=1 in the make
parameters.
This work was initially started by David Du Colombier at Exceliance.
Willy Tarreau [Thu, 25 Oct 2012 17:04:45 +0000 (19:04 +0200)]
CLEANUP: http: rename HTTP_MSG_DATA_CRLF state
This state's name is confusing as it is only used with chunked encoding
and makes newcomers think it's also related to the content-length. Let's
call it CHUNK_CRLF to clear any doubt on this.
Willy Tarreau [Thu, 25 Oct 2012 22:58:22 +0000 (00:58 +0200)]
OPTIM: tools: inline hex2i()
This tiny function was not inlined because initially not much used.
However it's been used un the chunk parser for a while and it became
one of the most CPU-cycle eater there. By inlining it, the chunk parser
speed was increased by 74 %. We're almost 3 times faster than original
with just the last 4 commits.
Willy Tarreau [Thu, 25 Oct 2012 22:21:52 +0000 (00:21 +0200)]
OPTIM: channel: inline channel_forward's fast path
Most calls to channel_forward() are performed with short byte counts and
are already optimized in channel_forward() taking just a few instructions.
Thus it's a waste of CPU cycles to call a function for this, let's just
inline the short byte count case and fall back to the common one for
remaining situations.
Doing so has increased the chunked encoding parser's performance by 12% !
Cyril Bonté [Wed, 24 Oct 2012 22:01:06 +0000 (00:01 +0200)]
MEDIUM: http: accept IPv6 values with (s)hdr_ip acl
Commit ceb4ac9c states that IPv6 values are accepted by "hdr_ip" acl,
but the code didn't allow it. This patch provides the ability to accept IPv6
values.
Cyril Bonté [Tue, 23 Oct 2012 19:28:31 +0000 (21:28 +0200)]
BUG/MEDIUM: acls using IPv6 subnets patterns incorrectly match IPs
Some tests revealed that IPs not in the range of IPv6 subnets incorrectly
matched (for example "acl BUG src 2804::/16" applied to a src IP "127.0.0.1").
This is caused by the acl_match_ip() function applies a mask in host byte
order, whereas it should be in network byte order.
Willy Tarreau [Mon, 22 Oct 2012 21:17:18 +0000 (23:17 +0200)]
MEDIUM: cli: allow the stats socket to be bound to a specific set of processes
Using "stats bind-process", it becomes possible to indicate to haproxy which
process will get the incoming connections to the stats socket. It will also
shut down the warning when nbproc > 1.
Willy Tarreau [Mon, 22 Oct 2012 20:47:55 +0000 (22:47 +0200)]
BUG/MAJOR: connection: risk of crash on certain tricky close scenario
In some circumstances, if the connection to the server is aborted while
some data were planned to be sent and the poller reported an ability to
send, then conn_fd_handler() would still call conn->data->send(), causing
the data layer to dereference the now NULL conn->xprt and crash.
So we have to check for conn->xprt validity before calling the data
layer.
This issue was introduced after 1.5-dev12 so it does not need any backport
and does not affect any released version.
Special thanks go to Cristian Ditoiu who once again provided amazing help
to troubleshoot this bug !
Willy Tarreau [Mon, 22 Oct 2012 17:32:55 +0000 (19:32 +0200)]
MEDIUM: listener: provide a fallback for accept4() when not supported
It happens that on some systems, the libc is recent enough to permit
building with accept4() but the kernel does not support it. The result
is then a disaster since no connection is accepted. We now detect this
and automatically fall back to accept() and fcntl() when this happens.
Please consider the following patch for configuration.txt to clarify meaning
of bufsize, maxrewrite and the size of HTTP request which can be processed.
Willy Tarreau [Sat, 20 Oct 2012 08:38:09 +0000 (10:38 +0200)]
BUG/MEDIUM: http: set DONTWAIT on data when switching to tunnel mode
Jaroslaw Bojar diagnosed an issue when haproxy switches to tunnel mode
after a transfer. The response data are sent with the MSG_MORE flag,
causing them to be needlessly queued in the kernel. In order to fix this,
we set the CF_NEVER_WAIT flag on the channels when switching to tunnel
mode.
One issue remained with client-side keep-alive : if the response is sent
before the end of the request, it suffers the same issue for the same
reason. This is easily addressed by setting the CF_SEND_DONTWAIT flag
on the channel when the response has been parsed and we're waiting for
the other side.
The same issue is present in 1.4 so the fix must be backported.
Willy Tarreau [Fri, 19 Oct 2012 18:52:18 +0000 (20:52 +0200)]
MINOR: ssl: improve socket behaviour upon handshake abort.
While checking haproxy's SSL stack with www.ssllabs.com, it appeared that
immediately closing upon a failed handshake caused a TCP reset to be emitted.
This is because OpenSSL does not consume pending data in the socket buffers.
One side effect is that if the reset packet is lost, the client might not get
it. So now when a handshake fails, we try to clean the socket buffers before
closing, resulting in a clean FIN instead of an RST.
Willy Tarreau [Fri, 19 Oct 2012 17:49:09 +0000 (19:49 +0200)]
MEDIUM: sample: pass an empty list instead of a null for fetch args
ACL and sample fetches use args list and it is really not convenient to
check for null args everywhere. Now for empty args we pass a constant
list of end of lists. It will allow us to remove many useless checks.
Willy Tarreau [Fri, 19 Oct 2012 14:47:23 +0000 (16:47 +0200)]
MINOR: sample: accept fetch keywords without parenthesis
fetch keywords which support arguments do not support being called
without parenthesis even if all arguments are optional. Let's fix
this to allow fetch keywords without parenthesis as is already done
in ACLs.
Willy Tarreau [Fri, 19 Oct 2012 13:18:06 +0000 (15:18 +0200)]
MINOR: chunk: provide string compare functions
It's sometimes needed to be able to compare a zero-terminated string with a
chunk, so we now have two functions to do that, one strcmp() equivalent and
one strcasecmp() equivalent.
Willy Tarreau [Thu, 18 Oct 2012 16:57:14 +0000 (18:57 +0200)]
MEDIUM: ssl: add support for the "npn" bind keyword
The ssl_npn match could not work by itself because clients do not use
the NPN extension unless the server advertises the protocols it supports.
Thanks to Simone Bordet for the explanations on how to get it right.
Willy Tarreau [Sat, 13 Oct 2012 12:33:58 +0000 (14:33 +0200)]
OPTIM: connection: pack the struct target
The struct target contains one int and one pointer, causing it to be
64-bit aligned on 64-bit platforms. By marking it "packed", we can
save 8 bytes in struct connection and as many in struct session on
such platforms.
Willy Tarreau [Sat, 13 Oct 2012 09:09:14 +0000 (11:09 +0200)]
CLEANUP: session: remove term_trace which is not used anymore
This field was used to trace precisely where a session was terminated
but it did not survive code rearchitecture and was not used at all
anymore. Let's get rid of it.
Willy Tarreau [Sat, 13 Oct 2012 08:05:56 +0000 (10:05 +0200)]
OPTIM: channel: reorganize struct members to improve cache efficiency
Now that the buffer is moved out of the channel, it is possible to move
the pointer earlier in the struct and reorder some fields. This new
ordering improves overall performance by 2%, mainly saved in the HTTP
parsers and data transfers.
Willy Tarreau [Fri, 12 Oct 2012 21:49:43 +0000 (23:49 +0200)]
MAJOR: channel: replace the struct buffer with a pointer to a buffer
With this commit, we now separate the channel from the buffer. This will
allow us to replace buffers on the fly without touching the channel. Since
nobody is supposed to keep a reference to a buffer anymore, doing so is not
a problem and will also permit some copy-less data manipulation.
Interestingly, these changes have shown a 2% performance increase on some
workloads, probably due to a better cache placement of data.
Willy Tarreau [Fri, 12 Oct 2012 20:51:15 +0000 (22:51 +0200)]
CLEANUP: http: use 'chn' to name channel variables, not 'buf'
These "buf" were confusing as they were really refering to channels. At
most places, a buffer was really all what was needed, so a struct buffer
was used instead. It is possible that the performance has slightly increased
by the removal of pointer offset in many pointer operations by directly
using the buffer pointer instead of the channel pointer.
Willy Tarreau [Fri, 12 Oct 2012 18:17:54 +0000 (20:17 +0200)]
MEDIUM: log: report SSL ciphers and version in logs using logformat %sslc/%sslv
These two new log-format tags report the SSL protocol version (%sslv) and the
SSL ciphers (%sslc) used for the connection with the client. For instance, to
append these information just after the client's IP/port address information
on an HTTP log line, use the following configuration :
Willy Tarreau [Fri, 12 Oct 2012 16:01:49 +0000 (18:01 +0200)]
MEDIUM: log: add a new LW_XPRT flag to pin the transport layer
This flag will have to be set on log tags which require transport layer
information. They will prevent the conn_xprt_close() call from releasing
the transport layer too early.
Willy Tarreau [Fri, 12 Oct 2012 15:50:05 +0000 (17:50 +0200)]
MEDIUM: connection: add a flag to hold the transport layer
When we start logging SSL information, we need the SSL struct to be
present even past the conn_xprt_close() call. In order to achieve this,
we should use refcounting on the connection and the transport layer. At
the moment it's not worth using plain refcounting as only the logs require
this, so instead of real refcounting we just use a flag which will be set
by the log subsystem when SSL data need to be logged.
What happens then is that the xprt->close() call is ignored and the
transport layer is closed again during session_free(), after the log
line is emitted.
Willy Tarreau [Fri, 12 Oct 2012 15:42:13 +0000 (17:42 +0200)]
BUG/MEDIUM: session: enable the conn_session_update() callback
This callback was introduced by commit 9683e9a0 but never enabled because
the CO_FL_WAKE_DATA flag was not set. The result is that this function is
never called when an SSL handshake fails, so the connection is only closed
on timeout.
Willy Tarreau [Fri, 12 Oct 2012 15:36:40 +0000 (17:36 +0200)]
BUG/MINOR: session: fix some leftover from debug code
Commit 82569f91 moved the health and monitor-net checks to session.c
but a debug test introduced 0& to disable MSG_DONTWAIT in the recv()
call and this debug code remained there. Since the socket is marked
non-blocking, there should be no effect but it's dangerous to keep
such a thing here.
Willy Tarreau [Fri, 12 Oct 2012 15:00:05 +0000 (17:00 +0200)]
MEDIUM: connection: always unset the transport layer upon close
When calling conn_xprt_close(), we always clear the transport pointer
so that all transport layers leave the connection in the same state after
a close. This will also make it safer and cheaper to call conn_xprt_close()
multiple times if needed.
Willy Tarreau [Fri, 12 Oct 2012 12:56:11 +0000 (14:56 +0200)]
MEDIUM: log: suffix the frontend's name with '~' when using SSL
Until now it was not possible to know from the logs whether the incoming
connection was made over SSL or not. In order to address this in the existing
log formats, a new log format %ft was introduced, to log the frontend's name
suffixed with its transport layer. The only transport layer in use right now
is '~' for SSL, so that existing log formats for non-SSL traffic are not
affected at all, and SSL log formats have the frontend's name suffixed with
'~'.
The TCP, HTTP and CLF log format now use %ft instead of %f. This does not
affect existing log formats which still make use of %f however.
Emeric Brun [Thu, 11 Oct 2012 14:11:36 +0000 (16:11 +0200)]
MINOR: ssl: add statements 'verify', 'ca-file' and 'crl-file' on servers.
It now becomes possible to verify the server's certificate using the "verify"
directive. This one only supports "none" and "required", as it does not make
much sense to also support "optional" here.
Willy Tarreau [Wed, 10 Oct 2012 21:04:25 +0000 (23:04 +0200)]
MEDIUM: ssl: move "server" keyword SSL options parsing to ssl_sock.c
All SSL-specific "server" keywords are now processed in ssl_sock.c. At
the moment, there is no more "not implemented" hint when SSL is disabled,
but keywords could be added in server.c if needed.
Willy Tarreau [Wed, 10 Oct 2012 06:56:47 +0000 (08:56 +0200)]
MINOR: standard: make indent_msg() support empty messages
indent_msg() is called with dynamically generated messages, so these
may be empty (NULL) when an empty list is being dumped. Support this
and return a NULL too.
Willy Tarreau [Wed, 10 Oct 2012 06:27:36 +0000 (08:27 +0200)]
MINOR: server: add minimal infrastructure to parse keywords
Just like with the "bind" lines, we'll switch the "server" line
parsing to keyword registration. The code is essentially the same
as for bind keywords, with minor changes such as support for the
default-server keywords and support for variable argument count.
Willy Tarreau [Wed, 10 Oct 2012 14:49:28 +0000 (16:49 +0200)]
MINOR: halog: add a parameter to limit output line count
Sometimes it's useful to limit the output to a number of lines, for
example when output is already sorted (eg: 10 slowest URLs, ...). Now
we can use -m for this.
Willy Tarreau [Mon, 8 Oct 2012 18:11:03 +0000 (20:11 +0200)]
MEDIUM: listener: add support for linux's accept4() syscall
On Linux, accept4() does the same as accept() except that it allows
the caller to specify some flags to set on the resulting socket. We
use this to set the O_NONBLOCK flag and thus to save one fcntl()
call in each connection. The effect is a small performance gain of
around 1%.
The option is automatically enabled when target linux2628 is set, or
when the USE_ACCEPT4 Makefile variable is set. If the libc is too old
to provide the equivalent function, this is automatically detected and
our own function is used instead. In any case it is possible to force
the use of our implementation with USE_MY_ACCEPT4.
Willy Tarreau [Fri, 5 Oct 2012 20:41:26 +0000 (22:41 +0200)]
BUG/MAJOR: ensure that hdr_idx is always reserved when L7 fetches are used
Baptiste Assmann reported a bug causing a crash on recent versions when
sticking rules were set on layer 7 in a TCP proxy. The bug is easier to
reproduce with the "defer-accept" option on the "bind" line in order to
have some contents to parse when the connection is accepted. The issue
is that the acl_prefetch_http() function called from HTTP fetches relies
on hdr_idx to be preinitialized, which is not the case if there is no L7
ACL.
The solution consists in adding a new SMP_CAP_L7 flag to fetches to indicate
that they are expected to work on L7 data, so that the proxy knows that the
hdr_idx has to be initialized. This is already how ACL and HTTP mode are
handled.
Emeric Brun [Fri, 5 Oct 2012 13:47:31 +0000 (15:47 +0200)]
MINOR: ssl: add defines LISTEN_DEFAULT_CIPHERS and CONNECT_DEFAULT_CIPHERS.
These ones are used to set the default ciphers suite on "bind" lines and
"server" lines respectively, instead of using OpenSSL's defaults. These
are probably mainly useful for distro packagers.
Willy Tarreau [Fri, 5 Oct 2012 19:29:37 +0000 (21:29 +0200)]
BUG: connection: fix regression from commit 9e272bf9
Commit 9e272bf9 broke connection setup in TCP mode, the comment was
misleading and obviously wrong, as after a connection is established,
we *do* have none of the CONNECT* flags. However we can never have
them all at the same time, so let's use this to trigger a detection.
Willy Tarreau [Thu, 4 Oct 2012 22:04:16 +0000 (00:04 +0200)]
MEDIUM: checks: enable the PROXY protocol with health checks
When health checks are configured on a server which has the send-proxy
directive and no "port" nor "addr" settings, the health check connections
will automatically use the PROXY protocol. If "port" or "addr" are set,
the "check-send-proxy" directive may be used to force the protocol.
MAJOR: checks: completely use the connection transport layer
With this change, we now use the connection's transport layer to receive
and send data during health checks. It even becomes possible to send data
in multiple times, which was not possible before.
The transport layer used is the same as the one used for the traffic, unless
a specific address and/or port is specified for the checks using "port" or
"addr", in which case the transport layer defaults to raw_sock. An option
will be provided to force SSL checks on different IP/ports later.
Connection errors and timeouts are still reported.
Some situations where strerror() was able to report a precise error after
a failed connect() in the past might not be reported with as much precision
anymore, but the error message was already meaningless. During the tests,
no situation was found where a message became less precise.
MEDIUM: check: add the ctrl and transport layers in the server check structure
Since it's possible for the checks to use a different protocol or transport layer
than the prod traffic, we need to have them referenced in the server. The
SSL checks are not enabled yet, but the transport layers are completely used.
MEDIUM: checks: use real buffers to store requests and responses
Till now the request was made in the trash and sent to the network at
once, and the response was read into a preallocated char[]. Now we
allocate a full buffer for both the request and the response, and make
use of it.
Some of the operations will probably be replaced later with buffer macros
but the point was to ensure we could migrate to use the data layers soon.
One nice improvement caused by this change is that requests are now formed
at the beginning of the check and may safely be sent in multiple chunks if
needed.
REORG: server: move the check-specific parts into a check subsection
The health checks in the servers are becoming a real mess, move them
into their own subsection. We'll soon need to have a struct buffer to
replace the char * as well as check-specific protocol and transport
layers.
MAJOR: checks: make use of the connection layer to send checks
This is a first step, we now use the connection layer without the data
layers (send/recv are still used by hand). The connection is established
using tcp_connect_server() and raw_sock is assumed and forced for now.
fdtab is not manipulated anymore and polling is managed via the connection
layer.
It becomes quite clear that the server needs a second ->ctrl and ->xprt
dedicated to the checks.
Willy Tarreau [Thu, 4 Oct 2012 21:55:57 +0000 (23:55 +0200)]
MEDIUM: connection: add a new local send-proxy transport callback
This callback sends a PROXY protocol line on the outgoing connection,
with the local and remote endpoint information. This is used for local
connections (eg: health checks) where the other end needs to have a
valid address and no connection is relayed.
Willy Tarreau [Thu, 4 Oct 2012 22:10:55 +0000 (00:10 +0200)]
REORG: connection: move the PROXY protocol management to connection.c
It was previously in frontend.c but there is no reason for this anymore
considering that all the information involved is in the connection itself
only. Theorically this should be in the socket layer but we don't have
this yet.
Willy Tarreau [Thu, 4 Oct 2012 20:21:15 +0000 (22:21 +0200)]
MEDIUM: connection: automatically disable polling on error
We absolutely want to disable FD polling after an error is detected,
otherwise the data layer has to do it and it's far from being obvious
at these layers.
The way we did it was a bit tricky in conn_update_*_polling and
conn_*_polling_changes. However it has almost no impact on performance
and code size both for the fast and slow path.
We'll now be able to remove some flag updates in the stream interface.
Willy Tarreau [Thu, 4 Oct 2012 18:20:46 +0000 (20:20 +0200)]
MEDIUM: connection: it's not the data layer's role to validate the connection
Till now we used to perform the L4_CONN check in the data layer
(eg: stream interface) but that does not make sense, because some transport
layers will imply that the connection is opened (eg: SSL), and also because
the complexity to check for this is higher in the data layer than in the
transport layer. This is so much true that some read0 cases did not validate
the connection.
So as of now, the transport layer is responsible for clearing L4_CONN when
it detects an activity, and the data layer may safely rely on this flag. This
only impacts a minor change in raw_sock and stream_interface for now.
Willy Tarreau [Wed, 3 Oct 2012 19:17:23 +0000 (21:17 +0200)]
MEDIUM: session: register a data->wake callback to process errors
The connection layer will soon call ->wake() only when errors happen, and
not ->init(). So make the session layer use this callback to detect errors
and abort connections.
Willy Tarreau [Wed, 3 Oct 2012 19:12:16 +0000 (21:12 +0200)]
MEDIUM: connection: make it possible for data->wake to return an error
Just like ->init(), ->wake() may now be used to return an error and
abort the connection. Currently this is not used but will be with
embryonic sessions.
Willy Tarreau [Wed, 3 Oct 2012 19:04:48 +0000 (21:04 +0200)]
MEDIUM: connection: only call the data->wake callback on activity
We now check the connection flags for changes in order not to call the
data->wake callback when there is no activity. Activity means a change
on any of the CO_FL_*_SH, CO_FL_ERROR, CO_FL_CONNECTED, CO_FL_WAIT_CONN*
flags, as well as a call to data->recv or data->send.
Willy Tarreau [Wed, 3 Oct 2012 18:00:18 +0000 (20:00 +0200)]
MEDIUM: connection: reorganize connection flags
The connection flags have progressively been added one after the other
and were not very well organized. Some of them are often used together
and a number of operations are performed on the DATA/SOCK ENA/POL flags.
Thus, they have been reorganized so that flags that work together are
close to each other (allows immediate operands on ARM) and that polling
changes can be detected with fewer operations using a simple shift and
xor. The handshakes are now the last ones so that it will be easier to
add new ones after without risking a collision. All activity-related
flags are also grouped together.
Willy Tarreau [Tue, 2 Oct 2012 23:39:48 +0000 (01:39 +0200)]
MEDIUM: connection: use a generic data-layer init() callback
The generic data-layer init callback is now used after the transport
layer is complete and before calling the data layer recv/send callbacks.
This allows the session to switch from the embryonic session data layer
to the complete stream interface data layer, by making conn_session_complete()
the data layer's init callback.
It sill looks awkwards that the init() callback must be used opon error,
but except by adding yet another one, it does not seem to be mergeable
into another function (eg: it should probably not be merged with ->wake
to avoid unneeded calls during the handshake, though semantically that
would make sense).