Willy Tarreau [Sat, 17 Dec 2011 15:34:27 +0000 (16:34 +0100)]
BUG: http: re-enable TCP quick-ack upon incomplete HTTP requests
By default we disable TCP quick-acking on HTTP requests so that we
avoid sending a pure ACK immediately followed by the HTTP response.
However, if the client sends an incomplete request in a short packet,
its TCP stack might wait for this packet to be ACKed before sending
the rest of the request, delaying incoming requests by up to 40-200ms.
We can detect this undesirable situation when parsing the request :
- if an incomplete request is received
- if a full request is received and uses chunked encoding or advertises
a content-length larger than the data available in the buffer
In these situations, we re-enable TCP quick-ack if we had previously
disabled it.
Willy Tarreau [Mon, 12 Dec 2011 16:23:41 +0000 (17:23 +0100)]
MINOR: acl: add support for TLS server name matching using SNI
Server Name Indication (SNI) is a TLS extension which makes a client
present the name of the server it is connecting to in the client hello.
It allows a transparent proxy to take a decision based on the beginning
of an SSL/TLS stream without deciphering it.
The new ACL "req_ssl_sni" matches the name extracted from the TLS
handshake against a list of names which may be loaded from a file if
needed.
Willy Tarreau [Sun, 11 Dec 2011 21:37:06 +0000 (22:37 +0100)]
OPTIM: stream_sock: save a failed recv syscall when splice returns EAGAIN
When splice() returns EAGAIN, on old kernels it could be caused by a read
shutdown which was not detected. Due to this behaviour, we had to fall
back to recv(), which in turn says if it's a real EAGAIN or a shutdown.
Since this behaviour was fixed in 2.6.27.14, on more recent kernels we'd
prefer to avoid the fallback to recv() when possible. For this, we set a
variable the first time splice() detects a shutdown, to indicate that it
works. We can then rely on this variable to adjust our behaviour.
Doing this alone increased the overall performance by about 1% on medium
sized objects.
Willy Tarreau [Sun, 11 Dec 2011 21:11:47 +0000 (22:11 +0100)]
OPTIM: stream_sock: reduce the amount of in-flight spliced data
First, it's a waste not to call chk_snd() when spliced data are available,
because the pipe can almost always be transferred into the outgoing socket
buffers. Starting from now, when we splice data in, we immediately try to
send them. This results in less pipes used, and possibly less kernel memory
in use at once.
Second, if a pipe cannot be transferred into the outgoing socket buffers,
it means this buffer is full. There's no point trying again then, as space
will almost never be available, resulting in a useless syscall returning
EAGAIN.
Willy Tarreau [Mon, 14 Nov 2011 13:09:27 +0000 (14:09 +0100)]
BUG: ebtree: ebst_lookup() could return the wrong entry
(from ebtree 6.0.7)
Julien Thomas provided a reproducible test case where a string lookup
could return the wrong node. The issue is caused by the jump to a node
which contains less bit in common than the previous node, making the
string_equal_bits() function return -1. We must not remember more bits
than the number on the node, otherwise we can be tempted to trust them
while they can change while running down.
For a valid test case, enter : "0", "WW", "W", "S", and lookup "W".
Previously, "S" was returned.
Note: string-based ebtrees are used in haproxy in ACL, peers and
stick-tables. ACLs are not affected because all patterns are
interchangeable. stick-tables are not affected because lookups are
performed using ebmb_lookup(). Only peers might be affected though
it is not easy to infirm or confirm the issue.
CLEANUP: ebtree: remove 4-year old harmless typo in duplicates insertion code
(from ebtree 6.0.7)
This typo has been there since we introduced duplicates. A "struct eb_troot *"
which apparently the compiler doesn't complain about while it is never declared
anywhere. Amazing...
CLEANUP: ebtree: clarify licence and update to 6.0.6
(from ebtree 6.0.6)
This version is mainly aimed at clarifying the fact that the ebtree license
is LGPL. Some files used to indicate LGPL and other ones GPL, while the goal
clearly is to have it LGPL. A LICENSE file has also been added.
No code is affected, but it's better to have the local tree in sync anyway.
CLEANUP: ebtree: remove a few annoying signedness warnings
(from ebtree 6.0.6)
Care has been taken not to make the code bigger (it even got smaller
due to a possible simplification).
(cherry picked from commit 7a2c1df646049c7daac52677ec11ed63048cd150)
Willy Tarreau [Wed, 30 Nov 2011 17:02:24 +0000 (18:02 +0100)]
BUG: tcp: option nolinger does not work on backends
Daniel Rankov reported that "option nolinger" is inefficient on backends.
The reason is that it is set on the file descriptor only, which does not
prevent haproxy from performing a clean shutdown() before closing. We must
set the flag on the stream_interface instead if we want an RST to be emitted
upon active close.
Willy Tarreau [Mon, 28 Nov 2011 12:40:49 +0000 (13:40 +0100)]
BUG: buffers: don't return a negative value on buffer_total_space_res()
In commit 4b517ca93aaaead8aa6143aa2836dc96417653c6 (MEDIUM: buffers:
add some new primitives and rework existing ones), we forgot to check
if buffer_max_len() < l.
Willy Tarreau [Fri, 25 Nov 2011 19:33:58 +0000 (20:33 +0100)]
MEDIUM: buffers: add some new primitives and rework existing ones
A number of primitives were missing for buffer management, and some
of them were particularly awkward to use. Specifically, the functions
used to compute free space could not always be used depending what was
wrapping in the buffers. Some documentation has been added about how
the buffers work and their properties. Some functions are still missing
such as a buffer replacement which would support wrapping buffers.
This patch settles the 2 loggers limitation.
Loggers are now stored in linked lists.
Using "global log", the global loggers list content is added at the end
of the current proxy list. Each "log" entries are added at the end of
the proxy list.
Willy Tarreau [Mon, 31 Oct 2011 12:49:26 +0000 (13:49 +0100)]
MINOR: config: tolerate server "cookie" setting in non-HTTP mode
Up to now, if a cookie value was specified on a server when the proxy was
in TCP mode, it would cause a fatal error. Now we only report a warning,
since the cookie will be ignored. This makes it easier to generate configs
from scripts.
Willy Tarreau [Mon, 31 Oct 2011 10:53:20 +0000 (11:53 +0100)]
BUG/MEDIUM: checks: fix slowstart behaviour when server tracking is in use
Ludovic Levesque reported and diagnosed an annoying bug. When a server is
configured to track another one and has a slowstart interval set, it's
assigned a minimal weight when the tracked server goes back up but keeps
this weight forever.
This is because the throttling during the warmup phase is only computed
in the health checking function.
After several attempts to resolve the issue, the only real solution is to
split the check processing task in two tasks, one for the checks and one
for the warmup. Each server with a slowstart setting has a warmum task
which is responsible for updating the server's weight after a down to up
transition. The task does not run in othe situations.
In the end, the fix is neither complex nor long and should be backported
to 1.4 since the issue was detected there first.
Willy Tarreau [Fri, 28 Oct 2011 13:35:33 +0000 (15:35 +0200)]
CLEANUP: rename possibly confusing struct field "tracked"
When reading the code, the "tracked" member of a server makes one
think the server is tracked while it's the opposite, it's a pointer
to the server being tracked. This is particularly true in constructs
such as :
if (srv->tracked) {
Since it's the second time I get caught misunderstanding it, let's
rename it to "track" to avoid the confusion.
Willy Tarreau [Fri, 28 Oct 2011 12:16:49 +0000 (14:16 +0200)]
BUG/MINOR: fix a segfault when parsing a config with undeclared peers
Baptiste Assmann reported that a config where a non-existing peers
section is referenced by a stick-table causes a segfault after displaying
the error. This is caused by the freeing of the peers. Setting it to NULL
after displaying the error fixes the issue.
Willy Tarreau [Mon, 24 Oct 2011 17:14:41 +0000 (19:14 +0200)]
MEDIUM: tune.http.maxhdr makes it possible to configure the maximum number of HTTP headers
For a long time, the max number of headers was taken as a part of the buffer
size. Since the header size can be configured at runtime, it does not make
much sense anymore.
Nothing was making it necessary to have a static value, so let's turn this into
a tunable with a default value of 101 which equals what was previously used.
Willy Tarreau [Mon, 24 Oct 2011 16:15:04 +0000 (18:15 +0200)]
OPTIM/MINOR: move the hdr_idx pools out of the proxy struct
It makes no sense to have one pointer to the hdr_idx pool in each proxy
struct since these pools do not depend on the proxy. Let's have a common
pool instead as it is already the case for other types.
Willy Tarreau [Sun, 23 Oct 2011 19:14:29 +0000 (21:14 +0200)]
OPTIM/MINOR: make it possible to change pipe size (tune.pipesize)
By default, pipes are the default size for the system. But sometimes when
using TCP splicing, it can improve performance to increase pipe sizes,
especially if it is suspected that pipes are not filled and that many
calls to splice() are performed. This has an impact on the kernel's
memory footprint, so this must not be changed if impacts are not understood.
Willy Tarreau [Fri, 21 Oct 2011 16:51:57 +0000 (18:51 +0200)]
OPTIM/MINOR: move struct sockaddr_storage to the tail of structs
Struct sockaddr_storage is huge (128 bytes) and severely impacts the
cache. It also displaces other struct members, causing them to have
larger relative offsets. By moving these few occurrences to the end
of the structs which host them, we can reduce the code size by no less
than 2 kB !
Willy Tarreau [Mon, 17 Oct 2011 10:24:55 +0000 (12:24 +0200)]
DOC: indicate that cookie "prefix" and "indirect" should not be mixed
When prefix and indirect are used together, a client which connects to
a server with a cookie will never get any cookie update from this server,
which will be removed by the "indirect" option.
MINOR: remove the client/server side distinction in SI addresses
Stream interfaces used to distinguish between client and server addresses
because they were previously of different types (sockaddr_storage for the
client, sockaddr_in for the server). This is not the case anymore, and this
distinction is confusing at best and has caused a number of regressions to
be introduced in the process of converting everything to full-ipv6. We can
now remove this and have a much cleaner code.
BUG/MINOR: don't use a wrong port when connecting to a server with mapped ports
Nick Chalk reported that a connection to a server which has no port specified
used twice the port number. The reason is that the port number was taken from
the wrong part of the address, the client's destination address was used as the
base port instead of the server's configured address.
MINOR: acl: add new matches for header/path/url length
This patch introduces hdr_len, path_len and url_len for matching these
respective parts lengths against integers. This can be used to detect
abuse or empty headers.
BUG/MEDIUM: don't trim last spaces from headers consisting only of spaces
Commit 588bd4 fixed header parsing so that trailing spaces were not part
of the returned string. Unfortunately, if a header only had spaces, the
last spaces were trimmed past the beginning of the value, causing a negative
length to be returned.
A quick code review shows that there should be no impact since the only
places where the vlen is used are either compared to a specific value or
with explicit contents (eg: digits).
Released version 1.5-dev7 with the following main changes :
- [BUG] fix binary stick-tables
- [MINOR] http: *_dom matching header functions now also split on ":"
- [BUG] checks: fix support of Mysqld >= 5.5 for mysql-check
- [MINOR] acl: add srv_conn acl to count connections on a specific backend server
- [MINOR] check: add redis check support
- [DOC] small fixes to clearly distinguish between keyword and variables
- [MINOR] halog: add support for termination code matching (-tcn/-TCN)
- [DOC] Minor spelling fixes and grammatical enhancements
- [CLEANUP] dumpstats: make symbols static where possible
- [MINOR] Break out dumping table
- [MINOR] Break out processing of clear table
- [MINOR] Allow listing of stick table by key
- [MINOR] Break out all stick table socat command parsing
- [MINOR] More flexible clearing of stick table
- [MINOR] Allow showing and clearing by key of ipv6 stick tables
- [MINOR] Allow showing and clearing by key of integer stick tables
- [MINOR] Allow showing and clearing by key of string stick tables
- [CLEANUP] Remove assigned but unused variables
- [CLEANUP] peers.h: fix declarations
- [CLEANUP] session.c: Make functions static where possible
- [MINOR] Add active connection list to server
- [MINOR] Allow shutdown of sessions when a server becomes unavailable
- [MINOR] Add down termination condition
- [MINOR] Make appsess{,ion}_refresh static
- [MINOR] Add rdp_cookie pattern fetch function
- [CLEANUP] Remove unnecessary casts
- [MINOR] Add non-stick server option
- [MINOR] Consistently use error in tcp_parse_tcp_req()
- [MINOR] Consistently free expr on error in cfg_parse_listen()
- [MINOR] Free rdp_cookie_name on denint()
- [MINOR] Free tcp rules on denint()
- [MINOR] Free stick table pool on denint()
- [MINOR] Free stick rules on denint()
- [MEDIUM] Fix stick-table replication on soft-restart
- [MEDIUM] Correct ipmask() logic
- [MINOR] Correct type in table dump examples
- [MINOR] Fix build error in stream_int_register_handler()
- [MINOR] Use DPRINTF in assign_server()
- [BUG] checks: http-check expect could fail a check on multi-packet responses
- [DOC] fix minor typo in the "dispatch" doc
- [BUG] proto_tcp: fix address binding on remote source
- [MINOR] http: don't report the "haproxy" word on the monitoring response
- [REORG] http: move HTTP error codes back to proto_http.h
- [MINOR] http: make the "HTTP 200" status code configurable.
- [MINOR] http: partially revert the chunking optimization for now
- [MINOR] stream_sock: always clear BF_EXPECT_MORE upon complete transfer
- [CLEANUP] stream_sock: remove unneeded FL_TCP and factor out test
- [MEDIUM] http: add support for "http-no-delay"
- [OPTIM] http: optimize chunking again in non-interactive mode
- [OPTIM] stream_sock: avoid fast-forwarding of partial data
- [OPTIM] stream_sock: don't use splice on too small payloads
- [MINOR] config: make it possible to specify a cookie even without a server
- [BUG] stats: support url-encoded forms
- [MINOR] config: automatically compute a default fullconn value
- [CLEANUP] config: remove some left-over printf debugging code from previous patch
- [DOC] add missing entry or stick store-response
- [MEDIUM] http: add support for 'cookie' and 'set-cookie' patterns
- [BUG] halog: correctly handle truncated last line
- [MINOR] halog: make SKIP_CHAR stop on field delimiters
- [MINOR] halog: add support for HTTP log matching (-H)
- [MINOR] halog: gain back performance before SKIP_CHAR fix
- [OPTIM] halog: cache some common fields positions
- [OPTIM] halog: check once for correct line format and reuse the pointer
- [OPTIM] halog: remove many 'if' by using a function pointer for the filters
- [OPTIM] halog: remove support for tab delimiters in input data
- [BUG] session: risk of crash on out of memory (1.5-dev regression)
- [MINOR] session: try to emit a 500 response on memory allocation errors
- [OPTIM] stream_sock: reduce the default number of accepted connections at once
- [BUG] stream_sock: disable listener when system resources are exhausted
- [MEDIUM] proxy: add a PAUSED state to listeners and move socket tricks out of proxy.c
- [BUG] stream_sock: ensure orphan listeners don't accept too many connections
- [MINOR] listeners: add listen_full() to mark a listener full
- [MINOR] listeners: add support for queueing resource limited listeners
- [MEDIUM] listeners: put listeners in queue upon resource shortage
- [MEDIUM] listeners: queue proxy-bound listeners at the proxy's
- [MEDIUM] listeners: don't stop proxies when global maxconn is reached
- [MEDIUM] listeners: don't change listeners states anymore in maintain_proxies
- [CLEANUP] proxy: rename a few proxy states (PR_STIDLE and PR_STRUN)
- [MINOR] stats: report a "WAITING" state for sockets waiting for resource
- [MINOR] proxy: make session rate-limit more accurate
- [MINOR] sessions: only wake waiting listeners up if rate limit is OK
- [BUG] proxy: peers must only be stopped once, not upon every call to maintain_proxies
- [CLEANUP] proxy: merge maintain_proxies() operation inside a single loop
- [MINOR] task: new function task_schedule() to schedule a wake up
- [MAJOR] proxy: finally get rid of maintain_proxies()
- [BUG] proxy: stats frontend and peers were missing many initializers
- [MEDIUM] listeners: add a global listener management task
- [MINOR] proxy: make findproxy() return proxies from numeric IDs too
- [DOC] fix typos, "#" is a sharp, not a dash
- [MEDIUM] stats: add support for changing frontend's maxconn at runtime
- [MEDIUM] checks: group health checks methods by values and save option bits
- [MINOR] session-counters: add the ability to clear the counters
- [BUG] check: http-check expect + regex would crash in defaults section
- [MEDIUM] http: make x-forwarded-for addition conditional
- [REORG] build: move syscall redefinition to specific places
- [CLEANUP] update the year in the copyright banner
- [BUG] possible crash in 'show table' on stats socket
- [BUG] checks: use the correct destination port for sending checks
- [BUG] backend: risk of picking a wrong port when mapping is used with crossed families
- [MINOR] make use of set_host_port() and get_host_port() to get rid of family mismatches
- [DOC] fixed a few "sensible" -> "sensitive" errors
- [MINOR] make use of addr_to_str() and get_host_port() to replace many inet_ntop()
- [BUG] http: trailing white spaces must also be trimmed after headers
- [MINOR] stats: display "<NONE>" instead of the frontend name when unknown
- [MINOR] http: take a capture of too large requests and responses
- [MINOR] http: take a capture of truncated responses
- [MINOR] http: take a capture of bad content-lengths.
- [DOC] add a few old and uncommitted docs
- [CLEANUP] cfgparse: fix reported options for the "bind" keyword
- [MINOR] halog: add -hs/-HS to filter by HTTP status code range
- [MINOR] halog: support backslash-escaped quotes
- [CLEANUP] remove dirty left-over of a debugging message
- [MEDIUM] stats: disable complex socket reservation for stats socket
- [CLEANUP] remove a useless test in manage_global_listener_queue()
- [MEDIUM] stats: add the "set maxconn" setting to the command line interface
- [MEDIUM] add support for global.maxconnrate to limit the per-process conn rate.
- [MINOR] stats: report the current and max global connection rates
- [MEDIUM] stats: add the ability to adjust the global maxconnrate
- [BUG] peers: don't pre-allocate 65000 connections to each peer
- [MEDIUM] don't limit peers nor stats socket to maxconn nor maxconnrate
- [BUG] peers: the peer frontend must not emit any log
- [CLEANUP] proxy: make pause_proxy() perform the required controls and emit the logs
- [BUG] peers: don't keep a peers section which has a NULL frontend
- [BUG] peers: ensure the peers are resumed if they were paused
- [MEDIUM] stats: add the ability to enable/disable/shutdown a frontend at runtime
- [MEDIUM] session: make session_shutdown() an independant function
- [MEDIUM] stats: offer the possibility to kill a session from the CLI
- [CLEANUP] stats: centralize tests for backend/server inputs on the CLI
- [MEDIUM] stats: offer the possibility to kill sessions by server
- [MINOR] halog: do not consider byte 0x8A as end of line
- [MINOR] frontend: ensure debug message length is always initialized
- [OPTIM] halog: make fgets parse more bytes by blocks
- [OPTIM] halog: add assembly version of the field lookup code
- [MEDIUM] poll: add a measurement of idle vs work time
- [CLEANUP] startup: report only the basename in the usage message
- [MINOR] startup: add an option to change to a new directory
- [OPTIM] task: don't scan the run queue if we know it's empty
- [BUILD] stats: stdint is not present on solaris
- [DOC] update the README file to reflect new naming rules for patches
- [MINOR] stats: report the number of requests intercepted by the frontend
- [DOC] update ROADMAP file
[MINOR] stats: report the number of requests intercepted by the frontend
These requests are mainly monitor requests, as well as stats requests when
the stats are processed by the frontend. Having this counter helps explain
the difference in number of sessions that is sometimes observed between a
frontend and a backend.
[MINOR] startup: add an option to change to a new directory
Passing -C <dir> causes haproxy to chdir to <dir> before loading
any file. The argument may be passed anywhere on the command line.
A typical use case is :
[MEDIUM] poll: add a measurement of idle vs work time
We now measure the work and idle times in order to report the idle
time in the stats. It's expected that we'll be able to use it at
other places later.
[OPTIM] halog: add assembly version of the field lookup code
Gcc tries to be a bit too smart in these small loops and the result is
that on i386 we waste a lot of time there. By recoding these loops in
assembly, we save up to 23% total processing time on i386! The savings
on x86_64 are much lower, probably because there are more registers and
gcc has to do less tricks. However, those savings vary a lot between gcc
versions and even cause harm on some of them (eg: 4.4) because gcc does
not know how to optimize the code once inlined.
However, by recoding field_start() in C to try to match the assembly
code as much as possible, we can significantly reduce its execution
time without risking the negative impacts. Thus, the assembly version
is less interesting there but still worth being used on some compilers.
[OPTIM] halog: make fgets parse more bytes by blocks
By adding a "landing area" at the end of the buffer, it becomes safe to
parse more bytes at once. On 32-bit this makes fgets run about 4% faster
but it does not save anything on 64-bit.
[MINOR] http: *_dom matching header functions now also split on ":"
*_dom is mostly used for matching Host headers, and host headers may
include port numbers. To avoid having to create multiple rules with
and without :<port-number> in hdr_dom rules, change the *_dom
matching functions to also handle : as a delimiter.
Typically there are rules like this in haproxy.cfg:
acl is_foo hdr_dom(host) www.foo.com
Most clients send "Host: www.foo.com" in their HTTP header, but some
send "Host: www.foo.com:80" (which is allowed), and the above
rule will now work for those clients as well.
[Note: patch was edited before merge, any unexpected bug is mine /willy]
[MINOR] halog: do not consider byte 0x8A as end of line
A bug in the algorithm used to find an LF in multiple bytes at once
made byte 0x80 trigger detection of byte 0x00, thus 0x8A matches byte
0x0A. In practice, this issue never happens since byte 0x8A won't be
displayed in logs (or it will be encoded). This could still possibly
happen in mixed logs.
[MEDIUM] session: make session_shutdown() an independant function
We already had the ability to kill a connection, but it was only
for the checks. Now we can do this for any session, and for this we
add a specific flag "K" to the logs.
[MEDIUM] stats: add the ability to enable/disable/shutdown a frontend at runtime
The stats socket now allows the admin to disable, enable or shutdown a frontend.
This can be used when a bug is discovered in a configuration and it's desirable
to fix it but the rules in place don't allow to change a running config. Thus it
becomes possible to kill the frontend to release the port and start a new one in
a separate process.
This can also be used to temporarily make haproxy return TCP resets to incoming
requests to pretend the service is not bound. For instance, this may be useful
to quickly flush a very deep SYN backlog.
The frontend check and lookup code was factored with the "set maxconn" usage.
[BUG] peers: ensure the peers are resumed if they were paused
Upon an incoming soft restart request, we first pause all frontends and
peers. If the caller changes its mind and asks us to resume (eg: failed
binding), we must resume all the frontends and peers. Unfortunately the
peers were not resumed.
The code was arranged to avoid code duplication (which used to hide the
issue till now).
[BUG] peers: don't keep a peers section which has a NULL frontend
If a peers section has no peer named as the local peer, we must destroy
it, otherwise a NULL peer frontend remains in the lists and a segfault
can happen upon a soft restart.
We also now report the missing peer name in order to help troubleshooting.
[BUG] peers: the peer frontend must not emit any log
Peers' frontends must have logging disabled by default, which was not
the case, so logs were randomly emitted upon restart, sometimes causing
a new process to fail to replace the old one.
[BUG] peers: don't pre-allocate 65000 connections to each peer
This made sense a long time ago but since the maxconn is dynamically
computed from the tracking tables, it does not make any sense anymore
and will harm future changes.
[MINOR] stats: report the current and max global connection rates
The HTML page reports the current process connection rate, and the
"show info" command on the stats socket also reports the conn rate
limit and the max conn rate that was once reached.
Note that the max value can be cleared using "clear counters".
[MEDIUM] add support for global.maxconnrate to limit the per-process conn rate.
This one enforces a per-process connection rate limit, regardless of what
may be set per frontend. It can be a way to limit the CPU usage of a process
being severely attacked.
The side effect is that the global process connection rate is now measured
for each incoming connection, so it will be possible to report it.
[MEDIUM] stats: add the "set maxconn" setting to the command line interface
This option permits to change the global maxconn setting within the
limit that was set by the initial value, which is now reported as the
hard maxconn value. This allows to immediately accept more concurrent
connections or to stop accepting new ones until the value passes below
the indicated setting.
The main use of this option is on systems where many haproxy instances
are loaded and admins need to re-adjust resource sharing at run time
to regain a bit of fairness between processes.
[MEDIUM] stats: disable complex socket reservation for stats socket
The way the unix socket is initialized is awkward. Some of the settings are put
in the sockets itself, other ones in the backend. And more importantly the
global.maxsock value is adjusted so that the stats socket evades the global
maxconn value. This complexifies maxsock computations for nothing, since the
stats socket is not supposed to receive hundreds of concurrent connections when
the global maxconn is very low. What is needed however is to ensure that there
are always connections left for the stats socket even when traffic sockets are
saturated, but this guarantee is not offered anymore by current code.
So as of now, the stats socket is subject to the global maxconn limitation just
as any other socket until a reservation mechanism is implemented.
Some syslog servers escape quotes, which make the resulting logs unusable
for URL processing since the parser looks for the first field beginning
with a quote. It now supports also fields starting with backslash and
quote in order to address this. No performance impact was measured.
[MINOR] halog: add -hs/-HS to filter by HTTP status code range
The code was merged with the error code checking which is very similar and
which shares the same information. The new test adds about 1% slowdown to
error checking but makes it more reliable when facing wrongly formated
status codes.
[MINOR] http: take a capture of bad content-lengths.
Sometimes a bad content-length header is encountered and this causes
an abort. It's hard to debug without a trace, so let's take a capture
of the contents when this happens.
[MINOR] http: take a capture of truncated responses
If a server starts to respond but stops before the body, then we
capture the truncated response. We don't do this on the request
because it would happen too often upon stupid attacks.
[BUG] http: trailing white spaces must also be trimmed after headers
Trailing spaces after headers were not trimmed, only the leading ones
were. An issue was detected today with a content-length value which
was padded with spaces and which was rejected. Recent updates to the
http-bis draft made it a lot more clear that such spaces must be ignored,
so this is what this patch does.
[MINOR] make use of addr_to_str() and get_host_port() to replace many inet_ntop()
Many inet_ntop calls were partially right, which was hard to detect given
the complex combinations. Some of them were relying on the listener's proto
instead of the address itself, which could have been different when dealing
with an accept-proxy connection.
The new addr_to_str() function does the dirty job and returns the family, which
makes it particularly suited to calls from switch/case statements. A large number
of if/else statements were removed and the stats output could even be cleaned up
in the case of session dump.
As a side effect of doing this, the resulting code is smaller by almost 1kB.
All changed parts have been tested and provided expected output.
Willy Tarreau [Sat, 27 Aug 2011 10:07:49 +0000 (12:07 +0200)]
[BUG] backend: risk of picking a wrong port when mapping is used with crossed families
A similar issue as the previous one causes port mapping to fail in some
combinations of client and server address families. Using the macros fixes
the issue.
Willy Tarreau [Sat, 27 Aug 2011 09:51:36 +0000 (11:51 +0200)]
[BUG] checks: use the correct destination port for sending checks
In the number of switch/case statements added for IPv6 changes,
one was wrong and caused the check port to be ignored for outgoing
connection because the socket's family was not taken at the right
place. Use the set_host_port() macro instead to fix the issue.
The same cleanup could be performed at a number of other places
and should follow shortly.
Special thanks to Stephane Bakhos of Techboom for reporting a
detailed analysis of this bug.
Willy Tarreau [Wed, 24 Aug 2011 06:23:34 +0000 (08:23 +0200)]
[BUG] possible crash in 'show table' on stats socket
Patch d5b9fd95 was missing an initialisation of "ctx.table.target", which caused
"show table" to segfault if it was issued after a "show errors" (target pointer == -1).
Willy Tarreau [Mon, 22 Aug 2011 15:12:02 +0000 (17:12 +0200)]
[REORG] build: move syscall redefinition to specific places
Some older libc don't define splice() and and don't define _syscall*()
either, which causes build errors if splicing is enabled.
To solve this, we now split the syscall redefinition into two layers :
- one file per syscall (epoll, splice)
- one common file to declare the _syscall*() macros
The code is cleaner because files using the syscalls just have to include
their respective file. It's not adviced to merge multiple syscall families
into a same file if all are not intended to be used simultaneously, because
defining unused static functions causes warnings to be emitted during build.
As a result, the new USE_MY_SPLICE parameter was added in order to be able
to define the splice() syscall separately.
Willy Tarreau [Fri, 19 Aug 2011 20:57:24 +0000 (22:57 +0200)]
[MEDIUM] http: make x-forwarded-for addition conditional
If "option forwardfor" has the "if-none" argument, then the header is
only added when the request did not already have one. This option has
security implications, and should not be set blindly.
Willy Tarreau [Fri, 19 Aug 2011 18:04:17 +0000 (20:04 +0200)]
[BUG] check: http-check expect + regex would crash in defaults section
Manoj Kumar reported a case where haproxy would crash upon start-up. The
cause was an "http-check expect" statement declared in the defaults section,
which caused a NULL regex to be used during the check. This statement is not
allowed in defaults sections precisely because this requires saving a copy
of the regex in the default proxy. But the check was not made to prevent it
from being declared there, hence the issue.
Instead of adding code to detect its abnormal use, we decided to implement
it. It was not that much complex because the expect_str part was not used
with regexes, so it could hold the string form of the regex in order to
compile it again for every backend (there's no way to clone regexes).
This patch has been tested and works. So it's both a bugfix and a minor
feature enhancement.
It should be backported to 1.4 though it's not critical since the config
was not supposed to be supported.
Simon Horman [Fri, 12 Aug 2011 23:03:48 +0000 (08:03 +0900)]
[MEDIUM] Fix stick-table replication on soft-restart
"[MINOR] session: add a pointer to the new target into the session" (664beb8)
introduced a regression by changing the type of a peer's target from
TARG_TYPE_PROXY to TARG_TYPE_NONE. The effect of this is that during
a soft-restart the new process no longer tries to connect to the
old process to replicate its stick tables.
This patch sets the type of a peer's target as TARG_TYPE_PROXY and
replication on soft-restart works once again.
Hervé COMMOWICK [Wed, 10 Aug 2011 15:42:41 +0000 (17:42 +0200)]
[MINOR] halog: add support for termination code matching (-tcn/-TCN)
It is now possible to filter by termination code with -tcn <termcode>, to be
able to track one kind of errors, for example after counting it with -tc.
Use -TCN <termcode> gives you the opposite.
Willy Tarreau [Sat, 6 Aug 2011 15:05:02 +0000 (17:05 +0200)]
[MEDIUM] checks: group health checks methods by values and save option bits
Adding health checks has become a real pain, with cross-references to all
checks everywhere because they're all a single bit. Since they're all
exclusive, let's change this to have a check number only. We reserve 4
bits allowing up to 16 checks (15+tcp), only 7 of which are currently
used. The code has shrunk by almost 1kB and we saved a few option bits.
The "dispatch" option has been moved to px->options, making a few tests
a bit cleaner.
Hervé COMMOWICK [Fri, 5 Aug 2011 14:23:48 +0000 (16:23 +0200)]
[MINOR] check: add redis check support
This patch provides a new "option redis-check" statement to enable server health checks based on redis PING request (http://www.redis.io/commands/ping).
Willy Tarreau [Tue, 2 Aug 2011 09:49:05 +0000 (11:49 +0200)]
[MEDIUM] stats: add support for changing frontend's maxconn at runtime
The new "set maxconn frontend XXX" statement on the stats socket allows
the admin to change a frontend's maxconn value. If some connections are
queued, they will immediately be accepted up to the new limit. If the
limit is lowered, new connections acceptation might be delayed. This can
be used to temporarily reduce or increase the impact of a specific frontend's
traffic on the whole process.
Willy Tarreau [Mon, 1 Aug 2011 18:57:55 +0000 (20:57 +0200)]
[MEDIUM] listeners: add a global listener management task
This global task is used to periodically check for end of resource shortage
and to try to enable queued listeners again. This is important in case some
temporary system-wide shortage is encountered, so that we don't have to wait
for an existing connection to be released before checking the queue again.
For situations where listeners are queued due to the global maxconn being
reached, the task is woken up at least every second. For situations where
a system resource shortage is detected (memory, sockets, ...) the task is
woken up at least every 100 ms. That way, recovery from severe events can
still be achieved under acceptable conditions.