git.ipfire.org Git - thirdparty/haproxy.git/log

[BUG] proxy: connection rate limiting was eating lots of CPU

The rate-limit feature relied on a timer to define how long a frontend
must remain idle. It was not considering the pending connections, so it
was almost always ready to be used again and only the accept's limit was
preventing new connections from coming in. By accounting for the pending
connection, we can compute a correct delay and effectively make the
frontend go idle for that (short) time.

[BUG] http: automatically close response if req is aborted

Latest BF_READ_ATTACHED fix has unveiled a nice issue with the way
HTTP requests and responses are forwarded. The case where the request
aborts after the response has responded (POST with early response)
forgot to re-enable auto-close on the response. In fact it still
worked thanks to a side effect as long as BF_READ_ATTACHED was there
to force the states to be resynced (and the flags). Since last fix,
the missing auto-close causes CLOSE_WAIT connections when the client
aborts too late during a data transfer.

The right fix consists in considering the situation where the client
experiences an error and to explicitly abort the transfer. There is
no need to wake the response analysers up for that since they'd have
no added value and the analysers flags are cleared. However for a
future usage, that might help (eg: stickiness, ...).

This fix should be backported to 1.4 if the previous one is backported
too. After all the non-reg tests, the risks to see a problem arise
without both patches seems low, and both patches touch sensible areas
of the code. So there's no hurry.

[BUG] session: clear BF_READ_ATTACHED before next I/O

The BF_READ_ATTACHED flag was created to wake analysers once after
a connection was established. It turns out that this flag is never
cleared once set, so even if there is no event, some analysers are
still evaluated for no reason.

The bug was introduced with commit ea38854d34675d5472319c453b7027af42fe8aab.
It may cause slightly increased CPU usages during data transfers, maybe
even quite noticeable once when transferring transfer-encoded data,
due to the fact that the request analysers are being checked for every
chunk.

This fix must be backported in 1.4 after all non-reg tests have been
completed.

[BUG] client: always ensure to zero rep->analysers

The response analyser was not emptied upon creation of a new session. In
fact it was always zero just because last session leaved it in a zero state,
but in case of shared pools this cannot be guaranteed. The net effect is
that it was possible to have some HTTP (or any other) analysers on the
response path of a stats unix socket, which would reject the response.

This fix must be backported to 1.4.

[BUG] http: the transaction must be initialized even in TCP mode (part 2)

Commit 4605e3b641cebbdbb2ee5726e5bbc3c03a2d7b5e was not enough, because
connections passing from a TCP frontend to an HTTP backend without any
ACL and via a "default_backend" statement were still working on non-initialized
data. An initialization was missing in the session_set_backend() function, next
to the initialization of hdr_idx.

[CONTRIB] halog: report per-server status codes, errors and response times

It's sometimes very useful to be able to monitor a production status in real
time by comparing servers behaviours. Now halog is able to do this when called
with "-srv". It reports various fields for each server found in a log, including
statuses, total reqs, valid reqs, percent of valid reqs, average connection time,
average response time.

[BUG] consistent hash: balance on all servers, not only 2 !

It was once reported at least by Dirk Taggesell that the consistent
hash had a very poor distribution, making use of only two servers.
Jeff Persch analysed the code and found the root cause. Consistent
hash makes use of the server IDs, which are completed after the chash
array initialization. This implies that each server which does not
have an explicit "id" parameter will be merged at the same place in
the chash tree and that in the end, only the first or last servers
may be used.

The now obvious fix (thanks to Jeff) is to assign the missing IDs
earlier. However, it should be clearly understood that changing a
hash algorithm on live systems will rebalance the whole system.
Anyway, the only affected users will be the ones for which the
system is quite unbalanced already. The ones who fix their IDs are
not affected at all.

Kudos to Jeff for spotting that bug which got merged 3 days after
the consistent hashing !

[BUG] tcp: dropped connections must be counted as "denied" not "failed"

This probably was a copy-paste typo from the initial tcp-request feature.
This must be backported to 1.4 and possibly 1.3.

[BUG] http: the transaction must be initialized even in TCP mode

When running in pure TCP mode with a traffic inspection rule to detect
HTTP protocol, we have to initialize the HTTP transaction too. The
effect of not doing this was that some incoming connections could have
been matched as carrying HTTP protocol eventhough this was not the case.

[BUG] http: dispatch and http_proxy modes were broken for a long time

Both dispatch and http_proxy modes were broken since 1.4-dev5 when
the adjustment of server health based on response codes was introduced.
In fact, in these modes, s->srv == NULL. The result is a plain segfault.
It should have been noted critical, but the fact that it remained 6
months without being noticed indicates that almost nobody uses these
modes anymore. Also, the crash is immediate upon first request.

Further versions should not be affected anymore since it's planned to
have a dummy server instead of these annoying NULL pointers.

[DEV] open new 1.5 development branch

This new branch is based on 1.4.6, which 1.5-dev0 is 100% equivalent to.
The roadmap has been updated.

[RELEASE] Released version 1.4.6

Released version 1.4.6 with the following main changes :
    - [BUILD] ebtree: update to v6.0.1 to remove references to dprintf()
    - [CLEANUP] acl: make use of eb_is_empty() instead of open coding the tree's emptiness test
    - [MINOR] acl: add srv_is_up() to check that a specific server is up or not
    - [DOC] add a few precisions about the use of RDP cookies

[DOC] add a few precisions about the use of RDP cookies

RDP cookies are not necessarily easy to implement because they require
some configuration on the servers. Add a few hints so that people know
what to check on their servers.

[MINOR] acl: add srv_is_up() to check that a specific server is up or not

This ACL was missing in complex setups where the status of a remote site
has to be considered in switching decisions. Until there, using a server's
status in an ACL required to have a dedicated backend, which is a bit heavy
when multiple servers have to be monitored.

[CLEANUP] acl: make use of eb_is_empty() instead of open coding the tree's emptiness test

Since ebtree 6.0.1, we now have eb_is_empty() which is cleaner and safer to
use than checking the left pointer of the tree, so let's use that.

[BUILD] ebtree: update to v6.0.1 to remove references to dprintf()

Cyril Bonté reported a build breakage with glibc >= 2.10 due to a name clash
in dprintf(). Let's remove it, it's unused anyway.

[RELEASE] Released version 1.4.5

Released version 1.4.5 with the following main changes :
    - [DOC] report minimum kernel version for tproxy in the Makefile
    - [MINOR] add the "ignore-persist" option to conditionally ignore persistence
    - [DOC] add the "ignore-persist" option to conditionally ignore persistence
    - [DOC] fix ignore-persist/force-persist documentation
    - [BUG] cttproxy: socket fd leakage in check_cttproxy_version
    - [DOC] doc/configuration.txt: fix typos
    - [MINOR] option http-pretend-keepalive is both for FEs and BEs
    - [MINOR] fix possible crash in debug mode with invalid responses
    - [MINOR] halog: add support for statisticts on status codes
    - [OPTIM] halog: use a faster zero test in fgets()
    - [OPTIM] halog: minor speedup by using unlikely()
    - [OPTIM] halog: speed up fgets2-64 by about 10%
    - [DOC] refresh the README file and merge the CONTRIB file into it
    - [MINOR] acl: support loading values from files
    - [MEDIUM] ebtree: upgrade to version 6.0
    - [MINOR] acl trees: add flags and union members to store values in trees
    - [MEDIUM] acl: add ability to insert patterns in trees
    - [MEDIUM] acl: add tree-based lookups of exact strings
    - [MEDIUM] acl: add tree-based lookups of networks
    - [MINOR] acl: ignore empty lines and comments in pattern files
    - [MINOR] stick-tables: add support for "stick on hdr"

[MINOR] stick-tables: add support for "stick on hdr"

It is now possible to stick on an IP address found in a HTTP header. Right
now only the last occurrence of the header can be used, which is generally
enough for most uses. Also, the header extraction rule only knows how to
convert the header to IP. Later it will be usable as a plain string with
an implicit conversion, and the syntax will not change.

[MINOR] acl: ignore empty lines and comments in pattern files

Most often, pattern files used by ACLs will be produced by tools
which emit some comments (eg: geolocation lists). It's very annoying
to have to clean the files before using them, and it does not make
much sense to be able to support patterns we already can't input in
the config file. So this patch makes the pattern file loader skip
lines beginning with a sharp and the empty ones, and strips leading
spaces and tabs.

[MEDIUM] acl: add tree-based lookups of networks

Networks patterns loaded from files for longest match ACL testing
will now be arranged into a prefix tree. This is possible thanks to
the new prefix features in ebtree v6.0. Longest match testing is
slightly slower than exact data maching. However, the measured impact
of running at 42000 requests per second and testing whether the IP
address found in a header belongs to a list of 52000 networks or
not is 3% CPU (increase from 66% to 69%). This is low enough to
permit true geolocation based on huge tables.

[MEDIUM] acl: add tree-based lookups of exact strings

Now if some ACL patterns are loaded from a file and the operation is
an exact string match, the data will be arranged in a tree, yielding
a significant performance boost on large data sets. Note that this
only works when case is sensitive.

A new dedicated function, acl_lookup_str(), has been created for this
matching. It is called for every possible input data to test and it
looks the tree up for the data. Since the keywords are loosely typed,
we would have had to add a new columns to all keywords to adjust the
function depending on the type. Instead, we just compare on the match
function. We call acl_lookup_str() when we could use acl_match_str().
The tree lookup is performed first, then the remaining patterns are
attempted if the tree returned nothing.

A quick test shows that when matching a header against a list of 52000
network names, haproxy uses 68% of one core on a core2-duo 3.2 GHz at
42000 requests per second, versus 66% without any rule, which means
only a 2% CPU increase for 52000 rules. Doing the same test without
the tree leads to 100% CPU at 6900 requests/s. Also it was possible
to run the same test at full speed with about 50 sets of 52000 rules
without any measurable performance drop.

[MEDIUM] acl: add ability to insert patterns in trees

The code is now ready to support loading pattern from filesinto trees. For
that, it will be required that the ACL keyword has a flag ACL_MAY_LOOKUP
and that the expr is case sensitive. When that is true, the pattern will
have a flag ACL_PAT_F_TREE_OK to indicate that it is possible to feed the
tree instead of a usual pattern if the parsing function is able to do this.
The tree's root is pre-initialized in the pattern's value so that the
function can easily find it. At that point, if the parsing function decides
to use the tree, it just sets ACL_PAT_F_TREE in the return flags so that
the caller knows the tree has been used and the pattern can be recycled.

That way it will be possible to load some patterns into the tree when it
is compatible, and other ones as linear linked lists. A good example of
this might be IPv4 network entries : right now we support holes in masks,
but this very rare feature is not compatible with binary lookup in trees.
So the parser will be able to decide itself whether the pattern can go to
the tree or not.

[MINOR] acl trees: add flags and union members to store values in trees

If we want to be able to match ACLs against a lot of possible values, we
need to put those values in trees. That will only work for exact matches,
which is normally just what is needed.

Right now, only IPv4 and string matching are planned, but others might come
later.

[MEDIUM] ebtree: upgrade to version 6.0

This version adds support for prefix-based matching of memory blocks,
as well as some code-size and performance improvements on the generic
code. It provides a prefix insertion and longest match which are
compatible with the rest of the common features (walk, duplicates,
delete, ...). This is typically used for network address matching. The
longest-match code is a bit slower than the original memory block
handling code, so they have not been merged together into generic
code. Still it's possible to perform about 10 million networks lookups
per second in a set of 50000, so this should be enough for most usages.

This version also fixes some bugs in parts that were not used, so there
is no need to backport them.

[MINOR] acl: support loading values from files

The "acl XXX -f <file>" syntax was supported but nothing was read from
the file. This is now possible. All lines are merged verbatim, even if
they contain spaces (useful for user-agents). There are shortcomings
though. The worst one is that error reporting is too approximative.

[DOC] refresh the README file and merge the CONTRIB file into it

Patrick Mézard reported that it was a bit awkward to have the CONTRIB
and contrib entries in the source archive since those can conflict on
case-insensitive file systems. That made a good opportunity to refresh
the README file and to remove that old outdated file.

[DOC] doc/configuration.txt: fix typos

[BUG] cttproxy: socket fd leakage in check_cttproxy_version

in cttproxy.c check_cttproxy_version socket is not closed before function
returned. Although it is called only once, I think it is better to close
the socket.

[OPTIM] halog: speed up fgets2-64 by about 10%

This version uses more 64-bit lookups and two 32-bit lookups
to converge faster. This saves about 10% performance.

[OPTIM] halog: minor speedup by using unlikely()

By moving the filter-specific code out of the loop, we can slightly
speed it up (3%).

[OPTIM] halog: use a faster zero test in fgets()

A new idea came up to detect the presence of a null byte in a word.
It saves several operations compared to the previous one, and eliminates
the jumps (about 6 instructions which can run 2-by-2 in parallel).

This sole optimisation improved the line count speed by about 30%.

[MINOR] halog: add support for statisticts on status codes

Using "-st", halog outputs number of requests by status codes.

[MINOR] fix possible crash in debug mode with invalid responses

When trying to display an invalid request or response we received,
we must at least check that we have identified something looking
like a start of message, otherwise we can dereference a NULL pointer.

[MINOR] option http-pretend-keepalive is both for FEs and BEs

The config parser only accepted it in frontends.

[DOC] fix ignore-persist/force-persist documentation

Shame on me, I didn't correctly document the "ignore-persist" statement
(convinced I used it like this in my tests, which is not the case at all...)
This fixes the doc and updates the proxy keyword matrix to add "force-persist".

[DOC] add the "ignore-persist" option to conditionally ignore persistence

Document the "ignore-persist" option.

[MINOR] add the "ignore-persist" option to conditionally ignore persistence

This is used to disable persistence depending on some conditions (for
example using an ACL matching static files or a specific User-Agent).
You can see it as a complement to "force-persist".

In the configuration file, the force-persist/ignore-persist declaration
order define the rules priority.

Used with the "appsesion" keyword, it can also help reducing memory usage,
as the session won't be hashed the persistence is ignored.

[DOC] report minimum kernel version for tproxy in the Makefile

[RELEASE] Released version 1.4.4

Released version 1.4.4 with the following main changes :
    - [BUG] appsession should match the whole cookie name
    - [CLEANUP] proxy: move PR_O_SSL3_CHK to options2 to release one flag
    - [MEDIUM] backend: move the transparent proxy address selection to backend
    - [MINOR] add very fast IP parsing functions
    - [MINOR] add new tproxy flags for dynamic source address binding
    - [MEDIUM] add ability to connect to a server from an IP found in a header
    - [BUILD] config: last patch breaks build without CONFIG_HAP_LINUX_TPROXY
    - [MINOR] http: make it possible to pretend keep-alive when doing close
    - [MINOR] config: report "default-server" instead of "(null)" in error messages

[BUG] appsession should match the whole cookie name

I met a strange behaviour with appsession.

I firstly thought this was a regression due to one of my previous patch
but after testing with a 1.3.15.12 version, I also could reproduce it.

To illustrate, the configuration contains :
  appsession PHPSESSID len 32 timeout 1h

Then I call a short PHP script containing :
  setcookie("P", "should not match")

When calling this script thru haproxy, the cookie "P" matches the appsession rule :
Dumping hashtable 0x11f05c8
        table[1572]:    should+not+match

Shouldn't it be ignored ?
If you confirm, I'll send a patch for 1.3 and 1.4 branches to check that the
cookie length is equal to the appsession name length.

This is due to the comparison length, where the cookie length is took into
account instead of the appsession name length. Using the appsession name
length would allow ASPSESSIONIDXXX (+ check that memcmp won't go after the
buffer size).

Also, while testing, I noticed that HEAD requests where not available for
URIs containing the appsession parameter. 1.4.3 patch fixes an horrible
segfault I missed in a previous patch when appsession is not in the
configuration and HAProxy is compiled with DEBUG_HASH.

[MINOR] config: report "default-server" instead of "(null)" in error messages

When an error is reported in a default-server entry, we want to have
that name in the error message instead of "(null)".

[MINOR] http: make it possible to pretend keep-alive when doing close

Some servers do not completely conform with RFC2616 requirements for
keep-alive when they receive a request with "Connection: close". More
specifically, they don't bother using chunked encoding, so the client
never knows whether the response is complete or not. One immediately
visible effect is that haproxy cannot maintain client connections alive.
The second issue is that truncated responses may be cached on clients
in case of network error or timeout.

Óscar Frías Barranco reported this issue on Tomcat 6.0.20, and
Patrik Nilsson with Jetty 6.1.21.

Cyril Bonté proposed this smart idea of pretending we run keep-alive
with the server and closing it at the last moment as is already done
with option forceclose. The advantage is that we only change one
emitted header but not the overall behaviour.

Since some servers such as nginx are able to close the connection
very quickly and save network packets when they're aware of the
close negociation in advance, we don't enable this behaviour by
default.

"option http-pretend-keepalive" will have to be used for that, in
conjunction with "option http-server-close".

[BUILD] config: last patch breaks build without CONFIG_HAP_LINUX_TPROXY

A few ifdefs were missing in the config validity check function.

[MEDIUM] add ability to connect to a server from an IP found in a header

Using get_ip_from_hdr2() we can look for occurrence #X or #-X and
extract the IP it contains. This is typically designed for use with
the X-Forwarded-For header.

Using "usesrc hdr_ip(name,occ)", it becomes possible to use the IP address
found in <name>, and possibly specify occurrence number <occ>, as the
source to connect to a server. This is possible both in a server and in
a backend's source statement. This is typically used to use the source
IP previously set by a upstream proxy.

[MINOR] add new tproxy flags for dynamic source address binding

This patch adds a new TPROXY bind type, TPROXY_DYN, to indicate to the
TCP connect function that we want to bind to the address passed in
argument.

[MINOR] add very fast IP parsing functions

Those functions were previouly used in my firewall log parser,
and are particularly suited for use with http headers.

[MEDIUM] backend: move the transparent proxy address selection to backend

The transparent proxy address selection was set in the TCP connect function
which is not the most appropriate place since this function has limited
access to the amount of parameters which could produce a source address.

Instead, now we determine the source address in backend.c:connect_server(),
right after calling assign_server_address() and we assign this address in
the session and pass it to the TCP connect function. This cannot be performed
in assign_server_address() itself because in some cases (transparent mode,
dispatch mode or http_proxy mode), we assign the address somewhere else.

This change will open the ability to bind to addresses extracted from many
other criteria (eg: from a header).

[CLEANUP] proxy: move PR_O_SSL3_CHK to options2 to release one flag

We'll need another flag in the 'options' member close to PR_O_TPXY_*,
and all are used, so let's move this easy one to options2 (which are
already used for SQL checks).

[RELEASE] Released version 1.4.3

Released version 1.4.3 with the following main changes :
    - [CLEANUP] stats: remove printf format warning in stats_dump_full_sess_to_buffer()
    - [MEDIUM] session: better fix for connection to servers with closed input
    - [DOC] indicate in the doc how to bind to port ranges
    - [BUG] backend: L7 hashing must not be performed on incomplete requests
    - [TESTS] add a simple program to test connection resets
    - [MINOR] cli: "show errors" should display "backend <NONE>" when backend was not used
    - [MINOR] config: emit warnings when HTTP-only options are used in TCP mode
    - [MINOR] config: allow "slowstart 0s"
    - [BUILD] 'make tags' did not consider files ending in '.c'
    - [MINOR] checks: add the ability to disable a server in the config

[MINOR] checks: add the ability to disable a server in the config

Adding the "disabled" keyword on a server line disables it. It can
then be enabled again on the unix socket.

[BUILD] 'make tags' did not consider files ending in '.c'

A missing parenthesis made the output of find apply only to files
'.h' making the tags useless.

[MINOR] config: allow "slowstart 0s"

Sometimes it's useful to be able to disable slowstart by setting "slowstart 0".

[MINOR] config: emit warnings when HTTP-only options are used in TCP mode

It's very common to see people getting trapped by HTTP-only options
which don't work in TCP proxies. To help them definitely get rid of
those configs, let's emit warnings for all options and statements
which are not supported in their mode. That includes all HTTP-only
options, the cookies and the stats.

In order to ensure internal config correctness, the options are also
disabled.

[MINOR] cli: "show errors" should display "backend <NONE>" when backend was not used

It was disturbing to see a backend name associated with a bad request
when this "backend" was in fact the frontend. Instead, we now display
"backend <NONE>" if the "backend" has no backend capability :

> show errors

[25/Mar/2010:06:44:25.394] frontend fe (#1): invalid request
src 127.0.0.1, session #0, backend <NONE> (#-1), server <NONE> (#-1)
request length 45 bytes, error at position 0:

[TESTS] add a simple program to test connection resets

The "reset" program may be used to send strings then reset connections.

[BUG] backend: L7 hashing must not be performed on incomplete requests

Isidore Li reported an occasional segfault when using URL hashing, and
kindly provided backtraces and core files to help debugging.

The problem was triggered by reset connections before the URL was sent,
and was due to the same bug which was fixed by commit e45997661bf
(connections were attempted in case of connection abort). While that
bug was already fixed, it appeared that the same segfault could be
triggered when URL hashing is configured in an HTTP backend when the
frontend runs in TCP mode and no URL was seen. It is totally abnormal
to try to hash a null URL, as well as to process any kind of L7 hashing
when a full request was not seen.

This additional fix now ensures that layer7 hashing is not performed on
incomplete requests.

[DOC] indicate in the doc how to bind to port ranges

This part was missing from the new doc.

[MEDIUM] session: better fix for connection to servers with closed input

The following patch fixed an issue but brought another one :
296897 [MEDIUM] connect to servers even when the input has already been closed

The new issue is that when a connection is inspected and aborted using
TCP inspect rules, now it is sent to the server before being closed. So
that test is not satisfying. A probably better way is not to prevent a
connection from establishing if only BF_SHUTW_NOW is set but BF_SHUTW
is not. That way, the BF_SHUTW flag is not set if the request has any
data pending, which still fixes the stats issue, but does not let any
empty connection pass through.

Also, as a safety measure, we extend buffer_abort() to automatically
disable the BF_AUTO_CONNECT flag. While it appears to always be OK,
it is by pure luck, so better safe than sorry.

[CLEANUP] stats: remove printf format warning in stats_dump_full_sess_to_buffer()

This warning was first reported by Ross West on FreeBSD, then by
Holger Just on OpenSolaris. It also happens on 64bit Linux. However,
fixing the format to use long int complains on 32bit Linux where
ptrdiff_t is apparently different. Better cast the pointer difference
to an int then.

[RELEASE] Released version 1.4.2

Released version 1.4.2 with the following main changes :
    - [CLEANUP] product branch update
    - [DOC] Some more documentation cleanups
    - [BUG] clf logs segfault when capturing a non existant header
    - [OPTIM] config: only allocate check buffer when checks are enabled
    - [MEDIUM] checks: support multi-packet health check responses
    - [CLEANUP] session: remove duplicate test
    - [BUG] http: don't wait for response data to leave buffer is client has left
    - [MINOR] proto_uxst: set accept_date upon accept() to the wall clock time
    - [MINOR] stats: don't send empty lines in "show errors"
    - [MINOR] stats: make the data dump function reusable for other purposes
    - [MINOR] stats socket: add show sess <id> to dump details about a session
    - [BUG] stats: connection reset counters must be plain ascii, not HTML
    - [BUG] url_param hash may return a down server
    - [MINOR] force null-termination of hostname
    - [MEDIUM] connect to servers even when the input has already been closed
    - [BUG] don't merge anonymous ACLs !
    - [BUG] config: fix endless loop when parsing "on-error"
    - [MINOR] http: don't mark a server as failed when it returns 501/505
    - [OPTIM] checks: try to detect the end of response without polling again
    - [BUG] checks: don't report an error when recv() returns an error after data
    - [BUG] checks: don't abort when second poll returns an error
    - [MINOR] checks: make shutdown() silently fail
    - [BUG] http: fix truncated responses on chunk encoding when size divides buffer size
    - [BUG] init: unconditionally catch SIGPIPE
    - [BUG] checks: don't wait for a close to start parsing the response

[OPTIM] config: only allocate check buffer when checks are enabled

To save a little memory, the check_data buffer is only allocated
for the servers that are checked.

[WT: this patch saves 80 MB of RAM on the test config with 5000 servers]

[BUG] checks: don't wait for a close to start parsing the response

Cyril Bonté reported a regression introduced with very last changes
on the checks code, which causes failed checks on if the server does
not close the connection in time. This happens on HTTP/1.1 checks or
on SMTP checks for instance.

This fix consists in restoring the old behaviour of parsing as soon
as something is available in the response buffer, and waiting for
more data if some are missing. This also helps releasing connections
earlier (eg: a GET check will not have to download the whole object).

[BUG] init: unconditionally catch SIGPIPE

Apparently some systems define MSG_NOSIGNAL but do not necessarily
check it (or maybe binaries are built somewhere and used on older
versions). There were reports of very recent FreeBSD setups causing
SIGPIPEs, while older ones catch the signal. Recent FreeBSD manpages
indeed define MSG_NOSIGNAL.

So let's now unconditionnaly catch the signal. It's useless not to do
it for the rare cases where it's not needed (linux 2.4 and below).

[BUG] http: fix truncated responses on chunk encoding when size divides buffer size

Bernhard Krieger reported truncated HTTP responses in presence of some
specific chunk-encoded data, and kindly offered complete traces of the
issue which made it easy to reproduce it.

Those traces showed that the chunks were of exactly 8192 bytes, chunk
size and CRLF included, which was exactly half the size of the buffer.
In this situation, the function http_chunk_skip_crlf() could erroneously
try to parse a CRLF after the chunk believing there were more data
pending, because the number of bytes present in the buffer was considered
instead of the number of remaining bytes to be parsed.

[MINOR] checks: make shutdown() silently fail

Shutdown may fail for instance after an RST. So we must not report
any error for that.

[BUG] checks: don't abort when second poll returns an error

Now that the response may be fragmented, we may receive early notifications
of aborts in return of poll(), as indicated below, which currently cause
an early error detection :

  21:11:21.036600 epoll_wait(3, {{EPOLLIN, {u32=7, u64=7}}}, 8, 993) = 1
  21:11:21.054361 gettimeofday({1268770281, 54467}, NULL) = 0
  21:11:21.054540 recv(7, "H"..., 8030, 0) = 1
  21:11:21.054694 recv(7, 0x967e759, 8029, 0) = -1 EAGAIN (Resource temporarily unavailable)
  21:11:21.054843 epoll_wait(3, {{EPOLLIN|EPOLLERR|EPOLLHUP, {u32=7, u64=7}}}, 8, 975) = 1
  21:11:21.060274 gettimeofday({1268770281, 60386}, NULL) = 0
  21:11:21.060454 close(7)                = 0

Just as in stream_sock, we must not believe poll() without attempting to receive,
which fixes the issue :

  21:11:59.402207 recv(7, "H"..., 8030, 0) = 1
  21:11:59.402362 recv(7, 0x8b5c759, 8029, 0) = -1 EAGAIN (Resource temporarily unavailable)
  21:11:59.402511 epoll_wait(3, {{EPOLLIN|EPOLLERR|EPOLLHUP, {u32=7, u64=7}}}, 8, 974) = 1
  21:11:59.407242 gettimeofday({1268770319, 407353}, NULL) = 0
  21:11:59.407425 recv(7, "TTP/1.0 200 OK\r\n"..., 8029, 0) = 16
  21:11:59.407606 recv(7, 0x8b5c769, 8013, 0) = -1 ECONNRESET (Connection reset by peer)
  21:11:59.407753 shutdown(7, 2 /* send and receive */) = -1 ENOTCONN (Transport endpoint is not connected)

[BUG] checks: don't report an error when recv() returns an error after data

This happens when a server immediately closes the connection after
the response without lingering or when we close before the end of
the data. We get an RST which translates into a late error. We must
not declare an error without checking that the contents are OK.

[OPTIM] checks: try to detect the end of response without polling again

Since the recv() call returns every time it succeeds, we always need
to calls with one intermediate poll before detecting the end of response :

  20:20:03.958207 recv(7, "HTTP/1.1 200\r\nConnection: close\r\n"..., 8030, 0) = 145
  20:20:03.958365 epoll_wait(3, {{EPOLLIN, {u32=7, u64=7}}}, 8, 1000) = 1
  20:20:03.958543 gettimeofday({1268767203, 958626}, NULL) = 0
  20:20:03.958694 recv(7, ""..., 7885, 0) = 0
  20:20:03.958833 shutdown(7, 2 /* send and receive */) = 0

Let's read as long as we can, that way we can detect end of connections
in the same call, which is much more efficient especially for LBs with
hundreds of servers :

  20:29:58.797019 recv(7, "HTTP/1.1 200\r\nConnection: close\r\n"..., 8030, 0) = 145
  20:29:58.797182 recv(7, ""..., 7885, 0) = 0
  20:29:58.797356 shutdown(7, 2 /* send and receive */) = 0

[MEDIUM] checks: support multi-packet health check responses

We are seeing both real servers repeatedly going on- and off-line with
a period of tens of seconds. Packet tracing, stracing, and adding
debug code to HAProxy itself has revealed that the real servers are
always responding correctly, but HAProxy is sometimes receiving only
part of the response.

It appears that the real servers are sending the test page as three
separate packets. HAProxy receives the contents of one, two, or three
packets, apparently randomly. Naturally, the health check only
succeeds when all three packets' data are seen by HAProxy. If HAProxy
and the real servers are modified to use a plain HTML page for the
health check, the response is in the form of a single packet and the
checks do not fail.

(...)
I've added buffer and length variables to struct server, and allocated
space with the rest of the server initialisation.

(...)
It seems to be working fine in my tests, and handles check responses
that are bigger than the buffer.

[MINOR] http: don't mark a server as failed when it returns 501/505

Those two codes can be triggered on demand by client requests.
We must not fail a server on them.

Ideally we should ignore a certain amount of status codes which do
not indicate life nor death.

[BUG] config: fix endless loop when parsing "on-error"

An arg index increment was missing causing the same arg to be parsed
in an endless loop. Proabably a merge conflict that remained undetected.

[BUG] don't merge anonymous ACLs !

The new anonymous ACL feature was buggy. If several ones are
declared, the first rule is always matched because all of them
share the same internal name (".noname"). Now we simply declare
them with an empty name and ensure that we disable any merging
when the name is empty.

[BUG] clf logs segfault when capturing a non existant header

Hi Willy,

Please find a small patch to prevent haproxy segfaulting when logging captured headers in CLF format.

Example config to reproduce the bug :
listen test :10080
log 127.0.0.1 local7 debug err
mode http
option httplog clf
capture request header NonExistantHeader len 16

--
Cyril Bonté

[MEDIUM] connect to servers even when the input has already been closed

The BF_AUTO_CLOSE flag prevented a connection from establishing on
a server if the other side's input channel was already closed. This
is wrong because there may be pending data to be sent.

This was causing an issue with stats, as noticed and reported by
Cyril Bonté. Since the stats are now handled as a server, sometimes
concurrent accesses were causing one of the connections to send the
shutdown(write) before the connection to the stats function was
established, which aborted it early.

This fix causes the BF_AUTO_CLOSE flag to be checked only when the
connection on the outgoing stream interface has reached an established
state. That way we're still able to connect, send the request then
close.

[MINOR] force null-termination of hostname

Marcello Gorlani reported that at least on FreeBSD, a long hostname
was reported with garbage on the stats page. POSIX does not make it
mandatory for gethostname() to NULL-terminate the string in case of
truncation, and at least FreeBSD appears not to do it. So let's
force null-termination to keep safe.

[DOC] Some more documentation cleanups

Since the last documentation cleanups, I've found more typos that I kept
in a corner instead of sending you a mail just for one character :)

--
Cyril Bonté

[CLEANUP] product branch update

today I've noticed that the stats page still displays v1.3 in the
"Updates" link, due to the PRODUCT_BRANCH value in version.h, then
it's maybe time to send you the result (notice that the patch updates
PRODUCT_BRANCH to "1.4").

--
Cyril Bonté

[BUG] url_param hash may return a down server

Jozef Hovan reported a bug sometimes causing a down server to be
used in url_param hashing mode.

This happens if the following conditions are met :
  - the backend contains more than one server with at least two
    of different weights
  - all servers but one are down
  - the server which is not down has a weight which does not divide
    all the other ones

Example: 3 servers with 20,20,10, the first one remains up.

The problem is caused by an optimisation in recalc_server_map()
which only fills the first map slot when only one server is up,
because all LB algorithms are optimized to use entry zero when
only one server is up... All but url_param. When doing the modulus,
we can return a position which is greater than zero and use an
entry which still refers to a server which has since been stopped.

One solution could be to optimize the url_param algo to proceed
as the other ones, but the fact that was wrong implies that we
can repeat the same bug later. So let's first correctly initialize
the map in order to avoid that trap.

[BUG] stats: connection reset counters must be plain ascii, not HTML

Using U2H*() to put numbers in title tags is a bad idea as those
appear in the HTML itself. Problem reported by Laurent Dolosor.

[MINOR] stats socket: add show sess <id> to dump details about a session

When trying to spot some complex bugs, it's often needed to access
information on stuck sessions, which is quite difficult. This new
command helps one get detailed information about a session, with
flags, timers, states, etc... The buffer data are not dumped yet.

[MINOR] stats: make the data dump function reusable for other purposes

The dump_error_line() function was limited to dump error buffers while
it's perfectly suitable to dump anything else.

[MINOR] stats: don't send empty lines in "show errors"

Empty lines indicate end of dump, so it's important not to do that.
Send a single space instead.

[MINOR] proto_uxst: set accept_date upon accept() to the wall clock time

This accept_date field was not set and will be reported in the stats as
the connection's accept date.

[BUG] http: don't wait for response data to leave buffer is client has left

In case of pipelined requests, if the client aborts before reading response
N-1, haproxy waits forever for the data to leave the buffer before parsing
the next response.

[CLEANUP] session: remove duplicate test

This duplicate test should have been removed with the loop rework but was forgotten.
It was harmless, but disassembly shows that it prevents gcc from correctly optimizing
the loop.

[RELEASE] Released version 1.4.1

Released version 1.4.1 with the following main changes :
    - [BUG] Clear-cookie path issue
    - [DOC] fix typo on stickiness rules
    - [BUILD] fix BSD and OSX makefiles for missing files
    - [BUILD] includes order breaks OpenBSD build
    - [BUILD] fix some build warnings on Solaris with is* macros
    - [BUG] logs: don't report "last data" when we have just closed after an error
    - [BUG] logs: don't report "proxy request" when server closes early
    - [BUILD] fix platform-dependant build issues related to crypt()
    - [STATS] count transfer aborts caused by client and by server
    - [STATS] frontend requests were not accounted for failed requests
    - [MINOR] report total number of processed connections when stopping a proxy
    - [DOC] be more clear about the limitation to one single monitor-net entry

[DOC] be more clear about the limitation to one single monitor-net entry

It was not clear in the doc that only one monitor-net entry is supported.

[BUG] Clear-cookie path issue

We have been using haproxy to balance a not very well written application
(http://www.blackboard.com/). Using the "insert postonly indirect" cookie
method, I was attempting to remove the cookie when users would logout,
allowing the machine to re-balance for the next user (this application is
used in school computer labs, so a computer might stay on the whole day
but be used on and off).

I was having a lot of trouble because when the cookie was set, it was with
"Path=/", but when being cleared there was no "Path" in the set cookie
header, and because the logout page was in a different place of the
website (which I couldn't change), the cookie would not be cleared. I
don't know if this would be a problem for anyone other than me (as our
HTTP application is so un-adjustable), but just in case, I have included
the patch I used. Maybe it will help someone else.

[ WT: this was a correct fix, and I also added the same missing path to
the set-cookie option ]

[MINOR] report total number of processed connections when stopping a proxy

It's sometimes convenient to know if a proxy has processed any connection
at all when stopping it. Since a soft restart causes the "Proxy stopped"
message to be logged for each proxy, let's add the number of connections
so that it's possible afterwards to check whether a proxy had received
any connection.

[STATS] frontend requests were not accounted for failed requests

But failed requests were accounted for, resulting in more failures
than requests.

[STATS] count transfer aborts caused by client and by server

Often we need to understand why some transfers were aborted or what
constitutes server response errors. With those two counters, it is
now possible to detect an unexpected transfer abort during a data
phase (eg: too short HTTP response), and to know what part of the
server response errors may in fact be assigned to aborted transfers.

[BUILD] fix platform-dependant build issues related to crypt()

Holger Just and Ross West reported build issues on FreeBSD and
Solaris that were initially caused by the definition of
_XOPEN_SOURCE at the top of auth.c, which was required on Linux
to avoid a build warning.

Krzysztof Oledzki found that using _GNU_SOURCE instead also worked
on Linux and did not cause any issue on several versions of FreeBSD.
Solaris still reported a warning this time, which was fixed by
including <crypt.h>, which itself is not present on FreeBSD nor on
all Linux toolchains.

So by adding a new build option (NEED_CRYPT_H), we can get Solaris
to get crypt() working and stop complaining at the same time, without
impacting other platforms.

This fix was tested at least on several linux toolchains (at least
uclibc, glibc 2.2.5, 2.3.6 and 2.7), on FreeBSD 4 to 8, Solaris 8
(which needs crypt.h), and AIX 5.3 (without crypt.h).

Every time it builds without a warning.

[BUG] logs: don't report "proxy request" when server closes early

A copy-paste typo and a missing check were causing the logs to
report "PR" instead of "SD" when a server closes before sending
full data. Also, the log would erroneously report 502 while in
fact the correct response will already have been transmitted.

[BUG] logs: don't report "last data" when we have just closed after an error

Some people have reported seeing "SL" flags in their logs quite often while
this should never happen. The reason was that then a server error is detected,
we close the connection to that server and when we decide what state we were
in, we see the connection is closed, and deduce it was the last data transfer,
which is wrong. We should report DATA if the previous state was an established
state, which this patch does.

Now logs correctly report "SD" and not "SL" when a server resets a connection
before the end of the transfer.

[BUILD] fix some build warnings on Solaris with is* macros

isalnum, isdigit and friends are really annoying because they take
an int in which we should pass an unsigned char, while strings
everywhere use chars. Solaris uses macros relying on an array for
those functions, which easily triggers some warnings showing where
we have mistakenly passed a char instead of an unsigned char or an
int. Those warnings may indicate real bugs on some platforms
depending on the implementation.

[BUILD] includes order breaks OpenBSD build

Jeff Buchbinder reported that OpenBSD build broke on compat.h,
and that this patch fixes the issue.

[BUILD] fix BSD and OSX makefiles for missing files

Jeff Buchbinder reported that OpenBSD build broke because recently
added files were not ported to BSD and OSX makefiles.

[DOC] fix typo on stickiness rules

[RELEASE] Released version 1.4.0

Released version 1.4.0 with the following main changes :
    - [MINOR] stats: report maint state for tracking servers too
    - [DOC] fix summary to add pattern extraction
    - [DOC] Documentation cleanups
    - [BUG] cfgparse memory leak and missing free calls in deinit()
    - [BUG] pxid/puid/luid: don't shift IDs when some of them are forced
    - [EXAMPLES] add auth.cfg
    - [BUG] uri_auth: ST_SHLGNDS should be 0x00000008 not 0x0000008
    - [BUG] uri_auth: do not attemp to convert uri_auth -> http-request more than once
    - [BUILD] auth: don't use unnamed unions
    - [BUG] config: report unresolvable host names as errors
    - [BUILD] fix build breakage with DEBUG_FULL
    - [DOC] fix a typo about timeout check and clarify the explanation.
    - [MEDIUM] http: don't use trash to realign large buffers
    - [STATS] report HTTP requests (total and rate) in frontends
    - [STATS] separate frontend and backend HTTP stats
    - [MEDIUM] http: revert to use a swap buffer for realignment
    - [MINOR] stats: report the request rate in frontends as cell titles
    - [MINOR] stats: mark areas with an underline when tooltips are available
    - [DOC] reorder some entries to maintain the alphabetical order
    - [DOC] cleanup of the keyword matrix

[DOC] cleanup of the keyword matrix

The keyword matrix was barely readable due to the long lines.
Also let's repeat the legend every 24 lines.