git.ipfire.org Git - thirdparty/haproxy.git/log

[MEDIUM] implement a monotonic internal clock

If the system date is set backwards while haproxy is running,
some scheduled events are delayed by the amount of time the
clock went backwards. This is particularly problematic on
systems where the date is set at boot, because it seldom
happens that health-checks do not get sent for a few hours.

Before switching to use clock_gettime() on systems which
provide it, we can at least ensure that the clock is not
going backwards and maintain two clocks : the "date" which
represents what the user wants to see (mostly for logs),
and an internal date stored in "now", used for scheduled
events.

[DOC] documentation for the "retries" parameter was missing.

[BUG] fix the dequeuing logic to ensure that all requests get served

The dequeuing logic was completely wrong. First, a task was assigned
to all servers to process the queue, but this task was never scheduled
and was only woken up on session free. Second, there was no reservation
of server entries when a task was assigned a server. This means that
as long as the task was not connected to the server, its presence was
not accounted for. This was causing trouble when detecting whether or
not a server had reached maxconn. Third, during a redispatch, a session
could lose its place at the server's and get blocked because another
session at the same moment would have stolen the entry. Fourth, the
redispatch option did not work when maxqueue was reached for a server,
and it was not possible to do so without indefinitely hanging a session.

The root cause of all those problems was the lack of pre-reservation of
connections at the server's, and the lack of tracking of servers during
a redispatch. Everything relied on combinations of flags which could
appear similarly in quite distinct situations.

This patch is a major rework but there was no other solution, as the
internal logic was deeply flawed. The resulting code is cleaner, more
understandable, uses less magics and is overall more robust.

As an added bonus, "option redispatch" now works when maxqueue has
been reached on a server.

[BUG] log: reported queue position was offed-by-one

The reported queue position in the logs was 0 for the first pending request
in the queue, which is wrong because it means that one request will have to
be completed before the queued one may execute. It caused the undesired side
effect that 0/0 was reported when either 0 or 1 request was pending in the
queue. Thus, we have to increment the queue size before reporting the value.

[BUG] queue management: wake oldest request in queues

When a server terminates a connection, the next session in its
own queue was immediately processed. Because of this, if all
server queues are always filled, then no new anonymous request
will be processed. Consider oldest request between global and
server queues to choose from which to pick the request.

An improvement over this will consist in adding a configurable
offset when comparing expiration dates, so that cookie-less
requests can get either less or more priority.

[BUG] event pollers must not wait if a task exists in the run queue

Under some circumstances, a task may already lie in the run queue
(eg: inter-task wakeup). It is disastrous to wait for an event in
this case because some processing gets delayed.

[DEBUG] add a TRACE macro to facilitate runtime data extraction

The new TRACE macro is used almost like fprintf, except that a session
has to be passed instead of the file descriptor. It displays infos about
where it is called, session ptr and id, etc...

[BUILD] make install should depend on haproxy not "all"

Reported by Cherife Li : just doing a "make install" fails because it
depends on "all" which is equivalent to "help" if no TARGET was specified.
Make it depend on "haproxy" instead.

[MEDIUM] add support for conditional HTTP redirection

A new "redirect" keyword adds the ability to send an HTTP 301/302/303
redirection to either an absolute location or to a prefix followed by
the original URI. The redirection is conditionned by ACL rules, so it
becomes very easy to move parts of a site to another site using this.

This work was almost entirely done at Exceliance by Emeric Brun.

A test-case has been added in the tests/ directory.

[MEDIUM] Fix memory freeing at exit, part 2

- free oldpids
- call free(exp->preg), not only regfree(exp->preg): req_exp, rsp_exp
- build a list of unique uri_auths and eventually free it
- prune_acl_cond/free for switching_rules
- add a callback pointer to free ptr from acl_pattern (used for regexs) and execute it

==1180== malloc/free: in use at exit: 0 bytes in 0 blocks.
==1180== malloc/free: 5,599 allocs, 5,599 frees, 4,220,556 bytes allocated.
==1180== All heap blocks were freed -- no leaks are possible.

[MEDIUM] Fix memory freeing at exit

New functions implemented:
- deinit_pollers: called at the end of deinit())
- prune_acl: called via list_for_each_entry_safe

Add missing pool_destroy2 calls:
- p->hdr_idx_pool
- pool2_tree64

Implement all task stopping:
- health-check: needs new "struct task" in the struct server
- queue processing: queue_mgt
- appsess_refresh: appsession_refresh

before (idle system):
==6079== LEAK SUMMARY:
==6079==    definitely lost: 1,112 bytes in 75 blocks.
==6079==    indirectly lost: 53,356 bytes in 2,090 blocks.
==6079==      possibly lost: 52 bytes in 1 blocks.
==6079==    still reachable: 150,996 bytes in 504 blocks.
==6079==         suppressed: 0 bytes in 0 blocks.

after (idle system):
==6945== LEAK SUMMARY:
==6945==    definitely lost: 7,644 bytes in 137 blocks.
==6945==    indirectly lost: 9,913 bytes in 587 blocks.
==6945==      possibly lost: 0 bytes in 0 blocks.
==6945==    still reachable: 0 bytes in 0 blocks.
==6945==         suppressed: 0 bytes in 0 blocks.

before (running system for ~2m):
==9343== LEAK SUMMARY:
==9343==    definitely lost: 1,112 bytes in 75 blocks.
==9343==    indirectly lost: 54,199 bytes in 2,122 blocks.
==9343==      possibly lost: 52 bytes in 1 blocks.
==9343==    still reachable: 151,128 bytes in 509 blocks.
==9343==         suppressed: 0 bytes in 0 blocks.

after (running system for ~2m):
==11616== LEAK SUMMARY:
==11616==    definitely lost: 7,644 bytes in 137 blocks.
==11616==    indirectly lost: 9,981 bytes in 591 blocks.
==11616==      possibly lost: 0 bytes in 0 blocks.
==11616==    still reachable: 4 bytes in 1 blocks.
==11616==         suppressed: 0 bytes in 0 blocks.

Still not perfect but significant improvement.

[BUG/CLEANUP] cookiedomain -> cookie_domain rename + free(p->cookie_domain)

Rename cookiedomain -> cookie_domain to be consistent with current
naming scheme. Also make sure cookie_domain is deallocated at deinit()

[MEDIUM] detect streaming buffers and tag them as such

Add the ability to detect streaming buffers, and set a
flag indicating it. It will later serve us in order to
dynamically resize them, and to prioritize file descriptors
during polls.

[MEDIUM] reduce risk of event starvation in ev_sepoll

If too many events are set for spec I/O, those ones can starve the
polled events. Experiments show that when polled events starve, they
quickly turn into spec I/O, making the situation even worse. While
we can reduce the number of polled events processed at once, we
cannot do this on speculative events because most of them are new
ones (avg 2/3 new - 1/3 old from experiments).

The solution against this problem relies on those two factors :
  1) one FD registered as a spec event cannot be polled at the same time
  2) even during very high loads, we will almost never be interested in
     simultaneous read and write streaming on the same FD.

The first point implies that during starvation, we will not have more than
half of our FDs in the poll list, otherwise it means there is less than that
in the spec list, implying there is no starvation.

The second point implies that we're statically only interested in half of
the maximum number of file descriptors at once, because we will unlikely
have simultaneous read and writes for a same buffer during long periods.

So, if we make it possible to drain maxsock/2/2 during peak loads, then we
can ensure that there will be no starvation effect. This means that we must
always allocate maxsock/4 events for the poller.

Last, sepoll uses an optimization consisting in reducing the number of calls
to epoll_wait() to once every too polls. However, when dealing with many
spec events, we can wait very long and skipping epoll_wait() every second
time increases latency. For this reason, we try to detect if we are beyond
a reasonable limit and stop doing so at this stage.

[DOC] update the README file with new build options

[MINOR] Allow to specify a domain for a cookie

This patch allows to specify a domain used when inserting a cookie
providing a session stickiness. Usefull for example with wildcard domains.

The patch adds one new variable to the struct proxy: cookiedomain.
When set the domain is appended to a Set-Cookie header.

Domain name is validated using the new invalid_domainchar() function.
It is basically invalid_char() limited to [A-Za-z0-9_.-]. Yes, the test
is too trivial and does not cover all wrong situations, but the main
purpose is to detect most common mistakes, not intentional abuses.

The underscore ("_") character is not RFC-valid but as it is
often (mis)used so I decided to allow it.

[MEDIUM] upgrade to ebtree v4.0

New ebtree also supports unique keys. This is useful for counters.

[MEDIUM] add support for URI hash depth and length limits

This patch adds two optional arguments "len" and "depth" to
"balance uri". They are used to limit the length in characters
of the analysis, as well as the number of directory components
it applies to.

[BUILD] fix build with gcc 4.3

For Fedora 9 gcc 4.3 will be shipping as a feature, and right now haproxy does
not compile with gcc 4.3.

It appears that there is a reordering of headers or something along those lines,
This is the patch that gets haproxy to compile with gcc 4.3. I'm not sure if
this is the correct approach you would want to use, so please correct me.

If this works for you, I'll go ahead and put this patch in the src rpm until a
release of haproxy which compiles with gcc 4.3 is released.

[TESTS] add a debug patch to help trigger the stats bug

About: [BUG] Flush buffers also where there are exactly 0 bytes left

I'm also attaching a debug patch that helps to trigger this bug.

Without the fix:
# echo -ne "GET /haproxy?stats;csv;norefresh HTTP/1.0\r\n\r\n"|nc 127.0.0.1
801|wc -c
16384

With the fix:
# echo -ne "GET /haproxy?stats;csv;norefresh HTTP/1.0\r\n\r\n"|nc 127.0.0.1
801|wc -c
33089

Best regards,

Krzysztof Oledzki

[BUG] Flush buffers also where there are exactly 0 bytes left

I noticed it was possible to get truncated http/csv stats. Sometimes.
Usually the problem disappeared as fast as it appeared, but once it
happend that my http-stats page was truncated for about one hour.
It was quite weird as it happened independently for csv and http
output and it took me some time to track & fix this bug.

Both buffer_write & buffer_write_chunk used to return 0 in two
situations: is case of success or where there was exactly 0 bytes
left. The first one is intentional but I believe the second one
is not as it was not possible to distinguish between successful
write and unsuccessful one, which means that if the buffer was 100%
filled, it was never flushed and it was not possible to write
more data.

This patch fixes this problem.

[RELEASE] Released version 1.3.15

Released version 1.3.15 with the following main changes :
    - [BUILD] Added support for 'make install'
    - [BUILD] Added 'install-man' make target for installing the man page
    - [BUILD] Added 'install-bin' make target
    - [BUILD] Added 'install-doc' make target
    - [BUILD] Removed "/" after '$(DESTDIR)' in install targets
    - [BUILD] Changed 'install' target to install the binaries first
    - [BUILD] Replace hardcoded 'LD = gcc' with 'LD = $(CC)'
    - [MEDIUM]: Inversion for options
    - [MEDIUM]: Count retries and redispatches also for servers, fix redistribute_pending, extend logs, %d->%u cleanup
    - [BUG]: Restore clearing t->logs.bytes
    - [MEDIUM]: rework checks handling
    - [DOC] Update a "contrib" file with a hint about a scheme used for formathing subjects
    - [MEDIUM] Implement "track [<backend>/]<server>"
    - [MINOR] Implement persistent id for proxies and servers
    - [BUG] Don't increment server connections too much + fix retries
    - [MEDIUM]: Prevent redispatcher from selecting the same server, version #3
    - [MAJOR] proto_uxst rework -> SNMP support
    - [BUG] appsession lookup in URL does not work
    - [BUG] transparent proxy address was ignored in backend
    - [BUG] hot reconfiguration failed because of a wrong error check
    - [DOC] big update to the configuration manual
    - [DOC] large update to the configuration manual
    - [DOC] document more options
    - [BUILD] major rework of the GNU Makefile
    - [STATS] add support for "show info" on the unix socket
    - [DOC] document options forwardfor to logasap
    - [MINOR] add support for the "backlog" parameter
    - [OPTIM] introduce global parameter "tune.maxaccept"
    - [MEDIUM] introduce "timeout http-request" in frontends
    - [MINOR] tarpit timeout is also allowed in backends
    - [BUG] increment server connections for each connect()
    - [MEDIUM] add a turn-around state of one second after a connection failure
    - [BUG] fix typo in redispatched connection
    - [DOC] document options nolinger to ssl-hello-chk
    - [DOC] added documentation for "option tcplog" to "use_backend"
    - [BUG] connect_server: server might not exist when sending error report
    - [MEDIUM] support fully transparent proxy on Linux (USE_LINUX_TPROXY)
    - [MEDIUM] add non-local bind to connect() on Linux
    - [MINOR] add transparent proxy support for balabit's Tproxy v4
    - [BUG] use backend's source and not server's source with tproxy
    - [BUG] fix overlapping server flags
    - [MEDIUM] fix server health checks source address selection
    - [BUG] build failed on CONFIG_HAP_LINUX_TPROXY without CONFIG_HAP_CTTPROXY
    - [DOC] added "server", "source" and "stats" keywords
    - [DOC] all server parameters have been documented
    - [DOC] document all req* and rsp* keywords.
    - [DOC] added documentation about HTTP header manipulations
    - [BUG] log response byte count, not request
    - [BUILD] code did not build in full debug mode
    - [BUG] fix truncated responses with sepoll
    - [MINOR] use s->frt_addr as the server's address in transparent proxy
    - [MINOR] fix configuration hint about timeouts
    - [DOC] minor cleanup of the doc and notice to contributors
    - [MINOR] report correct section type for unknown keywords.
    - [BUILD] update MacOS Makefile to build on newer versions
    - [DOC] fix erroneous "useallbackups" option in the doc
    - [DOC] applied small fixes from early readers
    - [MINOR] add configuration support for "redir" server keyword
    - [MEDIUM] completely implement the server redirection method
    - [TESTS] add a test case for the server redirection mechanism
    - [DOC] add a configuration entry for "server ... redir <prefix>"
    - [BUILD] backend.c and checks.c did not build without tproxy !
    - Revert "[BUILD] backend.c and checks.c did not build without tproxy !"
    - [BUILD] backend.c and checks.c did not build without tproxy !
    - [OPTIM] used unsigned ints for HTTP state and message offsets
    - [OPTIM] GCC4's builtin_expect() is suboptimal
    - [BUG] failed conns were sometimes incremented in the frontend!
    - [BUG] timeout.check was not pre-set to eternity
    - [TESTS] add test-pollers.cfg to easily report pollers in use
    - [BUG] do not apply timeout.connect in checks if unset
    - [BUILD] ensure that makefile understands USE_DLMALLOC=1
    - [MINOR] silent gcc for a wrong warning
    - [CLEANUP] update .gitignore to ignore more temporary files
    - [CLEANUP] report dlmalloc's source path only if explictly specified
    - [BUG] str2sun could leak a small buffer in case of error during parsing
    - [BUG] option allbackups was not working anymore in roundrobin mode
    - [MAJOR] implementation of the "leastconn" load balancing algorithm
    - [BUILD] ensure that users don't build without setting the target anymore.
    - [DOC] document the leastconn LB algo
    - [MEDIUM] fix stats socket limitation to 16 kB
    - [DOC] fix unescaped space in httpchk example.
    - [BUG] fix double-decrement of server connections
    - [TESTS] add a test case for port mapping
    - [TESTS] add a benchmark for integer hashing
    - [TESTS] add new methods in ip-hash test file
    - [MAJOR] implement parameter hashing for POST requests

[BUILD] fix build of POST analysis code with gcc < 3

move variable declarations at beginning of blocks.

[MAJOR] implement parameter hashing for POST requests

This patch extends the "url_param" load balancing method by introducing
the "check_post" option. Using this option enables analysis of the beginning
of POST requests to search for the specified URL parameter.

The patch also fixes a few minor typos in comments that were discovered
during code review.

[TESTS] add new methods in ip-hash test file

added methods to provide a better hash with small input sets

[TESTS] add a benchmark for integer hashing

If we want to support netmasks for IP address hashing,
we will need something better than a pure modulus, otherwise
people with even numbers of servers will get surprizes.

Bob Jenkins is known for his works on hashing, and his site
has a lot of very interesting researches and algorithms for
integer hashing. He also points to the work of Thomas Wang
who has similar findings.

The program here tests their algorithms in order to find one
well suited for IP address hashing.

[TESTS] add a test case for port mapping

[BUG] fix double-decrement of server connections

If a client does a sudden dirty close (CL_STCLOSE) during a server
connect turn-around, then the number of server connections is
decremented twice. This causes huge problems on the affected
server because when its connection number becomes negative, it
overflows and prevents the server from accepting new connections
due to an apparent saturation.

The fix consists in not decrementing the counter if the server is
in a turn-around state.

[DOC] fix unescaped space in httpchk example.

Lars Braeuer reported a missing space in the example which drives
readers in wrong direction.

[BUILD] Replace hardcoded 'LD = gcc' with 'LD = $(CC)'

haproxy relies on linking the binary using gcc, so there is no real need to
hardcode both (CC and LD). Setting 'LD = $(CC)' will make the build system
a bit more cross-compile friendly because only the right cross-compiler has
to be passed via make.

[MEDIUM] fix stats socket limitation to 16 kB

Due to the way the stats socket work, it was not possible to
maintain the information related to the command entered, so
after filling a whole buffer, the request was lost and it was
considered that there was nothing to write anymore.

The major reason was that some flags were passed directly
during the first call to stats_dump_raw() instead of being
stored persistently in the session.

To definitely fix this problem, flags were added to the stats
member of the session structure.

A second problem appeared. When the stats were produced, a first
call to client_retnclose() was performed, then one or multiple
subsequent calls to buffer_write_chunks() were done. But once the
stats buffer was full and a reschedule operated, the buffer was
flushed, the write flag cleared from the buffer and nothing was
done to re-arm it.

For this reason, a check was added in the proto_uxst_stats()
function in order to re-call the client FSM when data were added
by stats_dump_raw(). Finally, the whole unix stats dump FSM was
rewritten to avoid all the magics it depended on. It is now
simpler and looks more like the HTTP one.

[DOC] document the leastconn LB algo

[BUILD] Changed 'install' target to install the binaries first

[BUILD] Removed "/" after '$(DESTDIR)' in install targets

SBINDIR, MANDIR and DOCDIR use an absolute path

[BUILD] Added 'install-doc' make target

This change is also introducing a new make variable called DOCDIR,
which is set to "$(PREFIX)/doc/haproxy" by default.

[BUILD] Added 'install-bin' make target

[BUILD] Added 'install-man' make target for installing the man page

This change is also introducing a new variable in the Makefile called
MANDIR, which is set to "$PREFIX/share/man" by default.

[BUILD] Added support for 'make install'

To be flexible while installing haproxy following variables have been
added to the Makefile:
- DESTDIR useful i.e. while installing in a sandbox (not set by default)
- PREFIX defines the default install prefix (default: /usr/local)
- SBINDIR defines the dir the haproxy binary gets installed
(default: $PREFIX/sbin)

[BUILD] ensure that users don't build without setting the target anymore.

Too often, people report performance issues on Linux 2.6 because they don't
use the available optimizations. We need to ensure that people are aware of
the available features, and for this, we must force them to choose a target
OS (or "generic"), but at least prevent them from blindly building for a
generic target.

[MAJOR] implementation of the "leastconn" load balancing algorithm

The new "leastconn" LB algorithm selects the server which has the
least established or pending connections. The weights are considered,
so that a server with a weight of 20 will get twice as many connections
as the server with a weight of 10.

The algorithm respects the minconn/maxconn settings, as well as the
slowstart since it is a dynamic algorithm. It also correctly supports
backup servers (one and all).

It is generally suited for protocols with long sessions (such as remote
terminals and databases), as it will ensure that upon restart, a server
with no connection will take all new ones until its load is balanced
with others.

A test configuration has been added in order to ease regression testing.

[BUG] option allbackups was not working anymore in roundrobin mode

Commit 3168223a7b33a1d5aad1e11b8f2ad917645d7f27 broke option
"allbackups" in roundrobin mode due to an erroneous structure
member replacement in backend.c. The PR_O_USE_ALL_BK flag was
not tested in the right member anymore.

This bug uncoverred another one, by which all backup servers would
be used whatever the option's value, if all of them had been seen
as simultaneously failed at one moment.

This patch fixes the two stupid errors. Correctness has been tested
using the test-fwrr.cfg config example.

[BUG] str2sun could leak a small buffer in case of error during parsing

Matt Farnsworth reported a memory leak in str2sun() in case a too large
socket path is passed. The bug is very minor because it only happens
once during config parsing, but has to be fixed nevertheless. The patch
Matt provided could even be improved by completely removing the useless
strdup() in this function.

[CLEANUP] report dlmalloc's source path only if explictly specified

There's no point in reporting dlmalloc's source path if it was the
default one.

[CLEANUP] update .gitignore to ignore more temporary files

[MINOR] silent gcc for a wrong warning

gcc believes that avoididx may be used uninitialized, which is wrong.

[MAJOR] proto_uxst rework -> SNMP support

Currently there is a ~16KB limit for a data size passed via unix socket.
It is caused by a trivial bug ttat is going to fixed soon, however
in most cases there is no need to dump a full stats.

This patch makes possible to select a scope of dumped data by extending
current "show stat" to "show stat [<iid> <type> <sid>]":
- iid is a proxy id, -1 to dump all proxies
- type selects type of dumpable objects: 1 for frontend, 2 for backend, 4 for
   server, -1 for all types. Values can be ORed, for example:
     1+2=3   -> frontend+backend.
     1+2+4=7 -> frontend+backend+server.
- sid is a service id, -1 to dump everything from the selected proxy.

To do this I implemented a new session flag (SN_STAT_BOUND), added three
variables in data_ctx.stats (iid, type, sid), modified dumpstats.c and
completely revorked the process_uxst_stats: now it waits for a "\n"
terminated string, splits args and uses them. BTW: It should be quite easy
to add new commands, for example to enable/disable servers, the only problem
I can see is a not very lucky config name (*stats* socket). :|

During the work I also fixed two bug:
- s->flags were not initialized for proto_uxst
- missing comma if throttling not enabled (caused by a stupid change in
     "Implement persistent id for proxies and servers")

Other changes:
- No more magic type valuse, use STATS_TYPE_FE/STATS_TYPE_BE/STATS_TYPE_SV
- Don't memset full s->data_ctx (it was clearing s->data_ctx.stats.{iid/type/sid},
    instead initialize stats.sv & stats.sv_st (stats.px and stats.px_st were already
    initialized)

With all that changes it was extremely easy to write a short perl plugin
for a perl-enabled net-snmp (also included in this patch).

29385 is my PEN (Private Enterprise Number) and I'm willing to donate
the SNMPv2-SMI::enterprises.29385.106.* OIDs for HAProxy if there
is nothing assigned already.

[MEDIUM]: Prevent redispatcher from selecting the same server, version #3

When haproxy decides that session needs to be redispatched it chose a server,
but there is no guarantee for it to be a different one. So, it often
happens that selected server is exactly the same that it was previously, so
a client ends up with a 503 error anyway, especially when one sever has
much bigger weight than others.

Changes from the previous version:
- drop stupid and unnecessary SN_DIRECT changes

- assign_server(): use srvtoavoid to keep the old server and clear s->srv
    so SRV_STATUS_NOSRV guarantees that t->srv == NULL (again)
    and get_server_rr_with_conns has chances to work (previously
    we were passing a NULL here)

- srv_redispatch_connect(): remove t->srv->cum_sess and t->srv->failed_conns
   incrementing as t->srv was guaranteed to be NULL

- add avoididx to get_server_rr_with_conns. I hope I correctly understand this code.

- fix http_flush_cookie_flags() and move it to assign_server_and_queue()
   directly. The code here was supposed to set CK_DOWN and clear CK_VALID,
   but: (TX_CK_VALID | TX_CK_DOWN) == TX_CK_VALID == TX_CK_MASK so:
if ((txn->flags & TX_CK_MASK) == TX_CK_VALID)
txn->flags ^= (TX_CK_VALID | TX_CK_DOWN);
   was really a:
if ((txn->flags & TX_CK_MASK) == TX_CK_VALID)
txn->flags &= TX_CK_VALID

   Now haproxy logs "--DI" after redispatching connection.

- defer srv->redispatches++ and s->be->redispatches++ so there
   are called only if a conenction was redispatched, not only
   supposed to.

- don't increment lbconn if redispatcher selected the same sarver

- don't count unsuccessfully redispatched connections as redispatched
   connections

- don't count redispatched connections as errors, so:

- the number of connections effectively served by a server is:
srv->cum_sess - srv->failed_conns - srv->retries - srv->redispatches
   and
SUM(servers->failed_conns) == be->failed_conns

- requires the "Don't increment server connections too much + fix retries" patch

- needs little more testing and probably some discussion so reverting to the RFC state

Tests #1:
retries 4
redispatch

i) 1 server(s): b (wght=1, down)
  b) sessions=5, lbtot=1, err_conn=1, retr=4, redis=0
  -> request failed

ii) server(s): b (wght=1, down), u (wght=1, down)
  b) sessions=4, lbtot=1, err_conn=0, retr=3, redis=1
  u) sessions=1, lbtot=1, err_conn=1, retr=0, redis=0
  -> request FAILED

iii) 2 server(s): b (wght=1, down), u (wght=1, up)
  b) sessions=4, lbtot=1, err_conn=0, retr=3, redis=1
  u) sessions=1, lbtot=1, err_conn=0, retr=0, redis=0
  -> request OK

iv) 2 server(s): b (wght=100, down), u (wght=1, up)
  b) sessions=4, lbtot=1, err_conn=0, retr=3, redis=1
  u) sessions=1, lbtot=1, err_conn=0, retr=0, redis=0
  -> request OK

v) 1 server(s): b (down for first 4 SYNS)
  b) sessions=5, lbtot=1, err_conn=0, retr=4, redis=0
  -> request OK

Tests #2:
retries 4

i) 1 server(s): b (down)
  b) sessions=5, lbtot=1, err_conn=1, retr=4, redis=0
  -> request FAILED

[BUG] Don't increment server connections too much + fix retries

Commit 98937b875798e10fac671d109355cde29d2a411a while fixing
one bug introduced another one. With "retries 4" and
"option redispatch" haproxy tries to connect 4 times to
one server server and 1 time to a second one. However
logs showed 5 connections to the first server (the
last one was counted twice) and 2 to the second.

This patch also fixes srv->retries and be->retries increments.

Now I get: 3 retries and 1 error in a first server (4 cum_sess)
and 1 error in a second server (1 cum_sess) with:
retries 4
option redispatch

and: 4 retries and 1 error (5 cum_sess) with:
retries 4

So, the number of connections effectively served by a server is:
srv->cum_sess - srv->failed_conns - srv->retries

[MINOR] Implement persistent id for proxies and servers

This patch adds a possibility to set a persistent id for a proxy/server.
Now, even if some proxies/servers are inserted/deleted/moved, iids and
sids can be still used reliable.

Some people add servers with tricky names (BACKEND or FRONTEND for example).
So I also added one more field ('type') to distinguish between a
backend (0), frontend (1) and server (2) without complicated logic:
if name==BACKEND and sid==0 then type is BACKEND else type is SERVER,
etc for a FRONTEND. It also makes possible to have one frontend with more
than one IP (a patch coming soon) with independed stats - for example to
differs between remote and local traffic.

Finally, I added documentation about the CSV format.

This patch depends on '[MEDIUM] Implement "track [<backend>/]<server>"'

[MEDIUM] Implement "track [<backend>/]<server>"

This patch implements ability to set the current state of one server
by tracking another one. It:
- adds two variables: *tracknext, *tracked to struct server
- implements findserver(), similar to findproxy()
- adds "track" keyword accepting both "proxy/server" and "server" (assuming current proxy)
- verifies if both checks and tracking is not enabled at the same time
- changes set_server_down() to notify tracking server
- creates set_server_up(), set_server_disabled(), set_server_enabled() by
moving the code from process_chk() and adding notifications
- changes stats to show a name of tracked server instead of Chk/Dwn/Dwntime(html)
or by adding new variable (csv)

Changes from the previuos version:
- it is possibile to track independently of the declaration order
- one extra comma bug is fixed
- new condition to check if there is no disable-on-404 inconsistency

[BUILD] ensure that makefile understands USE_DLMALLOC=1

USE_DLMALLOC=1 was ignored since last makefile update. It's better
to keep it running for existing setups.

[BUG] do not apply timeout.connect in checks if unset

tv_bound() does not consider infinite timeouts, so we must
check that timeout.connect is set before applying it to the
checks.

[BUG] appsession lookup in URL does not work

We've been trying to use the latest release (1.3.14.2) of haproxy  to do
sticky sessions.  Cookie insertion is not an option for us, although we
would much rather use it, as we are trying to work around a problem where
cookies are unreliable.  The appsession functionality only partially worked
(it wouldn't read the session id out of a query string) until we made the
following code change to the get_srv_from_appsession function in
proto_http.c.

[TESTS] add test-pollers.cfg to easily report pollers in use

[BUG] timeout.check was not pre-set to eternity

If timeout.check was not set, check were using 0 as the timeout, causing
odd behaviours.

[BUG] failed conns were sometimes incremented in the frontend!

[OPTIM] GCC4's builtin_expect() is suboptimal

GCC4 is stupid (unbelievable news!).

When some code uses __builtin_expect(x != 0, 1), it really performs
the check of x != 0 then tests that the result is not zero! This is
a double check when only one was expected. Some performance drops
of 10% in the HTTP parser code have been observed due to this bug.

GCC 3.4 is fine though.

A solution consists in expecting that the tested value is 1. In
this case, it emits the correct code, but it's still not optimal
it seems. Finally the best solution is to ignore likely() and to
pray for the compiler to emit correct code. However, we still have
to fix unlikely() to remove the test there too, and to fix all
code which passed pointers overthere to pass integers instead.

[OPTIM] used unsigned ints for HTTP state and message offsets

State and offsets within http_msg were incorrectly set to signed int.
Turning them into unsigned slightly improved performance while reducing
code size.

[BUILD] backend.c and checks.c did not build without tproxy !

missing #ifdefs. The right patch this time!

Revert "[BUILD] backend.c and checks.c did not build without tproxy !"

This reverts commit 3c3c0122f84d72eae1c4ef4b1826bfdbef7d95e6.
This commit was buggy as it also removed previous tproxy changes !

[BUILD] backend.c and checks.c did not build without tproxy !

missing #ifdefs.

[DOC] add a configuration entry for "server ... redir <prefix>"

[TESTS] add a test case for the server redirection mechanism

[MEDIUM] completely implement the server redirection method

Now when a server has "redir <prefix>" on its config line, any HEAD or GET
request addressing it will lead to a 302 with Location set to "<prefix>"
immediately followed by the relative URI of the incoming request. This makes
it very easy to send redirect to browsers to check remote static servers, as
well as to provide redirection for remote sites when the local one is down.

[MINOR] add configuration support for "redir" server keyword

The servers now support the "redir" keyword, making it possible to
return a 302 with the specified prefix in front of the request instead
of connecting to them. This is generally useful for multi-site load
balancing but may also serve in order to achieve very high traffic
rate.

The keyword has only been added to the config parser and to structures,
it's not used yet.

[DOC] applied small fixes from early readers

[DOC] fix erroneous "useallbackups" option in the doc

[BUILD] update MacOS Makefile to build on newer versions

This update from Dan Zinngrabe enables building on MacOS 10.4 and 10.5.

[DOC] Update a "contrib" file with a hint about a scheme used for formathing subjects

With each new patch I had to search for the e-mail from Willy
describing the schem used for formathing subjects. No more. ;)

[MINOR] report correct section type for unknown keywords.

An unknown keyword was always reported in section "listen" for any
section type (defaults, listen, frontend, backend, ...).

[DOC] minor cleanup of the doc and notice to contributors

[MEDIUM]: rework checks handling

This patch adds two new variables: fastinter and downinter.
When server state is:
- non-transitionally UP -> inter (no change)
- transitionally UP (going down), unchecked or transitionally DOWN (going up) -> fastinter
- down -> downinter

It allows to set something like:
server sr6 127.0.51.61:80 cookie s6 check inter 10000 downinter 20000 fastinter 500 fall 3 weight 40
In the above example haproxy uses 10000ms between checks but as soon as
one check fails fastinter (500ms) is used. If server is down
downinter (20000) is used or fastinter (500ms) if one check pass.
Fastinter is also used when haproxy starts.

New "timeout.check" variable was added, if set haproxy uses it as an additional
read timeout, but only after a connection has been already established. I was
thinking about using "timeout.server" here but most people set this
with an addition reserve but still want checks to kick out laggy servers.
Please also note that in most cases check request is much simpler
and faster to handle than normal requests so this timeout should be smaller.

I also changed the timeout used for check connections establishing.

Changes from the previous version:
- use tv_isset() to check if the timeout is set,
- use min("timeout connect", "inter") but only if "timeout check" is set
as this min alone may be to short for full (connect + read) check,
- debug code (fprintf) commented/removed
- documentation

Compile tested only (sorry!) as I'm currently traveling but changes
are rather small and trivial.

[BUG]: Restore clearing t->logs.bytes

Commit 8b3977ffe36190e45fb974bf813bfbd935ac99a1 removed "t->logs.bytes_in = 0;"
but instead it should change it into "t->logs.bytes_out = 0;" as since
583bc966064e248771e75c253dddec7351afd8a2 counters are incremented not set.

It should be incremented in session_process_counters while sending data to a
client:
bytes = s->rep->total - s->logs.bytes_out;
s->logs.bytes_out = s->rep->total;

However, if we increment (set) s->logs.bytes_out while processing
"logasap", statistics get wrong values added for headers: 0 or even
negative if haproxy adds some headers itself.

To test it, please enable logasap and download one empty file and look at
stats. Without my fix information available on that page are invalid, for
example:

# pxname,svname,qcur,qmax,scur,smax,slim,stot,bin,bout,dreq,dresp,ereq,econ,eresp,wretr,wredis,status,weight,act,bck,chkfail,chkdown,lastchg,downtime,qlimit,pid,iid,sid,throttle,lbtot,
www,b,0,0,0,1,,1,24,-92,,0,,0,0,0,,UP,1,1,0,0,0,3121,0,,1,2,1,,1,
www,BACKEND,0,0,0,1,0,1,24,-92,0,0,,0,0,0,0,UP,1,1,0,,0,3121,0,,1,2,0,,1,

[MINOR] fix configuration hint about timeouts

Do not talk about "clitimeout", "contimeout" or "srvtimeout"
anymore.

[MINOR] use s->frt_addr as the server's address in transparent proxy

There's no point trying to check original dest addr with only one
method when doing transparent proxy as in full transparent mode,
the real destination address is required. Let's copy the one from
the frontend.

[BUG] fix truncated responses with sepoll

Due to the way Linux delivers EPOLLIN and EPOLLHUP, a closed connection
received after some server data sometimes results in truncated responses
if the client disconnects before server starts to respond. The reason
is that the EPOLLHUP flag is processed as an indication of end of
transfer while some data may remain in the system's socket buffers.

This problem could only be triggered with sepoll, although nothing should
prevent it from happening with normal epoll. In fact, the work factoring
performed by sepoll increases the risk that this bug appears.

The fix consists in making FD_POLL_HUP and FD_POLL_ERR sticky and that
they are only checked if FD_POLL_IN is not set, meaning that we have
read all pending data.

That way, the problem is definitely fixed and sepoll still remains about
17% faster than epoll since it can take into account all information
returned by the kernel.

[BUILD] code did not build in full debug mode

[BUG] log response byte count, not request

Due to a shameless copy-paste typo, the number of bytes logged was
from the request and not the response. This bug has been present
for a long time.

[DOC] added documentation about HTTP header manipulations

This section has been inserted before the logging section.

[DOC] document all req* and rsp* keywords.

[DOC] all server parameters have been documented

[DOC] added "server", "source" and "stats" keywords

The documentation now lists all keywords except the req* and rsp*. The
"server" keyword has been documented for mandatory parameters. Specific
settings are still waiting to be written in a dedicated section.

[BUG] build failed on CONFIG_HAP_LINUX_TPROXY without CONFIG_HAP_CTTPROXY

changed #ifdef

[MEDIUM] fix server health checks source address selection

The source address selection for health checks did not consider
the new transparent proxy method. Rely on the same unified function
as the other connect() calls.

This patch also fixes a bug by which the proxy's source address was
ignored if cttproxy was used.

[BUG] fix overlapping server flags

Server flags SRV_GOINGDOWN, SRV_WARMINGUP were overlapping
SRV_TPROXY_*.

[BUG] use backend's source and not server's source with tproxy

copy-paste typo.

[MINOR] add transparent proxy support for balabit's Tproxy v4

Balabit's TPROXY version 4 which replaces CTTPROXY provides a similar
API to the previous proxy, but relies on IP_FREEBIND instead of
IP_TRANSPARENT. Let's add it.

[MEDIUM] add non-local bind to connect() on Linux

Using some Linux kernel patches which add the IP_TRANSPARENT
SOL_IP option , it is possible to bind to a non-local address
on without having resort to any sort of NAT, thus causing no
performance degradation.

This is by far faster and cleaner than the previous CTTPROXY
method. The code has been slightly changed in order to remain
compatible with CTTPROXY as a fallback for the new method when
it does not work.

It is not needed anymore to specify the outgoing source address
for connect, it can remain 0.0.0.0.

[MEDIUM] support fully transparent proxy on Linux (USE_LINUX_TPROXY)

Using some Linux kernel patches, it is possible to redirect non-local
traffic to local sockets when IP forwarding is enabled. In order to
enable this option, we introduce the "transparent" option keyword on
the "bind" command line. It will make the socket reachable by remote
sources even if the destination address does not belong to the machine.

[BUG] connect_server: server might not exist when sending error report

In connect_server(), we may send an alert with the server name while
the server might not exist, eg in dispatch mode.

[DOC] added documentation for "option tcplog" to "use_backend"

- options tcplog, tcpsplice and transparent have been documented.
- keywords "srvtimeout", "timeout queue", "timeout server" and
"timeout tarpit" have been documented
- keywords "transparent" and "use_backend" have been documented

Only "server", "source" and "stats *" remain undocumented

[DOC] document options nolinger to ssl-hello-chk

Options nolinger, persist, smtpchk and ssl-hello-chk have been
documented. All keywords and options up to and including option
tcpka are now documented.

[BUG] fix typo in redispatched connection

a copy-paste typo was present in the reconnection code responsible
for respatching. The client's FSM would not be re-evaluated if an
error occurred. It looks harmless but better fix it.

[MEDIUM] add a turn-around state of one second after a connection failure

Several users have complained that when haproxy gets a connection
failure due to an active reject from a server, it immediately
retries, often leading to the same situation being repeated until
the retry counter reaches zero.

Now if a connection error shows up, a turn-around state of 1 second
is applied before retrying. This is performed by faking a connection
timeout in order not to touch much code. However, a cleaner method
would involve an extra state.

[MEDIUM]: Count retries and redispatches also for servers, fix redistribute_pending, extend logs, %d->%u cleanup

This patch extends a little previously added functionality to also
count retries and redispatches for servers. Now it is possible to know
which server causes redispatches as it is not always the same that takes
most retries.

While working with the code I found that redistribute_pending() does not increment
srv->redispatches && be->redispatches. I don't know how to test it but
I think the fix is correct. If not I can withdraw it.

I also extended logs to show how many retries were done and if redispatching
was necessary ('+'). I'm using an additional session flag SN_REDISP to match
redispatched connections. I had to rearrange all defines in session.h to make
more room for it.

The documentation about logs was also fixed a little (sorry, english only),
as current version uses totally different format. BTW: examples are still
outdated, maybe next time...

Finally, I changed %d -> %u for retries/redispatches as those variables
are declared as unsigned.

[BUG] increment server connections for each connect()

It was abnormal to see more connect errors than connect attempts.
This was caused by the fact that the server's connection count was
not incremented for failed connect() attempts.

Now the per-server connections are correctly incremented for each
connect() attempt. This includes the retries too. The number of
connections effectively served by a server will then be :

srv->cum_sess - srv->errors - srv->warnings

[MINOR] tarpit timeout is also allowed in backends

Since the tarpit action may be set in backends too, its timeout
must be configurable there.

[MEDIUM] introduce "timeout http-request" in frontends

In order to offer DoS protection, it may be required to lower the maximum
accepted time to receive a complete HTTP request without affecting the client
timeout. This helps protecting against established connections on which
nothing is sent. The client timeout cannot offer a good protection against
this abuse because it is an inactivity timeout, which means that if the
attacker sends one character every now and then, the timeout will not
trigger. With the HTTP request timeout, no matter what speed the client
types, the request will be aborted if it does not complete in time.

[OPTIM] introduce global parameter "tune.maxaccept"

This new parameter makes it possible to override the default
number of consecutive incoming connections which can be
accepted on a socket. By default it is not limited on single
process mode, and limited to 8 in multi-process mode.

[MINOR] add support for the "backlog" parameter

Add the "backlog" parameter to frontends, to give hints to
the system about the approximate listen backlog desired size.

In order to protect against SYN flood attacks, one solution is
to increase the system's SYN backlog size. Depending on the
system, sometimes it is just tunable via a system parameter,
sometimes it is not adjustable at all, and sometimes the system
relies on hints given by the application at the time of the
listen() syscall. By default, HAProxy passes the frontend's
maxconn value to the listen() syscall. On systems which can
make use of this value, it can sometimes be useful to be able
to specify a different value, hence this backlog parameter.