Willy Tarreau [Tue, 13 Aug 2013 15:51:07 +0000 (17:51 +0200)]
BUG/MEDIUM: unique_id: HTTP request counter must be unique!
The HTTP request counter is incremented non atomically, which means that
many requests can log the same ID. Let's increment it when it is consumed
so that we avoid this case.
This bug was reported by Patrick Hemmer. It's 1.5-specific and does not
need to be backported.
Willy Tarreau [Tue, 13 Aug 2013 15:19:08 +0000 (17:19 +0200)]
MINOR: config: warn when a server with no specific port uses rdp-cookie
Mathew Levett reported an issue which is a bit nasty and hard to track
down. RDP cookies contain both the IP and the port, and haproxy matches
them exactly. So if a server has no port specified (or a remapped port),
it will never match a port specified in a cookie. Better warn the user
when this is detected.
Willy Tarreau [Tue, 13 Aug 2013 14:59:39 +0000 (16:59 +0200)]
MINOR: ssl: use MAXPATHLEN instead of PATH_MAX
Apollon Oikonomopoulos reported a build failure on Hurd where PATH_MAX
is not defined. The only place where it is referenced is ssl_sock.c,
all other places use MAXPATHLEN instead, with a fallback to 128 when
the OS does not define it. So let's switch to MAXPATHLEN as well.
Id assume that the entries with server_id = 2 would be removed but its
removing everything each time."
The cause of the issue is a missing test for skip_entry when deciding
whether to clear the key or not. The test was present when only the
last node is to be removed, so removing only the first node from a
list of two always did the right thing, explaining why it remained
unnoticed in basic unit tests.
The bug was introduced by commit 8fa52f4e which attempted to fix a
previous issue with this feature where only the last node was removed.
This bug is 1.5-specific and does not require any backport.
Godbach [Wed, 7 Aug 2013 01:48:23 +0000 (09:48 +0800)]
BUG/MINOR: use the same check condition for server as other algorithms
Such load balance algorithms as roundrobin, leastconn and first will check the
server after being selected with the following condition:
if (!s->maxconn || (!s->nbpend && s->served < srv_dynamic_maxconn(s)))
But static-rr uses the different one in map_get_server_rr() as below:
if (!srv->maxconn || srv->cur_sess < srv_dynamic_maxconn(srv))
After viewing this difference, it is a better choice for static-rr to use the
same check condition as other algorithms.
This change will only affect static-rr. Though all hash algorithms with type
map-based will use the same server map as static-rr, they call another function
map_get_server_hash() to get server.
Willy Tarreau [Fri, 2 Aug 2013 09:07:32 +0000 (11:07 +0200)]
MINOR: payload: allow the payload sample fetches to retrieve arbitrary lengths
When using req.payload and res.payload to look up for specific content at an
arbitrary location, we're often facing the problem of not knowing the input
buffer length. If the length argument is larger than the buffer length, the
function did not match, and if they're smaller, there is a risk of not getting
the expected content. This is especially true when looking for data in SOAP
requests.
So let's make some provisions for scanning the whole buffer by specifying a
length of 0 bytes. This greatly simplifies the processing of random-sized
input data.
Willy Tarreau [Thu, 1 Aug 2013 19:11:42 +0000 (21:11 +0200)]
MINOR: cli: make it possible to enter multiple values at once with "set table"
The "set table" statement allows to create new entries with their respective
values. Till now it was limited to a single data type per line, requiring as
many "set table" statements as the desired data types to be set. Since this
is only a parser limitation, this patch gets rid of it. It also allows the
creation of a key with no data types (all reset to their default values).
MEDIUM: cli: adjust the method for feeding frequency counters in tables
Since commit 654694e1, it has been possible to feed some data into
stick tables from the CLI. That commit considered that frequency
counters would only have their previous value set, so that they
progressively fade out. But this does not match any real world
use case in fact. The only reason for feeding a freq counter is
to pass some data learned outside. We certainly don't want to see
such data start to vanish immediately, otherwise it will force the
external scripts to loop very frequently to limit the losses.
So let's set the current value instead in order to guarantee that
the data remains stable over the full period, then starts to fade
out between 1* and 2* the period.
MEDIUM: counters: support looking up a key in an alternate table
sc_* sample fetches now take an optional parameter which allows to look
the key in an alternate table. This is convenient to pass multiple
information for the same key at once (eg: have multiple gpc0 for the
same key, or support being fed complementary information from the CLI).
Example :
listen front
bind :8000
tcp-request content track-sc0 src table local-ip
http-response set-header src-id %[sc0_get_gpc0]+%[sc0_get_gpc0(global-ip)]
server dummy 127.0.0.1:8001
backend local-ip
stick-table size 1k type ip store gpc0
backend global-ip
stick-table size 1k type ip store gpc0
MEDIUM: counters: support passing the counter number as a fetch argument
One very annoying issue when trying to extend the sticky counters beyond
the current 3 counters is that it requires a massive copy-paste of fetch
functions (we don't have to copy-paste code anymore), just so that the
fetch names exist.
So let's have an alternate form like "sc_*(num)" to allow passing the
counter number as an argument without having to redefine new fetch names.
The MAX_SESS_STKCTR macro defines the number of usable sticky counters,
which defaults to 3.
MINOR: session: make the number of stick counter entries more configurable
In preparation of more flexibility in the stick counters, make their
number configurable. It still defaults to 3 which is the minimum
accepted value. Changing the value alone is not sufficient to get
more counters, some bitfields still need to be updated and the TCP
actions need to be updated as well, but this update tries to be
easier, which is nice for experimentation purposes.
MEDIUM: counters: factor out smp_fetch_sc*_trackers
smp_fetch_sc0_trackers, smp_fetch_sc1_trackers and smp_fetch_sc2_trackers
were merged into a single function which relies on the fetch name to decide
what to return.
This is also a bug fix for this feature which has never worked till its bogus
introduction by commit "2406db4 MEDIUM: counters: add sc1_trackers/sc2_trackers"
(1.5-dev10).
Instead of returning the value in the sample, it was returned as the fetch
result!
There is no need to backport this fix anyway since it's 1.5-specific and
nobody uses the feature.
MEDIUM: counters: factor out smp_fetch_sc*_bytes_out_rate
smp_fetch_sc0_bytes_out_rate, smp_fetch_sc1_bytes_out_rate, smp_fetch_sc2_bytes_out_rate,
smp_fetch_src_bytes_out_rate and smp_fetch_bytes_out_rate were merged into a single
function which relies on the fetch name to decide what to return.
MEDIUM: counters: factor out smp_fetch_sc*_kbytes_out
smp_fetch_sc0_kbytes_out, smp_fetch_sc1_kbytes_out, smp_fetch_sc2_kbytes_out,
smp_fetch_src_kbytes_out and smp_fetch_kbytes_out were merged into a single
function which relies on the fetch name to decide what to return.
MEDIUM: counters: factor out smp_fetch_sc*_bytes_in_rate
smp_fetch_sc0_bytes_in_rate, smp_fetch_sc1_bytes_in_rate, smp_fetch_sc2_bytes_in_rate,
smp_fetch_src_bytes_in_rate and smp_fetch_bytes_in_rate were merged into a single
function which relies on the fetch name to decide what to return.
MEDIUM: counters: factor out smp_fetch_sc*_kbytes_in
smp_fetch_sc0_kbytes_in, smp_fetch_sc1_kbytes_in, smp_fetch_sc2_kbytes_in,
smp_fetch_src_kbytes_in and smp_fetch_kbytes_in were merged into a single
function which relies on the fetch name to decide what to return.
MEDIUM: counters: factor out smp_fetch_sc*_http_err_rate
smp_fetch_sc0_http_err_rate, smp_fetch_sc1_http_err_rate, smp_fetch_sc2_http_err_rate,
smp_fetch_src_http_err_rate and smp_fetch_http_err_rate were merged into a single
function which relies on the fetch name to decide what to return.
MEDIUM: counters: factor out smp_fetch_sc*_http_err_cnt
smp_fetch_sc0_http_err_cnt, smp_fetch_sc1_http_err_cnt, smp_fetch_sc2_http_err_cnt,
smp_fetch_src_http_err_cnt and smp_fetch_http_err_cnt were merged into a single
function which relies on the fetch name to decide what to return.
MEDIUM: counters: factor out smp_fetch_sc*_http_req_rate
smp_fetch_sc0_http_req_rate, smp_fetch_sc1_http_req_rate, smp_fetch_sc2_http_req_rate,
smp_fetch_src_http_req_rate and smp_fetch_http_req_rate were merged into a single
function which relies on the fetch name to decide what to return.
MEDIUM: counters: factor out smp_fetch_sc*_http_req_cnt
smp_fetch_sc0_http_req_cnt, smp_fetch_sc1_http_req_cnt, smp_fetch_sc2_http_req_cnt,
smp_fetch_src_http_req_cnt and smp_fetch_http_req_cnt were merged into a single
function which relies on the fetch name to decide what to return.
MEDIUM: counters: factor out smp_fetch_sc*_sess_rate
smp_fetch_sc0_sess_rate, smp_fetch_sc1_sess_rate, smp_fetch_sc2_sess_rate,
smp_fetch_src_sess_rate and smp_fetch_sess_rate were merged into a single
function which relies on the fetch name to decide what to return.
MEDIUM: counters: factor out smp_fetch_sc*_sess_cnt
smp_fetch_sc0_sess_cnt, smp_fetch_sc1_sess_cnt, smp_fetch_sc2_sess_cnt,
smp_fetch_src_sess_cnt and smp_fetch_sess_cnt were merged into a single
function which relies on the fetch name to decide what to return.
MEDIUM: counters: factor out smp_fetch_sc*_conn_cur
smp_fetch_sc0_conn_cur, smp_fetch_sc1_conn_cur, smp_fetch_sc2_conn_cur,
smp_fetch_src_conn_cur and smp_fetch_conn_cur were merged into a single
function which relies on the fetch name to decide what to return.
MEDIUM: counters: factor out smp_fetch_sc*_conn_rate
smp_fetch_sc0_conn_rate, smp_fetch_sc1_conn_rate, smp_fetch_sc2_conn_rate,
smp_fetch_src_conn_rate and smp_fetch_conn_rate were merged into a single
function which relies on the fetch name to decide what to return.
MEDIUM: counters: factor out smp_fetch_sc*_conn_cnt
smp_fetch_sc0_conn_cnt, smp_fetch_sc1_conn_cnt, smp_fetch_sc2_conn_cnt,
smp_fetch_src_conn_cnt and smp_fetch_conn_cnt were merged into a single
function which relies on the fetch name to decide what to return.
MEDIUM: counters: factor out smp_fetch_sc*_clr_gpc0
smp_fetch_sc0_clr_gpc0, smp_fetch_sc1_clr_gpc0, smp_fetch_sc2_clr_gpc0,
smp_fetch_src_clr_gpc0 and smp_fetch_clr_gpc0 were merged into a single
function which relies on the fetch name to decide what to return.
MEDIUM: counters: factor out smp_fetch_sc*_inc_gpc0
smp_fetch_sc0_inc_gpc0, smp_fetch_sc1_inc_gpc0, smp_fetch_sc2_inc_gpc0,
smp_fetch_src_inc_gpc0 and smp_fetch_inc_gpc0 were merged into a single
function which relies on the fetch name to decide what to return.
MEDIUM: counters: factor out smp_fetch_sc*_gpc0_rate
smp_fetch_sc0_gpc0, smp_fetch_sc1_gpc0, smp_fetch_sc2_gpc0,
smp_fetch_src_gpc0 and smp_fetch_gpc0 were merged into a single
function which relies on the fetch name to decide what to return.
MEDIUM: counters: factor out smp_fetch_sc*_get_gpc0
smp_fetch_sc0_get_gpc0, smp_fetch_sc1_get_gpc0, smp_fetch_sc2_get_gpc0,
smp_fetch_src_get_gpc0 and smp_fetch_get_gpc0 were merged into a single
function which relies on the fetch name to decide what to return.
MINOR: counters: provide a generic function to retrieve a stkctr for sc* and src.
This function aims at simplifying the prefetching of the table and entry
when using any of the session counters fetches. The principle is that the
src_* variant produces a stkctr that is used instead of the one from the
session. That way we can call the same function from all session counter
fetch functions and always have a single function to support sc[0-9]_/src_.
This function is also called directly from backend.c, so let's stop
building fake args to call it as a sample fetch, and have a lower
layer more generic function instead.
MEDIUM: sample: systematically pass the keyword pointer to the keyword
We're having a lot of duplicate code just because of minor variants between
fetch functions that could be dealt with if the functions had the pointer to
the original keyword, so let's pass it as the last argument. An earlier
version used to pass a pointer to the sample_fetch element, but this is not
the best solution for two reasons :
- fetch functions will solely rely on the keyword string
- some other smp_fetch_* users do not have the pointer to the original
keyword and were forced to pass NULL.
So finally we're passing a pointer to the keyword as a const char *, which
perfectly fits the original purpose.
MINOR: samples: add the http_date([<offset>]) sample converter.
Converts an integer supposed to contain a date since epoch to
a string representing this date in a format suitable for use
in HTTP header fields. If an offset value is specified, then
it is a number of seconds that is added to the date before the
conversion is operated. This is particularly useful to emit
Date header fields, Expires values in responses when combined
with a positive offset, or Last-Modified values when the
offset is negative.
MINOR: sample: add a new "date" fetch to return the current date
Returns the current date as the epoch (number of seconds since 01/01/1970).
If an offset value is specified, then it is a number of seconds that is added
to the current date before returning the value. This is particularly useful
to compute relative dates, as both positive and negative offsets are allowed.
CLEANUP: acl: move the 3 remaining sample fetches to samples.c
There is no more reason for having "always_true", "always_false" and "env"
in acl.c while they're the most basic sample fetch keywords, so let's move
them to sample.c where it's easier to find them.
MINOR: sample: fix sample_process handling of unstable data
sample_process() used to return NULL on changing data, regardless of the
SMP_OPT_FINAL flag. Let's change this so that it is now possible to
include such data in logs or HTTP headers. Also, one unconvenient
thing was that it used to always set the sample flags to zero, making
it incompatible with ACLs which may need to call it multiple times. Only
do this for locally-allocated samples.
MEDIUM: sample: handle comma-delimited converter list
We now support having a comma-delimited converter list, which can start
right after the fetch keyword. The immediate benefit is that it allows
to use converters in log-format expressions, for example :
set-header source-net %[src,ipmask(24)]
The parser is also slightly improved and should be more resilient against
configuration errors. Also, optional arguments in converters were mistakenly
not allowed till now, so this was fixed.
OPTIM: splicing: use splice() for the last block when relevant
Splicing is avoided for small transfers because it's generally cheaper
to perform a couple of recv+send calls than pipe+splice+splice. This
has the consequence that the last chunk of a large transfer may be
transferred using recv+send if it's less than 4 kB. But when the pipe
is already set up, it's better to use splice() to read the pending data,
since they will get merged with the pending ones. This is what now
happens everytime the reader is slower than the writer.
Note that this change alone could have fixed most of the CPU hog bug,
except at the end when only the close was pending.
BUG/MINOR: stream_interface: don't call chk_snd() on polled events
As explained in previous patch, we incorrectly call chk_snd() when
performing a read even if the write event is already subscribed to
poll(). This is counter-productive because we're almost sure to get
an EAGAIN.
A quick test shows that this fix halves the number of failed splice()
calls without adding any extra work on other syscalls.
This could have been tagged as an improvement, but since this behaviour
made the analysis of previous bug more complex, it still qualifies as
a fix.
BUG/MEDIUM: splicing: fix abnormal CPU usage with splicing
Mark Janssen reported an issue in 1.5-dev19 which was introduced
in 1.5-dev12 by commit 96199b10. From time to time, randomly, the
CPU usage spikes to 100% for seconds to minutes.
A deep analysis of the traces provided shows that it happens when
waiting for the response to a second pipelined HTTP request, or
when trying to handle the received shutdown advertised by epoll()
after the last block of data. Each time, splice() was involved with
data pending in the pipe.
The cause of this was that such events could not be taken into account
by splice nor by recv and were left pending :
- the transfer of the last block of data, optionally with a shutdown
was not handled by splice() because of the validation that to_forward
is higher than MIN_SPLICE_FORWARD ;
- the next recv() call was inhibited because of the test on presence
of data in the pipe. This is also what prevented the recv() call
from handling a response to a pipelined request until the client
had ACKed the previous response.
No less than 4 different methods were experimented to fix this, and the
current one was finally chosen. The principle is that if an event is not
caught by splice(), then it MUST be caught by recv(). So we remove the
condition on the pipe's emptiness to perform an recv(), and in order to
prevent recv() from being used in the middle of a transfer, we mark
supposedly full pipes with CO_FL_WAIT_ROOM, which makes sense because
the reason for stopping a splice()-based receive is that the pipe is
supposed to be full.
The net effect is that we don't wake up and sleep in loops during these
transient states. This happened much more often than expected, sometimes
for a few cycles at end of transfers, but rarely long enough to be
noticed, unless a client timed out with data pending in the pipe. The
effect on CPU usage is visible even when transfering 1MB objects in
pipeline, where the CPU usage drops from 10 to 6% on a small machine at
medium bandwidth.
Some further improvements are needed :
- the last chunk of a splice() transfer is never done using splice due
to the test on to_forward. This is wrong and should be performed with
splice if the pipe has not yet been emptied ;
- si_chk_snd() should not be called when the write event is already being
polled, otherwise we're almost certain to get EAGAIN.
Many thanks to Mark for all the traces he cared to provide, they were
essential for understanding this issue which was not reproducible
without.
BUG/MEDIUM: server: set the macro for server's max weight SRV_UWGHT_MAX to SRV_UWGHT_RANGE
The max weight of server is 256 now, but SRV_UWGHT_MAX is still 255. As a result,
FWRR will not work well when server's weight is 256. The description is as below:
There are some macros related to server's weight in include/types/server.h:
#define SRV_UWGHT_RANGE 256
#define SRV_UWGHT_MAX (SRV_UWGHT_RANGE - 1)
#define SRV_EWGHT_MAX (SRV_UWGHT_MAX * BE_WEIGHT_SCALE)
Since weight of server can be reach to 256 and BE_WEIGHT_SCALE equals to 16,
the max eweight of server should be 256*16 = 4096, it will exceed SRV_EWGHT_MAX
which equals to SRV_UWGHT_MAX*BE_WEIGHT_SCALE = 255*16 = 4080. When a server
with weight 256 is insterted into FWRR tree during initialization, the key value
of this server should be SRV_EWGHT_MAX - s->eweight = 4080 - 4096 = -16 which
is closed to UINT_MAX in unsigned type, so the server with highest weight will
be not elected as the first server to process request.
In addition, it is a better choice to compare with SRV_UWGHT_MAX than a magic
number 256 while doing check for the weight. The max number of servers for
round-robin algorithm is also updated.
BUG/MAJOR: http: sample prefetch code was not properly migrated
When ACLs and samples were converged in 1.5-dev18, function
"acl_prefetch_http" was not properly converted after commit 8ed669b1.
It used to return -1 when contents did not match HTTP traffic, which
was considered as a "true" boolean result by the ACL execution code,
possibly causing crashes due to missing data when checking for HTTP
traffic in TCP rules.
Another issue is that when the function returned zero, it did not
set tje SMP_F_MAY_CHANGE flag, so it could randomly exit on partial
requests before waiting for a complete one.
Last issue is that when it returned 1, it did not set smp->data.uint,
so this last one would retain a random value from a past execution.
This could randomly cause some matches to fail as well.
Thanks to Remo Eichenberger for reporting this issue with a detailed
explanation and configuration.
BUG/MEDIUM: http: "option checkcache" fails with the no-cache header
The checkcache option checks for cacheable responses with a set-cookie
header. Since the response processing code was refactored in 1.3.8
(commit a15645d4), the check was broken because the no-cache value
is only checked as no-cache="set-cookie", and not alone.
BUG/MAJOR: http: don't emit the send-name-header when no server is available
Lukas Benes reported that http-send-name-header causes a segfault if no
server is available because we're dereferencing the session's target which
is NULL. The tiniest reproducer looks like this :
BUG: counters: third counter was not stored if others unset
Commit e25c917a introduced a third tracking counter bug forgot
to check it when storing values at the end of the session. The
impact is that if neither the first nor the second one are
changed, none of them are saved.
Lukas Tribus [Sun, 23 Jun 2013 15:37:13 +0000 (17:37 +0200)]
MEDIUM: http: add IPv6 support for "set-tos"
As per RFC3260 #4 and BCP37 #4.2 and #5.2, the IPv6 counterpart of TOS
is "traffic class".
Add support for IPv6 traffic class in "set-tos" by moving the "set-tos"
related code to the new inline function inet_set_tos(), handling IPv4
(IP_TOS), IPv6 (IPV6_TCLASS) and IPv4-mapped sockets (IP_TOS, like
::ffff:127.0.0.1).
Also define - if missing - the IN6_IS_ADDR_V4MAPPED() macro in
include/common/compat.h for compatibility.
Lukas Tribus [Wed, 19 Jun 2013 21:34:41 +0000 (23:34 +0200)]
BUG/MINOR: http: fix "set-tos" not working in certain configurations
s->req->prod->conn->addr.to.ss_family contains only useful data if
conn_get_to_addr() is called early. If thats not the case (nothing in the
configuration needs the destination address like logs, transparent, ...)
then "set-tos" doesn't work.
Fix this by checking s->req->prod->conn->addr.from.ss_family instead.
Also fix a minor doc issue about set-tos in http-response.
Willy Tarreau [Fri, 21 Jun 2013 21:16:39 +0000 (23:16 +0200)]
BUG/MEDIUM: prevent gcc from moving empty keywords lists into BSS
Benoit Dolez reported a failure to start haproxy 1.5-dev19. The
process would immediately report an internal error with missing
fetches from some crap instead of ACL names.
The cause is that some versions of gcc seem to trim static structs
containing a variable array when moving them to BSS, and only keep
the fixed size, which is just a list head for all ACL and sample
fetch keywords. This was confirmed at least with gcc 3.4.6. And we
can't move these structs to const because they contain a list element
which is needed to link all of them together during the parsing.
The bug indeed appeared with 1.5-dev19 because it's the first one
to have some empty ACL keyword lists.
One solution is to impose -fno-zero-initialized-in-bss to everyone
but this is not really nice. Another solution consists in ensuring
the struct is never empty so that it does not move there. The easy
solution consists in having a non-null list head since it's not yet
initialized.
A new "ILH" list head type was thus created for this purpose : create
an Initialized List Head so that gcc cannot move the struct to BSS.
This fixes the issue for this version of gcc and does not create any
burden for the declarations.
Willy Tarreau [Fri, 21 Jun 2013 06:20:19 +0000 (08:20 +0200)]
MEDIUM: session: disable lingering on the server when the client aborts
When abortonclose is used and an error is detected on the client side,
better force an RST to the server. That way we propagate to the server
the same vision we got from the client, and we ensure that we won't keep
TIME_WAITs.
Willy Tarreau [Mon, 17 Jun 2013 13:10:25 +0000 (15:10 +0200)]
[RELEASE] Released version 1.5-dev19
Released version 1.5-dev19 with the following main changes :
- MINOR: stats: remove the autofocus on the scope input field
- BUG/MEDIUM: Fix crt-list file parsing error: filtered name was ignored.
- BUG/MEDIUM: ssl: EDH ciphers are not usable if no DH parameters present in pem file.
- BUG/MEDIUM: shctx: makes the code independent on SSL runtime version.
- MEDIUM: ssl: improve crt-list format to support negation
- BUG: ssl: fix crt-list for clients not supporting SNI
- MINOR: stats: show soft-stopped servers in different color
- BUG/MINOR: config: "source" does not work in defaults section
- BUG: regex: fix pcre compile error when using JIT
- MINOR: ssl: add pattern fetch 'ssl_c_sha1'
- BUG: ssl: send payload gets corrupted if tune.ssl.maxrecord is used
- MINOR: show PCRE version and JIT status in -vv
- BUG/MINOR: jit: don't rely on USE flag to detect support
- DOC: readme: add suggestion to link against static openssl
- DOC: examples: provide simplified ssl configuration
- REORG: tproxy: prepare the transparent proxy defines for accepting other OSes
- MINOR: tproxy: add support for FreeBSD
- MINOR: tproxy: add support for OpenBSD
- DOC: examples: provide an example of transparent proxy configuration for FreeBSD 8
- CLEANUP: fix minor typo in error message.
- CLEANUP: fix missing include <string.h> in proto/listener.h
- CLEANUP: protect checks.h from multiple inclusions
- MINOR: compression: acl "res.comp" and fetch "res.comp_algo"
- BUG/MINOR: http: add-header/set-header did not accept the ACL condition
- BUILD: mention in the Makefile that USE_PCRE_JIT is for libpcre >= 8.32
- BUG/MEDIUM: splicing is broken since 1.5-dev12
- BUG/MAJOR: acl: add implicit arguments to the resolve list
- BUG/MINOR: tcp: fix error reporting for TCP rules
- CLEANUP: peers: remove a bit of spaghetti to prepare for the next bugfix
- MINOR: stick-table: allow to allocate an entry without filling it
- BUG/MAJOR: peers: fix an overflow when syncing strings larger than 16 bytes
- MINOR: session: only call http_send_name_header() when changing the server
- MINOR: tcp: report the erroneous word in tcp-request track*
- BUG/MAJOR: backend: consistent hash can loop forever in certain circumstances
- BUG/MEDIUM: log: fix regression on log-format handling
- MEDIUM: log: report file name, line number, and directive name with log-format errors
- BUG/MINOR: cli: "clear table" did not work anymore without a key
- BUG/MINOR: cli: "clear table xx data.xx" does not work anymore
- BUG/MAJOR: http: compression still has defects on chunked responses
- BUG/MINOR: stats: fix confirmation links on the stats interface
- BUG/MINOR: stats: the status bar does not appear anymore after a change
- BUG/MEDIUM: stats: allocate the stats frontend also on "stats bind-process"
- BUG/MEDIUM: stats: fix a regression when dealing with POST requests
- BUG/MINOR: fix unterminated ACL array in compression
- BUILD: last fix broke non-linux platforms
- MINOR: init: indicate the SSL runtime version on -vv.
- BUG/MEDIUM: compression: the deflate algorithm must use global settings as well
- BUILD: stdbool is not portable (again)
- DOC: readme: add a small reminder about restrictions to respect in the code
- MINOR: ebtree: add new eb_next_dup/eb_prev_dup() functions to visit duplicates
- BUG/MINOR: acl: fix a double free during exit when using PCRE_JIT
- DOC: fix wrong copy-paste in the rspdel example
- MINOR: counters: make it easier to extend the amount of tracked counters
- MEDIUM: counters: add support for tracking a third counter
- MEDIUM: counters: add a new "gpc0_rate" counter in stick-tables
- BUG/MAJOR: http: always ensure response buffer has some room for a response
- MINOR: counters: add fetch/acl sc*_tracked to indicate whether a counter is tracked
- MINOR: defaults: allow REQURI_LEN and CAPTURE_LEN to be redefined
- MINOR: log: add a new flag 'L' for locally processed requests
- MINOR: http: add full-length header fetch methods
- MEDIUM: protocol: implement a "drain" function in protocol layers
- MEDIUM: http: add a new "http-response" ruleset
- MEDIUM: http: add the "set-nice" action to http-request and http-response
- MEDIUM: log: add a log level override value in struct session
- MEDIUM: http: add support for action "set-log-level" in http-request/http-response
- MEDIUM: http: add support for "set-tos" in http-request/http-response
- MEDIUM: http: add the "set-mark" action on http-request/http-response rules
- MEDIUM: tcp: add "tcp-request connection expect-proxy layer4"
- MEDIUM: acl: automatically detect the type of certain fetches
- MEDIUM: acl: remove a lot of useless ACLs that are equivalent to their fetches
- MEDIUM: acl: remove 15 additional useless ACLs that are equivalent to their fetches
- DOC: major reorg of ACL + sample fetch
- CLEANUP: http: remove the bogus urlp_ip ACL match
- MINOR: acl: add the new "env()" fetch method to retrieve an environment variable
- BUG/MINOR: acl: correctly consider boolean fetches when doing casts
- BUG/CRITICAL: fix a possible crash when using negative header occurrences
- DOC: update ROADMAP file
- MEDIUM: counters: use sc0/sc1/sc2 instead of sc1/sc2/sc3
- MEDIUM: stats: add proxy name filtering on the statistic page
Willy Tarreau [Mon, 17 Jun 2013 13:04:07 +0000 (15:04 +0200)]
MEDIUM: counters: use sc0/sc1/sc2 instead of sc1/sc2/sc3
It was a bit inconsistent to have gpc start at 0 and sc start at 1,
so make sc start at zero like gpc. No previous release was issued
with sc3 anyway, so no existing setup should be affected.
Willy Tarreau [Wed, 12 Jun 2013 20:27:44 +0000 (22:27 +0200)]
BUG/CRITICAL: fix a possible crash when using negative header occurrences
When a config makes use of hdr_ip(x-forwarded-for,-1) or any such thing
involving a negative occurrence count, the header is still parsed in the
order it appears, and an array of up to MAX_HDR_HISTORY entries is created.
When more entries are used, the entries simply wrap and continue this way.
A problem happens when the incoming header field count exactly divides
MAX_HDR_HISTORY, because the computation removes the number of requested
occurrences from the count, but does not care about the risk of wrapping
with a negative number. Thus we can dereference the array with a negative
number and randomly crash the process.
The bug is located in http_get_hdr() in haproxy 1.5, and get_ip_from_hdr2()
in haproxy 1.4. It affects configurations making use of one of the following
functions with a negative <value> occurence number :
- hdr_ip(<name>, <value>) (in 1.4)
- hdr_*(<name>, <value>) (in 1.5)
It also affects "source" statements involving "hdr_ip(<name>)" since that
statement implicitly uses -1 for <value> :
- source 0.0.0.0 usesrc hdr_ip(<name>)
A workaround consists in rejecting dangerous requests early using
hdr_cnt(<name>), which is available both in 1.4 and 1.5 :
block if { hdr_cnt(<name>) ge 10 }
This bug has been present since the introduction of the negative offset
count in 1.4.4 via commit bce70882. It has been reported by David Torgerson
who offered some debugging traces showing where the crash happened, thus
making it significantly easier to find the bug!
Willy Tarreau [Wed, 12 Jun 2013 19:50:19 +0000 (21:50 +0200)]
BUG/MINOR: acl: correctly consider boolean fetches when doing casts
Commit 5adeda1 (acl: add option -m to change the pattern matching method)
was not completely correct with regards to boolean fetches. It only used
the sample type to determine if the test had to be performed as a boolean
instead of relying on the match function. Due to this, a test such as the
following would not correctly match as the pattern would be ignored :
acl srv_down srv_is_up(s2) -m int 0
No backport is needed as this was merged first in 1.5-dev18.
Willy Tarreau [Wed, 12 Jun 2013 19:34:28 +0000 (21:34 +0200)]
MINOR: acl: add the new "env()" fetch method to retrieve an environment variable
This is useful in order to take different actions across restarts without
touching the configuration (eg: soft-stop), or to pass some information
such as the local host name to the next hop.
Willy Tarreau [Wed, 12 Jun 2013 19:02:59 +0000 (21:02 +0200)]
CLEANUP: http: remove the bogus urlp_ip ACL match
This one is wrong, never matches and cannot work. It was brought by a blind
copy-paste from the url_* version in 1.5-dev9, but there is no underlying
fetch returning an IP type for this.
Willy Tarreau [Tue, 11 Jun 2013 21:12:07 +0000 (23:12 +0200)]
DOC: major reorg of ACL + sample fetch
The split between ACL and sample fetch was a terrible mess in the doc,
as it caused all entries to be duplicated with most of them not easy to
find, some missing and some wrong.
The new approach consists in describing the sample fetch methods and
indicating the ACLs that are derived from these fetches. The doc is
much smaller (1500 lines added, 2200 removed, net gain = 700 lines)
and much clearer.
The description of the ACL mechanics was revamped to take account of
the latest evolutions and clearly describe the compatibility between
types of fetches and ACL patterns.
The deprecated keywords have been marked as such, though they still
appear in the examples given for various other keywords.
Willy Tarreau [Tue, 11 Jun 2013 19:22:58 +0000 (21:22 +0200)]
MEDIUM: acl: remove a lot of useless ACLs that are equivalent to their fetches
The following 116 ACLs were removed because they're redundant with their
fetch function since last commit which allows the fetch function to be
used instead for types BOOL, INT and IP. Most places are now left with
an empty ACL keyword list that was not removed so that it's easier to
add other ACLs later.
Willy Tarreau [Tue, 11 Jun 2013 19:09:06 +0000 (21:09 +0200)]
MEDIUM: acl: automatically detect the type of certain fetches
Commit bef91e71 added the possibility to automatically use some fetch
functions instead of ACL functions, but for the fetch output type was
never used and setting the match method using -m was always mandatory.
Some fetch types are non-ambiguous and can intuitively be associated
with some ACL types :
SMP_T_BOOL -> bool
SMP_T_UINT/SINT -> int
SMP_T_IPV4/IPV6 -> ip
So let's have the ACL expression parser detect these ones automatically.
Other types are more ambiguous, especially everything related to strings,
as there are many string matching methods available and none of them is
the obvious standard matching method for any string. These ones will still
have to be specified using -m.
This configures the client-facing connection to receive a PROXY protocol
header before any byte is read from the socket. This is equivalent to
having the "accept-proxy" keyword on the "bind" line, except that using
the TCP rule allows the PROXY protocol to be accepted only for certain
IP address ranges using an ACL. This is convenient when multiple layers
of load balancers are passed through by traffic coming from public
hosts.
Willy Tarreau [Tue, 11 Jun 2013 17:34:13 +0000 (19:34 +0200)]
MEDIUM: http: add the "set-mark" action on http-request/http-response rules
"set-mark" is used to set the Netfilter MARK on all packets sent to the
client to the value passed in <mark> on platforms which support it. This
value is an unsigned 32 bit value which can be matched by netfilter and
by the routing table. It can be expressed both in decimal or hexadecimal
format (prefixed by "0x"). This can be useful to force certain packets to
take a different route (for example a cheaper network path for bulk
downloads). This works on Linux kernels 2.6.32 and above and requires
admin privileges.
Willy Tarreau [Tue, 11 Jun 2013 16:51:32 +0000 (18:51 +0200)]
MEDIUM: http: add support for "set-tos" in http-request/http-response
This manipulates the TOS field of the IP header of outgoing packets sent
to the client. This can be used to set a specific DSCP traffic class based
on some request or response information. See RFC2474, 2597, 3260 and 4594
for more information.
Willy Tarreau [Tue, 11 Jun 2013 15:45:46 +0000 (17:45 +0200)]
MEDIUM: http: add support for action "set-log-level" in http-request/http-response
Some users want to disable logging for certain non-important requests such as
stats requests or health-checks coming from another equipment. Other users want
to log with a higher importance (eg: notice) some special traffic (POST requests,
authenticated requests, requests coming from suspicious IPs) or some abnormally
large responses.
This patch responds to all these needs at once by adding a "set-log-level" action
to http-request/http-response. The 8 syslog levels are supported, as well as "silent"
to disable logging.
Willy Tarreau [Tue, 11 Jun 2013 14:06:12 +0000 (16:06 +0200)]
MEDIUM: http: add a new "http-response" ruleset
Some actions were clearly missing to process response headers. This
patch adds a new "http-response" ruleset which provides the following
actions :
- allow : stop evaluating http-response rules
- deny : stop and reject the response with a 502
- add-header : add a header in log-format mode
- set-header : set a header in log-format mode
Willy Tarreau [Mon, 10 Jun 2013 17:56:38 +0000 (19:56 +0200)]
MEDIUM: protocol: implement a "drain" function in protocol layers
Since commit cfd97c6f was merged into 1.5-dev14 (BUG/MEDIUM: checks:
prevent TIME_WAITs from appearing also on timeouts), some valid health
checks sometimes used to show some TCP resets. For example, this HTTP
health check sent to a local server :
19:55:15.742818 IP 127.0.0.1.16568 > 127.0.0.1.8000: S 3355859679:3355859679(0) win 32792 <mss 16396,nop,nop,sackOK,nop,wscale 7>
19:55:15.742841 IP 127.0.0.1.8000 > 127.0.0.1.16568: S 1060952566:1060952566(0) ack 3355859680 win 32792 <mss 16396,nop,nop,sackOK,nop,wscale 7>
19:55:15.742863 IP 127.0.0.1.16568 > 127.0.0.1.8000: . ack 1 win 257
19:55:15.745402 IP 127.0.0.1.16568 > 127.0.0.1.8000: P 1:23(22) ack 1 win 257
19:55:15.745488 IP 127.0.0.1.8000 > 127.0.0.1.16568: FP 1:146(145) ack 23 win 257
19:55:15.747109 IP 127.0.0.1.16568 > 127.0.0.1.8000: R 23:23(0) ack 147 win 257
After some discussion with Chris Huang-Leaver, it appeared clear that
what we want is to only send the RST when we have no other choice, which
means when the server has not closed. So we still keep SYN/SYN-ACK/RST
for pure TCP checks, but don't want to see an RST emitted as above when
the server has already sent the FIN.
The solution against this consists in implementing a "drain" function at
the protocol layer, which, when defined, causes as much as possible of
the input socket buffer to be flushed to make recv() return zero so that
we know that the server's FIN was received and ACKed. On Linux, we can make
use of MSG_TRUNC on TCP sockets, which has the benefit of draining everything
at once without even copying data. On other platforms, we read up to one
buffer of data before the close. If recv() manages to get the final zero,
we don't disable lingering. Same for hard errors. Otherwise we do.
In practice, on HTTP health checks we generally find that the close was
pending and is returned upon first recv() call. The network trace becomes
cleaner :
19:55:23.650621 IP 127.0.0.1.16561 > 127.0.0.1.8000: S 3982804816:3982804816(0) win 32792 <mss 16396,nop,nop,sackOK,nop,wscale 7>
19:55:23.650644 IP 127.0.0.1.8000 > 127.0.0.1.16561: S 4082139313:4082139313(0) ack 3982804817 win 32792 <mss 16396,nop,nop,sackOK,nop,wscale 7>
19:55:23.650666 IP 127.0.0.1.16561 > 127.0.0.1.8000: . ack 1 win 257
19:55:23.651615 IP 127.0.0.1.16561 > 127.0.0.1.8000: P 1:23(22) ack 1 win 257
19:55:23.651696 IP 127.0.0.1.8000 > 127.0.0.1.16561: FP 1:146(145) ack 23 win 257
19:55:23.652628 IP 127.0.0.1.16561 > 127.0.0.1.8000: F 23:23(0) ack 147 win 257
19:55:23.652655 IP 127.0.0.1.8000 > 127.0.0.1.16561: . ack 24 win 257
This change should be backported to 1.4 which is where Chris encountered
this issue. The code is different, so probably the tcp_drain() function
will have to be put in the checks only.
Willy Tarreau [Mon, 10 Jun 2013 16:39:42 +0000 (18:39 +0200)]
MINOR: http: add full-length header fetch methods
The req.hdr and res.hdr fetch methods do not work well on headers which
are allowed to contain commas, such as User-Agent, Date or Expires.
More specifically, full-length matching is impossible if a comma is
present.
This patch introduces 4 new fetch functions which are designed to work
with these full-length headers :
- req.fhdr, req.fhdr_cnt
- res.fhdr, res.fhdr_cnt
These ones do not stop at commas and permit to return full-length header
values.
Willy Tarreau [Mon, 10 Jun 2013 14:42:09 +0000 (16:42 +0200)]
MINOR: log: add a new flag 'L' for locally processed requests
People who use "option dontlog-normal" are bothered with redirects and
stats being logged and reported as errors in the logs ("PR" = proxy
blocked the request).
This patch introduces a new flag 'L' for when a request is locally
processed, that is not considered as an error by the log filters. That
way we know a request was intercepted and processed by haproxy without
logging the line when "option dontlog-normal" is in effect.
Willy Tarreau [Sat, 8 Jun 2013 10:55:46 +0000 (12:55 +0200)]
BUG/MAJOR: http: always ensure response buffer has some room for a response
Since 1.5-dev12 and commit 3bf1b2b8 (MAJOR: channel: stop relying on
BF_FULL to take action), the HTTP parser switched to channel_full()
instead of BF_FULL to decide whether a buffer had enough room to start
parsing a request or response. The problem is that channel_full()
intentionally ignores outgoing data, so a corner case exists where a
large response might still be left in a response buffer with just a
few bytes left (much less than the reserve), enough to accept a second
response past the last data, but not enough to permit the HTTP processor
to add some headers. Since all the processing relies on this space being
available, we can get some random crashes when clients pipeline requests.
The analysis of a core from haproxy configured with 20480 bytes buffers
shows this : with enough "luck", when sending back the response for the
first request, the client is slow, the TCP window is congested, the socket
buffers are full, and haproxy's buffer fills up. We still have 20230 bytes
of response data in a 20480 response buffer. The second request is sent to
the server which returns 214 bytes which fit in the small 250 bytes left
in this buffer. And the buffer arrangement makes it possible to escape all
the controls in http_wait_for_response() :
1/2 = beginning of previous response (18240)
2/2 = end of previous response (1990)
3 = current response (214)
4 = free space (36)
- channel_full() returns false (20230 bytes are going to leave)
- the response headers does not wrap at the end of the buffer
- the remaining linear room after the headers is larger than the
reserve, because it's the previous response which wraps :
=> response is processed
Header rewriting causes it to reach 260 bytes, 10 bytes larger than what
the buffer could hold. So all computations during header addition are
wrong and lead to the corruption we've observed.
All the conditions are very hard to meet (which explains why it took
almost one year for this bug to show up) and are almost impossible to
reproduce on purpose on a test platform. But the bug is clearly there.
This issue was reported by Dinko Korunic who kindly devoted a lot of
time to provide countless traces and cores, and to experiment with
troubleshooting patches to knock the bug down. Thanks Dinko!
No backport is needed, but all 1.5-dev versions between dev12 and dev18
included must be upgraded. A workaround consists in setting option
forceclose to prevent pipelined requests from being processed.
Kevin Hester [Thu, 30 May 2013 22:12:41 +0000 (15:12 -0700)]
BUG: ssl: send payload gets corrupted if tune.ssl.maxrecord is used
We were using "tune.ssl.maxrecord 2000" and discovered an interesting
problem: SSL data sent from the server to the client showed occasional
corruption of the payload data.
The root cause was:
When ssl_max_record is smaller than the requested send amount
the ring buffer wrapping wasn't properly adjusting the
number of bytes to send.
I solved this by selecting the initial size based on the number
of output bytes that can be sent without splitting _before_ checking
against ssl_max_record.
Willy Tarreau [Tue, 28 May 2013 16:32:20 +0000 (18:32 +0200)]
MEDIUM: counters: add support for tracking a third counter
We're often missin a third counter to track base, src and base+src at
the same time. Here we introduce track_sc3 to have this third counter.
It would be wise not to add much more counters because that slightly
increases the session size and processing time though the real issue
is more the declaration of the keywords in the code and in the doc.
Willy Tarreau [Tue, 28 May 2013 15:40:25 +0000 (17:40 +0200)]
MINOR: counters: make it easier to extend the amount of tracked counters
By properly affecting the flags and values, it becomes easier to add
more tracked counters, for example for experimentation. It also slightly
reduces the code and the number of tests. No counters were added with
this patch.
Pieter Baauw [Wed, 8 May 2013 20:49:23 +0000 (22:49 +0200)]
REORG: tproxy: prepare the transparent proxy defines for accepting other OSes
This patch does not change the logic of the code, it only changes the
way OS-specific defines are tested.
At the moment the transparent proxy code heavily depends on Linux-specific
defines. This first patch introduces a new define "CONFIG_HAP_TRANSPARENT"
which is set every time the defines used by transparent proxy are present.
This also means that with an up-to-date libc, it should not be necessary
anymore to force CONFIG_HAP_LINUX_TPROXY during the build, as the flags
will automatically be detected.
The CTTPROXY flags still remain separate because this older API doesn't
work the same way.
A new line has been added in the version output for haproxy -vv to indicate
what transparent proxy support is available.
Willy Tarreau [Wed, 8 May 2013 16:09:54 +0000 (18:09 +0200)]
BUG/MINOR: acl: fix a double free during exit when using PCRE_JIT
When freeing ACL regex, we don't want to perform the free() in regex_free()
as it's already performed in free_pattern(). The double free only happens
when using PCRE_JIT when freeing everything during exit so it's harmless
but exhibits libc errors during a reload/restart.
Willy Tarreau [Tue, 7 May 2013 13:58:28 +0000 (15:58 +0200)]
MINOR: ebtree: add new eb_next_dup/eb_prev_dup() functions to visit duplicates
Sometimes it's very useful to visit duplicates of a same node, but doing
so from the application is not convenient because keys have to be compared,
while all the information is available during the next/prev steps.
Let's introduce a couple of new eb_next_dup/eb_prev_dup functions to visit
only duplicates of the current node and return NULL once it's done. Now we
have all 3 combinations :
- next : returns next node in the tree
- next_dup : returns next dup in the sub-tree
- next_unique : returns next value after skipping dups