Willy Tarreau [Fri, 21 Oct 2016 13:07:45 +0000 (15:07 +0200)]
MINOR: tcp: make set-src/set-src-port and set-dst/set-dst-port commutative
When the tcp/http actions above were introduced in 1.7-dev4, we used to
proceed like this :
- set-src/set-dst would force the port to zero
- set-src-port/set-dst-port would not do anything if the address family is
neither AF_INET nor AF_INET6.
It was a stupid idea of mine to request this behaviour because it ensures
that these functions cannot be used in a wide number of situations. Because
of the first rule, it is necessary to save the source port one way or
another if only the address has to be changed (so you have to use an
variable). Due to the second rule, there's no way to set the source port
on a unix socket without first overwriting the address. And sometimes it's
really not convenient, especially when there's no way to guarantee that all
fields will properly be set.
In order to fix all this, this small change does the following :
- set-src/set-dst always preserve the original port even if the address
family changes. If the previous address family didn't have a port (eg:
AF_UNIX), then the port is set to zero ;
- set-src-port/set-dst-port always preserve the original address. If the
address doesn't have a port, then the family is forced to IPv4 and the
address to "0.0.0.0".
Thanks to this it now becomes possible to perform one action, the other or
both in any order.
MEDIUM: cli: register CLI keywords with cli_register_kw()
To register a new cli keyword, you need to declare a cli_kw_list
structure in your source file:
static struct cli_kw_list cli_kws = {{ },{
{ { "test", "list", NULL }, "test list : do some tests on the cli", test_parsing, NULL },
{ { NULL }, NULL, NULL, NULL, NULL }
}};
And then register it:
cli_register_kw(&cli_kws);
The first field is an array of 5 elements, where you declare the
keywords combination which will match, it must be ended by a NULL
element.
The second field is used as a usage message, it will appear in the help
of the cli, you can set it to NULL if you don't want to show it, it's a
good idea if you want to overwrite some existing keywords.
The two last fields are callbacks.
The first one is used at parsing time, you can use it to parse the
arguments of your keywords and print small messages. The function must
return 1 in case of a failure, otherwise 0:
if (!*args[2]) {
appctx->ctx.cli.msg = "Error: the 3rd argument is mandatory !";
appctx->st0 = STAT_CLI_PRINT;
return 1;
}
chunk_reset(&trash);
chunk_printf(&trash, "arg[3]: %s\n", args[2]);
chunk_init(&out, NULL, 0);
chunk_dup(&out, &trash);
appctx->ctx.cli.err = out.str;
appctx->st0 = STAT_CLI_PRINT_FREE; /* print and free in the default cli_io_handler */
return 0;
}
The last field is the IO handler callback, it can be set to NULL if you
want to use the default cli_io_handler() otherwise you can write your
own. You can use the private pointer in the appctx if you need to store
a context or some data. stats_dump_sess_to_buffer() is a good example of
IO handler, IO handlers often use the appctx->st2 variable for the state
machine. The handler must return 0 in case it have to be recall later
otherwise 1.
MEDIUM: peers: Fix a peer stick-tables synchronization issue.
During the stick-table teaching process which occurs at reloading/restart time,
expiration dates of stick-tables entries were not synchronized between peers.
This patch adds two new stick-table messages to provide such a synchronization feature.
As these new messages are not supported by older haproxy peers protocol versions,
this patch increments peers protol version, from 2.0 to 2.1, to help in detecting/supporting
such older peers protocol implementations so that new versions might still be able
to transparently communicate with a newer one.
[wt: technically speaking it would be nice to have this backported into 1.6
as some people who reload often are affected by this design limitation, but
it's not a totally transparent change that may make certain users feel
reluctant to upgrade older versions. Let's let it cook in 1.7 first and
decide later]
Willy Tarreau [Sat, 1 Oct 2016 07:20:32 +0000 (09:20 +0200)]
BUG/MEDIUM: dns: don't randomly crash on out-of-memory
dns_init_resolvers() tries to emit the current resolver's name in the
error message in case of out-of-memory condition. But it must not do
it when initializing the trash before even having such a resolver
otherwise the user is certain to get a dirty crash instead of the
error message. No backport is needed.
Willy Tarreau [Sat, 1 Oct 2016 07:12:08 +0000 (09:12 +0200)]
BUG/MINOR: stats: report the correct conn_time in backend's html output
An apparent copy-paste error resulted in backend's avg connection time
to report the average queue time instead in the HTML dump. Backport to
1.6 is desired.
BUG/MEDIUM: http/compression: Fix how chunked data are copied during the HTTP body parsing
When the compression is enable on HTTP responses, the chunked data are copied in
a temporary buffer during the HTTP body parsing and then compressed when
everything is forwarded to the client. But the amout of data that can be copied
was not correctly calculated. In many cases, it worked, else on the edge when
the channel buffer was almost full.
[wt: bug introduced by b77c5c26 in 1.7-dev, no backport needed]
Lukas Tribus [Tue, 13 Sep 2016 09:51:15 +0000 (09:51 +0000)]
MINOR: enable IP_BIND_ADDRESS_NO_PORT on backend connections
Enable IP_BIND_ADDRESS_NO_PORT on backend connections when the source
address is specified without port or port ranges. This is supported
since Linux 4.2/libc 2.23.
If the kernel supports it but the libc doesn't, we can define it at
build time:
make [...] DEFINE=-DIP_BIND_ADDRESS_NO_PORT=24
For more informations about this feature, see Linux commit 90c337da
Lukas Tribus [Mon, 12 Sep 2016 21:42:20 +0000 (21:42 +0000)]
MEDIUM: make SO_REUSEPORT configurable
With Linux officially introducing SO_REUSEPORT support in 3.9 and
its mainstream adoption we have seen more people running into strange
SO_REUSEPORT related issues (a process management issue turning into
hard to diagnose problems because the kernel load-balances between the
new and an obsolete haproxy instance).
Also some people simply want the guarantee that the bind fails when
the old process is still bound.
This change makes SO_REUSEPORT configurable, introducing the command
line argument "-dR" and the noreuseport configuration directive.
MINOR: dns: comments in types/dns.h about structures endianness
To avoid issues when porting code to some architecture, we need to know
the endianess the structures are currently used.
This patch simply had a short notice before those structures to report
endianess and ease contributor's job.
MINOR: dns: proper domain name validation when receiving DNS response
The analyse of CNAME resolution and request's domain name was performed
twice:
- when validating the response buffer
- when loading the right IP address from the response
Now DNS response are properly loaded into a DNS response structure, we
do the domain name validation when loading/validating the response in
the DNS strcucture and later processing of this task is now useless.
MINOR: dns: query type change when last record is a CNAME
DNS servers don't return A or AAAA record if the query points to a CNAME
not resolving to the right type.
We know it because the last record of the response is a CNAME. We can
trigger a new query, switching to a new query type, handled by the layer
above.
Baptiste Assmann [Sat, 14 May 2016 09:26:22 +0000 (11:26 +0200)]
MEDIUM: dns: new DNS response parser
New DNS response parser function which turn the DNS response from a
network buffer into a DNS structure, much easier for later analysis
by upper layer.
Memory is pre-allocated at start-up in a chunk dedicated to DNS
response store.
New error code to report a wrong number of queries in a DNS response.
Enrichment of the 'set server <b>/<s> addr' cli directive to allow changing
now a server's port.
The new syntax looks like:
set server <b>/<s> addr [port <port>]
MINOR: new update_server_addr_port() function to change both server's ADDR and service PORT
This function can replace update_server_addr() where the need to change the
server's port as well as the IP address is required.
It performs some validation before performing each type of change.
Baptiste Assmann [Thu, 11 Aug 2016 21:12:18 +0000 (23:12 +0200)]
MINOR: server: introduction of 3 new server flags
Introduction of 3 new server flags to remember if some parameters were set
during configuration parsing.
* SRV_F_CHECKADDR: this server has a check addr configured
* SRV_F_CHECKPORT: this server has a check port configured
* SRV_F_AGENTADDR: this server has a agent addr configured
Baptiste Assmann [Mon, 13 Jun 2016 12:15:41 +0000 (14:15 +0200)]
MAJOR: check: find out which port to use for health check at run time
HAProxy used to deduce port used for health checks when parsing configuration
at startup time.
Because of this way of working, it makes it complicated to change the port at
run time.
The current patch changes this behavior and makes HAProxy to choose the
port used for health checking when preparing the check task itself.
A new type of error is introduced and reported when no port can be found.
There won't be any impact on performance, since the process to find out the
port value is made of a few 'if' statements.
This patch also introduces a new check state CHK_ST_PORT_MISS: this flag is
used to report an error in the case when HAProxy needs to establish a TCP
connection to a server, to perform a health check but no TCP ports can be
found for it.
And last, it also introduces a new stream termination condition:
SF_ERR_CHK_PORT. Purpose of this flag is to report an error in the event when
HAProxy has to run a health check but no port can be found to perform it.
SOL_IPV6 is not defined on OSX, breaking the compile. Also libcrypt is
not available for installation neither in Macports nor as a Brew recipe,
so we're disabling implicit dependancy.
Chad Lavoie [Thu, 26 May 2016 20:42:25 +0000 (16:42 -0400)]
MINOR: cli: allow the semi-colon to be escaped on the CLI
Today I was working on an auto-update script for some ACLs, and found
that I couldn't load ACL entries with a semi-colon in them no matter
how I tried to escape it.
As such, I wrote this patch (this one is for 1.7dev, but it applies to
1.5 the same with just line numbers changed), which seems to allow me
to execute a command such as "add acl /etc/foo.lst foo\;bar" over the
socket. It's worth noting that stats_sock_parse_request() already uses
the backslash to escape spaces in words so it makes sense to use it as
well to escape the semi-colon.
Willy Tarreau [Tue, 30 Aug 2016 12:39:46 +0000 (14:39 +0200)]
BUG/MINOR: payload: fix SSLv2 version parser
A typo resulting from a copy-paste in the original req.ssl_ver code will
make certain SSLv2 hello messages not properly detected. The bug has been
present since the code was added in 1.3.16. In 1.3 and 1.4, this code was
in proto_tcp.c. In 1.5-dev0, it moved to acl.c, then later to payload.c.
This bug was tagged "minor" because SSLv2 is outdated and this encoding
was rarely (if at all) used, the shorter form starting with 0x80 being
more common.
This fix needs to be backported to all currently maintained branches.
In dump_servers_state(), srv_time_since_last_change, bk_f_forced_id, srv_f_forced_id variables
were firstly set to zero and immediately reassigned to another value while never been accessed in between.
Sounds like a useless initiazation. So let's make only the useful allocation.
Willy Tarreau [Mon, 29 Aug 2016 12:46:54 +0000 (14:46 +0200)]
BUG/MAJOR: stream: properly mark the server address as unset on connect retry
Bartosz Koninski reported that recent commit 568743a ("BUG/MEDIUM: stream-int:
completely detach connection on connect error") introduced a nasty side effect
during the connection retry, causing a reconnect attempt to be performed on a
random address, possibly another server from another backend as shown in his
reproducer.
The real reason is in fact that by calling si_release_endpoint() after a
failed connect attempt and not clearing the SN_ADDR_SET flag on the
stream, we indicate that we continue to trust the connection's current
address. So upon next connect attempt, a connection is picked from the
pool and the last target address is reused. It is not very easy to
reproduce it 100% reliably out of production traffic, but it's easier to
see when haproxy is started with -dM to force the pools to be filled with
junk, and where subsequent connections are not even attempted due to their
family being incorrect.
The correct fix consists in clearing the SN_ADDR_SET flag after calling
si_release_endpoint() during a retry. But it outlines a deeper issue which
is in fact that the target address is *stored* in the connection and its
validity is stored in the stream. Until we have a better solution to store
target addresses, it would be better to rely on the connection flags
(CO_FL_ADDR_TO_SET) for this purpose. But it also outlines the fact that
the same issue still exists in Lua sockets and in idle sockets, which
fortunately are not affected by this issue.
Thanks to Bartosz for providing all the elements needed to understand the
problem.
BUILD/MAJOR:updated 51d Trie implementation to incorperate latest update to 51Degrees.c
Trie now uses a dataset structure just like Pattern, so this has been
defined in includes/types/global.h for both Pattern and Trie where it
was just Pattern.
In src/51d.c all functions used by the Trie implementation which need a
dataset as an argument now use the global dataset. The
fiftyoneDegreesDestroy method has now been replaced with
fiftyoneDegreesDataSetFree which is common to Pattern and Trie. In
addition, two extra dataset init status' have been added to the switch
statement in init_51degrees.
Tq is the time between the instant the connection is accepted and a
complete valid request is received. This time includes the handshake
(SSL / Proxy-Protocol), the idle when the browser does preconnect and
the request reception.
This patch decomposes %Tq in 3 measurements names %Th, %Ti, and %TR
which returns respectively the handshake time, the idle time and the
duration of valid request reception. It also adds %Ta which reports
the request's active time, which is the total time without %Th nor %Ti.
It replaces %Tt as the total time, reporting accurate measurements for
HTTP persistent connections.
%Th is avalaible for TCP and HTTP sessions, %Ti, %TR and %Ta are only
avalaible for HTTP connections.
In addition to this, we have new timestamps %tr, %trg and %trl, which
log the date of start of receipt of the request, respectively in the
default format, in GMT time and in local time (by analogy with %t, %T
and %Tl). All of them are obviously only available for HTTP. These values
are more relevant as they more accurately represent the request date
without being skewed by a browser's preconnect nor a keep-alive idle
time.
The HTTP log format and the CLF log format have been modified to
use %tr, %TR, and %Ta respectively instead of %t, %Tq and %Tt. This
way the default log formats now produce the expected output for users
who don't want to manually fiddle with the log-format directive.
=> 6 ms of SSL handshake, 2375 waiting before sending the first char (in
fact the time to type the first line), 261 ms before the end of the request,
no time spent in queue, 1 ms spend connecting to the server, immediate
response, total active time for this request = 262ms. Total time from accept
to close : 2643 ms.
The timing now decomposes like this :
first request 2nd request
|<-------------------------------->|<-------------- ...
t tr t tr ...
---|----|----|----|----|----|----|----|----|--
: Th Ti TR Tw Tc Tr Td : Ti ...
:<---- Tq ---->: :
:<-------------- Tt -------------->:
:<--------- Ta --------->:
Willy Tarreau [Sun, 14 Aug 2016 10:25:21 +0000 (12:25 +0200)]
[RELEASE] Released version 1.7-dev4
Released version 1.7-dev4 with the following main changes :
- MINOR: add list_append_word function
- MEDIUM: init: use list_append_word in haproxy.c
- MEDIUM: init: allow directory as argument of -f
- CLEANUP: config: detect double registration of a config section
- MINOR: log: add the %Td log-format specifier
- MEDIUM: filters: Move HTTP headers filtering in its own callback
- MINOR: filters: Simplify calls to analyzers using 2 new macros
- MEDIUM: filters: Add pre and post analyzer callbacks
- DOC: filters: Update the filters documentation accordingly to recent changes
- BUG/MEDIUM: init: don't use environment locale
- SCRIPTS: teach git-show-backports how to report upstream commits
- SCRIPTS: make git-show-backports capable of limiting its history
- BUG/MAJOR: fix listening IP address storage for frontends
- BUG/MINOR: fix listening IP address storage for frontends (cont)
- DOC: Fix typo so fetch is properly parsed by Cyril's converter
- BUG/MAJOR: http: fix breakage of "reqdeny" causing random crashes
- BUG/MEDIUM: stick-tables: fix breakage in table converters
- MINOR: stick-table: change all stick-table converters' inputs to SMP_T_ANY
- BUG/MEDIUM: dns: unbreak DNS resolver after header fix
- BUILD: fix build on Solaris 11
- BUG/MEDIUM: config: fix multiple declaration of section parsers
- BUG/MEDIUM: stats: show servers state may show an servers from another backend
- BUG/MEDIUM: fix risk of segfault with "show tls-keys"
- MEDIUM: dumpstats: 'show tls-keys' is now able to show secrets
- DOC: update doc about tls-tickets-keys dump
- MEDIUM: tcp: add 'set-src' to 'tcp-request connection'
- MINOR: set the CO_FL_ADDR_FROM_SET flags with 'set-src'
- MEDIUM: tcp/http: add 'set-src-port' action
- MEDIUM: tcp/http: new set-dst/set-dst-port actions
- BUG/MEDIUM: sticktables: segfault in some configuration error cases
- BUILD/MEDIUM: rebuild everything when an include file is changed
- BUILD/MEDIUM: force a full rebuild if some build options change
- BUG/MEDIUM: lua: converters doesn't work
- BUG/MINOR: http: add-header: header name copied twice
- BUG/MEDIUM: http: add-header: buffer overwritten
- BUG/MINOR: ssl: fix potential memory leak in ssl_sock_load_dh_params()
- MINOR: stream: export the function 'smp_create_src_stkctr'
- BUG/MEDIUM: dumpstats: undefined behavior in stats_tlskeys_list()
- MEDIUM: dumpstats: make stats_tlskeys_list() yield-aware during tls-keys dump
- BUG/MINOR: http: url32+src should use the big endian version of url32
- BUG/MINOR: http: url32+src should check cli_conn before using it
- DOC: http: add documentation for url32 and url32+src
- BUG/MINOR: fix http-response set-log-level parsing error
- MINOR: systemd: Use variable for config and pidfile paths
- MINOR: systemd: Perform sanity check on config before reload
- MEDIUM: ssl: support SNI filters with multicerts
- MINOR: ssl: crt-list parsing factor
- BUILD: ssl: fix typo causing a build failure in the multicert patch
- MINOR: listener: add the "accept-netscaler-cip" option to the "bind" keyword
- MINOR: tcp: add "tcp-request connection expect-netscaler-cip layer4"
- BUG/MINOR: init: always ensure that global.rlimit_nofile matches actual limits
- BUG/MINOR: init: ensure that FD limit is raised to the max allowed
- BUG/MEDIUM: external-checks: close all FDs right after the fork()
- BUG/MAJOR: external-checks: use asynchronous signal delivery
- BUG/MINOR: external-checks: do not unblock undesired signals
- CLEANUP: external-check: don't block/unblock SIGCHLD when manipulating the list
- BUG/MEDIUM: filters: Fix data filtering when data are modified
- BUG/MINOR: filters: Fix HTTP parsing when a filter loops on data forwarding
- BUG/MINOR: srv-state: fix incorrect output of state file
- BUG/MINOR: ssl: close ssl key file on error
- BUG/MINOR: http: fix misleading error message for response captures
- BUG/BUILD: don't automatically run "make" on "make install"
- DOC: add missing doc for http-request deny [deny_status <status>]
- CLEANUP: dumpstats: u64 field is an unsigned type.
- BUG/MEDIUM: http: unbreak uri/header/url_param hashing
- BUG/MINOR: Rework slightly commit 9962f8fc to clean code and avoid mistakes
- MINOR: new function my_realloc2 = realloc + free upon failure
- CLEANUP: fixed some usages of realloc leading to memory leak
- Revert "BUG/MINOR: ssl: fix potential memory leak in ssl_sock_load_dh_params()"
- CLEANUP: connection: using internal struct to hold source and dest port.
- DOC: spelling fixes
- BUG/MINOR: ssl: fix potential memory leak in ssl_sock_load_dh_params()
- BUG/MEDIUM: dns: fix alignment issues in the DNS response parser
- BUG/MINOR: Fix endiness issue in DNS header creation code
- BUG/MEDIUM: lua: the function txn_done() from sample fetches can crash
- BUG/MEDIUM: lua: the function txn_done() from action wrapper can crash
- MEDIUM: http: implement http-response track-sc* directive
- BUG/MINOR: peers: Fix peers data decoding issue
- BUG/MINOR: peers: don't count track-sc multiple times on errors
- MINOR: standard: add function "escape_string"
- BUG/MEDIUM: log: use function "escape_string" instead of "escape_chunk"
- MINOR: tcp: Return TCP statistics like RTT and RTT variance
- DOC: lua: remove old functions
- BUG/MEDIUM: lua: somme HTTP manipulation functions are called without valid requests
- DOC: fix json converter example and error message
- BUG/MEDIUM: stream-int: completely detach connection on connect error
- DOC: minor typo fixes to improve HTML parsing by haproxy-dconv
- BUILD: make proto_tcp.c compatible with musl library
- BUG/MAJOR: compression: initialize avail_in/next_in even during flush
- BUG/MEDIUM: samples: make smp_dup() always duplicate the sample
- MINOR: sample: implement smp_is_safe() and smp_make_safe()
- MINOR: sample: provide smp_is_rw() and smp_make_rw()
- BUG/MAJOR: server: the "sni" directive could randomly cause trouble
- BUG/MEDIUM: stick-tables: do not fail on string keys with no allocated size
- BUG/MEDIUM: stick-table: properly convert binary samples to keys
- MINOR: sample: use smp_make_rw() in upper/lower converters
- MINOR: tcp: add dst_is_local and src_is_local
- BUG/MINOR: peers: some updates are pushed twice after a resync.
- BUILD: protocol: fix some build errors on OpenBSD
- BUILD: log: iovec requires to include sys/uio.h on OpenBSD
- BUILD: tcp: do not include netinet/ip.h for IP_TTL
- BUILD: connection: fix build breakage on openbsd due to missing in_systm.h
- BUILD: checks: remove the last strcat and eliminate a warning on OpenBSD
- BUILD: tcp: define SOL_TCP when only IPPROTO_TCP exists
- BUILD: compression: remove a warning when no compression lib is used
- BUILD: poll: remove unused hap_fd_isset() which causes a warning with clang
- MINOR: tcp: add further tcp info fetchers
- BUG/MINOR: peers: empty chunks after a resync.
- BUG/MAJOR: stick-counters: possible crash when using sc_trackers with wrong table
- MINOR: standard.c: ipcmp() function to compare 2 IP addresses stored in 2 struct sockaddr_storage
- MINOR: standard.c: ipcpy() function to copy an IP address from a struct sockaddr_storage into an other one
- MAJOR: listen section: don't use first bind port anymore when no server ports are provided
Baptiste Assmann [Sat, 30 Jan 2016 23:27:17 +0000 (00:27 +0100)]
MINOR: standard.c: ipcpy() function to copy an IP address from a struct sockaddr_storage into an other one
The function ipcpy() simply duplicates the IP address found in one
struct sockaddr_storage into an other struct sockaddr_storage.
It also update the family on the destination structure.
Memory of destination structure must be allocated and cleared by the
caller.
Willy Tarreau [Sun, 14 Aug 2016 10:02:55 +0000 (12:02 +0200)]
BUG/MAJOR: stick-counters: possible crash when using sc_trackers with wrong table
Bryan Talbot reported a very interesting bug. The sc_trackers() sample
fetch seems to have escaped the sanitization that was performed during 1.5
to ensure all dereferences of stkctr_entry() were safe. Here if a tacker
is set on a backend and is then checked against a different backend where
the entry doesn't exist, stkctr_entry() returns NULL and this is dereferenced
to retrieve the ref count. Thanks to Bryan for his detailed bug report featuring
a working config and reproducer.
Emeric Brun [Fri, 12 Aug 2016 09:23:31 +0000 (11:23 +0200)]
BUG/MINOR: peers: empty chunks after a resync.
After pushing the last update related to a resync process, the teacher resets
the re-connection's origin to the saved one (pointer position when he receive
the resync request). But the last acknowledgement overwrites this pointer to
an inconsistent value. In peersv2, it results with empty table chunks
regularly pushed.
The fix consist to move the confirm code to assure that the confirm message is
always sent after the last acknowledgement related to the resync. And to reset
the re-connection's origin to the saved one when a confirm message is received.
This bug affects versions 1.6 and superior.
For older versions (peersv1), this inconsistent state should not generate side
effects because chunks are not used and a next acknowlegement message resets
the pointer in a valid state.
Joe Williams [Wed, 10 Aug 2016 14:06:44 +0000 (07:06 -0700)]
MINOR: tcp: add further tcp info fetchers
Adding on to Thierry's work (http://git.haproxy.org/?p=haproxy.git;h=6310bef5)
I have added a few more fetchers for counters based on the tcp_info struct
maintained by the kernel :
Two fields were not added because they're version-dependant :
fc_rcv_rtt, fc_total_retrans
The fields name depend on the operating system. FreeBSD and NetBSD prefix
all the field names with "__" so we have to rely on a few #ifdef for
portability.
Willy Tarreau [Wed, 10 Aug 2016 19:09:24 +0000 (21:09 +0200)]
BUILD: tcp: define SOL_TCP when only IPPROTO_TCP exists
FreeBSD prefers to use IPPROTO_TCP over SOL_TCP, just like it does
with their *_IP counterparts. It's worth noting that there are a few
inconsistencies between SOL_TCP and IPPROTO_TCP in the code, eg on
TCP_QUICKACK. The two values are the same but it's worth applying
what implementations recommend.
No backport is needed, this was uncovered by the recent tcp_info stuff.
Willy Tarreau [Wed, 10 Aug 2016 17:29:09 +0000 (19:29 +0200)]
BUILD: checks: remove the last strcat and eliminate a warning on OpenBSD
OpenBSD emits warnings on usages of strcpy, strcat and sprintf (and
probably a few others). Here we have a single such warning in all the code
reintroduced by commit 0ba0e4a ("MEDIUM: Support sending email alerts") in
1.6-dev1. Let's get rid of it, the open-coding of strcat is as small as its
usage and the the result is even more efficient.
Willy Tarreau [Wed, 10 Aug 2016 16:57:38 +0000 (18:57 +0200)]
BUILD: connection: fix build breakage on openbsd due to missing in_systm.h
Recent commit 93b227d ("MINOR: listener: add the "accept-netscaler-cip"
option to the "bind" keyword") introduced an include of netinet/ip.h
which requires in_systm.h on OpenBSD. No backport is needed.
Willy Tarreau [Wed, 10 Aug 2016 16:42:17 +0000 (18:42 +0200)]
BUILD: tcp: do not include netinet/ip.h for IP_TTL
On OpenBSD, netinet/ip.h fails unless in_systm.h is included. This
include was added by the silent-drop feature introduced with commit 2d392c2 ("MEDIUM: tcp: add new tcp action "silent-drop"") in 1.6-dev6,
but we don't need it, IP_TTL is defined in netinet/in.h, so let's drop
this useless include.
Willy Tarreau [Tue, 9 Aug 2016 14:46:18 +0000 (16:46 +0200)]
MINOR: tcp: add dst_is_local and src_is_local
It is sometimes needed in application server environments to easily tell
if a source is local to the machine or a remote one, without necessarily
knowing all the local addresses (dhcp, vrrp, etc). Similarly in transparent
proxy configurations it is sometimes desired to tell the difference between
local and remote destination addresses.
This patch adds two new sample fetch functions for this :
dst_is_local : boolean
Returns true if the destination address of the incoming connection is local
to the system, or false if the address doesn't exist on the system, meaning
that it was intercepted in transparent mode. It can be useful to apply
certain rules by default to forwarded traffic and other rules to the traffic
targetting the real address of the machine. For example the stats page could
be delivered only on this address, or SSH access could be locally redirected.
Please note that the check involves a few system calls, so it's better to do
it only once per connection.
src_is_local : boolean
Returns true if the source address of the incoming connection is local to the
system, or false if the address doesn't exist on the system, meaning that it
comes from a remote machine. Note that UNIX addresses are considered local.
It can be useful to apply certain access restrictions based on where the
client comes from (eg: require auth or https for remote machines). Please
note that the check involves a few system calls, so it's better to do it only
once per connection.
Willy Tarreau [Tue, 9 Aug 2016 12:29:38 +0000 (14:29 +0200)]
MINOR: sample: use smp_make_rw() in upper/lower converters
There's no point in always duplicating the sample, just ensure it's
writable, as was done prior to the smp_dup() change. This should be
backported to 1.6 to avoid a performance regression caused by this
change (about 30% more time for upper/lower due to the copy).
Willy Tarreau [Tue, 9 Aug 2016 10:08:41 +0000 (12:08 +0200)]
BUG/MEDIUM: stick-table: properly convert binary samples to keys
The binary sample to stick-table key conversion is wrong. It doesn't
check that the binary sample is writable before padding it. After a
quick audit, it doesn't look like any existing sample fetch function
can trigger this bug. The correct fix consists in calling smp_make_rw()
prior to padding the sample.
Willy Tarreau [Tue, 9 Aug 2016 09:59:12 +0000 (11:59 +0200)]
BUG/MEDIUM: stick-tables: do not fail on string keys with no allocated size
When a stick-table key is derived from a string-based sample, it checks
if it's properly zero-terminated otherwise tries to do so. But the test
doesn't work for two reasons :
- the reported allocated size may be zero while the sample is maked as
not CONST (eg: certain sample fetch functions like smp_fetch_base()
do this), so smp_dup() prior to the recent changes will fail on this.
- the string might have been converted from a binary sample, where the
trailing zero is not appended. If the sample was writable, smp_dup()
would not modify it either and we would fail again here. This may
happen with req.payload or req.body_param for example.
The correct solution consists in calling smp_make_safe() to ensure the
sample is usable as a valid string.
Willy Tarreau [Tue, 9 Aug 2016 09:55:21 +0000 (11:55 +0200)]
BUG/MAJOR: server: the "sni" directive could randomly cause trouble
The "sni" server directive does some bad stuff on many occasions because
it works on a sample of type string and limits len to size-1 by hand. The
problem is that size used to be zero on many occasions before the recent
changes to smp_dup() and that it effectively results in setting len to -1
and writing the zero byte *before* the string (and not terminating the
string).
This patch makes use of the recently introduced smp_make_safe() to address
this issue.
Willy Tarreau [Tue, 9 Aug 2016 09:49:20 +0000 (11:49 +0200)]
MINOR: sample: provide smp_is_rw() and smp_make_rw()
At some places, smp_dup() is inappropriately called to ensure a modification
is possible while in fact we only need to ensure the sample may be modified
in place. Let's provide smp_is_rw() to check for this capability and
smp_make_rw() to perform the smp_dup() when it is not the case.
Note that smp_is_rw() will also try to add the trailing zero on strings when
needed if possible, to avoid a useless duplication.
Willy Tarreau [Tue, 9 Aug 2016 09:37:54 +0000 (11:37 +0200)]
MINOR: sample: implement smp_is_safe() and smp_make_safe()
These functions ensure that the designated sample is "safe for use",
which means that its size is known, its length is correct regarding its
size, and that strings are properly zero-terminated.
smp_is_safe() only checks (and optionally sets the trailing zero when
needed and possible). smp_make_safe() will call smp_dup() after
smp_is_safe() fails.
Willy Tarreau [Mon, 8 Aug 2016 17:21:09 +0000 (19:21 +0200)]
BUG/MEDIUM: samples: make smp_dup() always duplicate the sample
Vedran Furac reported a strange problem where the "base" sample fetch
would not always work for tracking purposes.
In fact, it happens that commit bc8c404 ("MAJOR: stick-tables: use sample
types in place of dedicated types") merged in 1.6 exposed a fundamental
bug related to the way samples use chunks as strings. The problem is that
chunks convey a base pointer, a length and an optional size, which may be
zero when unknown or when the chunk is allocated from a read-only location.
The sole purpose of this size is to know whether or not the chunk may be
appended new data. This size cause some semantics issue in the sample,
which has its own SMP_F_CONST flag to indicate read-only contents.
The problem was emphasized by the commit above because it made use of new
calls to smp_dup() to convert a sample to a table key. And since smp_dup()
would only check the SMP_F_CONST flag, it would happily return read-write
samples indicating size=0.
So some tests were added upon smp_dup() return to ensure that the actual
length is smaller than size, but this in fact made things even worse. For
example, the "sni" server directive does some bad stuff on many occasions
because it limits len to size-1 and effectively sets it to -1 and writes
the zero byte before the beginning of the string!
It is therefore obvious that smp_dup() needs to be modified to take this
nature of the chunks into account. It's not enough but is needed. The core
of the problem comes from the fact that smp_dup() is called for 5 distinct
needs which are not always fulfilled :
1) duplicate a sample to keep a copy of it during some operations
2) ensure that the sample is rewritable for a converter like upper()
3) ensure that the sample is terminated with a \0
4) set a correct size on the sample
5) grow the sample in case it was extracted from a partial chunk
Case 1 is not used for now, so we can ignore it. Case 2 indicates the wish
to modify the sample, so its R/O status must be removed if any, but there's
no implied requirement that the chunk becomes larger. Case 3 is used when
the sample has to be made compatible with libc's str* functions. There's no
need to make it R/W nor to duplicate it if it is already correct. Case 4
can happen when the sample's size is required (eg: before performing some
changes that must fit in the buffer). Case 5 is more or less similar but
will happen when the sample by be grown but we want to ensure we're not
bound by the current small size.
So the proposal is to have different functions for various operations. One
will ensure a sample is safe for use with str* functions. Another one will
ensure it may be rewritten in place. And smp_dup() will have to perform an
inconditional duplication to guarantee at least #5 above, and implicitly
all other ones.
This patch only modifies smp_dup() to make the duplication inconditional. It
is enough to fix both the "base" sample fetch and the "sni" server directive,
and all use cases in general though not always optimally. More patches will
follow to address them more optimally and even better than the current
situation (eg: avoid a dup just to add a \0 when possible).
The bug comes from an ambiguous design, so its roots are old. 1.6 is affected
and a backport is needed. In 1.5, the function already existed but was only
used by two converters modifying the data in place, so the bug has no effect
there.
Willy Tarreau [Mon, 8 Aug 2016 14:41:01 +0000 (16:41 +0200)]
BUG/MAJOR: compression: initialize avail_in/next_in even during flush
For quite some time, a few users have been experiencing random crashes
when compressing with zlib, from versions 1.2.3 to 1.2.8 included.
Upon thourough investigation in zlib's deflate.c, it appeared obvious
that avail_in and next_in are used during the flush operation and need
to be initialized, while admittedly it's not obvious in the documentation.
By simply forcing both values to -1 it's possible to immediately reproduce
the exact crash that these users have been experiencing :
(gdb) bt
#0 0x00007fa73ce10c00 in __memcpy_sse2 () from /lib64/libc.so.6
#1 0x00007fa73e0c5d49 in ?? () from /lib64/libz.so.1
#2 0x00007fa73e0c68e0 in ?? () from /lib64/libz.so.1
#3 0x00007fa73e0c73c7 in deflate () from /lib64/libz.so.1
#4 0x00000000004dca1c in deflate_flush_or_finish (comp_ctx=0x7b6580, out=0x7fa73e5bd010, flag=2) at src/compression.c:808
#5 0x00000000004dcb60 in deflate_flush (comp_ctx=0x7b6580, out=0x7fa73e5bd010) at src/compression.c:835
#6 0x00000000004dbc50 in http_compression_buffer_end (s=0x7c0050, in=0x7c00a8, out=0x78adf0 <tmpbuf.24662>, end=0) at src/compression.c:249
#7 0x000000000048bb5f in http_response_forward_body (s=0x7c0050, res=0x7c00a0, an_bit=1048576) at src/proto_http.c:7173
#8 0x00000000004cce54 in process_stream (t=0x7bffd8) at src/stream.c:1939
#9 0x0000000000427ddf in process_runnable_tasks () at src/task.c:238
#10 0x0000000000419892 in run_poll_loop () at src/haproxy.c:1573
#11 0x000000000041a4a5 in main (argc=4, argv=0x7fffcda38348) at src/haproxy.c:1933
Note that for all reports deflate_flush_or_finish() was always involved.
The crash is very hard to reproduce when using regular traffic because it
requires that the combination of avail_in and next_in are inadequate so
that the memcpy() call reads out of bounds. But this can very likely
happen when the input buffer points to an area reused by another stream
when the flush has been interrupted due to a full output buffer. This
also explains why this report is recent, as dynamic buffer allocation
was introduced in 1.6.
Anyway it's not acceptable to call a function with a randomly set input
buffer. The deflate() function explicitly checks for the case where both
avail_in and next_in are null and doesn't use it in this case during a
flush, so this is the best solution.
Special thanks to Sasha Litvak, James Hartshorn and Paul Bauer for
reporting very useful stack traces which were critical to finding the
root cause of this bug.
This fix must be backported into 1.6 and 1.5, though 1.5 is less likely to
trigger this case given that it keeps its own buffers allocated all along
the session's life.
Willy Tarreau [Sun, 7 Aug 2016 06:12:35 +0000 (08:12 +0200)]
BUG/MEDIUM: stream-int: completely detach connection on connect error
Tim Butler reported a troubling issue affecting all versions since 1.5.
When a connection error occurs and a retry is performed on the same server,
the server connection first goes into the turn-around state (SI_ST_TAR) for
one second. During this time, the client may speak and try to push some
data. The tests in place confirm that the stream interface is in a state
<= SI_ST_EST and that a connection exists, so all ingredients are present
to try to perform a send() to forward data. The send() cannot be performed
since the connection's control layer is marked as not ready, but the
polling flags are changed, and due to the remaining error flag present
on the connection, the polling on the FD is disabled in both directions.
But if this FD was reassigned to another connection in the mean time, it
is this FD which is disabled, and it causes a timeout on another connection.
A configuration allowing to reproduce the issue looks like this :
listen test
bind :8003
server s1 127.0.0.1:8001 # this one should be closed
listen victim
bind :8002
server s1 127.0.0.1:8000 # this one should respond slowly (~50ms)
Two parallel injections should be run with short time-outs (100ms). After
some time, some dead connections will appear in listener "victim" due to
their I/Os being disabled by some of the failed transfers on "test"
instance. These ones will only be flushed on time out. A dead connection
looks like this :
BUG/MEDIUM: lua: somme HTTP manipulation functions are called without valid requests
Somme HTTP manipulation functions are executed without valid and parsed
requests or responses. This causes a segmentation fault when the executed
code tries to change an empty buffer.
MINOR: tcp: Return TCP statistics like RTT and RTT variance
This patch adds 4 new sample fetches which returns the RTT of the
established connexion and the RTT variance. The established connection
can be between the client and HAProxy, and between HAProxy and the
server. This is very useful for statistics. A great use case is the
estimation of the TCP connection time of the client. Note that the
RTT of the server side is not so interesting because we already have
the connect() time.
BUG/MEDIUM: log: use function "escape_string" instead of "escape_chunk"
In function lf_text_len(), we used escape_chunk() to escape special
characters. There could be a problem if len is greater than the real src
string length (zero-terminated), eg. when calling lf_text_len() from
lf_text().
Similar to "escape_chunk", this function tries to prefix all characters
tagged in the <map> with the <escape> character. The specified <string>
contains the input to be escaped.
BUG/MINOR: peers: don't count track-sc multiple times on errors
Ruoshan Huang found that the call to session_inc_http_err_ctr() in the
recent http-response patch was not a good idea because it also increments
counters that are already tracked (eg: http-request track-sc or previous
http-response track-sc). Better open-code the update, it's simple.
This enables tracking of sticky counters from current response. The only
difference from "http-request track-sc" is the <key> sample expression
can only make use of samples in response (eg. res.*, status etc.) and
samples below Layer 6.
BUG/MEDIUM: lua: the function txn_done() from action wrapper can crash
If an action wrapper stops the processing of the transaction
with a txn_done() function, the return code of the action is
"continue". So the continue can implies the processing of other
like adding headers. However, the HTTP content is flushed and
a segfault occurs.
This patchs add a flag indicating that the Lua code want to
stop the processing, ths flags is forwarded to the haproxy core,
and other actions are ignored.
BUG/MEDIUM: lua: the function txn_done() from sample fetches can crash
The function txn_done() ends a transaction. It does not make
sense to call this function from a lua sample-fetch wrapper,
because the role of a sample-fetch is not to terminate a
transaction.
This patch modify the role of the fucntion txn_done() if it
is called from a sample-fetch wrapper, now it just ends the
execution of the Lua code like the done() function.
BUG/MINOR: Fix endiness issue in DNS header creation code
Alexander Lebedev reported that the response bit is set on SPARC when
DNS queries are sent. This has been tracked to the endianess issue, so
this patch makes the code portable.
BUG/MEDIUM: dns: fix alignment issues in the DNS response parser
Alexander Lebedev reported that the DNS parser crashes in 1.6 with a bus
error on Sparc when it receives a response. This is obviously caused by
some alignment issues. The issue can also be reproduced on ARMv5 when
setting /proc/cpu/alignment to 4 (which helps debugging).
Two places cause this crash in turn, the first one is when the IP address
from the packet is compared to the current one, and the second place is
when the address is assigned because an unaligned address is passed to
update_server_addr().
This patch modifies these places to properly use memcpy() and memcmp()
to manipulate the unaligned data.
Nenad Merdanovic found another set of places specific to 1.7 in functions
in_net_ipv4() and in_net_ipv6(), which are used to compare networks. 1.6
has the functions but does not use them. There we perform a temporary copy
to a local variable to fix the problem. The type of the function's argument
is wrong since it's not necessarily aligned, so we change it for a const
void * instead.
This fix must be backported to 1.6. Note that in 1.6 the code is slightly
different, there's no rec[] array, the pointer is used directly from the
buffer.
BUG/MINOR: ssl: fix potential memory leak in ssl_sock_load_dh_params()
Roberto Guimaraes reported that Valgrind complains about a leak
in ssl_get_dh_1024().
This is caused caused by an oversight in ssl_sock_load_dh_params(),
where local_dh_1024 is always replaced by a new DH object even if
it already holds one. This patch simply checks whether local_dh_1024
is NULL before calling ssl_get_dh_1024().
David Carlier [Mon, 4 Jul 2016 21:51:33 +0000 (22:51 +0100)]
CLEANUP: connection: using internal struct to hold source and dest port.
Originally, tcphdr's source and dest from Linux were used to get the
source and port which led to a build issue on BSD oses.
To avoid side problems related to network then we just use an internal
struct as we need only those two fields.
Certain very simple confs randomly segfault upon startup with openssl 1.0.2
with this patch, which seems to indicate a use after free. Better drop it
and let valgrind complain about the potential leak.
Also it's worth noting that the man page for SSL_CTX_set_tmp_dh() makes no
mention about whether or not the element should be freed, and the example
provided does not use it either.
This fix should be backported to 1.6 and 1.5 where the patch was just
included.
CLEANUP: fixed some usages of realloc leading to memory leak
Changed all the cases where the pointer passed to realloc is overwritten
by the pointer returned by realloc. The new function my_realloc2 has
been used except in function register_name. If register_name fails to
add a new variable because of an "out of memory" error, all the existing
variables remain valid. If we had used my_realloc2, the array of variables
would have been freed.
MINOR: new function my_realloc2 = realloc + free upon failure
When realloc fails to allocate memory, the original pointer is not
freed. Sometime people override the original pointer with the pointer
returned by realloc which is NULL in case of failure. This results
in a memory leak because the memory pointed by the original pointer
cannot be freed.
BUG/MINOR: Rework slightly commit 9962f8fc to clean code and avoid mistakes
In commit 9962f8fc (BUG/MEDIUM: http: unbreak uri/header/url_param hashing), we
take care to update 'msg->sov' value when the parser changes to state
HTTP_MSG_DONE. This works when no filter is used. But, if a filter is used and
if it loops on 'http_end' callback, the following block is evaluated two times
consecutively:
Today, in practice, because this happens when all data are parsed and forwarded,
the second test always fails (after the first update, msg->sov is always lower
or equal to 0). But it is useless and error prone. So to avoid misunderstanding
the code has been slightly changed. Now, in all cases, we try to update msg->sov
only once per iteration.
Vedran Furac reported that "balance uri" doesn't work anymore in recent
1.7-dev versions. Dragan Dosen found that the first faulty commit was dbe34eb ("MEDIUM: filters/http: Move body parsing of HTTP messages in
dedicated functions"), merged in 1.7-dev2.
After this patch, the hashing is performed on uninitialized data,
indicating that the buffer is not correctly rewound. In fact, all forms
of content-based hashing are broken since the commit above. Upon code
inspection, it appears that the new functions http_msg_forward_chunked_body()
and http_msg_forward_body() forget to rewind the buffer in the success
case, when the parser changes to state HTTP_MSG_DONE. The rewinding code
was reinserted in both functions and the fix was confirmed by two test,
with and without chunking.
Willy Tarreau [Sun, 26 Jun 2016 17:37:59 +0000 (19:37 +0200)]
DOC: add missing doc for http-request deny [deny_status <status>]
The feature was introduced in 1.6-dev2 by commit 108b1dd ("MEDIUM:
http: configurable http result codes for http-request deny") but the
doc was missing. Thanks to Cyril for noticing.
Willy Tarreau [Fri, 24 Jun 2016 16:30:22 +0000 (18:30 +0200)]
BUG/BUILD: don't automatically run "make" on "make install"
Kay Fuchs reported that the recent changes to automatically rebuild files
on config option changes caused "make install" to rebuild the whole code
with the wrong options. That's caused by the fact that the "install-bin"
target depends on the "haproxy" target, which detects the lack of options
and causes a rebuild with different ones.
This patch makes a simple change, it removes this automatic dependency
which was already wrong since it could cause some files to be built with
different options prior to these changes, and instead emits an error
message indicating that "make" should be run prior to "make install".
The patches were backported into 1.6 so this fix must go there as well.
Willy Tarreau [Wed, 22 Jun 2016 12:51:40 +0000 (14:51 +0200)]
BUG/MINOR: srv-state: fix incorrect output of state file
Eric Webster reported that the state file wouldn't reload in 1.6.5
while it used to work in 1.6.4. The issue is that headers are now
missing from the output when a specific backend is dumped since
commit 4c1544d ("BUG/MEDIUM: stats: show servers state may show an
empty or incomplete result"). This patch fixes this by introducing
a dump state.
BUG/MINOR: filters: Fix HTTP parsing when a filter loops on data forwarding
A filter can choose to loop on data forwarding. When this loop occurs in
HTTP_MSG_ENDING state, http_foward_data callbacks are called twice because of a
goto on the wrong label.
A filter can also choose to loop at the end of a HTTP message, in http_end
callback. Here the goto is good but the label is not at the right place. We must
be sure to upate msg->sov value.
BUG/MEDIUM: filters: Fix data filtering when data are modified
Filters can alter data during the parsing, i.e when http_data or tcp_data
callbacks are called. For now, the update must be done by hand. So we must
handle changes in the channel buffers, especially on the number of input bytes
pending (buf->i).
In addition, a filter can choose to switch channel buffers to do its
updates. So, during data filtering, we must always use the right buffer and not
use variable to reference them.
Without this patch, filters cannot safely alter data during the data parsing.
Willy Tarreau [Tue, 21 Jun 2016 15:34:14 +0000 (17:34 +0200)]
CLEANUP: external-check: don't block/unblock SIGCHLD when manipulating the list
There's no point in blocking/unblocking sigchld when removing entries
from the list since the code is called asynchronously.
Similarly the blocking/unblocking could be removed from the connect_proc_chk()
function but it happens that at high signal rates, fork() takes twice as much
time to execute as it is regularly interrupted by a signal, so in the end this
signal blocking is beneficial there for performance reasons.
Willy Tarreau [Tue, 21 Jun 2016 15:29:46 +0000 (17:29 +0200)]
BUG/MINOR: external-checks: do not unblock undesired signals
The external checks code makes use of block_sigchld() and unblock_sigchld()
to ensure nobody modifies the signals list while they're being manipulated.
It happens that these functions clear the list of blocked signals, so they
can possibly have a side effect if other signals are blocked. For now no
other signal is blocked but it may very well change in the future so rather
correctly use SIG_BLOCK/SIG_UNBLOCK instead of touching unrelated signals.
This fix should be backported to 1.6 for correctness.