Thierry FOURNIER [Fri, 12 Dec 2014 18:41:33 +0000 (19:41 +0100)]
MEDIUM: tcp: add register keyword system.
This patch introduces an action keyword registration system for TCP
rulesets similar to what is available for HTTP rulesets. This sytem
will be useful with lua.
Thierry FOURNIER [Mon, 23 Feb 2015 14:11:11 +0000 (15:11 +0100)]
MINOR: converters: give the session pointer as converter argument
Some usages of the converters need to know the attached session. The Lua
needs the session for retrieving his running context. This patch adds
the "session" as an argument of the converters prototype.
Thierry FOURNIER [Thu, 26 Feb 2015 09:40:09 +0000 (10:40 +0100)]
MEDIUM: channel: wake up any request analyzer on response activity
This behavior is already existing for the "WAIT_HTTP" analyzer,
this patch just extends the system to any analyzer that would
be waked up on response activity.
Thierry FOURNIER [Wed, 14 Jan 2015 10:31:49 +0000 (11:31 +0100)]
MEDIUM: protocol: automatically pick the proto associated to the connection.
When the destination IP is dynamically set, we can't use the "target"
to define the proto. This patch ensures that we always use the protocol
associated with the address family. The proto field was removed from
the server and check structs.
Thierry FOURNIER [Mon, 15 Dec 2014 12:26:01 +0000 (13:26 +0100)]
MAJOR: poll: only rely on wake_expired_tasks() to compute the wait delay
Actually, HAProxy uses the function "process_runnable_tasks" and
"wake_expired_tasks" to get the next task which can expires.
If a task is added with "task_schedule" or other method during
the execution of an other task, the expiration of this new task
is not taken into account, and the execution of this task can be
too late.
Actualy, HAProxy seems to be no sensitive to this bug.
This fix moves the call to process_runnable_tasks() before the timeout
calculation and ensures that all wakeups are processed together. Only
wake_expired_tasks() needs to return a timeout now.
Nenad Merdanovic [Fri, 27 Feb 2015 18:56:49 +0000 (19:56 +0100)]
MEDIUM: Add support for configurable TLS ticket keys
Until now, the TLS ticket keys couldn't have been configured and
shared between multiple instances or multiple servers running HAproxy.
The result was that if a request got a TLS ticket from one instance/server
and it hits another one afterwards, it will have to go through the full
SSL handshake and negotation.
This patch enables adding a ticket file to the bind line, which will be
used for all SSL contexts created from that bind line. We can use the
same file on all instances or servers to mitigate this issue and have
consistent TLS tickets assigned. Clients will no longer have to negotiate
every time they change the handling process.
Simon Horman [Thu, 26 Feb 2015 02:26:17 +0000 (11:26 +0900)]
BUG/MEDIUM: Do not consider an agent check as failed on L7 error
As failure to connect to the agent check is not sufficient to mark it as
failed it stands to reason that an L7 error shouldn't either.
Without this fix if an L7 error occurs, for example of connectivity to the
agent is lost immediately after establishing a connection to it, then the
agent check will be considered to have failed and thus may end up with zero
health. Once this has occurred if the primary health check also reaches
zero health, which is likely if connectivity to the server is lost, then
the server will be marked as down and not be marked as up again until a
successful agent check occurs regardless of the success of any primary
health checks.
This behaviour is not correct as a failed agent check should never cause a
server to be marked as down or by extension continue to be marked as down.
Willy Tarreau [Mon, 23 Feb 2015 15:07:01 +0000 (16:07 +0100)]
MEDIUM: task: always ensure that the run queue is consistent
As found by Thierry Fournier, if a task manages to kill another one and
if this other task is the next one in the run queue, we can do whatever
including crashing, because the scheduler restarts from the saved next
task. For now, there is no such concept of a task killing another one,
but with Lua it will come.
A solution consists in always performing the lookup of the first task in
the scheduler's loop, but it's expensive and costs around 2% of the
performance.
Another solution consists in keeping a global next run queue node and
ensuring that when this task gets removed, it updates this pointer to
the next one. This allows to simplify the code a bit and in the end to
slightly increase the performance (0.3-0.5%). The mechanism might still
be usable if we later migrate to a multi-threaded scheduler.
Thierry FOURNIER [Tue, 17 Feb 2015 15:09:00 +0000 (16:09 +0100)]
BUG/MINOR: ARG6 and ARG7 don't fit in a 32 bits word
The patch "MEDIUM: args: increase arg type to 5 bits and limit arg count
to 5" (dbc79d0a) increased the number of types supported, but forgot to
remove the ARG6/ARG7 macros.
Willy Tarreau [Fri, 20 Feb 2015 12:55:29 +0000 (13:55 +0100)]
MINOR: http: add the new sample fetches req.hdr_names and res.hdr_names
These new sample fetches retrieve the list of header names as they appear
in the request or response. This can be used for debugging, for statistics
as well as an aid to better detect the presence of proxies or plugins on
some browsers, which alter the request compared to a regular browser by
adding or reordering headers.
Willy Tarreau [Sat, 14 Feb 2015 13:14:57 +0000 (14:14 +0100)]
BUG/MAJOR: peers: initialize s->buffer_wait when creating the session
Commit bf883e0 ("MAJOR: session: implement a wait-queue for sessions who
need a buffer") introduced in 1.6 forgot to initialize the buffer_wait
list when the session is initiated by an applet for a peer, resulting in
a crash. Thanks to Chris Kopp for reporting the issue.
BUG/MEDIUM: pattern: some entries are not deleted with case insensitive match
ACL or map entries are not deleted with the command "del acl" or "del map"
if the case insentive flag is set.
This is because the the case insensitive string are stored in a list and the
default delete function associated with string looks in a tree. I add a check
of the case insensitive flag and execute the delete function for lists if it
is set.
Willy Tarreau [Thu, 5 Feb 2015 15:47:07 +0000 (16:47 +0100)]
MINOR: ssl/server: add the "no-ssl-reuse" server option
This option disables SSL session reuse when SSL is used to communicate with
the server. It will force the server to perform a full handshake for every
new connection. It's probably only useful for benchmarking, troubleshooting,
and for paranoid users.
Simon Horman [Fri, 6 Feb 2015 02:11:57 +0000 (11:11 +0900)]
MEDIUM: Allow suppression of email alerts by log level
This patch adds a new option which allows configuration of the maximum
log level of messages for which email alerts will be sent.
The default is alert which is more restrictive than
the current code which sends email alerts for all priorities.
That behaviour may be configured using the new configuration
option to set the maximum level to notice or greater.
Simon Horman [Fri, 6 Feb 2015 02:11:56 +0000 (11:11 +0900)]
MINOR: Remove trailing '.' from email alert messages
This removes the trailing '.' from both the header and the body of email
alerts.
The main motivation for this change is to make the format of email alerts
generated from srv_set_stopped() consistent with those generated from
set_server_check_status().
Willy Tarreau [Tue, 3 Feb 2015 23:45:58 +0000 (00:45 +0100)]
MEDIUM: tcp: implement tcp-ut bind option to set TCP_USER_TIMEOUT
On Linux since 2.6.37, it's possible to set the socket timeout for
pending outgoing data, with an accuracy of 1 millisecond. This is
pretty handy to deal with dead connections to clients and or servers.
For now we only implement it on the frontend side (bind line) so
that when a client disappears from the net, we're able to quickly
get rid of its connection and possibly release a server connection.
This can be useful with long-lived connections where an application
level timeout is not suited because long pauses are expected (remote
terminals, connection pools, etc).
Thanks to Thijs Houtenbos and John Eckersberg for the suggestion.
Simon Horman [Fri, 30 Jan 2015 02:22:56 +0000 (11:22 +0900)]
MEDIUM: Move proto and addr fields struct check
The motivation for this is to make checks more independent of each
other to allow further reuse of their infrastructure.
For nowserver->check and server->agent still always use the same values
for the addr and proto fields so this patch should not introduce any
behavioural changes.
Simon Horman [Fri, 30 Jan 2015 02:22:54 +0000 (11:22 +0900)]
MEDIUM: Refactor init_check and move to checks.c
Refactor init_check so that an error string is returned
rather than alerts being printed by it. Also
init_check to checks.c and provide a prototype to allow
it to be used from multiple C files.
Simon Horman [Fri, 30 Jan 2015 02:22:53 +0000 (11:22 +0900)]
MEDIUM: Remove connect_chk
Remove connect_chk and instead call connect_proc_chk()
and connect_conn_chk(). There no longer seems to be any
value in having a wrapper function here.
Willy Tarreau [Fri, 30 Jan 2015 19:58:58 +0000 (20:58 +0100)]
BUG/MINOR: http: abort request processing on filter failure
Commit c600204 ("BUG/MEDIUM: regex: fix risk of buffer overrun in
exp_replace()") added a control of failure on the response headers,
but forgot to check for the error during request processing. So if
the filters fail to apply, we could keep the request. It might
cause some headers to silently fail to be added for example. Note
that it's tagged MINOR because a standard configuration cannot make
this case happen.
The fix should be backported to 1.5 and 1.4 though.
Cyril Bonté [Thu, 29 Jan 2015 23:07:07 +0000 (00:07 +0100)]
BUG/MINOR: checks: prevent http keep-alive with http-check expect
Sébastien Rohaut reported that string negation in http-check expect didn't
work as expected.
The misbehaviour is caused by responses with HTTP keep-alive. When the
condition is not met, haproxy awaits more data until the buffer is full or the
connection is closed, resulting in a check timeout when "timeout check" is
lower than the keep-alive timeout on the server side.
In order to avoid the issue, when a "http-check expect" is used, haproxy will
ask the server to disable keep-alive by automatically appending a
"Connection: close" header to the request.
Willy Tarreau [Thu, 29 Jan 2015 13:01:34 +0000 (14:01 +0100)]
BUG/MINOR: http: fix incorrect header value offset in replace-hdr/replace-value
The two http-req/http-resp actions "replace-hdr" and "replace-value" were
expecting exactly one space after the colon, which is wrong. It was causing
the first char not to be seen/modified when no space was present, and empty
headers not to be modified either. Instead of using name->len+2, we must use
ctx->val which points to the first character of the value even if there is
no value.
Willy Tarreau [Wed, 28 Jan 2015 18:03:21 +0000 (19:03 +0100)]
MEDIUM: init: continue to enforce SYSTEM_MAXCONN with auto settings if set
Commit d025648 ("MAJOR: init: automatically set maxconn and/or maxsslconn
when possible") resulted in a case where if enough memory is available,
a maxconn value larger than SYSTEM_MAXCONN could be computed, resulting
in possibly overflowing other systems resources (eg: kernel socket buffers,
conntrack entries, etc). Let's bound any automatic maxconn to SYSTEM_MAXCONN
if it is defined. Note that the value is set to DEFAULT_MAXCONN since
SYSTEM_MAXCONN forces DEFAULT_MAXCONN, thus it is not an error.
Godbach [Wed, 28 Jan 2015 09:36:16 +0000 (17:36 +0800)]
BUG/MINOR: parse: check the validity of size string in a more strict way
If a stick table is defined as below:
stick-table type ip size 50ka expire 300s
HAProxy will stop parsing size after passing through "50k" and return the value
directly. But such format string of size should not be valid. The patch checks
the next character to report error if any.
Willy Tarreau [Tue, 27 Jan 2015 14:12:13 +0000 (15:12 +0100)]
MEDIUM: samples: provide basic arithmetic and bitwise operators
This commit introduces a new category of converters. They are bitwise and
arithmetic operators which support performing basic operations on integers.
Some bitwise operations are supported (and, or, xor, cpl) and some arithmetic
operations are supported (add, sub, mul, div, mod, neg). Some comparators
are provided (odd, even, not, bool) which make it possible to report a match
without having to write an ACL.
The detailed list of new operators as they appear in the doc is :
add(<value>)
Adds <value> to the input value of type unsigned integer, and returns the
result as an unsigned integer.
and(<value>)
Performs a bitwise "AND" between <value> and the input value of type unsigned
integer, and returns the result as an unsigned integer.
bool
Returns a boolean TRUE if the input value of type unsigned integer is
non-null, otherwise returns FALSE. Used in conjunction with and(), it can be
used to report true/false for bit testing on input values (eg: verify the
presence of a flag).
cpl
Takes the input value of type unsigned integer, applies a twos-complement
(flips all bits) and returns the result as an unsigned integer.
div(<value>)
Divides the input value of type unsigned integer by <value>, and returns the
result as an unsigned integer. If <value> is null, the largest unsigned
integer is returned (typically 2^32-1).
even
Returns a boolean TRUE if the input value of type unsigned integer is even
otherwise returns FALSE. It is functionally equivalent to "not,and(1),bool".
mod(<value>)
Divides the input value of type unsigned integer by <value>, and returns the
remainder as an unsigned integer. If <value> is null, then zero is returned.
mul(<value>)
Multiplies the input value of type unsigned integer by <value>, and returns
the product as an unsigned integer. In case of overflow, the higher bits are
lost, leading to seemingly strange values.
neg
Takes the input value of type unsigned integer, computes the opposite value,
and returns the remainder as an unsigned integer. 0 is identity. This
operator is provided for reversed subtracts : in order to subtract the input
from a constant, simply perform a "neg,add(value)".
not
Returns a boolean FALSE if the input value of type unsigned integer is
non-null, otherwise returns TRUE. Used in conjunction with and(), it can be
used to report true/false for bit testing on input values (eg: verify the
absence of a flag).
odd
Returns a boolean TRUE if the input value of type unsigned integer is odd
otherwise returns FALSE. It is functionally equivalent to "and(1),bool".
or(<value>)
Performs a bitwise "OR" between <value> and the input value of type unsigned
integer, and returns the result as an unsigned integer.
sub(<value>)
Subtracts <value> from the input value of type unsigned integer, and returns
the result as an unsigned integer. Note: in order to subtract the input from
a constant, simply perform a "neg,add(value)".
xor(<value>)
Performs a bitwise "XOR" (exclusive OR) between <value> and the input value
of type unsigned integer, and returns the result as an unsigned integer.
Cyril Bonté [Sat, 24 Jan 2015 23:16:08 +0000 (00:16 +0100)]
MINOR: ssl: load certificates in alphabetical order
As reported by Raphaël Enrici, certificates loaded from a directory are loaded
in a non predictive order. If no certificate was first loaded from a file, it
can result in different behaviours when haproxy is used in cluster.
We can also imagine other cases which weren't met yet.
Instead of using readdir(), we can use scandir() and sort files alphabetically.
This will ensure a predictive behaviour.
This commit implements the following new actions :
- "set-method" rewrites the request method with the result of the
evaluation of format string <fmt>. There should be very few valid reasons
for having to do so as this is more likely to break something than to fix
it.
- "set-path" rewrites the request path with the result of the evaluation of
format string <fmt>. The query string, if any, is left intact. If a
scheme and authority is found before the path, they are left intact as
well. If the request doesn't have a path ("*"), this one is replaced with
the format. This can be used to prepend a directory component in front of
a path for example. See also "set-query" and "set-uri".
Example :
# prepend the host name before the path
http-request set-path /%[hdr(host)]%[path]
- "set-query" rewrites the request's query string which appears after the
first question mark ("?") with the result of the evaluation of format
string <fmt>. The part prior to the question mark is left intact. If the
request doesn't contain a question mark and the new value is not empty,
then one is added at the end of the URI, followed by the new value. If
a question mark was present, it will never be removed even if the value
is empty. This can be used to add or remove parameters from the query
string. See also "set-query" and "set-uri".
Example :
# replace "%3D" with "=" in the query string
http-request set-query %[query,regsub(%3D,=,g)]
- "set-uri" rewrites the request URI with the result of the evaluation of
format string <fmt>. The scheme, authority, path and query string are all
replaced at once. This can be used to rewrite hosts in front of proxies,
or to perform complex modifications to the URI such as moving parts
between the path and the query string. See also "set-path" and
"set-query".
All of them are handled by the same parser and the same exec function,
which is why they're merged all together. For once, instead of adding
even more entries to the huge switch/case, we used the new facility to
register action keywords. A number of the existing ones should probably
move there as well.
Willy Tarreau [Fri, 23 Jan 2015 19:23:17 +0000 (20:23 +0100)]
BUG/MINOR: sample: fix case sensitivity for the regsub converter
Two commits ago in 7eda849 ("MEDIUM: samples: add a regsub converter to
perform regex-based transformations"), I got caught for the second time
with the inverted case sensitivity usage of regex_comp(). So by default
it is case insensitive and passing the "i" flag makes it case sensitive.
I forgot to recheck that case before committing the cleanup. No harm
anyway, nobody had the time to use it.
Simon Horman [Wed, 12 Nov 2014 06:55:54 +0000 (15:55 +0900)]
MEDIUM/BUG: Only explicitly report "DOWN (agent)" if the agent health is zero
Make check check used to report explicitly report "DOWN (agent)" slightly
more restrictive such that it only triggers if the agent health is zero.
This avoids the following problem.
1. Backend is started disabled, agent check is is enabled
2. Backend is stabled using set server vip/rip state ready
3. Health is marked as down using set server vip/rip health down
At this point the http stats page will report "DOWN (agent)"
but the backend being down has nothing to do with the agent check
This problem appears to have been introduced by cf2924bc2537bb08c
("MEDIUM: stats: report down caused by agent prior to reporting up").
Note that "DOWN (agent)" may also be reported by a more generic conditional
which immediately follows the code changed by this patch.
Reported-by: Mark Brooks <mark@loadbalancer.org> Signed-off-by: Simon Horman <horms@verge.net.au>
Simon Horman [Wed, 12 Nov 2014 06:55:53 +0000 (15:55 +0900)]
BUG/MEDIUM: Do not set agent health to zero if server is disabled in config
disable starts a server in the disabled state, however setting the health
of an agent implies that the agent is disabled as well as the server.
This is a problem because the state of the agent is not restored if
the state of the server is subsequently updated leading to an
unexpected state.
For example, if a server is started disabled and then the server
state is set to ready then without this change show stat indicates
that the server is "DOWN (agent)" when it is expected that the server
would be UP if its (non-agent) health check passes.
Reported-by: Mark Brooks <mark@loadbalancer.org> Signed-off-by: Simon Horman <horms@verge.net.au>
Willy Tarreau [Tue, 20 Jan 2015 18:47:06 +0000 (19:47 +0100)]
MEDIUM: samples: add a regsub converter to perform regex-based transformations
We can now replace matching regex parts with a string, a la sed. Note
that there are at least 3 different behaviours for existing sed
implementations when matching 0-length strings. Here is the result
of the following operation on each implementationt tested :
echo 'xzxyz' | sed -e 's/x*y*/A/g'
GNU sed 4.2.1 => AzAzA
Perl's sed 5.16.1 => AAzAAzA
Busybox v1.11.2 sed => AzAz
The psed behaviour was adopted because it causes the least exceptions
in the code and seems logical from a certain perspective :
- "x" matches x*y* => add "A" and skip "x"
- "z" matches x*y* => add "A" and keep "z", not part of the match
- "xy" matches x*y* => add "A" and skip "xy"
- "z" matches x*y* => add "A" and keep "z", not part of the match
- "" matches x*y* => add "A" and stop here
Anyway, given the incompatibilities between implementations, it's unlikely
that some processing will rely on this behaviour.
There currently is one big limitation : the configuration parser makes it
impossible to pass commas or closing parenthesis (or even closing brackets
in log formats). But that's still quite usable to replace certain characters
or character sequences. It will become more complete once the config parser
is reworked.
Willy Tarreau [Wed, 21 Jan 2015 12:39:42 +0000 (13:39 +0100)]
MEDIUM: regex: add support for passing regex flags to regex_exec_match()
This function (and its sister regex_exec_match2()) abstract the regex
execution but make it impossible to pass flags to the regex engine.
Currently we don't use them but we'll need to support REG_NOTBOL soon
(to indicate that we're not at the beginning of a line). So let's add
support for this flag and update the API accordingly.
Willy Tarreau [Mon, 19 Jan 2015 18:00:58 +0000 (19:00 +0100)]
MINOR: args: implement a new arg type for regex : ARGT_REG
This one will be used when a regex is expected. It is automatically
resolved after the parsing and compiled into a regex. Some optional
flags are supported in the type-specific flags that should be set by
the optional arg checker. One is used during the regex compilation :
ARGF_REG_ICASE to ignore case.
Willy Tarreau [Wed, 21 Jan 2015 14:51:47 +0000 (15:51 +0100)]
MINOR: args: add type-specific flags for each arg in a list
These flags are meant to be used by arg checkers to pass out-of-band
information related to some args. A typical use is to indicate how a
regex is expected to be compiled/matched based on other arguments.
These flags are initialized to zero by default and it is up to the args
checkers to set them if needed.
Willy Tarreau [Mon, 19 Jan 2015 17:54:49 +0000 (18:54 +0100)]
MEDIUM: args: increase arg type to 5 bits and limit arg count to 5
We'll soon need to add new argument types, and we don't use the current
limit of 7 arguments, so let's increase the arg type size to 5 bits and
reduce the arg count to 5 (3 max are used today).
Willy Tarreau [Mon, 19 Jan 2015 17:44:07 +0000 (18:44 +0100)]
MEDIUM: args: use #define to specify the number of bits used by arg types and counts
This is in order to add new types. This patch does not change anything
else. Two remaining (harmless) occurrences of a count of 8 instead of 7
were fixed by this patch : empty_arg_list[] and the for() loop counting
args.
Willy Tarreau [Wed, 21 Jan 2015 19:39:27 +0000 (20:39 +0100)]
BUG/MEDIUM: http: make http-request set-header compute the string before removal
The way http-request/response set-header works is stupid. For a naive
reuse of the del-header code, it removes all occurrences of the header
to be set before computing the new format string. This makes it almost
unusable because it is not possible to append values to an existing
header without first copying them to a dummy header, performing the
copy back and removing the dummy header.
Instead, let's share the same code as add-header and perform the optional
removal after the string is computed. That way it becomes possible to
write things like :
Note that this change is not expected to have any undesirable impact on
existing configs since if they rely on the bogus behaviour, they don't
work as they always retrieve an empty string.
This fix must be backported to 1.5 to stop the spreadth of ugly configs.
Willy Tarreau [Tue, 20 Jan 2015 18:35:24 +0000 (19:35 +0100)]
MINOR: samples: provide a "crc32" converter
This converter hashes a binary input sample into an unsigned 32-bit quantity
using the CRC32 hash function. Optionally, it is possible to apply a full
avalanche hash function to the output if the optional <avalanche> argument
equals 1. This converter uses the same functions as used by the various hash-
based load balancing algorithms, so it will provide exactly the same results.
It is provided for compatibility with other software which want a CRC32 to be
computed on some input keys, so it follows the most common implementation as
found in Ethernet, Gzip, PNG, etc... It is slower than the other algorithms
but may provide a better or at least less predictable distribution.
Willy Tarreau [Tue, 20 Jan 2015 18:17:09 +0000 (19:17 +0100)]
MINOR: hash: add new function hash_crc32
This function will be used to perform CRC32 computations. This one wa
loosely inspired from crc32b found here, and focuses on size and speed
at the same time :
http://www.hackersdelight.org/hdcodetxt/crc.c.txt
Much faster table-based versions exist but are pointless for our usage
here, this hash already sustains gigabit speed which is far faster than
what we'd ever need. Better preserve the CPU's cache instead.
Willy Tarreau [Mon, 19 Jan 2015 14:06:26 +0000 (15:06 +0100)]
MINOR: http: add a new fetch "query" to extract the request's query string
This fetch extracts the request's query string, which starts after the first
question mark. If no question mark is present, this fetch returns nothing. If
a question mark is present but nothing follows, it returns an empty string.
This means it's possible to easily know whether a query string is present
using the "found" matching method. This fetch is the completemnt of "path"
which stops before the question mark.
Willy Tarreau [Thu, 15 Jan 2015 20:45:22 +0000 (21:45 +0100)]
MAJOR: init: automatically set maxconn and/or maxsslconn when possible
If a memory size limit is enforced using "-n" on the command line and
one or both of maxconn / maxsslconn are not set, instead of using the
build-time values, haproxy now computes the number of sessions that can
be allocated depending on a number of parameters among which :
- global.maxconn (if set)
- global.maxsslconn (if set)
- maxzlibmem
- tune.ssl.cachesize
- presence of SSL in at least one frontend (bind lines)
- presence of SSL in at least one backend (server lines)
- tune.bufsize
- tune.cookie_len
The purpose is to ensure that not haproxy will not run out of memory
when maxing out all parameters. If neither maxconn nor maxsslconn are
used, it will consider that 100% of the sessions involve SSL on sides
where it's supported. That means that it will typically optimize maxconn
for SSL offloading or SSL bridging on all connections. This generally
means that the simple act of enabling SSL in a frontend or in a backend
will significantly reduce the global maxconn but in exchange of that, it
will guarantee that it will not fail.
All metrics may be enforced using #defines to accomodate variations in
SSL libraries or various allocation sizes.
Willy Tarreau [Thu, 15 Jan 2015 20:34:39 +0000 (21:34 +0100)]
MINOR: global: report information about the cost of SSL connections
An SSL connection takes some memory when it exists and during handshakes.
We measured up to 16kB for an established endpoint, and up to 76 extra kB
during a handshake. The SSL layer stores these values into the global
struct during initialization. If other SSL libs are used, it's easy to
change these values. Anyway they'll only be used as gross estimates in
order to guess the max number of SSL conns that can be established when
memory is constrained and the limit is not set.
Willy Tarreau [Thu, 15 Jan 2015 20:32:40 +0000 (21:32 +0100)]
MINOR: global: always export some SSL-specific metrics
We'll need to know the number of SSL connections, their use and their
cost soon. In order to avoid getting tons of ifdefs everywhere, always
export SSL information in the global section. We add two flags to know
whether or not SSL is used in a frontend and in a backend.
Willy Tarreau [Thu, 15 Jan 2015 15:29:53 +0000 (16:29 +0100)]
BUG/MAJOR: log: don't try to emit a log if no logger is set
send_log() calls update_hdr() to build a log header. It may happen
that no logger is defined at all but that we try to send a log anyway
(eg: upon startup). This results in a segfault when building the log
header because logline was never allocated.
This bug was revealed by the recent log-tag changes because the logline
is dereferenced after the call to snprintf(). So in 1.5 on most platforms
it has no impact because snprintf() will ignore NULL, but not necessarily
on all platforms.
Willy Tarreau [Wed, 14 Jan 2015 19:32:59 +0000 (20:32 +0100)]
MINOR: channel: rename bi_erase() to channel_truncate()
It applies to the channel and it doesn't erase outgoing data, only
pending unread data, which is strictly equivalent to what recv()
does with MSG_TRUNC, so that new name is more accurate and intuitive.
Willy Tarreau [Wed, 14 Jan 2015 19:25:34 +0000 (20:25 +0100)]
MINOR: channel: rename bi_avail() to channel_recv_max()
This name more accurately reminds that it applies to a channel and not
to a buffer, and that what is returned may be used as a max number of
bytes to pass to recv().
Willy Tarreau [Wed, 14 Jan 2015 19:16:52 +0000 (20:16 +0100)]
MINOR: channel: rename buffer_reserved() to channel_reserved()
This applies to the channel, not the buffer, so let's fix this name.
Warning, the function's name happens to be the same as the old one
which was mistakenly used during 1.5.
Willy Tarreau [Tue, 13 Jan 2015 19:20:10 +0000 (20:20 +0100)]
MINOR: channel: rename channel_full() to !channel_may_recv()
This function's name was poorly chosen and is confusing to the point of
being suspiciously used at some places. The operations it does always
consider the ability to forward pending input data before receiving new
data. This is not obvious at all, especially at some places where it was
used when consuming outgoing data to know if the buffer has any chance
to ever get the missing data. The code needs to be re-audited with that
in mind. Care must be taken with existing code since the polarity of the
function was switched with the renaming.
channel_reserved is confusingly named. It is used to know whether or
not the rewrite area is left intact for situations where we want to
ensure we can use it before proceeding. Let's rename it to fix this
confusion.
Willy Tarreau [Wed, 14 Jan 2015 15:08:45 +0000 (16:08 +0100)]
BUG/MEDIUM: channel: don't schedule data in transit for leaving until connected
Option http-send-name-header is still hurting. If a POST request has to be
redispatched when this option is used, and the next server's name is larger
than the initial one, and the POST body fills the buffer, it becomes
impossible to rewrite the server's name in the buffer when redispatching.
In 1.4, this is worse, the process may crash because of a negative size
computation for the memmove().
The only solution to fix this is to refrain from eating the reserve before
we're certain that we won't modify the buffer anymore. And the condition for
that is that the connection is established.
This patch introduces "channel_may_send()" which helps to detect whether it's
safe to eat the reserve or not. This condition is used by channel_in_transit()
introduced by recent patches.
This patch series must be backported into 1.5, and a simpler version must be
backported into 1.4 where fixing the bug is much easier since there were no
channels by then. Note that in 1.4 the severity is major.
Willy Tarreau [Tue, 13 Jan 2015 19:09:54 +0000 (20:09 +0100)]
MINOR: channel: add channel_in_transit()
This function returns the amount of bytes in transit in a channel's buffer,
which is the amount of outgoing data plus the amount of incoming data bound
to the forward limit.
Willy Tarreau [Tue, 13 Jan 2015 18:07:23 +0000 (19:07 +0100)]
BUG/MINOR: channel: compare to_forward with buf->i, not buf->size
We know that all incoming data are going to be purged if to_forward
is greater than them, not only if greater than the buffer size. This
buf has no direct impact on this version, but it participates to some
bugs affecting http-send-name-header since 1.4. This fix will have to
be backported down to 1.4 albeit in a different form.
Willy Tarreau [Thu, 8 Jan 2015 10:34:55 +0000 (11:34 +0100)]
BUG/MEDIUM: channel: fix possible integer overflow on reserved size computation
The buffer_max_len() function is subject to an integer overflow in this
calculus :
int ret = global.tune.maxrewrite - chn->to_forward - chn->buf->o;
- chn->to_forward may be up to 2^31 - 1
- chn->buf->o may be up to chn->buf->size
- global.tune.maxrewrite is by definition smaller than chn->buf->size
Thus here we can subtract (2^31 + buf->o) (highly negative) from something
slightly positive, and result in ret being larger than expected.
Fortunately in 1.5 and 1.6, this is only used by bi_avail() which itself
is used by applets which do not set high values for to_forward so this
problem does not happen there. However in 1.4 the equivalent computation
was used to limit the size of a read and can result in a read overflow
when combined with the nasty http-send-name-header feature.
Willy Tarreau [Wed, 14 Jan 2015 10:48:58 +0000 (11:48 +0100)]
MINOR: config: extend the default max hostname length to 64 and beyond
Some users reported that the default max hostname length of 32 is too
short in some environments. This patch does two things :
- it relies on the system's max hostname length as found in MAXHOSTNAMELEN
if it is set. This is the most logical thing to do as the system libs
generally present the appropriate value supported by the system. This
value is 64 on Linux and 256 on Solaris, to give a few examples.
- otherwise it defaults to 64
It is still possible to override this value by defining MAX_HOSTNAME_LEN at
build time. After some observation time, this patch may be backported to
1.5 if it does not cause any build issue, as it is harmless and may help
some users.
Willy Tarreau [Wed, 7 Jan 2015 16:23:50 +0000 (17:23 +0100)]
BUG/MEDIUM: http: fix header removal when previous header ends with pure LF
In 1.4-dev7, a header removal mechanism was introduced with commit 68085d8
("[MINOR] http: add http_remove_header2() to remove a header value."). Due
to a typo in the function, the beginning of the headers gets desynchronized
if the header preceeding the deleted one ends with an LF/CRLF combination
different form the one of the removed header. The reason is that while
rewinding the pointer, we go back by a number of bytes taking into account
the LF/CRLF status of the removed header instead of the previous one. The
case where it fails is in http-request del-header/set-header where the
multiple occurrences of a header are present and their LF/CRLF ending
differs from the preceeding header. The loop then stops because no more
headers are found given that the names and length do not match.
Another point to take into consideration is that removing headers using
a loop of http_find_header2() and this function is inefficient since we
remove values one at a time while it could be simpler and faster to
remove full header lines. This is something that should be addressed
separately.
This fix must be backported to 1.5 and 1.4. Note that http-send-name-header
relies on this function as well so it could be possible that some of the
issues encountered with it in 1.4 come from this bug.
Willy Tarreau [Wed, 7 Jan 2015 14:03:42 +0000 (15:03 +0100)]
MINOR: logs: add a new per-proxy "log-tag" directive
This is equivalent to what was done in commit 48936af ("[MINOR] log:
ability to override the syslog tag") but this time instead of doing
this globally, it does it per proxy. The purpose is to be able to use
a separate log tag for various proxies (eg: make it easier to route
log messages depending on the customer).
Cyril Bonté [Sun, 4 Jan 2015 14:17:36 +0000 (15:17 +0100)]
BUG/MEDIUM: backend: correctly detect the domain when use_domain_only is used
balance hdr(<name>) provides on option 'use_domain_only' to match only the
domain part in a header (designed for the Host header).
Olivier Fredj reported that the hashes were not the same for
'subdomain.domain.tld' and 'domain.tld'.
This is because the pointer was rewinded one step to far, resulting in a hash
calculated against wrong values :
- '.domai' for 'subdomain.domain.tld'
- ' domai' for 'domain.tld' (beginning with the space in the header line)
Another special case is when no dot can be found in the header : the hash will
be calculated against an empty string.
The patch addresses both cases : 'domain' will be used to compute the hash for
'subdomain.domain.tld', 'domain.tld' and 'domain' (using the whole header value
for the last case).
The fix must be backported to haproxy 1.5 and 1.4.
Willy Tarreau [Sun, 28 Dec 2014 11:19:57 +0000 (12:19 +0100)]
CLEANUP: session: remove session_from_task()
Since commit 3dd6a25 ("MINOR: stream-int: retrieve session pointer from
stream-int"), we can get the session from the task, so let's get rid of
this less obvious function.
Cyril Bonté [Sat, 27 Dec 2014 21:28:39 +0000 (22:28 +0100)]
DOC: checks: environment variables used by "external-check command"
Add some documentation about the environment variables available with
"external-check command". Currently, only one of them is dynamically updated
on each check : HAPROXY_SERVER_CURCONN.
Cyril Bonté [Sat, 27 Dec 2014 21:28:38 +0000 (22:28 +0100)]
MINOR: checks: update dynamic environment variables in external checks
commit 9ede66b0 introduced an environment variable (HAPROXY_SERVER_CURCONN) that
was supposed to be dynamically updated, but it was set only once, during its
initialization.
Most of the code provided in this previous patch has been rewritten in order to
easily update the environment variables without reallocating memory during each
check.
Now, HAPROXY_SERVER_CURCONN will contain the current number of connections on
the server at the time of the check.
Willy Tarreau [Tue, 9 Dec 2014 18:56:47 +0000 (19:56 +0100)]
OPTIM: stream-int: try to send pending spliced data
This is the equivalent of eb9fd51 ("OPTIM: stream_sock: reduce the amount
of in-flight spliced data") whose purpose is to try to immediately send
spliced data if available.
Willy Tarreau [Mon, 8 Dec 2014 17:14:53 +0000 (18:14 +0100)]
MEDIUM: channel: implement a zero-copy buffer transfer
bi_swpbuf() swaps the buffer passed in argument with the one attached to
the channel, but only if this last one is empty. The idea is to avoid a
copy when buffers can simply be swapped.
Willy Tarreau [Tue, 23 Dec 2014 21:52:37 +0000 (22:52 +0100)]
MINOR: config: implement global setting tune.buffers.limit
This setting is used to limit memory usage without causing the alloc
failures caused by "-m". Unexpectedly, tests have shown a performance
boost of up to about 18% on HTTP traffic when limiting the number of
buffers to about 10% of the amount of concurrent connections.
tune.buffers.limit <number>
Sets a hard limit on the number of buffers which may be allocated per process.
The default value is zero which means unlimited. The minimum non-zero value
will always be greater than "tune.buffers.reserve" and should ideally always
be about twice as large. Forcing this value can be particularly useful to
limit the amount of memory a process may take, while retaining a sane
behaviour. When this limit is reached, sessions which need a buffer wait for
another one to be released by another session. Since buffers are dynamically
allocated and released, the waiting time is very short and not perceptible
provided that limits remain reasonable. In fact sometimes reducing the limit
may even increase performance by increasing the CPU cache's efficiency. Tests
have shown good results on average HTTP traffic with a limit to 1/10 of the
expected global maxconn setting, which also significantly reduces memory
usage. The memory savings come from the fact that a number of connections
will not allocate 2*tune.bufsize. It is best not to touch this value unless
advised to do so by an haproxy core developer.