MEDIUM: lua: only allow actions to yield if not in a final call
Actions may yield but must not do it during the final call from a ruleset
because it indicates there will be no more opportunity to complete or
clean up. This is indicated by ACT_FLAG_FINAL in the action's flags,
which must be passed to hlua_resume().
Thanks to this, an action called from a TCP ruleset is properly woken
up and possibly finished when the client disconnects.
MEDIUM: http: pass ACT_FLAG_FINAL to custom actions
In HTTP it's more difficult to know when to pass the flag or not
because all actions are supposed to be final and there's no inspection
delay. Also, the input channel may very well be closed without this
being an error. So we only set the flag when option abortonclose is
set and the input channel is closed, which is the only case where the
user explicitly wants to forward a close down the chain.
MEDIUM: actions: add new flag ACT_FLAG_FINAL to notify about last call
This new flag indicates to a custom action that it must not yield because
it will not be called anymore. This addresses an issue introduced by commit bc4c1ac ("MEDIUM: http/tcp: permit to resume http and tcp custom actions"),
which made it possible to yield even after the last call and causes Lua
actions not to be stopped when the session closes. Note that the Lua issue
is not fixed yet at this point. Also only TCP rules were handled, for now
HTTP rules continue to let the action yield since we don't know whether or
not it is a final call.
MEDIUM: actions: pass a new "flags" argument to custom actions
Since commit bc4c1ac ("MEDIUM: http/tcp: permit to resume http and tcp
custom actions"), some actions may yield and be called back when new
information are available. Unfortunately some of them may continue to
yield because they simply don't know that it's the last call from the
rule set. For this reason we'll need to pass a flag to the custom
action to pass such information and possibly other at the same time.
When a thread stops this patch forces a garbage collection which
cleanup and free the memory.
The HAProxy Lua Socket class contains an HAProxy session, so it
uses a big amount of memory, in other this Socket class uses a
filedescriptor and maintain a connection open.
If the class Socket is stored in a global variable, the Socket stay
alive along of the life of the process (except if it is closed by the
other size, or if a timeout is reached). If the class Socket is stored
in a local variable, it must die with this variable.
The socket is closed by a call from the garbage collector. And the
portability and use of a variable is known by the same garbage collector.
so, running the GC just after the end of all Lua code ensure that the
heavy resources like the socket class are freed quickly.
BUG/MEDIUM: lua: properly set the target on the connection
Not having the target set on the connection causes it to be released
at the last moment, and the destination address to randomly be valid
depending on the data found in the memory at this moment. In practice
it works as long as memory poisonning is disabled. The deep reason is
that connect_server() doesn't expect to be called with SF_ADDR_SET and
an existing connection with !reuse. This causes the release of the
connection, its reallocation (!reuse), and taking the address from the
newly allocated connection. This should certainly be improved.
BUG/MEDIUM: lua: better fix for the protocol check
Commit d75cb0f ("BUG/MAJOR: lua: segfault after the channel data is
modified by some Lua action.") introduced a regression causing an
action run from a TCP rule in an HTTP proxy to end in HTTP error
if it terminated cleanly, because it didn't parse the HTTP request!
Relax the test so that it takes into account the opportunity for the
analysers to parse the message.
MINOR: lua: use the proper applet wakeup mechanism
The lua code must the the appropriate wakeup mechanism for cosockets.
It wakes them up from outside the stream and outside the applet, so it
shouldn't use the applet's wakeup callback which is designed only for
use from within the applet itself. For now it didn't cause any trouble
(yet).
BUG/MAJOR: lua: segfault after the channel data is modified by some Lua action.
When an action or a fetch modify the channel data, the http request parser
pointer become inconsistent. This patch detects the modification and call
again the parser.
MEDIUM: lua: use the function lua_rawset in place of lua_settable
The function lua_settable uses the metatable for acessing data,
so in the C part, we want to index value in the real table and
not throught the meta-methods. It's much faster this way.
When we call the function smp_prefetch_http(), if the txn is not initialized,
it doesn't work. This patch fix this. Now, smp_prefecth_http() permits to use
http with any proxy mode.
BUG/MEDIUM: stream-int: avoid double-call to applet->release
While the SI_ST_DIS state is set *after* doing the close on a connection,
it was set *before* calling release on an applet. Applets have no internal
flags contrary to connections, so they have no way to detect they were
already released. Because of this it happened that applets were closed
twice, once via si_applet_release() and once via si_release_endpoint() at
the end of a transaction. The CLI applet could perform a double free in
this case, though the situation to cause it is quite hard because it
requires that the applet is stuck on output in states that produce very
few data.
In order to solve this, we now assign the SI_ST_DIS state *after* calling
->release, and we refrain from doing so if the state is already assigned.
This makes applets work much more like connections and definitely avoids
this double release.
In the future it might be worth making applets have their own flags like
connections to carry their own state regardless of the stream interface's
state, especially when dealing with connection reuse.
No backport is needed since this issue was caused by the rearchitecture
in 1.6.
MINOR: cli: do not call the release handler on internal error.
It's dangerous to call this on internal state error, because it risks
to perform a double-free. This can only happen when a state is not
handled. Note that the switch/case currently doesn't offer any option
for missed states since they're all declared. Better fix this anyway.
The fix was tested by commenting out some entries in the switch/case.
Due to the code inherited from the early CLI mode with the non-interactive
code, the SHUTR status was only considered while waiting for a request,
which prevents the connection from properly being closed during a dump,
and the connection used to remain established. This issue didn't happen
in 1.5 because while this part was missed, the resynchronization performed
in process_session() would detect the situation and handle the cleanup.
BUG/MINOR: stats: do not call cli_release_handler 3 times
If an error happens during a dump on the CLI, an explicit call to
cli_release_handler() is performed. This is not needed anymore since
we introduced ->release() in the applet which is called upon error.
Let's remove this confusing call which can even be risky in some
situations.
BUG/MEDIUM: applet: fix reporting of broken write situation
If an applet tries to write to a closed connection, it hangs forever.
This results in some "get map" commands on the CLI to leave orphaned
connections alive.
Now the applet wakeup function detects that the applet still wants to
write while the channel is closed for reads, which is the equivalent
to the common "broken pipe" situation. In this case, an error is
reported on the stream interface, just as it happens with connections
trying to perform a send() in a similar situation.
With this fix the stats socket is properly released.
MINOR: stream-int: rename si_applet_done() to si_applet_wake_cb()
This function is a callback made only for calls from the applet handler.
Rename it to remove confusion. It's currently called from the Lua code
but that's not correct, we should call the notify and update functions
instead otherwise it will not enable the applet again.
This one is not needed anymore as what it used to do is either
completely covered by the new stream_int_notify() function, or undesired
and inherited from the past as a side effect of introducing the
connections.
This update is theorically never called since it's assigned only when
nothing is connected to the stream interface. However a test has been
added to si_update() to stay safe if some foreign code decides to call
si_update() in unsafe situations.
MEDIUM: stream-int: use the same stream notification function for applets and conns
The code to report completion after a connection update or an applet update
was almost the same since applets stole it from the connection. But the
differences made them hard to maintain and prevented the creation of new
functions doing only one part of the work.
This patch replaces the common code from the si_conn_wake_cb() and
si_applet_wake_cb() with a single call to stream_int_notify() which only
notifies the stream (si+channels+task) from the outside.
MINOR: stream-int: implement the stream_int_notify() function
stream_int_notify() was taken from the common part between si_conn_wake_cb()
and si_applet_done(). It is designed to report activity to a stream from
outside its handler. It'll generally be used by lower layers to report I/O
completion but may also be used by remote streams if the buffer processing
is shared.
MEDIUM: stream-int: clean up the conditions to enable reading in si_conn_wake_cb
The condition to release the SI_FL_WAIT_ROOM flag was abnormally
complicated because it was inherited from 6 years ago before we used
to check for the buffer's emptiness. The CF_READ_PARTIAL flag had to be
removed, and the complex test was replaced with a simpler one checking
if *some* data were moved out or not.
The reason behind this change is to have a condition compatible with
both connections and applets, as applets currently don't work very
well in this area. Specifically, some optimizations on the applet
side cause them not to release the flag above until the buffer is
empty, which may prevent applets from taking together (eg: peers
over large haproxy buffers and small kernel buffers).
MEDIUM: stream-int: call stream_int_update() from si_update()
Now the call to stream_int_update() is moved to si_update(), which
is exclusively called from the stream, so that the socket layer may
be updated without updating the stream layer. This will later permit
to call it individually from other places (other tasks or applets for
example).
MEDIUM: stream-int: factor out the stream update functions
Now that we have a generic stream_int_update() function, we can
replace the equivalent part in stream_int_update_conn() and
stream_int_update_applet() to avoid code duplication.
There is no functional change, as the code is the same but split
in two functions for each call.
MINOR: stream-int: implement a new stream_int_update() function
This function is designed to be called from within the stream handler to
update the channels' expiration timers and the stream interface's flags
based on the channels' flags. It needs to be called only once after the
channels' flags have settled down, and before they are cleared, though it
doesn't harm to call it as often as desired (it just slightly hurts
performance). It must not be called from outside of the stream handler,
as what it does will be used to compute the stream task's expiration.
The code was taken directly from stream_int_update_applet() and
stream_int_update_conn() which had exactly the same one except for
applet-specific or connection-specific status update.
MEDIUM: stream-int: split stream_int_update_conn() into si- and conn-specific parts
The purpose is to separate the connection-specific parts so that the
stream-int specific one can be factored out. There's no functional
change here, only code displacement.
BUG/MAJOR: applet: use a separate run queue to maintain list integrity
If an applet wakes up and causes the next one to sleep, the active list
is corrupted and cannot be scanned anymore, as the process then loops
over the next element.
In order to avoid this problem, we move the active applet list to a run
queue and reinit the active list. Only the first element of this queue
is checked, and if the element is not removed, it is removed and requeued
into the active list.
Since we're using a distinct list, if an applet wants to requeue another
applet into the active list, it properly gets added to the active list
and not to the run queue.
This stops the infinite loop issue that could be caused with Lua applets,
and in any future configuration where two applets could be attached
together.
MINOR: applet: rename applet_runq to applet_active_queue
This is not a real run queue and we're facing ugly bugs because
if this : if a an applet removes another applet from the queue,
typically the next one after itself, the list iterator loops
forever because the list's backup pointer is not valid anymore.
Before creating a run queue, let's rename this list.
The pattern match "found" fails to parse on binary type samples. The
reason is that it presents itself as an integer type which bin cannot
be cast into. We must always accept this match since it validates
anything on input regardless of the type. Let's just relax the parser
to accept it.
BUG/MEDIUM: payload: make req.payload and payload_lv aware of dynamic buffers
Due to a check between offset+len and buf->size, an empty buffer returns
"will never match". Check against tune.bufsize instead.
(cherry picked from commit 43e4039fd5d208fd9d32157d20de93d3ddf9bc0d)
This private configuration pointer is used for storing some configuration
data associated the keyword, So many keywords can use the same parse
function, and this one can use a discriminator.
BUG/MEDIUM: peers: same table updates re-pushed after a re-connect
Some updates are pushed using an incremental update message after a
re-connection whereas the origin is forgotten by the peer.
These updates are never correctly acknowledged. So they are regularly
re-pushed after an idle timeout and a re-connect.
The fix consists to use an absolute update message in some cases.
BUG/MEDIUM: peers: some table updates are randomly not pushed.
If an entry is still not present in the update tree, we could miss to schedule
for a push depending of an un-initialized value (upd.key remains un-initialized
for new sessions or isn't re-initalized for reused ones).
In the same way, if an entry is present in the tree, but its update's tick
is far in the past (> 2^31). We could consider it's still scheduled even if
it is not the case.
The fix consist to force the re-scheduling of an update if it was not present in
the updates tree or if the update is not in the scheduling window of every peers.
PiBANL reported that HAProxy's DNS resolver can't "connect" its socker
on FreeBSD.
Remi Gacogne reported that we should use the function 'get_addr_len' to
get the addr structure size instead of sizeof.
BUG/MEDIUM: stick-tables: fix double-decrement of tracked entries
Mailing list participant "mlist" reported negative conn_cur values in
stick tables as the result of "tcp-request connection track-sc". The
reason is that after the stick entry it copied from the session to the
stream, both the session and the stream grab a reference to the entry
and when the stream ends, it decrements one reference and one connection,
then the same is done for the session.
In fact this problem was already encountered slightly differently in the
past and addressed by Thierry using the patch below as it was believed by
then to be only a refcount issue since it was the observable symptom :
827752e "BUG/MEDIUM: stick-tables: refcount error after copying SC..."
In reality the problem is that the stream must touch neither the refcount
nor the connection count for entries it inherits from the session. While
we have no way to tell whether a track entry was inherited from the session
(since they're simply memcpy'd), it is possible to prevent the stream from
touching an entry that already exists in the session because that's a
guarantee that it was inherited from it.
Note that it may be a temporary fix. Maybe in the future when a session
gives birth to multiple streams we'll face a situation where a session may
be updated to add more tracked entries after instanciating some streams.
The correct long-term fix is to mark some tracked entries as shared or
private (or RO/RW). That will allow the session to track more entries
even after the same trackers are being used by early streams.
No backport is needed, this is only caused by the session/stream split in 1.6.
DOC: update coding-style to reference checkpatch.pl
Running the Linux kernel's checkpatch.pl is actually quite efficient
at spotting style issues and even sometimes bugs. The doc now suggests
how to use it to avoid the warnings that are specific to Linux's stricter
rules.
It properly reports errors like the following ones that were found on
real submissions so it should improve the situation for everyone :
ERROR: "foo * bar" should be "foo *bar"
+static char * tcpcheck_get_step_comment(struct check *, int);
ERROR: do not use assignment in if condition
+ if ((comment = tcpcheck_get_step_comment(check, step)))
WARNING: trailing semicolon indicates no statements, indent implies otherwise
+ if (elem->data && elem->free);
+ elem->free(elem->data);
ERROR: do not initialise statics to 0 or NULL
+static struct lru64_head *ssl_ctx_lru_tree = NULL;
ERROR: space required after that ',' (ctx:VxV)
+ !X509_gmtime_adj(X509_get_notAfter(newcrt),(long)60*60*24*365))
^
WARNING: space prohibited between function name and open parenthesis '('
+ else if (EVP_PKEY_type (capkey->type) == EVP_PKEY_RSA)
ERROR: trailing statements should be on next line
+ if (cacert) X509_free(cacert);
ERROR: space prohibited after that open parenthesis '('
+ !( (srv_op_state == SRV_ST_STOPPED)
BUG/MAJOR: peers: fix a crash when stopping peers on unbound processes
Pradeep Jindal reported and troubleshooted a bug causing haproxy to die
during startup on all processes not making use of a peers section. It
only happens with nbproc > 1 when peers are declared. Technically it's
when the peers task is stopped on processes that don't use it that the
crash occurred (a task_free() called on a NULL task pointer).
This only affects peers v2 in the dev branch, no backport is needed.
James Rosewell [Fri, 18 Sep 2015 17:28:52 +0000 (18:28 +0100)]
MAJOR: 51d: Upgraded to support 51Degrees V3.2 and new features
Trie device detection doesn't benefit from caching compared to Pattern.
As such the LRU cache has been removed from the Trie method.
A new fetch method has been added named 51d.all which uses all the
available HTTP headers for device device detection. The previous 51d
conv method has been changed to 51d.single where one HTTP header,
typically User-Agent, is used for detection. This method is marginally
faster but less accurate.
Three new properties are available with the Pattern method called
Method, Difference and Rank which provide insight into the validity of
the results returned.
A pool of worksets is used to avoid needing to create a new workset for
every request. The workset pool is thread safe ready to support a future
multi threaded version of HAProxy.
James Rosewell [Fri, 18 Sep 2015 16:24:29 +0000 (17:24 +0100)]
BUILD: Changed 51Degrees option to support V3.2
Added support for city hash method, turned off multi threading support
and included maths library. Removed reference to compression library
which was never needed.
DOC: add the documentation about internal circular lists
This file was recovered from the first project where it was born 12 years
ago, but it's still convenient to understand how our circular lists work,
so let's add it.
MINOR: server: startup slowstart task when using seamless reload of HAProxy
This patch uses the start up of the health check task to also start
the warmup task when required.
This is executed only once: when HAProxy has just started up and can
be started only if the load-server-state-from-file feature is enabled
and the server was in the warmup state before a reload occurs.
Baptiste Assmann [Wed, 19 Aug 2015 14:44:03 +0000 (16:44 +0200)]
MINOR: config: new backend directives: load-server-state-from-file and server-state-file-name
This directive gives HAProxy the ability to use the either the global
server-state-file directive or a local one using server-state-file-name to
load server states.
The state can be saved right before the reload by the init script, using
the "show servers state" command on the stats socket redirecting output into
a file.
Baptiste Assmann [Sun, 23 Aug 2015 08:06:39 +0000 (10:06 +0200)]
DOC: new global directive: server-state-file
Documentation related to a new global directive.
Purpose of this directive is to store a file path into the global
structure of HAProxy. The file pointed by the path may be used by
HAProxy to retrieve server state from the previous running process
after a reload occured.
Baptiste Assmann [Sun, 23 Aug 2015 07:22:25 +0000 (09:22 +0200)]
MINOR: config: new global directive server-state-base
This new global directive can be used to provide a base directory where
all the server state files could be loaded.
If a server state file name starts with a slash '/', then this directive
must not be applied.
MINOR: cli: new stats socket command: show servers state
new command 'show servers state' which dumps all variable parameters
of a server during an HAProxy process life.
Purpose is to dump current server state at current run time in order to
read them right after the reload.
The format of the output is versionned and we support version 1 for now.
Introduces a few new macros used by server state save and application accros reloads:
- currently used state server file format version
- currently used state server file header fields
- MIN and MAX value for version number
- maximum number of fields that could be found in a server-state file
- an arbitrary state-file max line length
BUG/MAJOR: can't enable a server through the stat socket
When a server is disabled in the configuration using the "disabled"
keyword, a single flag is positionned: SRV_ADMF_CMAINT (use to be
SRV_ADMF_FMAINT)..
That said, when providing the first version of this code, we also
changed the SRV_ADMF_MAINT mask to match any of the possible MAINT
cases: SRV_ADMF_FMAINT, SRV_ADMF_IMAINT, SRV_ADMF_CMAINT
Since SRV_ADMF_CMAINT is never (and is not supposed to be) altered at
run time, once a server has this flag set up, it can never ever be
enabled again using the stats socket.
In order to fix this, we should:
- consider SRV_ADMF_CMAINT as a simple flag to report the state in the
old configuration file (will be used after a reload to deduce the
state of the server in a new running process)
- enabling both SRV_ADMF_CMAINT and SRV_ADMF_FMAINT when the keyword
"disabled" is in use in the configuration
- update the mask SRV_ADMF_MAINT as it was before, to only match
SRV_ADMF_FMAINT and SRV_ADMF_IMAINT.
The following patch perform the changes above.
It allows fixing the regression without breaking the way the up coming
feature (seamless server state accross reloads) is going to work.
This couple of function executes securely some Lua calls outside of
the lua runtime environment. Each Lua call can return a longjmp
if it encounter a memory error.
Lua documentation extract:
If an error happens outside any protected environment, Lua calls
a panic function (see lua_atpanic) and then calls abort, thus
exiting the host application. Your panic function can avoid this
exit by never returning (e.g., doing a long jump to your own
recovery point outside Lua).
The panic function runs as if it were a message handler (see
§2.3); in particular, the error message is at the top of the
stack. However, there is no guarantee about stack space. To push
anything on the stack, the panic function must first check the
available space (see §4.2).
We must check all the Lua entry point. This includes:
- The include/proto/hlua.h exported functions
- the task wrapper function
- The action wrapper function
- The converters wrapper function
- The sample-fetch wrapper functions
It is tolerated that the initilisation function returns an abort.
Before each Lua abort, an error message is writed on stderr.
The macro SET_SAFE_LJMP initialise the longjmp. The Macro
RESET_SAFE_LJMP reset the longjmp. These function must be macro
because they must be exists in the program stack when the longjmp
is called
Released version 1.6-dev5 with the following main changes :
- MINOR: dns: dns_resolution structure update: time_t to unsigned int
- BUG/MEDIUM: dns: DNS resolution doesn't start
- BUG/MAJOR: dns: dns client resolution infinite loop
- MINOR: dns: coding style update
- MINOR: dns: new bitmasks to use against DNS flags
- MINOR: dns: dns_nameserver structure update: new counter for truncated response
- MINOR: dns: New DNS response analysis code: DNS_RESP_TRUNCATED
- MEDIUM: dns: handling of truncated response
- MINOR: DNS client query type failover management
- MINOR: dns: no expected DNS record type found
- MINOR: dns: new flag to report that no IP can be found in a DNS response packet
- BUG/MINOR: DNS request retry counter used for retry only
- DOC: DNS documentation updated
- MEDIUM: actions: remove ACTION_STOP
- BUG/MEDIUM: lua: outgoing connection was broken since 1.6-dev2 (bis)
- BUG/MINOR: lua: last log character truncated.
- CLEANUP: typo: bad indent
- CLEANUP: actions: missplaced includes
- MINOR: build: missing header
- CLEANUP: lua: Merge log functions
- BUG/MAJOR: http: don't manipulate the server connection if it's killed
- BUG/MINOR: http: remove stupid HTTP_METH_NONE entry
- BUG/MAJOR: http: don't call http_send_name_header() after an error
- MEDIUM: tools: make str2sa_range() optionally return the FQDN
- BUG/MINOR: tools: make str2sa_range() report unresolvable addresses
- BUG/MEDIUM: dns: use the correct server hostname when resolving
I cannot build the current dev's master HEAD (ec3c37d) because of this error:
> In file included from include/proto/proto_http.h:26:0,
> from src/stick_table.c:26:
> include/types/action.h:102:20: error: field â\80\98reâ\80\99 has incomplete type
> struct my_regex re; /* used by replace-header and replace-value */
> ^
> Makefile:771: recipe for target 'src/stick_table.o' failed
> make: *** [src/stick_table.o] Error 1
The struct act_rule defined in action.h includes a full struct my_regex
without #include-ing regex.h. Both gcc 5.2.0 and clang 3.6.2 do not allow this.
More information regarding DNS resolution:
- behavior in case of errors
- behavior when multiple name servers are configured in a resolvers
section
- when a retry is performed
- when a query type change is performed
- make it clear that DNS resolution requires health checking enabled
on the server
BUG/MINOR: DNS request retry counter used for retry only
There are two types of retries when performing a DNS resolution:
1. retry because of a timeout
2. retry of the full sequence of requests (query types failover)
Before this patch, the 'resolution->try' counter was incremented
after each send of a DNS request, which does not cover the 2 cases
above.
This patch fix this behavior.