Cyril Bonté [Mon, 11 Oct 2010 22:14:36 +0000 (00:14 +0200)]
[MEDIUM] stats: add an admin level
The stats web interface must be read-only by default to prevent security
holes. As it is now allowed to enable/disable servers, a new keyword
"stats admin" is introduced to activate this admin level, conditioned by ACLs.
(cherry picked from commit 5334bab92ca7debe36df69983c19c21b6dc63f78)
Cyril Bonté [Mon, 11 Oct 2010 22:14:35 +0000 (00:14 +0200)]
[MEDIUM] enable/disable servers from the stats web interface
Based on a patch provided by Judd Montgomery, it is now possible to
enable/disable servers from the stats web interface. This allows to select
several servers in a backend and apply the action to them at the same time.
Currently, there are 2 known limitations :
- The POST data are limited to one packet
(don't alter too many servers at a time).
- Expect: 100-continue is not supported.
(cherry picked from commit 7693948766cb5647ac03b48e782cfee2b1f14491)
Willy Tarreau [Sun, 17 Oct 2010 15:16:42 +0000 (17:16 +0200)]
[BUG] checks: don't log backend down for all zero-weight servers
In a down backend, when a zero-weight server is lost, a new
"backend down" message was emitted and the down transition of that
backend was wrongly increased. This change ensures that we don't
count that transition again.
Willy Tarreau [Thu, 7 Oct 2010 19:00:29 +0000 (21:00 +0200)]
[MEDIUM] cookie: set the date in the cookie if needed
If a maxidle or maxlife parameter is set on the persistence cookie in
insert mode and the client did not provide a recent enough cookie,
then we emit a new cookie with a new last_seen date and the same
first_seen (if maxlife is set). Recent enough here designates a
cookie that would be rounded to the same date. That way, we can
refresh a cookie when required without doing it in all responses.
If the request did not contain such parameters, they are set anyway.
This means that a monitoring request that is forced to a server will
get an expiration date anyway, but this should not be a problem given
that the client is able to set its cookie in this case. This also
permits to force an expiration date on visitors who previously did
not have one.
If a request comes with a dated cookie while no date check is performed,
then a new cookie is emitted with no date, so that we don't risk dropping
the user too fast due to a very old date when we re-enable the date check.
All requests that were targetting the correct server and which had their
expiration date added/updated/removed in the response cookie are logged
with the 'U' ("updated") flag instead of the 'I' ("inserted"). So very
often we'll see "VU" instead of "VN".
(cherry picked from commit 8b3c6ecab6d37be5f3655bc3a2d2c0f9f37325eb)
Willy Tarreau [Thu, 7 Oct 2010 18:06:11 +0000 (20:06 +0200)]
[MEDIUM] cookie: check for maxidle and maxlife for incoming dated cookies
If a cookie comes in with a first or last date, and they are configured on
the backend, they're checked. If a date is expired or too far in the future,
then the cookie is ignored and the specific reason appears in the cookie
field of the logs.
(cherry picked from commit faa3019107eabe6b3ab76ffec9754f2f31aa24c6)
Willy Tarreau [Thu, 7 Oct 2010 17:27:29 +0000 (19:27 +0200)]
[MINOR] add encode/decode function for 30-bit integers from/to base64
These functions only require 5 chars to encode 30 bits, and don't expect
any padding. They will be used to encode dates in cookies.
(cherry picked from commit a7e2b5fc4612994c7b13bcb103a4a2c3ecd6438a)
Willy Tarreau [Thu, 7 Oct 2010 13:54:11 +0000 (15:54 +0200)]
[MEDIUM] cookie: reassign set-cookie status flags to store more states
The set-cookie status flags were not very handy and limited. Reorder
them to save some room for additional values and add the "U" flags
(for Updated expiration date) that will be used with expirable cookies
in insert mode.
(cherry picked from commit 5bab52f821bb0fa99fc48ad1b400769e66196ece)
Willy Tarreau [Wed, 6 Oct 2010 17:38:55 +0000 (19:38 +0200)]
[MINOR] http: make some room in the transaction flags to extend cookies
We'll need one more bit to store and report the request cookie's status.
Doing this required moving a few bits around. However, now in 1.4 all bits
are used, there's no room left.
Willy Tarreau [Wed, 6 Oct 2010 17:25:55 +0000 (19:25 +0200)]
[MEDIUM] cookie: support client cookies with some contents appended to their value
In all cookie persistence modes but prefix, we now support cookies whose
value is suffixed with some contents after a vertical bar ('|'). This will
be used to pass an optional expiration date. So as of now we only consider
the part of the cookie value which is used before the vertical bar.
(cherry picked from commit a4486bf4e5b03b5a980d03fef799f6407b2c992d)
Willy Tarreau [Wed, 6 Oct 2010 14:59:56 +0000 (16:59 +0200)]
[MINOR] cookie: add options "maxidle" and "maxlife"
Add two new arguments to the "cookie" keyword, to be able to
fix a max idle and max life on them. Right now only the parameter
parsing is implemented.
(cherry picked from commit 9ad5dec4c3bb8f29129f292cb22d3fc495fcc98a)
Willy Tarreau [Mon, 4 Oct 2010 18:39:20 +0000 (20:39 +0200)]
[MINOR] global: add "tune.chksize" to change the default check buffer size
HTTP content-based health checks will be involved in searching text in pages.
Some pages may not fit in the default buffer (16kB) and sometimes it might be
desired to have larger buffers in order to find patterns. Running checks on
smaller URIs is always preferred of course.
(cherry picked from commit 043f44aeb835f3d0b57626c4276581a73600b6b1)
Willy Tarreau [Tue, 16 Mar 2010 17:46:54 +0000 (18:46 +0100)]
[MEDIUM] checks: add support for HTTP contents lookup
This patch adds the "http-check expect [r]{string,status}" statements
which enable health checks based on whether the response status or body
to an HTTP request contains a string or matches a regex.
This probably is one of the oldest patches that remained unmerged. Over
the time, several people have contributed to it, among which FinalBSD
(first and second implementations), Nick Chalk (port to 1.4), Anze
Skerlavaj (tests and fixes), Cyril Bonté (general fixes), and of course
myself for the final fixes and doc during integration.
Some people already use an old version of this patch which has several
issues, among which the inability to search for a plain string that is
not at the beginning of the data, and the inability to look for response
contents that are provided in a second and subsequent recv() calls. But
since some configs are already deployed, it was quite important to ensure
a 100% compatible behaviour on the working cases.
Thus, that patch fixes the issues while maintaining config compatibility
with already deployed versions.
[MINOR] checks: add support for LDAPv3 health checks
This patch provides a new "option ldap-check" statement to enable
server health checks based on LDAPv3 bind requests.
(cherry picked from commit b76b44c6fed8a7ba6f0f565dd72a9cb77aaeca7c)
[MEDIUM] make it possible to combine http-pretend-keepalived with httpclose
Some configs may involve httpclose in a frontend and http-pretend-keepalive
in a backend. httpclose used to take priority over keepalive, thus voiding
its effect. This change ensures that when both are combined, keepalive is
still announced to the server while close is announced to the client.
(cherry picked from commit 2be7ec90fa9caf66294f446423bbab2d00db9004)
[BUG] stream_sock: try to flush any extra pending request data after a POST
Some broken browsers still happen to send a CRLF after a POST. Those which
send a CRLF in a second packet have it queued into the system's buffers,
which causes an RST to be emitted by some systems upon close of the response
(eg: Linux). The client may then receive the RST without the last response
segments, resulting in a truncated response.
This change leaves request polling enabled on a POST so that we can flush
any late data from the request buffers.
A more complete workaround would consist in reading from the request for a
long time, until we get confirmation that the close has been ACKed. This
is much more complex and should only be studied for newer versions.
(cherry picked from commit 12e316af4f0245fde12dbc224ebe33c8fea806b2)
[BUG] ebtree: string_equal_bits() could return garbage on identical strings
(from ebtree 6.0.2)
When inserting duplicates on x86/x86_64, the assembler optimization
does not support equal strings that both end up with a zero, and
can return garbage in the bit number, possibly causing a segfault
for its users. The only case where this can happen appears to be
in ebst_insert().
(cherry picked from commit 006152c62ae56d151188626e6074a79be3928858)
[BUG] stream_sock: cleanly disable the listener in case of resource shortage
Jozsef R.Nagy reported a reliability issue on FreeBSD. Sometimes an error
would be emitted, reporting the inability to switch a socket to non-blocking
mode and the listener would definitely not accept anything. Cyril Bonté
narrowed this bug down to the call to EV_FD_CLR(l->fd, DIR_RD).
He was right because this call is wrong. It only disables input events on
the listening socket, without setting the listener to the LI_LISTEN state,
so any subsequent call to enable_listener() from maintain_proxies() is
ignored ! The correct fix consists in calling disable_listener() instead.
It is discutable whether we should keep such error path or just ignore the
event. The goal in earlier versions was to temporarily disable new activity
in order to let the system recover while releasing resources.
[MEDIUM] buffers: rework the functions to exchange between SI and buffers
There was no consistency between all the functions used to exchange data
between a buffer and a stream interface. Also, the functions used to send
data to a buffer did not consider the possibility that the buffer was
shutdown for read.
Now the functions are called buffer_{put,get}_{char,block,chunk,string}.
The old buffer_feed* functions have been left available for existing code
but marked deprecated.
[BUG] stream_interface: only call si->release when both dirs are closed
si->release() was called each time we closed one direction of a stream
interface, while it should only have been called when both sides are
closed. This bug is specific to 1.5 and only affects embedded tasks.
[BUG] deinit: unbind listeners before freeing them
In deinit(), it is possible that we first free the listeners, then
unbind them all. Right now this situation can't happen because the
only way to call deinit() is to pass via a soft-stop which will
already unbind all protocols. But later this might become a problem.
Willy Tarreau [Tue, 31 Aug 2010 20:39:35 +0000 (22:39 +0200)]
[MEDIUM] http: fix space handling in the response cookie parser
This patch addresses exactly the same issues as the previous one, but
for responses this time. It also introduces implicit support for the
Set-Cookie2 header, for which there's almost nothing specific to do
since it is a clean header. This one allows multiple cookies in a
same header, by respecting the HTTP messaging semantics.
The new parser has been tested with insertion, rewrite, passive,
removal, prefixing and captures, and it looks OK. It's still able
to rewrite (or delete) multiple cookies at once. Just as with the
request parser, it tries hard to fix formating of the cookies it
displaces.
This patch too should be backported to 1.4 and possibly to 1.3.
Willy Tarreau [Tue, 31 Aug 2010 14:45:02 +0000 (16:45 +0200)]
[MEDIUM] http: fix space handling in the request cookie parser
The request cookie parser did not allow spaces to appear in cookie
values nor around the equal sign. The various RFCs on the subject
say different things, some suggesting that a space is allowed after
the equal sign and being worded in a way that lets one believe it
is allowed before too. Some spaces may appear inside values and be
part of the values. The quotes allow delimiters to be embedded in
values. The spaces before and after attributes should be trimmed.
The new parser addresses all those points and has been carefully tested.
It fixes misplaced spaces around equal signs before processing the cookies
or forwarding them. It also tries its best to perform clean removals by
always keeping the delimiter after the value being removed and leaving one
space after it.
The variable inside the parser have been renamed to make the code a lot
more understandable, and one multi-function pointer has been eliminated.
Since this patch fixes real possible issues, it should be backported to 1.4
and possibly 1.3, since one (single) case of wrong spaces has been reported
in 1.3.
The code handling the Set-Cookie has not been touched yet.
Willy Tarreau [Tue, 31 Aug 2010 20:54:15 +0000 (22:54 +0200)]
[DOC] fix description of cookie "insert" and "indirect" modes
The doc was wrong as the insert mode by default does not insert in
direct requests, and by default transmits the cookies to the server.
This was right in the old doc and it has not changed since the
beginning.
Willy Tarreau [Tue, 31 Aug 2010 13:39:26 +0000 (15:39 +0200)]
[MINOR] support a global jobs counter
This counter is incremented for each incoming connection and each active
listener, and is used to prevent haproxy from stopping upon SIGUSR1. It
will thus be possible for some tasks in increment this counter in order
to prevent haproxy from dying until they have completed their job.
Willy Tarreau [Mon, 30 Aug 2010 09:06:34 +0000 (11:06 +0200)]
[BUG] http: don't consider commas as a header delimitor within quotes
The header parser has a bug which causes commas to be matched within
quotes while it was not expected. The way the code was written could
make one think it was OK. The resulting effect is that the following
config would use the second IP address instead of the third when facing
this request :
source 0.0.0.0 usesrc hdr_ip(X-Forwarded-For,2)
GET / HTTP/1.0
X-Forwarded-for: "127.0.0.1, 127.0.0.2", 127.0.0.3
Willy Tarreau [Sat, 28 Aug 2010 17:21:00 +0000 (19:21 +0200)]
[RELEASE] Released version 1.5-dev2
Released version 1.5-dev2 with the following main changes :
- [MINOR] startup: release unused structs after forking
- [MINOR] startup: don't wait for nothing when no old pid remains
- [CLEANUP] reference product branch 1.5
- [MEDIUM] signals: add support for registering functions and tasks
- [MEDIUM] signals: support redistribution of signal zero when stopping
- [BUG] http: don't set auto_close if more data are expected
Willy Tarreau [Sat, 28 Aug 2010 16:57:20 +0000 (18:57 +0200)]
[BUG] http: don't set auto_close if more data are expected
Fix 4fe41902789d188ee4c23b14a7cdbf075463b158 was a bit too strong. It
has caused some chunked-encoded responses to be truncated when a recv()
call could return multiple chunks followed by a close. The reason is
that when a chunk is parsed, only its contents are scheduled to be
forwarded. Thus, the reader sees auto_close+shutr and sets shutw_now.
The sender in turn sends the last scheduled data and does shutw().
Another nasty effect is that it has reduced the keep-alive rate. If
a response did not completely fit into the buffer, then the auto_close
bit was left on and the sender would close upon completion.
The fix consists in not making use of auto_close when chunked encoding
is used nor when keep-alive is used, which makes sense. However it is
maintained on error processing.
Thanks to Cyril Bonté for reporting the issue early.
Willy Tarreau [Fri, 27 Aug 2010 16:26:11 +0000 (18:26 +0200)]
[MEDIUM] signals: support redistribution of signal zero when stopping
Signal zero is never delivered by the system. However having a signal to
which functions and tasks can subscribe to be notified of a stopping event
is useful. So this patch does two things :
1) allow signal zero to be delivered from any function of signal handler
2) make soft_stop() deliver this signal so that tasks can be notified of
a stopping condition.
Willy Tarreau [Fri, 27 Aug 2010 15:56:48 +0000 (17:56 +0200)]
[MEDIUM] signals: add support for registering functions and tasks
The two new functions below make it possible to register any number
of functions or tasks to a system signal. They will be called in the
registration order when the signal is received.
struct sig_handler *signal_register_fct(int sig, void (*fct)(struct sig_handler *), int arg);
struct sig_handler *signal_register_task(int sig, struct task *task, int reason);
Willy Tarreau [Wed, 25 Aug 2010 10:58:59 +0000 (12:58 +0200)]
[MINOR] startup: don't wait for nothing when no old pid remains
In case of binding failure during startup, we wait for some time sending
signals to old pids so that they release the ports we need. But if there
aren't any old pids anymore, it's useless to wait, we prefer to fail fast.
Along with this change, we now have the number of old pids really found
in the nb_oldpids variable.
Willy Tarreau [Wed, 25 Aug 2010 08:56:53 +0000 (10:56 +0200)]
[RELEASE] Released version 1.5-dev1
Released version 1.5-dev1 with the following main changes :
- [BUG] stats: session rate limit gets garbaged in the stats
- [DOC] mention 'option http-server-close' effect in Tq section
- [DOC] summarize and highlight persistent connections behaviour
- [DOC] add configuration samples
- [BUG] http: dispatch and http_proxy modes were broken for a long time
- [BUG] http: the transaction must be initialized even in TCP mode
- [BUG] tcp: dropped connections must be counted as "denied" not "failed"
- [BUG] consistent hash: balance on all servers, not only 2 !
- [CONTRIB] halog: report per-server status codes, errors and response times
- [BUG] http: the transaction must be initialized even in TCP mode (part 2)
- [BUG] client: always ensure to zero rep->analysers
- [BUG] session: clear BF_READ_ATTACHED before next I/O
- [BUG] http: automatically close response if req is aborted
- [BUG] proxy: connection rate limiting was eating lots of CPU
- [BUG] http: report correct flags in case of client aborts during body
- [TESTS] refine non-regression tests and add 4 new tests
- [BUG] debug: wrong pointer was used to report a status line
- [BUG] debug: correctly report truncated messages
- [DOC] document the "dispatch" keyword
- [BUG] stick_table: fix possible memory leak in case of connection error
- [CLEANUP] acl: use 'L6' instead of 'L4' in ACL flags relying on contents
- [MINOR] accept: count the incoming connection earlier
- [CLEANUP] tcp: move some non tcp-specific layer6 processing out of proto_tcp
- [CLEANUP] client: move some ACLs away to their respective locations
- [CLEANUP] rename client -> frontend
- [MEDIUM] separate protocol-level accept() from the frontend's
- [MINOR] proxy: add a list to hold future layer 4 rules
- [MEDIUM] config: parse tcp layer4 rules (tcp-request accept/reject)
- [MEDIUM] tcp: check for pure layer4 rules immediately after accept()
- [OPTIM] frontend: tell the compiler that errors are unlikely to occur
- [MEDIUM] frontend: check for LI_O_TCP_RULES in the listener
- [MINOR] frontend: only check for monitor-net rules if LI_O_CHK_MONNET is set
- [CLEANUP] buffer->cto is not used anymore
- [MEDIUM] session: finish session establishment sequence in with I/O handlers
- [MEDIUM] session: initialize server-side timeouts after connect()
- [MEDIUM] backend: initialize the server stream_interface upon connect()
- [MAJOR] frontend: don't initialize the server-side stream_int anymore
- [MEDIUM] session: move the conn_retries attribute to the stream interface
- [MEDIUM] session: don't assign conn_retries upon accept() anymore
- [MINOR] frontend: rely on the frontend and not the backend for INDEPSTR
- [MAJOR] frontend: reorder the session initialization upon accept
- [MINOR] proxy: add an accept() callback for the application layer
- [MAJOR] frontend: split accept() into frontend_accept() and session_accept()
- [MEDIUM] stats: rely on the standard session_accept() function
- [MINOR] buffer: refine the flags that may wake an analyser up.
- [MINOR] stream_sock: don't dereference a non-existing frontend
- [MINOR] session: differenciate between accepted connections and received connections
- [MEDIUM] frontend: count the incoming connection earlier
- [MINOR] frontend: count denied TCP requests separately
- [CLEANUP] stick_table: add/clarify some comments
- [BUILD] memory: add a few missing parenthesis to the pool management macros
- [MINOR] stick_table: add support for variable-sized data
- [CLEANUP] stick_table: rename some stksess struct members to avoid confusion
- [CLEANUP] stick_table: move pattern to key functions to stick_table.c
- [MEDIUM] stick_table: add room for extra data types
- [MINOR] stick_table: add support for "conn_cum" data type.
- [MEDIUM] stick_table: don't overwrite data when storing an entry
- [MINOR] config: initialize stick tables after all the parsing
- [MINOR] stick_table: provide functions to return stksess data from a type
- [MEDIUM] stick_table: move the server ID to a generic data type
- [MINOR] stick_table: enable it for frontends too
- [MINOR] stick_table: export the stick_table_key
- [MINOR] tcp: add per-source connection rate limiting
- [MEDIUM] stick_table: separate storage and update of session entries
- [MEDIUM] stick-tables: add a reference counter to each entry
- [MINOR] session: add a pointer to the tracked counters for the source
- [CLEANUP] proto_tcp: make the config parser a little bit more flexible
- [BUG] config: report the correct proxy type in tcp-request errors
- [MINOR] config: provide a function to quote args in a more friendly way
- [BUG] stick_table: the fix for the memory leak caused a regression
- [MEDIUM] backend: support servers on 0.0.0.0
- [BUG] stick-table: correctly refresh expiration timers
- [MEDIUM] stream-interface: add a ->release callback
- [MINOR] proxy: add a "parent" member to the structure
- [MEDIUM] session: make it possible to call an I/O handler on both SI
- [MINOR] tools: add a fast div64_32 function
- [MINOR] freq_ctr: add new types and functions for periods different from 1s
- [MINOR] errors: provide new status codes for config parsing functions
- [BUG] http: denied requests must not be counted as denied resps in listeners
- [MINOR] tools: add a get_std_op() function to parse operators
- [MEDIUM] acl: make use of get_std_op() to parse intger ranges
- [MAJOR] stream_sock: better wakeup conditions on read()
- [BUG] session: analysers must be checked when SI state changes
- [MINOR] http: reset analysers to listener's, not frontend's
- [MEDIUM] session: support "tcp-request content" rules in backends
- [BUILD] always match official tags when doing git-tar
- [MAJOR] stream_interface: fix the wakeup conditions for embedded iohandlers
- [MEDIUM] buffer: make buffer_feed* support writing non-contiguous chunks
- [MINOR] tcp: src_count acl does not have a permanent result
- [MAJOR] session: add track-counters to track counters related to the session
- [MINOR] stick-table: provide a table lookup function
- [MINOR] stick-table: use suffix "_cnt" for cumulated counts
- [MEDIUM] session: move counter ACL fetches from proto_tcp
- [MEDIUM] session: add concurrent connections counter
- [MEDIUM] session: add data in and out volume counters
- [MINOR] session: add the trk_conn_cnt ACL keyword to track connection counts
- [MEDIUM] session-counters: automatically update tracked connection count
- [MINOR] session: add the trk_conn_cur ACL keyword to track concurrent connection
- [MINOR] session: add trk_kbytes_* ACL keywords to track data size
- [MEDIUM] session: add a counter on the cumulated number of sessions
- [MINOR] config: support a comma-separated list of store data types in stick-table
- [MEDIUM] stick-tables: add support for arguments to data_types
- [MEDIUM] stick-tables: add stored data argument type checking
- [MEDIUM] session counters: add conn_rate and sess_rate counters
- [MEDIUM] session counters: add bytes_in_rate and bytes_out_rate counters
- [MINOR] stktable: add a stktable_update_key() function
- [MINOR] session-counters: add a general purpose counter (gpc0)
- [MEDIUM] session-counters: add HTTP req/err tracking
- [MEDIUM] stats: add "show table [<name>]" to dump a stick-table
- [MEDIUM] stats: add "clear table <name> key <value>" to clear table entries
- [CLEANUP] stick-table: declare stktable_data_types as extern
- [MEDIUM] stick-table: make use of generic types for stored data
- [MINOR] stats: correctly report errors on "show table" and "clear table"
- [MEDIUM] stats: add the ability to dump table entries matching criteria
- [DOC] configuration: document all the new tracked counters
- [DOC] stats: document "show table" and "clear table"
- [MAJOR] session-counters: split FE and BE track counters
- [MEDIUM] tcp: accept the "track-counters" in "tcp-request content" rules
- [MEDIUM] session counters: automatically remove expired entries.
- [MEDIUM] config: replace 'tcp-request <action>' with "tcp-request connection"
- [MEDIUM] session-counters: make it possible to count connections from frontend
- [MINOR] session-counters: use "track-sc{1,2}" instead of "track-{fe,be}-counters"
- [MEDIUM] session-counters: correctly unbind the counters tracked by the backend
- [CLEANUP] stats: use stksess_kill() to remove table entries
- [DOC] update the references to session counters and to tcp-request connection
- [DOC] cleanup: split a few long lines
- [MEDIUM] http: forward client's close when abortonclose is set
- [BUG] queue: don't dequeue proxy-global requests on disabled servers
- [BUG] stats: global stats timeout may be specified before stats socket.
- [BUG] conf: add tcp-request content rules to the correct list
Willy Tarreau [Tue, 17 Aug 2010 19:48:17 +0000 (21:48 +0200)]
[BUG] stats: global stats timeout may be specified before stats socket.
If the global stats timeout statement was found before the stats socket
(or without), the parser would crash because the stats frontend was not
initialized. Now we have an allocation function which solves the issue.
[MEDIUM] http: forward client's close when abortonclose is set
While it's usually desired to wait for a server response even
when the client closes its request channel, it can be problematic
with long polling requests. In order to let the server decide what
to do in such a case, if option abortonclose is set, we simply
forward the shutdown to the server. That way, it can decide to
take the appropriate action. Most servers will still process the
request, while some will probably want to abort.
Obviously, this only works as long as the client has not sent
another pipelined request over the same connection.
Willy Tarreau [Fri, 6 Aug 2010 18:11:05 +0000 (20:11 +0200)]
[MEDIUM] session-counters: correctly unbind the counters tracked by the backend
In case of HTTP keepalive processing, we want to release the counters tracked
by the backend. Till now only the second set of counters was released, while
it could have been assigned by the frontend, or the backend could also have
assigned the first set. Now we reuse to unused bits of the session flags to
mark which stick counters were assigned by the backend and to release them as
appropriate.
Willy Tarreau [Fri, 6 Aug 2010 17:06:56 +0000 (19:06 +0200)]
[MINOR] session-counters: use "track-sc{1,2}" instead of "track-{fe,be}-counters"
The assumption that there was a 1:1 relation between tracked counters and
the frontend/backend role was wrong. It is perfectly possible to track the
track-fe-counters from the backend and the track-be-counters from the
frontend. Thus, in order to reduce confusion, let's remove this useless
{fe,be} reference and simply use {1,2} instead. The keywords have also been
renamed in order to limit confusion. The ACL rule action now becomes
"track-sc{1,2}". The ACLs are now "sc{1,2}_*" instead of "trk{fe,be}_*".
That means that we can reasonably document "sc1" and "sc2" (sticky counters
1 and 2) as sort of patterns that are available during the whole session's
life and use them just like any other pattern.
Willy Tarreau [Fri, 6 Aug 2010 13:08:45 +0000 (15:08 +0200)]
[MEDIUM] config: replace 'tcp-request <action>' with "tcp-request connection"
It began to be problematic to have "tcp-request" followed by an
immediate action, as sometimes it was a keyword indicating a hook
or setting ("content" or "inspect-delay") and sometimes it was an
action.
Now the prefix for connection-level tcp-requests is "tcp-request connection"
and the ones processing contents remain "tcp-request contents".
This has allowed a nice simplification of the config parser and to
clean up the doc a bit. Also now it's a bit more clear why tcp-request
connection are not allowed in backends.
Willy Tarreau [Tue, 3 Aug 2010 17:34:32 +0000 (19:34 +0200)]
[MEDIUM] tcp: accept the "track-counters" in "tcp-request content" rules
Doing so allows us to track counters from backends or depending on contents.
For instance, it now becomes possible to decide to track a connection based
on a Host header if enough time is granted to parse the HTTP request. It is
also possible to just track frontend counters in the frontend and unconditionally
track backend counters in the backend without having to write complex rules.
The first track-fe-counters rule executed is used to track counters for
the frontend, and the first track-be-counters rule executed is used to track
counters for the backend. Nothing prevents a frontend from setting a track-be
rule nor a backend from setting a track-fe rule. In fact these rules are
arbitrarily split between FE and BE with no dependencies.
Willy Tarreau [Tue, 3 Aug 2010 14:29:52 +0000 (16:29 +0200)]
[MAJOR] session-counters: split FE and BE track counters
Having a single tracking pointer for both frontend and backend counters
does not work. Instead let's have one for each. The keyword has changed
to "track-be-counters" and "track-fe-counters", and the ACL "trk_*"
changed to "trkfe_*" and "trkbe_*".
[MEDIUM] stats: add the ability to dump table entries matching criteria
It is now possible to dump some select table entries based on criteria
which apply to the stored data. This is enabled by appending the following
options to the end of the "show table" statement :
data.<data_type> {eq|ne|lt|gt|le|ge} <value>
For intance :
show table http_proxy data.conn_rate gt 5
show table http_proxy data.gpc0 ne 0
The compare applies to the integer value as it would be displayed, and
operates on signed long long integers.
[MEDIUM] stick-table: make use of generic types for stored data
It's a bit cumbersome to have to know all possible storable types
from the stats interface. Instead, let's have generic types for
all data, which will facilitate their manipulation.
This feature will be required at some point, when the stick tables are
used to enforce security measures. For instance, some visitors may be
incorrectly flagged as abusers and would ask the site admins to remove
their entry from the table.
Willy Tarreau [Sun, 20 Jun 2010 10:47:25 +0000 (12:47 +0200)]
[MINOR] session-counters: add a general purpose counter (gpc0)
This counter may be used to track anything. Two sets of ACLs are available
to manage it, one gets its value, and the other one increments its value
and returns it. In the second case, the entry is created if it did not
exist.
Thus it is possible for example to mark a source as being an abuser and
to keep it marked as long as it does not wait for the entry to expire :
# The rules below use gpc0 to track abusers, and reject them if
# a source has been marked as such. The track-counters statement
# automatically refreshes the entry which will not expire until a
# 1-minute silence is respected from the source. The second rule
# evaluates the second part if the first one is true, so GPC0 will
# be increased once the conn_rate is above 100/5s.
stick-table type ip size 200k expire 1m store conn_rate(5s),gpc0
tcp-request track-counters src
tcp-request reject if { trk_get_gpc0 gt 0 }
tcp-request reject if { trk_conn_rate gt 100 } { trk_inc_gpc0 gt 0}
Alternatively, it is possible to let the entry expire even in presence of
traffic by swapping the check for gpc0 and the track-counters statement :
stick-table type ip size 200k expire 1m store conn_rate(5s),gpc0
tcp-request reject if { src_get_gpc0 gt 0 }
tcp-request track-counters src
tcp-request reject if { trk_conn_rate gt 100 } { trk_inc_gpc0 gt 0}
It is also possible not to track counters at all, but entry lookups will
then be performed more often :
stick-table type ip size 200k expire 1m store conn_rate(5s),gpc0
tcp-request reject if { src_get_gpc0 gt 0 }
tcp-request reject if { src_conn_rate gt 100 } { src_inc_gpc0 gt 0}
The '0' at the end of the counter name is there because if we find that more
counters may be useful, other ones will be added.
Willy Tarreau [Sun, 20 Jun 2010 10:27:21 +0000 (12:27 +0200)]
[MINOR] stktable: add a stktable_update_key() function
This function looks up a key, updates its expiration date, or creates
it if it was not found. acl_fetch_src_updt_conn_cnt() was updated to
make use of it.
Willy Tarreau [Sun, 20 Jun 2010 09:56:30 +0000 (11:56 +0200)]
[MEDIUM] session counters: add bytes_in_rate and bytes_out_rate counters
These counters maintain incoming and outgoing byte rates in a stick-table,
over a period which is defined in the configuration (2 ms to 24 days).
They can be used to detect service abuse and enforce a certain bandwidth
limits per source address for instance, and block if the rate is passed
over. Since 32-bit counters are used to compute the rates, it is important
not to use too long periods so that we don't have to deal with rates above
4 GB per period.
Example :
# block if more than 5 Megs retrieved in 30 seconds from a source.
stick-table type ip size 200k expire 1m store bytes_out_rate(30s)
tcp-request track-counters src
tcp-request reject if { trk_bytes_out_rate gt 5000000 }
# cause a 15 seconds pause to requests from sources in excess of 2 megs/30s
tcp-request inspect-delay 15s
tcp-request content accept if { trk_bytes_out_rate gt 2000000 } WAIT_END
Willy Tarreau [Sun, 20 Jun 2010 09:19:22 +0000 (11:19 +0200)]
[MEDIUM] session counters: add conn_rate and sess_rate counters
These counters maintain incoming connection rates and session rates
in a stick-table, over a period which is defined in the configuration
(2 ms to 24 days). They can be used to detect service abuse and
enforce a certain accept rate per source address for instance, and
block if the rate is passed over.
Example :
# block if more than 50 requests per 5 seconds from a source.
stick-table type ip size 200k expire 1m store conn_rate(5s),sess_rate(5s)
tcp-request track-counters src
tcp-request reject if { trk_conn_rate gt 50 }
# cause a 3 seconds pause to requests from sources in excess of 20 requests/5s
tcp-request inspect-delay 3s
tcp-request content accept if { trk_sess_rate gt 20 } WAIT_END
Willy Tarreau [Sun, 20 Jun 2010 08:41:54 +0000 (10:41 +0200)]
[MEDIUM] stick-tables: add stored data argument type checking
We're now able to return errors based on the validity of an argument
passed to a stick-table store data type. We also support ARG_T_DELAY
to pass delays to stored data types (eg: for rate counters).
Willy Tarreau [Sun, 20 Jun 2010 07:11:39 +0000 (09:11 +0200)]
[MEDIUM] stick-tables: add support for arguments to data_types
Some data types will require arguments (eg: period for a rate counter).
This patch adds support for such arguments between parenthesis in the
"store" directive of the stick-table statement. Right now only integers
are supported.
When a session tracks a counter, automatically increase the cumulated
connection count. This makes src_updt_conn_cnt() almost useless. In
fact it might still be used to update different tables.
Willy Tarreau [Fri, 18 Jun 2010 17:53:25 +0000 (19:53 +0200)]
[MINOR] session: add the trk_conn_cnt ACL keyword to track connection counts
Most of the time we'll want to check the connection count of the
criterion we're currently tracking. So instead of duplicating the
src* tests, let's add trk_conn_cnt to report the total number of
connections from the stick table entry currently being tracked.
A nice part of the code was factored, and we should do the same
for the other criteria.
Willy Tarreau [Fri, 18 Jun 2010 16:33:32 +0000 (18:33 +0200)]
[MEDIUM] session: add data in and out volume counters
The new "bytes_in_cnt" and "bytes_out_cnt" session counters have been
added. They're automatically updated when session counters are updated.
They can be matched with the "src_kbytes_in" and "src_kbytes_out" ACLs
which apply to the volume per source address. This can be used to deny
access to service abusers.
The new "conn_cur" session counter has been added. It is automatically
updated upon "track XXX" directives, and the entry is touched at the
moment we increment the value so that we don't consider further counter
updates as real updates, otherwise we would end up updating upon completion,
which may not be desired. Probably that some other event counters (eg: HTTP
requests) will have to be updated upon each event though.
This counter can be matched against current session's source address using
the "src_conn_cur" ACL.
Willy Tarreau [Fri, 18 Jun 2010 15:46:06 +0000 (17:46 +0200)]
[MEDIUM] session: move counter ACL fetches from proto_tcp
It was not normal to have counter fetches in proto_tcp.c. The only
reason was that the key based on the source address was fetched there,
but now we have split the key extraction and data processing, we must
move that to a more appropriate place. Session seems OK since the
counters are all manipulated from here.
Also, since we're precisely counting number of connections with these
ACLs, we rename them src_conn_cnt and src_updt_conn_cnt. This is not
a problem right now since no version was emitted with these keywords.
Willy Tarreau [Fri, 18 Jun 2010 18:16:39 +0000 (20:16 +0200)]
[MINOR] stick-table: use suffix "_cnt" for cumulated counts
The "_cnt" suffix is already used by ACLs to count various data,
so it makes sense to use the same one in "conn_cnt" instead of
"conn_cum" to count cumulated connections.
This is not a problem because no version was emitted with those
keywords.
Thus we'll try to stick to the following rules :
xxxx_cnt : cumulated event count for criterion xxxx
xxxx_cur : current number of concurrent entries for criterion xxxx
xxxx_rate: event rate for criterion xxxx
Willy Tarreau [Mon, 14 Jun 2010 19:04:55 +0000 (21:04 +0200)]
[MAJOR] session: add track-counters to track counters related to the session
This patch adds the ability to set a pointer in the session to an
entry in a stick table which holds various counters related to a
specific pattern.
Right now the syntax matches the target syntax and only the "src"
pattern can be specified, to track counters related to the session's
IPv4 source address. There is a special function to extract it and
convert it to a key. But the goal is to be able to later support as
many patterns as for the stick rules, and get rid of the specific
function.
The "track-counters" directive may only be set in a "tcp-request"
statement right now. Only the first one applies. Probably that later
we'll support multi-criteria tracking for a single session and that
we'll have to name tracking pointers.
No counter is updated right now, only the refcount is. Some subsequent
patches will have to bring that feature.
Willy Tarreau [Tue, 15 Jun 2010 15:57:36 +0000 (17:57 +0200)]
[MINOR] tcp: src_count acl does not have a permanent result
This ACL's count can change along the session's life because it depends
on other sessions' activity. Switch it to volatile since any session
could appear while evaluating the ACLs.
Willy Tarreau [Tue, 10 Aug 2010 13:28:21 +0000 (15:28 +0200)]
[MEDIUM] buffer: make buffer_feed* support writing non-contiguous chunks
The buffer_feed* functions that are used to send data to buffers did only
support sending contiguous chunks while they're relying on memcpy(). This
patch improves on this by making them able to write in two chunks if needed.
Thus, the buffer_almost_full() function has been improved to really consider
the remaining space and not just what can be written at once.
Willy Tarreau [Mon, 9 Aug 2010 14:24:56 +0000 (16:24 +0200)]
[MAJOR] stream_interface: fix the wakeup conditions for embedded iohandlers
Now we stop relying on BF_READ_DONTWAIT, which is unrelated to the
wakeups, and only consider activity to decide whether to wake the task
up instead of considering the other side's activity. It is worth noting
that the local stream interface's flags were not updated consecutively
to a call to chk_snd(), which could possibly result in hung tasks from
time to time. This fix will avoid possible loops and uncaught events.
Willy Tarreau [Tue, 3 Aug 2010 12:02:05 +0000 (14:02 +0200)]
[MEDIUM] session: support "tcp-request content" rules in backends
Sometimes it's necessary to be able to perform some "layer 6" analysis
in the backend. TCP request rules were not available till now, although
documented in the diagram. Enable them in backend now.
Willy Tarreau [Tue, 3 Aug 2010 09:52:10 +0000 (11:52 +0200)]
[MINOR] http: reset analysers to listener's, not frontend's
When resetting a session's request analysers, we must take them from the
listener, not from the frontend. At the moment there is no difference
but this might change.
[BUG] session: analysers must be checked when SI state changes
Since the BF_READ_ATTACHED bug was fixed, a new issue surfaced. When
a connection closes on the return path in tunnel mode while the request
input is already closed, the request analyser which is waiting for a
state change never gets woken up so it never closes the request output.
This causes stuck sessions to remain indefinitely.
One way to reliably reproduce the issue is the following (note that the
client expects a keep-alive but not the server) :
The reason for the issue is that we don't wake the analysers up on
stream interface state changes. So the least intrusive and most reliable
thing to do is to consider stream interface state changes to call the
analysers.
We just need to remember what state each series of analysers have seen
and check for the differences. In practice, that works.
A later improvement later could consist in being able to let analysers
state what they're interested to monitor :
- left SI's state
- right SI's state
- request buffer flags
- response buffer flags
That could help having only one set of analysers and call them once
status changes.
[MAJOR] stream_sock: better wakeup conditions on read()
After a read, there was a condition to mandatorily wake the task
up if the BF_READ_DONTWAIT flag was set. This was wrong because
the wakeup condition in this case can be deduced from the other
ones. Another condition was put on the other side not being in
SI_ST_EST state. It is not appropriate to do this because it
causes a useless wakeup at the beginning of every first request
in case of speculative polling, due to the fact that we don't
read anything and that the other side is still in SI_ST_INI.
Also, the wakeup was performed whenever to_forward was null,
which causes an unexpected wakeup upon the first read for the
same reason. However, those two conditions are valid if and
only if at least one read was performed.
Also, the BF_SHUTR flag was tested as part of the wakeup condition,
while this one can only be set if BF_READ_NULL is set too. So let's
simplify this ambiguous test by removing the BF_SHUTR part from the
condition to only process events.
Last, the BF_READ_DONTWAIT flag was unconditionally cleared,
while sometimes there would have been no I/O. Now we only clear
it once the I/O operation has been performed, which maintains
its validity until the I/O occurs.
Finally, those fixes saved approximately 16% of the per-session
wakeups and 20% of the epoll_ctl() calls, which translates into
slightly less under high load due to the request often being ready
when the read() occurs. A performance increase between 2 and 5% is
expected depending on the workload.
It does not seem necessary to backport this change to 1.4, eventhough
it fixes some performance issues. It may later be backported if
required to fix something else because the risk of regression seems
very low due to the fact that we're more in line with the documented
semantics.