Willy Tarreau [Fri, 6 Aug 2010 18:11:05 +0000 (20:11 +0200)]
[MEDIUM] session-counters: correctly unbind the counters tracked by the backend
In case of HTTP keepalive processing, we want to release the counters tracked
by the backend. Till now only the second set of counters was released, while
it could have been assigned by the frontend, or the backend could also have
assigned the first set. Now we reuse to unused bits of the session flags to
mark which stick counters were assigned by the backend and to release them as
appropriate.
Willy Tarreau [Fri, 6 Aug 2010 17:06:56 +0000 (19:06 +0200)]
[MINOR] session-counters: use "track-sc{1,2}" instead of "track-{fe,be}-counters"
The assumption that there was a 1:1 relation between tracked counters and
the frontend/backend role was wrong. It is perfectly possible to track the
track-fe-counters from the backend and the track-be-counters from the
frontend. Thus, in order to reduce confusion, let's remove this useless
{fe,be} reference and simply use {1,2} instead. The keywords have also been
renamed in order to limit confusion. The ACL rule action now becomes
"track-sc{1,2}". The ACLs are now "sc{1,2}_*" instead of "trk{fe,be}_*".
That means that we can reasonably document "sc1" and "sc2" (sticky counters
1 and 2) as sort of patterns that are available during the whole session's
life and use them just like any other pattern.
Willy Tarreau [Fri, 6 Aug 2010 13:08:45 +0000 (15:08 +0200)]
[MEDIUM] config: replace 'tcp-request <action>' with "tcp-request connection"
It began to be problematic to have "tcp-request" followed by an
immediate action, as sometimes it was a keyword indicating a hook
or setting ("content" or "inspect-delay") and sometimes it was an
action.
Now the prefix for connection-level tcp-requests is "tcp-request connection"
and the ones processing contents remain "tcp-request contents".
This has allowed a nice simplification of the config parser and to
clean up the doc a bit. Also now it's a bit more clear why tcp-request
connection are not allowed in backends.
Willy Tarreau [Tue, 3 Aug 2010 17:34:32 +0000 (19:34 +0200)]
[MEDIUM] tcp: accept the "track-counters" in "tcp-request content" rules
Doing so allows us to track counters from backends or depending on contents.
For instance, it now becomes possible to decide to track a connection based
on a Host header if enough time is granted to parse the HTTP request. It is
also possible to just track frontend counters in the frontend and unconditionally
track backend counters in the backend without having to write complex rules.
The first track-fe-counters rule executed is used to track counters for
the frontend, and the first track-be-counters rule executed is used to track
counters for the backend. Nothing prevents a frontend from setting a track-be
rule nor a backend from setting a track-fe rule. In fact these rules are
arbitrarily split between FE and BE with no dependencies.
Willy Tarreau [Tue, 3 Aug 2010 14:29:52 +0000 (16:29 +0200)]
[MAJOR] session-counters: split FE and BE track counters
Having a single tracking pointer for both frontend and backend counters
does not work. Instead let's have one for each. The keyword has changed
to "track-be-counters" and "track-fe-counters", and the ACL "trk_*"
changed to "trkfe_*" and "trkbe_*".
[MEDIUM] stats: add the ability to dump table entries matching criteria
It is now possible to dump some select table entries based on criteria
which apply to the stored data. This is enabled by appending the following
options to the end of the "show table" statement :
data.<data_type> {eq|ne|lt|gt|le|ge} <value>
For intance :
show table http_proxy data.conn_rate gt 5
show table http_proxy data.gpc0 ne 0
The compare applies to the integer value as it would be displayed, and
operates on signed long long integers.
[MEDIUM] stick-table: make use of generic types for stored data
It's a bit cumbersome to have to know all possible storable types
from the stats interface. Instead, let's have generic types for
all data, which will facilitate their manipulation.
This feature will be required at some point, when the stick tables are
used to enforce security measures. For instance, some visitors may be
incorrectly flagged as abusers and would ask the site admins to remove
their entry from the table.
Willy Tarreau [Sun, 20 Jun 2010 10:47:25 +0000 (12:47 +0200)]
[MINOR] session-counters: add a general purpose counter (gpc0)
This counter may be used to track anything. Two sets of ACLs are available
to manage it, one gets its value, and the other one increments its value
and returns it. In the second case, the entry is created if it did not
exist.
Thus it is possible for example to mark a source as being an abuser and
to keep it marked as long as it does not wait for the entry to expire :
# The rules below use gpc0 to track abusers, and reject them if
# a source has been marked as such. The track-counters statement
# automatically refreshes the entry which will not expire until a
# 1-minute silence is respected from the source. The second rule
# evaluates the second part if the first one is true, so GPC0 will
# be increased once the conn_rate is above 100/5s.
stick-table type ip size 200k expire 1m store conn_rate(5s),gpc0
tcp-request track-counters src
tcp-request reject if { trk_get_gpc0 gt 0 }
tcp-request reject if { trk_conn_rate gt 100 } { trk_inc_gpc0 gt 0}
Alternatively, it is possible to let the entry expire even in presence of
traffic by swapping the check for gpc0 and the track-counters statement :
stick-table type ip size 200k expire 1m store conn_rate(5s),gpc0
tcp-request reject if { src_get_gpc0 gt 0 }
tcp-request track-counters src
tcp-request reject if { trk_conn_rate gt 100 } { trk_inc_gpc0 gt 0}
It is also possible not to track counters at all, but entry lookups will
then be performed more often :
stick-table type ip size 200k expire 1m store conn_rate(5s),gpc0
tcp-request reject if { src_get_gpc0 gt 0 }
tcp-request reject if { src_conn_rate gt 100 } { src_inc_gpc0 gt 0}
The '0' at the end of the counter name is there because if we find that more
counters may be useful, other ones will be added.
Willy Tarreau [Sun, 20 Jun 2010 10:27:21 +0000 (12:27 +0200)]
[MINOR] stktable: add a stktable_update_key() function
This function looks up a key, updates its expiration date, or creates
it if it was not found. acl_fetch_src_updt_conn_cnt() was updated to
make use of it.
Willy Tarreau [Sun, 20 Jun 2010 09:56:30 +0000 (11:56 +0200)]
[MEDIUM] session counters: add bytes_in_rate and bytes_out_rate counters
These counters maintain incoming and outgoing byte rates in a stick-table,
over a period which is defined in the configuration (2 ms to 24 days).
They can be used to detect service abuse and enforce a certain bandwidth
limits per source address for instance, and block if the rate is passed
over. Since 32-bit counters are used to compute the rates, it is important
not to use too long periods so that we don't have to deal with rates above
4 GB per period.
Example :
# block if more than 5 Megs retrieved in 30 seconds from a source.
stick-table type ip size 200k expire 1m store bytes_out_rate(30s)
tcp-request track-counters src
tcp-request reject if { trk_bytes_out_rate gt 5000000 }
# cause a 15 seconds pause to requests from sources in excess of 2 megs/30s
tcp-request inspect-delay 15s
tcp-request content accept if { trk_bytes_out_rate gt 2000000 } WAIT_END
Willy Tarreau [Sun, 20 Jun 2010 09:19:22 +0000 (11:19 +0200)]
[MEDIUM] session counters: add conn_rate and sess_rate counters
These counters maintain incoming connection rates and session rates
in a stick-table, over a period which is defined in the configuration
(2 ms to 24 days). They can be used to detect service abuse and
enforce a certain accept rate per source address for instance, and
block if the rate is passed over.
Example :
# block if more than 50 requests per 5 seconds from a source.
stick-table type ip size 200k expire 1m store conn_rate(5s),sess_rate(5s)
tcp-request track-counters src
tcp-request reject if { trk_conn_rate gt 50 }
# cause a 3 seconds pause to requests from sources in excess of 20 requests/5s
tcp-request inspect-delay 3s
tcp-request content accept if { trk_sess_rate gt 20 } WAIT_END
Willy Tarreau [Sun, 20 Jun 2010 08:41:54 +0000 (10:41 +0200)]
[MEDIUM] stick-tables: add stored data argument type checking
We're now able to return errors based on the validity of an argument
passed to a stick-table store data type. We also support ARG_T_DELAY
to pass delays to stored data types (eg: for rate counters).
Willy Tarreau [Sun, 20 Jun 2010 07:11:39 +0000 (09:11 +0200)]
[MEDIUM] stick-tables: add support for arguments to data_types
Some data types will require arguments (eg: period for a rate counter).
This patch adds support for such arguments between parenthesis in the
"store" directive of the stick-table statement. Right now only integers
are supported.
When a session tracks a counter, automatically increase the cumulated
connection count. This makes src_updt_conn_cnt() almost useless. In
fact it might still be used to update different tables.
Willy Tarreau [Fri, 18 Jun 2010 17:53:25 +0000 (19:53 +0200)]
[MINOR] session: add the trk_conn_cnt ACL keyword to track connection counts
Most of the time we'll want to check the connection count of the
criterion we're currently tracking. So instead of duplicating the
src* tests, let's add trk_conn_cnt to report the total number of
connections from the stick table entry currently being tracked.
A nice part of the code was factored, and we should do the same
for the other criteria.
Willy Tarreau [Fri, 18 Jun 2010 16:33:32 +0000 (18:33 +0200)]
[MEDIUM] session: add data in and out volume counters
The new "bytes_in_cnt" and "bytes_out_cnt" session counters have been
added. They're automatically updated when session counters are updated.
They can be matched with the "src_kbytes_in" and "src_kbytes_out" ACLs
which apply to the volume per source address. This can be used to deny
access to service abusers.
The new "conn_cur" session counter has been added. It is automatically
updated upon "track XXX" directives, and the entry is touched at the
moment we increment the value so that we don't consider further counter
updates as real updates, otherwise we would end up updating upon completion,
which may not be desired. Probably that some other event counters (eg: HTTP
requests) will have to be updated upon each event though.
This counter can be matched against current session's source address using
the "src_conn_cur" ACL.
Willy Tarreau [Fri, 18 Jun 2010 15:46:06 +0000 (17:46 +0200)]
[MEDIUM] session: move counter ACL fetches from proto_tcp
It was not normal to have counter fetches in proto_tcp.c. The only
reason was that the key based on the source address was fetched there,
but now we have split the key extraction and data processing, we must
move that to a more appropriate place. Session seems OK since the
counters are all manipulated from here.
Also, since we're precisely counting number of connections with these
ACLs, we rename them src_conn_cnt and src_updt_conn_cnt. This is not
a problem right now since no version was emitted with these keywords.
Willy Tarreau [Fri, 18 Jun 2010 18:16:39 +0000 (20:16 +0200)]
[MINOR] stick-table: use suffix "_cnt" for cumulated counts
The "_cnt" suffix is already used by ACLs to count various data,
so it makes sense to use the same one in "conn_cnt" instead of
"conn_cum" to count cumulated connections.
This is not a problem because no version was emitted with those
keywords.
Thus we'll try to stick to the following rules :
xxxx_cnt : cumulated event count for criterion xxxx
xxxx_cur : current number of concurrent entries for criterion xxxx
xxxx_rate: event rate for criterion xxxx
Willy Tarreau [Mon, 14 Jun 2010 19:04:55 +0000 (21:04 +0200)]
[MAJOR] session: add track-counters to track counters related to the session
This patch adds the ability to set a pointer in the session to an
entry in a stick table which holds various counters related to a
specific pattern.
Right now the syntax matches the target syntax and only the "src"
pattern can be specified, to track counters related to the session's
IPv4 source address. There is a special function to extract it and
convert it to a key. But the goal is to be able to later support as
many patterns as for the stick rules, and get rid of the specific
function.
The "track-counters" directive may only be set in a "tcp-request"
statement right now. Only the first one applies. Probably that later
we'll support multi-criteria tracking for a single session and that
we'll have to name tracking pointers.
No counter is updated right now, only the refcount is. Some subsequent
patches will have to bring that feature.
Willy Tarreau [Tue, 15 Jun 2010 15:57:36 +0000 (17:57 +0200)]
[MINOR] tcp: src_count acl does not have a permanent result
This ACL's count can change along the session's life because it depends
on other sessions' activity. Switch it to volatile since any session
could appear while evaluating the ACLs.
Willy Tarreau [Tue, 10 Aug 2010 13:28:21 +0000 (15:28 +0200)]
[MEDIUM] buffer: make buffer_feed* support writing non-contiguous chunks
The buffer_feed* functions that are used to send data to buffers did only
support sending contiguous chunks while they're relying on memcpy(). This
patch improves on this by making them able to write in two chunks if needed.
Thus, the buffer_almost_full() function has been improved to really consider
the remaining space and not just what can be written at once.
Willy Tarreau [Mon, 9 Aug 2010 14:24:56 +0000 (16:24 +0200)]
[MAJOR] stream_interface: fix the wakeup conditions for embedded iohandlers
Now we stop relying on BF_READ_DONTWAIT, which is unrelated to the
wakeups, and only consider activity to decide whether to wake the task
up instead of considering the other side's activity. It is worth noting
that the local stream interface's flags were not updated consecutively
to a call to chk_snd(), which could possibly result in hung tasks from
time to time. This fix will avoid possible loops and uncaught events.
Willy Tarreau [Tue, 3 Aug 2010 12:02:05 +0000 (14:02 +0200)]
[MEDIUM] session: support "tcp-request content" rules in backends
Sometimes it's necessary to be able to perform some "layer 6" analysis
in the backend. TCP request rules were not available till now, although
documented in the diagram. Enable them in backend now.
Willy Tarreau [Tue, 3 Aug 2010 09:52:10 +0000 (11:52 +0200)]
[MINOR] http: reset analysers to listener's, not frontend's
When resetting a session's request analysers, we must take them from the
listener, not from the frontend. At the moment there is no difference
but this might change.
[BUG] session: analysers must be checked when SI state changes
Since the BF_READ_ATTACHED bug was fixed, a new issue surfaced. When
a connection closes on the return path in tunnel mode while the request
input is already closed, the request analyser which is waiting for a
state change never gets woken up so it never closes the request output.
This causes stuck sessions to remain indefinitely.
One way to reliably reproduce the issue is the following (note that the
client expects a keep-alive but not the server) :
The reason for the issue is that we don't wake the analysers up on
stream interface state changes. So the least intrusive and most reliable
thing to do is to consider stream interface state changes to call the
analysers.
We just need to remember what state each series of analysers have seen
and check for the differences. In practice, that works.
A later improvement later could consist in being able to let analysers
state what they're interested to monitor :
- left SI's state
- right SI's state
- request buffer flags
- response buffer flags
That could help having only one set of analysers and call them once
status changes.
[MAJOR] stream_sock: better wakeup conditions on read()
After a read, there was a condition to mandatorily wake the task
up if the BF_READ_DONTWAIT flag was set. This was wrong because
the wakeup condition in this case can be deduced from the other
ones. Another condition was put on the other side not being in
SI_ST_EST state. It is not appropriate to do this because it
causes a useless wakeup at the beginning of every first request
in case of speculative polling, due to the fact that we don't
read anything and that the other side is still in SI_ST_INI.
Also, the wakeup was performed whenever to_forward was null,
which causes an unexpected wakeup upon the first read for the
same reason. However, those two conditions are valid if and
only if at least one read was performed.
Also, the BF_SHUTR flag was tested as part of the wakeup condition,
while this one can only be set if BF_READ_NULL is set too. So let's
simplify this ambiguous test by removing the BF_SHUTR part from the
condition to only process events.
Last, the BF_READ_DONTWAIT flag was unconditionally cleared,
while sometimes there would have been no I/O. Now we only clear
it once the I/O operation has been performed, which maintains
its validity until the I/O occurs.
Finally, those fixes saved approximately 16% of the per-session
wakeups and 20% of the epoll_ctl() calls, which translates into
slightly less under high load due to the request often being ready
when the read() occurs. A performance increase between 2 and 5% is
expected depending on the workload.
It does not seem necessary to backport this change to 1.4, eventhough
it fixes some performance issues. It may later be backported if
required to fix something else because the risk of regression seems
very low due to the fact that we're more in line with the documented
semantics.
Willy Tarreau [Sun, 20 Jun 2010 08:26:51 +0000 (10:26 +0200)]
[MINOR] errors: provide new status codes for config parsing functions
Some config parsing functions need to return composite status codes
when they rely on other functions. Let's provide a few such codes
for general use and extend them later.
Willy Tarreau [Sun, 20 Jun 2010 05:15:43 +0000 (07:15 +0200)]
[MINOR] freq_ctr: add new types and functions for periods different from 1s
Some freq counters will have to work on periods different from 1 second.
The original freq counters rely on the period to be exactly one second.
The new ones (freq_ctr_period) let the user define the period in ticks,
and all computations are operated over that period. When reading a value,
it indicates the amount of events over that period too.
Willy Tarreau [Sun, 20 Jun 2010 05:12:37 +0000 (07:12 +0200)]
[MINOR] tools: add a fast div64_32 function
We'll need to divide 64 bits by 32 bits with new frequency counters.
Gcc does not know when it can safely do that, but the way we build
our operations let us be sure. So let's provide an optimised version
for that purpose.
[MEDIUM] session: make it possible to call an I/O handler on both SI
This will be used when an I/O handler running in a stream interface
needs to establish a connection somewhere. We want the session
processor to evaluate both I/O handlers, depending on which side has
one. Doing so also requires that stream_int_update_embedded() wakes
the session up only when the other side is established or has closed,
for instance in order to handle connection errors without looping
indefinitely during the connection setup time.
The session processor still relies on BF_READ_ATTACHED being set,
though we must do whatever is required to remove this dependency.
[MINOR] proxy: add a "parent" member to the structure
This member will be used later when frontends are created on the
fly by some tasks. It will also be usable later if we need to
support multiple config instances for example.
[MEDIUM] stream-interface: add a ->release callback
When a connection is closed on a stream interface, some iohandlers
will need to be informed in order to release some resources. This
normally happens upon a shutr+shutw. It is the equivalent of the
fd_delete() call which is done for real sockets, except that this
time we release internal resources.
It can also be used with real sockets because it does not cost
anything else and might one day be useful.
Till now when a server was configured with address 0.0.0.0, the
connection was forwarded to this address which generally is intercepted
by the system as a local address, so this was completely useless.
One sometimes useful feature for outgoing transparent proxies is to
be able to forward the connection to the same address the client
requested. This patch fixes the meaning of 0.0.0.0 precisely to
ensure that the connection will be forwarded to the initial client's
destination address.
Patrick Mezard [Sat, 12 Jun 2010 15:02:47 +0000 (17:02 +0200)]
[DOC] add configuration samples
configuration.txt is thorough and accurate but lacked sample configurations
clarifying both the syntax and the relations between global, defaults,
frontend, backend and listen sections. Besides, almost all examples to be found
in haproxy-en.txt or online tutorials make use of the 'listen' syntax while
'frontend/backend' is really the one to know about.
(cherry picked from commit 01ac10ad189b11c563eeb835733fba58e6c5271d)
Willy Tarreau [Fri, 18 Jun 2010 07:57:45 +0000 (09:57 +0200)]
[BUG] stick_table: the fix for the memory leak caused a regression
(cherry picked from commit 61ba936e6858dfcf9964d25870726621d8188fb9)
[ note: the bug was finally not present in 1.5-dev but at least we
have to reset store_count to be compatible with 1.4 ]
Commit d6e9e3b5e320b957e6c491bd92d91afad30ba638 caused recently created
entries to be removed as soon as they were created, breaking stickiness.
It is not clear whether a use-after-free was possible or not in this case.
Willy Tarreau [Mon, 14 Jun 2010 17:09:21 +0000 (19:09 +0200)]
[MINOR] config: provide a function to quote args in a more friendly way
The quote_arg() function can be used to quote an argument or indicate
"end of line" if it's null or empty. It should be useful to more precisely
report location of problems in the configuration.
Willy Tarreau [Sun, 6 Jun 2010 15:58:34 +0000 (17:58 +0200)]
[MEDIUM] stick_table: separate storage and update of session entries
When an entry already exists, we just need to update its expiration
timer. Let's have a dedicated function for that instead of spreading
open code everywhere.
This change also ensures that an update of an existing sticky session
really leads to an update of its expiration timer, which was apparently
not the case till now. This point needs to be checked in 1.4.
This change makes use of the stick-tables to keep track of any source
address activity. Two ACLs make it possible to check the count of an
entry or update it and act accordingly. The typical usage will be to
reject a TCP request upon match of an excess value.
Willy Tarreau [Sun, 6 Jun 2010 13:38:59 +0000 (15:38 +0200)]
[MEDIUM] stick_table: don't overwrite data when storing an entry
Till now sticky sessions only held server IDs. Now there are other
data types so it is not acceptable anymore to overwrite the server ID
when writing something. The server ID must then only be written from
the caller when appropriate. Doing this has also led to separate
lookup and storage.
Willy Tarreau [Sun, 6 Jun 2010 12:30:13 +0000 (14:30 +0200)]
[MINOR] stick_table: add support for "conn_cum" data type.
This one can be parsed on the "stick-table" after with the "store"
keyword. It will hold the number of connections matching the entry,
for use with ACLs or anything else.
Willy Tarreau [Sun, 6 Jun 2010 11:34:54 +0000 (13:34 +0200)]
[MEDIUM] stick_table: add room for extra data types
The stick_tables will now be able to store extra data for a same key.
A limited set of extra data types will be defined and for each of them
an offset in the sticky session will be assigned at startup time. All
of this information will be stored in the stick table.
The extra data types will have to be specified after the new "store"
keyword of the "stick-table" directive, which will reserve some space
for them.
Willy Tarreau [Sun, 6 Jun 2010 11:22:23 +0000 (13:22 +0200)]
[CLEANUP] stick_table: move pattern to key functions to stick_table.c
pattern.c depended on stick_table while in fact it should be the opposite.
So we move from pattern.c everything related to stick_tables and invert the
dependency. That way the code becomes more logical and intuitive.
Willy Tarreau [Sun, 6 Jun 2010 10:57:10 +0000 (12:57 +0200)]
[CLEANUP] stick_table: rename some stksess struct members to avoid confusion
The name 'exps' and 'keys' in struct stksess was confusing because it was
the same name as in the table which holds all of them, while they only hold
one node each. Remove the trailing 's' to more clearly identify who's who.
Willy Tarreau [Sun, 6 Jun 2010 10:11:37 +0000 (12:11 +0200)]
[MINOR] stick_table: add support for variable-sized data
Right now we're only able to store a server ID in a sticky session.
The goal is to be able to store anything whose size is known at startup
time. For this, we store the extra data before the stksess pointer,
using a negative offset. It will then be easy to cumulate multiple
data provided they each have their own offset.
It's very disturbing to see the "denied req" counter increase without
any other session counter moving. In fact, we can't count a rejected
TCP connection as "denied req" as we have not yet instanciated any
session at all. Let's use a new counter for that.
Willy Tarreau [Sat, 5 Jun 2010 08:49:41 +0000 (10:49 +0200)]
[MEDIUM] frontend: count the incoming connection earlier
The frontend's connection was accounted for once the session was
instanciated. This was problematic because the early ACLs weren't
able to correctly account for the number of concurrent connections.
Now we count the connection once it is assigned to the frontend.
It also brings the nice advantage of being more symmetrical, because
the stream_sock's accept() does not have to account for that anymore,
only the session's accept() does.
Willy Tarreau [Fri, 4 Jun 2010 18:59:39 +0000 (20:59 +0200)]
[MINOR] session: differenciate between accepted connections and received connections
Now we're able to reject connections very early, so we need to use a
different counter for the connections that are received and the ones
that are accepted and converted into sessions, so that the rate limits
can still apply to the accepted ones. The session rate must still be
used to compute the rate limit, so that we can reject undesired traffic
without affecting the rate.
Willy Tarreau [Fri, 4 Jun 2010 10:25:31 +0000 (12:25 +0200)]
[MINOR] buffer: refine the flags that may wake an analyser up.
Analysers don't care (and must not care) about a few flags such as
BF_AUTO_CLOSE or BF_AUTO_CONNECT, so those flags should not be listed
in the BF_MASK_STATIC bitmask.
We should also recheck if some buffer flags should be ignored or not
in process_session() when deciding if we must loop again or not.
Willy Tarreau [Tue, 1 Jun 2010 15:45:26 +0000 (17:45 +0200)]
[MAJOR] frontend: split accept() into frontend_accept() and session_accept()
A new function session_accept() is now called from the lower layer to
instanciate a new session. Once the session is instanciated, the upper
layer's frontent_accept() is called. This one can be service-dependant.
That way, we have a 3-phase accept() sequence :
1) protocol-specific, session-less accept(), which is pointed to by
the listener. It defaults to the generic stream_sock_accept().
2) session_accept() which relies on a frontend but not necessarily
for use in a proxy (eg: stats or any future service).
3) frontend_accept() which performs the accept for the service
offerred by the frontend. It defaults to frontend_accept() which
is really what is used by a proxy.
The TCP/HTTP proxies have been moved to this mode so that we can now rely on
frontend_accept() for any type of session initialization relying on a frontend.
The next step will be to convert the stats to use the same system for the stats.
Willy Tarreau [Tue, 1 Jun 2010 15:12:40 +0000 (17:12 +0200)]
[MAJOR] frontend: reorder the session initialization upon accept
This will be needed for the last factoring step which adds support
for application-level accept(). The tcp/http accept() code has now
been isolated and will have to move to a separate function.
Willy Tarreau [Tue, 1 Jun 2010 08:56:34 +0000 (10:56 +0200)]
[MINOR] frontend: rely on the frontend and not the backend for INDEPSTR
Till now, the frontend relied on the backend's options for INDEPSTR,
while at the time of accept, the frontend and backend are the same.
So we now use the frontend's pointer instead of the backend and we
don't have any dependency on the backend anymore in the frontend's
accept code.
Willy Tarreau [Tue, 1 Jun 2010 08:36:43 +0000 (10:36 +0200)]
[MEDIUM] session: don't assign conn_retries upon accept() anymore
The conn_retries attribute is now assigned when switching from SI_ST_INI
to SI_ST_REQ. This eliminates one of the last dependencies on the backend
in the frontend's accept() function.
Willy Tarreau [Tue, 1 Jun 2010 07:51:00 +0000 (09:51 +0200)]
[MEDIUM] session: move the conn_retries attribute to the stream interface
The conn_retries still lies in the session and its initialization depends
on the backend when it may not yet be known. Let's first move it to the
stream interface.
Willy Tarreau [Mon, 31 May 2010 17:17:12 +0000 (19:17 +0200)]
[MAJOR] frontend: don't initialize the server-side stream_int anymore
The frontend has no reason to initialize the server-side stream_interface.
It's a leftover from old times which now makes no sense due to the fact
that we don't know in the frontend whether the other side will be a socket,
a task or anything else. Removing this part is possible due to previous
patches which perform the initialization at the proper place. We'll still
have to be able to register an I/O handler for situations where everything
is known only to the frontend (eg: unix stats socket), before merging the
various instanciations of this accept() function.
Willy Tarreau [Mon, 31 May 2010 15:44:19 +0000 (17:44 +0200)]
[MEDIUM] backend: initialize the server stream_interface upon connect()
It's not normal to initialize the server-side stream interface from the
accept() function, because it may change later. Thus, we introduce a new
stream_sock_prepare_interface() function which is called just before the
connect() and which sets all of the stream_interface's callbacks to the
default ones used for real sockets. The ->connect function is also set
at the same instant so that we can easily add new server-side protocols
soon.
Willy Tarreau [Mon, 31 May 2010 10:31:35 +0000 (12:31 +0200)]
[MEDIUM] session: initialize server-side timeouts after connect()
It was particularly embarrassing that the server timeout was assigned
to buffers during an accept() just to be potentially changed later in
case of a use_backend rule. The frontend side has nothing to do with
server timeouts.
Now we initialize them right after the connect() succeeds. Later this
should change for a unique stream-interface timeout setting only.
Willy Tarreau [Mon, 31 May 2010 09:57:51 +0000 (11:57 +0200)]
[MEDIUM] session: finish session establishment sequence in with I/O handlers
Calling sess_establish() upon a successful connect() was essential, but
it was not clearly stated whether it was necessary for an access to an
I/O handler or not. While it would be desired, having it automatically
add the response analyzers is quite a problem, and it breaks HTTP stats.
The solution is thus not to call it for now and to perform the few response
initializations as needed.
For the long term, we need to find a way to specify the analyzers to install
during a stream_int_register_handler() if any.
Willy Tarreau [Mon, 31 May 2010 09:27:58 +0000 (11:27 +0200)]
[CLEANUP] buffer->cto is not used anymore
The connection timeout stored in the buffer has not been used since the
stream interface were introduced. Let's get rid of it as it's one of the
things that complicate factoring of the accept() functions.
Willy Tarreau [Mon, 31 May 2010 08:56:17 +0000 (10:56 +0200)]
[MINOR] frontend: only check for monitor-net rules if LI_O_CHK_MONNET is set
We can disable the monitor-net rules on a listener if this flag is not
set in the listener's options. This will be useful when we don't want
to check that fe->addr is set or not for non-TCP frontends.
Willy Tarreau [Mon, 31 May 2010 08:30:33 +0000 (10:30 +0200)]
[MEDIUM] frontend: check for LI_O_TCP_RULES in the listener
The new LI_O_TCP_RULES listener option indicates that some TCP rules
must be checked upon accept on this listener. It is now checked by
the frontend and the L4 rules are evaluated only in this case. The
flag is only set when at least one tcp-req rule is present in the
frontend.
The L4 rules check function has now been moved to proto_tcp.c where
it ought to be.
Willy Tarreau [Sun, 23 May 2010 20:59:00 +0000 (22:59 +0200)]
[MEDIUM] tcp: check for pure layer4 rules immediately after accept()
The tcp inspection rules were fast but were only processed after a
schedule had occurred and all resources were allocated. When defending
against DDoS, it's important to be able to apply some protection the
earliest possible instant.
Thus we introduce a new set of rules : tcp-request rules which act
on pure layer4 information (no content). They are evaluated even
before the buffers are allocated for the session, saving as much
time as possible. That way it becomes possible to check an incoming
connection's source IP address against a list of authorized/blocked
networks, and immediately drop the connection.
The rules are checked even before we perform any socket-specific
operation, so that we can optimize the reject case, which will be the
problematic one during a DDoS. The second stream interface and s->txn
are also now initialized after the rules are parsed for the same
reason. All these optimisations have permitted to reach up to 212000
connnections/s with a real rule rejecting based on the source IP
address.