MAJOR: listeners: use dual-linked lists to chain listeners with frontends
Navigating through listeners was very inconvenient and error-prone. Not to
mention that listeners were linked in reverse order and reverted afterwards.
In order to definitely get rid of these issues, we now do the following :
- frontends have a dual-linked list of bind_conf
- frontends have a dual-linked list of listeners
- bind_conf have a dual-linked list of listeners
- listeners have a pointer to their bind_conf
This way we can now navigate from anywhere to anywhere and always find the
proper bind_conf for a given listener, as well as find the list of listeners
for a current bind_conf.
MINOR: config: set the bind_conf entry on listeners created from a "listen" line.
Otherwise we would risk a segfault when checking the config's validity
(eg: when looking for conflicts on ID assignments).
Note that the same issue exists with peers_fe and the global stats_fe. All
listeners should be reviewed and simplified to use a compatible declaration
mode.
MEDIUM: config: enumerate full list of registered "bind" keywords upon error
When an unknown "bind" keyword is detected, dump the list of all
registered keywords. Unsupported default alternatives are also reported
as "not supported".
MEDIUM: config: move all unix-specific bind keywords to proto_uxst.c
The "mode", "uid", "gid", "user" and "group" bind options were moved to
proto_uxst as they are unix-specific.
Note that previous versions had a bug here, only the last listener was
updated with the specified settings. However, it almost never happens
that bind lines contain multiple UNIX socket paths so this is not that
much of a problem anyway.
Registering new SSL bind keywords was not particularly handy as it required
many #ifdef in cfgparse.c. Now the code has moved to ssl_sock.c which calls
a register function for all the keywords.
Error reporting was also improved by this move, because the called functions
build an error message using memprintf(), which can span multiple lines if
needed, and each of these errors will be displayed indented in the context of
the bind line being processed. This is important when dealing with certificate
directories which can report multiple errors.
MEDIUM: config: move the "bind" TCP parameters to proto_tcp
Now proto_tcp.c is responsible for the 4 settings it handles :
- defer-accept
- interface
- mss
- transparent
These ones do not need to be handled in cfgparse anymore. If support for a
setting is disabled by a missing build option, then cfgparse correctly
reports :
[ALERT] 255/232700 (2701) : parsing [echo.cfg:114] : 'bind' : 'transparent' option is not implemented in this version (check build options).
MEDIUM: listener: add a minimal framework to register "bind" keyword options
With the arrival of SSL, the "bind" keyword has received even more options,
all of which are processed in cfgparse in a cumbersome way. So it's time to
let modules register their own bind options. This is done very similarly to
the ACLs with a small difference in that we make the difference between an
unknown option and a known, unimplemented option.
Some settings need to be merged per-bind config line and are not necessarily
SSL-specific. It becomes quite inconvenient to have this ssl_conf SSL-specific,
so let's replace it with something more generic.
MINOR: config: add a function to indent error messages
Bind parsers may return multiple errors, so let's make use of a new function
to re-indent multi-line error messages so that they're all reported in their
context.
BUG/MAJOR: ssl: missing tests in ACL fetch functions
Baptiste Assmann observed a crash of 1.5-dev12 occuring when the ssl_sni
fetch was used with no SNI on the input connection and without a prior
has_sni check. A code review revealed several issues :
1) it was possible to call the has_sni and ssl_sni fetch functions with
a NULL data_ctx if the handshake fails or if the connection is aborted
during the handshake.
2) when no SNI is present, strlen() was called with a NULL parameter in
smp_fetch_ssl_sni().
Released version 1.5-dev12 with the following main changes :
- CONTRIB: halog: sort URLs by avg bytes_read or total bytes_read
- MEDIUM: ssl: add support for prefer-server-ciphers option
- MINOR: IPv6 support for transparent proxy
- MINOR: protocol: add SSL context to listeners if USE_OPENSSL is defined
- MINOR: server: add SSL context to servers if USE_OPENSSL is defined
- MEDIUM: connection: add a new handshake flag for SSL (CO_FL_SSL_WAIT_HS).
- MEDIUM: ssl: add new files ssl_sock.[ch] to provide the SSL data layer
- MEDIUM: config: add the 'ssl' keyword on 'bind' lines
- MEDIUM: config: add support for the 'ssl' option on 'server' lines
- MEDIUM: ssl: protect against client-initiated renegociation
- BUILD: add optional support for SSL via the USE_OPENSSL flag
- MEDIUM: ssl: add shared memory session cache implementation.
- MEDIUM: ssl: replace OpenSSL's session cache with the shared cache
- MINOR: ssl add global setting tune.sslcachesize to set SSL session cache size.
- MEDIUM: ssl: add support for SNI and wildcard certificates
- DOC: Typos cleanup
- DOC: fix name for "option independant-streams"
- DOC: specify the default value for maxconn in the context of a proxy
- BUG/MINOR: to_log erased with unique-id-format
- LICENSE: add licence exception for OpenSSL
- BUG/MAJOR: cookie prefix doesn't support cookie-less servers
- BUILD: add an AIX 5.2 (and later) target.
- MEDIUM: fd/si: move peeraddr from struct fdinfo to struct connection
- MINOR: halog: use the more recent dual-mode fgets2 implementation
- BUG/MEDIUM: ebtree: ebmb_insert() must not call cmp_bits on full-length matches
- CLEANUP: halog: make clean should also remove .o files
- OPTIM: halog: make use of memchr() on platforms which provide a fast one
- OPTIM: halog: improve cold-cache behaviour when loading a file
- BUG/MINOR: ACL implicit arguments must be created with unresolved flag
- MINOR: replace acl_fetch_{path,url}* with smp_fetch_*
- MEDIUM: pattern: add the "base" sample fetch method
- OPTIM: i386: make use of kernel-mode-linux when available
- BUG/MINOR: tarpit: fix condition to return the HTTP 500 message
- BUG/MINOR: polling: some events were not set in various pollers
- MINOR: http: add the urlp_val ACL match
- BUG: stktable: tcp_src_to_stktable_key() must return NULL on invalid families
- MINOR: stats/cli: add plans to support more stick-table actions
- MEDIUM: stats/cli: add support for "set table key" to enter values
- REORG/MEDIUM: fd: remove FD_STCLOSE from struct fdtab
- REORG/MEDIUM: fd: remove checks for FD_STERROR in ev_sepoll
- REORG/MEDIUM: fd: get rid of FD_STLISTEN
- REORG/MINOR: connection: move declaration to its own include file
- REORG/MINOR: checks: put a struct connection into the server
- MINOR: connection: add flags to the connection struct
- MAJOR: get rid of fdtab[].state and use connection->flags instead
- MINOR: fd: add a new I/O handler to fdtab
- MEDIUM: polling: prepare to call the iocb() function when defined.
- MEDIUM: checks: make use of fdtab->iocb instead of cb[]
- MEDIUM: protocols: use the generic I/O callback for accept callbacks
- MINOR: connection: add a handler for fd-based connections
- MAJOR: connection: replace direct I/O callbacks with the connection callback
- MINOR: fd: make fdtab->owner a connection and not a stream_interface anymore
- MEDIUM: connection: remove the FD_POLL_* flags only once
- MEDIUM: connection: extract the send_proxy callback from proto_tcp
- MAJOR: tcp: remove the specific I/O callbacks for TCP connection probes
- CLEANUP: remove the now unused fdtab direct I/O callbacks
- MAJOR: remove the stream interface and task management code from sock_*
- MEDIUM: stream_interface: pass connection instead of fd in sock_ops
- MEDIUM: stream_interface: centralize the SI_FL_ERR management
- MAJOR: connection: add a new CO_FL_CONNECTED flag
- MINOR: rearrange tcp_connect_probe() and fix wrong return codes
- MAJOR: connection: call data layer handshakes from the handler
- MEDIUM: fd: remove the EV_FD_COND_* primitives
- MINOR: sock_raw: move calls to si_data_close upper
- REORG: connection: replace si_data_close() with conn_data_close()
- MEDIUM: sock_raw: introduce a read0 callback that is different from shutr
- MAJOR: stream_int: use a common stream_int_shut*() functions regardless of the data layer
- MAJOR: fd: replace all EV_FD_* macros with new fd_*_* inline calls
- MEDIUM: fd: add fd_poll_{recv,send} for use when explicit polling is required
- MEDIUM: connection: add definitions for dual polling mechanisms
- MEDIUM: connection: make use of the new polling functions
- MAJOR: make use of conn_{data|sock}_{poll|stop|want}* in connection handlers
- MEDIUM: checks: don't use FD_WAIT_* anymore
- MINOR: fd: get rid of FD_WAIT_*
- MEDIUM: stream_interface: offer a generic function for connection updates
- MEDIUM: stream-interface: offer a generic chk_rcv function for connections
- MEDIUM: stream-interface: add a snd_buf() callback to sock_ops
- MEDIUM: stream-interface: provide a generic stream_int_chk_snd_conn() function
- MEDIUM: stream-interface: provide a generic si_conn_send_cb callback
- MEDIUM: stream-interface: provide a generic stream_sock_read0() function
- REORG/MAJOR: use "struct channel" instead of "struct buffer"
- REORG/MAJOR: extract "struct buffer" from "struct channel"
- MINOR: connection: provide conn_{data|sock}_{read0|shutw} functions
- REORG: sock_raw: rename the files raw_sock*
- MAJOR: raw_sock: extract raw_sock_to_buf() from raw_sock_read()
- MAJOR: raw_sock: temporarily disable splicing
- MINOR: stream-interface: add an rcv_buf callback to sock_ops
- REORG: stream-interface: move sock_raw_read() to si_conn_recv_cb()
- MAJOR: connection: split the send call into connection and stream interface
- MAJOR: stream-interface: restore splicing mechanism
- MAJOR: stream-interface: make conn_notify_si() more robust
- MEDIUM: proxy-proto: don't use buffer flags in conn_si_send_proxy()
- MAJOR: stream-interface: don't commit polling changes in every callback
- MAJOR: stream-interface: fix splice not to call chk_snd by itself
- MEDIUM: stream-interface: don't remove WAIT_DATA when a handshake is in progress
- CLEANUP: connection: split sock_ops into data_ops, app_cp and si_ops
- REORG: buffers: split buffers into chunk,buffer,channel
- MAJOR: channel: remove the BF_OUT_EMPTY flag
- REORG: buffer: move buffer_flush, b_adv and b_rew to buffer.h
- MINOR: channel: rename bi_full to channel_full as it checks the whole channel
- MINOR: buffer: provide a new buffer_full() function
- MAJOR: channel: stop relying on BF_FULL to take action
- MAJOR: channel: remove the BF_FULL flag
- REORG: channel: move buffer_{replace,insert_line}* to buffer.{c,h}
- CLEANUP: channel: usr CF_/CHN_ prefixes instead of BF_/BUF_
- CLEANUP: channel: use "channel" instead of "buffer" in function names
- REORG: connection: move the target pointer from si to connection
- MAJOR: connection: move the addr field from the stream_interface
- MEDIUM: stream_interface: remove CAP_SPLTCP/CAP_SPLICE flags
- MEDIUM: proto_tcp: remove any dependence on stream_interface
- MINOR: tcp: replace tcp_src_to_stktable_key with addr_to_stktable_key
- MEDIUM: connection: add an ->init function to data layer
- MAJOR: session: introduce embryonic sessions
- MAJOR: connection: make the PROXY decoder a handshake handler
- CLEANUP: frontend: remove the old proxy protocol decoder
- MAJOR: connection: rearrange the polling flags.
- MEDIUM: connection: only call tcp_connect_probe when nothing was attempted yet
- MEDIUM: connection: complete the polling cleanups
- MEDIUM: connection: avoid calling handshakes when polling is required
- MAJOR: stream_interface: continue to update data polling flags during handshakes
- CLEANUP: fd: remove fdtab->flags
- CLEANUP: fdtab: flatten the struct and merge the spec struct with the rest
- CLEANUP: includes: fix includes for a number of users of fd.h
- MINOR: ssl: disable TCP quick-ack by default on SSL listeners
- MEDIUM: config: add a "ciphers" keyword to set SSL cipher suites
- MEDIUM: config: add "nosslv3" and "notlsv1" on bind and server lines
- BUG: ssl: mark the connection as waiting for an SSL connection during the handshake
- BUILD: http: rename error_message http_error_message to fix conflicts on RHEL
- BUILD: ssl: fix shctx build on RHEL with futex
- BUILD: include sys/socket.h to fix build failure on FreeBSD
- BUILD: fix build error without SSL (ssl_cert)
- BUILD: ssl: use MAP_ANON instead of MAP_ANONYMOUS
- BUG/MEDIUM: workaround an eglibc bug which truncates the pidfiles when nbproc > 1
- MEDIUM: config: support per-listener backlog and maxconn
- MINOR: session: do not send an HTTP/500 error on SSL sockets
- MEDIUM: config: implement maxsslconn in the global section
- BUG: tcp: close socket fd upon connect error
- MEDIUM: connection: improve error handling around the data layer
- MINOR: config: make the tasks "nice" value configurable on "bind" lines.
- BUILD: shut a gcc warning introduced by commit 269ab31
- MEDIUM: config: centralize handling of SSL config per bind line
- BUILD: makefile: report USE_OPENSSL status in build options
- BUILD: report openssl build settings in haproxy -vv
- MEDIUM: ssl: add sample fetches for is_ssl, ssl_has_sni, ssl_sni_*
- DOC: add a special acknowledgement for the stud project
- DOC: add missing SSL options for servers and listeners
- BUILD: automatically add -lcrypto for SSL
- DOC: add some info about openssl build in the README
DOC: add a special acknowledgement for the stud project
Really, the quality of their code deserves it, it would have been much
harder to figure how to get all the things right at once without looking
there from time to time !
BUILD: report openssl build settings in haproxy -vv
Since it's common enough to discover that some config options are not
supported due to some openssl version or build options, we report the
relevant ones in "haproxy -vv".
MEDIUM: ssl: add support for SNI and wildcard certificates
A side effect of this change is that the "ssl" keyword on "bind" lines is now
just a boolean and that "crt" is needed to designate certificate files or
directories.
Note that much refcounting was needed to have the free() work correctly due to
the number of cert aliases which can make a context be shared by multiple names.
CONTRIB: halog: sort URLs by avg bytes_read or total bytes_read
The patch attached to this mail brings ability to sort URLs by
averaged bytes read and total bytes read in HALog tool.
In most cases, bytes read is also the object size.
The purpose of this patch is to know which URL consume the most
bandwith, in average or in total.
It may be interesting as well to know the standard deviation (ecart
type in french) for some counters (like bytes_read).
MEDIUM: config: centralize handling of SSL config per bind line
SSL config holds many parameters which are per bind line and not per
listener. Let's use a per-bind line config instead of having it
replicated for each listener.
At the moment we only do this for the SSL part but this should probably
evolved to handle more of the configuration and maybe even the state per
bind line.
MINOR: config: make the tasks "nice" value configurable on "bind" lines.
This is very convenient to reduce SSL processing priority compared to
other traffic. This applies to CPU usage only, but has a direct impact
on latency under congestion.
MEDIUM: connection: improve error handling around the data layer
Better avoid calling the data functions upon error or handshake than
having to put conditions everywhere, which are too easy to forget (one
check for CO_FL_ERROR was missing, but this was harmless).
MEDIUM: config: implement maxsslconn in the global section
SSL connections take a huge amount of memory, and unfortunately openssl
does not check malloc() returns and easily segfaults when too many
connections are used.
The only solution against this is to provide a global maxsslconn setting
to reject SSL connections above the limit in order to avoid reaching
unsafe limits.
MEDIUM: config: support per-listener backlog and maxconn
With SSL, connections are much more expensive, so it is important to be
able to limit concurrent connections per listener in order to limit the
memory usage.
BUG/MEDIUM: workaround an eglibc bug which truncates the pidfiles when nbproc > 1
Thomas Heil reported that when using nbproc > 1, his pidfiles were
regularly truncated. The issue could be tracked down to the presence
of a call to lseek(pidfile, 0, SEEK_SET) just before the close() call
in the children, resulting in the file being truncated by the children
while the parent was feeding it. This unexpected lseek() is transparently
performed by fclose().
Since there is no way to have the file automatically closed during the
fork, the only solution is to bypass the libc and use open/write/close
instead of fprintf() and fclose().
FreeBSD uses the former, Linux uses the latter but generally also
defines the former as an alias of the latter. Just checked on other
OSes and AIX defines both. So better use MAP_ANON which seems to be
more commonly defined.
David BERARD [Tue, 4 Sep 2012 13:15:13 +0000 (15:15 +0200)]
MEDIUM: ssl: add support for prefer-server-ciphers option
I wrote a small path to add the SSL_OP_CIPHER_SERVER_PREFERENCE OpenSSL option
to frontend, if the 'prefer-server-ciphers' keyword is set.
Example :
bind 10.11.12.13 ssl /etc/haproxy/ssl/cert.pem ciphers RC4:HIGH:!aNULL:!MD5 prefer-server-ciphers
This option mitigate the effect of the BEAST Attack (as I understand), and it
equivalent to :
- Apache HTTPd SSLHonorCipherOrder option.
- Nginx ssl_prefer_server_ciphers option.
On RHEL/CentOS, linux/futex.h uses an u32 type which is never declared
anywhere. Let's set it with a #define in order to fix the issue without
causing conflicts with possible typedefs on other platforms.
BUG: ssl: mark the connection as waiting for an SSL connection during the handshake
The WAIT_L6_CONN was designed especially to ensure that the connection
was not marked ready before the SSL layer was OK, but we forgot to set
the flag, resulting in a rejected handshake when ssl was combined with
accept-proxy because accept-proxy would validate the connection alone
and the SSL handshake would then believe in a client-initiated reneg
and kill it.
MEDIUM: config: add "nosslv3" and "notlsv1" on bind and server lines
This is aimed at disabling SSLv3 and TLSv1 respectively. SSLv2 is always
disabled. This can be used in some situations where one version looks more
suitable than the other.
This SSL session cache was developped at Exceliance and is the same that
was proposed for stunnel and stud. It makes use of a shared memory area
between the processes so that sessions can be handled by any process. It
is only useful when haproxy runs with nbproc > 1, but it does not hurt
performance at all with nbproc = 1. The aim is to totally replace OpenSSL's
internal cache.
The cache is optimized for Linux >= 2.6 and specifically for x86 platforms.
On Linux/x86, it makes use of futexes for inter-process locking, with some
x86 assembly for the locked instructions. On other architectures, GCC
builtins are used instead, which are available starting from gcc 4.1.
On other operating systems, the locks fall back to pthread mutexes so
libpthread is automatically linked. It is not recommended since pthreads
are much slower than futexes. The lib is only linked if SSL is enabled.
MINOR: ssl: disable TCP quick-ack by default on SSL listeners
Since the SSL handshake involves an immediate reply from the server
to the client, there's no point responding with a quick-ack before
sending the data, so disable quick-ack by default, just as it is done
for HTTP.
This shows a 2-2.5% transaction rate increase on a dual-core atom.
Emeric Brun [Fri, 18 May 2012 13:48:30 +0000 (15:48 +0200)]
BUILD: add optional support for SSL via the USE_OPENSSL flag
When this flag is set, the SSL data layer is enabled.
At the moment, only the GNU makefile was touched, the other ones
make the option handling a bit tricky.
MEDIUM: ssl: protect against client-initiated renegociation
CVE-2009-3555 suggests that client-initiated renegociation should be
prevented in the middle of data. The workaround here consists in having
the SSL layer notify our callback about a handshake occurring, which in
turn causes the connection to be marked in the error state if it was
already considered established (which means if a previous handshake was
completed). The result is that the connection with the client is immediately
aborted and any pending data are dropped.
Emeric Brun [Fri, 18 May 2012 14:02:00 +0000 (16:02 +0200)]
MEDIUM: config: add support for the 'ssl' option on 'server' lines
This option currently takes no option and simply turns SSL on for all
connections going to the server. It is likely that more options will
be needed in the future.
Emeric Brun [Fri, 18 May 2012 13:47:34 +0000 (15:47 +0200)]
MEDIUM: ssl: add new files ssl_sock.[ch] to provide the SSL data layer
This data layer supports socket-to-buffer and buffer-to-socket operations.
No sock-to-pipe nor pipe-to-sock functions are provided, since splicing does
not provide any benefit with data transformation. At best it could save a
memcpy() and avoid keeping a buffer allocated but that does not seem very
useful.
An init function and a close function are provided because the SSL context
needs to be allocated/freed.
A data-layer shutw() function is also provided because upon successful
shutdown, we want to store the SSL context in the cache in order to reuse
it for future connections and avoid a new key generation.
The handshake function is directly called from the connection handler.
At this point it is not certain whether this will remain this way or
if a new ->handshake callback will be added to the data layer so that
the connection handler doesn't care about SSL.
The sock-to-buf and buf-to-sock functions are all capable of enabling
the SSL handshake at any time. This also implies polling in the opposite
direction to what was expected. The upper layers must take that into
account (it is OK right now with the stream interface).
CLEANUP: includes: fix includes for a number of users of fd.h
It appears that fd.h includes a number of unneeded files and was
included from standard.h, and as such served as an intermediary
to provide almost everything to everyone.
By removing its useless includes, a long dependency chain broke
but could easily be fixed.
CLEANUP: fdtab: flatten the struct and merge the spec struct with the rest
The "spec" sub-struct was using 8 bytes for only 5 needed. There is no
reason to keep it as a struct, it doesn't bring any value. By flattening
it, we can merge the single byte with the next single byte, resulting in
an immediate saving of 4 bytes (20%). Interestingly, tests have shown a
steady performance gain of 0.6% after this change, which can possibly be
attributed to a more cache-line friendly struct.
These flags were added for TCP_CORK. They were only set at various places
but never checked by any user since TCP_CORK was replaced with MSG_MORE.
Simply get rid of this now.
MAJOR: stream_interface: continue to update data polling flags during handshakes
Since data and socket polling flags were split, it became possible to update
data flags even during handshakes. In fact this is very important otherwise
it is not possible to poll for writes if some data are to be forwarded during
a handshake (eg: data received during an SSL connect).
MEDIUM: connection: avoid calling handshakes when polling is required
If a data handler suddenly switches to a handshake mode and detects the
need for polling in either direction, we don't want to loop again through
the handshake handlers because we know we won't be able to do anything.
Similarly, we don't want to call again the data handlers after a loop
through the handshake handlers if polling is required.
No performance change was observed, it might only be observed during
high rate SSL renegociation.
I/O handlers now all use __conn_{sock,data}_{stop,poll,want}_* instead
of returning dummy flags. The code has become slightly simpler because
some tricks such as the MIN_RET_FOR_READ_LOOP are not needed anymore,
and the data handlers which switch to a handshake handler do not need
to disable themselves anymore.
MEDIUM: connection: only call tcp_connect_probe when nothing was attempted yet
It was observed that after a failed send() on EAGAIN, a second connect()
would still be attempted in tcp_connect_probe() because there was no way
to know that a send() had failed.
By checking the WANT_WR status flag, we know if a previous write attempt
failed on EAGAIN, so we don't try to connect again if we know this has
already failed.
With this simple change, the second connect() has disappeared.
Polling flags were set for data and sock layer, but while this does make
sense for the ENA flag, it does not for the POL flag which translates the
detection of an EAGAIN condition. So now we remove the {DATA,SOCK}_POL*
flags and instead introduce two new layer-independant flags (WANT_RD and
WANT_WR). These flags are only set when an EAGAIN is encountered so that
polling can be enabled.
In order for these flags to have any meaning they are not persistent and
have to be cleared by the connection handler before calling the I/O and
data callbacks. For this reason, changes detection has been slightly
improved. Instead of comparing the WANT_* flags with CURR_*_POL, we only
check if the ENA status changes, or if the polling appears, since we don't
want to detect the useless poll to ena transition. Tests show that this
has eliminated one useless call to __fd_clr().
Finally the conn_set_polling() function which was becoming complex and
required complex operations from the caller was split in two and replaced
its two only callers (conn_update_data_polling and conn_update_sock_polling).
The two functions are now much smaller due to the less complex conditions.
Note that it would be possible to re-merge them and only pass a mask but
this does not appear much interesting.
Willy Tarreau [Fri, 31 Aug 2012 15:43:29 +0000 (17:43 +0200)]
MAJOR: connection: make the PROXY decoder a handshake handler
The PROXY protocol is now decoded in the connection before other
handshakes. This means that it may be extracted from a TCP stream
before SSL is decoded from this stream.
Willy Tarreau [Fri, 31 Aug 2012 14:01:23 +0000 (16:01 +0200)]
MAJOR: session: introduce embryonic sessions
When an incoming connection request is accepted, a connection
structure is needed to store its state. However we don't want to
fully initialize a session until the data layer is about to be
ready.
As long as the connection is physically stored into the session,
it's not easy to split both allocations.
As such, we only initialize the minimum requirements of a session,
which results in what we call an embryonic session. Then once the
data layer is ready, we can complete the function's initialization.
Doing so avoids buffers allocation and ensures that a session only
sees ready connections.
The frontend's client timeout is used as the handshake timeout. It
is likely that another timeout will be used in the future.
Willy Tarreau [Fri, 31 Aug 2012 11:54:11 +0000 (13:54 +0200)]
MEDIUM: connection: add an ->init function to data layer
SSL need to initialize the data layer before proceeding with data. At
the moment, this data layer is automatically initialized from itself,
which will not be possible once we extract connection from sessions
since we'll only create the data layer once the handshake is finished.
So let's have the application layer initialize the data layer before
using it.
Willy Tarreau [Thu, 30 Aug 2012 20:59:48 +0000 (22:59 +0200)]
MINOR: tcp: replace tcp_src_to_stktable_key with addr_to_stktable_key
Make it more obvious that this function does not depend on any knowledge
of the session. This is important to plan for TCP rules that can run on
connection without any initialized session yet.
Willy Tarreau [Thu, 30 Aug 2012 20:23:13 +0000 (22:23 +0200)]
MEDIUM: proto_tcp: remove any dependence on stream_interface
The last uses of the stream interfaces were in tcp_connect_server() and
could easily and more appropriately be moved to its callers, si_connect()
and connect_server(), making a lot more sense.
Now the function should theorically be usable for health checks.
It also appears more obvious that the file is split into two distinct
parts :
- the protocol layer used at the connection level
- the tcp analysers executing tcp-* rules and their samples/acls.
These ones are implicitly handled by the connection's data layer, no need
to rely on them anymore and reaching them maintains undesired dependences
on stream-interface.
Willy Tarreau [Thu, 30 Aug 2012 19:11:38 +0000 (21:11 +0200)]
MAJOR: connection: move the addr field from the stream_interface
We need to have the source and destination addresses in the connection.
They were lying in the stream interface so let's move them. The flags
SI_FL_FROM_SET and SI_FL_TO_SET have been moved as well.
It's worth noting that tcp_connect_server() almost does not use the
stream interface anymore except for a few flags.
It has been identified that once we detach the connection from the SI,
it will probably be needed to keep a copy of the server-side addresses
in the SI just for logging purposes. This has not been implemented right
now though.
Some functions provided by channel.[ch] have kept their "buffer" name because
they are really designed to act on the buffer according to some information
gathered from the channel. They have been moved together to the same place in
the file for better readability but they were not changed at all.
The "buffer" memory pool was also renamed "channel".
Willy Tarreau [Mon, 27 Aug 2012 20:08:00 +0000 (22:08 +0200)]
REORG: channel: move buffer_{replace,insert_line}* to buffer.{c,h}
These functions do not depend on the channel flags anymore thus they're
much better suited to be used on plain buffers. Move them from channel
to buffer.
Willy Tarreau [Mon, 27 Aug 2012 18:53:34 +0000 (20:53 +0200)]
MAJOR: channel: remove the BF_FULL flag
This is similar to the recent removal of BF_OUT_EMPTY. This flag was very
problematic because it relies on permanently changing information such as the
to_forward value, so it had to be updated upon every change to the buffers.
Previous patch already got rid of its users.
One part of the change is sensible : the flag was also part of BF_MASK_STATIC,
which is used by process_session() to rescan all analysers in case the flag's
status changes. At first glance, none of the analysers seems to change its
mind base on this flag when it is subject to change, so it seems fine not to
add variation checks here. Otherwise it's possible that checking the buffer's
input and output is more reliable than checking the flag's replacement.
Willy Tarreau [Mon, 27 Aug 2012 18:46:07 +0000 (20:46 +0200)]
MAJOR: channel: stop relying on BF_FULL to take action
This flag is quite complex to get right and updating it everywhere is a
major pain, especially since the buffer/channel split. This is the first
step of getting rid of it. Instead now it's dynamically computed whenever
needed.
Willy Tarreau [Fri, 24 Aug 2012 20:40:29 +0000 (22:40 +0200)]
MAJOR: channel: remove the BF_OUT_EMPTY flag
This flag was very problematic because it was composite in that both changes
to the pipe or to the buffer had to cause this flag to be updated, which is
not always simple (eg: there may not even be a channel attached to a buffer
at all).
There were not that many users of this flags, mostly setters. So the flag got
replaced with a macro which reports whether the channel is empty or not, by
checking both the pipe and the buffer.
One part of the change is sensible : the flag was also part of BF_MASK_STATIC,
which is used by process_session() to rescan all analysers in case the flag's
status changes. At first glance, none of the analysers seems to change its
mind base on this flag when it is subject to change, so it seems fine not to
add variation checks here. Otherwise it's possible that checking the buffer's
output size is more useful than checking the flag's replacement.
Willy Tarreau [Fri, 24 Aug 2012 16:12:41 +0000 (18:12 +0200)]
CLEANUP: connection: split sock_ops into data_ops, app_cp and si_ops
Some parts of the sock_ops structure were only used by the stream
interface and have been moved into si_ops. Some of them were callbacks
to the stream interface from the connection and have been moved into
app_cp as they're the application seen from the connection (later,
health-checks will need to use them). The rest has moved to data_ops.
Normally at this point the connection could live without knowing about
stream interfaces at all.
Willy Tarreau [Fri, 24 Aug 2012 10:53:56 +0000 (12:53 +0200)]
MAJOR: stream-interface: fix splice not to call chk_snd by itself
In recent splice fixes we made splice call chk_snd, but this was due
to inappropriate checks in conn_notify_si() which prevented the chk_snd()
call from being performed. Now that this has been fixed, remove this
duplicate code.
Willy Tarreau [Fri, 24 Aug 2012 10:52:22 +0000 (12:52 +0200)]
MAJOR: stream-interface: don't commit polling changes in every callback
It's more efficient to centralize polling changes, which is already done
in the connection handler. So now all I/O callbacks just change flags and
rely on the connection handler for the commit. The special case of the
send loop is handled by the chk_snd() function which does an update at
the end.
Willy Tarreau [Fri, 24 Aug 2012 10:14:49 +0000 (12:14 +0200)]
MEDIUM: proxy-proto: don't use buffer flags in conn_si_send_proxy()
These ones should only be handled by the stream interface at the end
of the handshake now. Similarly a number of information are now taken
at the connection level rather than at the data level (eg: shutdown).
Fast polling updates have been used instead of slow ones since the
function is only called by the connection handler.
Willy Tarreau [Fri, 24 Aug 2012 10:12:53 +0000 (12:12 +0200)]
MAJOR: stream-interface: make conn_notify_si() more robust
This function was relying on the result of file descriptor polling
which is inappropriate as it may be subject to race conditions during
handshakes. Make it more robust by relying solely on buffer activity.
The splicing is now provided by the data-layer rcv_pipe/snd_pipe functions
which in turn are called by the stream interface's recv and send callbacks.
The presence of the rcv_pipe/snd_pipe functions is used to attest support
for splicing at the data layer. It looks like the stream-interface's
SI_FL_CAP_SPLICE flag does not make sense anymore as it's used as a proxy
for the pointers above.
It also appears that we call chk_snd() from the recv callback and then
try to call it again in update_conn(). It is very likely that this last
function will progressively slip into the recv/send callbacks in order
to avoid duplicate check code.
The code works right now with and without splicing. Only raw_sock provides
support for it and it is automatically selected when the various splice
options are set. However it looks like splice-auto doesn't enable it, which
possibly means that the streamer detection code does not work anymore, or
that it's only called at a time where it's too late to enable splicing (in
process_session).
Willy Tarreau [Tue, 21 Aug 2012 16:22:06 +0000 (18:22 +0200)]
MAJOR: connection: split the send call into connection and stream interface
Similar to what was done on the receive path, the data layer now provides
only an snd_buf() callback that is iterated over by the stream interface's
si_conn_send_loop() function.
The data layer now has no knowledge about channels nor stream interfaces.
The splice() code still need to be ported as it currently is disabled.
Willy Tarreau [Mon, 20 Aug 2012 19:41:06 +0000 (21:41 +0200)]
REORG: stream-interface: move sock_raw_read() to si_conn_recv_cb()
The recv function is now generic and is usable to iterate any connection-to-buf
reading function from a stream interface. So let's move it to stream-interface.
Willy Tarreau [Mon, 20 Aug 2012 15:30:32 +0000 (17:30 +0200)]
MAJOR: raw_sock: extract raw_sock_to_buf() from raw_sock_read()
This is the start of the stream connection iterator which calls the
data-layer reader. This still looks a bit tricky but is OK. Splicing
is not handled at all at the moment.
Willy Tarreau [Mon, 20 Aug 2012 15:01:35 +0000 (17:01 +0200)]
REORG: sock_raw: rename the files raw_sock*
The "raw_sock" prefix will be more convenient for naming functions as
it will be prefixed with the data layer and suffixed with the data
direction. So let's rename the files now to avoid any further confusion.
The #include directive was also removed from a number of files which do
not need it anymore.