From d28022b89415371dd7954656f33a13add283ea11 Mon Sep 17 00:00:00 2001 From: Willy Tarreau Date: Mon, 27 Jul 2009 20:08:06 +0200 Subject: [PATCH] [BUG] fix random pauses on last segment of a series During a direct data transfer from the server to the client, if the system did not have enough buffers anymore, haproxy would not enable write polling again if it could write at least one data chunk. Under normal conditions, this would remain undetected because the remaining data would be pushed by next data chunks. However, when this happens on the last chunk of a session, or the last in a series in an interactive bidirectional TCP transfer, haproxy would only start sending again when the read timeout was reached on the side it stopped writing, causing long pauses on some protocols such as SQL. This bug was reported by an Exceliance customer who generously offered to help us by sending large amounts of traces and running various tests on production systems. It is quite hard to trigger it but it becomes easier with a ping-pong TCP service which transfers random data sizes, with a modified version of send() able to send packets smaller than the average transfer size. A cleaner fix would imply only updating the write timeout when data transfers are *attempted*, not succeeded, but that requires more sensible code changes without fixing the result. It is a candidate for a later patch though. (cherry picked from commit c54aef3180c00abc5abe33136f4cfbaa1328bdb1) --- src/stream_sock.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/src/stream_sock.c b/src/stream_sock.c index a6603a309f..bd8d271c11 100644 --- a/src/stream_sock.c +++ b/src/stream_sock.c @@ -986,6 +986,11 @@ void stream_sock_chk_snd(struct stream_interface *si) return; retval = stream_sock_write_loop(si, ob); + /* here, we have : + * retval < 0 if an error was encountered during write. + * retval = 0 if we can't write anymore without polling + * retval = 1 if we're invited to come back when desired + */ if (retval < 0) { /* Write error on the file descriptor. We mark the FD as STERROR so * that we don't use it anymore and we notify the task. @@ -997,7 +1002,11 @@ void stream_sock_chk_snd(struct stream_interface *si) goto out_wakeup; } - if (retval > 0 || (ob->send_max == 0 && !ob->pipe)) { + /* OK, so now we know that retval >= 0 means that some data might have + * been sent, and that we may have to poll first. We have to do that + * too if the buffer is not empty. + */ + if (ob->send_max == 0 && !ob->pipe) { /* the connection is established but we can't write. Either the * buffer is empty, or we just refrain from sending because the * send_max limit was reached. Maybe we just wrote the last @@ -1015,9 +1024,13 @@ void stream_sock_chk_snd(struct stream_interface *si) ob->wex = TICK_ETERNITY; } else { - /* (re)start writing. */ - si->flags &= ~SI_FL_WAIT_DATA; + /* Otherwise there are remaining data to be sent in the buffer, + * which means we have to poll before doing so. + */ EV_FD_COND_S(si->fd, DIR_WR); + si->flags &= ~SI_FL_WAIT_DATA; + if (!tick_isset(ob->wex)) + ob->wex = tick_add_ifset(now_ms, ob->wto); } if (likely(ob->flags & BF_WRITE_ACTIVITY)) { -- 2.47.3