From: Willy Tarreau Date: Fri, 17 Nov 2023 09:56:33 +0000 (+0100) Subject: BUG/MEDIUM: mux-fcgi: fail earlier on malloc in takeover() X-Git-Tag: v2.9-dev10~27 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=d069825c5f59a22ec14577cfbd3bce4af0968ee7;p=thirdparty%2Fhaproxy.git BUG/MEDIUM: mux-fcgi: fail earlier on malloc in takeover() This is the equivalent of the previous "BUG/MEDIUM: mux-h1: fail earlier on malloc in takeover()". Connection takeover was implemented for fcgi in 2.2 by commit a41bb0b6c ("MEDIUM: mux_fcgi: Implement the takeover() method."). It does have one corner case related to memory allocation failure: in case the task or tasklet allocation fails, the connection gets released synchronously. Unfortunately the situation is bad there, because the lower layers are already switched to the new thread while the tasklet is either NULL or still the old one, and calling fcgi_release() will also result in touching the thread-local list of buffer waiters, calling unsubscribe(), There are even code paths where the thread will try to grab the lock of its own idle conns list, believing the connection is there while it has no useful effect. However, if the owner thread was doing the same at the same moment, and ended up trying to pick from the current thread (which could happen if picking a connection for a different name), the two could even deadlock. No tests were made to try to reproduce the problem, but the description above is sufficient to see that nothing can guarantee against it. This patch takes a simple but radically different approach. Instead of starting to migrate the connection before risking to face allocation failures, it first pre-allocates a new task and tasklet, then assigns them to the connection if the migration succeeds, otherwise it just frees them. This way it's no longer needed to manipulate the connection until it's fully migrated, and as a bonus this means the connection will continue to exist and the use-after-free condition is solved at the same time. This should be backported to 2.2. Thanks to Fred for the initial analysis of the problem! --- diff --git a/src/mux_fcgi.c b/src/mux_fcgi.c index f14de49754..d41318bb3f 100644 --- a/src/mux_fcgi.c +++ b/src/mux_fcgi.c @@ -4142,9 +4142,19 @@ static int fcgi_takeover(struct connection *conn, int orig_tid) { struct fcgi_conn *fcgi = conn->ctx; struct task *task; + struct task *new_task; + struct tasklet *new_tasklet; + + /* Pre-allocate tasks so that we don't have to roll back after the xprt + * has been migrated. + */ + new_task = task_new_here(); + new_tasklet = tasklet_new(); + if (!new_task || !new_tasklet) + goto fail; if (fd_takeover(conn->handle.fd, conn) != 0) - return -1; + goto fail; if (conn->xprt->takeover && conn->xprt->takeover(conn, conn->xprt_ctx, orig_tid) != 0) { /* We failed to takeover the xprt, even if the connection may @@ -4154,44 +4164,49 @@ static int fcgi_takeover(struct connection *conn, int orig_tid) */ conn->flags |= CO_FL_ERROR; tasklet_wakeup_on(fcgi->wait_event.tasklet, orig_tid); - return -1; + goto fail; } if (fcgi->wait_event.events) fcgi->conn->xprt->unsubscribe(fcgi->conn, fcgi->conn->xprt_ctx, fcgi->wait_event.events, &fcgi->wait_event); - /* To let the tasklet know it should free itself, and do nothing else, - * set its context to NULL; - */ - fcgi->wait_event.tasklet->context = NULL; - tasklet_wakeup_on(fcgi->wait_event.tasklet, orig_tid); task = fcgi->task; if (task) { + /* only assign a task if there was already one, otherwise + * the preallocated new task will be released. + */ task->context = NULL; fcgi->task = NULL; __ha_barrier_store(); task_kill(task); - fcgi->task = task_new_here(); - if (!fcgi->task) { - fcgi_release(fcgi); - return -1; - } + fcgi->task = new_task; + new_task = NULL; fcgi->task->process = fcgi_timeout_task; fcgi->task->context = fcgi; } - fcgi->wait_event.tasklet = tasklet_new(); - if (!fcgi->wait_event.tasklet) { - fcgi_release(fcgi); - return -1; - } + + /* To let the tasklet know it should free itself, and do nothing else, + * set its context to NULL; + */ + fcgi->wait_event.tasklet->context = NULL; + tasklet_wakeup_on(fcgi->wait_event.tasklet, orig_tid); + + fcgi->wait_event.tasklet = new_tasklet; fcgi->wait_event.tasklet->process = fcgi_io_cb; fcgi->wait_event.tasklet->context = fcgi; fcgi->conn->xprt->subscribe(fcgi->conn, fcgi->conn->xprt_ctx, SUB_RETRY_RECV, &fcgi->wait_event); + if (new_task) + __task_free(new_task); return 0; + fail: + if (new_task) + __task_free(new_task); + tasklet_free(new_tasklet); + return -1; } /****************************************/