From: Willy Tarreau Date: Fri, 17 Nov 2023 09:56:33 +0000 (+0100) Subject: BUG/MEDIUM: mux-h2: fail earlier on malloc in takeover() X-Git-Tag: v2.9-dev10~29 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=4f02e3da67aece3476e78f43f1cea45685edc48a;p=thirdparty%2Fhaproxy.git BUG/MEDIUM: mux-h2: fail earlier on malloc in takeover() Connection takeover was implemented for H2 in 2.2 by commit cd4159f03 ("MEDIUM: mux_h2: Implement the takeover() method."). It does have one corner case related to memory allocation failure: in case the task or tasklet allocation fails, the connection gets released synchronously. Unfortunately the situation is bad there, because the lower layers are already switched to the new thread while the tasklet is either NULL or still the old one, and calling h2_release() will also result in h2_process() and h2_process_demux() that may process any possibly pending frames. Even the session remains the old one on the old thread, so that some sess_log() that are called when facing certain demux errors will be associated with the previous thread, possibly accessing a number of elements belonging to another thread. There are even code paths where the thread will try to grab the lock of its own idle conns list, believing the connection is there while it has no useful effect. However, if the owner thread was doing the same at the same moment, and ended up trying to pick from the current thread (which could happen if picking a connection for a different name), the two could even deadlock. The risk is extremely low, but Fred managed to reproduce use-after-free errors in conn_backend_get() after a takeover() failed by playing with -dMfail, indicating that h2_release() had been successfully called. In practise it's sufficient to have h2 on the server side with reuse-always and to inject lots of request on it with -dMfail. This patch takes a simple but radically different approach. Instead of starting to migrate the connection before risking to face allocation failures, it first pre-allocates a new task and tasklet, then assigns them to the connection if the migration succeeds, otherwise it just frees them. This way it's no longer needed to manipulate the connection until it's fully migrated, and as a bonus this means the connection will continue to exist and the use-after-free condition is solved at the same time. This should be backported to 2.2. Thanks to Fred for the initial analysis of the problem! --- diff --git a/src/mux_h2.c b/src/mux_h2.c index 3883e15f6e..886a269824 100644 --- a/src/mux_h2.c +++ b/src/mux_h2.c @@ -7261,9 +7261,19 @@ static int h2_takeover(struct connection *conn, int orig_tid) { struct h2c *h2c = conn->ctx; struct task *task; + struct task *new_task; + struct tasklet *new_tasklet; + + /* Pre-allocate tasks so that we don't have to roll back after the xprt + * has been migrated. + */ + new_task = task_new_here(); + new_tasklet = tasklet_new(); + if (!new_task || !new_tasklet) + goto fail; if (fd_takeover(conn->handle.fd, conn) != 0) - return -1; + goto fail; if (conn->xprt->takeover && conn->xprt->takeover(conn, conn->xprt_ctx, orig_tid) != 0) { /* We failed to takeover the xprt, even if the connection may @@ -7273,44 +7283,49 @@ static int h2_takeover(struct connection *conn, int orig_tid) */ conn->flags |= CO_FL_ERROR; tasklet_wakeup_on(h2c->wait_event.tasklet, orig_tid); - return -1; + goto fail; } if (h2c->wait_event.events) h2c->conn->xprt->unsubscribe(h2c->conn, h2c->conn->xprt_ctx, h2c->wait_event.events, &h2c->wait_event); - /* To let the tasklet know it should free itself, and do nothing else, - * set its context to NULL. - */ - h2c->wait_event.tasklet->context = NULL; - tasklet_wakeup_on(h2c->wait_event.tasklet, orig_tid); task = h2c->task; if (task) { + /* only assign a task if there was already one, otherwise + * the preallocated new task will be released. + */ task->context = NULL; h2c->task = NULL; __ha_barrier_store(); task_kill(task); - h2c->task = task_new_here(); - if (!h2c->task) { - h2_release(h2c); - return -1; - } + h2c->task = new_task; + new_task = NULL; h2c->task->process = h2_timeout_task; h2c->task->context = h2c; } - h2c->wait_event.tasklet = tasklet_new(); - if (!h2c->wait_event.tasklet) { - h2_release(h2c); - return -1; - } + + /* To let the tasklet know it should free itself, and do nothing else, + * set its context to NULL. + */ + h2c->wait_event.tasklet->context = NULL; + tasklet_wakeup_on(h2c->wait_event.tasklet, orig_tid); + + h2c->wait_event.tasklet = new_tasklet; h2c->wait_event.tasklet->process = h2_io_cb; h2c->wait_event.tasklet->context = h2c; h2c->conn->xprt->subscribe(h2c->conn, h2c->conn->xprt_ctx, SUB_RETRY_RECV, &h2c->wait_event); + if (new_task) + __task_free(new_task); return 0; + fail: + if (new_task) + __task_free(new_task); + tasklet_free(new_tasklet); + return -1; } /*******************************************************/