From 34c78b8610a9befdcc139d34a7c98365f018026d Mon Sep 17 00:00:00 2001 From: Caleb Sander Mateos Date: Tue, 2 Dec 2025 13:57:44 -0700 Subject: [PATCH] io_uring/io-wq: always retry worker create on ERESTART* If a task has a pending signal when create_io_thread() is called, copy_process() will return -ERESTARTNOINTR. io_should_retry_thread() will request a retry of create_io_thread() up to WORKER_INIT_LIMIT = 3 times. If all retries fail, the io_uring request will fail with ECANCELED. Commit 3918315c5dc ("io-wq: backoff when retrying worker creation") added a linear backoff to allow the thread to handle its signal before the retry. However, a thread receiving frequent signals may get unlucky and have a signal pending at every retry. Since the userspace task doesn't control when it receives signals, there's no easy way for it to prevent the create_io_thread() failure due to pending signals. The task may also lack the information necessary to regenerate the canceled SQE. So always retry the create_io_thread() on the ERESTART* errors, analogous to what a fork() syscall would do. EAGAIN can occur due to various persistent conditions such as exceeding RLIMIT_NPROC, so respect the WORKER_INIT_LIMIT retry limit for EAGAIN errors. Signed-off-by: Caleb Sander Mateos Signed-off-by: Jens Axboe --- io_uring/io-wq.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/io_uring/io-wq.c b/io_uring/io-wq.c index 1d03b2fc4b259..cd13d8aac3d26 100644 --- a/io_uring/io-wq.c +++ b/io_uring/io-wq.c @@ -805,11 +805,12 @@ static inline bool io_should_retry_thread(struct io_worker *worker, long err) */ if (fatal_signal_pending(current)) return false; - if (worker->init_retries++ >= WORKER_INIT_LIMIT) - return false; + worker->init_retries++; switch (err) { case -EAGAIN: + return worker->init_retries <= WORKER_INIT_LIMIT; + /* Analogous to a fork() syscall, always retry on a restartable error */ case -ERESTARTSYS: case -ERESTARTNOINTR: case -ERESTARTNOHAND: -- 2.47.3