]> git.ipfire.org Git - thirdparty/kernel/linux.git/commitdiff
fs/eventpoll: fix endless busy loop after timeout has expired
authorMax Kellermann <max.kellermann@ionos.com>
Tue, 29 Apr 2025 18:58:27 +0000 (20:58 +0200)
committerChristian Brauner <brauner@kernel.org>
Fri, 2 May 2025 12:21:26 +0000 (14:21 +0200)
After commit 0a65bc27bd64 ("eventpoll: Set epoll timeout if it's in
the future"), the following program would immediately enter a busy
loop in the kernel:

```
int main() {
  int e = epoll_create1(0);
  struct epoll_event event = {.events = EPOLLIN};
  epoll_ctl(e, EPOLL_CTL_ADD, 0, &event);
  const struct timespec timeout = {.tv_nsec = 1};
  epoll_pwait2(e, &event, 1, &timeout, 0);
}
```

This happens because the given (non-zero) timeout of 1 nanosecond
usually expires before ep_poll() is entered and then
ep_schedule_timeout() returns false, but `timed_out` is never set
because the code line that sets it is skipped.  This quickly turns
into a soft lockup, RCU stalls and deadlocks, inflicting severe
headaches to the whole system.

When the timeout has expired, we don't need to schedule a hrtimer, but
we should set the `timed_out` variable.  Therefore, I suggest moving
the ep_schedule_timeout() check into the `timed_out` expression
instead of skipping it.

brauner: Note that there was an earlier fix by Joe Damato in response to
my bug report in [1].

Fixes: 0a65bc27bd64 ("eventpoll: Set epoll timeout if it's in the future")
Cc: Joe Damato <jdamato@fastly.com>
Cc: stable@vger.kernel.org
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
Link: https://lore.kernel.org/20250429153419.94723-1-jdamato@fastly.com
Link: https://lore.kernel.org/20250429185827.3564438-1-max.kellermann@ionos.com
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
fs/eventpoll.c

index 4bc264b854c436380c04c5df5491fd0b37aa9148..d4dbffdedd08e36a661e559b857a25349561ae62 100644 (file)
@@ -2111,9 +2111,10 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events,
 
                write_unlock_irq(&ep->lock);
 
-               if (!eavail && ep_schedule_timeout(to))
-                       timed_out = !schedule_hrtimeout_range(to, slack,
-                                                             HRTIMER_MODE_ABS);
+               if (!eavail)
+                       timed_out = !ep_schedule_timeout(to) ||
+                               !schedule_hrtimeout_range(to, slack,
+                                                         HRTIMER_MODE_ABS);
                __set_current_state(TASK_RUNNING);
 
                /*