From: Christian Brauner Date: Thu, 23 Apr 2026 09:56:09 +0000 (+0200) Subject: eventpoll: fix ep_remove struct eventpoll / struct file UAF X-Git-Url: http://git.ipfire.org/gitweb.cgi?a=commitdiff_plain;h=a6dc643c69311677c574a0f17a3f4d66a5f3744b;p=thirdparty%2Fkernel%2Fstable.git eventpoll: fix ep_remove struct eventpoll / struct file UAF ep_remove() (via ep_remove_file()) cleared file->f_ep under file->f_lock but then kept using @file inside the critical section (is_file_epoll(), hlist_del_rcu() through the head, spin_unlock). A concurrent __fput() taking the eventpoll_release() fastpath in that window observed the transient NULL, skipped eventpoll_release_file() and ran to f_op->release / file_free(). For the epoll-watches-epoll case, f_op->release is ep_eventpoll_release() -> ep_clear_and_put() -> ep_free(), which kfree()s the watched struct eventpoll. Its embedded ->refs hlist_head is exactly where epi->fllink.pprev points, so the subsequent hlist_del_rcu()'s "*pprev = next" scribbles into freed kmalloc-192 memory. In addition, struct file is SLAB_TYPESAFE_BY_RCU, so the slot backing @file could be recycled by alloc_empty_file() -- reinitializing f_lock and f_ep -- while ep_remove() is still nominally inside that lock. The upshot is an attacker-controllable kmem_cache_free() against the wrong slab cache. Pin @file via epi_fget() at the top of ep_remove() and gate the critical section on the pin succeeding. With the pin held @file cannot reach refcount zero, which holds __fput() off and transitively keeps the watched struct eventpoll alive across the hlist_del_rcu() and the f_lock use, closing both UAFs. If the pin fails @file has already reached refcount zero and its __fput() is in flight. Because we bailed before clearing f_ep, that path takes the eventpoll_release() slow path into eventpoll_release_file() and blocks on ep->mtx until the waiter side's ep_clear_and_put() drops it. The bailed epi's share of ep->refcount stays intact, so the trailing ep_refcount_dec_and_test() in ep_clear_and_put() cannot free the eventpoll out from under eventpoll_release_file(); the orphaned epi is then cleaned up there. A successful pin also proves we are not racing eventpoll_release_file() on this epi, so drop the now-redundant re-check of epi->dying under f_lock. The cheap lockless READ_ONCE(epi->dying) fast-path bailout stays. Fixes: 58c9b016e128 ("epoll: use refcount to reduce ep_mutex contention") Reported-by: Jaeyoung Chung Link: https://patch.msgid.link/20260423-work-epoll-uaf-v1-6-2470f9eec0f5@kernel.org Signed-off-by: Christian Brauner (Amutable) --- diff --git a/fs/eventpoll.c b/fs/eventpoll.c index 5ee4398a6cb8..0f785c0a1544 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -912,22 +912,26 @@ static bool ep_remove_epi(struct eventpoll *ep, struct epitem *epi) */ static void ep_remove(struct eventpoll *ep, struct epitem *epi) { - struct file *file = epi->ffd.file; + struct file *file __free(fput) = NULL; lockdep_assert_irqs_enabled(); lockdep_assert_held(&ep->mtx); ep_unregister_pollwait(ep, epi); - /* sync with eventpoll_release_file() */ + /* cheap sync with eventpoll_release_file() */ if (unlikely(READ_ONCE(epi->dying))) return; - spin_lock(&file->f_lock); - if (epi->dying) { - spin_unlock(&file->f_lock); + /* + * If we manage to grab a reference it means we're not in + * eventpoll_release_file() and aren't going to be. + */ + file = epi_fget(epi); + if (!file) return; - } + + spin_lock(&file->f_lock); ep_remove_file(ep, epi, file); if (ep_remove_epi(ep, epi))