eventpoll: hoist CTL_ADD scratch state into struct ep_ctl_ctx
Three globals were shared between the loop check and the path check
paths: tfile_check_list (chain of epitems_head to walk afterwards),
path_count[] (per-depth wakeup-path tally) and inserting_into
(cycle-detection sentinel). All three are scratch state used only
during a single EPOLL_CTL_ADD full_check, yet they sit at file
scope and rely on epnested_mutex for exclusion.
The area has had three bugs in the last year -- CVE-2025-38349,
f2e467a48287 ("eventpoll: Fix semi-unbounded recursion"), and
fdcfce93073d ("eventpoll: Fix integer overflow in
ep_loop_check_proc()") -- all rooted in the shared-mutable-global
pattern being hard to reason about.
Collect the three into a stack-allocated struct ep_ctl_ctx:
struct ep_ctl_ctx {
struct eventpoll *inserting_into;
struct epitems_head *tfile_check_list;
int path_count[PATH_ARR_SIZE];
};
do_epoll_ctl() zero-initializes one on its stack and plumbs it
through ep_ctl_lock() / ep_ctl_unlock() / ep_insert() /
ep_register_epitem() / list_file() / ep_loop_check() /
ep_loop_check_proc() / reverse_path_check() /
reverse_path_check_proc() / path_count_inc() / path_count_init() /
clear_tfile_check_list(). Non-nested inserts leave the ctx zeroed
and skip the machinery entirely.
With the scratch state in ctx:
- tfile_check_list no longer has an EP_UNACTIVE_PTR sentinel --
NULL is the obvious "empty" value and the zero-init handles it
for free;
- path_count[] is no longer an array global that could be touched
in unexpected orderings;
- inserting_into is scoped to the exact call that set it.
loop_check_gen stays as a file-scope monotonic counter, because the
stamp left on ep->gen by a completed walk must not equal the stamp
of a future walk -- something a stack-local value cannot guarantee
across calls. It remains protected by epnested_mutex for the bump
and read lockless for the "do we need a full check" trigger in
ep_ctl_lock().
Every bail-out that existed before (the ELOOP on cycle, the path
limit check, the unbounded-recursion cap, the +1 overflow guard) is
preserved verbatim; only the data they operate on moved from file
scope to the stack ctx.
No functional change.
Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
Link: https://patch.msgid.link/20260424-work-epoll-rework-v1-17-249ed00a20f3@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>