]> git.ipfire.org Git - thirdparty/linux.git/commitdiff
perf/core: Use POLLHUP for pinned events in error
authorNamhyung Kim <namhyung@kernel.org>
Mon, 17 Mar 2025 06:17:45 +0000 (23:17 -0700)
committerIngo Molnar <mingo@kernel.org>
Mon, 17 Mar 2025 07:31:03 +0000 (08:31 +0100)
Pinned performance events can enter an error state when they fail to be
scheduled in the context due to a failed constraint or some other conflict
or condition.

In error state these events won't generate any samples anymore and are
silently ignored until they are recovered by PERF_EVENT_IOC_ENABLE,
or the condition can also change so that they can be scheduled in.

Tooling should be allowed to know about the state change, but
currently there's no mechanism to notify tooling when events enter
an error state.

One way to do this is to issue a POLLHUP event to poll(2) to handle this.
Reading events in an error state would return 0 (EOF) and it matches to
the behavior of POLLHUP according to the man page.

Tooling should remove the fd of the event from pollfd after getting
POLLHUP, otherwise it'll be returned repeatedly.

[ mingo: Clarified the changelog ]

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250317061745.1777584-1-namhyung@kernel.org
kernel/events/core.c

index 2533fc32d890eacd656d5708570be725fa4376d5..ace1bcc1e05f9b517ce08f27e572c579925ba684 100644 (file)
@@ -3984,6 +3984,11 @@ static int merge_sched_in(struct perf_event *event, void *data)
                if (event->attr.pinned) {
                        perf_cgroup_event_disable(event, ctx);
                        perf_event_set_state(event, PERF_EVENT_STATE_ERROR);
+
+                       if (*perf_event_fasync(event))
+                               event->pending_kill = POLL_HUP;
+
+                       perf_event_wakeup(event);
                } else {
                        struct perf_cpu_pmu_context *cpc = this_cpc(event->pmu_ctx->pmu);
 
@@ -5925,6 +5930,10 @@ static __poll_t perf_poll(struct file *file, poll_table *wait)
        if (is_event_hup(event))
                return events;
 
+       if (unlikely(READ_ONCE(event->state) == PERF_EVENT_STATE_ERROR &&
+                    event->attr.pinned))
+               return events;
+
        /*
         * Pin the event->rb by taking event->mmap_mutex; otherwise
         * perf_event_set_output() can swizzle our rb and make us miss wakeups.