Based on a bisect, it appears that commit
7ee988770326 ("timers:
Implement the hierarchical pull model") has somehow inadvertently
broken BPF selftest test_tcpnotify_user. The error that is being
generated by this test is as follows:
FAILED: Wrong stats Expected 10 calls, got 8
It looks like the change allows timer functions to be run on CPUs
different from the one they are armed on. The test had pinned itself
to CPU 0, and in the past the retransmit attempts also occurred on CPU
0. The test had set the max_entries attribute for
BPF_MAP_TYPE_PERF_EVENT_ARRAY to 2 and was calling
bpf_perf_event_output() with BPF_F_CURRENT_CPU, so the entry was
likely to be in range. With the change to allow timers to run on other
CPUs, the current CPU tasked with performing the retransmit might be
bumped and in turn fall out of range, as the event will be filtered
out via __bpf_perf_event_output() using:
if (unlikely(index >= array->map.max_entries))
return -E2BIG;
A possible change would be to explicitly set the max_entries attribute
for perf_event_map in test_tcpnotify_kern.c to a value that's at least
as large as the number of CPUs. As it turns out however, if the field
is left unset, then the libbpf will determine the number of CPUs available
on the underlying system and update the max_entries attribute accordingly
in map_set_def_max_entries().
A further problem with the test is that it has a thread that continues
running up until the program exits. The main thread cleans up some
LIBBPF data structures, while the other thread continues to use them,
which inevitably will trigger a SIGSEGV. This can be dealt with by
telling the thread to run for as long as necessary and doing a
pthread_join on it before exiting the program.
Finally, I don't think binding the process to CPU 0 is meaningful for
this test any more, so get rid of that.
Fixes: 435f90a338ae ("selftests/bpf: add a test case for sock_ops perf-event notification")
Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/aJ8kHhwgATmA3rLf@google.com
struct {
__uint(type, BPF_MAP_TYPE_PERF_EVENT_ARRAY);
- __uint(max_entries, 2);
__type(key, int);
__type(value, __u32);
} perf_event_map SEC(".maps");
#include <bpf/libbpf.h>
#include <sys/ioctl.h>
#include <linux/rtnetlink.h>
-#include <signal.h>
#include <linux/perf_event.h>
-#include <linux/err.h>
-#include "bpf_util.h"
#include "cgroup_helpers.h"
#include "test_tcpnotify.h"
-#include "trace_helpers.h"
#include "testing_helpers.h"
#define SOCKET_BUFFER_SIZE (getpagesize() < 8192L ? getpagesize() : 8192L)
pthread_t tid;
+static bool exit_thread;
+
int rx_callbacks;
static void dummyfn(void *ctx, int cpu, void *data, __u32 size)
{
int err;
- while (1) {
+ while (!exit_thread) {
err = perf_buffer__poll(pb, 100);
if (err < 0 && err != -EINTR) {
printf("failed perf_buffer__poll: %d\n", err);
int error = EXIT_FAILURE;
struct bpf_object *obj;
char test_script[80];
- cpu_set_t cpuset;
__u32 key = 0;
libbpf_set_strict_mode(LIBBPF_STRICT_ALL);
- CPU_ZERO(&cpuset);
- CPU_SET(0, &cpuset);
- pthread_setaffinity_np(pthread_self(), sizeof(cpu_set_t), &cpuset);
-
cg_fd = cgroup_setup_and_join(cg_path);
if (cg_fd < 0)
goto err;
sleep(10);
+ exit_thread = true;
+ int ret = pthread_join(tid, NULL);
+ if (ret) {
+ printf("FAILED: pthread_join\n");
+ goto err;
+ }
+
if (verify_result(&g)) {
printf("FAILED: Wrong stats Expected %d calls, got %d\n",
g.ncalls, rx_callbacks);