When building the selftest with arm64/clang20, the following test failed:
...
ubtest_multispec_usdt:PASS:usdt_100_called 0 nsec
subtest_multispec_usdt:PASS:usdt_100_sum 0 nsec
subtest_multispec_usdt:FAIL:usdt_300_bad_attach unexpected pointer: 0xaaaad82a2a80
#471/2 usdt/multispec:FAIL
#471 usdt:FAIL
But arm64/gcc11 built kernel selftests succeeded. Further debug found arm64/clang
generated code has much less argument pattern after dedup, but gcc generated
code has a lot more.
Check usdt probes with usdt.test.o on arm64 platform:
with gcc11 build binary:
stapsdt 0x0000002e NT_STAPSDT (SystemTap probe descriptors)
Provider: test
Name: usdt_300
Location: 0x00000000000054f8, Base: 0x0000000000000000, Semaphore: 0x0000000000000008
Arguments: -4@[sp]
stapsdt 0x00000031 NT_STAPSDT (SystemTap probe descriptors)
Provider: test
Name: usdt_300
Location: 0x0000000000005510, Base: 0x0000000000000000, Semaphore: 0x0000000000000008
Arguments: -4@[sp, 4]
...
stapsdt 0x00000032 NT_STAPSDT (SystemTap probe descriptors)
Provider: test
Name: usdt_300
Location: 0x0000000000005660, Base: 0x0000000000000000, Semaphore: 0x0000000000000008
Arguments: -4@[sp, 60]
...
stapsdt 0x00000034 NT_STAPSDT (SystemTap probe descriptors)
Provider: test
Name: usdt_300
Location: 0x00000000000070e8, Base: 0x0000000000000000, Semaphore: 0x0000000000000008
Arguments: -4@[sp, 1192]
stapsdt 0x00000034 NT_STAPSDT (SystemTap probe descriptors)
Provider: test
Name: usdt_300
Location: 0x0000000000007100, Base: 0x0000000000000000, Semaphore: 0x0000000000000008
Arguments: -4@[sp, 1196]
...
stapsdt 0x00000032 NT_STAPSDT (SystemTap probe descriptors)
Provider: test
Name: usdt_300
Location: 0x0000000000009ec4, Base: 0x0000000000000000, Semaphore: 0x0000000000000008
Arguments: -4@[sp, 60]
with clang20 build binary:
stapsdt 0x0000002e NT_STAPSDT (SystemTap probe descriptors)
Provider: test
Name: usdt_300
Location: 0x00000000000009a0, Base: 0x0000000000000000, Semaphore: 0x0000000000000008
Arguments: -4@[x9]
stapsdt 0x0000002e NT_STAPSDT (SystemTap probe descriptors)
Provider: test
Name: usdt_300
Location: 0x00000000000009b8, Base: 0x0000000000000000, Semaphore: 0x0000000000000008
Arguments: -4@[x9]
...
stapsdt 0x0000002e NT_STAPSDT (SystemTap probe descriptors)
Provider: test
Name: usdt_300
Location: 0x0000000000002590, Base: 0x0000000000000000, Semaphore: 0x0000000000000008
Arguments: -4@[x9]
stapsdt 0x0000002e NT_STAPSDT (SystemTap probe descriptors)
Provider: test
Name: usdt_300
Location: 0x00000000000025a8, Base: 0x0000000000000000, Semaphore: 0x0000000000000008
Arguments: -4@[x8]
...
stapsdt 0x0000002f NT_STAPSDT (SystemTap probe descriptors)
Provider: test
Name: usdt_300
Location: 0x0000000000007fdc, Base: 0x0000000000000000, Semaphore: 0x0000000000000008
Arguments: -4@[x10]
There are total 300 locations for usdt_300. For gcc11 built binary, there are
300 spec's. But for clang20 built binary, there are 3 spec's. The default
BPF_USDT_MAX_SPEC_CNT is 256, so bpf_program__attach_usdt() will fail for gcc
but it will succeed with clang.
To fix the problem, do not do bpf_program__attach_usdt() for usdt_300
with arm64/clang setup.
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250624211802.2198821-1-yonghong.song@linux.dev
*/
trigger_300_usdts();
- /* we'll reuse usdt_100 BPF program for usdt_300 test */
bpf_link__destroy(skel->links.usdt_100);
+
+ bss->usdt_100_called = 0;
+ bss->usdt_100_sum = 0;
+
+ /* If built with arm64/clang, there will be much less number of specs
+ * for usdt_300 call sites.
+ */
+#if !defined(__aarch64__) || !defined(__clang__)
+ /* we'll reuse usdt_100 BPF program for usdt_300 test */
skel->links.usdt_100 = bpf_program__attach_usdt(skel->progs.usdt_100, -1, "/proc/self/exe",
"test", "usdt_300", NULL);
err = -errno;
/* let's check that there are no "dangling" BPF programs attached due
* to partial success of the above test:usdt_300 attachment
*/
- bss->usdt_100_called = 0;
- bss->usdt_100_sum = 0;
-
f300(777); /* this is 301st instance of usdt_300 */
ASSERT_EQ(bss->usdt_100_called, 0, "usdt_301_called");
ASSERT_EQ(bss->usdt_100_sum, 0, "usdt_301_sum");
+#endif
/* This time we have USDT with 400 inlined invocations, but arg specs
* should be the same across all sites, so libbpf will only need to