Satisfied with how the output looks at
- fixed some stray whitespace
- suppress (clearly marked) repeated 'address out of range' after 10th
time for each given pid
-- it's just the same uninformative error repeated, which clutters
output considerably
- keep the 'perf_event_attr configuration' message at all levels
-- aesthetics arguable, but showing sample_freq is useful
- clog/cerr division is ok
- mainfile can also be null
- if we deliberately limit maxframes<=2, the heuristic
flagging fewer than 2 frames as a 'lost' sample is inapplicable
and
- opt_maxframes relates to the user deliberately choosing to
override GprofUnwindSampleConsumer preference with --maxframes
(which I consider useful sometimes for testing):
if --maxframes is provided, then use the default of 1.
Frank Ch. Eigler [Thu, 12 Feb 2026 18:25:19 +0000 (13:25 -0500)]
stackprof testsuite++
Now including run-stackprof-system* tests, which only run as root, and
can require several minutes to run (with gprof downloading/processing
large debuginfo files if you happen to be running large gui programs
while this test runs).
HIST_SPLIT_EVEN is a sketch (WIP / testing in progress) of how to
cover the space with histograms of equal size.
Even if we solve the ld-linux.so stray untranslated addresses, there
may be an argument to save space by splitting histograms when
collecting profiledb data in bulk. Good to have this code for the
record.
Serhei Makarov [Mon, 9 Feb 2026 17:14:12 +0000 (12:14 -0500)]
stackprof.cxx minor fixes: tweak the maxframes logic
Since this is now an intended performance optimization (unwind exactly
N frames) rather than a rough safety tripwire, make sure the number of
frames returned is exact.
Serhei Makarov [Thu, 29 Jan 2026 17:27:50 +0000 (12:27 -0500)]
cherrypick PR33854 fix from main branch
only difference to draft fix already in branch is in freeing
sample_arg on attach failure
== original commit:
PR33854: fix regression in dwflst_perf_sample_getframes
In commit 3ce0d5ed, I missed the fact that
dwflst_perf_sample_getframes needs to handle the case of an unattached
Dwfl, when dwfl->process->ebl is not yet available to translate the
registers. Thus, it can't be a straightforward wrapper of
dwfl_sample_getframes, but should instead handle the attaching logic
identically to that function.
Also fix a leakage of sample_arg in dwflst_sample_getframes that was
happening on attach failure.
* libdwfl_stacktrace (dwflst_sample_getframes): Fix a leak of
sample_arg on attach failure.
* libdwfl_stacktrace (dwflst_perf_sample_getframes): Implement
attaching the Dwfl identically to dwflst_sample_getframes.
Avoid leaking sample_arg on attach failure.
Frank Ch. Eigler [Fri, 30 Jan 2026 03:09:21 +0000 (22:09 -0500)]
rework gmon.out header/content
- matched against gperf and gmon2profdata parsers
- not tested against shared libraries quite, will likely need some address translation (#if-0'd code)
libdwfl_stacktrace + libebl: dwflst_sample_getframes non-perf api
This patch adds a generic dwflst_sample_getframes() API that does not
depend on perf_events concepts, in particular the
linux-kernel-specific enum defining the perf_regs_mask register order.
This involves reworking the register-handling backend to use
regs_mapping arrays rather than perf_regs_mask, and provide a way to
translate perf_regs_mask to regs_mapping.
A regs_mapping array, for each item in a provided regs[] array,
specifies its position in the full register file expected by the DWARF
functionality.
Changes for v3:
- Added dwflst_sample_getframes to libdw.map.
Changes for v2:
- Addressed Aaron Merey's review comments.
- Removed dwflst_sample_frame.c dependency on perf_events abi constants.
* libdwfl_stacktrace/Makefile.am: Rename dwflst_sample_frame.c from
dwflst_perf_frame.c.
* libdwfl_stacktrace/libdwfl_stacktrace.h (dwflst_sample_getframes):
New function providing unwinding functionality with a regs_mapping
array rather than a linux-kernel-dependent perf_regs_mask.
* libdw/libdw.map (ELFUTILS_0.193_EXPERIMENTAL): Add dwflst_sample_getframes.
* libdwfl_stacktrace/dwflst_sample_frame.c: Renamed from
dwflst_perf_frame.c. Remove linux/perf_event.h dependency.
(struct sample_info): Rename from perf_sample_info, include
regs_mapping field, replace abi with elfclass field.
(sample_next_thread): Renamed struct sample_info.
(sample_getthread): Renamed struct sample_info.
(copy_word): Use elfclass instead of perf abi field.
(elf_memory_read): Renamed struct sample_info, use elfclass.
(sample_memory_read): Renamed struct sample_info, use elfclass.
(sample_set_initial_registers): Renamed struct sample_info,
pass regs_mapping to ebl_set_initial_registers_sample.
(dwflst_sample_getframes): New function.
(dwflst_perf_sample_getframes): Reimplement in terms of
dwflst_sample_getframes and ebl_sample_perf_regs_mapping.
* libebl/ebl-hooks.h (set_initial_registers_sample): Now
takes regs_mapping instead of regs_mask.
(sample_base_addr): Removed.
(sample_pc): Removed.
(sample_sp_pc): New function combining the removed functions for
efficiency.
(sample_perf_regs_mapping): New function translating
perf_regs_mask to regs_mapping array.
* libebl/eblinitreg_sample.c (ebl_sample_base_addr): Removed.
(ebl_sample_pc): Removed.
(ebl_sample_sp_pc): New function.
(ebl_set_initial_registers_sample): Take regs_mapping, provide
a default implementation for contiguous dwarf_regs array.
(ebl_sample_perf_regs_mapping): New function.
* libebl/eblclosebackend.c (ebl_closebackend):
Free cached_regs_mapping.
* libebl/libebl.h (ebl_set_initial_registers_sample): Now takes
regs_mapping instead of regs_mask.
(ebl_sample_base_addr): Removed.
(ebl_sample_pc): Removed.
(ebl_sample_sp_pc): New function.
(ebl_sample_perf_regs_mapping): New function.
* libebl/libeblP.h (struct ebl): Add caching fields to remove the
need to repeat a sample_perf_regs_mapping() computation for
every frame when the perf_regs_mask is consistent.
* backends/Makefile.am: Remove no-longer-needed linux-perf-regs.c.
* backends/i386_init.c (i386_init): Renamed sample_* functions,
added cached_regs_mapping and related fields/functions.
* backends/i386_initreg_sample.c (i386_sample_base_addr): Removed.
(i386_sample_pc): Removed.
(i386_sample_sp_pc): New function combining the removed functions.
(i386_set_initial_registers_sample): Removed.
(i386_sample_perf_regs_mapping): New function translating
perf_regs_mask to regs_mapping array.
* backends/linux-perf-regs.c: Removed as perf_sample_find_reg is no
longer needed.
* backends/x86_64_init.c (x86_64_init): Renamed sample_* functions,
added cached_regs_mapping and related fields/functions.
* backends/x86_64_initreg_sample.c (x86_64_sample_base_addr): Removed.
(x86_64_sample_pc): Removed.
(x86_64_sample_sp_pc): New function combining the removed functions.
(x86_64_set_initial_registers_sample): Removed.
(x86_64_sample_perf_regs_mapping): New function translating
perf_regs_mask to regs_mapping array.
* backends/x86_initreg_sample.c (x86_set_initial_registers_sample):
Removed.
(x86_sample_sp_pc): New function.
(x86_sample_perf_regs_mapping): New function translating
perf_regs_mask to regs_mapping array.
libdwfl_stacktrace: fix non-Linux build dep on PERF_SAMPLE_REGS_ABI
Reported on a GNU Hurd build of elfutils. This is a quick fix pending
my more complex patch to reduce dependency on linux perf concepts for
the libdwfl_stacktrace code.
* libdwfl_stacktrace/dwflst_perf_frame.c (perf_sample_regs_abi):
Define this Linux enum on non-Linux platforms.
Gives an idea of how much more flexibility is needed for the munging
to work across a variety of systems. Also concerned that elfutils
certainly runs on non-linux non-GLIBC systems too, worth a brief test
on musl and bsd?
Missing a few pieces, but worth sharing as an RFC. My idea is to
ensure better test coverage for eu-stack and then
eu-stacktrace+libdwfl_stacktrace by running against a live process
with known content, stopped at a known location, and aggressively
scrubbing output that's known to vary from testrun to testrun.
This is a very basic preview of how that might look. If the approach
is sound, I hope to make it more sophisticated/reliable.
Unanswered questions:
- Scrub more data (e.g. libc symvers) from a more known program.
Scrub stack frame numbers to account for a case where extra frames
appear / are missing at the bottom of the stack?
- Something better than sed for the scrubbing?
- An equivalent eu-stacktrace test will require privileged perf_events
access for profiling data and therefore likely to be skipped by
default. How feasible is it to be enabled on the buildbots, though?
* tests/run-stack-live-test.sh: New test with wild and fuzzy
testrun_compare variant that scrubs inherently unpredictable parts of
the data. Needs to scrub even more.
* tests/Makefile.am (TESTS): Add run-stack-live-test.sh.
Mark Wielaard [Tue, 6 May 2025 09:50:12 +0000 (11:50 +0200)]
tests: Create random test_dir name
The testsuite relies on there being no files in the test directory
after the test finishes. A test will fail if the test dir cannot be
removed. But the test dir isn't really random, it uses the pid of the
shell script that executes the test. On some of the buildbots that
execute a lot of tests it can happen that the pid number wraps around
and a pid of a previous pid is reused. To prevent that happening
generate a real random number (8 bytes) using od /dev/urandom and
xargs (to trim away spaces left by od).
* tests/test-subr.sh: Define test_name and random_number and use
those to define test_dir.
Missing a few pieces, but worth sharing as an RFC. My idea is to
ensure better test coverage for eu-stack and then
eu-stacktrace+libdwfl_stacktrace by running against a live process
with known content, stopped at a known location, and aggressively
scrubbing output that's known to vary from testrun to testrun.
This is a very basic preview of how that might look. If the approach
is sound, I hope to make it more sophisticated/reliable.
Unanswered questions:
- Scrub more data (e.g. libc symvers) from a more known program.
Scrub stack frame numbers to account for a case where extra frames
appear / are missing at the bottom of the stack?
- Something better than sed for the scrubbing?
- An equivalent eu-stacktrace test will require privileged perf_events
access for profiling data and therefore likely to be skipped by
default. How feasible is it to be enabled on the buildbots, though?
* tests/run-stack-live-test.sh: New test with wild and fuzzy
testrun_compare variant that scrubs inherently unpredictable parts of
the data. Needs to scrub even more.
* tests/Makefile.am (TESTS): Add run-stack-live-test.sh.