]> git.ipfire.org Git - thirdparty/linux.git/log
thirdparty/linux.git
2 weeks agoperf cpumap: Add "any" CPU handling to cpu_map__snprint_mask
Ian Rogers [Wed, 3 Dec 2025 21:47:02 +0000 (13:47 -0800)] 
perf cpumap: Add "any" CPU handling to cpu_map__snprint_mask

If the perf_cpu_map is empty or is just the any CPU value, then early
return. Don't process the "any" CPU when creating the bitmap.

Tested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agolibperf cpumap: Fix perf_cpu_map__max for an empty/NULL map
Ian Rogers [Wed, 3 Dec 2025 21:47:01 +0000 (13:47 -0800)] 
libperf cpumap: Fix perf_cpu_map__max for an empty/NULL map

Passing an empty map to perf_cpu_map__max triggered a SEGV. Explicitly
test for the empty map.

Reported-by: Ingo Molnar <mingo@kernel.org>
Closes: https://lore.kernel.org/linux-perf-users/aSwt7yzFjVJCEmVp@gmail.com/
Tested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf stat: Allow no events to open if this is a "--null" run
Ian Rogers [Wed, 3 Dec 2025 21:47:00 +0000 (13:47 -0800)] 
perf stat: Allow no events to open if this is a "--null" run

It is intended that a "--null" run doesn't open any events.

Fixes: 2cc7aa995ce9 ("perf stat: Refactor retry/skip/fatal error handling")
Tested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf test kvm: Add some basic perf kvm test coverage
Ian Rogers [Sat, 22 Nov 2025 08:19:29 +0000 (00:19 -0800)] 
perf test kvm: Add some basic perf kvm test coverage

Setup qemu with KVM then run kvm stat and some host
recording/reporting/build-id tests.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf tests evlist: Add basic evlist test
Ian Rogers [Sat, 22 Nov 2025 08:19:28 +0000 (00:19 -0800)] 
perf tests evlist: Add basic evlist test

Add test that evlist reports expected events from perf record.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf tests script dlfilter: Add a dlfilter test
Ian Rogers [Sat, 22 Nov 2025 08:19:27 +0000 (00:19 -0800)] 
perf tests script dlfilter: Add a dlfilter test

Compile a simple dlfilter and make sure it remove samples from
everything other than a test_loop.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf tests kallsyms: Add basic kallsyms test
Ian Rogers [Sat, 22 Nov 2025 08:19:26 +0000 (00:19 -0800)] 
perf tests kallsyms: Add basic kallsyms test

Add test that kallsyms finds a well known symbol and fails for
another.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf tests timechart: Add a perf timechart test
Ian Rogers [Sat, 22 Nov 2025 08:19:25 +0000 (00:19 -0800)] 
perf tests timechart: Add a perf timechart test

Basic coverage for `perf timechart` doing a record and then a basic
sanity test of the generated SVG file.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf tests top: Add basic perf top coverage test
Ian Rogers [Sat, 22 Nov 2025 08:19:24 +0000 (00:19 -0800)] 
perf tests top: Add basic perf top coverage test

The test starts a backgroup thloop workload and monitors it using
cpu-clock ensuring test_loop appears in the output.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf tests buildid: Add purge and remove testing
Ian Rogers [Sat, 22 Nov 2025 08:19:23 +0000 (00:19 -0800)] 
perf tests buildid: Add purge and remove testing

Add testing for the purge and remove commands. Use the noploop
workload rather than just a return to avoid missing samples in the
workload in perf record. Tidy up the cleanup code to cleanup when
signals happen.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf tests c2c: Add a basic c2c
Ian Rogers [Sat, 22 Nov 2025 08:19:22 +0000 (00:19 -0800)] 
perf tests c2c: Add a basic c2c

Add basic c2c record and report testing to gain some coverage.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf c2c: Clean up some defensive gets and make asan clean
Ian Rogers [Sat, 22 Nov 2025 08:19:21 +0000 (00:19 -0800)] 
perf c2c: Clean up some defensive gets and make asan clean

To deal with histogram code that had missing gets the c2c code had
some defensive gets. Those other issues were cleaned up by the
reference count checker, clean them up for the c2c command here.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf jitdump: Fix missed dso__put
Ian Rogers [Sat, 22 Nov 2025 08:19:20 +0000 (00:19 -0800)] 
perf jitdump: Fix missed dso__put

Reference count checking caught a missing dso__put following a
machine__findnew_dso_id.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf mem-events: Don't leak online CPU map
Ian Rogers [Sat, 22 Nov 2025 08:19:19 +0000 (00:19 -0800)] 
perf mem-events: Don't leak online CPU map

Reference count checking found the online CPU map was being gotten but
not put. Add in the missing put.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf hist: In init, ensure mem_info is put on error paths
Ian Rogers [Sat, 22 Nov 2025 08:19:18 +0000 (00:19 -0800)] 
perf hist: In init, ensure mem_info is put on error paths

Rather than exit the internal map_symbols directly, put the mem-info
that does this and also lowers the reference count on the mem-info
itself otherwise the mem-info is being leaked.

Fixes: 56e144fe98260a0f ("perf mem_info: Add and use map_symbol__exit and addr_map_symbol__exit")
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf probe-event: Ensure probe event nsinfo is always cleared
Ian Rogers [Sat, 22 Nov 2025 08:19:17 +0000 (00:19 -0800)] 
perf probe-event: Ensure probe event nsinfo is always cleared

Move nsinfo__zput from cleanup_perf_probe_events to
clear_perf_probe_event so it is always executed. Clean up
clear_perf_probe_events to not call nsinfo__zput and use the pev
variable to avoid repeated array accesses.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf symbol: Add missed dso__put
Ian Rogers [Sat, 22 Nov 2025 08:19:16 +0000 (00:19 -0800)] 
perf symbol: Add missed dso__put

Add missing dso__put for the dso created in maps__split_kallsyms.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf symbol-elf: Add missing puts on error path
Ian Rogers [Sat, 22 Nov 2025 08:19:15 +0000 (00:19 -0800)] 
perf symbol-elf: Add missing puts on error path

In dso__process_kernel_symbol if inserting a map fails, probably
ENOMEM, then the reference count puts were missing on the dso and map.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf timechart: Add record support for output perf.data path
Ian Rogers [Sat, 22 Nov 2025 08:19:14 +0000 (00:19 -0800)] 
perf timechart: Add record support for output perf.data path

The '-o' option exists for the SVG creation but not for `perf
timechart record`. Add to better allow testing.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf kvm: Fix debug assertion
Ian Rogers [Sat, 22 Nov 2025 08:19:13 +0000 (00:19 -0800)] 
perf kvm: Fix debug assertion

There are 2 slots left for kvm_add_default_arch_event, fix the
assertion so that debug builds don't fail the assert and to agree with
the comment.

Fixes: 45ff39f6e70aa55d0 ("perf tools kvm: Fix the potential out of range memory access issue")
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf vendor events intel: Update sierraforest events from 1.12 to 1.13
Ian Rogers [Tue, 2 Dec 2025 16:53:40 +0000 (08:53 -0800)] 
perf vendor events intel: Update sierraforest events from 1.12 to 1.13

The updated events were published in:
https://github.com/intel/perfmon/commit/445e38f5128592f8b5c38da30267fff025e37613

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf vendor events intel: Update pantherlake events from 1.00 to 1.02
Ian Rogers [Tue, 2 Dec 2025 16:53:39 +0000 (08:53 -0800)] 
perf vendor events intel: Update pantherlake events from 1.00 to 1.02

The updated events were published in:
https://github.com/intel/perfmon/commit/6edacf434dffa046435de2f6a182c00df3cf4edc

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf vendor events intel: Update meteorlake events from 1.17 to 1.18
Ian Rogers [Tue, 2 Dec 2025 16:53:38 +0000 (08:53 -0800)] 
perf vendor events intel: Update meteorlake events from 1.17 to 1.18

The updated events were published in:
https://github.com/intel/perfmon/commit/348f33fae477f281812c32e1c07812b7e35614dd

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf vendor events intel: Update lunarlake events from 1.18 to 1.19
Ian Rogers [Tue, 2 Dec 2025 16:53:37 +0000 (08:53 -0800)] 
perf vendor events intel: Update lunarlake events from 1.18 to 1.19

The updated events were published in:
https://github.com/intel/perfmon/commit/09a0c74b23b5d20adf1f97e5022856568d05494c

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf vendor events intel: Update icelakex events from 1.28 to 1.30
Ian Rogers [Tue, 2 Dec 2025 16:53:36 +0000 (08:53 -0800)] 
perf vendor events intel: Update icelakex events from 1.28 to 1.30

The updated events were published in:
https://github.com/intel/perfmon/commit/dc6ffee20c74bfd21d7a7e338345578d4b7ca9ca

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf vendor events intel: Update graniterapids events from 1.15 to 1.16
Ian Rogers [Tue, 2 Dec 2025 16:53:35 +0000 (08:53 -0800)] 
perf vendor events intel: Update graniterapids events from 1.15 to 1.16

The updated events were published in:
https://github.com/intel/perfmon/commit/b4acc3fd520eb098db41083010b65b75ae906c96

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf vendor events intel: Update cascadelakex metric units
Ian Rogers [Tue, 2 Dec 2025 16:53:34 +0000 (08:53 -0800)] 
perf vendor events intel: Update cascadelakex metric units

The updated metrics were published in:
https://github.com/intel/perfmon/pull/348/commits/2dce436130ddfb8b442fc373d103f970de26cb78

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf vendor events intel: Update arrowlake events from 1.13 to 1.14
Ian Rogers [Tue, 2 Dec 2025 16:53:33 +0000 (08:53 -0800)] 
perf vendor events intel: Update arrowlake events from 1.13 to 1.14

The updated events were published in:
https://github.com/intel/perfmon/commit/588dd77675039e1aaacee27a414cbcf3625c58a3

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf vendor events intel: Update alderlake events from 1.34 to 1.35
Ian Rogers [Tue, 2 Dec 2025 16:53:32 +0000 (08:53 -0800)] 
perf vendor events intel: Update alderlake events from 1.34 to 1.35

The updated events were published in:
https://github.com/intel/perfmon/commit/c74f1cefa94d224cb3338507961b59d8a2a1c4e9

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf arm_spe: Add CPU variants supporting common data source packet
Leo Yan [Thu, 13 Nov 2025 10:57:39 +0000 (10:57 +0000)] 
perf arm_spe: Add CPU variants supporting common data source packet

Add the following CPU variants to the list for data source decoding:

  - Cortex-A715 [1]
  - Cortex-A78C [2]
  - Cortex-X1 [3]
  - Cortex-X4 [4]
  - Neoverse V3 [5]

[1] https://developer.arm.com/documentation/101590/0103/Statistical-Profiling-Extension-Support/Statistical-Profiling-Extension-data-source-packet
[2] https://developer.arm.com/documentation/102226/0002/Debug-descriptions/Statistical-Profiling-Extension/implementation-defined-features-of-SPE
[3] https://developer.arm.com/documentation/101433/0102/Debug-descriptions/Statistical-Profiling-Extension/implementation-defined-features-of-SPE
[4] https://developer.arm.com/documentation/102484/0003/Statistical-Profiling-Extension-support/Statistical-Profiling-Extension-data-source-packet
[5] https://developer.arm.com/documentation/107734/0002/Statistical-Profiling-Extension-support/Statistical-Profiling-Extension-data-source-packet

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf auxtrace: Include sys/types.h for pid_t
Arnaldo Carvalho de Melo [Wed, 3 Dec 2025 14:50:03 +0000 (11:50 -0300)] 
perf auxtrace: Include sys/types.h for pid_t

In 754187ad73b73bcb ("perf build: Remove NO_AUXTRACE build option")
sys/types.h was removed, which broke the build in all Alpine Linux
releases, as musl libc has pid_t defined via sys/types.h, add it back.

Fixes: 754187ad73b73bcb ("perf build: Remove NO_AUXTRACE build option")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf test: Add kallsyms split test
Namhyung Kim [Tue, 2 Dec 2025 23:57:18 +0000 (15:57 -0800)] 
perf test: Add kallsyms split test

Create a fake root directory for /proc/{version,modules,kallsyms} in
/tmp for testing.  The kallsyms has a bad symbol in the module and it
causes the main map splitted.  The test ensures it only has two maps -
kernel and the module and it finds the initial map after the module
without creating the split maps like [kernel].0 and so on.

  $ perf test -vv "split kallsyms"
   69: split kallsyms:
  --- start ---
  test child forked, pid 1016196
  try to create fake root directory
  create kernel maps from the fake root directory
  maps__set_modules_path_dir: cannot open /tmp/perf-test.Zrv6Sy/lib/modules/X.Y.Z dir
  Problems setting modules path maps, continuing anyway...
  Failed to open /tmp/perf-test.Zrv6Sy/proc/kcore. Note /proc/kcore requires CAP_SYS_RAWIO capability to access.
  Using /tmp/perf-test.Zrv6Sy/proc/kallsyms for symbols
  kernel map loaded - check symbol and map
  ---- end(0) ----
   69: split kallsyms                                                  : Ok

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf tools: Use machine->root_dir to find /proc/kallsyms
Namhyung Kim [Tue, 2 Dec 2025 23:57:17 +0000 (15:57 -0800)] 
perf tools: Use machine->root_dir to find /proc/kallsyms

This is for test functions to find the kallsyms correctly.  It can find
the machine from the kernel maps and use its root_dir.  This is helpful
to setup fake /proc directory for testing.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf tools: Fallback to initial kernel map properly
Namhyung Kim [Tue, 2 Dec 2025 23:57:16 +0000 (15:57 -0800)] 
perf tools: Fallback to initial kernel map properly

In maps__split_kallsyms(), it assumes new kernel map when it finds a
symbol without module after any module and the initial kernel map has
some symbols.  Because it expects modules are out of the kernel map so
modules should not have symbols in the kernel map.

For example, the following memory map shows symbols and maps.  Any
symbols in the module 1 area will go to the module 1.  The main kernel
map starts at 0xffffffffbc200000.  But if any symbol has a module
between the symbols in that area, next symbols after 0xffffffffbd008000
will generate new kernel maps like [kernel].1.

   kernel address   |                     |
                    |                     |
 0xffffffffc0000000 |---------------------|
                    |     (symbols)       |
                    |        ...          |   <---  [kernel].N
 0xffffffffbc400000 |---------------------|
                    |     (symbols)       |
                    |      module 2       |   <---  bad?
 0xffffffffbc380000 |---------------------|
                    |        ...          |
                    |     (symbols)       |
                    |  [kernel.kallsyms]  |   <---  initial map
 0xffffffffbc200000 |---------------------|
                    |                     |
                    |                     |
 0xffffffffabcde000 |---------------------|
                    |     (symbols)       |
                    |      module 1       |
 0xffffffffabcd0000 |---------------------|

This is very fragile when the module has a symbol that falls into the
main kernel map for some reason.  My system has a livepatch module with
such symbols.  And it created a lot of new kernel maps after those
symbols.  But the symbol may have broken addresses and the later symbols
can still be found in the initial kernel map.

Let's check the symbol address in the initial map and use it if found.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf tools: Fix split kallsyms DSO counting
Namhyung Kim [Tue, 2 Dec 2025 23:57:15 +0000 (15:57 -0800)] 
perf tools: Fix split kallsyms DSO counting

It's counted twice as it's increased after calling maps__insert().  I
guess we want to increase it only after it's added properly.

Reviewed-by: Ian Rogers <irogers@google.com>
Fixes: 2e538c4a1847291cf ("perf tools: Improve kernel/modules symbol lookup")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf tools: Mark split kallsyms DSOs as loaded
Namhyung Kim [Tue, 2 Dec 2025 23:57:14 +0000 (15:57 -0800)] 
perf tools: Mark split kallsyms DSOs as loaded

The maps__split_kallsyms() will split symbols to module DSOs if it comes
from a module.  It also handled some unusual kernel symbols after modules
by creating new kernel maps like "[kernel].0".

But they are pseudo DSOs to have those unexpected symbols.  They should
not be considered as unloaded kernel DSOs.  Otherwise the dso__load()
for them will end up calling dso__load_kallsyms() and then
maps__split_kallsyms() again and again.

Reviewed-by: Ian Rogers <irogers@google.com>
Fixes: 2e538c4a1847291cf ("perf tools: Improve kernel/modules symbol lookup")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf tools: Flush remaining samples w/o deferred callchains
Namhyung Kim [Thu, 20 Nov 2025 23:48:04 +0000 (15:48 -0800)] 
perf tools: Flush remaining samples w/o deferred callchains

It's possible that some kernel samples don't have matching deferred
callchain records when the profiling session was ended before the
threads came back to userspace.  Let's flush the samples before
finish the session.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf tools: Merge deferred user callchains
Namhyung Kim [Thu, 20 Nov 2025 23:48:03 +0000 (15:48 -0800)] 
perf tools: Merge deferred user callchains

Save samples with deferred callchains in a separate list and deliver
them after merging the user callchains.  If users don't want to merge
they can set tool->merge_deferred_callchains to false to prevent the
behavior.

With previous result, now perf script will show the merged callchains.

  $ perf script
  ...
  pwd    2312   121.163435:     249113 cpu/cycles/P:
          ffffffff845b78d8 __build_id_parse.isra.0+0x218 ([kernel.kallsyms])
          ffffffff83bb5bf6 perf_event_mmap+0x2e6 ([kernel.kallsyms])
          ffffffff83c31959 mprotect_fixup+0x1e9 ([kernel.kallsyms])
          ffffffff83c31dc5 do_mprotect_pkey+0x2b5 ([kernel.kallsyms])
          ffffffff83c3206f __x64_sys_mprotect+0x1f ([kernel.kallsyms])
          ffffffff845e6692 do_syscall_64+0x62 ([kernel.kallsyms])
          ffffffff8360012f entry_SYSCALL_64_after_hwframe+0x76 ([kernel.kallsyms])
              7f18fe337fa7 mprotect+0x7 (/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
              7f18fe330e0f _dl_sysdep_start+0x7f (/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
              7f18fe331448 _dl_start_user+0x0 (/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
  ...

The old output can be get using --no-merge-callchain option.
Also perf report can get the user callchain entry at the end.

  $ perf report --no-children --stdio -q -S __build_id_parse.isra.0
  # symbol: __build_id_parse.isra.0
       8.40%  pwd      [kernel.kallsyms]
              |
              ---__build_id_parse.isra.0
                 perf_event_mmap
                 mprotect_fixup
                 do_mprotect_pkey
                 __x64_sys_mprotect
                 do_syscall_64
                 entry_SYSCALL_64_after_hwframe
                 mprotect
                 _dl_sysdep_start
                 _dl_start_user

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf script: Display PERF_RECORD_CALLCHAIN_DEFERRED
Namhyung Kim [Thu, 20 Nov 2025 23:48:02 +0000 (15:48 -0800)] 
perf script: Display PERF_RECORD_CALLCHAIN_DEFERRED

Handle the deferred callchains in the script output.

  $ perf script
  ...
  pwd    2312   121.163435:     249113 cpu/cycles/P:
          ffffffff845b78d8 __build_id_parse.isra.0+0x218 ([kernel.kallsyms])
          ffffffff83bb5bf6 perf_event_mmap+0x2e6 ([kernel.kallsyms])
          ffffffff83c31959 mprotect_fixup+0x1e9 ([kernel.kallsyms])
          ffffffff83c31dc5 do_mprotect_pkey+0x2b5 ([kernel.kallsyms])
          ffffffff83c3206f __x64_sys_mprotect+0x1f ([kernel.kallsyms])
          ffffffff845e6692 do_syscall_64+0x62 ([kernel.kallsyms])
          ffffffff8360012f entry_SYSCALL_64_after_hwframe+0x76 ([kernel.kallsyms])
                 b00000006 (cookie) ([unknown])

  pwd    2312   121.163447: DEFERRED CALLCHAIN [cookie: b00000006]
              7f18fe337fa7 mprotect+0x7 (/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
              7f18fe330e0f _dl_sysdep_start+0x7f (/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)
              7f18fe331448 _dl_start_user+0x0 (/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf record: Add --call-graph fp,defer option for deferred callchains
Namhyung Kim [Thu, 20 Nov 2025 23:48:01 +0000 (15:48 -0800)] 
perf record: Add --call-graph fp,defer option for deferred callchains

Add a new callchain record mode option for deferred callchains.  For now
it only works with FP (frame-pointer) mode.

And add the missing feature detection logic to clear the flag on old
kernels.

  $ perf record --call-graph fp,defer -vv true
  ...
  ------------------------------------------------------------
  perf_event_attr:
    type                             0 (PERF_TYPE_HARDWARE)
    size                             136
    config                           0 (PERF_COUNT_HW_CPU_CYCLES)
    { sample_period, sample_freq }   4000
    sample_type                      IP|TID|TIME|CALLCHAIN|PERIOD
    read_format                      ID|LOST
    disabled                         1
    inherit                          1
    mmap                             1
    comm                             1
    freq                             1
    enable_on_exec                   1
    task                             1
    sample_id_all                    1
    mmap2                            1
    comm_exec                        1
    ksymbol                          1
    bpf_event                        1
    defer_callchain                  1
    defer_output                     1
  ------------------------------------------------------------
  sys_perf_event_open: pid 162755  cpu 0  group_fd -1  flags 0x8
  sys_perf_event_open failed, error -22
  switching off deferred callchain support

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf tools: Minimal DEFERRED_CALLCHAIN support
Namhyung Kim [Thu, 20 Nov 2025 23:48:00 +0000 (15:48 -0800)] 
perf tools: Minimal DEFERRED_CALLCHAIN support

Add a new event type for deferred callchains and a new callback for the
struct perf_tool.  For now it doesn't actually handle the deferred
callchains but it just marks the sample if it has the PERF_CONTEXT_
USER_DEFFERED in the callchain array.

At least, perf report can dump the raw data with this change.  Actually
this requires the next commit to enable attr.defer_callchain, but if you
already have a data file, it'll show the following result.

  $ perf report -D
  ...
  0x2158@perf.data [0x40]: event: 22
  .
  . ... raw event: size 64 bytes
  .  0000:  16 00 00 00 02 00 40 00 06 00 00 00 0b 00 00 00  ......@.........
  .  0010:  03 00 00 00 00 00 00 00 a7 7f 33 fe 18 7f 00 00  ..........3.....
  .  0020:  0f 0e 33 fe 18 7f 00 00 48 14 33 fe 18 7f 00 00  ..3.....H.3.....
  .  0030:  08 09 00 00 08 09 00 00 e6 7a e7 35 1c 00 00 00  .........z.5....

  121163447014 0x2158 [0x40]: PERF_RECORD_CALLCHAIN_DEFERRED(IP, 0x2): 2312/2312: 0xb00000006
  ... FP chain: nr:3
  .....  0: 00007f18fe337fa7
  .....  1: 00007f18fe330e0f
  .....  2: 00007f18fe331448
  : unhandled!

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agotools headers UAPI: Sync linux/perf_event.h for deferred callchains
Namhyung Kim [Thu, 20 Nov 2025 23:47:59 +0000 (15:47 -0800)] 
tools headers UAPI: Sync linux/perf_event.h for deferred callchains

It needs to sync with the kernel to support user space changes for the
deferred callchains.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf jevents: Skip optional metrics in metric group list
Ian Rogers [Tue, 2 Dec 2025 17:50:07 +0000 (09:50 -0800)] 
perf jevents: Skip optional metrics in metric group list

For metric groups, skip metrics in the list that are None. This allows
functions to better optionally return metrics.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf jevents: Drop duplicate pending metrics
Ian Rogers [Tue, 2 Dec 2025 17:50:06 +0000 (09:50 -0800)] 
perf jevents: Drop duplicate pending metrics

Drop adding a pending metric if there is an existing one. Ensure the
PMUs differ for hybrid systems.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf jevents: Move json encoding to its own functions
Ian Rogers [Tue, 2 Dec 2025 17:50:05 +0000 (09:50 -0800)] 
perf jevents: Move json encoding to its own functions

Have dedicated encode functions rather than having them embedded in
MetricGroup. This is to provide some uniformity in the Metric ToXXX
routines.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf jevents: Add threshold expressions to Metric
Ian Rogers [Tue, 2 Dec 2025 17:50:04 +0000 (09:50 -0800)] 
perf jevents: Add threshold expressions to Metric

Allow threshold expressions for metrics to be generated.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf jevents: Term list fix in event parsing
Ian Rogers [Tue, 2 Dec 2025 17:50:03 +0000 (09:50 -0800)] 
perf jevents: Term list fix in event parsing

Fix events seemingly broken apart at a comma.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf jevents: Support parsing negative exponents
Ian Rogers [Tue, 2 Dec 2025 17:50:02 +0000 (09:50 -0800)] 
perf jevents: Support parsing negative exponents

Support negative exponents when parsing from a json metric string by
making the numbers after the 'e' optional in the 'Event' insertion fix
up.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf jevents: Allow metric groups not to be named
Ian Rogers [Tue, 2 Dec 2025 17:50:01 +0000 (09:50 -0800)] 
perf jevents: Allow metric groups not to be named

It can be convenient to have unnamed metric groups for the sake of
organizing other metrics and metric groups. An unspecified name
shouldn't contribute to the MetricGroup json value, so don't record
it.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf jevents: Add descriptions to metricgroup abstraction
Ian Rogers [Tue, 2 Dec 2025 17:50:00 +0000 (09:50 -0800)] 
perf jevents: Add descriptions to metricgroup abstraction

Add a function to recursively generate metric group descriptions.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf jevents: Update metric constraint support
Ian Rogers [Tue, 2 Dec 2025 17:49:59 +0000 (09:49 -0800)] 
perf jevents: Update metric constraint support

Previous metric constraints were binary, either none or don't group
when the NMI watchdog is present. Update to match the definitions in
'enum metric_event_groups' in pmu-events.h.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf jevents: Allow multiple metricgroups.json files
Ian Rogers [Tue, 2 Dec 2025 17:49:58 +0000 (09:49 -0800)] 
perf jevents: Allow multiple metricgroups.json files

Allow multiple metricgroups.json files by handling any file ending
with metricgroups.json as a metricgroups file.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf ilist: Be tolerant of reading a metric on the wrong CPU
Ian Rogers [Tue, 2 Dec 2025 17:49:57 +0000 (09:49 -0800)] 
perf ilist: Be tolerant of reading a metric on the wrong CPU

This happens on hybrid machine metrics. Be tolerant and don't cause
the ilist application to crash with an exception.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf python: Correct copying of metric_leader in an evsel
Ian Rogers [Tue, 2 Dec 2025 17:49:56 +0000 (09:49 -0800)] 
perf python: Correct copying of metric_leader in an evsel

Ensure the metric_leader is copied and set up correctly. In
compute_metric determine the correct metric_leader event to match the
requested CPU. Fixes the handling of metrics particularly on hybrid
machines.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf test: Add python JIT dump test
Namhyung Kim [Tue, 25 Nov 2025 08:07:47 +0000 (00:07 -0800)] 
perf test: Add python JIT dump test

Add a test case for the python interpreter like below so that we can
make sure it won't break again.  To validate the effect of build-ID
generation, it adds and removes the JIT'ed DSOs to/from the build-ID
cache for the test.

  $ perf test -vv jitdump
   84: python profiling with jitdump:
  --- start ---
  test child forked, pid 214316
  Run python with -Xperf_jit
  [ perf record: Woken up 5 times to write data ]
  [ perf record: Captured and wrote 1.180 MB /tmp/__perf_test.perf.data.XbqZNm (140 samples) ]
  Generate JIT-ed DSOs using perf inject
  Add JIT-ed DSOs to the build-ID cache
  Check the symbol containing the script name
  Found 108 matching lines
  Remove JIT-ed DSOs from the build-ID cache
  ---- end(0) ----
   84: python profiling with jitdump                                   : Ok

Cc: Pablo Galindo <pablogsal@gmail.com>
Link: https://docs.python.org/3/howto/perf_profiling.html#how-to-work-without-frame-pointers
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf jitdump: Add sym/str-tables to build-ID generation
Namhyung Kim [Tue, 25 Nov 2025 08:07:46 +0000 (00:07 -0800)] 
perf jitdump: Add sym/str-tables to build-ID generation

It was reported that python backtrace with JIT dump was broken after the
change to built-in SHA-1 implementation.  It seems python generates the
same JIT code for each function.  They will become separate DSOs but the
contents are the same.  Only difference is in the symbol name.

But this caused a problem that every JIT'ed DSOs will have the same
build-ID which makes perf confused.  And it resulted in no python
symbols (from JIT) in the output.

Looking back at the original code before the conversion, it used the
load_addr as well as the code section to distinguish each DSO.  But it'd
be better to use contents of symtab and strtab instead as it aligns with
some linker behaviors.

This patch adds a buffer to save all the contents in a single place for
SHA-1 calculation.  Probably we need to add sha1_update() or similar to
update the existing hash value with different contents and use it here.
But it's out of scope for this change and I'd like something that can be
backported to the stable trees easily.

Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Pablo Galindo <pablogsal@gmail.com>
Cc: Fangrui Song <maskray@sourceware.org>
Link: https://github.com/python/cpython/issues/139544
Fixes: e3f612c1d8f3945b ("perf genelf: Remove libcrypto dependency and use built-in sha1()")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf test: Fix hybrid testing of event fallback test
Ian Rogers [Mon, 1 Dec 2025 23:11:36 +0000 (15:11 -0800)] 
perf test: Fix hybrid testing of event fallback test

The mem-loads-aux event exists on hybrid systems but the "cpu" PMU
does not. This causes an event parsing error which erroneously makes
the test look like it is failing. Avoid naming the PMU to avoid
this. Rather than cleaning up perf.data in the directory the test is
run, explicitly send the 'perf record' output to /dev/null and avoid
any cleanup scripts.

Fixes: fc9c17b22352 ("perf test: Add a perf event fallback test")
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf tools: Remove a trailing newline in the event terms
Namhyung Kim [Tue, 2 Dec 2025 23:01:31 +0000 (15:01 -0800)] 
perf tools: Remove a trailing newline in the event terms

So that it can show the correct encoding info in the JSON output.

  $ perf list -j hw
  [
  {
          "Unit": "cpu",
          "Topic": "legacy hardware",
          "EventName": "branch-instructions",
          "EventType": "Kernel PMU event",
          "BriefDescription": "Retired branch instructions [This event is an alias of branches]",
          "Encoding": "cpu/event=0xc4/"
  },
  ...

Reviewed-by: Ian Rogers <irogers@google.com>
Suggested-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
3 weeks agoperf trace: Skip internal syscall arguments
Namhyung Kim [Thu, 27 Nov 2025 04:44:18 +0000 (20:44 -0800)] 
perf trace: Skip internal syscall arguments

Recent changes in the linux-next kernel will add new field for syscalls
to have contents in the userspace like below.

  # cat /sys/kernel/tracing/events/syscalls/sys_enter_write/format
  name: sys_enter_write
  ID: 758
  format:
          field:unsigned short common_type;       offset:0;       size:2; signed:0;
          field:unsigned char common_flags;       offset:2;       size:1; signed:0;
          field:unsigned char common_preempt_count;       offset:3;       size:1; signed:0;
          field:int common_pid;   offset:4;       size:4; signed:1;

          field:int __syscall_nr; offset:8;       size:4; signed:1;
          field:unsigned int fd;  offset:16;      size:8; signed:0;
          field:const char * buf; offset:24;      size:8; signed:0;
          field:size_t count;     offset:32;      size:8; signed:0;
          field:__data_loc char[] __buf_val;      offset:40;      size:4; signed:0;

  print fmt: "fd: 0x%08lx, buf: 0x%08lx (%s), count: 0x%08lx", ((unsigned long)(REC->fd)),
             ((unsigned long)(REC->buf)), __print_dynamic_array(__buf_val, 1),
             ((unsigned long)(REC->count))

We have a different way to handle those arguments and this change
confuses perf trace then make some tests failing.  Fix it by skipping
the new fields that have "__data_loc char[]" type.

Maybe we can switch to this instead of the BPF augmentation later.

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Howard Chu <howardchu95@gmail.com>
Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
3 weeks agoperf tools: Don't read build-ids from non-regular files
James Clark [Mon, 24 Nov 2025 10:59:08 +0000 (10:59 +0000)] 
perf tools: Don't read build-ids from non-regular files

Simplify the build ID reading code by removing the non-blocking option.
Having to pass the correct option to this function was fragile and a
mistake would result in a hang, see the linked fix. Furthermore,
compressed files are always opened blocking anyway, ignoring the
non-blocking option.

We also don't expect to read build IDs from non-regular files. The only
hits to this function that are non-regular are devices that won't be elf
files with build IDs, for example "/dev/dri/renderD129".

Now instead of opening these as non-blocking and failing to read, we
skip them. Even if something like a pipe or character device did have a
build ID, I don't think it would have worked because you need to call
read() in a loop, check for -EAGAIN and handle timeouts to make
non-blocking reads work.

Link: https://lore.kernel.org/linux-perf-users/20251022-james-perf-fix-dso-block-v1-1-c4faab150546@linaro.org/
Signed-off-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
3 weeks agoperf vendor events riscv: add T-HEAD C920V2 JSON support
Inochi Amaoto [Tue, 14 Oct 2025 01:48:29 +0000 (09:48 +0800)] 
perf vendor events riscv: add T-HEAD C920V2 JSON support

T-HEAD C920 has a V2 iteration, which supports Sscompmf. The V2
iteration supports the same perf events as V1.

Reuse T-HEAD c900-legacy JSON file for T-HEAD C920V2.

Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Acked-by: Paul Walmsley <pjw@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
3 weeks agoperf pmu: fix duplicate conditional statement
Anubhav Shelat [Tue, 25 Nov 2025 11:41:18 +0000 (11:41 +0000)] 
perf pmu: fix duplicate conditional statement

Remove duplicate check for PERF_PMU_TYPE_DRM_END in perf_pmu__kind.

Fixes: f0feb21e0a10 ("perf pmu: Add PMU kind to simplify differentiating")
Signed-off-by: Anubhav Shelat <ashelat@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Closes: https://lore.kernel.org/linux-perf-users/CA+G8Dh+wLx+FvjjoEkypqvXhbzWEQVpykovzrsHi2_eQjHkzQA@mail.gmail.com/
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
3 weeks agoperf docs: arm-spe: Document new SPE filtering features
James Clark [Tue, 11 Nov 2025 11:37:59 +0000 (11:37 +0000)] 
perf docs: arm-spe: Document new SPE filtering features

FEAT_SPE_EFT and FEAT_SPE_FDS etc have new user facing format attributes
so document them. Also document existing 'event_filter' bits that were
missing from the doc and the fact that latency values are stored in the
weight field.

Reviewed-by: Leo Yan <leo.yan@arm.com>
Tested-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
3 weeks agoperf tools: Add support for perf_event_attr::config4
James Clark [Tue, 11 Nov 2025 11:37:58 +0000 (11:37 +0000)] 
perf tools: Add support for perf_event_attr::config4

perf_event_attr has gained a new field, config4, so add support for it
extending the existing configN support.

Reviewed-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
3 weeks agotools headers UAPI: Sync linux/perf_event.h with the kernel sources
James Clark [Tue, 11 Nov 2025 11:37:57 +0000 (11:37 +0000)] 
tools headers UAPI: Sync linux/perf_event.h with the kernel sources

To pickup config4 changes.

Tested-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf: replace strcpy() with strncpy() in util/jitdump.c
Hrishikesh Suresh [Thu, 20 Nov 2025 04:16:10 +0000 (23:16 -0500)] 
perf: replace strcpy() with strncpy() in util/jitdump.c

Usage of strcpy() can lead to buffer overflows. Therefore, it has been
replaced with strncpy(). The output file path is provided as a parameter
and might be restricted by command-line by default. But this defensive
patch will prevent any potential overflow, making the code more robust
against future changes in input handling.

Testing:
- ran perf test from tools/perf and did not observe any regression with
  the earlier code

Signed-off-by: Hrishikesh Suresh <hrishikesh123s@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf list: Support filtering in JSON output
Namhyung Kim [Thu, 20 Nov 2025 00:47:26 +0000 (16:47 -0800)] 
perf list: Support filtering in JSON output

Like regular output mode, it should honor command line arguments to
limit to a certain type of PMUs or events.

  $ perf list -j hw
  [
  {
          "Unit": "cpu",
          "Topic": "legacy hardware",
          "EventName": "branch-instructions",
          "EventType": "Kernel PMU event",
          "BriefDescription": "Retired branch instructions [This event is an alias of branches]",
          "Encoding": "cpu/event=0xc4\n/"
  },
  {
          "Unit": "cpu",
          "Topic": "legacy hardware",
          "EventName": "branch-misses",
          "EventType": "Kernel PMU event",
          "BriefDescription": "Mispredicted branch instructions",
          "Encoding": "cpu/event=0xc5\n/"
  },
  ...

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf list: Share print state with JSON output
Namhyung Kim [Thu, 20 Nov 2025 00:47:25 +0000 (16:47 -0800)] 
perf list: Share print state with JSON output

The JSON print state has only one different field (need_sep).  Let's
add the default print state to the json state and use it.  Then we can
use the 'ps' variable to update the state properly.

This is a preparation for the next commit.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf list: Print matching PMU events for --unit
Namhyung Kim [Thu, 20 Nov 2025 00:47:24 +0000 (16:47 -0800)] 
perf list: Print matching PMU events for --unit

When --unit option is used, pmu_glob is set to the argument.  It should
match with event PMU and display the matching ones only.  But it also
shows raw events and metrics after that.

  $ perf list --unit tool
  List of pre-defined events (to be used in -e or -M):

  tool:
    core_wide
         [1 if not SMT,if SMT are events being gathered on all SMT threads 1 otherwise 0. Unit: tool]
    duration_time
         [Wall clock interval time in nanoseconds. Unit: tool]
    has_pmem
         [1 if persistent memory installed otherwise 0. Unit: tool]
    num_cores
         [Number of cores. A core consists of 1 or more thread,with each thread being associated with a logical Linux CPU. Unit: tool]
    num_cpus
         [Number of logical Linux CPUs. There may be multiple such CPUs on a core. Unit: tool]
    ...
    rNNN                                               [Raw event descriptor]
    cpu/event=0..255,pc,edge,.../modifier              [Raw event descriptor]
         [(see 'man perf-list' or 'man perf-record' on how to encode it)]
    breakpoint//modifier                               [Raw event descriptor]
    cstate_core/event=0..0xffffffffffffffff/modifier   [Raw event descriptor]
    cstate_pkg/event=0..0xffffffffffffffff/modifier    [Raw event descriptor]
    drm_i915//modifier                                 [Raw event descriptor]
    hwmon_acpitz//modifier                             [Raw event descriptor]
    hwmon_ac//modifier                                 [Raw event descriptor]
    hwmon_bat0//modifier                               [Raw event descriptor]
    hwmon_coretemp//modifier                           [Raw event descriptor]
    ...

  Metric Groups:

  Backend: [Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet]
    tma_core_bound
         [This metric represents fraction of slots where Core non-memory issues were of a bottleneck]
    tma_info_core_ilp
         [Instruction-Level-Parallelism (average number of uops executed when there is execution) per thread (logical-processor)]
    tma_info_memory_l2mpki
         [L2 cache true misses per kilo instruction for retired demand loads]
    ...

This change makes it print the tool PMU events only.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf test all metrics: Fully ignore Default metric failures
Ian Rogers [Wed, 19 Nov 2025 19:30:47 +0000 (11:30 -0800)] 
perf test all metrics: Fully ignore Default metric failures

Determine if a metric is default from `perf list --raw-dump $m` eg:
```
$ perf list --raw-dump l1_prefetch_miss_rate
Default4 l1_prefetch_miss_rate
```
If a metric has "not supported" or "no supported events" then ignore
these failures for default metrics. Tidy up the skip/fail messages in
the output to make them easier to spot/read.

```
$ perf list -vv "all metrics"
...
Testing llc_miss_rate
[Ignored llc_miss_rate] failed but as a Default metric this can be expected
Error: No supported events found. The LLC-loads event is not supported.
...
```

Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Closes: https://lore.kernel.org/linux-perf-users/20251119104751.51960-1-tmricht@linux.ibm.com/
Reported-by: Namhyung Kim <namhyung@kernel.org>
Reported-by: James Clark <james.clark@linaro.org>
Closes: https://lore.kernel.org/lkml/aRi9xnwdLh3Dir9f@google.com/
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf evsel: Skip store_evsel_ids for non-perf-event PMUs
Ian Rogers [Fri, 14 Nov 2025 22:05:47 +0000 (14:05 -0800)] 
perf evsel: Skip store_evsel_ids for non-perf-event PMUs

The IDs are associated with perf events and not applicable to non-perf
event PMUs. The failure to generate the ids was causing perf stat
record to fail.

```
$ perf stat record -a sleep 1

 Performance counter stats for 'system wide':

            47,941      context-switches                 #      nan cs/sec  cs_per_second
              0.00 msec cpu-clock                        #      0.0 CPUs  CPUs_utilized
             3,261      cpu-migrations                   #      nan migrations/sec  migrations_per_second
               516      page-faults                      #      nan faults/sec  page_faults_per_second
         7,525,483      cpu_core/branch-misses/          #      2.3 %  branch_miss_rate
       322,069,004      cpu_core/branches/               #      nan M/sec  branch_frequency
     1,895,684,291      cpu_core/cpu-cycles/             #      nan GHz  cycles_frequency
     2,789,777,426      cpu_core/instructions/           #      1.5 instructions  insn_per_cycle
         7,074,765      cpu_atom/branch-misses/          #      3.2 %  branch_miss_rate         (49.89%)
       224,225,412      cpu_atom/branches/               #      nan M/sec  branch_frequency     (50.29%)
     2,061,679,981      cpu_atom/cpu-cycles/             #      nan GHz  cycles_frequency       (50.33%)
     2,011,242,533      cpu_atom/instructions/           #      1.0 instructions  insn_per_cycle  (50.33%)
             TopdownL1 (cpu_core)                        #      9.0 %  tma_bad_speculation
                                                         #     28.3 %  tma_frontend_bound
                                                         #     35.2 %  tma_backend_bound
                                                         #     27.5 %  tma_retiring
             TopdownL1 (cpu_atom)                        #     36.8 %  tma_backend_bound        (59.65%)
                                                         #     22.8 %  tma_frontend_bound       (59.60%)
                                                         #     11.6 %  tma_bad_speculation
                                                         #     28.8 %  tma_retiring             (59.59%)

       1.006777519 seconds time elapsed

$ perf stat report

 Performance counter stats for 'perf':

     1,013,376,154      duration_time
     <not counted>      duration_time
     <not counted>      duration_time
     <not counted>      duration_time
     <not counted>      duration_time
     <not counted>      duration_time
            47,941      context-switches
              0.00 msec cpu-clock
             3,261      cpu-migrations
               516      page-faults
         7,525,483      cpu_core/branch-misses/
       322,069,814      cpu_core/branches/
       322,069,004      cpu_core/branches/
     1,895,684,291      cpu_core/cpu-cycles/
     1,895,679,209      cpu_core/cpu-cycles/
     2,789,777,426      cpu_core/instructions/
     <not counted>      cpu_core/cpu-cycles/
     <not counted>      cpu_core/stalled-cycles-frontend/
     <not counted>      cpu_core/cpu-cycles/
     <not counted>      cpu_core/stalled-cycles-backend/
     <not counted>      cpu_core/stalled-cycles-backend/
     <not counted>      cpu_core/instructions/
     <not counted>      cpu_core/stalled-cycles-frontend/
         7,074,765      cpu_atom/branch-misses/                                                 (49.89%)
       221,679,088      cpu_atom/branches/                                                      (49.89%)
       224,225,412      cpu_atom/branches/                                                      (50.29%)
     2,061,679,981      cpu_atom/cpu-cycles/                                                    (50.33%)
     2,016,259,567      cpu_atom/cpu-cycles/                                                    (50.33%)
     2,011,242,533      cpu_atom/instructions/                                                  (50.33%)
     <not counted>      cpu_atom/cpu-cycles/
     <not counted>      cpu_atom/stalled-cycles-frontend/
     <not counted>      cpu_atom/cpu-cycles/
     <not counted>      cpu_atom/stalled-cycles-backend/
     <not counted>      cpu_atom/stalled-cycles-backend/
     <not counted>      cpu_atom/instructions/
     <not counted>      cpu_atom/stalled-cycles-frontend/
        17,145,113      cpu_core/INT_MISC.UOP_DROPPING/
    10,594,226,100      cpu_core/TOPDOWN.SLOTS/
     2,919,021,401      cpu_core/topdown-retiring/
       943,101,838      cpu_core/topdown-bad-spec/
     3,031,152,533      cpu_core/topdown-fe-bound/
     3,739,756,791      cpu_core/topdown-be-bound/
     1,909,501,648      cpu_atom/CPU_CLK_UNHALTED.CORE/                                         (60.04%)
     3,516,608,359      cpu_atom/TOPDOWN_BE_BOUND.ALL/                                          (59.65%)
     2,179,403,876      cpu_atom/TOPDOWN_FE_BOUND.ALL/                                          (59.60%)
     2,745,732,458      cpu_atom/TOPDOWN_RETIRING.ALL/                                          (59.59%)

       1.006777519 seconds time elapsed

Some events weren't counted. Try disabling the NMI watchdog:
        echo 0 > /proc/sys/kernel/nmi_watchdog
        perf stat ...
        echo 1 > /proc/sys/kernel/nmi_watchdog
```

Reported-by: James Clark <james.clark@linaro.org>
Closes: https://lore.kernel.org/lkml/ca0f0cd3-7335-48f9-8737-2f70a75b019a@linaro.org/
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf pmu: Add PMU kind to simplify differentiating
Ian Rogers [Fri, 14 Nov 2025 22:05:46 +0000 (14:05 -0800)] 
perf pmu: Add PMU kind to simplify differentiating

Rather than perf_pmu__is_xxx calls, and a notion of kind so that a
single call can be used.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf header: Switch "cpu" for find_core_pmu in caps feature writing
Ian Rogers [Fri, 14 Nov 2025 22:05:45 +0000 (14:05 -0800)] 
perf header: Switch "cpu" for find_core_pmu in caps feature writing

Writing currently fails on non-x86 and hybrid CPUs. Switch to the more
regular find_core_pmu that is normally used in this case. Tested on
hybrid alderlake system.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf test maps: Additional maps__fixup_overlap_and_insert tests
Ian Rogers [Wed, 19 Nov 2025 05:05:55 +0000 (21:05 -0800)] 
perf test maps: Additional maps__fixup_overlap_and_insert tests

Add additional test to the maps covering
maps__fixup_overlap_and_insert. Change the test suite to be for more
than just 1 test.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf maps: Avoid RC_CHK use after free
Ian Rogers [Wed, 19 Nov 2025 05:05:54 +0000 (21:05 -0800)] 
perf maps: Avoid RC_CHK use after free

The case of __maps__fixup_overlap_and_insert where the "new" maps
covers existing mappings can create a use-after-free with reference
count checking enabled. The issue is that "pos" holds a map pointer
from maps_by_address that is put from maps_by_address but then used to
look for a map in maps_by_name (the compared map is now a
use-after-free). The issue stems from using maps__remove which redoes
some of the searches already done by __maps__fixup_overlap_and_insert,
so optimize the code (by avoiding repeated searches) and avoid the
use-after-free by inlining the appropriate removal code.

Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202511141407.f9edcfa6-lkp@intel.com
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf stat: Read tool events last
Ian Rogers [Tue, 18 Nov 2025 21:13:24 +0000 (13:13 -0800)] 
perf stat: Read tool events last

When reading a metric like memory bandwidth on multiple sockets, the
additional sockets will be on CPUS > 0. Because of the affinity
reading, the counters are read on CPU 0 along with the time, then the
later sockets are read. This can lead to the later sockets having a
bandwidth larger than is possible for the period of time. To avoid
this move the reading of tool events to occur after all other events
are read.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Synthesize memory samples for SIMD operations
Leo Yan [Wed, 12 Nov 2025 18:24:43 +0000 (18:24 +0000)] 
perf arm_spe: Synthesize memory samples for SIMD operations

Synthesize memory samples for SIMD operations (including Advanced SIMD,
SVE, and SME). To provide complete information, also generate data
source entries for SIMD operations.

Since memory operations are not limited to load and store, set
PERF_MEM_OP_STORE if the operation does not fall into these cases.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Expose SIMD information in other operations
Leo Yan [Wed, 12 Nov 2025 18:24:42 +0000 (18:24 +0000)] 
perf arm_spe: Expose SIMD information in other operations

The other operations contain SME data processing, ASE (Advanced SIMD)
and floating-point operations. Expose these info in the records.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Report GCS in record
Leo Yan [Wed, 12 Nov 2025 18:24:41 +0000 (18:24 +0000)] 
perf arm_spe: Report GCS in record

Report GCS related info in records.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Report memset and memcpy in records
Leo Yan [Wed, 12 Nov 2025 18:24:40 +0000 (18:24 +0000)] 
perf arm_spe: Report memset and memcpy in records

Expose memset and memcpy related info in records.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Report associated info for SVE / SME operations
Leo Yan [Wed, 12 Nov 2025 18:24:39 +0000 (18:24 +0000)] 
perf arm_spe: Report associated info for SVE / SME operations

SVE / SME operations can be predicated or Gather load / scatter store,
save the relevant info into record.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Report extended memory operations in records
Leo Yan [Wed, 12 Nov 2025 18:24:38 +0000 (18:24 +0000)] 
perf arm_spe: Report extended memory operations in records

Extended memory operations include atomic (AT), acquire/release (AR),
and exclusive (EXCL) operations. Save the relevant information
in the records.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Report MTE allocation tag in record
Leo Yan [Wed, 12 Nov 2025 18:24:37 +0000 (18:24 +0000)] 
perf arm_spe: Report MTE allocation tag in record

Save MTE tag info in memory record.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Report register access in record
Leo Yan [Wed, 12 Nov 2025 18:24:36 +0000 (18:24 +0000)] 
perf arm_spe: Report register access in record

Record register access info for load / store operations.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Introduce data processing macro for SVE operations
Leo Yan [Wed, 12 Nov 2025 18:24:35 +0000 (18:24 +0000)] 
perf arm_spe: Introduce data processing macro for SVE operations

Introduce the ARM_SPE_OP_DP (data processing) macro as associated
information for SVE operations. For SVE register access, only
ARM_SPE_OP_SVE is set; for SVE data processing, both ARM_SPE_OP_SVE and
ARM_SPE_OP_DP are set together.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Consolidate operation types
Leo Yan [Wed, 12 Nov 2025 18:24:34 +0000 (18:24 +0000)] 
perf arm_spe: Consolidate operation types

Consolidate operation types in a way:

(a) Extract the second-level types into separate enums.
(b) The second-level types for memory and SIMD operations are classified
    by modules. E.g., an operation may relate to general register,
    SIMD/FP, SVE, etc.
(c) The associated information tells details. E.g., an operation is
    load or store, whether it is atomic operation, etc.

Start the enum items for the second-level types from 8 to accommodate
more entries within a 32-bit integer.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Remove unused operation types
Leo Yan [Wed, 12 Nov 2025 18:24:33 +0000 (18:24 +0000)] 
perf arm_spe: Remove unused operation types

Remove unused SVE operation types. These operations will be reintroduced
in subsequent refactoring, but with a different format.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Decode SME data processing packet
Leo Yan [Wed, 12 Nov 2025 18:24:32 +0000 (18:24 +0000)] 
perf arm_spe: Decode SME data processing packet

For SME data processing, decode its Effective vector length or Tile Size
(ETS), and print out if a floating-point operation.

After:

  .  00000000:  49 00                                           SME-OTHER ETS 1024 FP
  .  00000002:  b2 18 3c d7 83 00 80 ff ff                      VA 0xffff800083d73c18
  .  0000000b:  9a 00 00                                        LAT 0 XLAT
  .  0000000e:  43 00                                           DATA-SOURCE 0

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Decode ASE and FP fields in other operation
Leo Yan [Wed, 12 Nov 2025 18:24:31 +0000 (18:24 +0000)] 
perf arm_spe: Decode ASE and FP fields in other operation

Add a check for other operation, which prevents any incorrectly
classifying. Parse the ASE and FP fields.

After:

  .  0000002f:  48 06                                           OTHER ASE FP INSN-OTHER
  .  00000031:  b2 08 80 48 01 08 00 ff ff                      VA 0xffff000801488008
  .  0000003a:  9a 00 00                                        LAT 0 XLAT
  .  0000003d:  42 16                                           EV RETIRED L1D-ACCESS TLB-ACCESS

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Rename SPE_OP_PKT_IS_OTHER_SVE_OP macro
Leo Yan [Wed, 12 Nov 2025 18:24:30 +0000 (18:24 +0000)] 
perf arm_spe: Rename SPE_OP_PKT_IS_OTHER_SVE_OP macro

Rename the macro to SPE_OP_PKT_OTHER_SUBCLASS_SVE to unify naming.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Decode GCS operation
Leo Yan [Wed, 12 Nov 2025 18:24:29 +0000 (18:24 +0000)] 
perf arm_spe: Decode GCS operation

Decode a load or store from a GCS operation and the associated "common"
field.

After:

  .  00000000:  49 44                                           LD GCS COMM
  .  00000002:  b2 18 3c d7 83 00 80 ff ff                      VA 0xffff800083d73c18
  .  0000000b:  9a 00 00                                        LAT 0 XLAT
  .  0000000e:  43 00                                           DATA-SOURCE 0

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Unify operation naming
Leo Yan [Wed, 12 Nov 2025 18:24:28 +0000 (18:24 +0000)] 
perf arm_spe: Unify operation naming

Rename extended subclass and SVE/SME register access subclass, so that
the naming can be consistent cross all sub classes.

Add an log "SVE-SME-REG" for the SVE/SME register access, this is easier
for parsing.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Fix memset subclass in operation
Leo Yan [Wed, 12 Nov 2025 18:24:27 +0000 (18:24 +0000)] 
perf arm_spe: Fix memset subclass in operation

The operation subclass is extracted from bits [7..1] of the payload.
Since bit [0] is not parsed, there is no chance to match the memset type
(0x25). As a result, the memset payload is never parsed successfully.

Instead of extracting a unified bit field, change to extract the
specific bits for each operation subclass.

Fixes: 34fb60400e32 ("perf arm-spe: Add raw decoding for SPEv1.3 MTE and MOPS load/store")
Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf tool_pmu: More accurately set the cpus for tool events
Ian Rogers [Thu, 13 Nov 2025 18:05:13 +0000 (10:05 -0800)] 
perf tool_pmu: More accurately set the cpus for tool events

The user and system time events can record on different CPUs, but for
all other events a single CPU map of just CPU 0 makes sense. In
parse-events detect a tool PMU and then pass the perf_event_attr so
that the tool_pmu can return CPUs specific for the event. This avoids
a CPU map of all online CPUs being used for events like
duration_time. Avoiding this avoids the evlist CPUs containing CPUs
for which duration_time just gives 0. Minimizing the evlist CPUs can
remove unnecessary sched_setaffinity syscalls that delay metric
calculations.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf stat: Reduce scope of walltime_nsecs_stats
Ian Rogers [Thu, 13 Nov 2025 18:05:12 +0000 (10:05 -0800)] 
perf stat: Reduce scope of walltime_nsecs_stats

walltime_nsecs_stats is no longer used for counter values, move into
that stat_config where it controls certain things like noise
measurement.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf stat: Reduce scope of ru_stats
Ian Rogers [Thu, 13 Nov 2025 18:05:11 +0000 (10:05 -0800)] 
perf stat: Reduce scope of ru_stats

The ru_stats are used to capture user and system time stats when a
process exits. These are then applied to user and system time tool
events if their reads fail due to the process terminating. Reduce the
scope now the metric code no longer reads these values.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf stat-shadow: Read tool events directly
Ian Rogers [Thu, 13 Nov 2025 18:05:10 +0000 (10:05 -0800)] 
perf stat-shadow: Read tool events directly

When reading time values for metrics don't use the globals updated in
builtin-stat, just read the events as regular events. The only
exception is for time events where nanoseconds need converting to
seconds as metrics assume time metrics are in seconds.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf tool_pmu: Use old_count when computing count values for time events
Ian Rogers [Thu, 13 Nov 2025 18:05:09 +0000 (10:05 -0800)] 
perf tool_pmu: Use old_count when computing count values for time events

When running in interval mode every third count of a time event isn't
showing properly:
```
$ perf stat -e duration_time -a -I 1000
     1.001082862      1,002,290,425      duration_time
     2.004264262      1,003,183,516      duration_time
     3.007381401      <not counted>      duration_time
     4.011160141      1,003,705,631      duration_time
     5.014515385      1,003,290,110      duration_time
     6.018539680      <not counted>      duration_time
     7.022065321      1,003,591,720      duration_time
```
The regression came in with a different fix, found through bisection,
commit 68cb1567439f ("perf tool_pmu: Fix aggregation on
duration_time"). The issue is caused by the enabled and running time
of the event matching the old_count's and creating a delta of 0, which
is indicative of an error.

Fixes: 68cb1567439f ("perf tool_pmu: Fix aggregation on duration_time")
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf pmu: perf_cpu_map__new_int to avoid parsing a string
Ian Rogers [Thu, 13 Nov 2025 18:05:08 +0000 (10:05 -0800)] 
perf pmu: perf_cpu_map__new_int to avoid parsing a string

Prefer perf_cpu_map__new_int(0) to perf_cpu_map__new("0") as it avoids
strings parsing.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agolibperf cpumap: Reduce allocations and sorting in intersect
Ian Rogers [Thu, 13 Nov 2025 18:05:07 +0000 (10:05 -0800)] 
libperf cpumap: Reduce allocations and sorting in intersect

On hybrid platforms the CPU maps are often disjoint. Rather than copy
CPUs and trim, compute the number of common CPUs, if none early exit,
otherwise copy in an sorted order. This avoids memory allocation in
the disjoint case and avoids a second malloc and useless sort in the
previous trim cases.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>