]> git.ipfire.org Git - thirdparty/kernel/linux.git/log
thirdparty/kernel/linux.git
2 weeks agoperf python: Correct copying of metric_leader in an evsel
Ian Rogers [Tue, 2 Dec 2025 17:49:56 +0000 (09:49 -0800)] 
perf python: Correct copying of metric_leader in an evsel

Ensure the metric_leader is copied and set up correctly. In
compute_metric determine the correct metric_leader event to match the
requested CPU. Fixes the handling of metrics particularly on hybrid
machines.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf test: Add python JIT dump test
Namhyung Kim [Tue, 25 Nov 2025 08:07:47 +0000 (00:07 -0800)] 
perf test: Add python JIT dump test

Add a test case for the python interpreter like below so that we can
make sure it won't break again.  To validate the effect of build-ID
generation, it adds and removes the JIT'ed DSOs to/from the build-ID
cache for the test.

  $ perf test -vv jitdump
   84: python profiling with jitdump:
  --- start ---
  test child forked, pid 214316
  Run python with -Xperf_jit
  [ perf record: Woken up 5 times to write data ]
  [ perf record: Captured and wrote 1.180 MB /tmp/__perf_test.perf.data.XbqZNm (140 samples) ]
  Generate JIT-ed DSOs using perf inject
  Add JIT-ed DSOs to the build-ID cache
  Check the symbol containing the script name
  Found 108 matching lines
  Remove JIT-ed DSOs from the build-ID cache
  ---- end(0) ----
   84: python profiling with jitdump                                   : Ok

Cc: Pablo Galindo <pablogsal@gmail.com>
Link: https://docs.python.org/3/howto/perf_profiling.html#how-to-work-without-frame-pointers
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf jitdump: Add sym/str-tables to build-ID generation
Namhyung Kim [Tue, 25 Nov 2025 08:07:46 +0000 (00:07 -0800)] 
perf jitdump: Add sym/str-tables to build-ID generation

It was reported that python backtrace with JIT dump was broken after the
change to built-in SHA-1 implementation.  It seems python generates the
same JIT code for each function.  They will become separate DSOs but the
contents are the same.  Only difference is in the symbol name.

But this caused a problem that every JIT'ed DSOs will have the same
build-ID which makes perf confused.  And it resulted in no python
symbols (from JIT) in the output.

Looking back at the original code before the conversion, it used the
load_addr as well as the code section to distinguish each DSO.  But it'd
be better to use contents of symtab and strtab instead as it aligns with
some linker behaviors.

This patch adds a buffer to save all the contents in a single place for
SHA-1 calculation.  Probably we need to add sha1_update() or similar to
update the existing hash value with different contents and use it here.
But it's out of scope for this change and I'd like something that can be
backported to the stable trees easily.

Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Pablo Galindo <pablogsal@gmail.com>
Cc: Fangrui Song <maskray@sourceware.org>
Link: https://github.com/python/cpython/issues/139544
Fixes: e3f612c1d8f3945b ("perf genelf: Remove libcrypto dependency and use built-in sha1()")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf test: Fix hybrid testing of event fallback test
Ian Rogers [Mon, 1 Dec 2025 23:11:36 +0000 (15:11 -0800)] 
perf test: Fix hybrid testing of event fallback test

The mem-loads-aux event exists on hybrid systems but the "cpu" PMU
does not. This causes an event parsing error which erroneously makes
the test look like it is failing. Avoid naming the PMU to avoid
this. Rather than cleaning up perf.data in the directory the test is
run, explicitly send the 'perf record' output to /dev/null and avoid
any cleanup scripts.

Fixes: fc9c17b22352 ("perf test: Add a perf event fallback test")
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2 weeks agoperf tools: Remove a trailing newline in the event terms
Namhyung Kim [Tue, 2 Dec 2025 23:01:31 +0000 (15:01 -0800)] 
perf tools: Remove a trailing newline in the event terms

So that it can show the correct encoding info in the JSON output.

  $ perf list -j hw
  [
  {
          "Unit": "cpu",
          "Topic": "legacy hardware",
          "EventName": "branch-instructions",
          "EventType": "Kernel PMU event",
          "BriefDescription": "Retired branch instructions [This event is an alias of branches]",
          "Encoding": "cpu/event=0xc4/"
  },
  ...

Reviewed-by: Ian Rogers <irogers@google.com>
Suggested-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
3 weeks agoperf trace: Skip internal syscall arguments
Namhyung Kim [Thu, 27 Nov 2025 04:44:18 +0000 (20:44 -0800)] 
perf trace: Skip internal syscall arguments

Recent changes in the linux-next kernel will add new field for syscalls
to have contents in the userspace like below.

  # cat /sys/kernel/tracing/events/syscalls/sys_enter_write/format
  name: sys_enter_write
  ID: 758
  format:
          field:unsigned short common_type;       offset:0;       size:2; signed:0;
          field:unsigned char common_flags;       offset:2;       size:1; signed:0;
          field:unsigned char common_preempt_count;       offset:3;       size:1; signed:0;
          field:int common_pid;   offset:4;       size:4; signed:1;

          field:int __syscall_nr; offset:8;       size:4; signed:1;
          field:unsigned int fd;  offset:16;      size:8; signed:0;
          field:const char * buf; offset:24;      size:8; signed:0;
          field:size_t count;     offset:32;      size:8; signed:0;
          field:__data_loc char[] __buf_val;      offset:40;      size:4; signed:0;

  print fmt: "fd: 0x%08lx, buf: 0x%08lx (%s), count: 0x%08lx", ((unsigned long)(REC->fd)),
             ((unsigned long)(REC->buf)), __print_dynamic_array(__buf_val, 1),
             ((unsigned long)(REC->count))

We have a different way to handle those arguments and this change
confuses perf trace then make some tests failing.  Fix it by skipping
the new fields that have "__data_loc char[]" type.

Maybe we can switch to this instead of the BPF augmentation later.

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Howard Chu <howardchu95@gmail.com>
Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
3 weeks agoperf tools: Don't read build-ids from non-regular files
James Clark [Mon, 24 Nov 2025 10:59:08 +0000 (10:59 +0000)] 
perf tools: Don't read build-ids from non-regular files

Simplify the build ID reading code by removing the non-blocking option.
Having to pass the correct option to this function was fragile and a
mistake would result in a hang, see the linked fix. Furthermore,
compressed files are always opened blocking anyway, ignoring the
non-blocking option.

We also don't expect to read build IDs from non-regular files. The only
hits to this function that are non-regular are devices that won't be elf
files with build IDs, for example "/dev/dri/renderD129".

Now instead of opening these as non-blocking and failing to read, we
skip them. Even if something like a pipe or character device did have a
build ID, I don't think it would have worked because you need to call
read() in a loop, check for -EAGAIN and handle timeouts to make
non-blocking reads work.

Link: https://lore.kernel.org/linux-perf-users/20251022-james-perf-fix-dso-block-v1-1-c4faab150546@linaro.org/
Signed-off-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
3 weeks agoperf vendor events riscv: add T-HEAD C920V2 JSON support
Inochi Amaoto [Tue, 14 Oct 2025 01:48:29 +0000 (09:48 +0800)] 
perf vendor events riscv: add T-HEAD C920V2 JSON support

T-HEAD C920 has a V2 iteration, which supports Sscompmf. The V2
iteration supports the same perf events as V1.

Reuse T-HEAD c900-legacy JSON file for T-HEAD C920V2.

Signed-off-by: Inochi Amaoto <inochiama@gmail.com>
Acked-by: Paul Walmsley <pjw@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
3 weeks agoperf pmu: fix duplicate conditional statement
Anubhav Shelat [Tue, 25 Nov 2025 11:41:18 +0000 (11:41 +0000)] 
perf pmu: fix duplicate conditional statement

Remove duplicate check for PERF_PMU_TYPE_DRM_END in perf_pmu__kind.

Fixes: f0feb21e0a10 ("perf pmu: Add PMU kind to simplify differentiating")
Signed-off-by: Anubhav Shelat <ashelat@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Closes: https://lore.kernel.org/linux-perf-users/CA+G8Dh+wLx+FvjjoEkypqvXhbzWEQVpykovzrsHi2_eQjHkzQA@mail.gmail.com/
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
3 weeks agoperf docs: arm-spe: Document new SPE filtering features
James Clark [Tue, 11 Nov 2025 11:37:59 +0000 (11:37 +0000)] 
perf docs: arm-spe: Document new SPE filtering features

FEAT_SPE_EFT and FEAT_SPE_FDS etc have new user facing format attributes
so document them. Also document existing 'event_filter' bits that were
missing from the doc and the fact that latency values are stored in the
weight field.

Reviewed-by: Leo Yan <leo.yan@arm.com>
Tested-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
3 weeks agoperf tools: Add support for perf_event_attr::config4
James Clark [Tue, 11 Nov 2025 11:37:58 +0000 (11:37 +0000)] 
perf tools: Add support for perf_event_attr::config4

perf_event_attr has gained a new field, config4, so add support for it
extending the existing configN support.

Reviewed-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
3 weeks agotools headers UAPI: Sync linux/perf_event.h with the kernel sources
James Clark [Tue, 11 Nov 2025 11:37:57 +0000 (11:37 +0000)] 
tools headers UAPI: Sync linux/perf_event.h with the kernel sources

To pickup config4 changes.

Tested-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf: replace strcpy() with strncpy() in util/jitdump.c
Hrishikesh Suresh [Thu, 20 Nov 2025 04:16:10 +0000 (23:16 -0500)] 
perf: replace strcpy() with strncpy() in util/jitdump.c

Usage of strcpy() can lead to buffer overflows. Therefore, it has been
replaced with strncpy(). The output file path is provided as a parameter
and might be restricted by command-line by default. But this defensive
patch will prevent any potential overflow, making the code more robust
against future changes in input handling.

Testing:
- ran perf test from tools/perf and did not observe any regression with
  the earlier code

Signed-off-by: Hrishikesh Suresh <hrishikesh123s@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf list: Support filtering in JSON output
Namhyung Kim [Thu, 20 Nov 2025 00:47:26 +0000 (16:47 -0800)] 
perf list: Support filtering in JSON output

Like regular output mode, it should honor command line arguments to
limit to a certain type of PMUs or events.

  $ perf list -j hw
  [
  {
          "Unit": "cpu",
          "Topic": "legacy hardware",
          "EventName": "branch-instructions",
          "EventType": "Kernel PMU event",
          "BriefDescription": "Retired branch instructions [This event is an alias of branches]",
          "Encoding": "cpu/event=0xc4\n/"
  },
  {
          "Unit": "cpu",
          "Topic": "legacy hardware",
          "EventName": "branch-misses",
          "EventType": "Kernel PMU event",
          "BriefDescription": "Mispredicted branch instructions",
          "Encoding": "cpu/event=0xc5\n/"
  },
  ...

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf list: Share print state with JSON output
Namhyung Kim [Thu, 20 Nov 2025 00:47:25 +0000 (16:47 -0800)] 
perf list: Share print state with JSON output

The JSON print state has only one different field (need_sep).  Let's
add the default print state to the json state and use it.  Then we can
use the 'ps' variable to update the state properly.

This is a preparation for the next commit.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf list: Print matching PMU events for --unit
Namhyung Kim [Thu, 20 Nov 2025 00:47:24 +0000 (16:47 -0800)] 
perf list: Print matching PMU events for --unit

When --unit option is used, pmu_glob is set to the argument.  It should
match with event PMU and display the matching ones only.  But it also
shows raw events and metrics after that.

  $ perf list --unit tool
  List of pre-defined events (to be used in -e or -M):

  tool:
    core_wide
         [1 if not SMT,if SMT are events being gathered on all SMT threads 1 otherwise 0. Unit: tool]
    duration_time
         [Wall clock interval time in nanoseconds. Unit: tool]
    has_pmem
         [1 if persistent memory installed otherwise 0. Unit: tool]
    num_cores
         [Number of cores. A core consists of 1 or more thread,with each thread being associated with a logical Linux CPU. Unit: tool]
    num_cpus
         [Number of logical Linux CPUs. There may be multiple such CPUs on a core. Unit: tool]
    ...
    rNNN                                               [Raw event descriptor]
    cpu/event=0..255,pc,edge,.../modifier              [Raw event descriptor]
         [(see 'man perf-list' or 'man perf-record' on how to encode it)]
    breakpoint//modifier                               [Raw event descriptor]
    cstate_core/event=0..0xffffffffffffffff/modifier   [Raw event descriptor]
    cstate_pkg/event=0..0xffffffffffffffff/modifier    [Raw event descriptor]
    drm_i915//modifier                                 [Raw event descriptor]
    hwmon_acpitz//modifier                             [Raw event descriptor]
    hwmon_ac//modifier                                 [Raw event descriptor]
    hwmon_bat0//modifier                               [Raw event descriptor]
    hwmon_coretemp//modifier                           [Raw event descriptor]
    ...

  Metric Groups:

  Backend: [Grouping from Top-down Microarchitecture Analysis Metrics spreadsheet]
    tma_core_bound
         [This metric represents fraction of slots where Core non-memory issues were of a bottleneck]
    tma_info_core_ilp
         [Instruction-Level-Parallelism (average number of uops executed when there is execution) per thread (logical-processor)]
    tma_info_memory_l2mpki
         [L2 cache true misses per kilo instruction for retired demand loads]
    ...

This change makes it print the tool PMU events only.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf test all metrics: Fully ignore Default metric failures
Ian Rogers [Wed, 19 Nov 2025 19:30:47 +0000 (11:30 -0800)] 
perf test all metrics: Fully ignore Default metric failures

Determine if a metric is default from `perf list --raw-dump $m` eg:
```
$ perf list --raw-dump l1_prefetch_miss_rate
Default4 l1_prefetch_miss_rate
```
If a metric has "not supported" or "no supported events" then ignore
these failures for default metrics. Tidy up the skip/fail messages in
the output to make them easier to spot/read.

```
$ perf list -vv "all metrics"
...
Testing llc_miss_rate
[Ignored llc_miss_rate] failed but as a Default metric this can be expected
Error: No supported events found. The LLC-loads event is not supported.
...
```

Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Closes: https://lore.kernel.org/linux-perf-users/20251119104751.51960-1-tmricht@linux.ibm.com/
Reported-by: Namhyung Kim <namhyung@kernel.org>
Reported-by: James Clark <james.clark@linaro.org>
Closes: https://lore.kernel.org/lkml/aRi9xnwdLh3Dir9f@google.com/
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf evsel: Skip store_evsel_ids for non-perf-event PMUs
Ian Rogers [Fri, 14 Nov 2025 22:05:47 +0000 (14:05 -0800)] 
perf evsel: Skip store_evsel_ids for non-perf-event PMUs

The IDs are associated with perf events and not applicable to non-perf
event PMUs. The failure to generate the ids was causing perf stat
record to fail.

```
$ perf stat record -a sleep 1

 Performance counter stats for 'system wide':

            47,941      context-switches                 #      nan cs/sec  cs_per_second
              0.00 msec cpu-clock                        #      0.0 CPUs  CPUs_utilized
             3,261      cpu-migrations                   #      nan migrations/sec  migrations_per_second
               516      page-faults                      #      nan faults/sec  page_faults_per_second
         7,525,483      cpu_core/branch-misses/          #      2.3 %  branch_miss_rate
       322,069,004      cpu_core/branches/               #      nan M/sec  branch_frequency
     1,895,684,291      cpu_core/cpu-cycles/             #      nan GHz  cycles_frequency
     2,789,777,426      cpu_core/instructions/           #      1.5 instructions  insn_per_cycle
         7,074,765      cpu_atom/branch-misses/          #      3.2 %  branch_miss_rate         (49.89%)
       224,225,412      cpu_atom/branches/               #      nan M/sec  branch_frequency     (50.29%)
     2,061,679,981      cpu_atom/cpu-cycles/             #      nan GHz  cycles_frequency       (50.33%)
     2,011,242,533      cpu_atom/instructions/           #      1.0 instructions  insn_per_cycle  (50.33%)
             TopdownL1 (cpu_core)                        #      9.0 %  tma_bad_speculation
                                                         #     28.3 %  tma_frontend_bound
                                                         #     35.2 %  tma_backend_bound
                                                         #     27.5 %  tma_retiring
             TopdownL1 (cpu_atom)                        #     36.8 %  tma_backend_bound        (59.65%)
                                                         #     22.8 %  tma_frontend_bound       (59.60%)
                                                         #     11.6 %  tma_bad_speculation
                                                         #     28.8 %  tma_retiring             (59.59%)

       1.006777519 seconds time elapsed

$ perf stat report

 Performance counter stats for 'perf':

     1,013,376,154      duration_time
     <not counted>      duration_time
     <not counted>      duration_time
     <not counted>      duration_time
     <not counted>      duration_time
     <not counted>      duration_time
            47,941      context-switches
              0.00 msec cpu-clock
             3,261      cpu-migrations
               516      page-faults
         7,525,483      cpu_core/branch-misses/
       322,069,814      cpu_core/branches/
       322,069,004      cpu_core/branches/
     1,895,684,291      cpu_core/cpu-cycles/
     1,895,679,209      cpu_core/cpu-cycles/
     2,789,777,426      cpu_core/instructions/
     <not counted>      cpu_core/cpu-cycles/
     <not counted>      cpu_core/stalled-cycles-frontend/
     <not counted>      cpu_core/cpu-cycles/
     <not counted>      cpu_core/stalled-cycles-backend/
     <not counted>      cpu_core/stalled-cycles-backend/
     <not counted>      cpu_core/instructions/
     <not counted>      cpu_core/stalled-cycles-frontend/
         7,074,765      cpu_atom/branch-misses/                                                 (49.89%)
       221,679,088      cpu_atom/branches/                                                      (49.89%)
       224,225,412      cpu_atom/branches/                                                      (50.29%)
     2,061,679,981      cpu_atom/cpu-cycles/                                                    (50.33%)
     2,016,259,567      cpu_atom/cpu-cycles/                                                    (50.33%)
     2,011,242,533      cpu_atom/instructions/                                                  (50.33%)
     <not counted>      cpu_atom/cpu-cycles/
     <not counted>      cpu_atom/stalled-cycles-frontend/
     <not counted>      cpu_atom/cpu-cycles/
     <not counted>      cpu_atom/stalled-cycles-backend/
     <not counted>      cpu_atom/stalled-cycles-backend/
     <not counted>      cpu_atom/instructions/
     <not counted>      cpu_atom/stalled-cycles-frontend/
        17,145,113      cpu_core/INT_MISC.UOP_DROPPING/
    10,594,226,100      cpu_core/TOPDOWN.SLOTS/
     2,919,021,401      cpu_core/topdown-retiring/
       943,101,838      cpu_core/topdown-bad-spec/
     3,031,152,533      cpu_core/topdown-fe-bound/
     3,739,756,791      cpu_core/topdown-be-bound/
     1,909,501,648      cpu_atom/CPU_CLK_UNHALTED.CORE/                                         (60.04%)
     3,516,608,359      cpu_atom/TOPDOWN_BE_BOUND.ALL/                                          (59.65%)
     2,179,403,876      cpu_atom/TOPDOWN_FE_BOUND.ALL/                                          (59.60%)
     2,745,732,458      cpu_atom/TOPDOWN_RETIRING.ALL/                                          (59.59%)

       1.006777519 seconds time elapsed

Some events weren't counted. Try disabling the NMI watchdog:
        echo 0 > /proc/sys/kernel/nmi_watchdog
        perf stat ...
        echo 1 > /proc/sys/kernel/nmi_watchdog
```

Reported-by: James Clark <james.clark@linaro.org>
Closes: https://lore.kernel.org/lkml/ca0f0cd3-7335-48f9-8737-2f70a75b019a@linaro.org/
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf pmu: Add PMU kind to simplify differentiating
Ian Rogers [Fri, 14 Nov 2025 22:05:46 +0000 (14:05 -0800)] 
perf pmu: Add PMU kind to simplify differentiating

Rather than perf_pmu__is_xxx calls, and a notion of kind so that a
single call can be used.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf header: Switch "cpu" for find_core_pmu in caps feature writing
Ian Rogers [Fri, 14 Nov 2025 22:05:45 +0000 (14:05 -0800)] 
perf header: Switch "cpu" for find_core_pmu in caps feature writing

Writing currently fails on non-x86 and hybrid CPUs. Switch to the more
regular find_core_pmu that is normally used in this case. Tested on
hybrid alderlake system.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf test maps: Additional maps__fixup_overlap_and_insert tests
Ian Rogers [Wed, 19 Nov 2025 05:05:55 +0000 (21:05 -0800)] 
perf test maps: Additional maps__fixup_overlap_and_insert tests

Add additional test to the maps covering
maps__fixup_overlap_and_insert. Change the test suite to be for more
than just 1 test.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf maps: Avoid RC_CHK use after free
Ian Rogers [Wed, 19 Nov 2025 05:05:54 +0000 (21:05 -0800)] 
perf maps: Avoid RC_CHK use after free

The case of __maps__fixup_overlap_and_insert where the "new" maps
covers existing mappings can create a use-after-free with reference
count checking enabled. The issue is that "pos" holds a map pointer
from maps_by_address that is put from maps_by_address but then used to
look for a map in maps_by_name (the compared map is now a
use-after-free). The issue stems from using maps__remove which redoes
some of the searches already done by __maps__fixup_overlap_and_insert,
so optimize the code (by avoiding repeated searches) and avoid the
use-after-free by inlining the appropriate removal code.

Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202511141407.f9edcfa6-lkp@intel.com
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf stat: Read tool events last
Ian Rogers [Tue, 18 Nov 2025 21:13:24 +0000 (13:13 -0800)] 
perf stat: Read tool events last

When reading a metric like memory bandwidth on multiple sockets, the
additional sockets will be on CPUS > 0. Because of the affinity
reading, the counters are read on CPU 0 along with the time, then the
later sockets are read. This can lead to the later sockets having a
bandwidth larger than is possible for the period of time. To avoid
this move the reading of tool events to occur after all other events
are read.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Synthesize memory samples for SIMD operations
Leo Yan [Wed, 12 Nov 2025 18:24:43 +0000 (18:24 +0000)] 
perf arm_spe: Synthesize memory samples for SIMD operations

Synthesize memory samples for SIMD operations (including Advanced SIMD,
SVE, and SME). To provide complete information, also generate data
source entries for SIMD operations.

Since memory operations are not limited to load and store, set
PERF_MEM_OP_STORE if the operation does not fall into these cases.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Expose SIMD information in other operations
Leo Yan [Wed, 12 Nov 2025 18:24:42 +0000 (18:24 +0000)] 
perf arm_spe: Expose SIMD information in other operations

The other operations contain SME data processing, ASE (Advanced SIMD)
and floating-point operations. Expose these info in the records.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Report GCS in record
Leo Yan [Wed, 12 Nov 2025 18:24:41 +0000 (18:24 +0000)] 
perf arm_spe: Report GCS in record

Report GCS related info in records.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Report memset and memcpy in records
Leo Yan [Wed, 12 Nov 2025 18:24:40 +0000 (18:24 +0000)] 
perf arm_spe: Report memset and memcpy in records

Expose memset and memcpy related info in records.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Report associated info for SVE / SME operations
Leo Yan [Wed, 12 Nov 2025 18:24:39 +0000 (18:24 +0000)] 
perf arm_spe: Report associated info for SVE / SME operations

SVE / SME operations can be predicated or Gather load / scatter store,
save the relevant info into record.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Report extended memory operations in records
Leo Yan [Wed, 12 Nov 2025 18:24:38 +0000 (18:24 +0000)] 
perf arm_spe: Report extended memory operations in records

Extended memory operations include atomic (AT), acquire/release (AR),
and exclusive (EXCL) operations. Save the relevant information
in the records.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Report MTE allocation tag in record
Leo Yan [Wed, 12 Nov 2025 18:24:37 +0000 (18:24 +0000)] 
perf arm_spe: Report MTE allocation tag in record

Save MTE tag info in memory record.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Report register access in record
Leo Yan [Wed, 12 Nov 2025 18:24:36 +0000 (18:24 +0000)] 
perf arm_spe: Report register access in record

Record register access info for load / store operations.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Introduce data processing macro for SVE operations
Leo Yan [Wed, 12 Nov 2025 18:24:35 +0000 (18:24 +0000)] 
perf arm_spe: Introduce data processing macro for SVE operations

Introduce the ARM_SPE_OP_DP (data processing) macro as associated
information for SVE operations. For SVE register access, only
ARM_SPE_OP_SVE is set; for SVE data processing, both ARM_SPE_OP_SVE and
ARM_SPE_OP_DP are set together.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Consolidate operation types
Leo Yan [Wed, 12 Nov 2025 18:24:34 +0000 (18:24 +0000)] 
perf arm_spe: Consolidate operation types

Consolidate operation types in a way:

(a) Extract the second-level types into separate enums.
(b) The second-level types for memory and SIMD operations are classified
    by modules. E.g., an operation may relate to general register,
    SIMD/FP, SVE, etc.
(c) The associated information tells details. E.g., an operation is
    load or store, whether it is atomic operation, etc.

Start the enum items for the second-level types from 8 to accommodate
more entries within a 32-bit integer.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Remove unused operation types
Leo Yan [Wed, 12 Nov 2025 18:24:33 +0000 (18:24 +0000)] 
perf arm_spe: Remove unused operation types

Remove unused SVE operation types. These operations will be reintroduced
in subsequent refactoring, but with a different format.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Decode SME data processing packet
Leo Yan [Wed, 12 Nov 2025 18:24:32 +0000 (18:24 +0000)] 
perf arm_spe: Decode SME data processing packet

For SME data processing, decode its Effective vector length or Tile Size
(ETS), and print out if a floating-point operation.

After:

  .  00000000:  49 00                                           SME-OTHER ETS 1024 FP
  .  00000002:  b2 18 3c d7 83 00 80 ff ff                      VA 0xffff800083d73c18
  .  0000000b:  9a 00 00                                        LAT 0 XLAT
  .  0000000e:  43 00                                           DATA-SOURCE 0

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Decode ASE and FP fields in other operation
Leo Yan [Wed, 12 Nov 2025 18:24:31 +0000 (18:24 +0000)] 
perf arm_spe: Decode ASE and FP fields in other operation

Add a check for other operation, which prevents any incorrectly
classifying. Parse the ASE and FP fields.

After:

  .  0000002f:  48 06                                           OTHER ASE FP INSN-OTHER
  .  00000031:  b2 08 80 48 01 08 00 ff ff                      VA 0xffff000801488008
  .  0000003a:  9a 00 00                                        LAT 0 XLAT
  .  0000003d:  42 16                                           EV RETIRED L1D-ACCESS TLB-ACCESS

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Rename SPE_OP_PKT_IS_OTHER_SVE_OP macro
Leo Yan [Wed, 12 Nov 2025 18:24:30 +0000 (18:24 +0000)] 
perf arm_spe: Rename SPE_OP_PKT_IS_OTHER_SVE_OP macro

Rename the macro to SPE_OP_PKT_OTHER_SUBCLASS_SVE to unify naming.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Decode GCS operation
Leo Yan [Wed, 12 Nov 2025 18:24:29 +0000 (18:24 +0000)] 
perf arm_spe: Decode GCS operation

Decode a load or store from a GCS operation and the associated "common"
field.

After:

  .  00000000:  49 44                                           LD GCS COMM
  .  00000002:  b2 18 3c d7 83 00 80 ff ff                      VA 0xffff800083d73c18
  .  0000000b:  9a 00 00                                        LAT 0 XLAT
  .  0000000e:  43 00                                           DATA-SOURCE 0

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Unify operation naming
Leo Yan [Wed, 12 Nov 2025 18:24:28 +0000 (18:24 +0000)] 
perf arm_spe: Unify operation naming

Rename extended subclass and SVE/SME register access subclass, so that
the naming can be consistent cross all sub classes.

Add an log "SVE-SME-REG" for the SVE/SME register access, this is easier
for parsing.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf arm_spe: Fix memset subclass in operation
Leo Yan [Wed, 12 Nov 2025 18:24:27 +0000 (18:24 +0000)] 
perf arm_spe: Fix memset subclass in operation

The operation subclass is extracted from bits [7..1] of the payload.
Since bit [0] is not parsed, there is no chance to match the memset type
(0x25). As a result, the memset payload is never parsed successfully.

Instead of extracting a unified bit field, change to extract the
specific bits for each operation subclass.

Fixes: 34fb60400e32 ("perf arm-spe: Add raw decoding for SPEv1.3 MTE and MOPS load/store")
Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf tool_pmu: More accurately set the cpus for tool events
Ian Rogers [Thu, 13 Nov 2025 18:05:13 +0000 (10:05 -0800)] 
perf tool_pmu: More accurately set the cpus for tool events

The user and system time events can record on different CPUs, but for
all other events a single CPU map of just CPU 0 makes sense. In
parse-events detect a tool PMU and then pass the perf_event_attr so
that the tool_pmu can return CPUs specific for the event. This avoids
a CPU map of all online CPUs being used for events like
duration_time. Avoiding this avoids the evlist CPUs containing CPUs
for which duration_time just gives 0. Minimizing the evlist CPUs can
remove unnecessary sched_setaffinity syscalls that delay metric
calculations.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf stat: Reduce scope of walltime_nsecs_stats
Ian Rogers [Thu, 13 Nov 2025 18:05:12 +0000 (10:05 -0800)] 
perf stat: Reduce scope of walltime_nsecs_stats

walltime_nsecs_stats is no longer used for counter values, move into
that stat_config where it controls certain things like noise
measurement.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf stat: Reduce scope of ru_stats
Ian Rogers [Thu, 13 Nov 2025 18:05:11 +0000 (10:05 -0800)] 
perf stat: Reduce scope of ru_stats

The ru_stats are used to capture user and system time stats when a
process exits. These are then applied to user and system time tool
events if their reads fail due to the process terminating. Reduce the
scope now the metric code no longer reads these values.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf stat-shadow: Read tool events directly
Ian Rogers [Thu, 13 Nov 2025 18:05:10 +0000 (10:05 -0800)] 
perf stat-shadow: Read tool events directly

When reading time values for metrics don't use the globals updated in
builtin-stat, just read the events as regular events. The only
exception is for time events where nanoseconds need converting to
seconds as metrics assume time metrics are in seconds.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf tool_pmu: Use old_count when computing count values for time events
Ian Rogers [Thu, 13 Nov 2025 18:05:09 +0000 (10:05 -0800)] 
perf tool_pmu: Use old_count when computing count values for time events

When running in interval mode every third count of a time event isn't
showing properly:
```
$ perf stat -e duration_time -a -I 1000
     1.001082862      1,002,290,425      duration_time
     2.004264262      1,003,183,516      duration_time
     3.007381401      <not counted>      duration_time
     4.011160141      1,003,705,631      duration_time
     5.014515385      1,003,290,110      duration_time
     6.018539680      <not counted>      duration_time
     7.022065321      1,003,591,720      duration_time
```
The regression came in with a different fix, found through bisection,
commit 68cb1567439f ("perf tool_pmu: Fix aggregation on
duration_time"). The issue is caused by the enabled and running time
of the event matching the old_count's and creating a delta of 0, which
is indicative of an error.

Fixes: 68cb1567439f ("perf tool_pmu: Fix aggregation on duration_time")
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf pmu: perf_cpu_map__new_int to avoid parsing a string
Ian Rogers [Thu, 13 Nov 2025 18:05:08 +0000 (10:05 -0800)] 
perf pmu: perf_cpu_map__new_int to avoid parsing a string

Prefer perf_cpu_map__new_int(0) to perf_cpu_map__new("0") as it avoids
strings parsing.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agolibperf cpumap: Reduce allocations and sorting in intersect
Ian Rogers [Thu, 13 Nov 2025 18:05:07 +0000 (10:05 -0800)] 
libperf cpumap: Reduce allocations and sorting in intersect

On hybrid platforms the CPU maps are often disjoint. Rather than copy
CPUs and trim, compute the number of common CPUs, if none early exit,
otherwise copy in an sorted order. This avoids memory allocation in
the disjoint case and avoids a second malloc and useless sort in the
previous trim cases.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf stat: Display metric-only for 0 counters
Ian Rogers [Wed, 12 Nov 2025 19:53:11 +0000 (11:53 -0800)] 
perf stat: Display metric-only for 0 counters

0 counters may occur in hypervisor settings but metric-only output is
always expected. This resolves an issue in the "perf stat STD output
linter" test.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf test: Don't fail if user rdpmc returns 0 when disabled
Ian Rogers [Wed, 12 Nov 2025 19:53:10 +0000 (11:53 -0800)] 
perf test: Don't fail if user rdpmc returns 0 when disabled

In certain hypervisor set ups the value 0 may be returned but this is
only erroneous if the user rdpmc isn't disabled.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf parse-events: Add debug logging to perf_event
Ian Rogers [Wed, 12 Nov 2025 19:53:09 +0000 (11:53 -0800)] 
perf parse-events: Add debug logging to perf_event

If verbose is enabled and parse_event is called, typically by tests,
log failures.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf test: Be tolerant of missing json metric none value
Ian Rogers [Wed, 12 Nov 2025 19:53:08 +0000 (11:53 -0800)] 
perf test: Be tolerant of missing json metric none value

print_metric_only_json and print_metric_end in stat-display.c may
create a metric value of "none" which fails validation as isfloat. Add
a helper to properly validate metric numeric values.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
4 weeks agoperf sample: Fix the wrong format specifier
liujing [Mon, 22 Sep 2025 09:50:57 +0000 (17:50 +0800)] 
perf sample: Fix the wrong format specifier

In the file tools/perf/util/cs-etm.c, queue_nr is of type unsigned
int and should be printed with %u.

Signed-off-by: liujing <liujing@cmss.chinamobile.com>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf script: Fix build by removing unused evsel_script()
James Clark [Fri, 14 Nov 2025 14:06:18 +0000 (14:06 +0000)] 
perf script: Fix build by removing unused evsel_script()

The evsel_script() function is unused since the linked commit. Fix the
build by removing it.

Fixes the following compilation error:

  static inline struct evsel_script *evsel_script(struct evsel *evsel)
                                     ^

builtin-script.c:347:36: error: unused function 'evsel_script' [-Werror,-Wunused-function]
Fixes: 3622990efaab ("perf script: Change metric format to use json metrics")
Signed-off-by: James Clark <james.clark@linaro.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf vendor metrics s390: Avoid has_event(INSTRUCTIONS)
Ian Rogers [Wed, 12 Nov 2025 16:24:39 +0000 (08:24 -0800)] 
perf vendor metrics s390: Avoid has_event(INSTRUCTIONS)

The instructions event is now provided in json meaning the has_event
test always succeeds. Switch to using non-legacy event names in the
affected metrics.

Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Closes: https://lore.kernel.org/linux-perf-users/3e80f453-f015-4f4f-93d3-8df6bb6b3c95@linux.ibm.com/
Fixes: 0012e0fa221b ("perf jevents: Add legacy-hardware and legacy-cache json")
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Thomas Richter <tmricht@linux.ibm.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf auxtrace: Remove errno.h from auxtrace.h and fix transitive dependencies
Ian Rogers [Mon, 10 Nov 2025 01:31:52 +0000 (17:31 -0800)] 
perf auxtrace: Remove errno.h from auxtrace.h and fix transitive dependencies

errno.h isn't used in auxtrace.h so remove it and fix build failures
caused by transitive dependencies through auxtrace.h on errno.h.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf build: Remove NO_AUXTRACE build option
Ian Rogers [Mon, 10 Nov 2025 01:31:51 +0000 (17:31 -0800)] 
perf build: Remove NO_AUXTRACE build option

The NO_AUXTRACE build option was used when the __get_cpuid feature
test failed or if it was provided on the command line. The option no
longer avoids a dependency on a library and so having the option is
just adding complexity to the code base. Remove the option
CONFIG_AUXTRACE from Build files and HAVE_AUXTRACE_SUPPORT by assuming
it is always defined.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agotool build: Remove __get_cpuid feature test
Ian Rogers [Mon, 10 Nov 2025 01:31:50 +0000 (17:31 -0800)] 
tool build: Remove __get_cpuid feature test

This feature test is no longer used so remove.

The function tested by the feature test is used in:
tools/power/x86/x86_energy_perf_policy/x86_energy_perf_policy.c
however, the Makefile just assumes the presence of the function and
doesn't perform a build feature test for it.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf build: Don't add NO_AUXTRACE if missing feature-get_cpuid
Ian Rogers [Mon, 10 Nov 2025 01:31:49 +0000 (17:31 -0800)] 
perf build: Don't add NO_AUXTRACE if missing feature-get_cpuid

The intel-pt code dependent on __get_cpuid is no longer present so
remove the feature test in the Makefile.config.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf intel-pt: Use the perf provided "cpuid.h"
Ian Rogers [Mon, 10 Nov 2025 01:31:48 +0000 (17:31 -0800)] 
perf intel-pt: Use the perf provided "cpuid.h"

Rather than having a feature test and include of <cpuid.h> for the
__get_cpuid function, use the cpuid function provided by
tools/perf/arch/x86/util/cpuid.h.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf test: Add a perf event fallback test
Zide Chen [Wed, 12 Nov 2025 16:48:23 +0000 (08:48 -0800)] 
perf test: Add a perf event fallback test

This adds test cases to verify the precise ip fallback logic:

- If the system supports precise ip, for an event given with the maximum
  precision level, it should be able to decrease precise_ip to find a
  supported level.
- The same fallback behavior should also work in more complex scenarios,
  such as event groups or when PEBS is involved

Additional fallback tests, such as those covering missing feature cases,
can be added in the future.

Suggested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Zide Chen <zide.chen@intel.com>
Reviewed-by: Ian Rogers <irogers!@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf stat: Align metric output without events
Namhyung Kim [Thu, 6 Nov 2025 07:28:34 +0000 (23:28 -0800)] 
perf stat: Align metric output without events

One of my concern in the perf stat output was the alignment in the
metrics and shadow stats.  I think it missed to calculate the basic
output length using COUNTS_LEN and EVNAME_LEN but missed to add the
unit length like "msec" and surround 2 spaces.  I'm not sure why it's
not printed below though.

But anyway, now it shows correctly aligned metric output.

  $ perf stat true

   Performance counter stats for 'true':

             859,772      task-clock                       #    0.395 CPUs utilized
                   0      context-switches                 #    0.000 /sec
                   0      cpu-migrations                   #    0.000 /sec
                  56      page-faults                      #   65.134 K/sec
           1,075,022      instructions                     #    0.86  insn per cycle
           1,255,911      cycles                           #    1.461 GHz
             220,573      branches                         #  256.548 M/sec
               7,381      branch-misses                    #    3.35% of all branches
                          TopdownL1                        #     19.2 %  tma_retiring
                                                           #     28.6 %  tma_backend_bound
                                                           #      9.5 %  tma_bad_speculation
                                                           #     42.6 %  tma_frontend_bound

         0.002174871 seconds time elapsed                  ^
                                                           |
         0.002154000 seconds user                          |
         0.000000000 seconds sys                          here

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf tool_pmu: Make core_wide and target_cpu json events
Ian Rogers [Tue, 11 Nov 2025 21:22:06 +0000 (13:22 -0800)] 
perf tool_pmu: Make core_wide and target_cpu json events

For the sake of better documentation, add core_wide and target_cpu to
the tool.json. When the values of system_wide and
user_requested_cpu_list are unknown, use the values from the global
stat_config.

Example output showing how '-a' modifies the values in `perf stat`:
```
$ perf stat -e core_wide,target_cpu true

 Performance counter stats for 'true':

                 0      core_wide
                 0      target_cpu

       0.000993787 seconds time elapsed

       0.001128000 seconds user
       0.000000000 seconds sys

$ perf stat -e core_wide,target_cpu -a true

 Performance counter stats for 'system wide':

                 1      core_wide
                 1      target_cpu

       0.002271723 seconds time elapsed

$ perf list
...
tool:
  core_wide
       [1 if not SMT,if SMT are events being gathered on all SMT threads 1 otherwise 0. Unit: tool]
...
  target_cpu
       [1 if CPUs being analyzed,0 if threads/processes. Unit: tool]
...
```

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf test stat csv: Update test expectations and events
Ian Rogers [Tue, 11 Nov 2025 21:22:05 +0000 (13:22 -0800)] 
perf test stat csv: Update test expectations and events

Explicitly use a metric rather than implicitly expecting '-e
instructions,cycles' to produce a metric. Use a metric with software
events to make it more compatible.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf test stat: Update test expectations and events
Ian Rogers [Tue, 11 Nov 2025 21:22:04 +0000 (13:22 -0800)] 
perf test stat: Update test expectations and events

test_stat_record_report and test_stat_record_script used default
output which triggers a bug when sending metrics. As this isn't
relevant to the test switch to using named software events.

Update the match in test_hybrid as the cycles event is now cpu-cycles
to workaround potential ARM issues.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf test stat: Update shadow test to use metrics
Ian Rogers [Tue, 11 Nov 2025 21:22:03 +0000 (13:22 -0800)] 
perf test stat: Update shadow test to use metrics

Previously '-e cycles,instructions' would implicitly create an IPC
metric. This now has to be explicit with '-M insn_per_cycle'.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf test metrics: Update all metrics for possibly failing default metrics
Ian Rogers [Tue, 11 Nov 2025 21:22:02 +0000 (13:22 -0800)] 
perf test metrics: Update all metrics for possibly failing default metrics

Default metrics may use unsupported events and be ignored. These
metrics shouldn't cause metric testing to fail.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf test stat: Update std_output testing metric expectations
Ian Rogers [Tue, 11 Nov 2025 21:22:01 +0000 (13:22 -0800)] 
perf test stat: Update std_output testing metric expectations

Make the expectations match json metrics rather than the previous hard
coded ones.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf test stat: Ignore failures in Default[234] metricgroups
Ian Rogers [Tue, 11 Nov 2025 21:22:00 +0000 (13:22 -0800)] 
perf test stat: Ignore failures in Default[234] metricgroups

The Default[234] metric groups may contain unsupported legacy
events. Allow those metric groups to fail.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf test stat+json: Improve metric-only testing
Ian Rogers [Tue, 11 Nov 2025 21:21:59 +0000 (13:21 -0800)] 
perf test stat+json: Improve metric-only testing

When testing metric-only, pass a metric to perf rather than expecting
a hard coded metric value to be generated.

Remove keys that were really metric-only units and instead don't
expect metric only to have a matching json key as it encodes metrics
as {"metric_name", "metric_value"}.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf stat: Remove "unit" workarounds for metric-only
Ian Rogers [Tue, 11 Nov 2025 21:21:58 +0000 (13:21 -0800)] 
perf stat: Remove "unit" workarounds for metric-only

Remove code that tested the "unit" as in KB/sec for certain hard coded
metric values and did workarounds.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf stat: Sort default events/metrics
Ian Rogers [Tue, 11 Nov 2025 21:21:57 +0000 (13:21 -0800)] 
perf stat: Sort default events/metrics

To improve the readability of default events/metrics, sort the evsels
after the Default metric groups have be parsed.

Before:
```
$ perf stat -a sleep 1
 Performance counter stats for 'system wide':

            22,087      context-switches                 #      nan cs/sec  cs_per_second
             TopdownL1 (cpu_core)                 #     10.3 %  tma_bad_speculation
                                                  #     25.8 %  tma_frontend_bound
                                                  #     34.5 %  tma_backend_bound
                                                  #     29.3 %  tma_retiring
             7,829      page-faults                      #      nan faults/sec  page_faults_per_second
       880,144,270      cpu_atom/cpu-cycles/             #      nan GHz  cycles_frequency       (50.10%)
     1,693,081,235      cpu_core/cpu-cycles/             #      nan GHz  cycles_frequency
             TopdownL1 (cpu_atom)                 #     20.5 %  tma_bad_speculation
                                                  #     13.8 %  tma_retiring             (50.26%)
                                                  #     34.6 %  tma_frontend_bound       (50.23%)
        89,326,916      cpu_atom/branches/               #      nan M/sec  branch_frequency     (60.19%)
       538,123,088      cpu_core/branches/               #      nan M/sec  branch_frequency
             1,368      cpu-migrations                   #      nan migrations/sec  migrations_per_second
                                                  #     31.1 %  tma_backend_bound        (60.19%)
              0.00 msec cpu-clock                        #      0.0 CPUs  CPUs_utilized
       485,744,856      cpu_atom/instructions/           #      0.6 instructions  insn_per_cycle  (59.87%)
     3,093,112,283      cpu_core/instructions/           #      1.8 instructions  insn_per_cycle
         4,939,427      cpu_atom/branch-misses/          #      5.0 %  branch_miss_rate         (49.77%)
         7,632,248      cpu_core/branch-misses/          #      1.4 %  branch_miss_rate

       1.005084693 seconds time elapsed
```
After:
```
$ perf stat -a sleep 1
 Performance counter stats for 'system wide':

            22,165      context-switches                 #      nan cs/sec  cs_per_second
              0.00 msec cpu-clock                        #      0.0 CPUs  CPUs_utilized
             2,260      cpu-migrations                   #      nan migrations/sec  migrations_per_second
            20,476      page-faults                      #      nan faults/sec  page_faults_per_second
        17,052,357      cpu_core/branch-misses/          #      1.5 %  branch_miss_rate
     1,120,090,590      cpu_core/branches/               #      nan M/sec  branch_frequency
     3,402,892,275      cpu_core/cpu-cycles/             #      nan GHz  cycles_frequency
     6,129,236,701      cpu_core/instructions/           #      1.8 instructions  insn_per_cycle
         6,159,523      cpu_atom/branch-misses/          #      3.1 %  branch_miss_rate         (49.86%)
       222,158,812      cpu_atom/branches/               #      nan M/sec  branch_frequency     (50.25%)
     1,547,610,244      cpu_atom/cpu-cycles/             #      nan GHz  cycles_frequency       (50.40%)
     1,304,901,260      cpu_atom/instructions/           #      0.8 instructions  insn_per_cycle  (50.41%)
             TopdownL1 (cpu_core)                 #     13.7 %  tma_bad_speculation
                                                  #     23.5 %  tma_frontend_bound
                                                  #     33.3 %  tma_backend_bound
                                                  #     29.6 %  tma_retiring
             TopdownL1 (cpu_atom)                 #     32.1 %  tma_backend_bound        (59.65%)
                                                  #     30.1 %  tma_frontend_bound       (59.51%)
                                                  #     22.3 %  tma_bad_speculation
                                                  #     15.5 %  tma_retiring             (59.53%)

       1.008405429 seconds time elapsed
```

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf stat: Fix default metricgroup display on hybrid
Ian Rogers [Tue, 11 Nov 2025 21:21:56 +0000 (13:21 -0800)] 
perf stat: Fix default metricgroup display on hybrid

The logic to skip output of a default metric line was firing on
Alderlake and not displaying 'TopdownL1 (cpu_atom)'. Remove the
need_full_name check as it is equivalent to the different PMU test in
the cases we care about, merge the 'if's and flip the evsel of the PMU
test. The 'if' is now basically saying, if the output matches the last
printed output then skip the output.

Before:
```
             TopdownL1 (cpu_core)                 #     11.3 %  tma_bad_speculation
                                                  #     24.3 %  tma_frontend_bound
             TopdownL1 (cpu_core)                 #     33.9 %  tma_backend_bound
                                                  #     30.6 %  tma_retiring
                                                  #     42.2 %  tma_backend_bound
                                                  #     25.0 %  tma_frontend_bound       (49.81%)
                                                  #     12.8 %  tma_bad_speculation
                                                  #     20.0 %  tma_retiring             (59.46%)
```
After:
```
             TopdownL1 (cpu_core)                 #      8.3 %  tma_bad_speculation
                                                  #     43.7 %  tma_frontend_bound
                                                  #     30.7 %  tma_backend_bound
                                                  #     17.2 %  tma_retiring
             TopdownL1 (cpu_atom)                 #     31.9 %  tma_backend_bound
                                                  #     37.6 %  tma_frontend_bound       (49.66%)
                                                  #     18.0 %  tma_bad_speculation
                                                  #     12.6 %  tma_retiring             (59.58%)
```

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf stat: Remove hard coded shadow metrics
Ian Rogers [Tue, 11 Nov 2025 21:21:55 +0000 (13:21 -0800)] 
perf stat: Remove hard coded shadow metrics

Now that the metrics are encoded in common json the hard coded
printing means the metrics are shown twice. Remove the hard coded
version.

This means that when specifying events, and those events correspond to
a hard coded metric, the metric will no longer be displayed. The
metric will be displayed if the metric is requested. Due to the adhoc
printing in the previous approach it was often found frustrating, the
new approach avoids this.

The default perf stat output on an alderlake now looks like:
```
$ perf stat -a -- sleep 1

 Performance counter stats for 'system wide':

            19,697      context-switches                 #      nan cs/sec  cs_per_second
             TopdownL1 (cpu_core)                 #     10.7 %  tma_bad_speculation
                                                  #     24.9 %  tma_frontend_bound
             TopdownL1 (cpu_core)                 #     34.3 %  tma_backend_bound
                                                  #     30.1 %  tma_retiring
             6,593      page-faults                      #      nan faults/sec  page_faults_per_second
       729,065,658      cpu_atom/cpu-cycles/             #      nan GHz  cycles_frequency       (49.79%)
     1,605,131,101      cpu_core/cpu-cycles/             #      nan GHz  cycles_frequency
                                                  #     19.7 %  tma_bad_speculation
                                                  #     14.2 %  tma_retiring             (50.14%)
                                                  #     37.3 %  tma_frontend_bound       (50.31%)
        87,302,268      cpu_atom/branches/               #      nan M/sec  branch_frequency     (60.27%)
       512,046,956      cpu_core/branches/               #      nan M/sec  branch_frequency
             1,111      cpu-migrations                   #      nan migrations/sec  migrations_per_second
                                                  #     28.8 %  tma_backend_bound        (60.26%)
              0.00 msec cpu-clock                        #      0.0 CPUs  CPUs_utilized
       392,509,323      cpu_atom/instructions/           #      0.6 instructions  insn_per_cycle  (60.19%)
     2,990,369,310      cpu_core/instructions/           #      1.9 instructions  insn_per_cycle
         3,493,478      cpu_atom/branch-misses/          #      5.9 %  branch_miss_rate         (49.69%)
         7,297,531      cpu_core/branch-misses/          #      1.4 %  branch_miss_rate

       1.006621701 seconds time elapsed
```

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf script: Change metric format to use json metrics
Ian Rogers [Tue, 11 Nov 2025 21:21:54 +0000 (13:21 -0800)] 
perf script: Change metric format to use json metrics

The metric format option isn't properly supported. This change
improves that by making the sample events update the counts of an
evsel, where the shadow metric code expects to read the values.  To
support printing metrics, metrics need to be found. This is done on
the first attempt to print a metric. Every metric is parsed and then
the evsels in the metric's evlist compared to those in perf script
using the perf_event_attr type and config. If the metric matches then
it is added for printing. As an event in the perf script's evlist may
have >1 metric id, or different leader for aggregation, the first
metric matched will be displayed in those cases.

An example use is:
```
$ perf record -a -e '{instructions,cpu-cycles}:S' -a -- sleep 1
$ perf script -F period,metric
...
     867817
         metric:    0.30  insn per cycle
     125394
         metric:    0.04  insn per cycle
     313516
         metric:    0.11  insn per cycle
         metric:    1.00  insn per cycle
```

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf stat: Add detail -d,-dd,-ddd metrics
Ian Rogers [Tue, 11 Nov 2025 21:21:53 +0000 (13:21 -0800)] 
perf stat: Add detail -d,-dd,-ddd metrics

Add metrics for the stat-shadow -d, -dd and -ddd events and hard coded
metrics. Remove the events as these now come from the metrics.

Following this change a detailed perf stat output looks like:
```
$ perf stat -a -ddd -- sleep 1
 Performance counter stats for 'system wide':

            21,089      context-switches                 #      nan cs/sec  cs_per_second
             TopdownL1 (cpu_core)                 #     14.1 %  tma_bad_speculation
                                                  #     27.3 %  tma_frontend_bound       (30.56%)
             TopdownL1 (cpu_core)                 #     31.5 %  tma_backend_bound
                                                  #     27.2 %  tma_retiring             (30.56%)
             6,302      page-faults                      #      nan faults/sec  page_faults_per_second
       928,495,163      cpu_atom/cpu-cycles/
                                                  #      nan GHz  cycles_frequency       (28.41%)
     1,841,409,834      cpu_core/cpu-cycles/
                                                  #      nan GHz  cycles_frequency       (38.51%)
                                                  #     14.5 %  tma_bad_speculation
                                                  #     16.0 %  tma_retiring             (28.41%)
                                                  #     36.8 %  tma_frontend_bound       (35.57%)
       100,859,118      cpu_atom/branches/               #      nan M/sec  branch_frequency     (42.73%)
       572,657,734      cpu_core/branches/               #      nan M/sec  branch_frequency     (54.43%)
             1,527      cpu-migrations                   #      nan migrations/sec  migrations_per_second
                                                  #     32.7 %  tma_backend_bound        (42.73%)
              0.00 msec cpu-clock                        #    0.000 CPUs utilized
                                                  #      0.0 CPUs  CPUs_utilized
       498,668,509      cpu_atom/instructions/           #    0.57  insn per cycle
                                                  #      0.6 instructions  insn_per_cycle  (42.97%)
     3,281,762,225      cpu_core/instructions/           #    1.84  insn per cycle
                                                  #      1.8 instructions  insn_per_cycle  (62.20%)
         4,919,511      cpu_atom/branch-misses/          #    5.43% of all branches
                                                  #      5.4 %  branch_miss_rate         (35.80%)
         7,431,776      cpu_core/branch-misses/          #    1.39% of all branches
                                                  #      1.4 %  branch_miss_rate         (62.20%)
         2,517,007      cpu_atom/LLC-loads/              #      0.1 %  llc_miss_rate            (28.62%)
         3,931,318      cpu_core/LLC-loads/              #     40.4 %  llc_miss_rate            (45.98%)
        14,918,674      cpu_core/L1-dcache-load-misses/  #    2.25% of all L1-dcache accesses
                                                  #      nan %  l1d_miss_rate            (37.80%)
        27,067,264      cpu_atom/L1-icache-load-misses/  #   15.92% of all L1-icache accesses
                                                  #     15.9 %  l1i_miss_rate            (21.47%)
       116,848,994      cpu_atom/dTLB-loads/             #      0.8 %  dtlb_miss_rate           (21.47%)
       764,870,407      cpu_core/dTLB-loads/             #      0.1 %  dtlb_miss_rate           (15.12%)

       1.006181526 seconds time elapsed
```

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf jevents: Add metric DefaultShowEvents
Ian Rogers [Tue, 11 Nov 2025 21:21:52 +0000 (13:21 -0800)] 
perf jevents: Add metric DefaultShowEvents

Some Default group metrics require their events showing for
consistency with perf's previous behavior. Add a flag to indicate when
this is the case and use it in stat-display.

As events are coming from Default metrics remove that default hardware
and software events from perf stat.

Following this change the default perf stat output on an alderlake looks like:
```
$ perf stat -a -- sleep 1

 Performance counter stats for 'system wide':

            20,550      context-switches                 #      nan cs/sec  cs_per_second
             TopdownL1 (cpu_core)                 #      9.0 %  tma_bad_speculation
                                                  #     28.1 %  tma_frontend_bound
             TopdownL1 (cpu_core)                 #     29.2 %  tma_backend_bound
                                                  #     33.7 %  tma_retiring
             6,685      page-faults                      #      nan faults/sec  page_faults_per_second
       790,091,064      cpu_atom/cpu-cycles/
                                                  #      nan GHz  cycles_frequency       (49.83%)
     2,563,918,366      cpu_core/cpu-cycles/
                                                  #      nan GHz  cycles_frequency
                                                  #     12.3 %  tma_bad_speculation
                                                  #     14.5 %  tma_retiring             (50.20%)
                                                  #     33.8 %  tma_frontend_bound       (50.24%)
        76,390,322      cpu_atom/branches/               #      nan M/sec  branch_frequency     (60.20%)
     1,015,173,047      cpu_core/branches/               #      nan M/sec  branch_frequency
             1,325      cpu-migrations                   #      nan migrations/sec  migrations_per_second
                                                  #     39.3 %  tma_backend_bound        (60.17%)
              0.00 msec cpu-clock                        #    0.000 CPUs utilized
                                                  #      0.0 CPUs  CPUs_utilized
       554,347,072      cpu_atom/instructions/           #    0.64  insn per cycle
                                                  #      0.6 instructions  insn_per_cycle  (60.14%)
     5,228,931,991      cpu_core/instructions/           #    2.04  insn per cycle
                                                  #      2.0 instructions  insn_per_cycle
         4,308,874      cpu_atom/branch-misses/          #    5.65% of all branches
                                                  #      5.6 %  branch_miss_rate         (49.76%)
         9,890,606      cpu_core/branch-misses/          #    0.97% of all branches
                                                  #      1.0 %  branch_miss_rate

       1.005477803 seconds time elapsed
```

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf jevents: Add set of common metrics based on default ones
Ian Rogers [Tue, 11 Nov 2025 21:21:51 +0000 (13:21 -0800)] 
perf jevents: Add set of common metrics based on default ones

Add support to getting a common set of metrics from a default
table. It simplifies the generation to add json metrics at the same
time. The metrics added are CPUs_utilized, cs_per_second,
migrations_per_second, page_faults_per_second, insn_per_cycle,
stalled_cycles_per_instruction, frontend_cycles_idle,
backend_cycles_idle, cycles_frequency, branch_frequency and
branch_miss_rate based on the shadow metric definitions.

Following this change the default perf stat output on an alderlake
looks like:
```
$ perf stat -a -- sleep 2

 Performance counter stats for 'system wide':

              0.00 msec cpu-clock                        #    0.000 CPUs utilized
            77,739      context-switches
            15,033      cpu-migrations
           321,313      page-faults
    14,355,634,225      cpu_atom/instructions/           #    1.40  insn per cycle              (35.37%)
   134,561,560,583      cpu_core/instructions/           #    3.44  insn per cycle              (57.85%)
    10,263,836,145      cpu_atom/cycles/                                                        (35.42%)
    39,138,632,894      cpu_core/cycles/                                                        (57.60%)
     2,989,658,777      cpu_atom/branches/                                                      (42.60%)
    32,170,570,388      cpu_core/branches/                                                      (57.39%)
        29,789,870      cpu_atom/branch-misses/          #    1.00% of all branches             (42.69%)
       165,991,152      cpu_core/branch-misses/          #    0.52% of all branches             (57.19%)
                       (software)                 #      nan cs/sec  cs_per_second
             TopdownL1 (cpu_core)                 #     11.9 %  tma_bad_speculation
                                                  #     19.6 %  tma_frontend_bound       (63.97%)
             TopdownL1 (cpu_core)                 #     18.8 %  tma_backend_bound
                                                  #     49.7 %  tma_retiring             (63.97%)
                       (software)                 #      nan faults/sec  page_faults_per_second
                                                  #      nan GHz  cycles_frequency       (42.88%)
                                                  #      nan GHz  cycles_frequency       (69.88%)
             TopdownL1 (cpu_atom)                 #     11.7 %  tma_bad_speculation
                                                  #     29.9 %  tma_retiring             (50.07%)
             TopdownL1 (cpu_atom)                 #     31.3 %  tma_frontend_bound       (43.09%)
                       (cpu_atom)                 #      nan M/sec  branch_frequency     (43.09%)
                                                  #      nan M/sec  branch_frequency     (70.07%)
                                                  #      nan migrations/sec  migrations_per_second
             TopdownL1 (cpu_atom)                 #     27.1 %  tma_backend_bound        (43.08%)
                       (software)                 #      0.0 CPUs  CPUs_utilized
                                                  #      1.4 instructions  insn_per_cycle  (43.04%)
                                                  #      3.5 instructions  insn_per_cycle  (69.99%)
                                                  #      1.0 %  branch_miss_rate         (35.46%)
                                                  #      0.5 %  branch_miss_rate         (65.02%)

       2.005626564 seconds time elapsed
```

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf expr: Add #target_cpu literal
Ian Rogers [Tue, 11 Nov 2025 21:21:50 +0000 (13:21 -0800)] 
perf expr: Add #target_cpu literal

For CPU nanoseconds a lot of the stat-shadow metrics use either
task-clock or cpu-clock, the latter being used when
target__has_cpu. Add a #target_cpu literal so that json metrics can
perform the same test.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf metricgroup: Add care to picking the evsel for displaying a metric
Ian Rogers [Tue, 11 Nov 2025 21:21:49 +0000 (13:21 -0800)] 
perf metricgroup: Add care to picking the evsel for displaying a metric

Rather than using the first evsel in the matched events, try to find
the least shared non-tool evsel. The aim is to pick the first evsel
that typifies the metric within the list of metrics.

This addresses an issue where Default metric group metrics may lose
their counter value due to how the stat displaying hides counters for
default event/metric output.

For a metricgroup like TopdownL1 on an Intel Alderlake the change is,
before there are 4 events with metrics:
```
$ perf stat -M topdownL1 -a sleep 1

 Performance counter stats for 'system wide':

     7,782,334,296      cpu_core/TOPDOWN.SLOTS/          #     10.4 %  tma_bad_speculation
                                                  #     19.7 %  tma_frontend_bound
     2,668,927,977      cpu_core/topdown-retiring/       #     35.7 %  tma_backend_bound
                                                  #     34.1 %  tma_retiring
       803,623,987      cpu_core/topdown-bad-spec/
       167,514,386      cpu_core/topdown-heavy-ops/
     1,555,265,776      cpu_core/topdown-fe-bound/
     2,792,733,013      cpu_core/topdown-be-bound/
       279,769,310      cpu_atom/TOPDOWN_RETIRING.ALL/   #     12.2 %  tma_retiring
                                                  #     15.1 %  tma_bad_speculation
       457,917,232      cpu_atom/CPU_CLK_UNHALTED.CORE/  #     38.4 %  tma_backend_bound
                                                  #     34.2 %  tma_frontend_bound
       783,519,226      cpu_atom/TOPDOWN_FE_BOUND.ALL/
        10,790,192      cpu_core/INT_MISC.UOP_DROPPING/
       879,845,633      cpu_atom/TOPDOWN_BE_BOUND.ALL/
```

After there are 6 events with metrics:
```
$ perf stat -M topdownL1 -a sleep 1

 Performance counter stats for 'system wide':

     2,377,551,258      cpu_core/TOPDOWN.SLOTS/          #      7.9 %  tma_bad_speculation
                                                  #     36.4 %  tma_frontend_bound
       480,791,142      cpu_core/topdown-retiring/       #     35.5 %  tma_backend_bound
       186,323,991      cpu_core/topdown-bad-spec/
        65,070,590      cpu_core/topdown-heavy-ops/      #     20.1 %  tma_retiring
       871,733,444      cpu_core/topdown-fe-bound/
       848,286,598      cpu_core/topdown-be-bound/
       260,936,456      cpu_atom/TOPDOWN_RETIRING.ALL/   #     12.4 %  tma_retiring
                                                  #     17.6 %  tma_bad_speculation
       419,576,513      cpu_atom/CPU_CLK_UNHALTED.CORE/
       797,132,597      cpu_atom/TOPDOWN_FE_BOUND.ALL/   #     38.0 %  tma_frontend_bound
         3,055,447      cpu_core/INT_MISC.UOP_DROPPING/
       671,014,164      cpu_atom/TOPDOWN_BE_BOUND.ALL/   #     32.0 %  tma_backend_bound
```

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf tools: Fix missing feature check for inherit + SAMPLE_READ
Namhyung Kim [Tue, 11 Nov 2025 07:59:44 +0000 (23:59 -0800)] 
perf tools: Fix missing feature check for inherit + SAMPLE_READ

It should also have PERF_SAMPLE_TID to enable inherit and PERF_SAMPLE_READ
on recent kernels.  Not having _TID makes the feature check wrongly detect
the inherit and _READ support.

It was reported that the following command failed due to the error in
the missing feature check on Intel SPR machines.

  $ perf record -e '{cpu/mem-loads-aux/S,cpu/mem-loads,ldlat=3/PS}' -- ls
  Error:
  Failure to open event 'cpu/mem-loads,ldlat=3/PS' on PMU 'cpu' which will be removed.
  Invalid event (cpu/mem-loads,ldlat=3/PS) in per-thread mode, enable system wide with '-a'.

Reviewed-by: Ian Rogers <irogers@google.com>
Fixes: 3b193a57baf15c468 ("perf tools: Detect missing kernel features properly")
Reported-and-tested-by: Chen, Zide <zide.chen@intel.com>
Closes: https://lore.kernel.org/lkml/20251022220802.1335131-1-zide.chen@intel.com/
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf symbol: Remove unneeded semicolon
Chen Ni [Tue, 11 Nov 2025 05:44:15 +0000 (13:44 +0800)] 
perf symbol: Remove unneeded semicolon

Remove unnecessary semicolons reported by Coccinelle/coccicheck and the
semantic patch at scripts/coccinelle/misc/semicolon.cocci.

Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf test: Add test that command line period overrides sysfs/json values
Ian Rogers [Sun, 9 Nov 2025 00:59:59 +0000 (16:59 -0800)] 
perf test: Add test that command line period overrides sysfs/json values

The behavior of weak terms is subtle, add a test that they aren't
accidentally broken. The test finds an event with a weak 'period' and
then overrides it. In no such event is present then the test skips.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
5 weeks agoperf pmu: Make pmu_alias_terms weak again
Ian Rogers [Sun, 9 Nov 2025 00:59:58 +0000 (16:59 -0800)] 
perf pmu: Make pmu_alias_terms weak again

The terms for a json event should be weak so they don't override
command line options.

Before:
```
$ perf record -vv -c 1000 -e uops_issued.any -o /dev/null true 2>&1
|grep "{ sample_period, sample_freq }"
 { sample_period, sample_freq }   200003
 { sample_period, sample_freq }   2000003
 { sample_period, sample_freq }   1000
```

After:
```
$ perf record -vv -c 1000 -e uops_issued.any -o /dev/null true 2>&1
|grep "{ sample_period, sample_freq }"
 { sample_period, sample_freq }   1000
 { sample_period, sample_freq }   1000
 { sample_period, sample_freq }   1000
```

Fixes: 84bae3af20d0 ("perf pmu: Don't eagerly parse event terms")
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
6 weeks agoperf tool: Add a delegate_tool that just delegates actions to another tool
Ian Rogers [Fri, 7 Nov 2025 17:07:12 +0000 (09:07 -0800)] 
perf tool: Add a delegate_tool that just delegates actions to another tool

Add an ability to be able to compose perf_tools, by having one perform
an action and then calling a delegate. Currently the perf_tools have
if-then-elses setting the callback and then if-then-elses within the
callback. Understanding the behavior is complex as it is in two places
and logic for numerous operations, within things like perf inject, is
interwoven. By chaining perf_tools together based on command line
options this kind of code can be avoided.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
6 weeks agoperf tool: Add the perf_tool argument to all callbacks
Ian Rogers [Fri, 7 Nov 2025 17:07:11 +0000 (09:07 -0800)] 
perf tool: Add the perf_tool argument to all callbacks

Getting context for what a tool is doing, such as the perf_inject
instance, using container_of the tool is a common pattern in the
code. This isn't possible event_op2, event_op3 and event_op4 callbacks
as the tool isn't passed. Add the argument and then fix function
signatures to match. As tools maybe reading a tool from somewhere
else, change that code to use the passed in tool.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
6 weeks agoperf vendor events arm64:: Add i.MX94 DDR Performance Monitor metrics
Xu Yang [Thu, 21 Aug 2025 11:01:51 +0000 (19:01 +0800)] 
perf vendor events arm64:: Add i.MX94 DDR Performance Monitor metrics

Add JSON metrics for i.MX94 DDR Performance Monitor.

Reviewed-by: Peng Fan <peng.fan@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: Xu Yang <xu.yang_2@nxp.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
6 weeks agoperf stat: Add ScaleUnit to {cpu,task}-clock JSON description
Namhyung Kim [Thu, 6 Nov 2025 21:53:50 +0000 (13:53 -0800)] 
perf stat: Add ScaleUnit to {cpu,task}-clock JSON description

This changes the output of the event like below.  In fact, that's the
output it used to have before the JSON conversion.

Before:
  $ perf stat -e task-clock true

   Performance counter stats for 'true':

             313,848      task-clock                       #    0.290 CPUs utilized

         0.001081223 seconds time elapsed

         0.001122000 seconds user
         0.000000000 seconds sys

After:
  $ perf stat -e task-clock true

   Performance counter stats for 'true':

                0.36 msec task-clock                       #    0.297 CPUs utilized

         0.001225435 seconds time elapsed

         0.001268000 seconds user
         0.000000000 seconds sys

Reviewed-by: Ian Rogers <irogers@google.com>
Fixes: 9957d8c801fe0cb90 ("perf jevents: Add common software event json")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
6 weeks agoperf record: Make sure to update build-ID cache
Namhyung Kim [Thu, 6 Nov 2025 19:00:23 +0000 (11:00 -0800)] 
perf record: Make sure to update build-ID cache

Recent change on enabling --buildid-mmap by default brought an issue
with build-id handling.  With build-ID in MMAP2 records, we don't need
to save the build-ID table in the header of a perf data file.

But the actual file contents still need to be cached in the debug
directory for annotation etc.  Split the build-ID header processing and
caching and make sure perf record to save hit DSOs in the build-ID cache
by moving perf_session__cache_build_ids() to the end of the record__
finish_output().

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
6 weeks agoperf jevents: Make all tables static
Ian Rogers [Fri, 24 Oct 2025 17:58:41 +0000 (10:58 -0700)] 
perf jevents: Make all tables static

The tables created by jevents.py are only used within the pmu-events.c
file. Change the declarations of those global variables to be static
to encapsulate this.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
6 weeks agoperf metricgroup: When copy metrics copy default information
Ian Rogers [Fri, 24 Oct 2025 17:58:39 +0000 (10:58 -0700)] 
perf metricgroup: When copy metrics copy default information

When copy metrics into a group also copy default information from the
original metrics.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
6 weeks agoperf metricgroup: Missed free on error path
Ian Rogers [Fri, 24 Oct 2025 17:58:38 +0000 (10:58 -0700)] 
perf metricgroup: Missed free on error path

If an out-of-memory occurs the expr also needs freeing.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
6 weeks agoperf metricgroup: Update comment on location of metric_event list
Ian Rogers [Fri, 24 Oct 2025 17:58:37 +0000 (10:58 -0700)] 
perf metricgroup: Update comment on location of metric_event list

Update comment as the stat_config no longer holds all metrics.

Signed-off-by: Ian Rogers <irogers@google.com>
Fixes: faebee18d720 ("perf stat: Move metric list from config to evlist")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
6 weeks agoperf evsel: Remove unused metric_events variable
Ian Rogers [Fri, 24 Oct 2025 17:58:36 +0000 (10:58 -0700)] 
perf evsel: Remove unused metric_events variable

The metric_events exist in the metric_expr list and so this variable
has been unused for a while.

Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
7 weeks agoperf tools: Cache counter names for raw samples on s390
Ian Rogers [Fri, 31 Oct 2025 19:42:16 +0000 (12:42 -0700)] 
perf tools: Cache counter names for raw samples on s390

Searching all event names is slower now that legacy names are
included. Add a cache to avoid long iterative searches. Note, the
cache isn't cleaned up and is as such a memory leak, however, globally
reachable leaks like this aren't treated as leaks by leak sanitizer.

Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Closes: https://lore.kernel.org/linux-perf-users/09943f4f-516c-4b93-877c-e4a64ed61d38@linux.ibm.com/
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
7 weeks agoperf trace: Increase syscall handler map size to 1024
Namhyung Kim [Mon, 19 May 2025 23:25:39 +0000 (16:25 -0700)] 
perf trace: Increase syscall handler map size to 1024

The syscalls_sys_{enter,exit} map in augmented_raw_syscalls.bpf.c has
max entries of 512.  Usually syscall numbers are smaller than this but
x86 has x32 ABI where syscalls start from 512.

That makes trace__init_syscalls_bpf_prog_array_maps() fail in the middle
of the loop when it accesses those keys.  As the loop iteration is not
ordered by syscall numbers anymore, the failure can affect non-x32
syscalls.

Let's increase the map size to 1024 so that it can handle those ABIs
too.  While most systems won't need this, increasing the size will be
safer for potential future changes.

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
7 weeks agoperf vendor events AmpereOneX: Fix spelling typo in the metrics file
Chu Guangqing [Fri, 31 Oct 2025 02:58:10 +0000 (10:58 +0800)] 
perf vendor events AmpereOneX: Fix spelling typo in the metrics file

The json file incorrectly used "acceses" instead of "accesses".

Signed-off-by: Chu Guangqing <chuguangqing@inspur.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
7 weeks agoperf vendor events arm64: Fix typo in Ampere eMag json file
Chu Guangqing [Fri, 31 Oct 2025 03:17:29 +0000 (11:17 +0800)] 
perf vendor events arm64: Fix typo in Ampere eMag json file

Correct instruction spelling errors.

Signed-off-by: Chu Guangqing <chuguangqing@inspur.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Reviewed-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
7 weeks agoperf record: skip synthesize event when open evsel failed
Shuai Xue [Thu, 23 Oct 2025 01:50:43 +0000 (09:50 +0800)] 
perf record: skip synthesize event when open evsel failed

When using perf record with the `--overwrite` option, a segmentation fault
occurs if an event fails to open. For example:

  perf record -e cycles-ct -F 1000 -a --overwrite
  Error:
  cycles-ct:H: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat'
  perf: Segmentation fault
      #0 0x6466b6 in dump_stack debug.c:366
      #1 0x646729 in sighandler_dump_stack debug.c:378
      #2 0x453fd1 in sigsegv_handler builtin-record.c:722
      #3 0x7f8454e65090 in __restore_rt libc-2.32.so[54090]
      #4 0x6c5671 in __perf_event__synthesize_id_index synthetic-events.c:1862
      #5 0x6c5ac0 in perf_event__synthesize_id_index synthetic-events.c:1943
      #6 0x458090 in record__synthesize builtin-record.c:2075
      #7 0x45a85a in __cmd_record builtin-record.c:2888
      #8 0x45deb6 in cmd_record builtin-record.c:4374
      #9 0x4e5e33 in run_builtin perf.c:349
      #10 0x4e60bf in handle_internal_command perf.c:401
      #11 0x4e6215 in run_argv perf.c:448
      #12 0x4e653a in main perf.c:555
      #13 0x7f8454e4fa72 in __libc_start_main libc-2.32.so[3ea72]
      #14 0x43a3ee in _start ??:0

The --overwrite option implies --tail-synthesize, which collects non-sample
events reflecting the system status when recording finishes. However, when
evsel opening fails (e.g., unsupported event 'cycles-ct'), session->evlist
is not initialized and remains NULL. The code unconditionally calls
record__synthesize() in the error path, which iterates through the NULL
evlist pointer and causes a segfault.

To fix it, move the record__synthesize() call inside the error check block, so
it's only called when there was no error during recording, ensuring that evlist
is properly initialized.

Fixes: 4ea648aec019 ("perf record: Add --tail-synthesize option")
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
7 weeks agoperf lock contention: Load kernel map before lookup
Namhyung Kim [Thu, 30 Oct 2025 04:01:39 +0000 (21:01 -0700)] 
perf lock contention: Load kernel map before lookup

On some machines, it caused troubles when it tried to find kernel
symbols.  I think it's because kernel modules and kallsyms are messed
up during load and split.

Basically we want to make sure the kernel map is loaded and the code has
it in the lock_contention_read().  But recently we added more lookups in
the lock_contention_prepare() which is called before _read().

Also the kernel map (kallsyms) may not be the first one in the group
like on ARM.  Let's use machine__kernel_map() rather than just loading
the first map.

Reviewed-by: Ian Rogers <irogers@google.com>
Fixes: 688d2e8de231c54e ("perf lock contention: Add -l/--lock-addr option")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
7 weeks agoperf test workload: Add thread count argument to thloop
Ian Rogers [Tue, 28 Oct 2025 15:38:20 +0000 (08:38 -0700)] 
perf test workload: Add thread count argument to thloop

Allow the number of threads for the thloop workload to be increased
beyond the normal 2. Add error checking to the parsed time and thread
count values.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>