git.ipfire.org Git - thirdparty/kernel/linux.git/log

perf list: Skip ABI PMUs when printing pmu values

Avoid printing tracepoint, legacy and software events when listing for
the pmu option. Add the PMU type to the print_event callbacks to ease
detection.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250725185202.68671-8-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf list: Remove tracepoint printing code

Now that the tp_pmu can iterate and describe events remove the custom
tracepoint printing logic, this avoids perf list showing the
tracepoint events twice.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250725185202.68671-7-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf tp_pmu: Add event APIs

Add event APIs for the tracepoint PMU allowing things like perf list
to function using it. For perf list add the tracepoint format in the
long description (shown with -v).

  $ sudo perf list -v tracepoint

  List of pre-defined events (to be used in -e or -M):

    alarmtimer:alarmtimer_cancel                       [Tracepoint event]
         [name: alarmtimer_cancel
          ID: 416
          format:
          field:unsigned short common_type; offset:0; size:2; signed:0;
          field:unsigned char common_flags; offset:2; size:1; signed:0;
          field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
          field:int common_pid; offset:4; size:4; signed:1;
          field:void * alarm; offset:8; size:8; signed:0;
          field:unsigned char alarm_type; offset:16; size:1; signed:0;
          field:s64 expires; offset:24; size:8; signed:1;
          field:s64 now; offset:32; size:8; signed:1;
          print fmt: "alarmtimer:%p type:%s expires:%llu now:%llu",REC->alarm,__print_flags((1 << REC->alarm_type)," | ",{ 1 << 0,
          "REALTIME" },{ 1 << 1,"BOOTTIME" },{ 1 << 3,"REALTIME Freezer" },{ 1 << 4,"BOOTTIME Freezer" }),REC->expires,REC->now
          . Unit: tracepoint]
    alarmtimer:alarmtimer_fired                        [Tracepoint event]
         [name: alarmtimer_fired
          ID: 418
          ...

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250725185202.68671-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf tp_pmu: Factor existing tracepoint logic to new file

Start the creation of a tracepoint PMU abstraction. Tracepoint events
don't follow the regular sysfs perf conventions. Eventually the new
PMU abstraction will bridge the gap so tracepoint events look more
like regular perf ones.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250725185202.68671-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf parse-events: Remove non-json software events

Remove the hard coded encodings from parse-events. This has the
consequence that software events are matched using the sysfs/json
priority, will be case insensitive and will be wildcarded across PMUs.
As there were software and hardware types in the parsing code, the
removal means software vs hardware logic can be removed and hardware
assumed.

Now the perf json provides detailed descriptions of software events,
remove the previous listing support that didn't contain event
descriptions. When globbing is required for the "sw" option in perf
list, use string PMU globbing as was done previously for the tool PMU.

The output of `perf list sw` command changed like this.

Before:
  List of pre-defined events (to be used in -e or -M):

    alignment-faults                                   [Software event]
    bpf-output                                         [Software event]
    cgroup-switches                                    [Software event]
    context-switches OR cs                             [Software event]
    cpu-clock                                          [Software event]
    cpu-migrations OR migrations                       [Software event]
    dummy                                              [Software event]
    emulation-faults                                   [Software event]
    major-faults                                       [Software event]
    minor-faults                                       [Software event]
    page-faults OR faults                              [Software event]
    task-clock                                         [Software event]

After:
  List of pre-defined events (to be used in -e or -M):

  software:
    alignment-faults
         [Number of kernel handled memory alignment faults. Unit: software]
    bpf-output
         [An event used by BPF programs to write to the perf ring buffer. Unit: software]
    cgroup-switches
         [Number of context switches to a task in a different cgroup. Unit: software]
    context-switches
         [Number of context switches [This event is an alias of cs]. Unit: software]
    cpu-clock
         [Per-CPU high-resolution timer based event. Unit: software]
    cpu-migrations
         [Number of times a process has migrated to a new CPU [This event is an alias of migrations]. Unit: software]
    cs
         [Number of context switches [This event is an alias of context-switches]. Unit: software]
    dummy
         [A placeholder event that doesn't count anything. Unit: software]
    ...

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250725185202.68671-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf jevents: Add common software event json

Add json for software events so that in perf list the events can have
a description. Common json exists for the tool PMU but it has no
sysfs equivalent. Modify the map_for_pmu code to return the common map
(rather than an architecture specific one) when a PMU with a common
name is being looked for, this allows the events to be found.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250725185202.68671-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf tools: Remove libtraceevent in .gitignore

The libtraceevent has been removed from the source tree, and .gitignore
needs to be updated as well.

Fixes: 4171925aa9f3f7bf ("tools lib traceevent: Remove libtraceevent")
Signed-off-by: Chen Pei <cp0613@linux.alibaba.com>
Link: https://lore.kernel.org/r/20250726111532.8031-1-cp0613@linux.alibaba.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf test: Fix comment ordering

The previous commit that introduced this test overlooked a behavior of
"perf test list", causing it to print "SPDX-License-Identifier: GPL-2.0"
as a description for that test. This reorders the comments to fix that
issue.

Fixes: edf2cadf01e8 ("perf test: add test for BPF metadata collection")
Signed-off-by: Blake Jones <blakejones@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250726004023.3466563-1-blakejones@google.com
[ update the commit message a little bit ]
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf sort: Use perf_env to set arch sort keys and header

Previously arch_support_sort_key and arch_perf_header_entry used a
weak symbol to compile as appropriate for x86 and powerpc. A
limitation to this is that the handling of a data file could vary in
cross-platform development. Change to using the perf_env of the
current session to determine the architecture kind and set the sort
key and header entries as appropriate.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250724163302.596743-23-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf test: Move PERF_SAMPLE_WEIGHT_STRUCT parsing to common test

test__x86_sample_parsing is identical to test__sample_parsing except
it explicitly tested PERF_SAMPLE_WEIGHT_STRUCT. Now the parsing code
is common move the PERF_SAMPLE_WEIGHT_STRUCT to the common sample
parsing test and remove the x86 version.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250724163302.596743-22-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf sample: Remove arch notion of sample parsing

By definition arch sample parsing and synthesis will inhibit certain
kinds of cross-platform record then analysis (report, script,
etc.). Remove arch_perf_parse_sample_weight and
arch_perf_synthesize_sample_weight replacing with a common
implementation. Combine perf_sample p_stage_cyc and retire_lat as
weight3 to capture the differing uses regardless of compiled for
architecture.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250724163302.596743-21-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf env: Remove global perf_env

The global perf_env was used for the host, but if a perf_env wasn't
easy to come by it was used in a lot of places where potentially
recorded and host data could be confused. Remove the global variable
as now the majority of accesses retrieve the perf_env for the host
from the session.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250724163302.596743-20-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf trace: Avoid global perf_env with evsel__env

There is no session in perf trace unless in replay mode, so in host
mode no session can be associated with the evlist. If the evsel__env
call fails resort to the host_env that's part of the trace. Remove
errno_to_name as it becomes a called once 1-line function once the
argument is turned into a perf_env, just call perf_env__arch_strerrno
directly.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250724163302.596743-19-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf auxtrace: Pass perf_env from session through to mmap read

auxtrace_mmap__read and auxtrace_mmap__read_snapshot end up calling
`evsel__env(NULL)` which returns the global perf_env variable for the
host. Their only call is in perf record. Rather than use the global
variable pass through the perf_env for `perf record`.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250724163302.596743-18-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf machine: Explicitly pass in host perf_env

When creating a machine for the host explicitly pass in a scoped
perf_env. This removes a use of the global perf_env.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250724163302.596743-17-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf bench synthesize: Avoid use of global perf_env

The benchmark doesn't use a data file and so the header perf_env isn't
used. Stack allocate a host perf_env for use to avoid the use of the
global perf_env.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250724163302.596743-16-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf top: Make perf_env locally scoped

The use of the global host perf_env variable is potentially
inconsistent within the code. Switch perf top to using a locally
scoped variable that is generally accessed through the session.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250724163302.596743-15-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf session: Add host_env argument to perf_session__new

When creating a perf_session the host perf_env may or may not want to
be used. For example, `perf top` uses a host perf_env while `perf
inject` does not. Add a host_env argument to perf_session__new so that
sessions requiring a host perf_env can pass it in. Currently if none
is specified the global perf_env variable is used, but this will
change in later patches.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250724163302.596743-14-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf test: Avoid use perf_env

The perf_env global variable holds the host perf_env data but its use
is hit and miss. Switch to using local perf_env variables and ensure
scoped perf_env__init and perf_env__exit. This loses command line
setting of the perf_env, but this doesn't matter for tests. So the
perf_env is fully initialized, clear it with memset in perf_env__init.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250724163302.596743-13-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf header: Clean up use of perf_env

Always use the perf_env from the feat_fd's perf_header. Cache the
value on entry to a function in `env` and use `env->` consistently in
the code. Ensure the header is initialized for use in
perf_session__do_write_header.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250724163302.596743-12-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf evlist: Change env variable to session

The session holds a perf_env pointer env. In UI code container_of is
used to turn the env to a session, but this assumes the session
header's env is in use. Rather than a dubious container_of, hold the
session in the evlist and derive the env from the session with
evsel__env, perf_session__env, etc.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250724163302.596743-11-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf session: Add accessor for session->header.env

The perf_env from the header in the session is frequently accessed,
add an accessor function rather than access directly. Cache the value
to avoid repeated calls. No behavioral change.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250724163302.596743-10-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf record: Make --buildid-mmap the default

Support for build IDs in mmap2 perf events has been present since
Linux v5.12:
https://lore.kernel.org/lkml/20210219194619.1780437-1-acme@kernel.org/
Build ID mmap events don't avoid the need to inject build IDs for DSO
touched by samples as the build ID cache is populated by perf
record. They can avoid some cases of symbol mis-resolution caused by
the file system changing from when a sample occurred and when the DSO
is sought.

Unlike the --buildid-mmap option, this chnage doesn't disable the
build ID cache but it does disable the processing of samples looking
for DSOs to inject build IDs for. To disable the build ID cache the -B
(--no-buildid) option should be used.

Making this option the default was raised on the list in:
https://lore.kernel.org/linux-perf-users/CAP-5=fXP7jN_QrGUcd55_QH5J-Y-FCaJ6=NaHVtyx0oyNh8_-Q@mail.gmail.com/

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250724163302.596743-9-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf jitdump: Directly mark the jitdump DSO

The DSO being generated was being accessed through a thread's maps,
this is unnecessary as the dso can just be directly found. This avoids
problems with passing a NULL evsel which may be inspected to determine
properties of a callchain when using the buildid DSO marking code.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250724163302.596743-8-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf dso: Move build_id to dso_id

The dso_id previously contained the major, minor, inode and inode
generation information from a mmap2 event - the inode generation would
be zero when reading from /proc/pid/maps. The build_id was in the
dso. With build ID mmap2 events these fields wouldn't be initialized
which would largely mean the special empty case where any dso would
match for equality. This isn't desirable as if a dso is replaced we
want the comparison to yield a difference.

To support detecting the difference between DSOs based on build_id,
move the build_id out of the DSO and into the dso_id. The dso_id is
also stored in the DSO so nothing is lost. Capture in the dso_id what
parts have been initialized and rename dso_id__inject to
dso_id__improve_id so that it is clear the dso_id is being improved
upon with additional information. With the build_id in the dso_id, use
memcmp to compare for equality.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250724163302.596743-7-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf build-id: Ensure struct build_id is empty before use

If a build ID is read then not all code paths may ensure it is empty
before use. Initialize the build_id to be zero-ed unless there is
clear initialization such as a call to build_id__init.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250724163302.596743-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf build-id: Mark DSO in sample callchains

Previously only the sample IP's map DSO would be marked hit for the
purposes of populating the build ID cache. Walk the call chain to mark
all IPs and DSOs.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250724163302.596743-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf build-id: Change sprintf functions to snprintf

Pass in a size argument rather than implying all build id strings must
be SBUILD_ID_SIZE.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250724163302.596743-4-irogers@google.com
[ fixed some build errors ]
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf build-id: Truncate to avoid overflowing the build_id data

Warning when the build_id data would be overflowed would lead to
memory corruption, switch to truncation.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250724163302.596743-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf build-id: Reduce size of "size" variable

Later clean up of the dso_id to include a build_id will suffer from
alignment and size issues. The size can only hold up to a value of
BUILD_ID_SIZE (20) and the mmap2 event uses a byte for the value.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250724163302.596743-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf metricgroups: Add NO_THRESHOLD_AND_NMI constraint

Thresholds can increase the number of counters a metric needs. The NMI
watchdog can take away a counter (hopefully the buddy watchdog will
become the default and this will no longer be true). Add a new
constraint for the case that a metric and its thresholds would fit in
counters but only if the NMI watchdog isn't enabled. Either the
threshold or the NMI watchdog should be disabled to make the metric
fit. Wire this up into the metric__group_events logic.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250719030517.1990983-16-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf parse-events: Fix missing slots for Intel topdown metric events

Topdown metric events require grouping with a slots event. In perf
metrics this is currently achieved by metrics adding an unnecessary
"0 * tma_info_thread_slots". New TMA metrics trigger optimizations of
the metric expression that removes the event and breaks the metric due
to the missing but required event. Add a pass immediately before
sorting and fixing parsed events, that insert a slots event if one is
missing. Update test expectations to match this.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250719030517.1990983-15-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf topdown: Use attribute to see an event is a topdown metic or slots

The string comparisons were overly broad and could fire for the
incorrect PMU and events. Switch to using the config in the attribute
then add a perf test to confirm the attribute config values match
those of parsed events of that name and don't match others. This
exposed matches for slots events that shouldn't have matched as the
slots fixed counter event, such as topdown.slots_p.

Fixes: fbc798316bef ("perf x86/topdown: Refine helper arch_is_topdown_metrics()")
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250719030517.1990983-14-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf parse-events: Support user CPUs mixed with threads/processes

Counting events system-wide with a specified CPU prior to this change
worked:
```
$ perf stat -e 'msr/tsc/,msr/tsc,cpu=cpu_core/,msr/tsc,cpu=cpu_atom/' -a sleep 1

  Performance counter stats for 'system wide':

     59,393,419,099      msr/tsc/
     33,927,965,927      msr/tsc,cpu=cpu_core/
     25,465,608,044      msr/tsc,cpu=cpu_atom/
```

However, when counting with process the counts became system wide:
```
$ perf stat -e 'msr/tsc/,msr/tsc,cpu=cpu_core/,msr/tsc,cpu=cpu_atom/' perf test -F 10
10.1: Basic parsing test                                            : Ok
10.2: Parsing without PMU name                                      : Ok
10.3: Parsing with PMU name                                         : Ok

Performance counter stats for 'perf test -F 10':

        59,233,549      msr/tsc/
        59,227,556      msr/tsc,cpu=cpu_core/
        59,224,053      msr/tsc,cpu=cpu_atom/
```

Make the handling of CPU maps with event parsing clearer. When an
event is parsed creating an evsel the cpus should be either the PMU's
cpumask or user specified CPUs.

Update perf_evlist__propagate_maps so that it doesn't clobber the user
specified CPUs. Try to make the behavior clearer, firstly fix up
missing cpumasks. Next, perform sanity checks and adjustments from the
global evlist CPU requests and for the PMU including simplifying to
the "any CPU"(-1) value. Finally remove the event if the cpumask is
empty.

So that events are opened with a CPU and a thread change stat's
create_perf_stat_counter to give both.

With the change things are fixed:
```
$ perf stat --no-scale -e 'msr/tsc/,msr/tsc,cpu=cpu_core/,msr/tsc,cpu=cpu_atom/' perf test -F 10
10.1: Basic parsing test                                            : Ok
10.2: Parsing without PMU name                                      : Ok
10.3: Parsing with PMU name                                         : Ok

Performance counter stats for 'perf test -F 10':

        63,704,975      msr/tsc/
        47,060,704      msr/tsc,cpu=cpu_core/                        (4.62%)
        16,640,591      msr/tsc,cpu=cpu_atom/                        (2.18%)
```

However, note the "--no-scale" option is used. This is necessary as
the running time for the event on the counter isn't the same as the
enabled time because the thread doesn't necessarily run on the CPUs
specified for the counter. All counter values are scaled with:

  scaled_value = value * time_enabled / time_running

and so without --no-scale the scaled_value becomes very large. This
problem already exists on hybrid systems for the same reason. Here are
2 runs of the same code with an instructions event that counts the
same on both types of core, there is no real multiplexing happening on
the event:

```
$ perf stat -e instructions perf test -F 10
...
Performance counter stats for 'perf test -F 10':

        87,896,447      cpu_atom/instructions/                       (14.37%)
        98,171,964      cpu_core/instructions/                       (85.63%)
...
$ perf stat --no-scale -e instructions perf test -F 10
...
Performance counter stats for 'perf test -F 10':

        13,069,890      cpu_atom/instructions/                       (19.32%)
        83,460,274      cpu_core/instructions/                       (80.68%)
...
```
The scaling has inflated per-PMU instruction counts and the overall
count by 2x.

To fix this the kernel needs changing when a task+CPU event (or just
task event on hybrid) is scheduled out. A fix could be that the state
isn't inactive but off for such events, so that time_enabled counts
don't accumulate on them.

Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250719030517.1990983-13-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf evsel: Add evsel__open_per_cpu_and_thread

Add evsel__open_per_cpu_and_thread that combines the operation of
evsel__open_per_cpu and evsel__open_per_thread so that an event
without the "any" cpumask can be opened with its cpumask and with
threads it specifies. Change the implementation of evsel__open_per_cpu
and evsel__open_per_thread to use evsel__open_per_cpu_and_thread to
make the implementation of those functions clearer.

Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250719030517.1990983-12-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf parse-events: Minor __add_event refactoring

Rename cpu_list to user_cpus. If a PMU isn't given, find it early from
the perf_event_attr. Make the pmu_cpus more explicitly a copy from the
PMU (except when user_cpus are given). Derive the cpus from pmu_cpus
and user_cpus as appropriate. Handle strdup errors on name and
metric_id.

Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250719030517.1990983-11-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf pmus: Factor perf_pmus__find_by_attr out of evsel__find_pmu

Allow a PMU to be found by a perf_event_attr, useful when creating
evsels.

Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250719030517.1990983-10-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf evsel: Use libperf perf_evsel__exit

Avoid the duplicated code and better enable perf_evsel to change.

Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250719030517.1990983-9-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

libperf evsel: Factor perf_evsel__exit out of perf_evsel__delete

This allows the perf_evsel__exit to be called when the struct
perf_evsel is embedded inside another struct, such as struct evsel in
perf.

Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250719030517.1990983-8-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

libperf evsel: Rename own_cpus to pmu_cpus

own_cpus is generally the cpumask from the PMU. Rename to pmu_cpus to
try to make this clearer. Variable rename with no other changes.

Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250719030517.1990983-7-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf tool_pmu: Allow num_cpus(_online) to be specific to a cpumask

For hybrid metrics it is useful to know the number of p-core or e-core
CPUs. If a cpumask is specified for the num_cpus or num_cpus_online
tool events, compute the value relative to the given mask rather than
for the full system.

```
$ sudo /tmp/perf/perf stat -e 'tool/num_cpus/,tool/num_cpus,cpu=cpu_core/,
  tool/num_cpus,cpu=cpu_atom/,tool/num_cpus_online/,tool/num_cpus_online,
  cpu=cpu_core/,tool/num_cpus_online,cpu=cpu_atom/' true

Performance counter stats for 'true':

                28      tool/num_cpus/
                16      tool/num_cpus,cpu=cpu_core/
                12      tool/num_cpus,cpu=cpu_atom/
                28      tool/num_cpus_online/
                16      tool/num_cpus_online,cpu=cpu_core/
                12      tool/num_cpus_online,cpu=cpu_atom/

       0.000767205 seconds time elapsed

       0.000938000 seconds user
       0.000000000 seconds sys
```

Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250719030517.1990983-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf parse-events: Allow the cpu term to be a PMU or CPU range

On hybrid systems, events like msr/tsc/ will aggregate counts across
all CPUs. Often metrics only want a value like msr/tsc/ for the cores
on which the metric is being computed. Listing each CPU with terms
cpu=0,cpu=1.. is laborious and would need to be encoded for all
variations of a CPU model.

Allow the cpumask from a PMU to be an argument to the cpu term. For
example in the following the cpumask of the cstate_pkg PMU selects the
CPUs to count msr/tsc/ counter upon:
```
$ cat /sys/bus/event_source/devices/cstate_pkg/cpumask
0
$ perf stat -A -e 'msr/tsc,cpu=cstate_pkg/' -a sleep 0.1

Performance counter stats for 'system wide':

CPU0          252,621,253      msr/tsc,cpu=cstate_pkg/

       0.101184092 seconds time elapsed
```

As the cpu term is now also allowed to be a string, allow it to encode
a range of CPUs (a list can't be supported as ',' is already a special
token).

The "event qualifiers" section of the `perf list` man page is updated
to detail the additional behavior.  The man page formatting is tidied
up in this section, as it was incorrectly appearing within the
"parameterized events" section.

Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250719030517.1990983-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf stat: Don't size aggregation ids from user_requested_cpus

As evsels may have additional CPU terms, the user_requested_cpus may
not reflect all the CPUs requested. Use evlist->all_cpus to size the
array as that reflects all the CPUs potentially needed by the evlist.

Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250719030517.1990983-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf stat: Avoid buffer overflow to the aggregation map

CPUs may be created and passed to perf_stat__get_aggr (via
config->aggr_get_id), such as in the stat display
should_skip_zero_counter. There may be no such aggr_id, for example,
if running with a thread. Add a missing bound check and just create
IDs for these cases.

Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250719030517.1990983-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf parse-events: Warn if a cpu term is unsupported by a CPU

Factor requested CPU warning out of evlist and into evsel. At the end
of adding an event, perform the warning check. To avoid repeatedly
testing if the cpu_list is empty, add a local variable.

```
$ perf stat -e cpu_atom/cycles,cpu=1/ -a true
WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 16-27) for event 'cpu_atom/cycles/'

Performance counter stats for 'system wide':

<not supported> cpu_atom/cycles/

0.000781511 seconds time elapsed
```

Reviewed-by: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250719030517.1990983-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf pfm: Don't force loading of all PMUs

Force loading all PMUs adds significant cost because DRM and other
PMUs are loaded, it should also not be required if the pmus__
functions are used.

Tested by run perf test, in particular the pfm related tests. Also
`perf list` is identical before and after.

Before:
  $ time ./perf test pfm
   54: Test libpfm4 support                                            :
   54.1: test of individual --pfm-events                               : Ok
   54.2: test groups of --pfm-events                                   : Ok
  103: perf all libpfm4 events test                                    : Ok

  real 0m8.933s
  user 0m1.824s
  sys 0m7.122s

After:
  $ time ./perf test pfm
   54: Test libpfm4 support                                            :
   54.1: test of individual --pfm-events                               : Ok
   54.2: test groups of --pfm-events                                   : Ok
  103: perf all libpfm4 events test                                    : Ok

  real 0m5.259s
  user 0m1.793s
  sys 0m3.570s

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250722013449.146233-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf stat: Remove duplicated include in stat-shadow.c

The header files rblist.h is included twice in stat-shadow.c,
so one inclusion of each can be removed.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=22933
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Link: https://lore.kernel.org/r/20250723070418.2195172-1-yang.lee@linux.alibaba.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf ui scripts: Switch FILENAME_MAX to NAME_MAX

FILENAME_MAX is the same as PATH_MAX (4kb) in glibc rather than
NAME_MAX's 255. Switch to using NAME_MAX and ensure the '\0' is
accounted for in the path's buffer size.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250717150855.1032526-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf pmu: Switch FILENAME_MAX to NAME_MAX

FILENAME_MAX is the same as PATH_MAX (4kb) in glibc rather than
NAME_MAX's 255. Switch to using NAME_MAX and ensure the '\0' is
accounted for in the path's buffer size.

Fixes: 754baf426e09 ("perf pmu: Change aliases from list to hashmap")
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250717150855.1032526-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

tools subcmd: Tighten the filename size in check_if_command_finished

FILENAME_MAX is often PATH_MAX (4kb), far more than needed for the
/proc path. Make the buffer size sufficient for the maximum integer
plus "/proc/" and "/status" with a '\0' terminator.

Fixes: 5ce42b5de461 ("tools subcmd: Add non-waitpid check_if_command_finished()")
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250717150855.1032526-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf: ftrace: add graph tracer options args/retval/retval-hex/retaddr

This change adds support for new funcgraph tracer options funcgraph-args,
funcgraph-retval, funcgraph-retval-hex and funcgraph-retaddr.

The new added options are:
  - args       : Show function arguments.
  - retval     : Show function return value.
  - retval-hex : Show function return value in hexadecimal format.
  - retaddr    : Show function return address.

# ./perf ftrace -G vfs_write --graph-opts retval,retaddr
# tracer: function_graph
#
# CPU  DURATION                  FUNCTION CALLS
# |     |   |                     |   |   |   |
5)               |  mutex_unlock() { /* <-rb_simple_write+0xda/0x150 */
5)   0.188 us    |    local_clock(); /* <-lock_release+0x2ad/0x440 ret=0x3bf2a3cf90e */
5)               |    rt_mutex_slowunlock() { /* <-rb_simple_write+0xda/0x150 */
5)               |      _raw_spin_lock_irqsave() { /* <-rt_mutex_slowunlock+0x4f/0x200 */
5)   0.123 us    |        preempt_count_add(); /* <-_raw_spin_lock_irqsave+0x23/0x90 ret=0x0 */
5)   0.128 us    |        local_clock(); /* <-__lock_acquire.isra.0+0x17a/0x740 ret=0x3bf2a3cfc8b */
5)   0.086 us    |        do_raw_spin_trylock(); /* <-_raw_spin_lock_irqsave+0x4a/0x90 ret=0x1 */
5)   0.845 us    |      } /* _raw_spin_lock_irqsave ret=0x292 */
5)               |      _raw_spin_unlock_irqrestore() { /* <-rt_mutex_slowunlock+0x191/0x200 */
5)   0.097 us    |        local_clock(); /* <-lock_release+0x2ad/0x440 ret=0x3bf2a3cff1f */
5)   0.086 us    |        do_raw_spin_unlock(); /* <-_raw_spin_unlock_irqrestore+0x23/0x60 ret=0x1 */
5)   0.104 us    |        preempt_count_sub(); /* <-_raw_spin_unlock_irqrestore+0x35/0x60 ret=0x0 */
5)   0.726 us    |      } /* _raw_spin_unlock_irqrestore ret=0x80000000 */
5)   1.881 us    |    } /* rt_mutex_slowunlock ret=0x0 */
5)   2.931 us    |  } /* mutex_unlock ret=0x0 */

Signed-off-by: Changbin Du <changbin.du@huawei.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250613114048.132336-1-changbin.du@huawei.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf build: Always disable stack protection for BPF skeleton objects

When the clang toolchain has stack protection enabled, the bpf
skeletons build fails with:

error: A call to built-in function '__stack_chk_fail' is not supported.

Since stack-protector makes no sense for the BPF bits, just unconditionally
disable it.

See also similar case at 878625e1c7a10dfbb1fdaaaae2c4d2a58fbce627

Signed-off-by: Federico Pellegrin <fede@evolware.org>
Link: https://lore.kernel.org/r/20250718041224.12389-1-fede@evolware.org
[ rearrange long lines ]
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf sched timehist: decode process names of processes in zombie state

Previously when running perf trace timehist --state, when recording
processes in the zombie state the process name would not be decoded
properly and appears with just the PID:

1140057.412177 [0006]  Mutter Input Th[3139/3104]          0.956      0.019      0.041      S
1140057.412222 [0012]  :1248612[1248612]                   0.000      0.000      0.332      Z
1140057.412275 [0004]  <idle>                              0.052      0.052      0.953      I
1140057.412284 [0008]  <idle>                              0.070      0.070      0.932      I
1140057.412333 [0004]  KMS thread[3126/3104]               0.953      0.112      0.058      S

Now some extra processing has been added to decode the process name:

1140057.412177 [0006]  Mutter Input Th[3139/3104]          0.956      0.019      0.041      S
1140057.412222 [0012]  sleep[1248612]                      0.000      0.000      0.332      Z
1140057.412275 [0004]  <idle>                              0.052      0.052      0.953      I
1140057.412284 [0008]  <idle>                              0.070      0.070      0.932      I
1140057.412333 [0004]  KMS thread[3126/3104]               0.953      0.112      0.058      S

Signed-off-by: Anubhav Shelat <ashelat@redhat.com>
Link: https://lore.kernel.org/r/20250716203914.45772-2-ashelat@redhat.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf flamegraph: Fix minor pylint/type hint issues

Switch to assuming python3. Fix minor pylint issues on line length,
repeated compares, not using f-strings and variable case. Add type
hints and check with mypy.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250716004635.31161-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf ftrace latency: Add -e option to measure time between two events

In addition to the function latency, it can measure events latencies.
Some kernel tracepoints are paired and it's menningful to measure how
long it takes between the two events.  The latency is tracked for the
same thread.

Currently it only uses BPF to do the work but it can be lifted later.
Instead of having separate a BPF program for each tracepoint, it only
uses generic 'event_begin' and 'event_end' programs to attach to any
(raw) tracepoints.

  $ sudo perf ftrace latency -a -b --hide-empty \
    -e i915_request_wait_begin,i915_request_wait_end -- sleep 1
  #   DURATION     |      COUNT | GRAPH                                |
     256 -  512 us |          4 | ######                               |
       2 -    4 ms |          2 | ###                                  |
       4 -    8 ms |         12 | ###################                  |
       8 -   16 ms |         10 | ################                     |

  # statistics  (in usec)
    total time:               194915
      avg time:                 6961
      max time:                12855
      min time:                  373
         count:                   28

Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250714052143.342851-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf python: Set index error for invalid thread/cpu map items

Returning NULL for out of bound CPU or thread map items causes
internal errors. Fix by correctly setting the error to be an index
error.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-14-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf python: Improve leader copying from evlist

The struct pyrf_evlist embeds the evlist requiring the copying from
things like parsed events. The copying logic handles the leader being
the event itself, but if the leader group event is a different in the
list it will cause an evsel to point to the evsel in the list that was
copied from which is bad. Fix this by adding another pass over the
evlist rewriting leaders, simplified by the introductin of two evlist
helpers.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-13-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf python: Correct pyrf_evsel__read for tool PMUs

Tool PMUs assume that stat's process_counter_values is being used to
read the counters. Specifically they hold onto old values in
evsel->prev_raw_counts and give the cumulative count based off of this
value. Update pyrf_evsel__read to allocate counts and prev_raw_counts,
use evsel__read_counter rather than perf_evsel__read so tool PMUs are
read from not just perf_event_open events, make the returned
pyrf_counts_values contain the delta value rather than the cumulative
value.

Fixes: 739621f65702 ("perf python: Add evsel read method")
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-12-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf python: Fix thread check in pyrf_evsel__read

The CPU index is incorrectly checked rather than the thread index.

Fixes: 739621f65702 ("perf python: Add evsel read method")
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-11-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf python: In str(evsel) use the evsel__pmu_name helper

The evsel__pmu_name helper will internally use evsel__find_pmu that
handles legacy events, extended types, etc. in determining a PMU and
will provide a better value than just trying to access the PMU's name
directly as the PMU may not have been computed.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-10-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf jevents: If the long_desc and desc are identical then drop the long_desc

If the short and long descriptions are the same then save space and
don't store both of them. When storing the desc in the perf_pmu_alias,
don't duplicate the desc into the long_desc.

By avoiding storing the duplicate the size of the events string in the
binary on x86 is reduced by 29,840 bytes.

Fix tests that expect a duplicated description.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-9-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf expr: Accumulate rather than replace in the context counts

Metrics will fill in the context to have mappings from an event to a
count. When counts are added they replace existing mappings which
generally shouldn't exist with aggregation. Switch to accumulating to
better support cases where perf stat's aggregation isn't used and we
may see a counter more than once.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-8-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf stat: Move metric list from config to evlist

The rblist of metric_event that then have a list of associated
metric_expr is moved out of the stat_config and into the evlist. This
is done as part of refactoring things for python, having the state
split in two places complicates that implementation. The evlist is
doing the harder work of enabling and disabling events, the metrics
are needed to compute a value and it doesn't seem unreasonable to hang
them from the evlist.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-7-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf metricgroup: Factor out for-each function and move out printing

Factor metricgroup__for_each_metric into its own function handling
regular and sys metrics. Make the metric adding and printing code use
it, move the printing code into print-events files.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf pmu: Tolerate failure to read the type for wellknown PMUs

If sysfs isn't mounted then we may fail to read a PMU's type. In this
situation resort to lookup of wellknown types. Only applies to
software, tracepoint and breakpoint PMUs.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf spark: Fix includes and add SPDX

scnprintf is declared in linux/kernel.h, directly depend upon it.
Add missing SPDX comments.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf parse-events: Minor tidy up of event_type helper

Add missing breakpoint and raw types. Avoid a switch, just use a
lookup array. Switch the type to unsigned to avoid checking negative
values.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf hwmon_pmu: Avoid shortening hwmon PMU name

Long names like ucsi_source_psy_USBC000:001 when prefixed with hwmon_
exceed the buffer size and the last digit is lost. This causes
confusion with similar names like ucsi_source_psy_USBC000:002. Extend
the buffer size to avoid this.

Fixes: 53cc0b351ec9 ("perf hwmon_pmu: Add a tool PMU exposing events from hwmon in sysfs")
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250710235126.1086011-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf tests bp_account: Fix leaked file descriptor

Since the commit e9846f5ead26 ("perf test: In forked mode add check that
fds aren't leaked"), the test "Breakpoint accounting" reports the error:

  # perf test -vvv "Breakpoint accounting"
  20: Breakpoint accounting:
  --- start ---
  test child forked, pid 373
  failed opening event 0
  failed opening event 0
  watchpoints count 4, breakpoints count 6, has_ioctl 1, share 0
  wp 0 created
  wp 1 created
  wp 2 created
  wp 3 created
  wp 0 modified to bp
  wp max created
  ---- end(0) ----
  Leak of file descriptor 7 that opened: 'anon_inode:[perf_event]'

A watchpoint's file descriptor was not properly released. This patch
fixes the leak.

Fixes: 032db28e5fa3 ("perf tests: Add breakpoint accounting/modify test")
Reported-by: Aishwarya TCV <aishwarya.tcv@arm.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250711-perf_fix_breakpoint_accounting-v1-1-b314393023f9@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf list: Remove trailing A in PAI crypto event 4210

According to the z16 and z17 Principle of Operation documents
SA22-7832-13 and SA22-7832-14 the event 4210 is named
   PCC_COMPUTE_LAST_BLOCK_CMAC_USING_ENCRYPTED_AES_256
without a trailing 'A'. Adjust the json definition files
for this event and remove the trailing 'A' character.
   PCC_COMPUTE_LAST_BLOCK_CMAC_USING_ENCRYPTED_AES_256A

Also remove a black ' ' between the dash '-' and the number:
   xxx-AES- 192 ----> xxx-AES-192

Suggested-by: Ingo Franzki <ifranzki@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Link: https://lore.kernel.org/r/20250709072452.1595257-1-tmricht@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf vendor events: Update TigerLake events

Update events from v1.17 to v1.18.

Bring in the event updates v1.18:
https://github.com/intel/perfmon/commit/943fea37d0d54232605f12abf72a812ac314cd1d

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250630163101.1920170-16-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf vendor events: Update SkylakeX events

Update events from v1.36 to v1.37.

Bring in the event updates v1.37:
https://github.com/intel/perfmon/commit/6ee8e4cadda8b6954bd84236e20fab95e345578f

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250630163101.1920170-15-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf vendor events: Update SierraForest events

Update events from v1.09 to v1.11.

Bring in the event updates v1.11:
https://github.com/intel/perfmon/commit/6b824df1dba3948146281c8ba2a8c3e7bf7f7c51
https://github.com/intel/perfmon/commit/4b0346fbee2b04dd34526522250116aee525c922

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250630163101.1920170-14-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf vendor events: Update SapphireRapids events

Update events from v1.25 to v1.28.

Bring in the event updates v1.28:
https://github.com/intel/perfmon/commit/990bfdff270adf08d408534d6d66ba47ec6adb34
https://github.com/intel/perfmon/commit/b7b4d7f18cf9a893438777a571abc7ecc087368b

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250630163101.1920170-13-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf vendor events: Add PantherLake events

Bring in the events at v1.00:
https://github.com/intel/perfmon/commit/d90a6737d0e4e6fbea4a5951e829615fd8317c24

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250630163101.1920170-12-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf vendor events: Update MeteorLake events

Update events from v1.13 to v1.14.

Bring in the event updates v1.14:
https://github.com/intel/perfmon/commit/6c53969b8d1a83afe6ae90149c8dd4ee416027ef

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250630163101.1920170-11-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf vendor events: Update LunarLake events

Update events from v1.11 to v1.14.

Bring in the event updates v1.14:
https://github.com/intel/perfmon/commit/95634fec10542c0c466eb2c6d9a81e0c24fb1123
https://github.com/intel/perfmon/commit/84a49938387ac592af0a622273e4e8e4997e987d

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250630163101.1920170-10-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf vendor events: Update IcelakeX events

Update events from v1.27 to v1.28.

Bring in the event updates v1.28:
https://github.com/intel/perfmon/commit/c52728a46cf37ba271c09b1eb7093cfc82dfbf29

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250630163101.1920170-9-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf vendor events: Update GraniteRapids events

Update events from v1.08 to v1.10.

Bring in the event updates v1.10
https://github.com/intel/perfmon/commit/96259a932e2ce5f70ed7d347ca92fdeb78f83aa5
https://github.com/intel/perfmon/commit/19e315c8d2e0b44e170a6e60de44c9359062a6aa

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250630163101.1920170-8-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf vendor events: Update GrandRidge events

Update events from v1.07 to v1.09.

Bring in the event updates v1.09:
https://github.com/intel/perfmon/commit/8c74d09c8544421256a79f4f21e548ad756f5b7f
https://github.com/intel/perfmon/commit/18c7d2a75e45eacf5553f900ae2097a1290f5bed

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250630163101.1920170-7-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf vendor events: Update EmeraldRapids events

Update events from v1.11 to v1.14.

Bring in the event updates v1.14:
https://github.com/intel/perfmon/commit/6f6e4c8c906992b450cb2014d0501a9ec1cda0d0
https://github.com/intel/perfmon/commit/e363f82276c129aec60402a1d64efbbd41af844e

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250630163101.1920170-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf vendor events: Update CascadelakeX events

Update events from v1.23 to v1.25.

Bring in the event updates v1.25:
https://github.com/intel/perfmon/commit/86f146e15626b0fd3b032cab4538cafaaf2d0635
https://github.com/intel/perfmon/commit/fef03ffc333ae44d1e9d695b4e67e5bbb4429729

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250630163101.1920170-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf vendor events: Update Arrowlake events

Update events from v1.08 to v1.09.

Bring in the event updates v1.09:
https://github.com/intel/perfmon/commit/cf3be6daf0a751ad270b67890dfdb2261dfc75da

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250630163101.1920170-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf vendor events: Update AlderlakeN events

Update events from v1.29 to v1.31.

Bring in the event updates v1.31:
https://github.com/intel/perfmon/commit/5a1269c8af70e32a548e74e1fda736189c398ddc
https://github.com/intel/perfmon/commit/76c6d2c348c067e9ae1b616b35ee982da6d873b4

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250630163101.1920170-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf vendor events: Update Alderlake events

Update events from v1.29 to v1.31.

Bring in the event updates v1.31:
https://github.com/intel/perfmon/commit/5a1269c8af70e32a548e74e1fda736189c398ddc
https://github.com/intel/perfmon/commit/76c6d2c348c067e9ae1b616b35ee982da6d873b4

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250630163101.1920170-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf test: Add more test cases to sched test

  $ sudo ./perf test -vv 92
   92: perf sched tests:
  --- start ---
  test child forked, pid 1360101
  Sched record
  pid 1360105's current affinity list: 0-3
  pid 1360105's new affinity list: 0
  pid 1360107's current affinity list: 0-3
  pid 1360107's new affinity list: 0
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 4.330 MB /tmp/__perf_test_sched.perf.data.b3319 (12246 samples) ]
  Sched latency
  Sched script
  Sched map
  Sched timehist
  Samples of sched_switch event do not have callchains.
  ---- end(0) ----
   92: perf sched tests                                                : Ok

Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250703014942.1369397-9-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf sched: Fix memory leaks in 'perf sched latency'

The work_atoms should be freed after use. Add free_work_atoms() to
make sure to release all. It should use list_splice_init() when merging
atoms to prevent accessing invalid pointers.

Fixes: b1ffe8f3e0c96f552 ("perf sched: Finish latency => atom rename and misc cleanups")
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250703014942.1369397-8-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf sched: Use RC_CHK_EQUAL() to compare pointers

So that it can check two pointers to the same object properly when
REFCNT_CHECKING is on.

Fixes: 78c32f4cb12f9430 ("libperf rc_check: Add RC_CHK_EQUAL")
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250703014942.1369397-7-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf sched: Fix memory leaks for evsel->priv in timehist

It uses evsel->priv to save per-cpu timing information. It should be
freed when the evsel is released.

Add the priv destructor for evsel same as thread to handle that.

Fixes: 49394a2a24c78ce0 ("perf sched timehist: Introduce timehist command")
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250703014942.1369397-6-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf sched: Fix thread leaks in 'perf sched timehist'

Add missing thread__put() after machine__findnew_thread() or
timehist_get_thread(). Also idle threads' last_thread should be
refcounted properly.

Fixes: 699b5b920db04a6f ("perf sched timehist: Save callchain when entering idle")
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250703014942.1369397-5-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf sched: Fix memory leaks in 'perf sched map'

It maintains per-cpu pointers for the current thread but it doesn't
release the refcounts.

Fixes: 5e895278697c014e ("perf sched: Move curr_thread initialization to perf_sched__map()")
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250703014942.1369397-4-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf sched: Free thread->priv using priv_destructor

In many perf sched subcommand saves priv data structure in the thread
but it forgot to free them. As it's an opaque type with 'void *', it
needs to register that knows how to free the data. In this case, just
regular 'free()' is fine.

Fixes: 04cb4fc4d40a5bf1 ("perf thread: Allow tools to register a thread->priv destructor")
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250703014942.1369397-3-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf sched: Make sure it frees the usage string

The parse_options_subcommand() allocates the usage string based on the
given subcommands. So it should reach the end of the function to free
the string to prevent memory leaks.

Fixes: 1a5efc9e13f357ab ("libsubcmd: Don't free the usage string")
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250703014942.1369397-2-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf tests make: Add NO_LIBDW=1 to minimal and add standalone test

Missing testing coverage of NO_LIBDW=1 and add NO_LIBDW=1 to the
minimal test configuration.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250703053622.3141424-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf header: Fix pipe mode header dumping

The pipe mode header dumping was accidentally removed when tracing of
header feature events in pipe mode was added.

Minor spelling tweak to header test failure message.

Fixes: 61051f9a8452 ("perf header: In pipe mode dump features without --header/-I")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250703042000.2740640-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf test: In forked mode add check that fds aren't leaked

When a test is forked no file descriptors should be open, however,
parent ones may have been inherited - in particular those of the pipes
of other forked child test processes. Add a loop to clean-up/close
those file descriptors prior to running the test. At the end of the
test assert that no additional file descriptors are present as this
would indicate a file descriptor leak.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250624190326.2038704-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf dso: With ref count checking, avoid dso_data holding dso live

With the dso_data embedded in a dso there is a reference counted
pointer to the dso rather than using container_of with reference count
checking. This data can hold the dso live meaning that no dso__put
ever deletes it. Add a check for this case and close the dso_data when
it happens. There isn't an infinite loop as the dso_data clears the
file descriptor prior to putting on the dso.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250624190326.2038704-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf hwmon_pmu: Hold path rather than fd

Hold the path to the hwmon_pmu rather than the file descriptor. The
file descriptor is somewhat problematic in that it reflects the
directory state when opened, something that may vary in testing. Using
a path simplifies testing and to some extent cleanup as the hwmon_pmu
is owned by the pmus list and intentionally global and leaked when
perf terminates, the file descriptor being left open looks like a
leak.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250624190326.2038704-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf test code-reading: Avoid a leak of cpus and threads

The perf_evlist__set_maps does the necessary gets on the arguments
passed, so the reference count bumping isn't necessary and creates a
memory leak.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250624190326.2038704-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>

perf dso: Add missed dso__put to dso__load_kcore

The kcore loading creates a set of list nodes that have reference
counted references to maps of the kcore. The list node freeing in the
success path wasn't releasing the maps, add the missing puts. It is
unclear why this leak was being missed by leak sanitizer.

Fixes: 83720209961f ("perf map: Move map list node into symbol")
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250624190326.2038704-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>