Age | Commit message (Collapse) | Author |
|
Use the demangle-rust-v0 APIs to see if symbol is Rust mangled and
demangle if so.
The API requires a pre-allocated output buffer, some estimation and
retrying are added for this.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@kernel.org>
Cc: Ariel Ben-Yehuda <ariel.byd@gmail.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Bill Wendling <morbo@google.com>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Daniel Xu <dxu@dxuuu.xyz>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Trevor Gross <tmgross@umich.edu>
Link: https://lore.kernel.org/r/20250430004128.474388-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Imported at commit 80e40f57d99f ("add comment about finding latest
version of code") from:
https://github.com/rust-lang/rustc-demangle/blob/main/crates/native-c/src/demangle.c
https://github.com/rust-lang/rustc-demangle/blob/main/crates/native-c/include/demangle.h
There is discussion of this issue motivating the import in:
https://github.com/rust-lang/rust/issues/60705
https://lore.kernel.org/lkml/20250129193037.573431-1-irogers@google.com/
The SPDX lines reflect the dual license Apache-2 or MIT in:
https://github.com/rust-lang/rustc-demangle/blob/main/README.md
Following Migual Ojeda's suggestion comments were added on copyright and
keeping the code in sync with upstream.
The files are renamed as perf supports multiple demanglers and so
demangle as a name would be overloaded.
The work here was done by Ariel Ben-Yehuda <ariel.byd@gmail.com> and I
am merely importing it as discussed in the rust-lang issue.
Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@kernel.org>
Cc: Ariel Ben-Yehuda <ariel.byd@gmail.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Bill Wendling <morbo@google.com>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Daniel Xu <dxu@dxuuu.xyz>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Trevor Gross <tmgross@umich.edu>
Link: https://lore.kernel.org/r/20250430004128.474388-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
There is a spelling mistake ina pr_debug message. Fix it.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250507082421.188848-1-colin.i.king@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When it finds a matching PMU for a legacy event, it should look for
core PMUs. The raw events also refers to core events so it should be
handled similarly.
On x86, PERF_TYPE_RAW should match with the existing cpu PMU. But on
ARM, there's no PMU with the matching type so it'll pick the first core
PMU for it.
Suggested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250507215939.54399-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This is to slow down lock acquistion (on contention locks) deliberately.
A possible use case is to estimate impact on application performance by
optimization of kernel locking behavior. By delaying the lock it can
simulate the worse condition as a control group, and then compare with
the current behavior as a optimized condition.
The syntax is 'time@function' and the time can have unit suffix like
"us" and "ms". For example, I ran a simple test like below.
$ sudo perf lock con -abl -L tasklist_lock -- \
sh -c 'for i in $(seq 1000); do sleep 1 & done; wait'
contended total wait max wait avg wait address symbol
92 1.18 ms 199.54 us 12.79 us ffffffff8a806080 tasklist_lock (rwlock)
The contention count was 92 and the average wait time was around 10 us.
But if I add 100 usec of delay to the tasklist_lock,
$ sudo perf lock con -abl -L tasklist_lock -J 100us@tasklist_lock -- \
sh -c 'for i in $(seq 1000); do sleep 1 & done; wait'
contended total wait max wait avg wait address symbol
190 15.67 ms 230.10 us 82.46 us ffffffff8a806080 tasklist_lock (rwlock)
The contention count increased and the average wait time was up closed
to 100 usec. If I increase the delay even more,
$ sudo perf lock con -abl -L tasklist_lock -J 1ms@tasklist_lock -- \
sh -c 'for i in $(seq 1000); do sleep 1 & done; wait'
contended total wait max wait avg wait address symbol
1002 2.80 s 3.01 ms 2.80 ms ffffffff8a806080 tasklist_lock (rwlock)
Now every sleep process had contention and the wait time was more than 1
msec. This is on my 4 CPU laptop so I guess one CPU has the lock while
other 3 are waiting for it mostly.
For simplicity, it only supports global locks for now.
Committer testing:
root@number:~# grep -m1 'model name' /proc/cpuinfo
model name : AMD Ryzen 9 9950X3D 16-Core Processor
root@number:~# perf lock con -abl -L tasklist_lock -- sh -c 'for i in $(seq 1000); do sleep 1 & done; wait'
contended total wait max wait avg wait address symbol
142 453.85 us 25.39 us 3.20 us ffffffffae808080 tasklist_lock (rwlock)
root@number:~# perf lock con -abl -L tasklist_lock -J 100us@tasklist_lock -- sh -c 'for i in $(seq 1000); do sleep 1 & done; wait'
contended total wait max wait avg wait address symbol
1040 2.39 s 3.11 ms 2.30 ms ffffffffae808080 tasklist_lock (rwlock)
root@number:~# perf lock con -abl -L tasklist_lock -J 1ms@tasklist_lock -- sh -c 'for i in $(seq 1000); do sleep 1 & done; wait'
contended total wait max wait avg wait address symbol
1025 24.72 s 31.01 ms 24.12 ms ffffffffae808080 tasklist_lock (rwlock)
root@number:~#
Suggested-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20250509171950.183591-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
There was a copy-paste mistake in the installation commands.
Also, we need to install stderr-whitelist.txt file, which contains
allowed messages that are printed on stderr and should not cause test
fail.
Fixes: 097fe67df1aa9cc7 ("perf testsuite: Install perf-report tests in the 'make install-tests -C tools/perf' target")
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250113182605.130719-6-vmolnaro@redhat.com
Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
I've found some leaks from 'perf trace -a'.
It seems there are more leaks but this is what I can find for now.
Fixes: 082ab9a18e532864 ("perf trace: Filter out 'sshd' in the tracer ancestry in syswide tracing")
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250403054213.7021-1-namhyung@kernel.org
[ split from a larget patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
I've found some leaks from 'perf trace -a'.
It seems there are more leaks but this is what I can find for now.
Fixes: 70351029b55677eb ("perf thread: Add support for reading the e_machine type for a thread")
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250403054213.7021-1-namhyung@kernel.org
[ split from a larget patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add debug verbose output to show how evsels were reordered by
parse_events__sort_events_and_fix_groups(). For example:
```
$ perf record -v -e '{instructions,cycles}' true
Using CPUID GenuineIntel-6-B7-1
WARNING: events were regrouped to match PMUs
evlist after sorting/fixing: '{cpu_atom/instructions/,cpu_atom/cycles/},{cpu_core/instructions/,cpu_core/cycles/}'
```
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Levi Yun <yeoreum.yun@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250402201549.4090305-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Make groups visible in output:
Before:
{cycles,instructions} ->
cpu_atom/cycles/,cpu_atom/instructions/,cpu_core/cycles/,cpu_core/instructions/
After:
{cycles,instructions} ->
{cpu_atom/cycles/,cpu_atom/instructions/},{cpu_core/cycles/,cpu_core/instructions/}
Committer testing:
Before:
root@number:~# perf record -e '{cycles,instructions,cache-misses}' /tmp/bla
Failed to collect 'cycles,instructions,cache-misses' for the '/tmp/bla' workload: Permission denied
root@number:~#
After:
root@number:~# perf record -e '{cycles,instructions,cache-misses}' /tmp/bla
Failed to collect '{cycles,instructions,cache-misses}' for the '/tmp/bla' workload: Permission denied
root@number:~#
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Levi Yun <yeoreum.yun@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250402201549.4090305-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Switch output to using a strbuf so the storage can be resized.
Add a maximum size argument to avoid too much output that may happen for
uncore events.
Rename as scnprintf is no longer used.
Committer testing:
With the patch applied:
root@number:~# perf probe -x ~/bin/perf evlist__format_evsels
Added new event:
probe_perf:evlist_format_evsels (on evlist__format_evsels in /home/acme/bin/perf)
You can now use it in all perf tools, such as:
perf record -e probe_perf:evlist_format_evsels -aR sleep 1
root@number:~# perf probe -l
probe_perf:evlist_format_evsels (on evlist__format_evsels@util/evlist.c in /home/acme/bin/perf)
root@number:~# perf trace -e probe_perf:*/max-stack=10/ perf record -e cycles,instructions,cache-misses /tmp/bla
Failed to collect 'cycles,instructions,cache-misses' for the '/tmp/bla' workload: Permission denied
0.000 perf/3893011 probe_perf:evlist_format_evsels(__probe_ip: 6183397)
evlist__format_evsels (/home/acme/bin/perf)
__cmd_record (/home/acme/bin/perf)
cmd_record (/home/acme/bin/perf)
run_builtin (/home/acme/bin/perf)
handle_internal_command (/home/acme/bin/perf)
run_argv (/home/acme/bin/perf)
main (/home/acme/bin/perf)
__libc_start_call_main (/usr/lib64/libc.so.6)
__libc_start_main@@GLIBC_2.34 (/usr/lib64/libc.so.6)
_start (/home/acme/bin/perf)
root@number:~#
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Levi Yun <yeoreum.yun@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250402201549.4090305-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
print_mixed_hw_group_error will print a warning when a group of events
uses different PMUs.
This isn't possible to happen as parse_events__sort_events_and_fix_groups()
will break groups when this happens, adding the warning at the start
of perf of:
WARNING: events were regrouped to match PMUs
As the previous mixed group warning can never happen, remove the
associated code.
Committer testing:
Before/after:
acme@five:~$ perf stat -e '{cpu_core/cycles/,cpu_atom/cycles/}' sleep 1
WARNING: events were regrouped to match PMUs
Performance counter stats for 'sleep 1':
424,895 cpu_atom/cycles/u
<not counted> cpu_core/cycles/u (0.00%)
1.011862314 seconds time elapsed
0.000000000 seconds user
0.003166000 seconds sys
acme@five:~$
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Levi Yun <yeoreum.yun@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250402201549.4090305-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Prior to this patch evlist__has_hybrid would return false if the
processor wasn't hybrid or the evlist didn't contain any core
events. If the only PMU used by events was cpu_core then it would
true even though there are no cpu_atom events. For example:
```
$ perf stat --cputype=cpu_core -e '{cycles,cycles,cycles,cycles,cycles,cycles,cycles,cycles,cycles}' true
Performance counter stats for 'true':
<not counted> cpu_core/cycles/ (0.00%)
<not counted> cpu_core/cycles/ (0.00%)
<not counted> cpu_core/cycles/ (0.00%)
<not counted> cpu_core/cycles/ (0.00%)
<not counted> cpu_core/cycles/ (0.00%)
<not counted> cpu_core/cycles/ (0.00%)
<not counted> cpu_core/cycles/ (0.00%)
<not counted> cpu_core/cycles/ (0.00%)
<not counted> cpu_core/cycles/ (0.00%)
0.001981900 seconds time elapsed
0.002311000 seconds user
0.000000000 seconds sys
```
This patch changes evlist__has_hybrid to return true only if the
evlist contains events from >1 core PMU. This means the NMI watchdog
warning is shown for the case above.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Levi Yun <yeoreum.yun@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250402201549.4090305-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add missing thread__put() of the found parent thread in
thread__e_machine().
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250401202715.3493567-1-irogers@google.com
[ split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The files.max is the maximum valid fd in the files array and so
freeing the values needs to be inclusive of the max value.
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250401202715.3493567-1-irogers@google.com
[ split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To pick up fixes from the latest perf-tools pull request from Namhyung
and get perf-tools-next in line with thinngs in other areas it uses,
like tools/lib/bpf, etc.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Since we added --off-cpu-thresh, add tests for when a sample's off-cpu
time is above the threshold, and when it's below the threshold.
Note that the basic test performed in test_offcpu_basic() collects a
direct sample now, since sleep 1 has duration of 1000ms, higher than the
default value of --off-cpu-thresh of 500ms, resulting in a direct
sample.
An example:
$ sudo perf test offcpu
124: perf record offcpu profiling tests : Ok
$
Committer testing:
root@number:~# perf test offcpu
126: perf record offcpu profiling tests : Ok
root@number:~# perf test -v offcpu
126: perf record offcpu profiling tests : Ok
root@number:~# perf test -vv offcpu
126: perf record offcpu profiling tests:
--- start ---
test child forked, pid 1410791
Checking off-cpu privilege
Basic off-cpu test
Basic off-cpu test [Success]
Child task off-cpu test
Child task off-cpu test [Success]
Threshold test (above threshold)
Threshold test (above threshold) [Success]
Threshold test (below threshold)
Threshold test (below threshold) [Success]
---- end(0) ----
126: perf record offcpu profiling tests : Ok
root@number:~#
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250501022809.449767-11-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Specify the threshold for dumping offcpu samples with --off-cpu-thresh,
the unit is milliseconds. Default value is 500ms.
Example:
perf record --off-cpu --off-cpu-thresh 824
The example above collects direct off-cpu samples where the off-cpu time
is longer than 824ms.
Committer testing:
After commenting out the end off-cpu dump to have just the ones that are
added right after the task is scheduled back, and using a threshould of
1000ms, we see some periods (the 5th column, just before "offcpu-time"
in the 'perf script' output) that are over 1000.000.000 nanoseconds:
root@number:~# perf record --off-cpu --off-cpu-thresh 10000
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 3.902 MB perf.data (34335 samples) ]
root@number:~# perf script
<SNIP>
Isolated Web Co 59932 [028] 63839.594437: 1000049427 offcpu-time:
7fe63c7976c2 __syscall_cancel_arch_end+0x0 (/usr/lib64/libc.so.6)
7fe63c78c04c __futex_abstimed_wait_common+0x7c (/usr/lib64/libc.so.6)
7fe63c78e928 pthread_cond_timedwait@@GLIBC_2.3.2+0x178 (/usr/lib64/libc.so.6)
5599974a9fe7 mozilla::detail::ConditionVariableImpl::wait_for(mozilla::detail::MutexImpl&, mozilla::BaseTimeDuration<mozilla::TimeDurationValueCalculator> const&)+0xe7 (/usr/lib64/fir>
100000000 [unknown] ([unknown])
swapper 0 [025] 63839.594459: 195724 cycles:P: ffffffffac328270 read_tsc+0x0 ([kernel.kallsyms])
Isolated Web Co 59932 [010] 63839.594466: 1000055278 offcpu-time:
7fe63c7976c2 __syscall_cancel_arch_end+0x0 (/usr/lib64/libc.so.6)
7fe63c78ba24 __syscall_cancel+0x14 (/usr/lib64/libc.so.6)
7fe63c804c4e __poll+0x1e (/usr/lib64/libc.so.6)
7fe633b0d1b8 PollWrapper(_GPollFD*, unsigned int, int) [clone .lto_priv.0]+0xf8 (/usr/lib64/firefox/libxul.so)
10000002c [unknown] ([unknown])
swapper 0 [027] 63839.594475: 134433 cycles:P: ffffffffad4c45d9 irqentry_enter+0x19 ([kernel.kallsyms])
swapper 0 [028] 63839.594499: 215838 cycles:P: ffffffffac39199a switch_mm_irqs_off+0x10a ([kernel.kallsyms])
MediaPD~oder #1 1407676 [027] 63839.594514: 134433 cycles:P: 7f982ef5e69f dct_IV(int*, int, int*)+0x24f (/usr/lib64/libfdk-aac.so.2.0.0)
swapper 0 [024] 63839.594524: 267411 cycles:P: ffffffffad4c6ee6 poll_idle+0x56 ([kernel.kallsyms])
MediaSu~sor #75 1093827 [026] 63839.594555: 332652 cycles:P: 55be753ad030 moz_xmalloc+0x200 (/usr/lib64/firefox/firefox)
swapper 0 [027] 63839.594616: 160548 cycles:P: ffffffffad144840 menu_select+0x570 ([kernel.kallsyms])
Isolated Web Co 14019 [027] 63839.595120: 1000050178 offcpu-time:
7fc9537cc6c2 __syscall_cancel_arch_end+0x0 (/usr/lib64/libc.so.6)
7fc9537c104c __futex_abstimed_wait_common+0x7c (/usr/lib64/libc.so.6)
7fc9537c3928 pthread_cond_timedwait@@GLIBC_2.3.2+0x178 (/usr/lib64/libc.so.6)
7fc95372a3c8 pt_TimedWait+0xb8 (/usr/lib64/libnspr4.so)
7fc95372a8d8 PR_WaitCondVar+0x68 (/usr/lib64/libnspr4.so)
7fc94afb1f7c WatchdogMain(void*)+0xac (/usr/lib64/firefox/libxul.so)
7fc947498660 [unknown] ([unknown])
7fc9535fce88 [unknown] ([unknown])
7fc94b620e60 WatchdogManager::~WatchdogManager()+0x0 (/usr/lib64/firefox/libxul.so)
fff8548387f8b48 [unknown] ([unknown])
swapper 0 [003] 63839.595712: 212948 cycles:P: ffffffffacd5b865 acpi_os_read_port+0x55 ([kernel.kallsyms])
<SNIP>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Suggested-by: Ian Rogers <irogers@google.com>
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-2-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-10-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
BPF's stack trace map
Dump the remaining PERF_SAMPLE_ data, as if it is dumping a direct
sample.
Put the stack trace, tid, off-cpu time and cgroup id into the raw_data
section, just like a direct off-cpu sample coming from BPF's
bpf_perf_event_output().
This ensures that evsel__parse_sample() correctly parses both direct
samples and accumulated samples.
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-10-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-9-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
No PERF_SAMPLE_CALLCHAIN in sample_type, but 'perf script' needs to
display a callchain, have to specify manually.
Also, prefer displaying a callchain:
gvfs-afc-volume 2267 [001] 3829232.955656: 1001115340 offcpu-time:
77f05292603f __pselect+0xbf (/usr/lib/x86_64-linux-gnu/libc.so.6)
77f052a1801c [unknown] (/usr/lib/x86_64-linux-gnu/libusbmuxd-2.0.so.6.0.0)
77f052a18d45 [unknown] (/usr/lib/x86_64-linux-gnu/libusbmuxd-2.0.so.6.0.0)
77f05289ca94 start_thread+0x384 (/usr/lib/x86_64-linux-gnu/libc.so.6)
77f052929c3c clone3+0x2c (/usr/lib/x86_64-linux-gnu/libc.so.6)
to a raw binary BPF output:
BPF output: 0000: dd 08 00 00 db 08 00 00 <DD>...<DB>...
0008: cc ce ab 3b 00 00 00 00 <CC>Ϋ;....
0010: 06 00 00 00 00 00 00 00 ........
0018: 00 fe ff ff ff ff ff ff .<FE><FF><FF><FF><FF><FF><FF>
0020: 3f 60 92 52 f0 77 00 00 ?`.R<F0>w..
0028: 1c 80 a1 52 f0 77 00 00 ..<A1>R<F0>w..
0030: 45 8d a1 52 f0 77 00 00 E.<A1>R<F0>w..
0038: 94 ca 89 52 f0 77 00 00 .<CA>.R<F0>w..
0040: 3c 9c 92 52 f0 77 00 00 <..R<F0>w..
0048: 00 00 00 00 00 00 00 00 ........
0050: 00 00 00 00 ....
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-9-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-8-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
There is a check in evsel.c that does this:
if (evsel__is_offcpu_event(evsel))
evsel->core.attr.sample_type &= OFFCPU_SAMPLE_TYPES;
This along with:
#define OFFCPU_SAMPLE_TYPES (PERF_SAMPLE_IDENTIFIER | PERF_SAMPLE_IP | \
PERF_SAMPLE_TID | PERF_SAMPLE_TIME | \
PERF_SAMPLE_ID | PERF_SAMPLE_CPU | \
PERF_SAMPLE_PERIOD | PERF_SAMPLE_CALLCHAIN | \
PERF_SAMPLE_CGROUP)
will tell perf_event to collect callchain.
We don't need the callchain from perf_event when collecting off-cpu
samples, because it's prev's callchain, not next's callchain.
(perf_event) (task_storage) (needed)
prev next
| |
---sched_switch---->
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-8-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-7-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Use the data in bpf-output samples, to assemble off-cpu samples.
In evsel__is_offcpu_event(), check if sample_type is PERF_SAMPLE_RAW to
support off-cpu sample data created by an older version of perf.
Testing compatibility on off-cpu samples collected by perf before this patch series:
See below, the sample_type still uses PERF_SAMPLE_CALLCHAIN
$ perf script --header -i ./perf.data.ptn | grep "event : name = offcpu-time"
# event : name = offcpu-time, , id = { 237917, 237918, 237919, 237920 }, type = 1 (software), size = 136, config = 0xa (PERF_COUNT_SW_BPF_OUTPUT), { sample_period, sample_freq } = 1, sample_type = IP|TID|TIME|CALLCHAIN|CPU|PERIOD|IDENTIFIER, read_format = ID|LOST, disabled = 1, freq = 1, sample_id_all = 1
The output is correct.
$ perf script -i ./perf.data.ptn | grep offcpu-time
gmain 2173 [000] 18446744069.414584: 100102015 offcpu-time:
NetworkManager 901 [000] 18446744069.414584: 5603579 offcpu-time:
Web Content 1183550 [000] 18446744069.414584: 46278 offcpu-time:
gnome-control-c 2200559 [000] 18446744069.414584: 11998247014 offcpu-time:
<SNIP>
$
And after this patch series:
$ perf script --header -i ./perf.data.off-cpu-v9 | grep "event : name = offcpu-time"
# event : name = offcpu-time, , id = { 237959, 237960, 237961, 237962 }, type = 1 (software), size = 136, config = 0xa (PERF_COUNT_SW_BPF_OUTPUT), { sample_period, sample_freq } = 1, sample_type = IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format = ID|LOST, disabled = 1, freq = 1, sample_id_all = 1
$ ./perf script -i ./perf.data.off-cpu-v9 | grep offcpu-time
gnome-shell 1875 [001] 4789616.361225: 100097057 offcpu-time:
gnome-shell 1875 [001] 4789616.461419: 100107463 offcpu-time:
firefox 2206821 [002] 4789616.475690: 255257245 offcpu-time:
$
Committer testing:
The command to record those samples:
root@number:~# perf record --off-cpu -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.092 MB perf.data (1552 samples) ]
root@number:~#
Then, before this patch series, the sample_type for the "offcpu-time" event is:
root@number:~# perf evlist -v | grep offcpu-time
offcpu-time: type: 1 (PERF_TYPE_SOFTWARE), size: 136, config: 0xa (PERF_COUNT_SW_BPF_OUTPUT), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD|IDENTIFIER, read_format: ID|LOST, disabled: 1, freq: 1, sample_id_all: 1
root@number:~#
And after it, after recording it again:
root@number:~# perf record --off-cpu -a sleep 1 ; perf evlist -v | grep offcpu-time
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.151 MB perf.data (2843 samples) ]
offcpu-time: type: 1 (PERF_TYPE_SOFTWARE), size: 136, config: 0xa (PERF_COUNT_SW_BPF_OUTPUT), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|IDENTIFIER, read_format: ID|LOST, disabled: 1, sample_id_all: 1
root@number:~#
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-7-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-6-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Collect tid, period, callchain, and cgroup id and dump them when off-cpu
time threshold is reached.
We don't collect the off-cpu time twice (the delta), it's either in
direct samples, or accumulated samples that are dumped at the end of
perf.data.
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-6-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-5-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Set the perf_event map in BPF for dumping off-cpu samples, and set the
offcpu_thresh to specify the threshold.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-5-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-4-howardchu95@gmail.com
[ Added some missing iteration variables to off_cpu_config() and fixed up
a manually edited patch hunk line boundary line ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Parse the off-cpu event using parse_event(), as bpf-output.
Call evlist__enable_evsel() on off-cpu event. This fixes the inability
to collect direct off-cpu samples on a workload, as reported by Arnaldo
Carvalho de Melo <acme@redhat.com>.
The reason being, workload sets enable_on_exec instead of calling
evlist__enable(), but off-cpu event does not attach to an executable and
execve won't be called, so the fds from perf_event_open() are not
enabled.
no-inherit should be set to 1, here's the reason:
We update the BPF perf_event map for direct off-cpu sample dumping (in
following patches), it executes as follows:
bpf_map_update_value()
bpf_fd_array_map_update_elem()
perf_event_fd_array_get_ptr()
perf_event_read_local()
In perf_event_read_local(), there is:
int perf_event_read_local(struct perf_event *event, u64 *value,
u64 *enabled, u64 *running)
{
...
/*
* It must not be an event with inherit set, we cannot read
* all child counters from atomic context.
*/
if (event->attr.inherit) {
ret = -EOPNOTSUPP;
goto out;
}
Which means no-inherit has to be true for updating the BPF perf_event
map.
Moreover, for bpf-output events, we primarily want a system-wide event
instead of a per-task event.
The reason is that in BPF's bpf_perf_event_output(), BPF uses the CPU
index to retrieve the perf_event file descriptor it outputs to.
Making a bpf-output event system-wide naturally satisfies this
requirement by mapping CPU appropriately.
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-4-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-3-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Expose evsel__is_offcpu_event() so it can be used in off_cpu_config(),
evsel__parse_sample() and 'perf script'.
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-3-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-2-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Running the "perf script task-analyzer tests" with address sanitizer
showed a double free:
```
FAIL: "test_csv_extended_times" Error message: "Failed to find required string:'Out-Out;'."
=================================================================
==19190==ERROR: AddressSanitizer: attempting double-free on 0x50b000017b10 in thread T0:
#0 0x55da9601c78a in free (perf+0x26078a) (BuildId: e7ef50e08970f017a96fde6101c5e2491acc674a)
#1 0x55da96640c63 in filename__read_build_id tools/perf/util/symbol-minimal.c:221:2
0x50b000017b10 is located 0 bytes inside of 112-byte region [0x50b000017b10,0x50b000017b80)
freed by thread T0 here:
#0 0x55da9601ce40 in realloc (perf+0x260e40) (BuildId: e7ef50e08970f017a96fde6101c5e2491acc674a)
#1 0x55da96640ad6 in filename__read_build_id tools/perf/util/symbol-minimal.c:204:10
previously allocated by thread T0 here:
#0 0x55da9601ca23 in malloc (perf+0x260a23) (BuildId: e7ef50e08970f017a96fde6101c5e2491acc674a)
#1 0x55da966407e7 in filename__read_build_id tools/perf/util/symbol-minimal.c:181:9
SUMMARY: AddressSanitizer: double-free (perf+0x26078a) (BuildId: e7ef50e08970f017a96fde6101c5e2491acc674a) in free
==19190==ABORTING
FAIL: "invocation of perf script report task-analyzer --csv-summary csvsummary --summary-extended command failed" Error message: ""
FAIL: "test_csvsummary_extended" Error message: "Failed to find required string:'Out-Out;'."
---- end(-1) ----
132: perf script task-analyzer tests : FAILED!
```
The buf_size if always set to phdr->p_filesz, but that may be 0
causing a free and realloc to return NULL. This is treated in
filename__read_build_id like a failure and the buffer is freed again.
To avoid this problem only grow buf, meaning the buf_size will never
be 0. This also reduces the number of memory (re)allocations.
Fixes: b691f64360ecec49 ("perf symbols: Implement poor man's ELF parser")
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250501070003.22251-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This is a breakdown of perf_mem_data_src.mem_dtlb values. It assumes
PMU drivers would set PERF_MEM_TLB_HIT bit with an appropriate level.
And having PERF_MEM_TLB_MISS means that it failed to find one in any
levels of TLB. For now, it doesn't use PERF_MEM_TLB_{WK,OS} bits.
Also it seems Intel machines don't distinguish L1 or L2 precisely. So I
added ANY_HIT (printed as "L?-Hit") to handle the case.
$ perf mem report -F overhead,dtlb,dso --stdio
...
# --- D-TLB ----
# Overhead L?-Hit Miss Shared Object
# ........ .............. .................
#
67.03% 99.5% 0.5% [unknown]
31.23% 99.2% 0.8% [kernel.kallsyms]
1.08% 97.8% 2.2% [i915]
0.36% 100.0% 0.0% [JIT] tid 6853
0.12% 100.0% 0.0% [drm]
0.05% 100.0% 0.0% [drm_kms_helper]
0.05% 100.0% 0.0% [ext4]
0.02% 100.0% 0.0% [aesni_intel]
0.02% 100.0% 0.0% [crc32c_intel]
0.02% 100.0% 0.0% [dm_crypt]
...
Committer testing:
# perf report --header | grep cpudesc
# cpudesc : AMD Ryzen 9 9950X3D 16-Core Processor
# perf mem report -F overhead,dtlb,dso --stdio | head -20
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 2K of event 'cycles:P'
# Total weight : 2637
# Sort order : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked,blocked,local_ins_lat,local_p_stage_cyc
#
# ---------- D-TLB -----------
# Overhead L1-Hit L2-Hit Miss Other Shared Object
# ........ ............................ .................................
#
77.47% 18.4% 0.1% 0.6% 80.9% [kernel.kallsyms]
5.61% 36.5% 0.7% 1.4% 61.5% libxul.so
2.77% 39.7% 0.0% 12.3% 47.9% libc.so.6
2.01% 34.0% 1.9% 1.9% 62.3% libglib-2.0.so.0.8400.1
1.93% 31.4% 2.0% 2.0% 64.7% [amdgpu]
1.63% 48.8% 0.0% 0.0% 51.2% [JIT] tid 60168
1.14% 3.3% 0.0% 0.0% 96.7% [vdso]
#
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-12-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This is a breakdown of perf_mem_data_src.mem_snoop values. For now, it
doesn't use mem_snoopx values like FWD and PEER.
$ perf mem report -F overhead,snoop,comm --stdio
...
# ---------- Snoop -----------
# Overhead Hit HitM Miss Other Command
# ........ ............................ ...............
#
34.24% 0.6% 0.0% 0.0% 99.4% gnome-shell
12.02% 1.0% 0.0% 0.0% 99.0% chrome
9.32% 1.0% 0.0% 0.3% 98.7% Isolated Web Co
6.85% 1.0% 0.3% 0.0% 98.6% swapper
6.30% 0.8% 0.8% 0.0% 98.5% Xorg
3.02% 2.4% 0.0% 0.0% 97.6% VizCompositorTh
2.35% 0.0% 0.0% 0.0% 100.0% firefox-esr
2.04% 0.0% 0.0% 0.0% 100.0% JS Helper
1.51% 3.2% 0.0% 0.0% 96.8% threaded-ml
1.44% 0.0% 0.0% 0.0% 100.0% AudioIP~allback
...
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-11-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This is a breakdown of perf_mem_data_src.mem_lvl_num. But it's also
divided into two parts because the combination is bigger than 8.
Since there are many entries for different cache levels, 'cache' field
focuses on them. I generalized buffers like LFB, MAB and MHB to L1-buf
and L2-buf.
The rest goes to 'memory' field which can be RAM, CXL, PMEM, IO, etc.
$ perf mem report -F cache,mem,dso --stdio
...
#
# -------------- Cache -------------- --- Memory ---
# L1 L2 L3 L1-buf Other RAM Other Shared Object
# ................................... .............. ....................................
#
53.9% 3.6% 16.2% 21.6% 4.8% 4.8% 95.2% [kernel.kallsyms]
64.7% 1.7% 3.5% 17.4% 12.8% 12.8% 87.2% chrome (deleted)
78.3% 2.8% 0.0% 1.0% 17.9% 17.9% 82.1% libc.so.6
39.6% 1.5% 0.0% 5.7% 53.2% 53.2% 46.8% libxul.so
26.2% 0.0% 0.0% 0.0% 73.8% 73.8% 26.2% [unknown]
85.5% 0.0% 0.0% 14.5% 0.0% 0.0% 100.0% libspa-audioconvert.so
66.3% 4.4% 0.0% 29.4% 0.0% 0.0% 100.0% libglib-2.0.so.0.8200.1 (deleted)
1.9% 0.0% 0.0% 0.0% 98.1% 98.1% 1.9% libmutter-cogl-15.so.0.0.0 (deleted)
10.6% 0.0% 0.0% 89.4% 0.0% 0.0% 100.0% libpulsecommon-16.1.so
0.0% 0.0% 0.0% 100.0% 0.0% 0.0% 100.0% libfreeblpriv3.so (deleted)
...
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-10-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Some mem_stat types don't use all 8 columns. And there are cases only
samples in certain kinds of mem_stat types are available only. For that
case hide columns which has no samples.
The new output for the previous data would be:
$ perf mem report -F overhead,op,comm --stdio
...
# ------ Mem Op -------
# Overhead Load Store Other Command
# ........ ..................... ...............
#
44.85% 21.1% 30.7% 48.3% swapper
26.82% 98.8% 0.3% 0.9% netsli-prober
7.19% 51.7% 13.7% 34.6% perf
5.81% 89.7% 2.2% 8.1% qemu-system-ppc
4.77% 100.0% 0.0% 0.0% notifications_c
1.77% 95.9% 1.2% 3.0% MemoryReleaser
0.77% 71.6% 4.1% 24.3% DefaultEventMan
0.19% 66.7% 22.2% 11.1% gnome-shell
...
On Intel machines, the event is only for loads or stores so it'll have
only one column:
# Mem Op
# Overhead Load Command
# ........ ....... ...............
#
20.55% 100.0% swapper
17.13% 100.0% chrome
9.02% 100.0% data-loop.0
6.26% 100.0% pipewire-pulse
5.63% 100.0% threaded-ml
5.47% 100.0% GraphRunner
5.37% 100.0% AudioIP~allback
5.30% 100.0% Chrome_ChildIOT
3.17% 100.0% Isolated Web Co
...
Committer testing:
# grep "model name" -m1 /proc/cpuinfo
model name : AMD Ryzen 9 9950X3D 16-Core Processo
# perf mem report -F overhead,op,comm --stdio
# Total Lost Samples: 0
#
# Samples: 2K of event 'cycles:P'
# Total weight : 2637
# Sort order : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked,blocked,local_ins_lat,local_p_stage_cyc
#
# ------ Mem Op -------
# Overhead Load Store Other Command
# ........ ..................... ...............
#
61.02% 14.4% 25.5% 60.1% swapper
5.61% 26.4% 13.5% 60.1% Isolated Web Co
5.50% 21.4% 29.7% 49.0% perf
4.74% 27.2% 15.2% 57.6% gnome-shell
4.63% 33.6% 11.5% 54.9% mdns_service
4.29% 28.3% 12.4% 59.3% ptyxis
2.16% 24.6% 19.3% 56.1% DOM Worker
0.99% 23.1% 34.6% 42.3% firefox
0.72% 26.3% 15.8% 57.9% IPC I/O Parent
0.61% 12.5% 12.5% 75.0% kworker/u130:20
0.61% 37.5% 18.8% 43.8% podman
0.57% 33.3% 6.7% 60.0% Timer
0.53% 14.3% 7.1% 78.6% KMS thread
0.49% 30.8% 7.7% 61.5% kworker/u130:3-
0.46% 41.7% 33.3% 25.0% IPDL Background
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-9-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This is an actual example of the he_mem_stat based sample breakdown. It
uses 'mem_op' field of union perf_mem_data_src which means memory
operations.
It'd have basically 'load' or 'store' which can be useful if PMU doesn't
have separate events for them like IBS or SPE. In addition, there's an
entry in case load and store happen at the same time. Also adds entries
for prefetching and execution.
$ perf mem report -F +op -s comm --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 4K of event 'ibs_op//'
# Total weight : 9559
# Sort order : comm
#
# --------------------- Mem Op ----------------------
# Overhead Samples Load Store Ld+St Pfetch Exec Other N/A N/A Command
# ........ ....... ................................................... ...............
#
44.85% 4077 21.1% 30.7% 0.0% 0.0% 0.0% 48.3% 0.0% 0.0% swapper
26.82% 45 98.8% 0.3% 0.0% 0.0% 0.0% 0.9% 0.0% 0.0% netsli-prober
7.19% 442 51.7% 13.7% 0.0% 0.0% 0.0% 34.6% 0.0% 0.0% perf
5.81% 75 89.7% 2.2% 0.0% 0.0% 0.0% 8.1% 0.0% 0.0% qemu-system-ppc
4.77% 1 100.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% notifications_c
1.77% 10 95.9% 1.2% 0.0% 0.0% 0.0% 3.0% 0.0% 0.0% MemoryReleaser
0.77% 32 71.6% 4.1% 0.0% 0.0% 0.0% 24.3% 0.0% 0.0% DefaultEventMan
0.19% 10 66.7% 22.2% 0.0% 0.0% 0.0% 11.1% 0.0% 0.0% gnome-shell
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-8-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This is a preparation for later changes to support mem_stat output. The
new fields will need two lines for the header - the first line will show
type of mem stat and the second line will show the name of each item
which is returned by mem_stat_name().
Each element in the mem_stat array will be printed in percentage for the
hist_entry and their sum would be 100%.
Add new output field dimension only for SORT_MODE__MEM using mem_stat.
To handle possible name conflict with existing sort keys, move the order
of checking output field dimensions after the sort dimensions when it
looks for sort keys.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-7-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add a logic to account he->mem_stat based on mem_stat_type in hists.
Each mem_stat entry will have different meaning based on the type so the
index in the array is calculated at runtime using the corresponding
value in the sample.data_src.
Still hists has no mem_stat_types yet so this code won't work for now.
Later hists->mem_stat_types will be allocated based on what users want
in the output actually.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The 'struct he_mem_stat' is to save detailed information about memory
instruction. It'll be used to show breakdown of various data from
PERF_SAMPLE_DATA_SRC. Note that this structure is generic and the
contents will be different depending on actual data it'll use later.
The information about the actual data will be saved in 'struct hists'
and its length is in nr_mem_stats. This commit just adds ground works
and does nothing since hists->nr_mem_stats is 0 for now.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This is a preparation to support multi-line headers in 'perf mem report'.
Normal sort keys and output fields that don't have contents for multi-
line will print the header string at the last line only.
As we don't use multi-line headers normally, it should not have any
changes in the output.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
There's no way to enable PERF_SAMPLE_DATA_SRC without PERF_SAMPLE_ADDR
which brings a lot of overhead due to the number of MMAP[2] records.
Let's add a new option to enable this information separately.
Committer testing:
# perf record -a --sample-mem-info
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.815 MB perf.data (2637 samples) ]
#
# perf evlist -v
cycles:P: type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0 (PERF_COUNT_HW_CPU_CYCLES), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|IDENTIFIER|DATA_SRC, read_format: ID|LOST, disabled: 1, freq: 1, precise_ip: 2, sample_id_all: 1
dummy:u: type: 1 (PERF_TYPE_SOFTWARE), size: 136, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|IDENTIFIER|DATA_SRC, read_format: ID|LOST, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
#
# perf report -D |& grep -w PERF_RECORD_SAMPLE -A3 -m1
0 44675164447282 0x1a7590 [0x40]: PERF_RECORD_SAMPLE(IP, 0x4001): 107299/107299: 0xffffffffac4a5e11 period: 144 addr: 0
. data_src: 0x229080142
... thread: perf:107299
...... dso: /lib/modules/6.15.0-rc4+/build/vmlinux
#
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When it removes an output format for cancelled children or latency, it
should delete itself from the sort list as well. Otherwise assertion
in fmt_free() will fire.
$ perf report -H --stdio
perf: ui/hist.c:603: fmt_free: Assertion `!(!list_empty(&fmt->sort_list))' failed.
Aborted (core dumped)
Also convert to perf_hpp__column_unregister() for the same open codes.
Committer notes:
Before this patch:
# perf test hierarchy
83: perf report --hierarchy : FAILED!
# perf test -v hierarchy
--- start ---
test child forked, pid 102242
perf report --hierarchy
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.025 MB /tmp/perf-test-report.HX0N85TlPq/perf-report-hierarchy-perf.data (6 samples) ]
perf: ui/hist.c:603: fmt_free: Assertion `!(!list_empty(&fmt->sort_list))' failed.
/home/acme/libexec/perf-core/tests/shell/perf-report-hierarchy.sh: line 34: 102250 Aborted (core dumped) perf report --hierarchy > /dev/null
--- Cleaning up ---
---- end(-1) ----
83: perf report --hierarchy : FAILED!
#
After:
# perf test hierarchy
83: perf report --hierarchy : Ok
#
Fixes: dbd11b6bdab12f60 ("perf hist: Remove formats in hierarchy when cancel children")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250430180321.736939-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Super simple test to check that at least we're not segfaulting when
trying to use 'perf report --hierarchy', more subtests should be added
to make sure the output is the expected one.
This is being merged right before a fix for that that this test detects:
# perf test hierarchy
83: perf report --hierarchy : FAILED!
# perf test -v hierarchy
--- start ---
test child forked, pid 102242
perf report --hierarchy
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.025 MB /tmp/perf-test-report.HX0N85TlPq/perf-report-hierarchy-perf.data (6 samples) ]
perf: ui/hist.c:603: fmt_free: Assertion `!(!list_empty(&fmt->sort_list))' failed.
/home/acme/libexec/perf-core/tests/shell/perf-report-hierarchy.sh: line 34: 102250 Aborted (core dumped) perf report --hierarchy > /dev/null
--- Cleaning up ---
---- end(-1) ----
83: perf report --hierarchy : FAILED!
#
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/lkml/20250430180321.736939-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
IBS Fetch and IBS Op PMUs has various constraints on supported sample
periods. Add perf unit tests to test those.
Running it in parallel with other tests causes intermittent failures.
Mark it exclusive to force it to run sequentially. Sample output on a
Zen5 machine:
Without kernel fixes:
$ sudo ./perf test -vv 112
112: AMD IBS sample period:
--- start ---
test child forked, pid 8774
Using CPUID AuthenticAMD-26-2-1
IBS config tests:
-----------------
Fetch PMU tests:
0xffff : Ok (nr samples: 1078)
0x1000 : Ok (nr samples: 17030)
0xff : Ok (nr samples: 41068)
0x1 : Ok (nr samples: 40543)
0x0 : Ok
0x10000 : Ok
Op PMU tests:
0x0 : Ok
0x1 : Fail
0x8 : Fail
0x9 : Ok (nr samples: 40543)
0xf : Ok (nr samples: 40543)
0x1000 : Ok (nr samples: 18736)
0xffff : Ok (nr samples: 1168)
0x10000 : Ok
0x100000 : Fail (nr samples: 14)
0xf00000 : Fail (nr samples: 1)
0xf0ffff : Fail (nr samples: 1)
0x1f0ffff : Fail (nr samples: 1)
0x7f0ffff : Fail (nr samples: 0)
0x8f0ffff : Ok
0x17f0ffff : Ok
IBS sample period constraint tests:
-----------------------------------
Fetch PMU test:
freq 0, sample_freq 0: Ok
freq 0, sample_freq 1: Fail
freq 0, sample_freq 15: Fail
freq 0, sample_freq 16: Ok (nr samples: 1604)
freq 0, sample_freq 17: Ok (nr samples: 1604)
freq 0, sample_freq 143: Ok (nr samples: 1604)
freq 0, sample_freq 144: Ok (nr samples: 1604)
freq 0, sample_freq 145: Ok (nr samples: 1604)
freq 0, sample_freq 1234: Ok (nr samples: 1566)
freq 0, sample_freq 4103: Ok (nr samples: 1119)
freq 0, sample_freq 65520: Ok (nr samples: 2264)
freq 0, sample_freq 65535: Ok (nr samples: 2263)
freq 0, sample_freq 65552: Ok (nr samples: 1166)
freq 0, sample_freq 8388607: Ok (nr samples: 268)
freq 0, sample_freq 268435455: Ok (nr samples: 8)
freq 1, sample_freq 0: Ok
freq 1, sample_freq 1: Ok (nr samples: 4)
freq 1, sample_freq 15: Ok (nr samples: 4)
freq 1, sample_freq 16: Ok (nr samples: 4)
freq 1, sample_freq 17: Ok (nr samples: 4)
freq 1, sample_freq 143: Ok (nr samples: 5)
freq 1, sample_freq 144: Ok (nr samples: 5)
freq 1, sample_freq 145: Ok (nr samples: 5)
freq 1, sample_freq 1234: Ok (nr samples: 7)
freq 1, sample_freq 4103: Ok (nr samples: 35)
freq 1, sample_freq 65520: Ok (nr samples: 642)
freq 1, sample_freq 65535: Ok (nr samples: 636)
freq 1, sample_freq 65552: Ok (nr samples: 651)
freq 1, sample_freq 8388607: Ok
Op PMU test:
freq 0, sample_freq 0: Ok
freq 0, sample_freq 1: Fail
freq 0, sample_freq 15: Fail
freq 0, sample_freq 16: Fail
freq 0, sample_freq 17: Fail
freq 0, sample_freq 143: Fail
freq 0, sample_freq 144: Ok (nr samples: 1604)
freq 0, sample_freq 145: Ok (nr samples: 1604)
freq 0, sample_freq 1234: Ok (nr samples: 1604)
freq 0, sample_freq 4103: Ok (nr samples: 1604)
freq 0, sample_freq 65520: Ok (nr samples: 2227)
freq 0, sample_freq 65535: Ok (nr samples: 2296)
freq 0, sample_freq 65552: Ok (nr samples: 2213)
freq 0, sample_freq 8388607: Ok (nr samples: 250)
freq 0, sample_freq 268435455: Ok (nr samples: 8)
freq 1, sample_freq 0: Ok
freq 1, sample_freq 1: Fail (nr samples: 4)
freq 1, sample_freq 15: Fail (nr samples: 4)
freq 1, sample_freq 16: Fail (nr samples: 4)
freq 1, sample_freq 17: Fail (nr samples: 4)
freq 1, sample_freq 143: Fail (nr samples: 5)
freq 1, sample_freq 144: Fail (nr samples: 5)
freq 1, sample_freq 145: Fail (nr samples: 5)
freq 1, sample_freq 1234: Fail (nr samples: 8)
freq 1, sample_freq 4103: Fail (nr samples: 33)
freq 1, sample_freq 65520: Fail (nr samples: 546)
freq 1, sample_freq 65535: Fail (nr samples: 544)
freq 1, sample_freq 65552: Fail (nr samples: 555)
freq 1, sample_freq 8388607: Ok
IBS ioctl() tests:
------------------
Fetch PMU tests
ioctl(period = 0x0 ): Ok
ioctl(period = 0x1 ): Fail
ioctl(period = 0xf ): Fail
ioctl(period = 0x10 ): Ok
ioctl(period = 0x11 ): Fail
ioctl(period = 0x1f ): Fail
ioctl(period = 0x20 ): Ok
ioctl(period = 0x80 ): Ok
ioctl(period = 0x8f ): Fail
ioctl(period = 0x90 ): Ok
ioctl(period = 0x91 ): Fail
ioctl(period = 0x100 ): Ok
ioctl(period = 0xfff0 ): Ok
ioctl(period = 0xffff ): Fail
ioctl(period = 0x10000 ): Ok
ioctl(period = 0x1fff0 ): Ok
ioctl(period = 0x1fff5 ): Fail
ioctl(freq = 0x0 ): Ok
ioctl(freq = 0x1 ): Ok
ioctl(freq = 0xf ): Ok
ioctl(freq = 0x10 ): Ok
ioctl(freq = 0x11 ): Ok
ioctl(freq = 0x1f ): Ok
ioctl(freq = 0x20 ): Ok
ioctl(freq = 0x80 ): Ok
ioctl(freq = 0x8f ): Ok
ioctl(freq = 0x90 ): Ok
ioctl(freq = 0x91 ): Ok
ioctl(freq = 0x100 ): Ok
Op PMU tests
ioctl(period = 0x0 ): Ok
ioctl(period = 0x1 ): Fail
ioctl(period = 0xf ): Fail
ioctl(period = 0x10 ): Fail
ioctl(period = 0x11 ): Fail
ioctl(period = 0x1f ): Fail
ioctl(period = 0x20 ): Fail
ioctl(period = 0x80 ): Fail
ioctl(period = 0x8f ): Fail
ioctl(period = 0x90 ): Ok
ioctl(period = 0x91 ): Fail
ioctl(period = 0x100 ): Ok
ioctl(period = 0xfff0 ): Ok
ioctl(period = 0xffff ): Fail
ioctl(period = 0x10000 ): Ok
ioctl(period = 0x1fff0 ): Ok
ioctl(period = 0x1fff5 ): Fail
ioctl(freq = 0x0 ): Ok
ioctl(freq = 0x1 ): Ok
ioctl(freq = 0xf ): Ok
ioctl(freq = 0x10 ): Ok
ioctl(freq = 0x11 ): Ok
ioctl(freq = 0x1f ): Ok
ioctl(freq = 0x20 ): Ok
ioctl(freq = 0x80 ): Ok
ioctl(freq = 0x8f ): Ok
ioctl(freq = 0x90 ): Ok
ioctl(freq = 0x91 ): Ok
ioctl(freq = 0x100 ): Ok
IBS freq (negative) tests:
--------------------------
freq 1, sample_freq 200000: Fail
IBS L3MissOnly test: (takes a while)
--------------------
Fetch L3MissOnly: Fail (nr_samples: 1213)
Op L3MissOnly: Ok (nr_samples: 1193)
---- end(-1) ----
112: AMD IBS sample period : FAILED!
With kernel fixes:
$ sudo ./perf test -vv 112
112: AMD IBS sample period:
--- start ---
test child forked, pid 6939
Using CPUID AuthenticAMD-26-2-1
IBS config tests:
-----------------
Fetch PMU tests:
0xffff : Ok (nr samples: 969)
0x1000 : Ok (nr samples: 15540)
0xff : Ok (nr samples: 40555)
0x1 : Ok (nr samples: 40543)
0x0 : Ok
0x10000 : Ok
Op PMU tests:
0x0 : Ok
0x1 : Ok
0x8 : Ok
0x9 : Ok (nr samples: 40543)
0xf : Ok (nr samples: 40543)
0x1000 : Ok (nr samples: 19156)
0xffff : Ok (nr samples: 1169)
0x10000 : Ok
0x100000 : Ok (nr samples: 1151)
0xf00000 : Ok (nr samples: 76)
0xf0ffff : Ok (nr samples: 73)
0x1f0ffff : Ok (nr samples: 33)
0x7f0ffff : Ok (nr samples: 10)
0x8f0ffff : Ok
0x17f0ffff : Ok
IBS sample period constraint tests:
-----------------------------------
Fetch PMU test:
freq 0, sample_freq 0: Ok
freq 0, sample_freq 1: Ok
freq 0, sample_freq 15: Ok
freq 0, sample_freq 16: Ok (nr samples: 1203)
freq 0, sample_freq 17: Ok (nr samples: 1604)
freq 0, sample_freq 143: Ok (nr samples: 1604)
freq 0, sample_freq 144: Ok (nr samples: 1604)
freq 0, sample_freq 145: Ok (nr samples: 1604)
freq 0, sample_freq 1234: Ok (nr samples: 1604)
freq 0, sample_freq 4103: Ok (nr samples: 1343)
freq 0, sample_freq 65520: Ok (nr samples: 2254)
freq 0, sample_freq 65535: Ok (nr samples: 2136)
freq 0, sample_freq 65552: Ok (nr samples: 1158)
freq 0, sample_freq 8388607: Ok (nr samples: 257)
freq 0, sample_freq 268435455: Ok (nr samples: 8)
freq 1, sample_freq 0: Ok
freq 1, sample_freq 1: Ok (nr samples: 4)
freq 1, sample_freq 15: Ok (nr samples: 4)
freq 1, sample_freq 16: Ok (nr samples: 4)
freq 1, sample_freq 17: Ok (nr samples: 4)
freq 1, sample_freq 143: Ok (nr samples: 5)
freq 1, sample_freq 144: Ok (nr samples: 5)
freq 1, sample_freq 145: Ok (nr samples: 5)
freq 1, sample_freq 1234: Ok (nr samples: 8)
freq 1, sample_freq 4103: Ok (nr samples: 34)
freq 1, sample_freq 65520: Ok (nr samples: 458)
freq 1, sample_freq 65535: Ok (nr samples: 628)
freq 1, sample_freq 65552: Ok (nr samples: 396)
freq 1, sample_freq 8388607: Ok
Op PMU test:
freq 0, sample_freq 0: Ok
freq 0, sample_freq 1: Ok
freq 0, sample_freq 15: Ok
freq 0, sample_freq 16: Ok
freq 0, sample_freq 17: Ok
freq 0, sample_freq 143: Ok
freq 0, sample_freq 144: Ok (nr samples: 1604)
freq 0, sample_freq 145: Ok (nr samples: 1604)
freq 0, sample_freq 1234: Ok (nr samples: 1604)
freq 0, sample_freq 4103: Ok (nr samples: 1604)
freq 0, sample_freq 65520: Ok (nr samples: 2250)
freq 0, sample_freq 65535: Ok (nr samples: 2158)
freq 0, sample_freq 65552: Ok (nr samples: 2296)
freq 0, sample_freq 8388607: Ok (nr samples: 243)
freq 0, sample_freq 268435455: Ok (nr samples: 6)
freq 1, sample_freq 0: Ok
freq 1, sample_freq 1: Ok (nr samples: 4)
freq 1, sample_freq 15: Ok (nr samples: 4)
freq 1, sample_freq 16: Ok (nr samples: 4)
freq 1, sample_freq 17: Ok (nr samples: 4)
freq 1, sample_freq 143: Ok (nr samples: 4)
freq 1, sample_freq 144: Ok (nr samples: 5)
freq 1, sample_freq 145: Ok (nr samples: 4)
freq 1, sample_freq 1234: Ok (nr samples: 6)
freq 1, sample_freq 4103: Ok (nr samples: 27)
freq 1, sample_freq 65520: Ok (nr samples: 542)
freq 1, sample_freq 65535: Ok (nr samples: 550)
freq 1, sample_freq 65552: Ok (nr samples: 552)
freq 1, sample_freq 8388607: Ok
IBS ioctl() tests:
------------------
Fetch PMU tests
ioctl(period = 0x0 ): Ok
ioctl(period = 0x1 ): Ok
ioctl(period = 0xf ): Ok
ioctl(period = 0x10 ): Ok
ioctl(period = 0x11 ): Ok
ioctl(period = 0x1f ): Ok
ioctl(period = 0x20 ): Ok
ioctl(period = 0x80 ): Ok
ioctl(period = 0x8f ): Ok
ioctl(period = 0x90 ): Ok
ioctl(period = 0x91 ): Ok
ioctl(period = 0x100 ): Ok
ioctl(period = 0xfff0 ): Ok
ioctl(period = 0xffff ): Ok
ioctl(period = 0x10000 ): Ok
ioctl(period = 0x1fff0 ): Ok
ioctl(period = 0x1fff5 ): Ok
ioctl(freq = 0x0 ): Ok
ioctl(freq = 0x1 ): Ok
ioctl(freq = 0xf ): Ok
ioctl(freq = 0x10 ): Ok
ioctl(freq = 0x11 ): Ok
ioctl(freq = 0x1f ): Ok
ioctl(freq = 0x20 ): Ok
ioctl(freq = 0x80 ): Ok
ioctl(freq = 0x8f ): Ok
ioctl(freq = 0x90 ): Ok
ioctl(freq = 0x91 ): Ok
ioctl(freq = 0x100 ): Ok
Op PMU tests
ioctl(period = 0x0 ): Ok
ioctl(period = 0x1 ): Ok
ioctl(period = 0xf ): Ok
ioctl(period = 0x10 ): Ok
ioctl(period = 0x11 ): Ok
ioctl(period = 0x1f ): Ok
ioctl(period = 0x20 ): Ok
ioctl(period = 0x80 ): Ok
ioctl(period = 0x8f ): Ok
ioctl(period = 0x90 ): Ok
ioctl(period = 0x91 ): Ok
ioctl(period = 0x100 ): Ok
ioctl(period = 0xfff0 ): Ok
ioctl(period = 0xffff ): Ok
ioctl(period = 0x10000 ): Ok
ioctl(period = 0x1fff0 ): Ok
ioctl(period = 0x1fff5 ): Ok
ioctl(freq = 0x0 ): Ok
ioctl(freq = 0x1 ): Ok
ioctl(freq = 0xf ): Ok
ioctl(freq = 0x10 ): Ok
ioctl(freq = 0x11 ): Ok
ioctl(freq = 0x1f ): Ok
ioctl(freq = 0x20 ): Ok
ioctl(freq = 0x80 ): Ok
ioctl(freq = 0x8f ): Ok
ioctl(freq = 0x90 ): Ok
ioctl(freq = 0x91 ): Ok
ioctl(freq = 0x100 ): Ok
IBS freq (negative) tests:
--------------------------
freq 1, sample_freq 200000: Ok
IBS L3MissOnly test: (takes a while)
--------------------
Fetch L3MissOnly: Ok (nr_samples: 1301)
Op L3MissOnly: Ok (nr_samples: 1590)
---- end(0) ----
112: AMD IBS sample period : Ok
Committer notes:
Avoid using PAGE_SIZE as that define is also in sys/user.h
Make it a variable not to call sysconf() multiple times.
Also cast func to void * when passing it as the first arg to memcpy to
avoid this with some versions of clang:
arch/x86/tests/amd-ibs-period.c:81:3: error: no matching function for call to 'memcpy'
memcpy(func, insn1, sizeof(insn1));
^~~~~~
/usr/include/string.h:27:7: note: candidate function not viable: no known conversion from 'int (*)(void)' to 'void *' for 1st argument
void *memcpy (void *__restrict, const void *__restrict, size_t);
^
/usr/include/fortify/string.h:40:27: note: candidate function not viable: no known conversion from 'int (*)(void)' to 'void *const' for 1st argument
_FORTIFY_FN(memcpy) void *memcpy(void * _FORTIFY_POS0 __od,
^
arch/x86/tests/amd-ibs-period.c:87:3: error: no matching function for call to 'memcpy'
This one, for instance:
Alpine clang version 19.1.4
Target: x86_64-alpine-linux-musl
Thread model: posix
InstalledDir: /usr/lib/llvm19/bin
Configuration file: /etc/clang19/x86_64-alpine-linux-musl.cfg
System configuration file directory: /etc/clang19
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20250429035938.1301-5-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
'perf mem/c2c' uses IBS Op PMU on AMD platforms.
IBS Op PMU on Zen5 uarch has added support for Load Latency filtering.
Implement 'perf mem/c2c' --ldlat using IBS Op Load Latency filtering
capability.
Some subtle differences between AMD and other arch:
o --ldlat is disabled by default on AMD
o Supported values are 128 to 2048.
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20250429035938.1301-4-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
IBS Op PMU on Zen5 reports DTLB and page size information differently
compared to prior generation.
IBS_OP_DATA3 Zen3/4 Zen5
----------------------------------------------------------------
19 IbsDcL2TlbHit1G Reserved
----------------------------------------------------------------
6 IbsDcL2tlbHit2M Reserved
----------------------------------------------------------------
5 IbsDcL1TlbHit1G PageSize:
4 IbsDcL1TlbHit2M 0 - 4K
1 - 2M
2 - 1G
3 - Reserved
Valid only if
IbsDcPhyAddrValid = 1
----------------------------------------------------------------
3 IbsDcL2TlbMiss IbsDcL2TlbMiss
Valid only if
IbsDcPhyAddrValid = 1
----------------------------------------------------------------
2 IbsDcL1tlbMiss IbsDcL1tlbMiss
Valid only if
IbsDcPhyAddrValid = 1
----------------------------------------------------------------
Kernel expose this change as "dtlb_pgsize" capability in PMU sysfs.
Change IBS register raw-dump logic according to new bit definitions.
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20250429035938.1301-3-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
IBS OP PMU on Zen5 supports Load Latency filtering. Decode and dump Load
Latency filtering related bits into perf script raw dump.
Also add oneliner example in the perf-amd-ibs man page.
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20250429035938.1301-2-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
I started seeing this in recent Fedora 42 kernels:
# uname -a
Linux number 6.14.3-300.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Sun Apr 20 16:08:39 UTC 2025 x86_64 GNU/Linux
#
# perf test vmlinux
1: vmlinux symtab matches kallsyms : FAILED!
#
Where we have Rust enabled:
# grep CONFIG_RUST /boot/config-6.14.3-300.fc42.x86_64
CONFIG_RUSTC_VERSION=108600
CONFIG_RUST_IS_AVAILABLE=y
CONFIG_RUSTC_LLVM_VERSION=200101
CONFIG_RUSTC_HAS_COERCE_POINTEE=y
CONFIG_RUST=y
CONFIG_RUSTC_VERSION_TEXT="rustc 1.86.0 (05f9846f8 2025-03-31) (Fedora 1.86.0-1.fc42)"
CONFIG_RUST_FW_LOADER_ABSTRACTIONS=y
CONFIG_RUST_PHYLIB_ABSTRACTIONS=y
# CONFIG_RUST_DEBUG_ASSERTIONS is not set
CONFIG_RUST_OVERFLOW_CHECKS=y
# CONFIG_RUST_BUILD_ASSERT_ALLOW is not set
#
Looking at the reason for the failure:
# perf test -v vmlinux |& grep ^ERR
ERR : 0xffffffff99efc7d0: __pfx__RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_ not on kallsyms
ERR : 0xffffffff99efc7e0: _RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_ not on kallsyms
#
But:
# grep -w u /proc/kallsyms
ffffffff99efc7d0 u __pfx__RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
ffffffff99efc7e0 u _RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
#
The test checks that "vmlinux symtab matches kallsyms", so it finds those two
symbols in vmlinux:
# pahole --running_kernel_vmlinux
/usr/lib/debug/lib/modules/6.14.3-300.fc42.x86_64/vmlinux
#
# readelf -sW /usr/lib/debug/lib/modules/6.14.3-300.fc42.x86_64/vmlinux | grep -Ew '(__pfx__RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_|_RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_)'
81844: ffffffff81efc7e0 524 FUNC LOCAL DEFAULT 1 _RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
144259: ffffffff81efc7d0 16 FUNC LOCAL DEFAULT 1 __pfx__RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
#
It is there.
From the nm documentation we can see that:
"U" The symbol is undefined.
"u" The symbol is a unique global symbol. This is a GNU extension to the
standard set of ELF symbol bindings. For such a symbol the dynamic
linker will make sure that in the entire process there is just one
symbol with this name and type in use.
So lets consider 'u' symbols in /proc/kallsyms when loading it to cover this case.
Fedora:40 shows this as a 'l' symbol, so consider that as well.
With this patch 'perf test 1' is happy again:
# perf test vmlinux
1: vmlinux symtab matches kallsyms : Ok
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/aBE_n0PGl3g6h-cS@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When libperf is built alone in-source, $(OUTPUT) isn't set. This causes
the generated uapi path to resolve to '/../arch' which results in a
permissions error:
mkdir: cannot create directory '/../arch': Permission denied
Fix it by removing the preceding '/..' which means that it gets
generated either in the tools/lib/perf part of the tree or the OUTPUT
folder. Some other rules that rely on OUTPUT further refine this
conditionally depending on whether it's an in-source or out-of-source
build, but I don't think we need the extra complexity here. And this
rule is slightly different to others because the header is needed by
both libperf and Perf. This is further complicated by the fact that Perf
always passes O=... to libperf even for in source builds, meaning that
OUTPUT isn't set consistently between projects.
Because we're no longer going one level up to try to generate the file
in the tools/ folder, Perf's include rule needs to descend into libperf.
Also fix the clean rule while we're here.
Reported-by: Thorsten Leemhuis <linux@leemhuis.info>
Closes: https://lore.kernel.org/linux-perf-users/7703f88e-ccb7-4c98-9da4-8aad224e780f@leemhuis.info/
Fixes: bfb713ea53c7 ("perf tools: Fix arm64 build by generating unistd_64.h")
Signed-off-by: James Clark <james.clark@linaro.org>
Tested-by: Thorsten Leemhuis <linux@leemhuis.info>
Link: https://lore.kernel.org/r/20250429-james-perf-fix-libperf-in-source-build-v1-1-a1a827ac15e5@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
|
|
In some cases when calling function add_probe_vfs_getname, line number
can't be detected by 'perf probe -L getname_flags':
78 atomic_set(&result->refcnt, 1);
// one of the following lines should have line number
// but sometimes it does not because of optimization
result->uptr = filename;
result->aname = NULL;
81 audit_getname(result);
To prevent false failures, skip the affected tests if no suitable line
numbers can be detected.
Signed-off-by: Jakub Brnak <jbrnak@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tomas Glozar <tglozar@redhat.com>
Link: https://lore.kernel.org/r/20250324144523.597557-1-jbrnak@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The struct zone is embedded in struct pglist_data which can be allocated
for each NUMA node early in the boot process. As it's not a slab object
nor a global lock, this was not symbolized.
Since the zone->lock is often contended, it'd be nice if we can
symbolize it. On NUMA systems, node_data array will have pointers for
struct pglist_data. By following the pointer, it can calculate the
address of each zone and its lock using BTF. On UMA, it can just use
contig_page_data and its zones.
The following example shows the zone lock contention at the end.
$ sudo ./perf lock con -abl -E 5 -- ./perf bench sched messaging
# Running 'sched/messaging' benchmark:
# 20 sender and receiver processes per group
# 10 groups == 400 processes run
Total time: 0.038 [sec]
contended total wait max wait avg wait address symbol
5167 18.17 ms 10.27 us 3.52 us ffff953340052d00 &kmem_cache_node (spinlock)
38 11.75 ms 465.49 us 309.13 us ffff95334060c480 &sock_inode_cache (spinlock)
3916 10.13 ms 10.43 us 2.59 us ffff953342aecb40 &kmem_cache_node (spinlock)
2963 10.02 ms 13.75 us 3.38 us ffff9533d2344098 &kmalloc-rnd-08-2k (spinlock)
216 5.05 ms 99.49 us 23.39 us ffff9542bf7d65d0 zone_lock (spinlock)
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: bpf@vger.kernel.org
Cc: linux-mm@kvack.org
Link: https://lore.kernel.org/r/20250401063055.7431-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
$ sudo ./perf test -vv 'trace summary'
109: perf trace summary:
--- start ---
test child forked, pid 3501572
testing: perf trace -s -- true
testing: perf trace -S -- true
testing: perf trace -s --summary-mode=thread -- true
testing: perf trace -S --summary-mode=total -- true
testing: perf trace -as --summary-mode=thread --no-bpf-summary -- true
testing: perf trace -as --summary-mode=total --no-bpf-summary -- true
testing: perf trace -as --summary-mode=thread --bpf-summary -- true
testing: perf trace -as --summary-mode=total --bpf-summary -- true
testing: perf trace -aS --summary-mode=total --bpf-summary -- true
---- end(0) ----
109: perf trace summary : Ok
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20250326044001.3503432-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When -s/--summary option is used, it doesn't need (augmented) arguments
of syscalls. Let's skip the augmentation and load another small BPF
program to collect the statistics in the kernel instead of copying the
data to the ring-buffer to calculate the stats in userspace. This will
be much more light-weight than the existing approach and remove any lost
events.
Let's add a new option --bpf-summary to control this behavior. I cannot
make it default because there's no way to get e_machine in the BPF which
is needed for detecting different ABIs like 32-bit compat mode.
No functional changes intended except for no more LOST events. :)
$ sudo ./perf trace -as --summary-mode=total --bpf-summary sleep 1
Summary of events:
total, 6194 events
syscall calls errors total min avg max stddev
(msec) (msec) (msec) (msec) (%)
--------------- -------- ------ -------- --------- --------- --------- ------
epoll_wait 561 0 4530.843 0.000 8.076 520.941 18.75%
futex 693 45 4317.231 0.000 6.230 500.077 21.98%
poll 300 0 1040.109 0.000 3.467 120.928 17.02%
clock_nanosleep 1 0 1000.172 1000.172 1000.172 1000.172 0.00%
ppoll 360 0 872.386 0.001 2.423 253.275 41.91%
epoll_pwait 14 0 384.349 0.001 27.453 380.002 98.79%
pselect6 14 0 108.130 7.198 7.724 8.206 0.85%
nanosleep 39 0 43.378 0.069 1.112 10.084 44.23%
...
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20250326044001.3503432-1-namhyung@kernel.org
[ Added fixup sent from Namhyung in response to my report to make it also dependent on CONFIG_TRACE ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
BriefDescription
If BriefDescription and PublicDescription are the same, only
BriefDescription is needed. It will be used for both long and short
format outputs.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250418070812.3771441-3-hejunhao3@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|