summaryrefslogtreecommitdiff
path: root/tools/perf/util
AgeCommit message (Collapse)Author
2022-05-23perf header: Add ability to keep feature sectionsAdrian Hunter
Many feature sections should not be re-written during perf inject. In preparation to support that, add callbacks that a tool can use to copy a feature section from elsewhere. perf inject will use this facility to copy features sections from the input file. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220520132404.25853-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23perf stat: Make use of index clearer with perf_countsIan Rogers
Try to disambiguate further when perf_counts is being accessed it is with a cpu map index rather than a CPU. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Dave Marchevsky <davemarchevsky@fb.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Lv Ruyi <lv.ruyi@zte.com.cn> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Monnet <quentin@isovalent.com> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: netdev@vger.kernel.org Link: https://lore.kernel.org/r/20220519032005.1273691-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23perf bpf_counter: Tidy use of CPU map indexIan Rogers
BPF counters are typically running across all CPUs and so the CPU map index and CPU number are the same. There may be cases with offline CPUs where this isn't the case and so ensure the cpu map index for perf_counts is going to be a valid index by explicitly iterating over the CPU map. This also makes it clearer that users of perf_counts are using an index. Collapse some multiple uses of perf_counts into single uses. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Dave Marchevsky <davemarchevsky@fb.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Lv Ruyi <lv.ruyi@zte.com.cn> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Monnet <quentin@isovalent.com> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: netdev@vger.kernel.org Link: https://lore.kernel.org/r/20220519032005.1273691-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23perf mem: Add stats for store operation with no available memory levelLeo Yan
Sometimes we don't know memory store operations happen on exactly which memory (or cache) level, the memory level flag is set to PERF_MEM_LVL_NA in this case; a practical example is Arm SPE AUX trace sets this flag for all store operations due to absent info for cache level. This patch is to add a new item "st_na" in structure c2c_stats to add statistics for store operations with no available cache level. Signed-off-by: Leo Yan <leo.yan@linaro.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adam Li <adamli@amperemail.onmicrosoft.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Alyssa Ross <hi@alyssa.is> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Joe Mario <jmario@redhat.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: Like Xu <likexu@tencent.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220518055729.1869566-2-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-23Merge remote-tracking branch 'torvalds/master' into perf/coreArnaldo Carvalho de Melo
To get the rest of 5.18. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-21perf session: Fix Intel LBR callstack entries and nr print messageChengdong Li
When generating callstack information from branch_stack(Intel LBR), the actual number of callstack entry should be bigger than the number of branch_stack, for example: branch_stack records: B() -> C() A() -> B() converted callstack records should be: C() B() A() though, the number of callstack equals to the number of branch stack plus 1. This patch fixes above issue in branch_stack__printf(). For example, # echo 'scale=2000; 4*a(1)' > cmd # perf record --call-graph lbr bc -l < cmd Before applying this patch, `perf script -D` output: 1220022677386876 0x2a40 [0xd8]: PERF_RECORD_SAMPLE(IP, 0x4002): 17990/17990: 0x40a6d6 period: 894172 addr: 0 ... LBR call chain: nr:8 ..... 0: fffffffffffffe00 ..... 1: 000000000040a410 ..... 2: 000000000040573c ..... 3: 0000000000408650 ..... 4: 00000000004022f2 ..... 5: 00000000004015f5 ..... 6: 00007f5ed6dcb553 ..... 7: 0000000000401698 ... FP chain: nr:2 ..... 0: fffffffffffffe00 ..... 1: 000000000040a6d8 ... branch callstack: nr:6 # which is not consistent with LBR records. ..... 0: 000000000040a410 ..... 1: 0000000000408650 # ditto ..... 2: 00000000004022f2 ..... 3: 00000000004015f5 ..... 4: 00007f5ed6dcb553 ..... 5: 0000000000401698 ... thread: bc:17990 ...... dso: /usr/bin/bc bc 17990 1220022.677386: 894172 cycles: 40a410 [unknown] (/usr/bin/bc) 40573c [unknown] (/usr/bin/bc) 408650 [unknown] (/usr/bin/bc) 4022f2 [unknown] (/usr/bin/bc) 4015f5 [unknown] (/usr/bin/bc) 7f5ed6dcb553 __libc_start_main+0xf3 (/usr/lib64/libc-2.17.so) 401698 [unknown] (/usr/bin/bc) After applied: 1220022677386876 0x2a40 [0xd8]: PERF_RECORD_SAMPLE(IP, 0x4002): 17990/17990: 0x40a6d6 period: 894172 addr: 0 ... LBR call chain: nr:8 ..... 0: fffffffffffffe00 ..... 1: 000000000040a410 ..... 2: 000000000040573c ..... 3: 0000000000408650 ..... 4: 00000000004022f2 ..... 5: 00000000004015f5 ..... 6: 00007f5ed6dcb553 ..... 7: 0000000000401698 ... FP chain: nr:2 ..... 0: fffffffffffffe00 ..... 1: 000000000040a6d8 ... branch callstack: nr:7 ..... 0: 000000000040a410 ..... 1: 000000000040573c ..... 2: 0000000000408650 ..... 3: 00000000004022f2 ..... 4: 00000000004015f5 ..... 5: 00007f5ed6dcb553 ..... 6: 0000000000401698 ... thread: bc:17990 ...... dso: /usr/bin/bc bc 17990 1220022.677386: 894172 cycles: 40a410 [unknown] (/usr/bin/bc) 40573c [unknown] (/usr/bin/bc) 408650 [unknown] (/usr/bin/bc) 4022f2 [unknown] (/usr/bin/bc) 4015f5 [unknown] (/usr/bin/bc) 7f5ed6dcb553 __libc_start_main+0xf3 (/usr/lib64/libc-2.17.so) 401698 [unknown] (/usr/bin/bc) Change from v1: - refined code style according to Jiri's review comments. Signed-off-by: Chengdong Li <chengdongli@tencent.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: likexu@tencent.com Link: https://lore.kernel.org/r/20220517015726.96131-1-chengdongli@tencent.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-20perf stat: Always keep perf metrics topdown events in a groupKan Liang
If any member in a group has a different cpu mask than the other members, the current perf stat disables group. when the perf metrics topdown events are part of the group, the below <not supported> error will be triggered. $ perf stat -e "{slots,topdown-retiring,uncore_imc_free_running_0/dclk/}" -a sleep 1 WARNING: grouped events cpus do not match, disabling group: anon group { slots, topdown-retiring, uncore_imc_free_running_0/dclk/ } Performance counter stats for 'system wide': 141,465,174 slots <not supported> topdown-retiring 1,605,330,334 uncore_imc_free_running_0/dclk/ The perf metrics topdown events must always be grouped with a slots event as leader. Factor out evsel__remove_from_group() to only remove the regular events from the group. Remove evsel__must_be_in_group(), since no one use it anymore. With the patch, the topdown events aren't broken from the group for the splitting. $ perf stat -e "{slots,topdown-retiring,uncore_imc_free_running_0/dclk/}" -a sleep 1 WARNING: grouped events cpus do not match, disabling group: anon group { slots, topdown-retiring, uncore_imc_free_running_0/dclk/ } Performance counter stats for 'system wide': 346,110,588 slots 124,608,256 topdown-retiring 1,606,869,976 uncore_imc_free_running_0/dclk/ 1.003877592 seconds time elapsed Fixes: a9a1790247bdcf3b ("perf stat: Ensure group is defined on top of the same cpu mask") Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220518143900.1493980-3-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-20perf stat: Fix and validate CPU map inputs in synthetic PERF_RECORD_STAT eventsIan Rogers
Stat events can come from disk and so need a degree of validation. They contain a CPU which needs looking up via CPU map to access a counter. Add the CPU to index translation, alongside validity checking. Discussion thread: https://lore.kernel.org/linux-perf-users/CAP-5=fWQR=sCuiSMktvUtcbOLidEpUJLCybVF6=BRvORcDOq+g@mail.gmail.com/ Fixes: 7ac0089d138f80dc ("perf evsel: Pass cpu not cpu map index to synthesize") Reported-by: Michael Petlan <mpetlan@redhat.com> Suggested-by: Michael Petlan <mpetlan@redhat.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Dave Marchevsky <davemarchevsky@fb.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Lv Ruyi <lv.ruyi@zte.com.cn> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: netdev@vger.kernel.org Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Monnet <quentin@isovalent.com> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Yonghong Song <yhs@fb.com> Link: http://lore.kernel.org/lkml/20220519032005.1273691-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-20perf build: Fix check for btf__load_from_kernel_by_id() in libbpfArnaldo Carvalho de Melo
Avi Kivity reported a problem where the __weak btf__load_from_kernel_by_id() in tools/perf/util/bpf-event.c was being used and it called btf__get_from_id() in tools/lib/bpf/btf.c that in turn called back to btf__load_from_kernel_by_id(), resulting in an endless loop. Fix this by adding a feature test to check if btf__load_from_kernel_by_id() is available when building perf with LIBBPF_DYNAMIC=1, and if not then provide the fallback to the old btf__get_from_id(), that doesn't call back to btf__load_from_kernel_by_id() since at that time it didn't exist at all. Tested on Fedora 35 where we have libbpf-devel 0.4.0 with LIBBPF_DYNAMIC where we don't have btf__load_from_kernel_by_id() and thus its feature test fail, not defining HAVE_LIBBPF_BTF__LOAD_FROM_KERNEL_BY_ID: $ cat /tmp/build/perf-urgent/feature/test-libbpf-btf__load_from_kernel_by_id.make.output test-libbpf-btf__load_from_kernel_by_id.c: In function ‘main’: test-libbpf-btf__load_from_kernel_by_id.c:6:16: error: implicit declaration of function ‘btf__load_from_kernel_by_id’ [-Werror=implicit-function-declaration] 6 | return btf__load_from_kernel_by_id(20151128, NULL); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors $ $ nm /tmp/build/perf-urgent/perf | grep btf__load_from_kernel_by_id 00000000005ba180 T btf__load_from_kernel_by_id $ $ objdump --disassemble=btf__load_from_kernel_by_id -S /tmp/build/perf-urgent/perf /tmp/build/perf-urgent/perf: file format elf64-x86-64 <SNIP> 00000000005ba180 <btf__load_from_kernel_by_id>: #include "record.h" #include "util/synthetic-events.h" #ifndef HAVE_LIBBPF_BTF__LOAD_FROM_KERNEL_BY_ID struct btf *btf__load_from_kernel_by_id(__u32 id) { 5ba180: 55 push %rbp 5ba181: 48 89 e5 mov %rsp,%rbp 5ba184: 48 83 ec 10 sub $0x10,%rsp 5ba188: 64 48 8b 04 25 28 00 mov %fs:0x28,%rax 5ba18f: 00 00 5ba191: 48 89 45 f8 mov %rax,-0x8(%rbp) 5ba195: 31 c0 xor %eax,%eax struct btf *btf; #pragma GCC diagnostic push #pragma GCC diagnostic ignored "-Wdeprecated-declarations" int err = btf__get_from_id(id, &btf); 5ba197: 48 8d 75 f0 lea -0x10(%rbp),%rsi 5ba19b: e8 a0 57 e5 ff call 40f940 <btf__get_from_id@plt> 5ba1a0: 89 c2 mov %eax,%edx #pragma GCC diagnostic pop return err ? ERR_PTR(err) : btf; 5ba1a2: 48 98 cltq 5ba1a4: 85 d2 test %edx,%edx 5ba1a6: 48 0f 44 45 f0 cmove -0x10(%rbp),%rax } <SNIP> Fixes: 218e7b775d368f38 ("perf bpf: Provide a weak btf__load_from_kernel_by_id() for older libbpf versions") Reported-by: Avi Kivity <avi@scylladb.com> Link: https://lore.kernel.org/linux-perf-users/f0add43b-3de5-20c5-22c4-70aff4af959f@scylladb.com Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/linux-perf-users/YobjjFOblY4Xvwo7@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-17perf evlist: Keep topdown counters in weak groupIan Rogers
On Intel Icelake, topdown events must always be grouped with a slots event as leader. When a metric is parsed a weak group is formed and retried if perf_event_open fails. The retried events aren't grouped breaking the slots leader requirement. This change modifies the weak group "reset" behavior so that topdown events aren't broken from the group for the retry. $ perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' -a sleep 1 Performance counter stats for 'system wide': 47,867,188,483 slots (92.27%) <not supported> topdown-bad-spec <not supported> topdown-be-bound <not supported> topdown-fe-bound <not supported> topdown-retiring 2,173,346,937 branch-instructions (92.27%) 10,540,253 branch-misses # 0.48% of all branches (92.29%) 96,291,140 bus-cycles (92.29%) 6,214,202 cache-misses # 20.120 % of all cache refs (92.29%) 30,886,082 cache-references (76.91%) 11,773,726,641 cpu-cycles (84.62%) 11,807,585,307 instructions # 1.00 insn per cycle (92.31%) 0 mem-loads (92.32%) 2,212,928,573 mem-stores (84.69%) 10,024,403,118 ref-cycles (92.35%) 16,232,978 baclears.any (92.35%) 23,832,633 ARITH.DIVIDER_ACTIVE (84.59%) 0.981070734 seconds time elapsed After: $ perf stat -e '{slots,topdown-bad-spec,topdown-be-bound,topdown-fe-bound,topdown-retiring,branch-instructions,branch-misses,bus-cycles,cache-misses,cache-references,cpu-cycles,instructions,mem-loads,mem-stores,ref-cycles,baclears.any,ARITH.DIVIDER_ACTIVE}:W' -a sleep 1 Performance counter stats for 'system wide': 31040189283 slots (92.27%) 8997514811 topdown-bad-spec # 28.2% bad speculation (92.27%) 10997536028 topdown-be-bound # 34.5% backend bound (92.27%) 4778060526 topdown-fe-bound # 15.0% frontend bound (92.27%) 7086628768 topdown-retiring # 22.2% retiring (92.27%) 1417611942 branch-instructions (92.26%) 5285529 branch-misses # 0.37% of all branches (92.28%) 62922469 bus-cycles (92.29%) 1440708 cache-misses # 8.292 % of all cache refs (92.30%) 17374098 cache-references (76.94%) 8040889520 cpu-cycles (84.63%) 7709992319 instructions # 0.96 insn per cycle (92.32%) 0 mem-loads (92.32%) 1515669558 mem-stores (84.68%) 6542411177 ref-cycles (92.35%) 4154149 baclears.any (92.35%) 20556152 ARITH.DIVIDER_ACTIVE (84.59%) 1.010799593 seconds time elapsed Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220517052724.283874-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-17perf intel-pt: Add support for emulated ptwriteAdrian Hunter
ptwrite is an Intel x86 instruction that writes arbitrary values into an Intel PT trace. It is not supported on all hardware, so provide an alternative that makes use of TNT packets to convey the payload data. TNT packets encode Taken/Not-taken conditional branch information, so taking branches based on the payload value will encode the value into the TNT packet. Refer to the changes to the documentation file perf-intel-pt.txt in this patch for an example. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220509152400.376613-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-13perf tools: Remove unused machines__find_host()Adrian Hunter
machines__find_host() does not exist. Remove declaration. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220513084459.6581-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-10perf auxtrace: Record whether an auxtrace mmap is neededAdrian Hunter
Add a flag needs_auxtrace_mmap to record whether an auxtrace mmap is needed, in preparation for correctly determining whether or not an auxtrace mmap is needed. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220506122601.367589-10-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-10libperf evlist: Add evsel as a parameter to ->idx()Adrian Hunter
Add evsel as a parameter to ->idx() in preparation for correctly determining whether an auxtrace mmap is needed. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220506122601.367589-9-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-10libperf evlist: Remove ->idx() per_cpu parameterAdrian Hunter
Remove ->idx() per_cpu parameter because it isn't needed. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220506122601.367589-7-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-10perf auxtrace: Do not mix up mmap idxAdrian Hunter
The idx is with respect to evlist not evsel. That hasn't mattered because they are the same at present. Prepare for that not being the case, which it won't be when sideband tracking events are allowed on all CPUs even when auxtrace is limited to selected CPUs. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220506122601.367589-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-10perf auxtrace: Move evlist__enable_event_idx() to auxtrace.cAdrian Hunter
evlist__enable_event_idx() is used only by auxtrace. Move it to auxtrace.c in preparation for making it even more auxtrace specific. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220506122601.367589-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-10perf evlist: Use libperf functions in evlist__enable_event_idx()Adrian Hunter
evlist__enable_event_idx() is used only for auxtrace events which are never system_wide. Simplify by using libperf enable event functions. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220506122601.367589-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-09perf metrics: Don't add all tool events for sharingIan Rogers
Tool events are added to the set of events for parsing so that having a tool event in a metric doesn't inhibit event sharing of events between metrics. All tool events were added but this meant unused tool events would be counted. Reduce this set of tool events to just those present in the overall metric list. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220507053410.3798748-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-09perf metrics: Support all tool eventsIan Rogers
Previously duration_time was hard coded, which was ok until commit b03b89b350034f22 ("perf stat: Add user_time and system_time events") added additional tool events. Do for all tool events what was previously done just for duration_time. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220507053410.3798748-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-09perf evsel: Add tool event helpersIan Rogers
Convert to and from a string. Fix evsel__tool_name() as array is off-by-1. Support more than just duration_time as a metric-id. Fixes: 75eafc970bd9d36d ("perf list: Print all available tool events") Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220507053410.3798748-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-09perf evsel: Constify a few arraysIan Rogers
Remove public definition of evsel__tool_names(). Not used outside util/evsel.c. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220507053410.3798748-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-09Revert "perf stat: Support metrics with hybrid events"Ian Rogers
This reverts commit 60344f1a9a597f2e0efcd57df5dad0b42da15e21. Hybrid metrics place a PMU at the end of the parse string. This is also where tool events are placed. The behavior of the parse string isn't clear and so revert the change for now. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Stephane Eranian <eranian@google.com> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220507053410.3798748-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-05perf cpumap: Switch to using perf_cpu_map APIIan Rogers
Switch some raw accesses to the cpu map to using the library API. This can help with reference count checking. Some BPF cases switch from index to CPU for consistency, this shouldn't matter as the CPU map is full. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Garry <john.garry@huawei.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yhs@fb.com> Link: http://lore.kernel.org/lkml/20220503041757.2365696-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-03perf stat: Avoid printing cpus with no countersIan Rogers
perf_evlist's user_requested_cpus can contain CPUs not present in any evsel's cpus, for example uncore counters. Avoid printing the prefix and trailing \n until the first valid counter is encountered. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Garry <john.garry@huawei.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Song Liu <songliubraving@fb.com> Cc: Stephane Eranian <eranian@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yhs@fb.com> Link: http://lore.kernel.org/lkml/20220503041757.2365696-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-30perf tools: Add missing headers needed by util/data.hYang Jihong
'struct perf_data' in util/data.h uses the "u64" data type, which is defined in "linux/types.h". If we only include util/data.h, the following compilation error occurs: util/data.h:38:3: error: unknown type name ‘u64’ u64 version; ^~~ Solution: include "linux/types.h." to add the needed type definitions. Fixes: 258031c017c353e8 ("perf header: Add DIR_FORMAT feature to describe directory data") Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220429090539.212448-1-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-30Merge remote-tracking branch 'torvalds/master' into perf/coreArnaldo Carvalho de Melo
To pick up fixes from perf/urgent. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28perf symbol: Remove arch__symbols__fixup_end()Namhyung Kim
Now the generic code can handle kallsyms fixup properly so no need to keep the arch-functions anymore. Fixes: 3cf6a32f3f2a4594 ("perf symbols: Fix symbol size calculation condition") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Will Deacon <will@kernel.org> Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220416004048.1514900-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28perf symbol: Update symbols__fixup_end()Namhyung Kim
Now arch-specific functions all do the same thing. When it fixes the symbol address it needs to check the boundary between the kernel image and modules. For the last symbol in the previous region, it cannot know the exact size as it's discarded already. Thus it just uses a small page size (4096) and rounds it up like the last symbol. Fixes: 3cf6a32f3f2a4594 ("perf symbols: Fix symbol size calculation condition") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Will Deacon <will@kernel.org> Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220416004048.1514900-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28perf symbol: Pass is_kallsyms to symbols__fixup_end()Namhyung Kim
The symbol fixup is necessary for symbols in kallsyms since they don't have size info. So we use the next symbol's address to calculate the size. Now it's also used for user binaries because sometimes they miss size for hand-written asm functions. There's a arch-specific function to handle kallsyms differently but currently it cannot distinguish kallsyms from others. Pass this information explicitly to handle it properly. Note that those arch functions will be moved to the generic function so I didn't added it to the arch-functions. Fixes: 3cf6a32f3f2a4594 ("perf symbols: Fix symbol size calculation condition") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Will Deacon <will@kernel.org> Cc: linux-s390@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220416004048.1514900-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28perf arm-spe: Fix SPE events with phys addressesTimothy Hayes
This patch corrects a bug whereby SPE collection is invoked with pa_enable=1 but synthesized events fail to show physical addresses. Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Timothy Hayes <timothy.hayes@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Garry <john.garry@huawei.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: netdev@vger.kernel.org Link: https://lore.kernel.org/r/20220421165205.117662-3-timothy.hayes@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28perf arm-spe: Fix addresses of synthesized SPE eventsTimothy Hayes
This patch corrects a bug whereby synthesized events from SPE samples are missing virtual addresses. Fixes: 54f7815efef7fad9 ("perf arm-spe: Fill address info for samples") Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Timothy Hayes <timothy.hayes@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: bpf@vger.kernel.org Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: John Garry <john.garry@huawei.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: linux-arm-kernel@lists.infradead.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: netdev@vger.kernel.org Cc: Song Liu <songliubraving@fb.com> Cc: Will Deacon <will@kernel.org> Cc: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/20220421165205.117662-2-timothy.hayes@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28perf intel-pt: Fix timeless decoding with perf.data directoryAdrian Hunter
Intel PT does not capture data in separate directories, so do not use separate directory processing because it doesn't work for timeless decoding. It also looks like it doesn't support one_mmap handling. Example: Before: # perf record --kcore -a -e intel_pt/tsc=0/k sleep 0.1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.799 MB perf.data ] # perf script --itrace=bep | head # After: # perf script --itrace=bep | head perf 21073 [000] psb: psb offs: 0 ffffffffaa68faf4 native_write_msr+0x4 ([kernel.kallsyms]) perf 21073 [000] cbr: cbr: 45 freq: 4505 MHz (161%) ffffffffaa68faf4 native_write_msr+0x4 ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: 0 [unknown] ([unknown]) => ffffffffaa68faf6 native_write_msr+0x6 ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: ffffffffaa68faf8 native_write_msr+0x8 ([kernel.kallsyms]) => ffffffffaa61aab0 pt_config_start+0x60 ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: ffffffffaa61aabd pt_config_start+0x6d ([kernel.kallsyms]) => ffffffffaa61b8ad pt_event_start+0x27d ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: ffffffffaa61b8bb pt_event_start+0x28b ([kernel.kallsyms]) => ffffffffaa61ba60 pt_event_add+0x40 ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: ffffffffaa61ba76 pt_event_add+0x56 ([kernel.kallsyms]) => ffffffffaa880e86 event_sched_in+0xc6 ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: ffffffffaa880e9b event_sched_in+0xdb ([kernel.kallsyms]) => ffffffffaa880ea5 event_sched_in+0xe5 ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: ffffffffaa880eba event_sched_in+0xfa ([kernel.kallsyms]) => ffffffffaa880f96 event_sched_in+0x1d6 ([kernel.kallsyms]) perf 21073 [000] 1 branches:k: ffffffffaa880fc8 event_sched_in+0x208 ([kernel.kallsyms]) => ffffffffaa880ec0 event_sched_in+0x100 ([kernel.kallsyms]) Fixes: bb6be405c4a2a5 ("perf session: Load data directory files for analysis") Cc: stable@vger.kernel.org Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20220428093109.274641-1-adrian.hunter@intel.com Cc: Ian Rogers <irogers@google.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: linux-kernel@vger.kernel.org
2022-04-24Merge remote-tracking branch 'torvalds/master' into perf/coreArnaldo Carvalho de Melo
To pick up fixes, such as the llvm one for ubuntu:22.04. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-24perf stat: Support hybrid --topdown optionZhengjun Xing
Since for cpu_core or cpu_atom, they have different topdown events groups. For cpu_core, --topdown equals to: "{slots,cpu_core/topdown-retiring/,cpu_core/topdown-bad-spec/, cpu_core/topdown-fe-bound/,cpu_core/topdown-be-bound/, cpu_core/topdown-heavy-ops/,cpu_core/topdown-br-mispredict/, cpu_core/topdown-fetch-lat/,cpu_core/topdown-mem-bound/}" For cpu_atom, --topdown equals to: "{cpu_atom/topdown-retiring/,cpu_atom/topdown-bad-spec/, cpu_atom/topdown-fe-bound/,cpu_atom/topdown-be-bound/}" To simplify the implementation, on hybrid, --topdown is used together with --cputype. If without --cputype, it uses cpu_core topdown events by default. # ./perf stat --topdown -a sleep 1 WARNING: default to use cpu_core topdown events Performance counter stats for 'system wide': retiring bad speculation frontend bound backend bound heavy operations light operations branch mispredict machine clears fetch latency fetch bandwidth memory bound Core bound 4.1% 0.0% 5.1% 90.8% 2.3% 1.8% 0.0% 0.0% 4.2% 0.9% 9.9% 81.0% 1.002624229 seconds time elapsed # ./perf stat --topdown -a --cputype atom sleep 1 Performance counter stats for 'system wide': retiring bad speculation frontend bound backend bound 13.5% 0.1% 31.2% 55.2% 1.002366987 seconds time elapsed Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220422065635.767648-3-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-22perf clang: Fix header include for LLVM >= 14Guilherme Amadio
The header TargetRegistry.h has moved in LLVM/clang 14. Committer notes: The problem as noticed when building in ubuntu:22.04: 90 98.61 ubuntu:22.04 : FAIL gcc version 11.2.0 (Ubuntu 11.2.0-19ubuntu1) util/c++/clang.cpp:23:10: fatal error: llvm/Support/TargetRegistry.h: No such file or directory 23 | #include "llvm/Support/TargetRegistry.h" | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ compilation terminated. Fixed after applying this patch. Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Signed-off-by: Guilherme Amadio <amadio@gentoo.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://twitter.com/GuilhermeAmadio/status/1514970524232921088 Link: http://lore.kernel.org/lkml/Ylp0M/VYgHOxtcnF@gentoo.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-22perf stat: Merge event counts from all hybrid PMUsZhengjun Xing
For hybrid events, by default stat aggregates and reports the event counts per pmu. # ./perf stat -e cycles -a sleep 1 Performance counter stats for 'system wide': 14,066,877,268 cpu_core/cycles/ 6,814,443,147 cpu_atom/cycles/ 1.002760625 seconds time elapsed Sometimes, it's also useful to aggregate event counts from all PMUs. Create a new option '--hybrid-merge' to enable that behavior and report the counts without PMUs. # ./perf stat -e cycles -a --hybrid-merge sleep 1 Performance counter stats for 'system wide': 20,732,982,512 cycles 1.002776793 seconds time elapsed Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220422065635.767648-2-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-22perf stat: Support metrics with hybrid eventsZhengjun Xing
One metric such as 'Kernel_Utilization' may be from different PMUs and consists of different events. For core, Kernel_Utilization = cpu_clk_unhalted.thread:k / cpu_clk_unhalted.thread For atom, Kernel_Utilization = cpu_clk_unhalted.core:k / cpu_clk_unhalted.core The metric group string for core is: '{cpu_clk_unhalted.thread/metric-id=cpu_clk_unhalted.thread:k/k,cpu_clk_unhalted.thread/metric-id=cpu_clk_unhalted.thread/}:W' It's internally expanded to: '{cpu_clk_unhalted.thread_p/metric-id=cpu_clk_unhalted.thread_p:k/k,cpu_clk_unhalted.thread/metric-id=cpu_clk_unhalted.thread/}:W#cpu_core' The metric group string for atom is: '{cpu_clk_unhalted.core/metric-id=cpu_clk_unhalted.core:k/k,cpu_clk_unhalted.core/metric-id=cpu_clk_unhalted.core/}:W' It's internally expanded to: '{cpu_clk_unhalted.core/metric-id=cpu_clk_unhalted.core:k/k,cpu_clk_unhalted.core/metric-id=cpu_clk_unhalted.core/}:W#cpu_atom' That means the group "{cpu_clk_unhalted.thread:k,cpu_clk_unhalted.thread}:W" is from cpu_core PMU and the group "{cpu_clk_unhalted.core:k,cpu_clk_unhalted.core}" is from cpu_atom PMU. And then next, check if the events in the group are valid on that PMU. If one event is not valid on that PMU, the associated group would be removed internally. In this example, cpu_clk_unhalted.thread is valid on cpu_core and cpu_clk_unhalted.core is valid on cpu_atom. So the checks for these two groups are passed. Before: # ./perf stat -M Kernel_Utilization -a sleep 1 WARNING: events in group from different hybrid PMUs! WARNING: grouped events cpus do not match, disabling group: anon group { CPU_CLK_UNHALTED.THREAD_P:k, CPU_CLK_UNHALTED.THREAD_P:k, CPU_CLK_UNHALTED.THREAD, CPU_CLK_UNHALTED.THREAD } Performance counter stats for 'system wide': 17,639,501 cpu_atom/CPU_CLK_UNHALTED.CORE/ # 1.00 Kernel_Utilization 17,578,757 cpu_atom/CPU_CLK_UNHALTED.CORE:k/ 1,005,350,226 ns duration_time 43,012,352 cpu_core/CPU_CLK_UNHALTED.THREAD_P:k/ # 0.99 Kernel_Utilization 17,608,010 cpu_atom/CPU_CLK_UNHALTED.THREAD_P:k/ 43,608,755 cpu_core/CPU_CLK_UNHALTED.THREAD/ 17,630,838 cpu_atom/CPU_CLK_UNHALTED.THREAD/ 1,005,350,226 ns duration_time 1.005350226 seconds time elapsed After: # ./perf stat -M Kernel_Utilization -a sleep 1 Performance counter stats for 'system wide': 17,981,895 CPU_CLK_UNHALTED.CORE [cpu_atom] # 1.00 Kernel_Utilization 17,925,405 CPU_CLK_UNHALTED.CORE:k [cpu_atom] 1,004,811,366 ns duration_time 41,246,425 CPU_CLK_UNHALTED.THREAD_P:k [cpu_core] # 0.99 Kernel_Utilization 41,819,129 CPU_CLK_UNHALTED.THREAD [cpu_core] 1,004,811,366 ns duration_time 1.004811366 seconds time elapsed Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220422065635.767648-1-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-22perf tools: Move libbpf init in libbpf_init functionJiri Olsa
Move the libbpf init code into a single function, so that we have a single place doing that. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Cc: bpf@vger.kernel.org Cc: netdev@vger.kernel.org Link: https://lore.kernel.org/r/20220422100025.1469207-4-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-20perf list: Print all available tool eventsFlorian Fischer
Introduce names for the new tool events 'user_time' and 'system_time'. $ perf list ... duration_time [Tool event] user_time [Tool event] system_time [Tool event] ... Committer testing: Before: $ perf list | grep Tool duration_time [Tool event] $ After: $ perf list | grep Tool duration_time [Tool event] user_time [Tool event] system_time [Tool event] $ Signed-off-by: Florian Fischer <florian.fischer@muhq.space> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: http://lore.kernel.org/lkml/20220420174244.1741958-2-florian.fischer@muhq.space Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-20perf stat: Add user_time and system_time eventsFlorian Fischer
It bothered me that during benchmarking using 'perf stat' (to collect for example CPU cache events) I could not simultaneously retrieve the times spend in user or kernel mode in a machine readable format. When running 'perf stat' the output for humans contains the times reported by rusage and wait4. $ perf stat -e cache-misses:u -- true Performance counter stats for 'true': 4,206 cache-misses:u 0.001113619 seconds time elapsed 0.001175000 seconds user 0.000000000 seconds sys But 'perf stat's machine-readable format does not provide this information. $ perf stat -x, -e cache-misses:u -- true 4282,,cache-misses:u,492859,100.00,, I found no way to retrieve this information using the available events while using machine-readable output. This patch adds two new tool internal events 'user_time' and 'system_time', similarly to the already present 'duration_time' event. Both events use the already collected rusage information obtained by wait4 and tracked in the global ru_stats. Examples presenting cache-misses and rusage information in both human and machine-readable form: $ perf stat -e duration_time,user_time,system_time,cache-misses -- grep -q -r duration_time . Performance counter stats for 'grep -q -r duration_time .': 67,422,542 ns duration_time:u 50,517,000 ns user_time:u 16,839,000 ns system_time:u 30,937 cache-misses:u 0.067422542 seconds time elapsed 0.050517000 seconds user 0.016839000 seconds sys $ perf stat -x, -e duration_time,user_time,system_time,cache-misses -- grep -q -r duration_time . 72134524,ns,duration_time:u,72134524,100.00,, 65225000,ns,user_time:u,65225000,100.00,, 6865000,ns,system_time:u,6865000,100.00,, 38705,,cache-misses:u,71189328,100.00,, Signed-off-by: Florian Fischer <florian.fischer@muhq.space> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220420102354.468173-3-florian.fischer@muhq.space Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-20perf stat: Introduce stats for the user and system rusage timesFlorian Fischer
This is preparation for exporting rusage values as tool events. Add new global stats tracking the values obtained via rusage. For now only ru_utime and ru_stime are part of the tracked stats. Both are stored as nanoseconds to be consistent with 'duration_time', although the finest resolution the struct timeval data in rusage provides are microseconds. Signed-off-by: Florian Fischer <florian.fischer@muhq.space> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Link: https://lore.kernel.org/r/20220420102354.468173-2-florian.fischer@muhq.space Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-20perf tools: Print warning when HAVE_DEBUGINFOD_SUPPORT is not set and user ↵Martin Liška
tries to use debuginfod support When one requests debuginfod, either via --debuginfod option, or with a perf-config value, complain when perf is not built with it. Signed-off-by: Martin Liška <mliska@suse.cz> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/35bae747-3951-dc3d-a66b-abf4cebcd9cb@suse.cz Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-18perf mem: Print memory operation typeLeo Yan
The memory operation types are not only for load and store, for easier reviewing the memory operation type, this patch prints out it. Before: ls 14753 [011] 3678.072400: 1 l1d-miss: 88000182 L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) ls 14753 [011] 3678.072400: 1 l1d-access: 88000182 L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) ls 14753 [011] 3678.072400: 1 tlb-access: 88000182 L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) ls 14753 [011] 3678.072400: 1 memory: 88000182 L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) After: ls 14753 [011] 3678.072400: 1 l1d-miss: 88000182 |OP LOAD|LVL L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) ls 14753 [011] 3678.072400: 1 l1d-access: 88000182 |OP LOAD|LVL L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) ls 14753 [011] 3678.072400: 1 tlb-access: 88000182 |OP LOAD|LVL L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) ls 14753 [011] 3678.072400: 1 memory: 88000182 |OP LOAD|LVL L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms]) Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220417124524.901148-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-14perf bench: Fix numa testcase to check if CPU used to bind task is onlineAthira Rajeev
Perf numa bench test fails with error: Testcase: ./perf bench numa mem -p 2 -t 1 -P 1024 -C 0,8 -M 1,0 -s 20 -zZq --thp 1 --no-data_rand_walk Failure snippet: <<>> Running 'numa/mem' benchmark: # Running main, "perf bench numa numa-mem -p 2 -t 1 -P 1024 -C 0,8 -M 1,0 -s 20 -zZq --thp 1 --no-data_rand_walk" perf: bench/numa.c:333: bind_to_cpumask: Assertion `!(ret)' failed. <<>> The Testcases uses CPU's 0 and 8. In function "parse_setup_cpu_list", There is check to see if cpu number is greater than max cpu's possible in the system ie via "if (bind_cpu_0 >= g->p.nr_cpus || bind_cpu_1 >= g->p.nr_cpus) {". But it could happen that system has say 48 CPU's, but only number of online CPU's is 0-7. Other CPU's are offlined. Since "g->p.nr_cpus" is 48, so function will go ahead and set bit for CPU 8 also in cpumask ( td->bind_cpumask). bind_to_cpumask function is called to set affinity using sched_setaffinity and the cpumask. Since the CPU8 is not present, set affinity will fail here with EINVAL. Fix this issue by adding a check to make sure that, CPU's provided in the input argument values are online before proceeding further and skip the test. For this, include new helper function "is_cpu_online" in "tools/perf/util/header.c". Since "BIT(x)" definition will get included from header.h, remove that from bench/numa.c Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com> Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20220412164059.42654-2-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-13perf stat: Fix error check return value of hashmap__new(), must use IS_ERR()Lv Ruyi
hashmap__new() returns ERR_PTR(-ENOMEM) when it fails, so we should use IS_ERR() to check it in error handling path. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220413093302.2538128-1-lv.ruyi@zte.com.cn Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-13perf tools: Fix misleading add event PMU debug messageAdrian Hunter
Fix incorrect debug message: Attempting to add event pmu 'intel_pt' with '' that may result in non-fatal errors which always appears with perf record -vv and intel_pt e.g. perf record -vv -e intel_pt//u uname The message is incorrect because there will never be non-fatal errors. Suppress the message if the PMU is 'selectable' i.e. meant to be selected directly as an event. Fixes: 4ac22b484d4c79e8 ("perf parse-events: Make add PMU verbose output clearer") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lore.kernel.org/lkml/20220411061758.2458417-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-11perf test: Shell - Limit to only run executable scripts in testsCarsten Haitzler
'perf test''s shell runner will just run everything in the tests directory (as long as it's not another directory or does not begin with a dot), but sometimes you find files in there that are not shell scripts - perf.data output for example if you do some testing and then the next time you run perf test it tries to run these. Check the files are executable so they are actually intended to be test scripts and not just some "random junk" files there. Signed-off-by: Carsten Haitzler <carsten.haitzler@arm.com> Reviewed-by: Leo Yan <leo.yan@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: coresight@lists.linaro.org Link: http://lore.kernel.org/lkml/20220309122859.31487-1-carsten.haitzler@foss.arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-11perf scripting python: Expose symbol offset and source informationEelco Chaudron
This change adds the symbol offset to the data exported for each call-chain entry. This can not be calculated from the script and only the ip value, and no related mmap information. In addition, also export the source file and line information, if available, to avoid an external lookup if this information is needed. Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/164554263724.752731.14651017093796049736.stgit@wsfd-netdev64.ntdv.lab.eng.bos.redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-11perf jitdump: Add riscv64 supportEric Lin
This patch enables perf jitdump for riscv64 and was tested with V8 on qemu rv64. Qemu rv64: $ perf record -e cpu-clock -c 1000 -g -k mono ./d8_rv64 --perf-prof --no-write-protect-code-memory test.js $ perf inject -j -i perf.data -o perf.data.jitted $ perf report -i perf.data.jitted Output: To display the perf.data header info, please use --header/--header-only options. Total Lost Samples: 0 Samples: 87K of event 'cpu-clock' Event count (approx.): 87974000 Children Self Command Shared Object Symbol .... 0.28% 0.06% d8_rv64 d8_rv64 [.] _ZN2v88internal6WasmJs7InstallEPNS0_7IsolateEb 0.28% 0.00% d8_rv64 d8_rv64 [.] _ZN2v88internal10ParserBaseINS0_6ParserEE22ParseLogicalExpressionEv 0.28% 0.03% d8_rv64 jitted-112-76.so [.] Builtin:InterpreterEntryTrampoline 0.12% 0.00% d8_rv64 d8_rv64 [.] _ZN2v88internal19ContextDeserializer11DeserializeEPNS0_7IsolateENS0_6HandleINS0_13JSGlobalProxyEEENS_33DeserializeInternalFieldsCallbackE 0.12% 0.01% d8_rv64 jitted-112-651.so [.] Builtin:CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit .... Signed-off-by: Eric Lin <eric.lin@sifive.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: greentime.hu@sifive.com Cc: linux-riscv@lists.infradead.org Link: http://lore.kernel.org/lkml/20220406142606.18464-2-eric.lin@sifive.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>