summaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)Author
2025-02-08perf test: Skip syscall enum test if no landlock syscallNamhyung Kim
[ Upstream commit 72d81e10628be6a948463259cbb6d3b670b20054 ] The perf trace enum augmentation test specifically targets landlock_ add_rule syscall but IIUC it's an optional and can be opt-out by a kernel config. Currently trace_landlock() runs `perf test -w landlock` before the actual testing to check the availability but it's not enough since the workload always returns 0. Instead it could check if perf trace output has 'landlock' string. Fixes: d66763fed30f0bd8c ("perf test trace_btf_enum: Add regression test for the BTF augmentation of enums in 'perf trace'") Reviewed-by: Howard Chu <howardchu95@gmail.com> Link: https://lore.kernel.org/r/20250128170629.1251574-1-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08perf trace: Fix runtime error of index out of boundsHoward Chu
[ Upstream commit c7b87ce0dd10b64b68a0b22cb83bbd556e28fe81 ] libtraceevent parses and returns an array of argument fields, sometimes larger than RAW_SYSCALL_ARGS_NUM (6) because it includes "__syscall_nr", idx will traverse to index 6 (7th element) whereas sc->fmt->arg holds 6 elements max, creating an out-of-bounds access. This runtime error is found by UBsan. The error message: $ sudo UBSAN_OPTIONS=print_stacktrace=1 ./perf trace -a --max-events=1 builtin-trace.c:1966:35: runtime error: index 6 out of bounds for type 'syscall_arg_fmt [6]' #0 0x5c04956be5fe in syscall__alloc_arg_fmts /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:1966 #1 0x5c04956c0510 in trace__read_syscall_info /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:2110 #2 0x5c04956c372b in trace__syscall_info /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:2436 #3 0x5c04956d2f39 in trace__init_syscalls_bpf_prog_array_maps /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:3897 #4 0x5c04956d6d25 in trace__run /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:4335 #5 0x5c04956e112e in cmd_trace /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:5502 #6 0x5c04956eda7d in run_builtin /home/howard/hw/linux-perf/tools/perf/perf.c:351 #7 0x5c04956ee0a8 in handle_internal_command /home/howard/hw/linux-perf/tools/perf/perf.c:404 #8 0x5c04956ee37f in run_argv /home/howard/hw/linux-perf/tools/perf/perf.c:448 #9 0x5c04956ee8e9 in main /home/howard/hw/linux-perf/tools/perf/perf.c:556 #10 0x79eb3622a3b7 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 #11 0x79eb3622a47a in __libc_start_main_impl ../csu/libc-start.c:360 #12 0x5c04955422d4 in _start (/home/howard/hw/linux-perf/tools/perf/perf+0x4e02d4) (BuildId: 5b6cab2d59e96a4341741765ad6914a4d784dbc6) 0.000 ( 0.014 ms): Chrome_ChildIO/117244 write(fd: 238, buf: !, count: 1) = 1 Fixes: 5e58fcfaf4c6 ("perf trace: Allow allocating sc->arg_fmt even without the syscall tracepoint") Signed-off-by: Howard Chu <howardchu95@gmail.com> Link: https://lore.kernel.org/r/20250122025519.361873-1-howardchu95@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08perf annotate: Use an array for the disassembler preferenceIan Rogers
[ Upstream commit bde4ccfd5ab5361490514fc4af7497989cfbee17 ] Prior to this change a string was used which could cause issues with an unrecognized disassembler in symbol__disassembler. Change to initializing an array of perf_disassembler enum values. If a value already exists then adding it a second time is ignored to avoid array out of bounds problems present in the previous code, it also allows a statically sized array and removes memory allocation needs. Errors in the disassembler string are reported when the config is parsed during perf annotate or perf top start up. If the array is uninitialized after processing the config file the default llvm, capstone then objdump values are added but without a need to parse a string. Fixes: a6e8a58de629 ("perf disasm: Allow configuring what disassemblers to use") Closes: https://lore.kernel.org/lkml/CAP-5=fUdfCyxmEiTpzS2uumUp3-SyQOseX2xZo81-dQtWXj6vA@mail.gmail.com/ Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250124043856.1177264-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08perf trace: Fix BPF loading failure (-E2BIG)Howard Chu
[ Upstream commit 013eb043f37bd87c4d60d51034401a5a6d105bcf ] As reported by Namhyung Kim and acknowledged by Qiao Zhao (link: https://lore.kernel.org/linux-perf-users/20241206001436.1947528-1-namhyung@kernel.org/), on certain machines, perf trace failed to load the BPF program into the kernel. The verifier runs perf trace's BPF program for up to 1 million instructions, returning an E2BIG error, whereas the perf trace BPF program should be much less complex than that. This patch aims to fix the issue described above. The E2BIG problem from clang-15 to clang-16 is cause by this line: } else if (size < 0 && size >= -6) { /* buffer */ Specifically this check: size < 0. seems like clang generates a cool optimization to this sign check that breaks things. Making 'size' s64, and use } else if ((int)size < 0 && size >= -6) { /* buffer */ Solves the problem. This is some Hogwarts magic. And the unbounded access of clang-12 and clang-14 (clang-13 works this time) is fixed by making variable 'aug_size' s64. As for this: -if (aug_size > TRACE_AUG_MAX_BUF) - aug_size = TRACE_AUG_MAX_BUF; +aug_size = args->args[index] > TRACE_AUG_MAX_BUF ? TRACE_AUG_MAX_BUF : args->args[index]; This makes the BPF skel generated by clang-18 work. Yes, new clangs introduce problems too. Sorry, I only know that it works, but I don't know how it works. I'm not an expert in the BPF verifier. I really hope this is not a kernel version issue, as that would make the test case (kernel_nr) * (clang_nr), a true horror story. I will test it on more kernel versions in the future. Fixes: 395d38419f18: ("perf trace augmented_raw_syscalls: Add more check s to pass the verifier") Reported-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Howard Chu <howardchu95@gmail.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20241213023047.541218-1-howardchu95@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08perf lock: Fix parse_lock_type which only retrieve one lock flagChun-Tse Shao
[ Upstream commit 1be9264158ef4818393e5d8144887a1a5d3cc480 ] `parse_lock_type` can only add the first lock flag in `lock_type_table` given input `str`. For example, for `Y rwlock`, it only adds `rwlock:R` into this perf session. Another example is for `-Y mutex`, it only adds the mutex without `LCB_F_SPIN` flag. The patch fixes this issue, makes sure both `rwlock:R` and `rwlock:W` will be added with `-Y rwlock`, and so on. Testing: $ ./perf lock con -ab -Y mutex,rwlock -- perf bench sched pipe # Running 'sched/pipe' benchmark: # Executed 1000000 pipe operations between two processes Total time: 9.313 [sec] 9.313976 usecs/op 107365 ops/sec contended total wait max wait avg wait type caller 176 1.65 ms 19.43 us 9.38 us mutex pipe_read+0x57 34 180.14 us 10.93 us 5.30 us mutex pipe_write+0x50 7 77.48 us 16.09 us 11.07 us mutex do_epoll_wait+0x24d 7 74.70 us 13.50 us 10.67 us mutex do_epoll_wait+0x24d 3 35.97 us 14.44 us 11.99 us rwlock:W ep_done_scan+0x2d 3 35.00 us 12.23 us 11.66 us rwlock:W do_epoll_wait+0x255 2 15.88 us 11.96 us 7.94 us rwlock:W do_epoll_wait+0x47c 1 15.23 us 15.23 us 15.23 us rwlock:W do_epoll_wait+0x4d0 1 14.26 us 14.26 us 14.26 us rwlock:W ep_done_scan+0x2d 2 14.00 us 7.99 us 7.00 us mutex pipe_read+0x282 1 12.29 us 12.29 us 12.29 us rwlock:R ep_poll_callback+0x35 1 12.02 us 12.02 us 12.02 us rwlock:W do_epoll_ctl+0xb65 1 10.25 us 10.25 us 10.25 us rwlock:R ep_poll_callback+0x35 1 7.86 us 7.86 us 7.86 us mutex do_epoll_ctl+0x6c1 1 5.04 us 5.04 us 5.04 us mutex do_epoll_ctl+0x3d4 [namhyung: Add a comment and rename to 'mutex:spin' for consistency Fixes: d783ea8f62c4 ("perf lock contention: Simplify parse_lock_type()") Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Chun-Tse Shao <ctshao@google.com> Cc: nick.forrington@arm.com Link: https://lore.kernel.org/r/20250116235838.2769691-1-ctshao@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08perf inject: Fix use without initialization of local variablesIan Rogers
[ Upstream commit 8e246a1b2a75e187c7d22c9aec4299057f87d19e ] Local variables were missing initialization and command line processing didn't provide default values. Fixes: 64eed019f3fce124 ("perf inject: Lazy build-id mmap2 event insertion") Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241211060831.806539-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08perf test stat: Avoid hybrid assumption when virtualizedIan Rogers
[ Upstream commit f9c506fb69bdcfb9d7138281378129ff037f2aa1 ] The cycles event will fallback to task-clock in the hybrid test when running virtualized. Change the test to not fail for this. Fixes: 65d11821910bd910 ("perf test: Add a test for default perf stat command") Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241212173354.9860-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08perf report: Fix misleading help message about --demangleJiachen Zhang
[ Upstream commit ac0ac75189a4d6a29a2765a7adbb62bc6cc650c7 ] The wrong help message may mislead users. This commit fixes it. Fixes: 328ccdace8855289 ("perf report: Add --no-demangle option") Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Jiachen Zhang <me@jcix.top> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250109152220.1869581-1-me@jcix.top Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08perf MANIFEST: Add arch/*/include/uapi/asm/bpf_perf_event.h to the perf tarballArnaldo Carvalho de Melo
[ Upstream commit 74c033b6aa650ea7280221a9e57b7318a120978c ] Needed to build tools/lib/bpf/ on various arches other than x86_64, notably arm64 when using the perf tarballs generated by: $ make help | grep perf- perf-tar-src-pkg - Build the perf source tarball with no compression perf-targz-src-pkg - Build the perf source tarball with gzip compression perf-tarbz2-src-pkg - Build the perf source tarball with bz2 compression perf-tarxz-src-pkg - Build the perf source tarball with xz compression perf-tarzst-src-pkg - Build the perf source tarball with zst compression $ Building with BPF support was opt-in in perf for a long time, and testing it via the tarball main kernel Makefile targets in an architecture other than x86_64 was an odd case. I had noticed this at some point earlier this year while cross building perf to some arches, including arm64, but it fell thru the cracks, see the Link tag below. Fix it now by adding those arch/*/include/uapi/asm/bpf_perf_event.h files to the MANIFEST file used in building the perf source tarball. Tested with: perfbuilder@number:~$ time dm debian:experimental-x-arm64 1 21.60 debian:experimental-x-arm64 : Ok aarch64-linux-gnu-gcc (Debian 14.1.0-5) 14.1.0 flex 2.6.4 BUILD_TARBALL_HEAD=d31a974f6edc576f84c35be9526fec549a3b3520 $ $ git log --oneline -1 d31a974f6edc576f84c35be9526fec549a3b3520 d31a974f6edc576f (HEAD -> perf-tools-next) perf MANIFEST: Add arch/*/include/uapi/asm/bpf_perf_event.h to the perf tarball $ That was previously failing: perfbuilder@number:~$ grep debian:experimental-x-arm64 dm.log.old/summary 19 4.80 debian:experimental-x-arm64 : FAIL gcc version 14.1.0 (Debian 14.1.0-5) $ perfbuilder@number:~$ grep -B6 'Error 1' dm.log.old/debian:experimental-x-arm64 In file included from /git/perf-6.12.0-rc6/tools/include/uapi/linux/bpf_perf_event.h:11, from libbpf.c:36: /git/perf-6.12.0-rc6/tools/include/uapi/asm/bpf_perf_event.h:2:10: fatal error: ../../arch/arm64/include/uapi/asm/bpf_perf_event.h: No such file or directory 2 | #include "../../arch/arm64/include/uapi/asm/bpf_perf_event.h" | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ compilation terminated. make[4]: *** [/git/perf-6.12.0-rc6/tools/build/Makefile.build:105: /tmp/build/perf/libbpf/staticobjs/libbpf.o] Error 1 perfbuilder@number:~$ Closes: https://lore.kernel.org/all/Z0UNRCRYKunbDYxP@hyperscale.parallels Fixes: 9eea8fafe33eb708 ("libbpf: fix __arg_ctx type enforcement for perf_event programs") Reported-by: Michel Lind <michel@michel-slm.name> Tested-by: Michel Lind <michel@michel-slm.name> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: 317c11923cf676437456e44a7f408d4ce589a9c0.camel@michel-slm.name Link: https://lore.kernel.org/bpf/ZfyEgoG3JFiOs2Fs@x1/ Link: https://lore.kernel.org/r/Z0Yy5u42Q1hWoEzz@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08perf symbol: Prefer non-label symbols with same addressNamhyung Kim
[ Upstream commit 8c2eafbbfd782d6ad270ca2de21b529ac57de0f4 ] When there are more than one symbols at the same address, it needs to choose which one is better. In choose_best_symbol() it didn't check the type of symbols. It's possible to have labels in other symbols and in that case, it would be better to pick the actual symbol over the labels. To minimize the possible impact on other symbols, I only check NOTYPE symbols specifically. $ readelf -sW vmlinux | grep -e __do_softirq -e __softirqentry_text_start 105089: ffffffff82000000 814 FUNC GLOBAL DEFAULT 1 __do_softirq 111954: ffffffff82000000 0 NOTYPE GLOBAL DEFAULT 1 __softirqentry_text_start The commit 77b004f4c5c3c90b tried to do the same by not giving the size to the label symbols but it seems there's some label-only symbols in asm code. Let's restore the original code and choose the right symbol using type of the symbols. Fixes: 77b004f4c5c3c90b ("perf symbol: Do not fixup end address of labels") Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Link: http://lore.kernel.org/lkml/Z3b-DqBMnNb4ucEm@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08perf namespaces: Fixup the nsinfo__in_pidns() return type, its boolArnaldo Carvalho de Melo
[ Upstream commit 64a7617efd5ae1d57a75e464d7134eec947c3fe3 ] When adding support for refconunt checking a cut'n'paste made this function, that is just an accessor to a bool member of 'struct nsinfo', return a pid_t, when that member is a boolean, fix it. Fixes: bcaf0a97858de7ab ("perf namespaces: Add functions to access nsinfo") Reported-by: Francesco Nigro <fnigro@redhat.com> Reported-by: Ilan Green <igreen@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Clark Williams <williams@redhat.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io> Link: https://lore.kernel.org/r/20241206204828.507527-6-acme@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08perf namespaces: Introduce nsinfo__set_in_pidns()Arnaldo Carvalho de Melo
[ Upstream commit 9c6a585d257f6845731f4e36b45fe42b5c3162f5 ] When we're processing a perf.data file we will, for every thread in that file do a machine__findnew_thread(machine, pid, tid) that when that pid is seen for the first time will create a 'struct thread' representing it. That in turn will call nsinfo__new() -> nsinfo__init() and there it will assume we're running live, which is wrong and will need to be addressed in a followup patch. The nsinfo__new() assumes that if we can't access that thread it has already finished and will ignore the -1 return from nsinfo__init(), just taking notes to avoid trying to enter in that namespace, since it isn't there anymore, a race. When doing this from 'perf inject', tho, we can fill in parts of that nsinfo from what we get from the PERF_RECORD_MMAP2 (pid, tid) and in the jitdump file name, that has the form of jit-<PID>.dump. So if the pid in the jitdump file name is not the one in the PERF_RECORD_MMAP2, we can assume that its the pid of the process _inside_ the namespace, and that perf was runing outside that namespace. This will be done in the following patch. Reported-by: Francesco Nigro <fnigro@redhat.com> Reported-by: Ilan Green <igreen@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Clark Williams <williams@redhat.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io> Link: https://lore.kernel.org/r/20241206204828.507527-4-acme@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Stable-dep-of: 64a7617efd5a ("perf namespaces: Fixup the nsinfo__in_pidns() return type, its bool") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08perf machine: Don't ignore _etext when not a text symbolChristophe Leroy
[ Upstream commit 7a93786c306296f15e728b1dbd949a319e4e3d19 ] Depending on how vmlinux.lds is written, _etext might be the very first data symbol instead of the very last text symbol. Don't require it to be a text symbol, accept any symbol type. Comitter notes: See the first Link for further discussion, but it all boils down to this: --- # grep -e _stext -e _etext -e _edata /proc/kallsyms c0000000 T _stext c08b8000 D _etext So there is no _edata and _etext is not text $ ppc-linux-objdump -x vmlinux | grep -e _stext -e _etext -e _edata c0000000 g .head.text 00000000 _stext c08b8000 g .rodata 00000000 _etext c1378000 g .sbss 00000000 _edata --- Fixes: ed9adb2035b5be58 ("perf machine: Read also the end of the kernel") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/r/b3ee1994d95257cb7f2de037c5030ba7d1bed404.1736327613.git.christophe.leroy@csgroup.eu Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08perf maps: Fix display of kernel symbolsChristophe Leroy
[ Upstream commit dae29277fddaaf6670d17dfcbb916a2ca29c912f ] Since commit 659ad3492b913c90 ("perf maps: Switch from rbtree to lazily sorted array for addresses"), perf doesn't display anymore kernel symbols on powerpc, allthough it still detects them as kernel addresses. # Overhead Command Shared Object Symbol # ........ .......... ............. ...................................... # 80.49% Coeur main [unknown] [k] 0xc005f0f8 3.91% Coeur main gau [.] engine_loop.constprop.0.isra.0 1.72% Coeur main [unknown] [k] 0xc005f11c 1.09% Coeur main [unknown] [k] 0xc01f82c8 0.44% Coeur main libc.so.6 [.] epoll_wait 0.38% Coeur main [unknown] [k] 0xc0011718 0.36% Coeur main [unknown] [k] 0xc01f45c0 This is because function maps__find_next_entry() now returns current entry instead of next entry, leading to kernel map end address getting mis-configured with its own start address instead of the start address of the following map. Fix it by really taking the next entry, also make sure that entry follows current one by making sure entries are sorted. Fixes: 659ad3492b913c90 ("perf maps: Switch from rbtree to lazily sorted array for addresses") Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/2ea4501209d5363bac71a6757fe91c0747558a42.1736329923.git.christophe.leroy@csgroup.eu Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08perf top: Don't complain about lack of vmlinux when not resolving some ↵Arnaldo Carvalho de Melo
kernel samples [ Upstream commit 058b38ccd2af9e5c95590b018e8425fa148d7aca ] Recently we got a case where a kernel sample wasn't being resolved due to a bug that was not setting the end address on kernel functions implemented in assembly (see Link: tag), and then those were not being found by machine__resolve() -> map__find_symbol(). So we ended up with: # perf top --stdio PerfTop: 0 irqs/s kernel: 0% exact: 0% lost: 0/0 drop: 0/0 [cycles/P] ----------------------------------------------------------------------- Warning: A vmlinux file was not found. Kernel samples will not be resolved. ^Z [1]+ Stopped perf top --stdio # But then resolving all other kernel symbols. So just fixup the logic to only print that warning when there are no symbols in the kernel map. Fixes: d88205db9caa0e9d ("perf dso: Add dso__has_symbols() method") Reviewed-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/lkml/Z3buKhcCsZi3_aGb@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08perf stat: Fix trailing comma when there is no metric unitJames Clark
[ Upstream commit 967364894e61b15819a0c11231512ecd5a46b503 ] Now that printing metric-value and metric-unit is optional, print_running_json() shouldn't add the comma in case it becomes trailing. Replace all manual JSON comma stuff with a json_out() function that uses the existing os->first tracking and auto inserts a comma if it's needed. Update the test to handle that two of the fields can be missing. This fixes the following test failure on Cortex A57 where the branch misses metric is missing a required event: $ perf test -vvv "json output" 106: perf stat JSON output linter: --- start --- test child forked, pid 665682 Checking json output: no args Test failed for input: {"counter-value" : "3112.000000", "unit" : "", "event" : "armv8_pmuv3_1/branch-misses/", "event-runtime" : 20699340, "pcnt-running" : 100.00, } ... json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 12 column 144 (char 2109) ---- end(-1) ---- 106: perf stat JSON output linter : FAILED! Fixes: e1cc918b6cfd1206 ("perf stat: Drop metric-unit if unit is NULL") Signed-off-by: James Clark <james.clark@linaro.org> Tested-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241112160048.951213-2-james.clark@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08perf expr: Initialize is_test value in expr__ctx_new()Levi Yun
[ Upstream commit 1d18ebcfd302a2005b83ae5f13df223894d19902 ] When expr_parse_ctx is allocated by expr_ctx_new(), expr_scanner_ctx->is_test isn't initialize, so it has garbage value. this can affects the result of expr__parse() return when it parses non-exist event literal according to garbage value. Use calloc instead of malloc in expr_ctx_new() to fix this. Fixes: 3340a08354ac286e ("perf pmu-events: Fix testing with JEVENTS_ARCH=all") Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Levi Yun <yeoreum.yun@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20241108143424.819126-1-yeoreum.yun@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08perf bpf: Fix two memory leakages when calling perf_env__insert_bpf_prog_info()Zhongqiu Han
[ Upstream commit 03edb7020bb920f1935c3f30acad0bb27fdb99af ] If perf_env__insert_bpf_prog_info() returns false due to a duplicate bpf prog info node insertion, the temporary info_node and info_linear memory will leak. Add a check to ensure the memory is freed if the function returns false. Fixes: d56354dc49091e33 ("perf tools: Save bpf_prog_info and BTF of new BPF programs") Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241205084500.823660-4-quic_zhonhan@quicinc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08perf header: Fix one memory leakage in process_bpf_prog_info()Zhongqiu Han
[ Upstream commit a7da6c7030e1aec32f0a41c7b4fa70ec96042019 ] Function __perf_env__insert_bpf_prog_info() will return without inserting bpf prog info node into perf env again due to a duplicate bpf prog info node insertion, causing the temporary info_linear and info_node memory to leak. Modify the return type of this function to bool and add a check to ensure the memory is freed if the function returns false. Fixes: 606f972b1361f477 ("perf bpf: Save bpf_prog_info information as headers to perf.data") Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241205084500.823660-3-quic_zhonhan@quicinc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08perf header: Fix one memory leakage in process_bpf_btf()Zhongqiu Han
[ Upstream commit 875d22980a062521beed7b5df71fb13a1af15d83 ] If __perf_env__insert_btf() returns false due to a duplicate btf node insertion, the temporary node will leak. Add a check to ensure the memory is freed if the function returns false. Fixes: a70a1123174ab592 ("perf bpf: Save BTF information as headers to perf.data") Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20241205084500.823660-2-quic_zhonhan@quicinc.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-02-08tools features: Don't check for libunwind devel files by defaultArnaldo Carvalho de Melo
[ Upstream commit 176c9d1e6a06f2fa62c1b9743369ab35c724d2c4 ] Since 13e17c9ff49119aa ("perf build: Make libunwind opt-in rather than opt-out"), so we shouldn't by default be testing for its availability at build time in tools/build/features/test-all.c. That test was designed to test the features we expect to be the most common ones in most builds, so if we test build just that file, then we assume the features there are present and will not test one by one. Removing it from test-all.c gets rid of the first impediment for test-all.c to build successfully: $ cat /tmp/build/perf-tools-next/feature/test-all.make.output In file included from test-all.c:62: test-libunwind.c:2:10: fatal error: libunwind.h: No such file or directory 2 | #include <libunwind.h> | ^~~~~~~~~~~~~ compilation terminated. $ We then get to: $ cat /tmp/build/perf-tools-next/feature/test-all.make.output /usr/bin/ld: cannot find -lunwind-x86_64: No such file or directory /usr/bin/ld: cannot find -lunwind: No such file or directory collect2: error: ld returned 1 exit status $ So make all the logic related to setting CFLAGS, LDFLAGS, etc for libunwind to be conditional on NO_LIBWUNWIND=1, which is now the default, now we get a faster build: $ cat /tmp/build/perf-tools-next/feature/test-all.make.output $ ldd /tmp/build/perf-tools-next/feature/test-all.bin linux-vdso.so.1 (0x00007fef04cde000) libdw.so.1 => /lib64/libdw.so.1 (0x00007fef04a49000) libpython3.12.so.1.0 => /lib64/libpython3.12.so.1.0 (0x00007fef04478000) libm.so.6 => /lib64/libm.so.6 (0x00007fef04394000) libtraceevent.so.1 => /lib64/libtraceevent.so.1 (0x00007fef0436c000) libtracefs.so.1 => /lib64/libtracefs.so.1 (0x00007fef04345000) libcrypto.so.3 => /lib64/libcrypto.so.3 (0x00007fef03e95000) libz.so.1 => /lib64/libz.so.1 (0x00007fef03e72000) libelf.so.1 => /lib64/libelf.so.1 (0x00007fef03e56000) libnuma.so.1 => /lib64/libnuma.so.1 (0x00007fef03e48000) libslang.so.2 => /lib64/libslang.so.2 (0x00007fef03b65000) libperl.so.5.38 => /lib64/libperl.so.5.38 (0x00007fef037c6000) libc.so.6 => /lib64/libc.so.6 (0x00007fef035d5000) liblzma.so.5 => /lib64/liblzma.so.5 (0x00007fef035a0000) libzstd.so.1 => /lib64/libzstd.so.1 (0x00007fef034e1000) libbz2.so.1 => /lib64/libbz2.so.1 (0x00007fef034cd000) /lib64/ld-linux-x86-64.so.2 (0x00007fef04ce0000) libcrypt.so.2 => /lib64/libcrypt.so.2 (0x00007fef03495000) $ Fixes: 13e17c9ff49119aa ("perf build: Make libunwind opt-in rather than opt-out") Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/Z09zTztD8X8qIWCX@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-11perf probe: Fix uninitialized variableJames Clark
Since the linked fixes: commit, err is returned uninitialized due to the removal of "return 0". Initialize err to fix it. This fixes the following intermittent test failure on release builds: $ perf test "testsuite_probe" ... -- [ FAIL ] -- perf_probe :: test_invalid_options :: mutually exclusive options :: -L foo -V bar (output regexp parsing) Regexp not found: \"Error: switch .+ cannot be used with switch .+\" ... Fixes: 080e47b2a237 ("perf probe: Introduce quotation marks support") Tested-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20241211085525.519458-2-james.clark@linaro.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-11perf test expr: Fix system_tsc_freq for only x86Ian Rogers
The refactoring of tool PMU events to have a PMU then adding the expr literals to the tool PMU made it so that the literal system_tsc_freq was only supported on x86. Update the test expectations to match - namely the parsing is x86 specific and only yields a non-zero value on Intel. Fixes: 609aa2667f67 ("perf tool_pmu: Switch to standard pmu functions and json descriptions") Reported-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Closes: https://lore.kernel.org/linux-perf-users/20241022140156.98854-1-atrajeev@linux.vnet.ibm.com/ Co-developed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: akanksha@linux.ibm.com Cc: hbathini@linux.ibm.com Cc: kjain@linux.ibm.com Cc: maddy@linux.ibm.com Cc: disgoel@linux.vnet.ibm.com Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20241205022305.158202-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-09perf test hwmon_pmu: Fix event file locationIan Rogers
The temp directory is made and a known fake hwmon PMU created within it. Prior to this fix the events were being incorrectly written to the temp directory rather than the fake PMU directory. This didn't impact the test as the directory fd matched the wrong location, but it doesn't mirror what a hwmon PMU would actually look like. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/r/20241206042306.1055913-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-09perf hwmon_pmu: Use openat rather than dup to refresh directoryIan Rogers
The hwmon PMU test will make a temp directory, open the directory with O_DIRECTORY then fill it with contents. As the open is before the filling the contents the later fdopendir may reflect the initial empty state, meaning no events are seen. Change to re-open the directory, rather than dup the fd, so the latest contents are seen. Minor tweaks/additions to debug messages. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Link: https://lore.kernel.org/r/20241206042306.1055913-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-09perf ftrace: Fix undefined behavior in cmp_profile_data()Kuan-Wei Chiu
The comparison function cmp_profile_data() violates the C standard's requirements for qsort() comparison functions, which mandate symmetry and transitivity: * Symmetry: If x < y, then y > x. * Transitivity: If x < y and y < z, then x < z. When v1 and v2 are equal, the function incorrectly returns 1, breaking symmetry and transitivity. This causes undefined behavior, which can lead to memory corruption in certain versions of glibc [1]. Fix the issue by returning 0 when v1 and v2 are equal, ensuring compliance with the C standard and preventing undefined behavior. Link: https://www.qualys.com/2024/01/30/qsort.txt [1] Fixes: 0f223813edd0 ("perf ftrace: Add 'profile' command") Fixes: 74ae366c37b7 ("perf ftrace profile: Add -s/--sort option") Cc: stable@vger.kernel.org Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: jserv@ccns.ncku.edu.tw Cc: chuang@cs.nycu.edu.tw Link: https://lore.kernel.org/r/20241209134226.1939163-1-visitorckw@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-05perf tools: Fix precise_ip fallback logicNamhyung Kim
Sometimes it returns other than EOPNOTSUPP for invalid precise_ip so it cannot check the error code. Let's move the fallback after the missing feature checks so that it can handle EINVAL as well. This also aligns well with the existing behavior which blindly turns off the precise_ip but we check the missing features correctly now. Fixes: af954f76eea56453 ("perf tools: Check fallback error and order") Reported-by: kernel test robot <oliver.sang@intel.com> Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Closes: https://lore.kernel.org/oe-lkp/202411301431.799e5531-lkp@intel.com Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/Z1DV0lN8qHSysX7f@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-04perf tools: Fix build error on generated/fs_at_flags_array.cNamhyung Kim
It should only have generic flags in the array but the recent header sync brought a new flags to fcntl.h and caused a build error. Let's update the shell script to exclude flags specific to name_to_handle_at(). CC trace/beauty/fs_at_flags.o In file included from trace/beauty/fs_at_flags.c:21: tools/perf/trace/beauty/generated/fs_at_flags_array.c:13:30: error: initialized field overwritten [-Werror=override-init] 13 | [ilog2(0x002) + 1] = "HANDLE_CONNECTABLE", | ^~~~~~~~~~~~~~~~~~~~ tools/perf/trace/beauty/generated/fs_at_flags_array.c:13:30: note: (near initialization for ‘fs_at_flags[2]’) Reviewed-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20241203035349.1901262-12-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-04tools headers: Sync uapi/linux/prctl.h with the kernel sourcesNamhyung Kim
To pick up the changes in this cset: 09d6775f503b393d riscv: Add support for userspace pointer masking 91e102e79740ae43 prctl: arch-agnostic prctl for shadow stack This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/perf/trace/beauty/include/uapi/linux/prctl.h include/uapi/linux/prctl.h Please see tools/include/uapi/README for further details. Reviewed-by: James Clark <james.clark@linaro.org> Cc: Mark Brown <broonie@kernel.org> Cc: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lore.kernel.org/r/20241203035349.1901262-11-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-04tools headers: Sync uapi/linux/mount.h with the kernel sourcesNamhyung Kim
To pick up the changes in this cset: aefff51e1c2986e1 statmount: retrieve security mount options 2f4d4503e9e5ab76 statmount: add flag to retrieve unescaped options 44010543fc8bedad fs: add the ability for statmount() to report the sb_source ed9d95f691c29748 fs: add the ability for statmount() to report the fs_subtype This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/perf/trace/beauty/include/uapi/linux/mount.h include/uapi/linux/mount.h Please see tools/include/uapi/README for further details. Reviewed-by: James Clark <james.clark@linaro.org> Cc: Christian Brauner <brauner@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20241203035349.1901262-10-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-04tools headers: Sync uapi/linux/fcntl.h with the kernel sourcesNamhyung Kim
To pick up the changes in this cset: c374196b2b9f4b80 ("fs: name_to_handle_at() support for "explicit connectable" file handles") 95f567f81e43a1bc ("fs: Simplify getattr interface function checking AT_GETATTR_NOSEC flag") This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/perf/trace/beauty/include/uapi/linux/fcntl.h include/uapi/linux/fcntl.h Please see tools/include/uapi/README for further details. Reviewed-by: James Clark <james.clark@linaro.org> Cc: Jeff Layton <jlayton@kernel.org> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Alexander Aring <alex.aring@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20241203035349.1901262-9-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-04tools headers: Sync *xattrat syscall changes with the kernel sourcesNamhyung Kim
To pick up the changes in this cset: 6140be90ec70c39f ("fs/xattr: add *at family syscalls") This addresses these perf build warnings: Warning: Kernel ABI header differences: diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h diff -u tools/perf/arch/x86/entry/syscalls/syscall_32.tbl arch/x86/entry/syscalls/syscall_32.tbl diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl The arm64 changes are not included as it requires more changes in the tools. It'll be worked for the later cycle. Please see tools/include/uapi/README for further details. Reviewed-by: James Clark <james.clark@linaro.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Christian Brauner <brauner@kernel.org> CC: x86@kernel.org CC: linux-mips@vger.kernel.org CC: linuxppc-dev@lists.ozlabs.org CC: linux-s390@vger.kernel.org Link: https://lore.kernel.org/r/20241203035349.1901262-7-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-03perf machine: Initialize machine->env to address a segfaultArnaldo Carvalho de Melo
Its used from trace__run(), for the 'perf trace' live mode, i.e. its strace-like, non-perf.data file processing mode, the most common one. The trace__run() function will set trace->host using machine__new_host() that is supposed to give a machine instance representing the running machine, and since we'll use perf_env__arch_strerrno() to get the right errno -> string table, we need to use machine->env, so initialize it in machine__new_host(). Before the patch: (gdb) run trace --errno-summary -a sleep 1 <SNIP> Summary of events: gvfs-afc-volume (3187), 2 events, 0.0% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ pselect6 1 0 0.000 0.000 0.000 0.000 0.00% GUsbEventThread (3519), 2 events, 0.0% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ poll 1 0 0.000 0.000 0.000 0.000 0.00% <SNIP> Program received signal SIGSEGV, Segmentation fault. 0x00000000005caba0 in perf_env__arch_strerrno (env=0x0, err=110) at util/env.c:478 478 if (env->arch_strerrno == NULL) (gdb) bt #0 0x00000000005caba0 in perf_env__arch_strerrno (env=0x0, err=110) at util/env.c:478 #1 0x00000000004b75d2 in thread__dump_stats (ttrace=0x14f58f0, trace=0x7fffffffa5b0, fp=0x7ffff6ff74e0 <_IO_2_1_stderr_>) at builtin-trace.c:4673 #2 0x00000000004b78bf in trace__fprintf_thread (fp=0x7ffff6ff74e0 <_IO_2_1_stderr_>, thread=0x10fa0b0, trace=0x7fffffffa5b0) at builtin-trace.c:4708 #3 0x00000000004b7ad9 in trace__fprintf_thread_summary (trace=0x7fffffffa5b0, fp=0x7ffff6ff74e0 <_IO_2_1_stderr_>) at builtin-trace.c:4747 #4 0x00000000004b656e in trace__run (trace=0x7fffffffa5b0, argc=2, argv=0x7fffffffde60) at builtin-trace.c:4456 #5 0x00000000004ba43e in cmd_trace (argc=2, argv=0x7fffffffde60) at builtin-trace.c:5487 #6 0x00000000004c0414 in run_builtin (p=0xec3068 <commands+648>, argc=5, argv=0x7fffffffde60) at perf.c:351 #7 0x00000000004c06bb in handle_internal_command (argc=5, argv=0x7fffffffde60) at perf.c:404 #8 0x00000000004c0814 in run_argv (argcp=0x7fffffffdc4c, argv=0x7fffffffdc40) at perf.c:448 #9 0x00000000004c0b5d in main (argc=5, argv=0x7fffffffde60) at perf.c:560 (gdb) After: root@number:~# perf trace -a --errno-summary sleep 1 <SNIP> pw-data-loop (2685), 1410 events, 16.0% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ epoll_wait 188 0 983.428 0.000 5.231 15.595 8.68% ioctl 94 0 0.811 0.004 0.009 0.016 2.82% read 188 0 0.322 0.001 0.002 0.006 5.15% write 141 0 0.280 0.001 0.002 0.018 8.39% timerfd_settime 94 0 0.138 0.001 0.001 0.007 6.47% gnome-control-c (179406), 1848 events, 20.9% syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ poll 222 0 959.577 0.000 4.322 21.414 11.40% recvmsg 150 0 0.539 0.001 0.004 0.013 5.12% write 300 0 0.442 0.001 0.001 0.007 3.29% read 150 0 0.183 0.001 0.001 0.009 5.53% getpid 102 0 0.101 0.000 0.001 0.008 7.82% root@number:~# Fixes: 54373b5d53c1f6aa ("perf env: Introduce perf_env__arch_strerrno()") Reported-by: Veronika Molnarova <vmolnaro@redhat.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Veronika Molnarova <vmolnaro@redhat.com> Acked-by: Michael Petlan <mpetlan@redhat.com> Tested-by: Michael Petlan <mpetlan@redhat.com> Link: https://lore.kernel.org/r/Z0XffUgNSv_9OjOi@x1 Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-02perf test: Don't signal all processes on system when interrupting testsJames Clark
This signal handler loops over all tests on ctrl-C, but it's active while the test list is being constructed. process.pid is 0, then -1, then finally set to the child pid on fork. If the Ctrl-C is received during this point a kill(-1, SIGINT) can be sent which affects all processes. Make sure the child has forked first before forwarding the signal. This can be reproduced with ctrl-C immediately after launching perf test which terminates the ssh connection. Fixes: 553d5efeb341 ("perf test: Add a signal handler to kill forked child processes") Signed-off-by: James Clark <james.clark@linaro.org> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20241129151948.3199732-1-james.clark@linaro.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-02perf tools: Fix build-id event recordingNamhyung Kim
The build-id events written at the end of the record session are broken due to unexpected data. The write_buildid() writes the fixed length event first and then variable length filename. But a recent change made it write more data in the padding area accidentally. So readers of the event see zero-filled data for the next entry and treat it incorrectly. This resulted in wrong kernel symbols because the kernel DSO loaded a random vmlinux image in the path as it didn't have a valid build-id. Fixes: ae39ba16554e ("perf inject: Fix build ID injection") Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/Z0aRFFW9xMh3mqKB@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-26Merge tag 'perf-tools-for-v6.13-2024-11-24' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools updates from Namhyung Kim: "perf record: - Enable leader sampling for inherited task events. It was supported only for system-wide events but the kernel started to support such a setup since v6.12. This is to reduce the number of PMU interrupts. The samples of the leader event will contain counts of other events and no samples will be generated for the other member events. $ perf record -e '{cycles,instructions}:S' ${MYPROG} perf report: - Fix --branch-history option to display more branch-related information like prediction, abort and cycles which is available on Intel machines. $ perf record -bg -- perf test -w brstack $ perf report --branch-history ... # # Overhead Source:Line Symbol Shared Object Predicted Abort Cycles IPC [IPC Coverage] # ........ ........................ .............. .................... ......... ..... ...... .................... # 8.17% copy_page_64.S:19 [k] copy_page [kernel.kallsyms] 50.0% 0 5 - - | ---xas_load xarray.h:171 | |--5.68%--xas_load xarray.c:245 (cycles:1) | xas_load xarray.c:242 | xas_load xarray.h:1260 (cycles:1) | xas_descend xarray.c:146 | xas_load xarray.c:244 (cycles:2) | xas_load xarray.c:245 | xas_descend xarray.c:218 (cycles:10) ... perf stat: - Add HWMON PMU support. The HWMON provides various system information like CPU/GPU temperature, fan speed and so on. Expose them as PMU events so that users can see the values using perf stat commands. $ perf stat -e temp_cpu,fan1 true Performance counter stats for 'true': 60.00 'C temp_cpu 0 rpm fan1 0.000745382 seconds time elapsed 0.000883000 seconds user 0.000000000 seconds sys - Display metric threshold in JSON output. Some metrics define thresholds to classify value ranges. It used to be in a different color but it won't work for JSON. Add "metric-threshold" field to the JSON that can be one of "good", "less good", "nearly bad" and "bad". # perf stat -a -M TopdownL1 -j true {"counter-value" : "18693525.000000", "unit" : "", "event" : "TOPDOWN.SLOTS", "event-runtime" : 5552708, "pcnt-running" : 100.00, "metric-value" : "43.226002", "metric-unit" : "% tma_backend_bound", "metric-threshold" : "bad"} {"metric-value" : "29.212267", "metric-unit" : "% tma_frontend_bound", "metric-threshold" : "bad"} {"metric-value" : "7.138972", "metric-unit" : "% tma_bad_speculation", "metric-threshold" : "good"} {"metric-value" : "20.422759", "metric-unit" : "% tma_retiring", "metric-threshold" : "good"} {"counter-value" : "3817732.000000", "unit" : "", "event" : "topdown-retiring", "event-runtime" : 5552708, "pcnt-running" : 100.00, } {"counter-value" : "5472824.000000", "unit" : "", "event" : "topdown-fe-bound", "event-runtime" : 5552708, "pcnt-running" : 100.00, } {"counter-value" : "7984780.000000", "unit" : "", "event" : "topdown-be-bound", "event-runtime" : 5552708, "pcnt-running" : 100.00, } {"counter-value" : "1418181.000000", "unit" : "", "event" : "topdown-bad-spec", "event-runtime" : 5552708, "pcnt-running" : 100.00, } ... perf sched: - Add -P/--pre-migrations option for 'timehist' sub-command to track time a task waited on a run-queue before migrating to a different CPU. $ perf sched timehist -P time cpu task name wait time sch delay run time pre-mig time [tid/pid] (msec) (msec) (msec) (msec) --------------- ------ ------------------------------ --------- --------- --------- --------- 585940.535527 [0000] perf[584885] 0.000 0.000 0.000 0.000 585940.535535 [0000] migration/0[20] 0.000 0.002 0.008 0.000 585940.535559 [0001] perf[584885] 0.000 0.000 0.000 0.000 585940.535563 [0001] migration/1[25] 0.000 0.001 0.004 0.000 585940.535678 [0002] perf[584885] 0.000 0.000 0.000 0.000 585940.535686 [0002] migration/2[31] 0.000 0.002 0.008 0.000 585940.535905 [0001] <idle> 0.000 0.000 0.342 0.000 585940.535938 [0003] perf[584885] 0.000 0.000 0.000 0.000 585940.537048 [0001] sleep[584886] 0.000 0.019 1.142 0.001 585940.537749 [0002] <idle> 0.000 0.000 2.062 0.000 ... Build: - Make libunwind opt-in (LIBUNWIND=1) rather than opt-out. The perf tools are generally built with libelf and libdw which has unwinder functionality. The libunwind support predates it and no need to have duplicate unwinders by default. - Rename NO_DWARF=1 build option to NO_LIBDW=1 in order to clarify it's using libdw for handling DWARF information. Internals: - Do not set exclude_guest bit in the perf_event_attr by default. This was causing a trouble in AMD IBS PMU as it doesn't support the bit. The bit will be set when it's needed later by the fallback logic. Also update the missing feature detection logic to make sure not clear supported bits unnecessarily. - Run perf test in parallel by default and mark flaky tests "exclusive" to run them serially at the end. Some test numbers are changed but the test can complete in less than half the time. JSON vendor events: - Add AMD Zen 5 events and metrics. - Add i.MX91 and i.MX95 DDR metrics - Fix HiSilicon HIP08 Topdown metric name. - Support compat events on PowerPC" * tag 'perf-tools-for-v6.13-2024-11-24' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (232 commits) perf tests: Fix hwmon parsing with PMU name test perf hwmon_pmu: Ensure hwmon key union is zeroed before use perf tests hwmon_pmu: Remove double evlist__delete() perf/test: fix perf ftrace test on s390 perf bpf-filter: Return -ENOMEM directly when pfi allocation fails perf test: Correct hwmon test PMU detection perf: Remove unused del_perf_probe_events() perf pmu: Move pmu_metrics_table__find and remove ARM override perf jevents: Add map_for_cpu() perf header: Pass a perf_cpu rather than a PMU to get_cpuid_str perf header: Avoid transitive PMU includes perf arm64 header: Use cpu argument in get_cpuid perf header: Refactor get_cpuid to take a CPU for ARM perf header: Move is_cpu_online to numa bench perf jevents: fix breakage when do perf stat on system metric perf test: Add missing __exit calls in tool/hwmon tests perf tests: Make leader sampling test work without branch event perf util: Remove kernel version deadcode perf test shell trace_exit_race: Use --no-comm to avoid cases where COMM isn't resolved perf test shell trace_exit_race: Show what went wrong in verbose mode ...
2024-11-22perf tests: Fix hwmon parsing with PMU name testIan Rogers
Incorrectly the hwmon with PMU name test didn't pass "true". Fix and address issue with hwmon_pmu__config_terms needing to load events - a load bearing assert fired. Also fix missing list deletion when putting the hwmon test PMU and lower some debug warnings to make the hwmon PMU less spammy in verbose mode. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20241121000955.536930-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-22perf hwmon_pmu: Ensure hwmon key union is zeroed before useIan Rogers
Non-zero values led to mismatches in testing. This was reproducible with -fsanitize=undefined. Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org> Closes: https://lore.kernel.org/lkml/Zzdtj0PEWEX3ATwL@x1/ Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20241119230033.115369-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-22perf tests hwmon_pmu: Remove double evlist__delete()Arnaldo Carvalho de Melo
In the error path when failing to parse events the evlist is being deleted twice, keep the one after the out label. Fixes: 531ee0fd4836994f ("perf test: Add hwmon "PMU" test") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/ZzzoJNNcJJVnPCCe@x1 Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-22perf/test: fix perf ftrace test on s390Thomas Richter
On s390 the perf test case ftrace sometimes fails as follows: # ./perf test ftrace 79: perf ftrace tests : FAILED! # The failure depends on the kernel .config file. Some configurations always work fine, some do not. The ftrace profile test mostly fails, because the ring buffer was not large enough, and some lines (especially the interesting ones with nanosleep in it) where dropped. To achieve success for all tested kernel configurations, enlarge the buffer to store the traces completely without wrapping. The default buffer size is too small for all kernel configurations. Set the buffer size of for the ftrace profile test to 16 MB. Output after: # ./perf test ftrace 79: perf ftrace tests : Ok # Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: agordeev@linux.ibm.com Cc: gor@linux.ibm.com Cc: hca@linux.ibm.com Cc: sumanthk@linux.ibm.com Link: https://lore.kernel.org/r/20241119064856.641446-1-tmricht@linux.ibm.com Suggested-by: Sven Schnelle <svens@linux.ibm.com> Suggested-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-22perf bpf-filter: Return -ENOMEM directly when pfi allocation failsHao Ge
Directly return -ENOMEM when pfi allocation fails, instead of performing other operations on pfi. Fixes: 0fe2b18ddc40 ("perf bpf-filter: Support multiple events properly") Signed-off-by: Hao Ge <gehao@kylinos.cn> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: hao.ge@linux.dev Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20241113030537.26732-1-hao.ge@linux.dev Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-22perf test: Correct hwmon test PMU detectionIan Rogers
Use name to avoid potential other hwmon PMUs. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20241118052638.754981-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-16perf: Remove unused del_perf_probe_events()Dr. David Alan Gilbert
del_perf_probe_events() last use was removed by commit 3d6dfae889174340 ("perf parse-events: Remove BPF event support") Remove it. It was the last user of probe_file__del_events(), so remove it as well. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241022002940.302946-1-linux@treblig.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16perf pmu: Move pmu_metrics_table__find and remove ARM overrideIan Rogers
Move pmu_metrics_table__find() to the jevents.py generated pmu-events.c and remove indirection override for ARM. The movement removes perf_pmu__find_metrics_table that exists to enable the ARM override. The ARM override isn't necessary as just the CPUID, not PMU, is used in the metric table lookup. On non-ARM the CPU argument is just ignored for the CPUID, for ARM -1 is passed so that the CPUID for the first logical CPU is read. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Xu Yang <xu.yang_2@nxp.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Zong-You Xie <ben717@andestech.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Bibo Mao <maobibo@loongson.cn> Cc: Clément Le Goffic <clement.legoffic@foss.st.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Will Deacon <will@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Link: https://lore.kernel.org/r/20241107162035.52206-9-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16perf jevents: Add map_for_cpu()Ian Rogers
The PMU is no longer part of the map finding process and for metrics doesn't make sense as they lack a PMU. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Xu Yang <xu.yang_2@nxp.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Zong-You Xie <ben717@andestech.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Bibo Mao <maobibo@loongson.cn> Cc: Clément Le Goffic <clement.legoffic@foss.st.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Will Deacon <will@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Link: https://lore.kernel.org/r/20241107162035.52206-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16perf header: Pass a perf_cpu rather than a PMU to get_cpuid_strIan Rogers
On ARM the cpuid is dependent on the core type of the CPU in question. The PMU was passed for the sake of the CPU map but this means in places a temporary PMU is created just to pass a CPU value. Just pass the CPU and fix up the callers. As there are no longer PMU users in header.h, shuffle forward declarations earlier to work around build failures. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Xu Yang <xu.yang_2@nxp.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Zong-You Xie <ben717@andestech.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Bibo Mao <maobibo@loongson.cn> Cc: Clément Le Goffic <clement.legoffic@foss.st.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Will Deacon <will@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Link: https://lore.kernel.org/r/20241107162035.52206-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16perf header: Avoid transitive PMU includesIan Rogers
Currently satisfied via header.h. Note, pmu.h includes parse-events.h. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Xu Yang <xu.yang_2@nxp.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Zong-You Xie <ben717@andestech.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Bibo Mao <maobibo@loongson.cn> Cc: Clément Le Goffic <clement.legoffic@foss.st.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Will Deacon <will@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Link: https://lore.kernel.org/r/20241107162035.52206-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16perf arm64 header: Use cpu argument in get_cpuidIan Rogers
Use the cpu to read the MIDR file requested. If the "any" value (-1) is passed that keep the behavior of returning the first MIDR file that can be read. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Xu Yang <xu.yang_2@nxp.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Zong-You Xie <ben717@andestech.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Bibo Mao <maobibo@loongson.cn> Cc: Clément Le Goffic <clement.legoffic@foss.st.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Will Deacon <will@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Link: https://lore.kernel.org/r/20241107162035.52206-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16perf header: Refactor get_cpuid to take a CPU for ARMIan Rogers
ARM BIG.little has no notion of a constant CPUID for both core types. To reflect this reality, change the get_cpuid function to also pass in a possibly unused logical cpu. If the dummy value (-1) is passed in then ARM can, as currently happens, select the first logical CPU's "CPUID". The changes to ARM getcpuid happen in a follow up change. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Xu Yang <xu.yang_2@nxp.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Zong-You Xie <ben717@andestech.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Bibo Mao <maobibo@loongson.cn> Cc: Clément Le Goffic <clement.legoffic@foss.st.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Will Deacon <will@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Link: https://lore.kernel.org/r/20241107162035.52206-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16perf header: Move is_cpu_online to numa benchIan Rogers
The helper function is only used in the NUMA benchmark as typically online CPUs are determined through perf_cpu_map__new_online_cpus(). Reduce the scope of the function for now. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Xu Yang <xu.yang_2@nxp.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Zong-You Xie <ben717@andestech.com> Cc: Benjamin Gray <bgray@linux.ibm.com> Cc: Bibo Mao <maobibo@loongson.cn> Cc: Clément Le Goffic <clement.legoffic@foss.st.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Will Deacon <will@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Link: https://lore.kernel.org/r/20241107162035.52206-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>