| Age | Commit message (Collapse) | Author | |
|---|---|---|---|
| 2024-04-26 | perf scripts python: Add a script to run instances of 'perf script' in parallel | Adrian Hunter | |
| Add a Python script to run a perf script command multiple times in parallel, using perf script options --cpu and --time so that each job processes a different chunk of the data. Extend perf script tests to test also the new script. The script supports the use of normal 'perf script' options like --dlfilter and --script, so that the benefit of running parallel jobs naturally extends to them also. In addition, a command can be provided (refer --pipe-to option) to pipe standard output to a custom command. Refer to the script's own help text at the end of the patch for more details. The script is useful for Intel PT traces, that can be efficiently decoded by 'perf script' when split by CPU and/or time ranges. Running jobs in parallel can decrease the overall decoding time. Committer testing: Ian reported that shellcheck found some issues, I installed it as there are no warnings about it not being available, but when available it fails the build with: TEST /tmp/build/perf-tools-next/tests/shell/script.sh.shellcheck_log CC /tmp/build/perf-tools-next/util/header.o In tests/shell/script.sh line 20: rm -rf "${temp_dir}/"* ^-------------^ SC2115 (warning): Use "${var:?}" to ensure this never expands to /* . In tests/shell/script.sh line 83: output1_dir="${temp_dir}/output1" ^---------^ SC2034 (warning): output1_dir appears unused. Verify use (or export if used externally). In tests/shell/script.sh line 84: output2_dir="${temp_dir}/output2" ^---------^ SC2034 (warning): output2_dir appears unused. Verify use (or export if used externally). In tests/shell/script.sh line 86: python3 "${pp}" -o "${output_dir}" --jobs 4 --verbose -- perf script -i "${perf_data}" ^-----------^ SC2154 (warning): output_dir is referenced but not assigned (did you mean 'output1_dir'?). For more information: https://www.shellcheck.net/wiki/SC2034 -- output1_dir appears unused. Verif... https://www.shellcheck.net/wiki/SC2115 -- Use "${var:?}" to ensure this nev... https://www.shellcheck.net/wiki/SC2154 -- output_dir is referenced but not ... Did these fixes: - rm -rf "${temp_dir}/"* + rm -rf "${temp_dir:?}/"* And: @@ -83,8 +83,8 @@ test_parallel_perf() output1_dir="${temp_dir}/output1" output2_dir="${temp_dir}/output2" perf record -o "${perf_data}" --sample-cpu uname - python3 "${pp}" -o "${output_dir}" --jobs 4 --verbose -- perf script -i "${perf_data}" - python3 "${pp}" -o "${output_dir}" --jobs 4 --verbose --per-cpu -- perf script -i "${perf_data}" + python3 "${pp}" -o "${output1_dir}" --jobs 4 --verbose -- perf script -i "${perf_data}" + python3 "${pp}" -o "${output2_dir}" --jobs 4 --verbose --per-cpu -- perf script -i "${perf_data}" After that: root@number:~# perf test -vv "perf script tests" 97: perf script tests: --- start --- test child forked, pid 4084139 DB test [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.032 MB /tmp/perf-test-script.T4MJDr0L6J/perf.data (7 samples) ] <SNIP> DB test [Success] parallel-perf test Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.034 MB /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data (7 samples) ] Starting: perf script --time=,91898.301878499 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --time=91898.301878500,91898.301905999 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --time=91898.301906000,91898.301933499 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --time=91898.301933500, -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --time=91898.301878500,91898.301905999 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --time=91898.301906000,91898.301933499 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 4 jobs: 2 completed, 2 running Finished: perf script --time=,91898.301878499 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --time=91898.301933500, -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 4 jobs: 4 completed, 0 running All jobs finished successfully parallel-perf.py done Starting: perf script --cpu=0 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=1 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=2 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=3 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=0 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=1 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=2 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=3 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 4 completed, 0 running Starting: perf script --cpu=4 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=5 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=6 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=7 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=4 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=5 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=6 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=7 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 8 completed, 0 running Starting: perf script --cpu=8 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=9 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=10 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=11 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=8 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=9 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=10 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=11 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 12 completed, 0 running Starting: perf script --cpu=12 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=13 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=14 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=15 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=12 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=13 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=14 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=15 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 16 completed, 0 running Starting: perf script --cpu=16 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=17 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=18 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=19 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=16 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=17 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=18 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=19 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 20 completed, 0 running Starting: perf script --cpu=20 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=21 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=22 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=23 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=20 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=21 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=22 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=23 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 24 completed, 0 running Starting: perf script --cpu=24 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=25 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=26 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=27 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=25 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=26 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=27 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 27 completed, 1 running Finished: perf script --cpu=24 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 28 completed, 0 running All jobs finished successfully parallel-perf.py done parallel-perf test [Success] --- Cleaning up --- ---- end(0) ---- 97: perf script tests : Ok root@number:~# Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240423133248.10206-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2023-12-20 | perf scripts python arm-cs-trace-disasm.py: Do not ignore disam first sample | Ruidong Tian | |
| arm-cs-trace-disasm ignore disam the first branch sample, For example as follow, the instructions beteween 0x0000ffffae878750 and 0x0000ffffae878754 is lose: ARM CoreSight Trace Data Assembler Dump Event type: branches:uH Sample = { cpu: 0000 addr: 0x0000ffffae878750 phys_addr: 0x0000000000000000 ip: 0x0000000000000000 pid: 4003489 tid: 4003489 period: 1 time: 26765151766034 } Event type: branches:uH Sample = { cpu: 0000 addr: 0x0000000000000000 phys_addr: 0x0000000000000000 ip: 0x0000ffffae878754 pid: 4003489 tid: 4003489 period: 1 time: 26765151766034 } Initialize cpu_data earlier to fix it: ARM CoreSight Trace Data Assembler Dump Event type: branches:uH Sample = { cpu: 0000 addr: 0x0000000000000000 phys_addr: 0x0000000000000000 ip: 0x0000ffffae878754 pid: 4003489 tid: 4003489 period: 1 time: 26765151766034 } 0000000000028740 <ioctl>: (base address is 0x0000ffffae850000) 28750: b13ffc1f cmn x0, #4095 28754: 54000042 b.hs 0x2875c <ioctl+0x1c> test 4003489/4003489 [0000] 26765.151766034 __GI___ioctl+0x14 /usr/lib64/libc-2.32.so Event type: branches:uH Sample = { cpu: 0000 addr: 0x0000ffffa67535ac phys_addr: 0x0000000000000000 ip: 0x0000000000000000 pid: 4003489 tid: 4003489 period: 1 time: 26765151766034 } Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Al Grant <al.grant@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Tor Jeremiassen <tor@ti.com> Link: https://lore.kernel.org/r/20231214123304.34087-4-tianruidong@linux.alibaba.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2023-12-20 | perf scripts python arm-cs-trace-disasm.py: Set start vm addr of exectable ↵ | Ruidong Tian | |
| file to 0 For exectable ELF file, which e_type is ET_EXEC, dso start address is a absolute address other than offset. Just set vm_start to zero when dso start is 0x400000, which means it is a exectable file. Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Ruidong Tian <tianruidong@linux.alibaba.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Al Grant <al.grant@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Tor Jeremiassen <tor@ti.com> Link: https://lore.kernel.org/r/20231214123304.34087-3-tianruidong@linux.alibaba.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2023-11-23 | perf tools: Address python 3.6 DeprecationWarning for string scapes | Benjamin Gray | |
| Python 3.6 introduced a DeprecationWarning for invalid escape sequences. This is upgraded to a SyntaxWarning in Python 3.12, and will eventually be a syntax error. Fix these now to get ahead of it before it's an error. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Hartley Sweeten <hsweeten@visionengravers.com> Cc: Ian Abbott <abbotti@mev.co.uk> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mykola Lysenko <mykolal@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Todd E Brandt <todd.e.brandt@linux.intel.com> Cc: Tom Rix <trix@redhat.com> Cc: linux-doc@vger.kernel.org Cc: linux-ia64@vger.kernel.org Cc: linux-kselftest@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230912060801.95533-6-bgray@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2023-08-24 | perf scripts python gecko: Launch the profiler UI on the default browser ↵ | Anup Sharma | |
| with the appropriate URL All required libraries have been imported and make sure that none of them are external dependencies. To achieve this, created a virt env and verified. Modified usage information and added combined command. Modified the main() function to read the --save-only command-line option and set the output_file variable accordingly. Modified the trace_end() function to check for the output_file variable. If it is set, the profiler data is saved to a local file in Gecko Profile format, or the profiler.firefox.com is opened on the default browser. Included trace_begin() to initialize the Firefox Profiler and launch the default browser to display the profiler.firefox.com. Added a new function launchFirefox() to start a local server and launch the profiler UI on the default browser with the appropriate URL. Created the "CORSRequestHandler" class to enable Cross-Origin Resource Sharing. Summary: This integration now includes a exiting feature to conveniently host the Gecko Profile data on a local server and open it directly in the default web browser. This means that users can now effortlessly visualize and analyze the profiler results with just a single click. The addition of the --save-only command-line option allows users to save the profiler output to a local file in Gecko Profile format, but the real highlight lies in the capability to seamlessly launch a local server, making the data accessible to Firefox Profiler via a web browser. In addition, it's important to highlight that all data are hosted locally, eliminating any concerns about data privacy rules and regulations. Signed-off-by: Anup Sharma <anupnewsmail@gmail.com> Tested-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/ZNOS0vo58DnVLpD8@yoga Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2023-08-24 | perf scripts python: Add support for input args in gecko script | Anup Sharma | |
| Refines the argument handling mechanism in the "gecko-report" script to enable better compatibility and improved user experience. The script now differentiates between scenarios where arguments are provided for record and report cases where gecko.py arguments are passed. Signed-off-by: Anup Sharma <anupnewsmail@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/ZNf7W+EIrrCSHZN0@yoga Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2023-08-15 | perf scripts python: Update audit-libs package name for python3 | Wei Li | |
| 'audit-libs-python' is the package for python2, update it for python3. On Ubuntu and Fedora, the new package is 'python3-audit'. Signed-off-by: Wei Li <liwei391@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Li Bin <huawei.libin@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230815131805.1237491-1-liwei391@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2023-08-15 | perf scripts python: Support syscall name parsing on arm64 | Wei Li | |
| In the result of "perf script syscall-counts" on arm64, the syscall events are not resolved currently. Add "aarch64" to audit uname list to support name parsing. * After the patch: [root@localhost ~]# perf script syscall-counts sleep 1 Press control+C to stop and show the summary syscall events: event count ---------------------------------------- ----------- mmap 6 close 5 mprotect 4 brk 3 newfstatat 3 openat 3 getrandom 1 prlimit64 1 munmap 1 clock_nanosleep 1 set_robust_list 1 set_tid_address 1 exit_group 1 read 1 faccessat 1 Signed-off-by: Wei Li <liwei391@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Li Bin <huawei.libin@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230815131735.1237221-1-liwei391@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2023-08-03 | perf script python: Cope with declarations after statements found in Python.h | Arnaldo Carvalho de Melo | |
| With -Werror the build was failing on fedora rawhide: [perfbuilder@27cfe44d67ed perf-6.5.0-rc2]$ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/13/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-redhat-linux Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,m2,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --enable-libstdcxx-backtrace --with-libstdcxx-zoneinfo=/usr/share/zoneinfo --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl=/builddir/build/BUILD/gcc-13.2.1-20230728/obj-x86_64-redhat-linux/isl-install --enable-offload-targets=nvptx-none --without-cuda-driver --enable-offload-defaulted --enable-gnu-indirect-function --enable-cet --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux --with-build-config=bootstrap-lto --enable-link-serialization=1 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 13.2.1 20230728 (Red Hat 13.2.1-1) (GCC) [perfbuilder@27cfe44d67ed perf-6.5.0-rc2]$ In file included from /usr/include/python3.12/Python.h:44, from scripts/python/Perf-Trace-Util/Context.c:14: /usr/include/python3.12/object.h: In function 'Py_SIZE': /usr/include/python3.12/object.h:217:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] 217 | PyVarObject *var_ob = _PyVarObject_CAST(ob); | ^~~~~~~~~~~ In file included from /usr/include/python3.12/Python.h:53: /usr/include/python3.12/cpython/longintrepr.h: In function '_PyLong_CompactValue': /usr/include/python3.12/cpython/longintrepr.h:121:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] 121 | Py_ssize_t sign = 1 - (op->long_value.lv_tag & _PyLong_SIGN_MASK); | ^~~~~~~~~~ <SNIP> In file included from /usr/include/python3.12/Python.h:44, from util/scripting-engines/trace-event-python.c:22: /usr/include/python3.12/object.h: In function 'Py_SIZE': /usr/include/python3.12/object.h:217:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] 217 | PyVarObject *var_ob = _PyVarObject_CAST(ob); | ^~~~~~~~~~~ CC /tmp/build/perf/util/units.o CC /tmp/build/perf/util/time-utils.o In file included from /usr/include/python3.12/Python.h:53: /usr/include/python3.12/cpython/longintrepr.h: In function '_PyLong_CompactValue': /usr/include/python3.12/cpython/longintrepr.h:121:5: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] 121 | Py_ssize_t sign = 1 - (op->long_value.lv_tag & _PyLong_SIGN_MASK); | ^~~~~~~~~~ So add -Wno-declaration-after-statement to the python scripting CFLAGS. Reviewed-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZMpdKeO8gU%2FcWDqH@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2023-07-28 | perf scripts python: Add command execution for gecko script | Anup Sharma | |
| This will enable the execution of gecko.py script using record and report commands in 'perf script'. And this will be also reflected at "perf script -l" command. For Example: perf script record gecko perf script report gecko Committer notes: As discussed on the perf tools office hours, I made -F 99 the default for the record script and removed the double -- on the report script so that the existing 'perf script' protocol for the combined operation: # perf script gecko Works, i.e. the record script pipes its stdout into the stdin of the report script, basically: /bin/sh /usr/libexec/perf-core/scripts/python/bin/gecko-record -F 99 -g -a -q -o - | \ /bin/sh /usr/libexec/perf-core/scripts/python/bin/gecko-report -i - Testing it: The resulting JSON file needs to be uploaded to https://profiler.firefox.com, Anup already has code to start a local http server on the trace_begin handler of the gecko python script, start firefox and feed it the JSON. The example below only collects sample for the specified workload, so that we don't produce thousands of lines, to collect system wide samples, use instead: # perf script gecko -a sleep 0.5 # nohup perf script gecko sleep 0.5 { "meta": { "interval": 1, "processType": 0, "product": "x86_64 GNU/Linux", "stackwalk": 1, "debug": 0, "gcpoison": 0, "asyncstack": 1, "startTime": 274601692.636, "shutdownTime": null, "version": 24, "presymbolicated": true, "categories": [ { "name": "User", "color": "yellow", "subcategories": [ "Other" ] }, { "name": "Kernel", "color": "orange", "subcategories": [ "Other" ] } ], "markerSchema": [] }, "libs": [], "threads": [ { "tid": 3344498, "pid": 3344498, "name": "sleep", "markers": { "schema": { "name": 0, "startTime": 1, "endTime": 2, "phase": 3, "category": 4, "data": 5 }, "data": [] }, "samples": { "schema": { "stack": 0, "time": 1, "responsiveness": 2 }, "data": [ [ 21, 274601692.636, 0 ], [ 23, 274601692.641, 0 ], [ 29, 274601692.643, 0 ], [ 42, 274601692.648, 0 ] ] }, "frameTable": { "schema": { "location": 0, "relevantForJS": 1, "innerWindowID": 2, "implementation": 3, "optimizations": 4, "line": 5, "column": 6, "category": 7, "subcategory": 8 }, "data": [ [ 0, false, 0, null, null, null, null, 1, null ], [ 1, false, 0, null, null, null, null, 1, null ], [ 2, false, 0, null, null, null, null, 1, null ], [ 3, false, 0, null, null, null, null, 1, null ], [ 4, false, 0, null, null, null, null, 1, null ], [ 5, false, 0, null, null, null, null, 1, null ], [ 6, false, 0, null, null, null, null, 1, null ], [ 7, false, 0, null, null, null, null, 1, null ], [ 8, false, 0, null, null, null, null, 1, null ], [ 9, false, 0, null, null, null, null, 1, null ], [ 10, false, 0, null, null, null, null, 1, null ], [ 11, false, 0, null, null, null, null, 1, null ], [ 12, false, 0, null, null, null, null, 1, null ], [ 13, false, 0, null, null, null, null, 1, null ], [ 14, false, 0, null, null, null, null, 1, null ], [ 15, false, 0, null, null, null, null, 1, null ], [ 16, false, 0, null, null, null, null, 1, null ], [ 17, false, 0, null, null, null, null, 1, null ], [ 18, false, 0, null, null, null, null, 1, null ], [ 19, false, 0, null, null, null, null, 1, null ], [ 20, false, 0, null, null, null, null, 1, null ], [ 21, false, 0, null, null, null, null, 1, null ], [ 22, false, 0, null, null, null, null, 1, null ], [ 23, false, 0, null, null, null, null, 1, null ], [ 24, false, 0, null, null, null, null, 1, null ], [ 25, false, 0, null, null, null, null, 1, null ], [ 26, false, 0, null, null, null, null, 1, null ], [ 27, false, 0, null, null, null, null, 1, null ], [ 28, false, 0, null, null, null, null, 1, null ], [ 29, false, 0, null, null, null, null, 1, null ], [ 30, false, 0, null, null, null, null, 1, null ], [ 31, false, 0, null, null, null, null, 1, null ], [ 32, false, 0, null, null, null, null, 1, null ], [ 33, false, 0, null, null, null, null, 1, null ], [ 34, false, 0, null, null, null, null, 1, null ], [ 35, false, 0, null, null, null, null, 1, null ], [ 36, false, 0, null, null, null, null, 1, null ], [ 37, false, 0, null, null, null, null, 1, null ], [ 38, false, 0, null, null, null, null, 1, null ] ] }, "stackTable": { "schema": { "prefix": 0, "frame": 1 }, "data": [ [ null, 0 ], [ 0, 1 ], [ 1, 2 ], [ 2, 3 ], [ 3, 4 ], [ 4, 5 ], [ 5, 6 ], [ 6, 7 ], [ 7, 8 ], [ 8, 9 ], [ 9, 10 ], [ 10, 11 ], [ 11, 12 ], [ 12, 13 ], [ 13, 14 ], [ 14, 15 ], [ 15, 16 ], [ 16, 17 ], [ 17, 18 ], [ 18, 19 ], [ 19, 20 ], [ 20, 21 ], [ 20, 22 ], [ 22, 23 ], [ 11, 24 ], [ 24, 25 ], [ 25, 26 ], [ 26, 27 ], [ 27, 28 ], [ 28, 29 ], [ 9, 11 ], [ 30, 24 ], [ 31, 25 ], [ 32, 30 ], [ 33, 31 ], [ 34, 32 ], [ 35, 29 ], [ 36, 33 ], [ 37, 34 ], [ 38, 35 ], [ 39, 36 ], [ 40, 37 ], [ 41, 38 ] ] }, "stringTable": [ "__func__.0 (in [kernel.kallsyms].rodata)", "perf_trace_ext4_fc_track_inode (in [kernel.kallsyms])", "perf_trace_ext4_es_insert_delayed_block (in [kernel.kallsyms])", "ext4_es_show_pblock (in [kernel.kallsyms])", "perf_trace_ext4_ext_rm_leaf (in [kernel.kallsyms])", "devcgroup_access_write (in [kernel.kallsyms])", "devcgroup_update_access (in [kernel.kallsyms])", "propagate_exception (in [kernel.kallsyms])", "revalidate_active_exceptions (in [kernel.kallsyms])", "perf_trace_ext4_fc_commit_stop (in [kernel.kallsyms])", "perf_fetch_caller_regs (in [kernel.kallsyms])", "khugepaged (in [kernel.kallsyms])", "khugepaged_wait_work (in [kernel.kallsyms])", "freezable_schedule_timeout (in [kernel.kallsyms])", "freezer_count (in [kernel.kallsyms])", "try_to_freeze (in [kernel.kallsyms])", "try_to_freeze_unsafe (in [kernel.kallsyms])", "split_huge_pages_write (in [kernel.kallsyms])", "migrate_pages (in [kernel.kallsyms])", "unmap_and_move (in [kernel.kallsyms])", "__unmap_and_move (in [kernel.kallsyms])", "collect_events (in [kernel.kallsyms])", "uncore_down_prepare (in [kernel.kallsyms])", "perf_iommu_read (in [kernel.kallsyms])", "khugepaged_do_scan (in [kernel.kallsyms])", "khugepaged_scan_mm_slot (in [kernel.kallsyms])", "khugepaged_scan_file (in [kernel.kallsyms])", "need_resched (in [kernel.kallsyms])", "get_current (in [kernel.kallsyms])", "move_to_new_page (in [kernel.kallsyms])", "khugepaged_scan_pmd (in [kernel.kallsyms])", "trace_mm_khugepaged_scan_pmd (in [kernel.kallsyms])", "migrate_huge_page_move_mapping (in [kernel.kallsyms])", "do_huge_pmd_numa_page (in [kernel.kallsyms])", "pmd_pfn (in [kernel.kallsyms])", "protnone_mask (in [kernel.kallsyms])", "__pte_needs_invert (in [kernel.kallsyms])", "reclaim_high (in [kernel.kallsyms])", "memcg_memory_event (in [kernel.kallsyms])" ], "registerTime": 0, "unregisterTime": null, "processType": "default" } ], "processes": [], "pausedRanges": [] } # Signed-off-by: Anup Sharma <anupnewsmail@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/cbf03cda175ea3dd2c6cd87bd3f12d803446cb95.1689961706.git.anupnewsmail@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2023-07-28 | perf scripts python: Implement add sample function and thread processing | Anup Sharma | |
| The stack has been created for storing func and dso from the callchain. The sample has been added to a specific thread. It first checks if the thread exists in the Thread class. Then it call _add_sample function which is responsible for appending a new entry to the samples list. Also callchain parsing and storing part is implemented. Moreover removed the comment from thread. Signed-off-by: Anup Sharma <anupnewsmail@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/5a112be85ccdcdcd611e343f6a7a7482d01f6299.1689961706.git.anupnewsmail@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2023-07-28 | perf scripts python: Implement add sample function and thread processing | Anup Sharma | |
| The intern_stack function is responsible for retrieving or creating a stack_id based on the provided frame_id and prefix_id. It first generates a key using the frame_id and prefix_id values. If the stack corresponding to the key is found in the stackMap, it is returned. Otherwise, a new stack is created by appending the prefix_id and frame_id to the stackTable. The key and the index of the newly created stack are added to the stackMap for future reference. The _intern_frame function is responsible for retrieving or creating a frame_id based on the provided frame string. If the frame_id corresponding to the frameString is found in the frameMap, it is returned. Otherwise, a new frame is created by appending relevant information to the frameTable and adding the frameString to the string_id through _intern_string. The _intern_string function will gets a matching string, or saves the new string and returns a String ID. Signed-off-by: Anup Sharma <anupnewsmail@gmail.com> Link: https://lore.kernel.org/r/4442f4b1ab4c7317cf940560a3a285fcdfbeeb08.1689961706.git.anupnewsmail@gmail.com Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: linux-kernel@vger.kernel.org Cc: linux-perf-users@vger.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2023-07-28 | perf scripts python: Add trace end processing and PRODUCT and CATEGORIES ↵ | Anup Sharma | |
| information The final output will now be presented in JSON format following the Gecko profile structure. Additionally, the inclusion of PRODUCT allows easy retrieval of header information for UI. Furthermore, CATEGORIES have been introduced to enable customization of kernel and user colors using input arguments. To facilitate this functionality, an argparse-based parser has been implemented. Note: The implementation of threads will be addressed in subsequent commits for now I have commented it out. Signed-off-by: Anup Sharma <anupnewsmail@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/fa6d027e4134c48e8a2ea45dd8f6b21e6a3418e4.1689961706.git.anupnewsmail@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2023-07-28 | perf scripts python: Add classes and conversion functions | Anup Sharma | |
| This commit introduces new classes and conversion functions to facilitate the representation of Gecko profile information. The new classes Frame, Stack, Sample, and Thread are added to handle specific components of the profile data, also link to the origin docs has been commented out. Additionally, Inside the Thread class _to_json_dict() method has been created that converts the current thread data into the corresponding format expected by the GeckoThread JSON schema, as per the Gecko profile format specification. Signed-off-by: Anup Sharma <anupnewsmail@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/ab7b40bd32df7101a6f8b4a3aa41570b63b831ac.1689961706.git.anupnewsmail@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2023-07-28 | perf scripts python: Extact necessary information from process event | Anup Sharma | |
| The script takes in a sample event dictionary(param_dict) and retrieves relevant data such as time stamp, PID, TID, and comm for each event. Also start time is defined as a global variable as it need to be passed to trace_end for gecko meta information field creation. Signed-off-by: Anup Sharma <anupnewsmail@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/19910fefcfe4be03cd5c2aa3fec11d3f86c0381b.1689961706.git.anupnewsmail@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2023-07-28 | perf scripts python: Add initial script file with usage information | Anup Sharma | |
| Added necessary modules, including the Perf-Trace-Util library, and defines the required functions and variables for using perf script python. The perf_trace_context and Core modules for tracing and processing events has been also imported. Added usage information. Signed-off-by: Anup Sharma <anupnewsmail@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/f2f1a62f1cc69f44a5414da46a26a4cf124d2744.1689961706.git.anupnewsmail@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2023-06-13 | perf python scripting: Get rid of unused import in arm-cs-trace-disasm | Sourabh Jain | |
| The arm-cs-trace-disasm.py script doesn't use the sys library, so remove the import. Report by pylint: W0611: Unused import sys (unused-import) Signed-off-by: Sourabh Jain <sourabhjain@linux.ibm.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/linux-perf-users/20230613164145.50488-2-atrajeev@linux.vnet.ibm.com Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2023-06-12 | perf thread: Add accessor functions for thread | Ian Rogers | |
| Using accessors will make it easier to add reference count checking in later patches. Committer notes: thread->nsinfo wasn't wrapped as it is used together with nsinfo__zput(), where does a trick to set the field with a refcount being dropped to NULL, and that doesn't work well with using thread__nsinfo(thread), that loses the &thread->nsinfo pointer. When refcount checking is added to 'struct thread', later in this series, nsinfo__zput(RC_CHK_ACCESS(thread)->nsinfo) will be used to check the thread pointer. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Brian Robbins <brianrob@linux.microsoft.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Fangrui Song <maskray@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Babrou <ivan@cloudflare.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Wenyu Liu <liuwenyu7@huawei.com> Cc: Will Deacon <will@kernel.org> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Ye Xingchen <ye.xingchen@zte.com.cn> Cc: Yuan Can <yuancan@huawei.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230608232823.4027869-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2023-05-02 | perf script: Add new parameter in kfree_skb tracepoint to the python scripts ↵ | Sriram Yagnaraman | |
| using it Include reason parameter that was added in commit c504e5c2f9648a1e ("net: skb: introduce kfree_skb_reason()") Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sriram Yagnaraman <sriram.yagnaraman@est.tech> Link: https://lore.kernel.org/r/20230426104149.14089-1-sriram.yagnaraman@est.tech Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2023-04-17 | perf script task-analyzer: Fix spelling mistake "miliseconds" -> "milliseconds" | Colin Ian King | |
| There is a spelling mistake in the help for the --ms option. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Acked-by: Hagen Paul Pfeifer <hagen@jauu.net> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Petar Gligoric <petar.gligoric@rohde-schwarz.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-janitors@vger.kernel.org Link: https://lore.kernel.org/r/20230417174826.52963-1-colin.i.king@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2023-04-12 | perf scripts python intel-pt-events: Delete unused 'event_attr variable | Alexander Pantyukhin | |
| The 'event_attr' is never used later, the var is ok be deleted. Additional code simplification is to substitute string slice comparison with "substring" function. This case no need to know the length specific words. Signed-off-by: Alexander Pantyukhin <apantykhin@gmail.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20230114130533.2877-1-apantykhin@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2023-04-04 | perf map: Add accessor for dso | Ian Rogers | |
| Later changes will add reference count checking for struct map, with dso being the most frequently accessed variable. Add an accessor so that the reference count check is only necessary in one place. Additional changes: - add a dso variable to avoid repeated map__dso calls. - in builtin-mem.c dump_raw_samples, code only partially tested for dso == NULL. Make the possibility of NULL consistent. - in thread.c thread__memcpy fix use of spaces and use tabs. Committer notes: Did missing conversions on these files: tools/perf/arch/powerpc/util/skip-callchain-idx.c tools/perf/arch/powerpc/util/sym-handling.c tools/perf/ui/browsers/hists.c tools/perf/ui/gtk/annotate.c tools/perf/util/cs-etm.c tools/perf/util/thread.c tools/perf/util/unwind-libunwind-local.c tools/perf/util/unwind-libunwind.c Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: Hao Luo <haoluo@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Yury Norov <yury.norov@gmail.com> Link: https://lore.kernel.org/r/20230320212248.1175731-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2023-04-04 | perf maps: Add functions to access maps | Ian Rogers | |
| Introduce functions to access struct maps. These functions reduce the number of places reference counting is necessary. While tidying APIs do some small const-ification, in particlar to unwind_libunwind_ops. Committer notes: Fixed up tools/perf/util/unwind-libunwind.c: - return ops->get_entries(cb, arg, thread, data, max_stack); + return ops->get_entries(cb, arg, thread, data, max_stack, best_effort); Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Darren Hart <dvhart@infradead.org> Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Eric Dumazet <edumazet@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: Hao Luo <haoluo@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miaoqian Lin <linmq006@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Yury Norov <yury.norov@gmail.com> Link: https://lore.kernel.org/r/20230320212248.1175731-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2023-03-15 | perf script: Fix Python support when no libtraceevent | Adrian Hunter | |
| Python scripting can be used without libtraceevent. In particular, scripting for Intel PT does not use tracepoints, and so does not need libtraceevent support. Alter the build and employ conditional compilation to allow Python scripting without libtraceevent. Example: Before: $ ldd `which perf` | grep -i python $ ldd `which perf` | grep -i libtraceevent $ perf record -e intel_pt//u uname Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.031 MB perf.data ] $ perf script intel-pt-events.py |& head -3 Error: Couldn't find script `intel-pt-events.py' See perf script -l for available scripts. After: $ ldd `which perf` | grep -i python libpython3.10.so.1.0 => /lib/x86_64-linux-gnu/libpython3.10.so.1.0 (0x00007f4bac400000) $ ldd `which perf` | grep -i libtraceevent $ perf script intel-pt-events.py | head Intel PT Branch Trace, Power Events, Event Trace and PTWRITE Switch In 8021/8021 [000] 11234.097713404 0/0 perf-exec 8021/8021 [000] 11234.098041726 psb offset: 0x0 0 [unknown] ([unknown]) perf-exec 8021/8021 [000] 11234.098041726 cbr 45 freq: 4505 MHz (161%) 0 [unknown] ([unknown]) uname 8021/8021 [000] 11234.098082170 branches:uH tr strt 0 [unknown] ([unknown]) => 7f3a8b9422b0 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) uname 8021/8021 [000] 11234.098082379 branches:uH tr end 7f3a8b9422b0 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) => 0 [unknown] ([unknown]) uname 8021/8021 [000] 11234.098083629 branches:uH tr strt 0 [unknown] ([unknown]) => 7f3a8b9422b0 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) uname 8021/8021 [000] 11234.098083629 branches:uH call 7f3a8b9422b3 _start+0x3 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) => 7f3a8b943050 _dl_start+0x0 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) uname 8021/8021 [000] 11234.098083837 branches:uH tr end 7f3a8b943060 _dl_start+0x10 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) => 0 [unknown] ([unknown]) IPC: 0.01 (9/938) uname 8021/8021 [000] 11234.098084670 branches:uH tr strt 0 [unknown] ([unknown]) => 7f3a8b943060 _dl_start+0x10 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) Fixes: 378ef0f5d9d7f465 ("perf build: Use libtraceevent from the system") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20230315084321.14563-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2023-03-14 | perf scripts intel-pt-events.py: Fix IPC output for Python 2 | Roman Lozko | |
| Integers are not converted to floats during division in Python 2 which results in incorrect IPC values. Fix by switching to new division behavior. Fixes: a483e64c0b62e93a ("perf scripting python: intel-pt-events.py: Add --insn-trace and --src-trace") Signed-off-by: Roman Lozko <lozko.roma@gmail.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20230310150445.2925841-1-lozko.roma@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2023-01-19 | perf script flamegraph: Avoid d3-flame-graph package dependency | Ian Rogers | |
| Currently flame graph generation requires a d3-flame-graph template to be installed. Unfortunately this is hard to come by for things like Debian [1]. If the template isn't installed then ask if it should be downloaded from jsdelivr CDN. The downloaded HTML file is validated against an md5sum. If the download fails, generate a minimal flame graph with the javascript coming from links to jsdelivr CDN. v3. Adds a warning message and quits before download in live mode. v2. Change the warning to a prompt about downloading and add the --allow-download command line flag. Add an md5sum check for the downloaded HTML. [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=996839 Reviewed-by: Andreas Gerstmayr <agerstmayr@redhat.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: 996839@bugs.debian.org Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Brendan Gregg <brendan@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin Spier <spiermar@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230118072409.147786-1-irogers@google.com # v3 discussion Link: https://lore.kernel.org/r/20230112220024.32709-1-irogers@google.com # v2 discussion Link: https://lore.kernel.org/r/CAP-5=fXi_9zdhTAoYApiFQoLURAvpEatFzU3uL23o3zs=z25ZQ@mail.gmail.com # v1 discussion Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2022-12-14 | perf script: task-analyzer add csv support | Petar Gligoric | |
| This patch adds the possibility to write the trace and the summary as csv files to a user specified file. A format as such simplifies further data processing. This is achieved by having ";" as separators instead of spaces and solely one header per file. Additional parameters are being considered, like in the normal usage of the script. Colors are turned off in the case of a csv output, thus the highlight option is also being ignored. Usage: Write standard task to csv file: $ perf script report tasks-analyzer --csv <file> write limited output to csv file in nanoseconds: $ perf script report tasks-analyzer --csv <file> --ns --limit-to-tasks 1337 Write summary to a csv file: $ perf script report tasks-analyzer --csv-summary <file> Write summary to csv file with additional schedule information: $ perf script report tasks-analyzer --csv-summary <file> --summary-extended Write both summary and standard task to a csv file: $ perf script report tasks-analyzer --csv --csv-summary The following examples illustrate what is possible with the CSV output. The first command sequence will record all scheduler switch events for 10 seconds, the task-analyzer calculates task information like runtimes as CSV. A small python snippet using pandas and matplotlib will visualize the most frequent task (e.g. kworker/1:1) runtimes - each runtime as a bar in a bar chart: $ perf record -e sched:sched_switch -a -- sleep 10 $ perf script report tasks-analyzer --ns --csv tasks.csv $ cat << EOF > /tmp/freq-comm-runtimes-bar.py import pandas as pd import matplotlib.pyplot as plt df = pd.read_csv("tasks.csv", sep=';') most_freq_comm = df["COMM"].value_counts().idxmax() most_freq_runtimes = df[df["COMM"]==most_freq_comm]["Runtime"] plt.title(f"Runtimes for Task {most_freq_comm} in Nanoseconds") plt.bar(range(len(most_freq_runtimes)), most_freq_runtimes) plt.show() $ python3 /tmp/freq-comm-runtimes-bar.py As a seconds example, the subsequent script generates a pie chart of all accumulated tasks runtimes for 10 seconds of system recordings: $ perf record -e sched:sched_switch -a -- sleep 10 $ perf script report tasks-analyzer --csv-summary task-summary.csv $ cat << EOF > /tmp/accumulated-task-pie.py import pandas as pd from matplotlib.pyplot import pie, axis, show df = pd.read_csv("task-summary.csv", sep=';') sums = df.groupby(df["Comm"])["Accumulated"].sum() axis("equal") pie(sums, labels=sums.index); show() EOF $ python3 /tmp/accumulated-task-pie.py A variety of other visualizations are possible in matplotlib and other environments. Of course, pandas, numpy and co. also allow easy statistical analysis of the data! Signed-off-by: Petar Gligoric <petar.gligoric@rohde-schwarz.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20221206154406.41941-3-petar.gligor@gmail.com Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2022-12-14 | perf script: Introduce task analyzer python script | Hagen Paul Pfeifer | |
| Introduce a new 'perf script' to analyze task scheduling behavior. During the task analysis, some data is always needed - which goes beyond the simple time of switching on and off a task (process/thread). This concerns for example the runtime of a process or the frequency with which the process was called. This script serves to simplify this recurring analyze process. It immediately provides the user with helpful task characteristic information about the tasks runtimes. Usage: Recorded can be in two ways: $ perf script record tasks-analyzer -- sleep 10 $ perf record -e sched:sched_switch -a -- sleep 10 The script can parse all perf.data files, most important: sched:sched_switch events are mandatory, other events will be ignored. Most simple report use case is to just call the script without arguments: $ perf script report tasks-analyzer Switched-In Switched-Out CPU PID TID Comm Runtime Time Out-In 15576.658891407 15576.659156086 4 2412 2428 gdbus 265 1949 15576.659111320 15576.659455410 0 2412 2412 gnome-shell 344 2267 15576.659491326 15576.659506173 2 74 74 kworker/2:1 15 13145 15576.659506173 15576.659825748 2 2858 2858 gnome-terminal- 320 63263 15576.659871270 15576.659902872 6 20932 20932 kworker/u16:0 32 2314582 15576.659909951 15576.659945501 3 27264 27264 sh 36 -1 15576.659853285 15576.659971052 7 27265 27265 perf 118 5050741 [...] What is not shown here are the ASCII color sequences. For example, if the task consists of only one thread, the TID is grayed out. Runtime is the time the task was running on the CPU, Time Out-In is the time between the process being scheduled *out* and scheduled back *in*. So the last time span between two executions. If -1 is printed, then the task simply ran the first time in the measurements - a Out-In delta could not be calculated. In addition to the chronological representation, there is a summary on task level. This output can be additionally switched on via the --summary option and provides information such as max, min & average runtime per process. The maximum runtime is often important for debugging. The call looks like this: $ perf script report tasks-analyzer --summary Summary Task Information Runtime Information PID TID Comm Runs Accumulated Mean Median Min Max Max At 14 14 ksoftirqd/0 13 334 26 15 9 127 15571.621211956 15 15 rcu_preempt 133 1778 13 13 2 33 15572.581176024 16 16 migration/0 3 49 16 13 12 24 15571.608915425 20 20 migration/1 3 34 11 13 8 13 15571.639101555 25 25 migration/2 3 32 11 12 9 12 15575.639239896 [...] Besides these two options, there are a number of other options that change the output and behavior. This can be queried via --help. Options worth mentioning include: - filter-tasks - filter out unneeded tasks, --filter-task 1337,/sbin/init - highlight-tasks - more pleasant focusing, --highlight-tasks 1:red,mutt:yellow - extended-times - show combinations of elapsed times between schedule in/schedule out - summary-extended - summary with additional information, like maximum delta time statistics - rename-comms-by-tids - handy for inexpressive processnames like python, --rename 1337:my-python-app - ms - show timestamps in milliseconds, nanoseconds is also possible (--ns) - time-limit - limit the analyzer to a time range, --time-limit 15576.0:15576.1 Script is tested and prime time ready for python2 & python3: - make PYTHON=python3 prefix=/usr/local install - make PYTHON=python2 prefix=/usr/local install Signed-off-by: Hagen Paul Pfeifer <hagen@jauu.net> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20221206154406.41941-2-petar.gligor@gmail.com Signed-off-by: Petar Gligoric <petar.gligoric@rohde-schwarz.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2022-12-14 | perf build: Use libtraceevent from the system | Ian Rogers | |
| Remove the LIBTRACEEVENT_DYNAMIC and LIBTRACEFS_DYNAMIC make command line variables. If libtraceevent isn't installed or NO_LIBTRACEEVENT=1 is passed to the build, don't compile in libtraceevent and libtracefs support. This also disables CONFIG_TRACE that controls "perf trace". CONFIG_LIBTRACEEVENT is used to control enablement in Build/Makefiles, HAVE_LIBTRACEEVENT is used in C code. Without HAVE_LIBTRACEEVENT tracepoints are disabled and as such the commands kmem, kwork, lock, sched and timechart are removed. The majority of commands continue to work including "perf test". Committer notes: Fixed up a tools/perf/util/Build reject and added: #include <traceevent/event-parse.h> to tools/perf/util/scripting-engines/trace-event-perl.c. Committer testing: $ rpm -qi libtraceevent-devel Name : libtraceevent-devel Version : 1.5.3 Release : 2.fc36 Architecture: x86_64 Install Date: Mon 25 Jul 2022 03:20:19 PM -03 Group : Unspecified Size : 27728 License : LGPLv2+ and GPLv2+ Signature : RSA/SHA256, Fri 15 Apr 2022 02:11:58 PM -03, Key ID 999f7cbf38ab71f4 Source RPM : libtraceevent-1.5.3-2.fc36.src.rpm Build Date : Fri 15 Apr 2022 10:57:01 AM -03 Build Host : buildvm-x86-05.iad2.fedoraproject.org Packager : Fedora Project Vendor : Fedora Project URL : https://git.kernel.org/pub/scm/libs/libtrace/libtraceevent.git/ Bug URL : https://bugz.fedoraproject.org/libtraceevent Summary : Development headers of libtraceevent Description : Development headers of libtraceevent-libs $ Default build: $ ldd ~/bin/perf | grep tracee libtraceevent.so.1 => /lib64/libtraceevent.so.1 (0x00007f1dcaf8f000) $ # perf trace -e sched:* --max-events 10 0.000 migration/0/17 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, dest_cpu: 1) 0.005 migration/0/17 sched:sched_wake_idle_without_ipi(cpu: 1) 0.011 migration/0/17 sched:sched_switch(prev_comm: "", prev_pid: 17 (migration/0), prev_state: 1, next_comm: "", next_prio: 120) 1.173 :0/0 sched:sched_wakeup(comm: "", pid: 3138 (gnome-terminal-), prio: 120) 1.180 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 3138 (gnome-terminal-), next_prio: 120) 0.156 migration/1/21 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, orig_cpu: 1, dest_cpu: 2) 0.160 migration/1/21 sched:sched_wake_idle_without_ipi(cpu: 2) 0.166 migration/1/21 sched:sched_switch(prev_comm: "", prev_pid: 21 (migration/1), prev_state: 1, next_comm: "", next_prio: 120) 1.183 :0/0 sched:sched_wakeup(comm: "", pid: 1602985 (kworker/u16:0-f), prio: 120, target_cpu: 1) 1.186 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 1602985 (kworker/u16:0-f), next_prio: 120) # Had to tweak tools/perf/util/setup.py to make sure the python binding shared object links with libtraceevent if -DHAVE_LIBTRACEEVENT is present in CFLAGS. Building with NO_LIBTRACEEVENT=1 uncovered some more build failures: - Make building of data-convert-bt.c to CONFIG_LIBTRACEEVENT=y - perf-$(CONFIG_LIBTRACEEVENT) += scripts/ - bpf_kwork.o needs also to be dependent on CONFIG_LIBTRACEEVENT=y - The python binding needed some fixups and util/trace-event.c can't be built and linked with the python binding shared object, so remove it in tools/perf/util/setup.py and exclude it from the list of dependencies in the python/perf.so Makefile.perf target. Building without libtraceevent-devel installed uncovered more build failures: - The python binding tools/perf/util/python.c was assuming that traceevent/parse-events.h was always available, which was the case when we defaulted to using the in-kernel tools/lib/traceevent/ files, now we need to enclose it under ifdef HAVE_LIBTRACEEVENT, just like the other parts of it that deal with tracepoints. - We have to ifdef the rules in the Build files with CONFIG_LIBTRACEEVENT=y to build builtin-trace.c and tools/perf/trace/beauty/ as we only ifdef setting CONFIG_TRACE=y when setting NO_LIBTRACEEVENT=1 in the make command line, not when we don't detect libtraceevent-devel installed in the system. Simplification here to avoid these two ways of disabling builtin-trace.c and not having CONFIG_TRACE=y when libtraceevent-devel isn't installed is the clean way. From Athira: <quote> tools/perf/arch/powerpc/util/Build -perf-y += kvm-stat.o +perf-$(CONFIG_LIBTRACEEVENT) += kvm-stat.o </quote> Then, ditto for arm64 and s390, detected by container cross build tests. - s/390 uses test__checkevent_tracepoint() that is now only available if HAVE_LIBTRACEEVENT is defined, enclose the callsite with ifder HAVE_LIBTRACEEVENT. Also from Athira: <quote> With this change, I could successfully compile in these environment: - Without libtraceevent-devel installed - With libtraceevent-devel installed - With “make NO_LIBTRACEEVENT=1” </quote> Then, finally rename CONFIG_TRACEEVENT to CONFIG_LIBTRACEEVENT for consistency with other libraries detected in tools/perf/. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Link: http://lore.kernel.org/lkml/20221205225940.3079667-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2022-10-27 | perf scripts python: intel-pt-events.py: Add ability interleave output | Adrian Hunter | |
| Intel PT timestamps are not provided for every branch, let alone every instruction, so there can be many samples with the same timestamp. With per-cpu contexts, decoding is done for each CPU in turn, which can make it difficult to see what is happening on different CPUs at the same time. Currently the interleaving from perf script --itrace=i0ns is quite coarse grained. There are often long stretches executing on one CPU and nothing on another. Some people are interested in seeing what happened on multiple CPUs before a crash to debug races etc. To improve perf script interleaving for parallel execution, the intel-pt-events.py script has been enhanced to enable interleaving the output with the same timestamp from different CPUs. It is understood that interleaving is not perfect or causal. Add parameter --interleave [<n>] to interleave sample output for the same timestamp so that no more than n samples for a CPU are displayed in a row. 'n' defaults to 4. Note this only affects the order of output, and only when the timestamp is the same. Example: $ perf script intel-pt-events.py --insn-trace --interleave 3 ... bash 2267/2267 [004] 9323.692625625 563caa3c86f0 jz 0x563caa3c89c7 run_pending_traps+0x30 (/usr/bin/bash) IPC: 1.52 (38/25) bash 2267/2267 [004] 9323.692625625 563caa3c89c7 movq 0x118(%rsp), %rax run_pending_traps+0x307 (/usr/bin/bash) bash 2267/2267 [004] 9323.692625625 563caa3c89cf subq %fs:0x28, %rax run_pending_traps+0x30f (/usr/bin/bash) bash 2270/2270 [007] 9323.692625625 55dc58cabf02 jz 0x55dc58cabf48 unquoted_glob_pattern_p+0x102 (/usr/bin/bash) IPC: 1.56 (25/16) bash 2270/2270 [007] 9323.692625625 55dc58cabf04 cmp $0x5d, %al unquoted_glob_pattern_p+0x104 (/usr/bin/bash) bash 2270/2270 [007] 9323.692625625 55dc58cabf06 jnz 0x55dc58cabf10 unquoted_glob_pattern_p+0x106 (/usr/bin/bash) bash 2264/2264 [001] 9323.692625625 7fd556a4376c jbe 0x7fd556a43ac8 round_and_return+0x3fc (/usr/lib/x86_64-linux-gnu/libc.so.6) IPC: 4.30 (43/10) bash 2264/2264 [001] 9323.692625625 7fd556a43772 and $0x8, %edx round_and_return+0x402 (/usr/lib/x86_64-linux-gnu/libc.so.6) bash 2264/2264 [001] 9323.692625625 7fd556a43775 jnz 0x7fd556a43ac8 round_and_return+0x405 (/usr/lib/x86_64-linux-gnu/libc.so.6) bash 2267/2267 [004] 9323.692625625 563caa3c89d8 jnz 0x563caa3c8b11 run_pending_traps+0x318 (/usr/bin/bash) bash 2267/2267 [004] 9323.692625625 563caa3c89de add $0x128, %rsp run_pending_traps+0x31e (/usr/bin/bash) bash 2267/2267 [004] 9323.692625625 563caa3c89e5 popq %rbx run_pending_traps+0x325 (/usr/bin/bash) ... Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20221020152509.5298-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2022-08-01 | Merge remote-tracking branch 'torvalds/master' into perf/core | Arnaldo Carvalho de Melo | |
| To pick up the fixes that went upstream via acme/perf/urgent and to get to v5.19. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2022-07-27 | perf scripts python: Let script to be python2 compliant | Leo Yan | |
| The mainline kernel can be used for relative old distros, e.g. RHEL 7. The distro doesn't upgrade from python2 to python3, this causes the building error that the python script is not python2 compliant. To fix the building failure, this patch changes from the python f-string format to traditional string format. Fixes: 12fdd6c009da0d02 ("perf scripts python: Support Arm CoreSight trace data disassembly") Reported-by: Akemi Yagi <toracat@elrepo.org> Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: ElRepo <contact@elrepo.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220725104220.1106663-1-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2022-07-20 | perf script python: intel-pt-events: Add machine_pid and vcpu | Adrian Hunter | |
| Add machine_pid and vcpu to the intel-pt-events.py script. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: kvm@vger.kernel.org Link: https://lore.kernel.org/r/20220711093218.10967-20-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2022-05-27 | perf scripts python: Support Arm CoreSight trace data disassembly | Leo Yan | |
| This commit adds python script to parse CoreSight tracing event and print out source line and disassembly, it generates readable program execution flow for easier humans inspecting. The script receives CoreSight tracing packet with below format: +------------+------------+------------+ packet(n): | addr | ip | cpu | +------------+------------+------------+ packet(n+1): | addr | ip | cpu | +------------+------------+------------+ packet::addr presents the start address of the coming branch sample, and packet::ip is the last address of the branch smple. Therefore, a code section between branches starts from packet(n)::addr and it stops at packet(n+1)::ip. As results we combines the two continuous packets to generate the address range for instructions: [ sample(n)::addr .. sample(n+1)::ip ] The script supports both objdump or llvm-objdump for disassembly with specifying option '-d'. If doesn't specify option '-d', the script simply outputs source lines and symbols. Below shows usages with llvm-objdump or objdump to output disassembly. # perf script -s scripts/python/arm-cs-trace-disasm.py -- -d llvm-objdump-11 -k ./vmlinux ARM CoreSight Trace Data Assembler Dump ffff800008eb3198 <etm4_enable_hw>: ffff800008eb3310: c0 38 00 35 cbnz w0, 0xffff800008eb3a28 <etm4_enable_hw+0x890> ffff800008eb3314: 9f 3f 03 d5 dsb sy ffff800008eb3318: df 3f 03 d5 isb ffff800008eb331c: f5 5b 42 a9 ldp x21, x22, [sp, #32] ffff800008eb3320: fb 73 45 a9 ldp x27, x28, [sp, #80] ffff800008eb3324: e0 82 40 39 ldrb w0, [x23, #32] ffff800008eb3328: 60 00 00 34 cbz w0, 0xffff800008eb3334 <etm4_enable_hw+0x19c> ffff800008eb332c: e0 03 19 aa mov x0, x25 ffff800008eb3330: 8c fe ff 97 bl 0xffff800008eb2d60 <etm4_cs_lock.isra.0.part.0> main 6728/6728 [0004] 0.000000000 etm4_enable_hw+0x198 [kernel.kallsyms] ffff800008eb2d60 <etm4_cs_lock.isra.0.part.0>: ffff800008eb2d60: 1f 20 03 d5 nop ffff800008eb2d64: 1f 20 03 d5 nop ffff800008eb2d68: 3f 23 03 d5 hint #25 ffff800008eb2d6c: 00 00 40 f9 ldr x0, [x0] ffff800008eb2d70: 9f 3f 03 d5 dsb sy ffff800008eb2d74: 00 c0 3e 91 add x0, x0, #4016 ffff800008eb2d78: 1f 00 00 b9 str wzr, [x0] ffff800008eb2d7c: bf 23 03 d5 hint #29 ffff800008eb2d80: c0 03 5f d6 ret main 6728/6728 [0004] 0.000000000 etm4_cs_lock.isra.0.part.0+0x20 # perf script -s scripts/python/arm-cs-trace-disasm.py -- -d objdump -k ./vmlinux ARM CoreSight Trace Data Assembler Dump ffff800008eb3310 <etm4_enable_hw+0x178>: ffff800008eb3310: 350038c0 cbnz w0, ffff800008eb3a28 <etm4_enable_hw+0x890> ffff800008eb3314: d5033f9f dsb sy ffff800008eb3318: d5033fdf isb ffff800008eb331c: a9425bf5 ldp x21, x22, [sp, #32] ffff800008eb3320: a94573fb ldp x27, x28, [sp, #80] ffff800008eb3324: 394082e0 ldrb w0, [x23, #32] ffff800008eb3328: 34000060 cbz w0, ffff800008eb3334 <etm4_enable_hw+0x19c> ffff800008eb332c: aa1903e0 mov x0, x25 ffff800008eb3330: 97fffe8c bl ffff800008eb2d60 <etm4_cs_lock.isra.0.part.0> main 6728/6728 [0004] 0.000000000 etm4_enable_hw+0x198 [kernel.kallsyms] ffff800008eb2d60 <etm4_cs_lock.isra.0.part.0>: ffff800008eb2d60: d503201f nop ffff800008eb2d64: d503201f nop ffff800008eb2d68: d503233f paciasp ffff800008eb2d6c: f9400000 ldr x0, [x0] ffff800008eb2d70: d5033f9f dsb sy ffff800008eb2d74: 913ec000 add x0, x0, #0xfb0 ffff800008eb2d78: b900001f str wzr, [x0] ffff800008eb2d7c: d50323bf autiasp ffff800008eb2d80: d65f03c0 ret main 6728/6728 [0004] 0.000000000 etm4_cs_lock.isra.0.part.0+0x20 Signed-off-by: Leo Yan <leo.yan@linaro.org> Co-authored-by: Al Grant <al.grant@arm.com> Co-authored-by: Mathieu Poirier <mathieu.poirier@linaro.org> Co-authored-by: Tor Jeremiassen <tor@ti.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Eelco Chaudron <echaudro@redhat.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Tanmay Jagdale <tanmay@marvell.com> Cc: coresight@lists.linaro.org Cc: zengshun . wu <zengshun.wu@outlook.com> Link: https://lore.kernel.org/r/20220521130446.4163597-3-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2022-05-17 | perf scripts python: intel-pt-events.py: Print ptwrite value as a string if ↵ | Adrian Hunter | |
| it is ASCII It can be convenient to put a string value into a ptwrite payload as a quick and easy way to identify what is being printed. To make that useful, if the Intel ptwrite payload value contains only printable ASCII characters padded with NULLs, then print it also as a string. Using the example program from the "Emulated PTWRITE" section of tools/perf/Documentation/perf-intel-pt.txt: $ echo -n "Hello" | od -t x8 0000000 0000006f6c6c6548 0000005 $ perf record -e intel_pt//u ./eg_ptw 0x0000006f6c6c6548 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.016 MB perf.data ] $ perf script --itrace=ew intel-pt-events.py Intel PT Branch Trace, Power Events, Event Trace and PTWRITE Switch In 38524/38524 [001] 24166.044995916 0/0 eg_ptw 38524/38524 [001] 24166.045380004 ptwrite jmp IP: 0 payload: 0x6f6c6c6548 Hello 56532c7ce196 perf_emulate_ptwrite+0x16 (/home/ahunter/git/work/eg_ptw) End Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220509152400.376613-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2022-02-15 | perf scripts python: export-to-postgresql.py: Export all sample flags | Adrian Hunter | |
| Add sample flags to the PostgreSQL database definition and export. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20220124084201.2699795-25-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2022-02-15 | perf scripts python: export-to-sqlite.py: Export all sample flags | Adrian Hunter | |
| Add sample flags to the SQLite database definition and export. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20220124084201.2699795-24-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2022-02-15 | perf scripts python: intel-pt-events.py: Add Event Trace | Adrian Hunter | |
| Add Event Trace to the intel-pt-events.py script. This shows how to unpack the raw data from the new sample events in a Python script. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20220124084201.2699795-22-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2021-12-28 | perf scripts python: intel-pt-events.py: Fix printing of switch events | Adrian Hunter | |
| The intel-pt-events.py script displays only the last of consecutive switch statements but that may not be the last switch event for the CPU. Fix by keeping a dictionary of last context switch keyed by CPU, and make it possible to see all switch events by adding option --all-switch-events. Fixes: a92bf335fd82eeee ("perf scripts python: intel-pt-events.py: Add branches to script") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Riccardo Mancini <rickyman7@gmail.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20211215080636.149562-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2021-09-10 | perf scripts python: Fix passing arguments to stackcollapse report | Michael Petlan | |
| The '--' prevented arguments from being passed to the script, such as: $ perf script report stackcollapse -i my_perf.data Signed-off-by: Michael Petlan <mpetlan@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> LPU-Reference: 20200427142327.21172-1-mpetlan@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2021-08-30 | perf flamegraph: flamegraph.py script improvements | Andreas Gerstmayr | |
| * display perf.data header * display PIDs of user stacks * added option to change color scheme * default to blue/green color scheme to improve accessibility * correctly identify kernel stacks when kernel-debuginfo is installed Signed-off-by: Andreas Gerstmayr <agerstmayr@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20210830164729.116049-1-agerstmayr@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2021-06-01 | perf scripting python: intel-pt-events.py: Add --insn-trace and --src-trace | Adrian Hunter | |
| Add an instruction trace and a source trace to the intel-pt-events.py script. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20210530192308.7382-14-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2021-06-01 | perf scripting python: exported-sql-viewer.py: Factor out libxed.py | Adrian Hunter | |
| Factor out libxed.py so it can be reused. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20210530192308.7382-13-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2021-06-01 | perf scripting python: Add perf_sample_srcline() and perf_sample_srccode() | Adrian Hunter | |
| Add perf_sample_srcline() and perf_sample_srccode() to the perf_trace_context module so that a script can get the srcline or srccode information. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20210530192308.7382-11-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2021-06-01 | perf scripting python: Add perf_set_itrace_options() | Adrian Hunter | |
| Add perf_set_itrace_options() to the perf_trace_context module so that a script can set the itrace options for a session if they have not been set already. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20210530192308.7382-10-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2021-06-01 | perf scripting python: Add perf_sample_insn() | Adrian Hunter | |
| Add perf_sample_insn() to the perf_trace_context module so that a script can get the instruction bytes. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20210530192308.7382-8-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2021-06-01 | perf scripting python: Assign perf_script_context | Adrian Hunter | |
| The scripting_context pointer itself does not change and nor does it need to. Put it directly into the script as a variable at the start so it does not have to be passed on each call into the script. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20210530192308.7382-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2021-06-01 | perf scripting python: Simplify perf-trace-context module functions | Adrian Hunter | |
| Simplify perf-trace-context module functions by factoring out some common code. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20210530192308.7382-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2021-06-01 | perf scripting python: Remove unnecessary 'static' | Adrian Hunter | |
| The variables are always assigned before use, making the 'static' storage class unnecessary. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20210530192308.7382-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
| 2021-05-25 | perf scripts python: intel-pt-events.py: Add branches to script | Adrian Hunter | |
| As an example, add branch information to intel-pt-events.py script. This shows how a simple python script can be used to customize perf script output for Intel PT branch traces or power event traces. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20210525095112.1399-11-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> | |||
