linux/linux-stable.git - Linux kernel stable tree

Age	Commit message (Collapse)	Author
2023-11-27	perf test: Use existing config value for objdump path	James Clark
	There is already an existing config value for changing the objdump path, so instead of having two values that do the same thing, make 'perf test' use annotate.objdump as well. Signed-off-by: James Clark <james.clark@arm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Fangrui Song <maskray@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/ZU5Cx4LTrB5q0sIG@kernel.org Link: https://lore.kernel.org/r/20231113102327.695386-1-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-27	perf vendor events riscv: add T-HEAD C9xx JSON file	Inochi Amaoto
	Add JSON file of T-HEAD C9xx series events. The event idx (raw value) is summary as following: event id range \| support cpu 0x01 - 0x2a \| c906,c910,c920 The event ids are based on the public document of T-HEAD and cover the c900 series. These events are the max that c900 series support. Since T-HEAD let manufacturers decide whether events are usable, the final support of the perf events is determined by the pmu node of the soc dtb. Signed-off-by: Inochi Amaoto <inochiama@outlook.com> Tested-by: Guo Ren <guoren@kernel.org> Acked-by: Palmer Dabbelt <palmer@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chen Wang <unicorn_wang@outlook.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jisheng Zhang <jszhang@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Wei Fu <wefu@redhat.com> Cc: linux-riscv@lists.infradead.org Link: https://lore.kernel.org/r/IA1PR20MB495325FCF603BAA841E29281BBBAA@IA1PR20MB4953.namprd20.prod.outlook.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-27	perf vendor events: Add skx, clx, icx and spr upi bandwidth metric	Ian Rogers
	Add upi_data_receive_bw metric for skylakex, cascadelakex, icelakex and sapphirerapids. The metric was added to perfmon metrics in: https://github.com/intel/perfmon/pull/119 Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20231109232732.2973015-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-27	perf tests: Skip data symbol test if buf1 symbol is missing	Adrian Hunter
	perf data symbol test depends on finding symbol buf1 in perf, and fails if perf has been stripped and no debug object is available. In that case, skip the test instead. Example: Before: $ strip tools/perf/perf $ tools/perf/perf buildid-cache -p `realpath tools/perf/perf` $ tools/perf/perf test -v 'data symbol' 113: Test data symbol : --- start --- test child forked, pid 125646 Recording workload... [ perf record: Woken up 3 times to write data ] [ perf record: Captured and wrote 0.577 MB /tmp/__perf_test.perf.data.Jhbdp (7794 samples) ] Cleaning up files... test child finished with -1 ---- end ---- Test data symbol: FAILED! After: $ tools/perf/perf test -v 'data symbol' 113: Test data symbol : --- start --- test child forked, pid 125747 perf does not have symbol 'buf1' perf is missing symbols - skipping test test child finished with -2 ---- end ---- Test data symbol: Skip Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20231123075848.9652-9-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-27	perf tests: Make data symbol test wait for perf to start	Adrian Hunter
	The perf data symbol test waits 1 second for perf to run and collect data, which may be too little if perf takes a long time to start up, which has been noticed on systems with many CPUs. Use existing wait_for_perf_to_start helper to wait for perf to start. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20231123075848.9652-8-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-27	perf tests: Skip branch stack sampling test if brstack_bench symbol is missing	Adrian Hunter
	The test "Check branch stack sampling" depends on finding symbol brstack_bench (and several others) in perf, and fails if perf has been stripped and no debug object is available. In that case, skip the test instead. Example: Before: $ strip tools/perf/perf $ tools/perf/perf buildid-cache -p `realpath tools/perf/perf` $ tools/perf/perf test -v 'branch stack sampling' 112: Check branch stack sampling : --- start --- test child forked, pid 123741 Testing user branch stack sampling + grep -E -m1 ^brstack_bench\+[^ ]/brstack_foo\+[^ ]/IND_CALL/.*$ /tmp/__perf_test.program.5Dz1U/perf.script + cleanup + rm -rf /tmp/__perf_test.program.5Dz1U test child finished with -1 ---- end ---- Check branch stack sampling: FAILED! After: $ tools/perf/perf test -v 'branch stack sampling' 112: Check branch stack sampling : --- start --- test child forked, pid 125157 perf does not have symbol 'brstack_bench' perf is missing symbols - skipping test test child finished with -2 ---- end ---- Check branch stack sampling: Skip Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20231123075848.9652-7-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-27	perf tests: Skip Arm64 callgraphs test if leafloop symbol is missing	Adrian Hunter
	The test "Check Arm64 callgraphs are complete in fp mode" depends on finding symbol leafloop in perf, and fails if perf has been stripped and no debug object is available. In that case, skip the test instead. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20231123075848.9652-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-27	perf tests: Skip record test if test_loop symbol is missing	Adrian Hunter
	perf record test depends on finding symbol test_loop in perf, and fails if perf has been stripped and no debug object is available. In that case, skip the test instead. Example: Note, building with perl support adds option -Wl,-E which causes the linker to add all (global) symbols to the dynamic symbol table. So the test_loop symbol, being global, does not get stripped unless NO_LIBPERL=1 Before: $ make NO_LIBPERL=1 -C tools/perf >/dev/null 2>&1 $ strip tools/perf/perf $ tools/perf/perf buildid-cache -p `realpath tools/perf/perf` $ tools/perf/perf test -v 'record tests' 91: perf record tests : --- start --- test child forked, pid 118750 Basic --per-thread mode test Per-thread record [Failed missing output] Register capture test Register capture test [Success] Basic --system-wide mode test System-wide record [Skipped not supported] Basic target workload test Workload record [Failed missing output] test child finished with -1 ---- end ---- perf record tests: FAILED! After: $ tools/perf/perf test -v 'record tests' 91: perf record tests : --- start --- test child forked, pid 120025 perf does not have symbol 'test_loop' perf is missing symbols - skipping test test child finished with -2 ---- end ---- perf record tests: Skip Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20231123075848.9652-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-27	perf tests: Skip pipe test if noploop symbol is missing	Adrian Hunter
	perf pipe recording and injection test depends on finding symbol noploop in perf, and fails if perf has been stripped and no debug object is available. In that case, skip the test instead. Example: Before: $ strip tools/perf/perf $ tools/perf/perf buildid-cache -p `realpath tools/perf/perf` $ tools/perf/perf test -v pipe 86: perf pipe recording and injection test : --- start --- test child forked, pid 47734 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] 47741 47741 -1 \|perf [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.000 MB - ] cannot find noploop function in pipe #1 test child finished with -1 ---- end ---- perf pipe recording and injection test: FAILED! After: $ tools/perf/perf test -v pipe 86: perf pipe recording and injection test : --- start --- test child forked, pid 48996 perf does not have symbol 'noploop' perf is missing symbols - skipping test test child finished with -2 ---- end ---- perf pipe recording and injection test: Skip Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20231123075848.9652-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-27	perf tests lib: Add perf_has_symbol.sh	Adrian Hunter
	Some shell tests depend on finding symbols for perf itself, and fail if perf has been stripped and no debug object is available. Add helper functions to check if perf has a needed symbol. This is preparation for amending the tests themselves to be skipped if a needed symbol is not found. The functions make use of the "Symbols" test which reads and checks symbols from a dso, perf itself by default. Note the "Symbols" test will find symbols using the same method as other perf tests, including, for example, looking in the buildid cache. An alternative would be to prevent the needed symbols from being stripped, which seems to work with gcc's externally_visible attribute, but that attribute is not supported by clang. Another alternative would be to use option -Wl,-E (which is already used when perf is built with perl support) which causes the linker to add all (global) symbols to the dynamic symbol table. Then the required symbols need only be made global in scope to avoid being strippable. However that goes beyond what is needed. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20231123075848.9652-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-27	perf header: Fix segfault on build_mem_topology() error path	Adrian Hunter
	Do not increase the node count unless a node has been successfully read, because it can lead to a segfault if an error occurs. For example, if perf exceeds the open file limit in memory_node__read(), which, on a test system, could be made to happen by setting the file limit to exactly 32: Before: $ ulimit -n 32 $ perf mem record --all-user -- sleep 1 [ perf record: Woken up 1 times to write data ] failed: can't open memory sysfs data perf: Segmentation fault Obtained 14 stack frames. perf(sighandler_dump_stack+0x48) [0x55f4b1f59558] /lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7f4ba1c42520] /lib/x86_64-linux-gnu/libc.so.6(free+0x1e) [0x7f4ba1ca53fe] perf(+0x178ff4) [0x55f4b1f48ff4] perf(+0x179a70) [0x55f4b1f49a70] perf(+0x17ef5d) [0x55f4b1f4ef5d] perf(+0x85c0b) [0x55f4b1e55c0b] perf(cmd_record+0xe1d) [0x55f4b1e5920d] perf(cmd_mem+0xc96) [0x55f4b1e80e56] perf(+0x130460) [0x55f4b1f00460] perf(main+0x689) [0x55f4b1e427d9] /lib/x86_64-linux-gnu/libc.so.6(+0x29d90) [0x7f4ba1c29d90] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80) [0x7f4ba1c29e40] perf(_start+0x25) [0x55f4b1e42a25] Segmentation fault (core dumped) $ After: $ ulimit -n 32 $ perf mem record --all-user -- sleep 1 [ perf record: Woken up 1 times to write data ] failed: can't open memory sysfs data [ perf record: Captured and wrote 0.005 MB perf.data (11 samples) ] $ Fixes: f8e502b9d1b3b197 ("perf header: Ensure bitmaps are freed") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20231123075848.9652-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-27	perf report: Remove warning on missing raw data for s390	Thomas Richter
	Command # ./perf report -i /tmp/111 -D > /dev/null emits an error message when a sample for event CRYPTO_ALL in the perf.data file does not contain any raw data. This is ok. Do not trigger this warning when the sample in the perf.data files does not contain any raw data at all. Check for availability of raw data for all events and return if none is available. Output before: # ./perf report -i /tmp/111 -D > /dev/null Invalid CRYPTO_ALL raw data encountered Invalid CRYPTO_ALL raw data encountered Invalid CRYPTO_ALL raw data encountered # Output after: # ./perf report -i /tmp/111 -D > /dev/null # Fixes: b539deafbadb2fc6 ("perf report: Add s390 raw data interpretation for PAI counters") Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20231122092703.3163191-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-27	perf tools: Add perf binary dependent rule for shellcheck log in Makefile.perf	Athira Rajeev
	Add rule in new Makefile "tests/Makefile.tests" for running shellcheck on shell test scripts. This automates below shellcheck into the build. $ for F in $(find tests/shell/ -perm -o=x -name '.sh'); do shellcheck -S warning $F; done Condition for shellcheck is added in Makefile.perf to avoid build breakage in the absence of shellcheck binary. Update Makefile.perf to contain new rule for "SHELLCHECK_TEST" which is for making shellcheck test as a dependency on perf binary. Added "tests/Makefile.tests" to run shellcheck on shellscripts in tests/shell. The make rule "SHLLCHECK_RUN" ensures that, every time during make, shellcheck will be run only on modified files during subsequent invocations. By this, if any newly added shell scripts or fixes in existing scripts breaks coding/formatting style, it will get captured during the perf build. Example build failure by modifying probe_vfs_getname.sh in tests/shell: In tests/shell/probe_vfs_getname.sh line 8: . $(dirname $0)/lib/probe.sh ^-----------^ SC2046 (warning): Quote this to prevent word splitting. For more information: https://www.shellcheck.net/wiki/SC2046 -- Quote this to prevent word splitt... make[3]: [/root/athira/perf-tools-next/tools/perf/tests/Makefile.tests:18: tests/shell/.probe_vfs_getname.sh.shellcheck_log] Error 1 make[2]: * [Makefile.perf:686: SHELLCHECK_TEST] Error 2 make[2]: * Waiting for unfinished jobs.... make[1]: * [Makefile.perf:244: sub-make] Error 2 make: * [Makefile:70: all] Error 2 Here, like other files which gets created during compilation (ex: .builtin-bench.o.cmd or .perf.o.cmd ), create .shellcheck_log also as a hidden file. Example: tests/shell/.probe_vfs_getname.sh.shellcheck_log shellcheck is re-run if any of the script gets modified based on its dependency of this log file. After this, for testing, changed "tests/shell/trace+probe_vfs_getname.sh" to break shellcheck format. In the next make run, it is also captured: In tests/shell/probe_vfs_getname.sh line 8: . $(dirname $0)/lib/probe.sh ^-----------^ SC2046 (warning): Quote this to prevent word splitting. For more information: https://www.shellcheck.net/wiki/SC2046 -- Quote this to prevent word splitt... make[3]: * [/root/athira/perf-tools-next/tools/perf/tests/Makefile.tests:18: tests/shell/.probe_vfs_getname.sh.shellcheck_log] Error 1 make[3]: * Waiting for unfinished jobs.... In tests/shell/trace+probe_vfs_getname.sh line 14: . $(dirname $0)/lib/probe.sh ^-----------^ SC2046 (warning): Quote this to prevent word splitting. For more information: https://www.shellcheck.net/wiki/SC2046 -- Quote this to prevent word splitt... make[3]: * [/root/athira/perf-tools-next/tools/perf/tests/Makefile.tests:18: tests/shell/.trace+probe_vfs_getname.sh.shellcheck_log] Error 1 make[2]: * [Makefile.perf:686: SHELLCHECK_TEST] Error 2 make[2]: * Waiting for unfinished jobs.... make[1]: * [Makefile.perf:244: sub-make] Error 2 make: * [Makefile:70: all] Error 2 Failure log can be found in the stdout of make itself. This is reported at build time. To be able to go ahead with the build or disable shellcheck even though it is known that some test is broken, add a "NO_SHELLCHECK" option. Example: make NO_SHELLCHECK=1 INSTALL libsubcmd_headers INSTALL libsymbol_headers INSTALL libapi_headers INSTALL libperf_headers INSTALL libbpf_headers LINK perf Note: This is tested on RHEL and also SLES. Use below check: "$(shell which shellcheck 2> /dev/null)" to look for presence of shellcheck binary. The approach "shell command -v" is not used here. In some of the distros(RHEL), command is available as executable file (/usr/bin/command). But in some distros(SLES), it is a shell builtin and not available as executable file. Committer testing: $ type shellcheck shellcheck is hashed (/usr/bin/shellcheck) $ rpm -qf /usr/bin/shellcheck ShellCheck-0.9.0-2.fc38.x86_64 $ $ alias m $ git diff diff --git a/tools/perf/tests/shell/probe_vfs_getname.sh b/tools/perf/tests/shell/probe_vfs_getname.sh index 554e12e83c55fd56..dbc14634678e2bf6 100755 --- a/tools/perf/tests/shell/probe_vfs_getname.sh +++ b/tools/perf/tests/shell/probe_vfs_getname.sh @@ -5,7 +5,7 @@ # Arnaldo Carvalho de Melo <acme@kernel.org>, 2017 # shellcheck source=lib/probe.sh -. "$(dirname $0)"/lib/probe.sh +. $(dirname $0)/lib/probe.sh skip_if_no_perf_probe \|\| exit 2 alias m='rm -rf ~/libexec/perf-core/ ; make -k CORESIGHT=1 O=/tmp/build/$(basename $PWD) -C tools/perf install-bin && perf test python' $ m make: Entering directory '/home/acme/git/perf-tools-next/tools/perf' BUILD: Doing 'make -j32' parallel build <SNIP> INSTALL libbpf_headers In tests/shell/probe_vfs_getname.sh line 8: . $(dirname $0)/lib/probe.sh ^-----------^ SC2046 (warning): Quote this to prevent word splitting. For more information: https://www.shellcheck.net/wiki/SC2046 -- Quote this to prevent word splitt... make[3]: * [/home/acme/git/perf-tools-next/tools/perf/tests/Makefile.tests:18: tests/shell/.probe_vfs_getname.sh.shellcheck_log] Error 1 make[2]: * [Makefile.perf:686: SHELLCHECK_TEST] Error 2 make[2]: * Waiting for unfinished jobs.... make[1]: * [Makefile.perf:244: sub-make] Error 2 make: *** [Makefile:113: install-bin] Error 2 make: Leaving directory '/home/acme/git/perf-tools-next/tools/perf' $ Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20231123160232.94253-1-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-27	perf vendor events riscv: Add StarFive Dubhe-90 JSON file	Ji Sheng Teoh
	Similar to StarFive's Dubhe-80, Dubhe-90 supports raw event id 0x00 - 0x22. Reuse Dubhe-80 firmware and common json file. The raw events are enabled through PMU node of DT binding. Besides raw event, add standard RISC-V firmware events to support monitoring of firmware event. Example of PMU DT node: pmu { compatible = "riscv,pmu"; riscv,raw-event-to-mhpmcounters = /* Event ID 1-31 / <0x00 0x00 0xFFFFFFFF 0xFFFFFFE0 0x00007FF8>, / Event ID 32-33 / <0x00 0x20 0xFFFFFFFF 0xFFFFFFFE 0x00007FF8>, / Event ID 34 */ <0x00 0x22 0xFFFFFFFF 0xFFFFFF22 0x00007FF8>; }; 'perf stat' output: [root@user]# perf stat -a \ -e access_mmu_stlb \ -e miss_mmu_stlb \ -e access_mmu_pte_c \ -e rob_flush \ -e btb_prediction_miss \ -e itlb_miss \ -e sync_del_fetch_g \ -e icache_miss \ -e bpu_br_retire \ -e bpu_br_miss \ -e ret_ins_retire \ -e ret_ins_miss \ -- openssl speed rsa2048 Doing 2048 bits private rsa's for 10s: 39 2048 bits private RSA's in 10.03s Doing 2048 bits public rsa's for 10s: 1469 2048 bits public RSA's in 9.47s version: 3.0.10 built on: Tue Aug 1 13:47:24 2023 UTC options: bn(64,64) CPUINFO: N/A sign verify sign/s verify/s rsa 2048 bits 0.257179s 0.006447s 3.9 155.1 Performance counter stats for 'system wide': 3112882 access_mmu_stlb 10550 miss_mmu_stlb 18251 access_mmu_pte_c 274765 rob_flush 22470560 btb_prediction_miss 3035839 itlb_miss 643549060 sync_del_fetch_g 133013 icache_miss 62982796 bpu_br_retire 287548 bpu_br_miss 8935910 ret_ins_retire 8308 ret_ins_miss 20.656182600 seconds time elapsed Reviewed-by: Ley Foon Tan <leyfoon.tan@starfivetech.com> Signed-off-by: Ji Sheng Teoh <jisheng.teoh@starfivetech.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nikita Shubin <n.shubin@yadro.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-riscv@lists.infradead.org Link: https://lore.kernel.org/r/20231122030908.2981502-1-jisheng.teoh@starfivetech.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-27	perf tests coresight: Remove unused variables	zhujun2
	These variables are never referenced in the code, just remove them. Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: zhujun2 <zhujun2@cmss.chinamobile.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: coresight@lists.linaro.org Link: https://lore.kernel.org/r/20231115064255.11057-1-zhujun2@cmss.chinamobile.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-27	perf lock: Fix a memory leak on an error path	zhaimingbing
	if a strdup-ed string is NULL,the allocated memory needs freeing. Signed-off-by: zhaimingbing <zhaimingbing@cmss.chinamobile.com> Acked-by: Ingo Molnar <mingo@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20231124092657.10392-1-zhaimingbing@cmss.chinamobile.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-27	perf parse-events: Make legacy events lower priority than sysfs/JSON	Ian Rogers
	The perf tool has previously made legacy events the priority so with or without a PMU the legacy event would be opened: $ perf stat -e cpu-cycles,cpu/cpu-cycles/ true Using CPUID GenuineIntel-6-8D-1 intel_pt default config: tsc,mtc,mtc_period=3,psb_period=3,pt,branch Attempting to add event pmu 'cpu' with 'cpu-cycles,' that may result in non-fatal errors After aliases, add event pmu 'cpu' with 'cpu-cycles,' that may result in non-fatal errors Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED\|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 833967 cpu -1 group_fd -1 flags 0x8 = 3 ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED\|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ ... Fixes to make hybrid/BIG.little PMUs behave correctly, ie as core PMUs capable of opening legacy events on each, removing hard coded "cpu_core" and "cpu_atom" Intel PMU names, etc. caused a behavioral difference on Apple/ARM due to latent issues in the PMU driver reported in: https://lore.kernel.org/lkml/08f1f185-e259-4014-9ca4-6411d5c1bc65@marcan.st/ As part of that report Mark Rutland <mark.rutland@arm.com> requested that legacy events not be higher in priority when a PMU is specified reversing what has until this change been perf's default behavior. With this change the above becomes: $ perf stat -e cpu-cycles,cpu/cpu-cycles/ true Using CPUID GenuineIntel-6-8D-1 Attempt to add: cpu/cpu-cycles=0/ ..after resolving event: cpu/event=0x3c/ Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED\|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ sys_perf_event_open: pid 827628 cpu -1 group_fd -1 flags 0x8 = 3 ------------------------------------------------------------ perf_event_attr: type 4 (PERF_TYPE_RAW) size 136 config 0x3c sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED\|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 exclude_guest 1 ------------------------------------------------------------ ... So the second event has become a raw event as /sys/devices/cpu/events/cpu-cycles exists. A fix was necessary to config_term_pmu in parse-events.c as check_alias expansion needs to happen after config_term_pmu, and config_term_pmu may need calling a second time because of this. config_term_pmu is updated to not use the legacy event when the PMU has such a named event (either from JSON or sysfs). The bulk of this change is updating all of the parse-events test expectations so that if a sysfs/JSON event exists for a PMU the test doesn't fail - a further sign, if it were needed, that the legacy event priority was a known and tested behavior of the perf tool. Reported-by: Hector Martin <marcan@marcan.st> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Hector Martin <marcan@marcan.st> Tested-by: Marc Zyngier <maz@kernel.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20231123042922.834425-1-irogers@google.com [ Initialize the 'alias_rewrote_terms' variable to false to address a clang warning ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-27	perf cs-etm: Enable itrace option 'T'	Leo Yan
	Prior to Armv8.4, the feature FEAT_TRF is not supported by Arm CPUs. Consequently, the sysfs node 'ts_source' will not be set as 1 by the CoreSight ETM driver. On the other hand, the perf tool relies on the 'ts_source' node to determine whether the kernel timestamp is traced. Since the 'ts_source' is not set for Arm CPUs prior to Armv8.4, platforms in this case cannot utilize the traced timestamp as the kernel time. This patch enables the 'T' itrace option, which forcibly utilizes the traced timestamp as the kernel time. If users are aware that their working platform's Arm CoreSight shares the same counter with the kernel time, they can specify 'T' option to decode the traced timestamp as the kernel time. An usage example is: # perf record -e cs_etm// -- test_program # perf script --itrace=i10ibT # perf report --itrace=i10ibT Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20231014074513.1668000-3-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-27	perf auxtrace: Add 'T' itrace option for timestamp trace	Leo Yan
	An AUX trace can contain timestamp, but in some situations, the hardware trace module (e.g. Arm CoreSight) cannot decide the traced timestamp is the same source with CPU's time, thus the decoder can not use the timestamp trace for samples. This patch introduces 'T' itrace option. If users know the platforms they are working on have the same time counter with CPUs, users can use this new option to tell a decoder for using timestamp trace as kernel time. Signed-off-by: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20231014074513.1668000-2-leo.yan@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-27	perf test: Basic branch counter support	Kan Liang
	Add a basic test for the branch counter feature. The test verifies that - The new filter can be successfully applied on the supported platforms. - The counter value can be outputted via the perf report -D Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tinghao Zhang <tinghao.zhang@intel.com> Link: https://lore.kernel.org/r/20231107184020.1497571-1-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-27	perf script perl: Fail check on dynamic allocation	zhaimingbing
	Return ENOMEM when dynamic allocation failed. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: zhaimingbing <zhaimingbing@cmss.chinamobile.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20231120112356.8652-1-zhaimingbing@cmss.chinamobile.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-27	perf script python: Fail check on dynamic allocation	Paran Lee
	Add PyList_New() Fail check in get_field_numeric_entry() function and dynamic allocation checking for set_regs_in_dict(), python_start_script(). Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: MichelleJin <shjy180909@gmail.com> Signed-off-by: Paran Lee <p4ranlee@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Austin Kim <austindh.kim@gmail.com> Cc: Honggyu Kim <honggyu.kp@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20231120223218.9036-1-p4ranlee@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-27	perf test: Remove atomics from test_loop to avoid test failures	Nick Forrington
	The current use of atomics can lead to test failures, as tests (such as tests/shell/record.sh) search for samples with "test_loop" as the top-most stack frame, but find frames related to the atomic operation (e.g. __aarch64_ldadd4_relax). This change simply removes the "count" variable, as it is not necessary. Fixes: 1962ab6f6e0b39e4 ("perf test workload thloop: Make count increments atomic") Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Nick Forrington <nick.forrington@arm.com> Acked-by: Leo Yan <leo.yan@linaro.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20231102162225.50028-1-nick.forrington@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-23	perf tools: Address python 3.6 DeprecationWarning for string scapes	Benjamin Gray
	Python 3.6 introduced a DeprecationWarning for invalid escape sequences. This is upgraded to a SyntaxWarning in Python 3.12, and will eventually be a syntax error. Fix these now to get ahead of it before it's an error. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Hartley Sweeten <hsweeten@visionengravers.com> Cc: Ian Abbott <abbotti@mev.co.uk> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Kieran Bingham <kbingham@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mykola Lysenko <mykolal@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Todd E Brandt <todd.e.brandt@linux.intel.com> Cc: Tom Rix <trix@redhat.com> Cc: linux-doc@vger.kernel.org Cc: linux-ia64@vger.kernel.org Cc: linux-kselftest@vger.kernel.org Cc: linux-pm@vger.kernel.org Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20230912060801.95533-6-bgray@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-22	perf build: Ensure sysreg-defs Makefile respects output dir	Oliver Upton
	Currently the sysreg-defs are written out to the source tree unconditionally, ignoring the specified output directory. Correct the build rule to emit the header to the output directory. Opportunistically reorganize the rules to avoid interleaving with the set of beauty make rules. Reported-by: Ian Rogers <irogers@google.com> Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20231121192956.919380-3-oliver.upton@linux.dev Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-11-22	tools perf: Add arm64 sysreg files to MANIFEST	Oliver Upton
	Ian pointed out that source tarballs are incomplete as of commit e2bdd172e665 ("perf build: Generate arm64's sysreg-defs.h and add to include path"), since the source files needed from the kernel tree do not appear in the manifest. Add them. Reported-by: Ian Rogers <irogers@google.com> Fixes: e2bdd172e665 ("perf build: Generate arm64's sysreg-defs.h and add to include path") Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20231121192956.919380-2-oliver.upton@linux.dev Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-11-22	tools/perf: Update tools's copy of mips syscall table	Namhyung Kim
	tldr; Just FYI, I'm carrying this on the perf tools tree. Full explanation: There used to be no copies, with tools/ code using kernel headers directly. From time to time tools/perf/ broke due to legitimate kernel hacking. At some point Linus complained about such direct usage. Then we adopted the current model. The way these headers are used in perf are not restricted to just including them to compile something. There are sometimes used in scripts that convert defines into string tables, etc, so some change may break one of these scripts, or new MSRs may use some different #define pattern, etc. E.g.: $ ls -1 tools/perf/trace/beauty/.sh \| head -5 tools/perf/trace/beauty/arch_errno_names.sh tools/perf/trace/beauty/drm_ioctl.sh tools/perf/trace/beauty/fadvise.sh tools/perf/trace/beauty/fsconfig.sh tools/perf/trace/beauty/fsmount.sh $ $ tools/perf/trace/beauty/fadvise.sh static const char fadvise_advices[] = { [0] = "NORMAL", [1] = "RANDOM", [2] = "SEQUENTIAL", [3] = "WILLNEED", [4] = "DONTNEED", [5] = "NOREUSE", }; $ The tools/perf/check-headers.sh script, part of the tools/ build process, points out changes in the original files. So its important not to touch the copies in tools/ when doing changes in the original kernel headers, that will be done later, when check-headers.sh inform about the change to the perf tools hackers. Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: linux-mips@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20231121225650.390246-14-namhyung@kernel.org
2023-11-22	tools/perf: Update tools's copy of s390 syscall table	Namhyung Kim
	tldr; Just FYI, I'm carrying this on the perf tools tree. Full explanation: There used to be no copies, with tools/ code using kernel headers directly. From time to time tools/perf/ broke due to legitimate kernel hacking. At some point Linus complained about such direct usage. Then we adopted the current model. The way these headers are used in perf are not restricted to just including them to compile something. There are sometimes used in scripts that convert defines into string tables, etc, so some change may break one of these scripts, or new MSRs may use some different #define pattern, etc. E.g.: $ ls -1 tools/perf/trace/beauty/.sh \| head -5 tools/perf/trace/beauty/arch_errno_names.sh tools/perf/trace/beauty/drm_ioctl.sh tools/perf/trace/beauty/fadvise.sh tools/perf/trace/beauty/fsconfig.sh tools/perf/trace/beauty/fsmount.sh $ $ tools/perf/trace/beauty/fadvise.sh static const char fadvise_advices[] = { [0] = "NORMAL", [1] = "RANDOM", [2] = "SEQUENTIAL", [3] = "WILLNEED", [4] = "DONTNEED", [5] = "NOREUSE", }; $ The tools/perf/check-headers.sh script, part of the tools/ build process, points out changes in the original files. So its important not to touch the copies in tools/ when doing changes in the original kernel headers, that will be done later, when check-headers.sh inform about the change to the perf tools hackers. Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: linux-s390@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20231121225650.390246-13-namhyung@kernel.org
2023-11-22	tools/perf: Update tools's copy of powerpc syscall table	Namhyung Kim
	tldr; Just FYI, I'm carrying this on the perf tools tree. Full explanation: There used to be no copies, with tools/ code using kernel headers directly. From time to time tools/perf/ broke due to legitimate kernel hacking. At some point Linus complained about such direct usage. Then we adopted the current model. The way these headers are used in perf are not restricted to just including them to compile something. There are sometimes used in scripts that convert defines into string tables, etc, so some change may break one of these scripts, or new MSRs may use some different #define pattern, etc. E.g.: $ ls -1 tools/perf/trace/beauty/.sh \| head -5 tools/perf/trace/beauty/arch_errno_names.sh tools/perf/trace/beauty/drm_ioctl.sh tools/perf/trace/beauty/fadvise.sh tools/perf/trace/beauty/fsconfig.sh tools/perf/trace/beauty/fsmount.sh $ $ tools/perf/trace/beauty/fadvise.sh static const char fadvise_advices[] = { [0] = "NORMAL", [1] = "RANDOM", [2] = "SEQUENTIAL", [3] = "WILLNEED", [4] = "DONTNEED", [5] = "NOREUSE", }; $ The tools/perf/check-headers.sh script, part of the tools/ build process, points out changes in the original files. So its important not to touch the copies in tools/ when doing changes in the original kernel headers, that will be done later, when check-headers.sh inform about the change to the perf tools hackers. Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20231121225650.390246-12-namhyung@kernel.org
2023-11-22	tools/perf: Update tools's copy of x86 syscall table	Namhyung Kim
	tldr; Just FYI, I'm carrying this on the perf tools tree. Full explanation: There used to be no copies, with tools/ code using kernel headers directly. From time to time tools/perf/ broke due to legitimate kernel hacking. At some point Linus complained about such direct usage. Then we adopted the current model. The way these headers are used in perf are not restricted to just including them to compile something. There are sometimes used in scripts that convert defines into string tables, etc, so some change may break one of these scripts, or new MSRs may use some different #define pattern, etc. E.g.: $ ls -1 tools/perf/trace/beauty/.sh \| head -5 tools/perf/trace/beauty/arch_errno_names.sh tools/perf/trace/beauty/drm_ioctl.sh tools/perf/trace/beauty/fadvise.sh tools/perf/trace/beauty/fsconfig.sh tools/perf/trace/beauty/fsmount.sh $ $ tools/perf/trace/beauty/fadvise.sh static const char fadvise_advices[] = { [0] = "NORMAL", [1] = "RANDOM", [2] = "SEQUENTIAL", [3] = "WILLNEED", [4] = "DONTNEED", [5] = "NOREUSE", }; $ The tools/perf/check-headers.sh script, part of the tools/ build process, points out changes in the original files. So its important not to touch the copies in tools/ when doing changes in the original kernel headers, that will be done later, when check-headers.sh inform about the change to the perf tools hackers. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: x86@kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20231121225650.390246-11-namhyung@kernel.org
2023-11-22	tools headers: Update tools's copy of socket.h header	Namhyung Kim
	tldr; Just FYI, I'm carrying this on the perf tools tree. Full explanation: There used to be no copies, with tools/ code using kernel headers directly. From time to time tools/perf/ broke due to legitimate kernel hacking. At some point Linus complained about such direct usage. Then we adopted the current model. The way these headers are used in perf are not restricted to just including them to compile something. There are sometimes used in scripts that convert defines into string tables, etc, so some change may break one of these scripts, or new MSRs may use some different #define pattern, etc. E.g.: $ ls -1 tools/perf/trace/beauty/.sh \| head -5 tools/perf/trace/beauty/arch_errno_names.sh tools/perf/trace/beauty/drm_ioctl.sh tools/perf/trace/beauty/fadvise.sh tools/perf/trace/beauty/fsconfig.sh tools/perf/trace/beauty/fsmount.sh $ $ tools/perf/trace/beauty/fadvise.sh static const char fadvise_advices[] = { [0] = "NORMAL", [1] = "RANDOM", [2] = "SEQUENTIAL", [3] = "WILLNEED", [4] = "DONTNEED", [5] = "NOREUSE", }; $ The tools/perf/check-headers.sh script, part of the tools/ build process, points out changes in the original files. So its important not to touch the copies in tools/ when doing changes in the original kernel headers, that will be done later, when check-headers.sh inform about the change to the perf tools hackers. Cc: netdev@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20231121225650.390246-7-namhyung@kernel.org
2023-11-21	perf lock contention: Fix a build error on 32-bit	Yang Jihong
	Fix a build error on 32-bit system: util/bpf_lock_contention.c: In function 'lock_contention_get_name': util/bpf_lock_contention.c:253:50: error: format '%lu' expects argument of type 'long unsigned int', but argument 4 has type 'u64 {aka long long unsigned int}' [-Werror=format=] snprintf(name_buf, sizeof(name_buf), "cgroup:%lu", cgrp_id); ~~^ %llu cc1: all warnings being treated as errors Fixes: d0c502e46e97 ("perf lock contention: Prepare to handle cgroups") Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: avagin@google.com Cc: daniel.diaz@linaro.org Link: https://lore.kernel.org/r/20231118024858.1567039-3-yangjihong1@huawei.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-11-21	perf kwork: Fix a build error on 32-bit	Yang Jihong
	lkft reported a build error for 32-bit system: builtin-kwork.c: In function 'top_print_work': builtin-kwork.c:1646:28: error: format '%ld' expects argument of type 'long int', but argument 3 has type 'u64' {aka 'long long unsigned int'} [-Werror=format=] 1646 \| ret += printf(" %ld ", PRINT_PID_WIDTH, work->id); \| ~~~^ ~~~~~~~~ \| \| \| \| long int u64 {aka long long unsigned int} \| %lld cc1: all warnings being treated as errors make[3]: *** [/builds/linux/tools/build/Makefile.build:106: /home/tuxbuild/.cache/tuxmake/builds/1/build/builtin-kwork.o] Error 1 Fix it. Fixes: 55c40e505234 ("perf kwork top: Introduce new top utility") Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: avagin@google.com Cc: daniel.diaz@linaro.org Link: https://lore.kernel.org/r/20231118024858.1567039-2-yangjihong1@huawei.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-11-15	perf vendor events riscv: Add StarFive Dubhe-80 JSON file	Ji Sheng Teoh
	StarFive's Dubhe-80 supports raw event id 0x00 - 0x22. The raw events are enabled through PMU node of DT binding. Besides raw event, add standard RISC-V firmware events to support monitoring of firmware event. Example of PMU DT node: pmu { compatible = "riscv,pmu"; riscv,raw-event-to-mhpmcounters = /* Event ID 1-31 / <0x00 0x00 0xFFFFFFFF 0xFFFFFFE0 0x00007FF8>, / Event ID 32-33 / <0x00 0x20 0xFFFFFFFF 0xFFFFFFFE 0x00007FF8>, / Event ID 34 */ <0x00 0x22 0xFFFFFFFF 0xFFFFFF22 0x00007FF8>; }; Example of 'perf stat' output: [root@user]# perf stat -a \ -e access_mmu_stlb \ -e miss_mmu_stlb \ -e access_mmu_pte_c \ -e rob_flush \ -e btb_prediction_miss \ -e itlb_miss \ -e sync_del_fetch_g \ -e icache_miss \ -e bpu_br_retire \ -e bpu_br_miss \ -e ret_ins_retire \ -e ret_ins_miss \ -- openssl speed rsa2048 Doing 2048 bits private rsa's for 10s: 39 2048 bits private RSA's in 10.14s Doing 2048 bits public rsa's for 10s: 1563 2048 bits public RSA's in 10.00s version: 3.0.11 built on: Tue Sep 19 13:02:31 2023 UTC options: bn(64,64) CPUINFO: N/A sign verify sign/s verify/s rsa 2048 bits 0.260000s 0.006398s 3.8 156.3 Performance counter stats for 'system wide': 1338350 access_mmu_stlb 1154025 miss_mmu_stlb 1162691 access_mmu_pte_c 34067 rob_flush 11212384 btb_prediction_miss 1256242 itlb_miss 652523491 sync_del_fetch_g 384465 icache_miss 64635789 bpu_br_retire 323440 bpu_br_miss 8785143 ret_ins_retire 31236 ret_ins_miss 20.760822480 seconds time elapsed Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Ji Sheng Teoh <jisheng.teoh@starfivetech.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Ley Foon Tan <leyfoon.tan@starfivetech.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nikita Shubin <n.shubin@yadro.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: linux-riscv@lists.infradead.org Link: https://lore.kernel.org/r/20231103082441.1389842-1-jisheng.teoh@starfivetech.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-15	perf report: Add s390 raw data interpretation for PAI counters	Thomas Richter
	Commit 39d62336f5c126ad ("s390/pai: add support for cryptography counters") added support for Processor Activity Instrumentation Facility (PAI) counters. These counters values are added as raw data with the perf sample during 'perf record'. Now add support to display these counters in the 'perf report' command. The counter number, its assigned name and value is now printed in addition to the hexadecimal output. Output before: # perf report -D 6 514766399626050 0x7b058 [0x48]: PERF_RECORD_SAMPLE(IP, 0x1): 303977/303977: 0 period: 1 addr: 0 ... thread: paitest:303977 ...... dso: <not found> 0x7b0a0@/root/perf.data.paicrypto [0x48]: event: 9 . . ... raw event: size 72 bytes . 0000: 00 00 00 09 00 01 00 48 00 00 00 00 00 00 00 00 .......H........ . 0010: 00 04 a3 69 00 04 a3 69 00 01 d4 2d 76 de a0 bb ...i...i...-v... . 0020: 00 00 00 00 00 01 5c 53 00 00 00 06 00 00 00 00 ......\S........ . 0030: 00 00 00 00 00 00 00 01 00 00 00 0c 00 07 00 00 ................ . 0040: 00 00 00 53 96 af 00 00 ...S.... Output after: # perf report -D 6 514766399626050 0x7b058 [0x48]: PERF_RECORD_SAMPLE(IP, 0x1): 303977/303977: 0 period: 1 addr: 0 ... thread: paitest:303977 ...... dso: <not found> 0x7b0a0@/root/perf.data.paicrypto [0x48]: event: 9 . . ... raw event: size 72 bytes . 0000: 00 00 00 09 00 01 00 48 00 00 00 00 00 00 00 00 .......H........ . 0010: 00 04 a3 69 00 04 a3 69 00 01 d4 2d 76 de a0 bb ...i...i...-v... . 0020: 00 00 00 00 00 01 5c 53 00 00 00 06 00 00 00 00 ......\S........ . 0030: 00 00 00 00 00 00 00 01 00 00 00 0c 00 07 00 00 ................ . 0040: 00 00 00 53 96 af 00 00 ...S.... Counter:007 km_aes_128 Value:0x00000000005396af <--- new Committer notes: Had to add ignore pragmas for that __packed function: +#pragma GCC diagnostic ignored "-Wpacked" +#pragma GCC diagnostic ignored "-Wattributes" Otherwise this doesn't build in things like debian experimentao cross building to mips64, etc. Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Tested-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20231110110908.2312308-1-tmricht@linux.ibm.com [ Corrected non-existent commit referred to the right one: 39d62336f5c126ad ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-12	LSM: wireup Linux Security Module syscalls	Casey Schaufler
	Wireup lsm_get_self_attr, lsm_set_self_attr and lsm_list_modules system calls. Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Cc: linux-api@vger.kernel.org Reviewed-by: Mickaël Salaün <mic@digikod.net> [PM: forward ported beyond v6.6 due merge window changes] Signed-off-by: Paul Moore <paul@paul-moore.com>
2023-11-10	perf probe: Convert to check dwarf_getcfi feature	Namhyung Kim
	Now it has a feature check for the dwarf_getcfi(), use it and convert the code to check HAVE_DWARF_CFI_SUPPORT definition. Suggested-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: linux-toolchains@vger.kernel.org Cc: linux-trace-devel@vger.kernel.org Link: https://lore.kernel.org/r/20231110000012.3538610-10-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-10	perf dwarf-aux: Add die_find_variable_by_reg() helper	Namhyung Kim
	The die_find_variable_by_reg() will search for a variable or a parameter sub-DIE in the given scope DIE where the location matches to the given register. For the simplest and most common case, memory access usually happens with a base register and an offset to the field so the register holds a pointer in a variable or function parameter. Then we can find one if it has a location expression at the (instruction) address. This function only handles such a simple case for now. In this case, the expression has a DW_OP_regN operation where N < 32. If the register index (N) is greater than or equal to 32, DW_OP_regx operation with an operand which saves the value for the N would be used. It rejects expressions with more operations. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: linux-toolchains@vger.kernel.org Cc: linux-trace-devel@vger.kernel.org Link: https://lore.kernel.org/r/20231110000012.3538610-8-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-10	perf dwarf-aux: Add die_get_scopes() alternative to dwarf_getscopes()	Namhyung Kim
	The die_get_scopes() returns the number of enclosing DIEs for the given address and it fills an array of DIEs like dwarf_getscopes(). But it doesn't follow the abstract origin of inlined functions as we want information of the concrete instance. This is needed to check the location of parameters and local variables properly. Users can check the origin separately if needed. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: linux-toolchains@vger.kernel.org Cc: linux-trace-devel@vger.kernel.org Link: https://lore.kernel.org/r/20231110000012.3538610-7-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-10	perf dwarf-aux: Move #else block of #ifdef HAVE_DWARF_GETLOCATIONS_SUPPORT ↵	Namhyung Kim
	code to the header file It's a usual convention that the conditional code is handled in a header file. As I'm planning to add some more of them, let's move the current code to the header first. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: linux-toolchains@vger.kernel.org Cc: linux-trace-devel@vger.kernel.org Link: https://lore.kernel.org/r/20231110000012.3538610-6-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-10	perf dwarf-aux: Fix die_get_typename() for void *	Namhyung Kim
	The die_get_typename() is to return a C-like type name from DWARF debug entry and it follows data type if the target entry is a pointer type. But I found that void pointers don't have the type attribute to follow and then the function returns an error for that case. This results in a broken type string for void pointer types. For example, the following type entries are pointer types. <1><48c>: Abbrev Number: 4 (DW_TAG_pointer_type) <48d> DW_AT_byte_size : 8 <48d> DW_AT_type : <0x481> <1><491>: Abbrev Number: 211 (DW_TAG_pointer_type) <493> DW_AT_byte_size : 8 <1><494>: Abbrev Number: 4 (DW_TAG_pointer_type) <495> DW_AT_byte_size : 8 <495> DW_AT_type : <0x49e> The first one at offset 48c and the third one at offset 494 have type information. Then they are pointer types for the referenced types. But the second one at offset 491 doesn't have the type attribute. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: linux-toolchains@vger.kernel.org Cc: linux-trace-devel@vger.kernel.org Link: https://lore.kernel.org/r/20231110000012.3538610-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-10	perf tools: Add util/debuginfo.[ch] files	Namhyung Kim
	Split debuginfo data structure and related functions into a separate file so that it can be used by other components than the probe-finder. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: linux-toolchains@vger.kernel.org Cc: linux-trace-devel@vger.kernel.org Link: https://lore.kernel.org/r/20231110000012.3538610-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-10	perf annotate: Move raw_comment and raw_func_start fields out of 'struct ↵	Namhyung Kim
	ins_operands' Thoese two fields are used only for the jump_ops, so move them into the union to save some bytes. Also add jump__delete() callback not to free the fields as they didn't allocate new strings. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: WANG Rui <wangrui@loongson.cn> Cc: linux-toolchains@vger.kernel.org Cc: linux-trace-devel@vger.kernel.org Link: https://lore.kernel.org/r/20231110000012.3538610-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-10	perf annotate: Pass "-l" option to objdump conditionally	Namhyung Kim
	The "-l" option is to print line numbers in the objdump output. perf annotate TUI only can show the line numbers later but it causes big slow downs for the kernel binary. Similarly, showing source code also takes a long time and it already has an option to control it. $ time objdump ... -d -S -C vmlinux > /dev/null real 0m3.474s user 0m3.047s sys 0m0.428s $ time objdump ... -d -l -C vmlinux > /dev/null real 0m1.796s user 0m1.459s sys 0m0.338s $ time objdump ... -d -C vmlinux > /dev/null real 0m0.051s user 0m0.036s sys 0m0.016s As it's not needed for data type profiling, let's make it conditional so that it can skip the unnecessary work. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: linux-toolchains@vger.kernel.org Cc: linux-trace-devel@vger.kernel.org Link: https://lore.kernel.org/r/20231110000012.3538610-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-10	perf header: Additional note on AMD IBS for max_precise pmu cap	Arnaldo Carvalho de Melo
	x86 core PMU exposes supported maximum precision level via max_precise PMU capability. Although, AMD core PMU does not support precise mode, certain core PMU events with precise_ip > 0 are allowed and forwarded to IBS OP PMU. Display a note about this in the 'perf report' header output and document the details in the perf-list man page. Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ananth Narayan <ananth.narayan@amd.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ross Zwisler <zwisler@chromium.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Santosh Shukla <santosh.shukla@amd.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20231107083331.901-2-ravi.bangoria@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-09	perf bpf: Don't synthesize BPF events when disabled	Ian Rogers
	If BPF sideband events are disabled on the command line, don't synthesize BPF events too. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Song Liu <song@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: German Gomez <german.gomez@arm.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Li Dong <lidong@vivo.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Terrell <terrelln@fb.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Vincent Whitchurch <vincent.whitchurch@axis.com> Cc: Wenyu Liu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20231102175735.2272696-13-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-09	perf test: Add support for setting objdump binary via perf config	James Clark
	Add a 'perf config' variable that does the same thing as "perf test --objdump <x>". Also update the man page. Committer testing: # perf config test.objdump # perf test "object code reading" 26: Object code reading : Ok # perf config test.objdump=blah # perf config test.objdump test.objdump=blah # perf test "object code reading" 26: Object code reading : FAILED! # perf test -v "object code reading" 26: Object code reading : --- start --- test child forked, pid 600599 Looking at the vmlinux_path (8 entries long) Using /proc/kcore for kernel data Using /proc/kallsyms for symbols Parsing event 'cycles' Using CPUID AuthenticAMD-25-21-0 mmap size 528384B Reading object code for memory address: 0x4d9a02 File is: /home/acme/bin/perf On file address is: 0xd9a02 Objdump command is: blah -z -d --start-address=0x4d9a02 --stop-address=0x4d9a82 /home/acme/bin/perf objdump read too few bytes: 128 Bytes read differ from those read by objdump buf1 (dso): 0x48 0x85 0xff 0x74 0x29 0xe8 0x94 0xdf 0x07 0x00 0x8b 0x73 0x1c 0x48 0x8b 0x43 0x08 0xeb 0xa5 0x0f 0x1f 0x00 0x48 0x8b 0x45 0xe8 0x64 0x48 0x2b 0x04 0x25 0x28 0x00 0x00 0x00 0x75 0x0f 0x48 0x8b 0x5d 0xf8 0xc9 0xc3 0x0f 0x1f 0x00 0x48 0x8b 0x43 0x08 0xeb 0x84 0xe8 0xc5 0x3e 0xf3 0xff 0x0f 0x1f 0x44 0x00 0x00 0x55 0x48 0x89 0xe5 0x41 0x56 0x41 0x55 0x49 0x89 0xd5 0x41 0x54 0x49 0x89 0xfc 0x53 0x48 0x89 0xf3 0x48 0x83 0xec 0x30 0x48 0x8b 0x7e 0x20 0x64 0x48 0x8b 0x04 0x25 0x28 0x00 0x00 0x00 0x48 0x89 0x45 0xd8 0x31 0xc0 0x48 0x89 0x75 0xb0 0x48 0xc7 0x45 0xb8 0x00 0x00 0x00 0x00 0x48 0xc7 0x45 0xc0 0x00 0x00 0x00 0x00 0xe8 0xad 0xfa buf2 (objdump): 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 test child finished with -1 ---- end ---- Object code reading: FAILED! # perf config test.objdump=/usr/bin/objdump # perf config test.objdump test.objdump=/usr/bin/objdump # perf test "object code reading" 26: Object code reading : Ok # Signed-off-by: James Clark <james.clark@arm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Fangrui Song <maskray@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Tom Rix <trix@redhat.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Yonghong Song <yhs@fb.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20231106151051.129440-3-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-09	perf test: Add option to change objdump binary	James Clark
	All of the other Perf subcommands that use objdump have an option to specify the binary, so add the same option to 'perf test'. This is useful if you have built the kernel with a different toolchain to the system one, where the system objdump may fail to disassemble vmlinux. Now this can be fixed with something like this: $ perf test --objdump llvm-objdump "object code reading" Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: James Clark <james.clark@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Fangrui Song <maskray@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Tom Rix <trix@redhat.com> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: llvm@lists.linux.dev Link: https://lore.kernel.org/r/20231106151051.129440-2-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-09	perf tests offcpu: Adjust test case perf record offcpu profiling tests for s390	Thomas Richter
	On s390 using linux-next the test case: 87: perf record offcpu profiling tests fails. The root cause is this command # ./perf record --off-cpu -e dummy -- ./perf bench sched messaging -l 10 # Running 'sched/messaging' benchmark: # 20 sender and receiver processes per group # 10 groups == 400 processes run Total time: 0.231 [sec] [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.077 MB perf.data (401 samples) ] # It does not generate 800+ sample entries, on s390 usually around 40[1-9], sometimes a few more, but never more than 450. The higher the number of CPUs the lower the number of samples. Looking at function chain: bench_sched_messaging() +--> group() the senders and receiver threads are created. The senders and receivers call function ready() which writes one bytes and wait for a reply using poll system() call. As context switches are counted, the function ready() will trigger a context switch when no input data is available after the write system call. The write system call does not trigger context switches when the data size is small. And writing 1000 bytes (10 iterations with 100 bytes) is not much and certainly won't block. The 400+ context switch on s390 occur when the some receiver/sender threads call ready() and wait for the response from function bench_sched_messaging() being kicked off. Lower the number of expected context switches to 400 to succeed on s390. Suggested-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Co-developed-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20231106091627.2022530-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-11-09	perf tools: Add the python_ext_build directory to .gitignore	Yang Jihong
	`python_ext_build` is the build directory for python.so, ignore it for cleaner git status. Signed-off-by: Yang Jihong <yangjihong1@huawei.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20231030111438.1357962-2-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>