linux/linux-stable.git - Linux kernel stable tree

Age	Commit message (Collapse)	Author
2024-04-29	selftests/powerpc: make sub-folders buildable on their own	Madhavan Srinivasan
	Build breaks when executing make with run_tests for sub-folders under powerpc. This is because, CFLAGS and GIT_VERSION macros are defined in Makefile of toplevel powerpc folder. make: Entering directory '/home/maddy/linux/tools/testing/selftests/powerpc/mm' gcc hugetlb_vs_thp_test.c ../harness.c ../utils.c -o /home/maddy/selftest_output//hugetlb_vs_thp_test hugetlb_vs_thp_test.c:6:10: fatal error: utils.h: No such file or directory 6 \| #include "utils.h" \| ^~~~~~~~~ compilation terminated. Fix this by adding the flags.mk in each sub-folder Makefile. Also remove the CFLAGS and GIT_VERSION macros from powerpc/ folder Makefile since the same is definied in flags.mk Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240229093711.581230-3-maddy@linux.ibm.com
2024-04-29	selftests/powerpc: Add flags.mk to support pmu buildable	Madhavan Srinivasan
	When running `make -C powerpc/pmu run_tests` from top level selftests directory, currently this error is being reported: make: Entering directory '/home/maddy/linux/tools/testing/selftests/powerpc/pmu' Makefile:40: warning: overriding recipe for target 'emit_tests' ../../lib.mk:111: warning: ignoring old recipe for target 'emit_tests' gcc -m64 count_instructions.c ../harness.c event.c lib.c ../utils.c loop.S -o /home/maddy/selftest_output//count_instructions In file included from count_instructions.c:13: event.h:12:10: fatal error: utils.h: No such file or directory 12 \| #include "utils.h" \| ^~~~~~~~~ compilation terminated. This is due to missing of include path in CFLAGS. That is, CFLAGS and GIT_VERSION macros are defined in the powerpc/ folder Makefile which in this case is not involved. To address the failure in case of executing specific sub-folder test directly, a new rule file has been addded by the patch called "flags.mk" under selftest/powerpc/ folder and is linked to all the Makefile of powerpc/pmu sub-folders. Reported-by: Sachin Sant <sachinp@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Tested-by: Sachin Sant <sachinp@linux.ibm.com> [mpe: Fixup ifeq, make GIT_VERSION simply expanded to avoid re-executing git describe] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240229093711.581230-2-maddy@linux.ibm.com
2024-04-29	selftests/powerpc: Re-order *FLAGS to follow lib.mk	Madhavan Srinivasan
	In some powerpc/ sub-folder Makefiles, CFLAGS are defined before lib.mk include. Clean it up by re-ordering the flags to follow after the mk include. This is needed to support sub-folders in powerpc/ buildable on its own. Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240229093711.581230-1-maddy@linux.ibm.com
2024-04-29	tools/power/x86/intel-speed-select: v1.19 release	Srinivas Pandruvada
	This version addresses issues with: - Support of SST BF/TF support per level - Increase number of CPUs displayed - Present all TRL levels for turbo-freq - Fix display for unsupported levels - Support multiple dies - Increase die count - Change CPU display for non compute domain Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2024-04-29	tools/power/x86/intel-speed-select: Display CPU as None for -1	Srinivas Pandruvada
	When there is no CPU in a power domain, display "None" instead of -1. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2024-04-29	tools/power/x86/intel-speed-select: SST BF/TF support per level	Srinivas Pandruvada
	SST BF and TF can be enabled/disabled per level. So check the current level support from the mask of supported levels. This change from a single level to mask for info.sst_tf_support and info.sst_tf_support is indicated by API version change. Use as mask for API version above 2. In this way there is no change in behavior when running on older kernel with API version 2. Since the tool can support now API version 3, update the supported API version. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2024-04-29	tools/power/x86/intel-speed-select: Increase number of CPUs displayed	Srinivas Pandruvada
	Currently max 128 CPUs can be displayed in the enable CPU list. Double the range. Since the size is big for stack allocation, change to static. Here changing to static is fine as these functions are called in serial. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2024-04-29	tools/power/x86/intel-speed-select: Present all TRL levels for turbo-freq	Srinivas Pandruvada
	For turbo-freq feature, only 3 levels of frequencies are displayed even if platform support more. Present all levels based on the CPU model. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2024-04-29	tools/power/x86/intel-speed-select: Fix display for unsupported levels	Srinivas Pandruvada
	During call to "intel-speed-select turbo-freq info" some junk values are reported for unsupported levels. Initialize the structure fact_info with 0s, so that isst_fact_display_information() will skip "0" values in the frequency. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2024-04-29	tools/power/x86/intel-speed-select: Support multiple dies	Srinivas Pandruvada
	When the die id is same as punit compute die ID, treat them same. In this case, when for_each_online_power_domain_in_set() is called, then don't loop for each punit in a die. Just loop for all punits in a package. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2024-04-29	tools/power/x86/intel-speed-select: Increase die count	Srinivas Pandruvada
	TPMI platform information supports up to 16 compute dies. So increase the range. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2024-04-29	tools/arch/x86/intel_sdsi: Add current meter support	David E. Box
	Add support to read the 'meter_current' file. The display is the same as the 'meter_certificate', but will show the current snapshot of the counters. Signed-off-by: David E. Box <david.e.box@linux.intel.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20240411025856.2782476-10-david.e.box@linux.intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2024-04-29	tools/arch/x86/intel_sdsi: Simplify ascii printing	David E. Box
	Add #define for feature length and move NUL assignment from callers to get_feature(). Signed-off-by: David E. Box <david.e.box@linux.intel.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20240411025856.2782476-9-david.e.box@linux.intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2024-04-29	tools/arch/x86/intel_sdsi: Fix meter_certificate decoding	David E. Box
	Fix errors in the calculation of the start position of the counters and in the display loop. While here, use a #define for the bundle count and size. Fixes: 7fdc03a7370f ("tools/arch/x86: intel_sdsi: Add support for reading meter certificates") Signed-off-by: David E. Box <david.e.box@linux.intel.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20240411025856.2782476-8-david.e.box@linux.intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2024-04-29	tools/arch/x86/intel_sdsi: Fix meter_show display	David E. Box
	Fixes sdsi_meter_cert_show() to correctly decode and display the meter certificate output. Adds and displays a missing version field, displays the ASCII name of the signature, and fixes the print alignment. Fixes: 7fdc03a7370f ("tools/arch/x86: intel_sdsi: Add support for reading meter certificates") Signed-off-by: David E. Box <david.e.box@linux.intel.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20240411025856.2782476-7-david.e.box@linux.intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2024-04-29	tools/arch/x86/intel_sdsi: Fix maximum meter bundle length	David E. Box
	The maximum number of bundles in the meter certificate was set to 8 which is much less than the maximum. Instead, since the bundles appear at the end of the file, set it based on the remaining file size from the bundle start position. Fixes: 7fdc03a7370f ("tools/arch/x86: intel_sdsi: Add support for reading meter certificates") Signed-off-by: David E. Box <david.e.box@linux.intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20240411025856.2782476-6-david.e.box@linux.intel.com Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2024-04-28	kselftest: arm64: Add a null pointer check	Kunwu Chan
	There is a 'malloc' call, which can be unsuccessful. This patch will add the malloc failure checking to avoid possible null dereference and give more information about test fail reasons. Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Link: https://lore.kernel.org/r/20240423082102.2018886-1-chentao@kylinos.cn Signed-off-by: Will Deacon <will@kernel.org>
2024-04-28	arm64/sysreg: Update PIE permission encodings	Shiqi Liu
	Fix left shift overflow issue when the parameter idx is greater than or equal to 8 in the calculation of perm in PIRx_ELx_PERM macro. Fix this by modifying the encoding to use a long integer type. Signed-off-by: Shiqi Liu <shiqiliu@hust.edu.cn> Acked-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20240421063328.29710-1-shiqiliu@hust.edu.cn Signed-off-by: Will Deacon <will@kernel.org>
2024-04-28	kselftest/arm64: Remove unused parameters in abi test	xieming
	Remove unused parameter i in tpidr2.c main function. Signed-off-by: xieming <xieming@kylinos.cn> Link: https://lore.kernel.org/r/20240422015730.89805-1-xieming@kylinos.cn Signed-off-by: Will Deacon <will@kernel.org>
2024-04-27	Merge tag 'riscv-for-linus-6.9-rc6' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Palmer Dabbelt: - A fix for TASK_SIZE on rv64/NOMMU, to reflect the lack of user/kernel separation - A fix to avoid loading rv64/NOMMU kernel past the start of RAM - A fix for RISCV_HWPROBE_EXT_ZVFHMIN on ilp32 to avoid signed integer overflow in the bitmask - The sud_test kselftest has been fixed to properly swizzle the syscall number into the return register, which are not the same on RISC-V - A fix for a build warning in the perf tools on rv32 - A fix for the CBO selftests, to avoid non-constants leaking into the inline asm - A pair of fixes for T-Head PBMT errata probing, which has been renamed MAE by the vendor * tag 'riscv-for-linus-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: RISC-V: selftests: cbo: Ensure asm operands match constraints, take 2 perf riscv: Fix the warning due to the incompatible type riscv: T-Head: Test availability bit before enabling MAE errata riscv: thead: Rename T-Head PBMT to MAE selftests: sud_test: return correct emulated syscall value on RISC-V riscv: hwprobe: fix invalid sign extension for RISCV_HWPROBE_EXT_ZVFHMIN riscv: Fix loading 64-bit NOMMU kernels past the start of RAM riscv: Fix TASK_SIZE on 64-bit NOMMU
2024-04-26	perf test: Reintroduce -p/--parallel and make -S/--sequential the default	Arnaldo Carvalho de Melo
	We can't default to doing parallel tests as there are tests that compete for the same resources and thus clash, for instance tests that put in place 'perf probe' probes, that clean the probes without regard to other tests needs, ARM64 coresight tests, Intel PT ones, etc. So reintroduce --p/--parallel and make -S/--sequential the default. We need to come up with infrastructure that state which tests can't run in parallel because they need exclusive access to some resource, something as simple as "probes" that would then avoid 'perf probe' tests from running while other such test is running, or make the tests more resilient, till then we can't use parallel mode as default. While at it, document all these options in the 'perf test' man page. Reported-by: Adrian Hunter <adrian.hunter@intel.com> Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Reported-by: James Clark <james.clark@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/Ziwm18BqIn_vc1vn@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	tools headers: Synchronize linux/bits.h with the kernel sources	Arnaldo Carvalho de Melo
	To pick up the changes in this cset: 3c7a8e190bc58081 ("uapi: introduce uapi-friendly macros for GENMASK") That just causes perf to rebuild. Its just some macros going to an uapi header that we now have to grab a copy into tools/ as well. This addresses this perf build warning: Warning: Kernel ABI header differences: diff -u tools/include/linux/bits.h include/linux/bits.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: https://lore.kernel.org/lkml/ZiwJsFOBez0MS4r9@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	tools headers x86 cpufeatures: Sync with the kernel sources to pick BHI ↵	Arnaldo Carvalho de Melo
	mitigation changes To pick the changes from: 95a6ccbdc7199a14 ("x86/bhi: Mitigate KVM by default") ec9404e40e8f3642 ("x86/bhi: Add BHI mitigation knob") be482ff9500999f5 ("x86/bhi: Enumerate Branch History Injection (BHI) bug") 0f4a837615ff925b ("x86/bhi: Define SPEC_CTRL_BHI_DIS_S") 7390db8aea0d64e9 ("x86/bhi: Add support for clearing branch history at syscall entry") This causes these perf files to be rebuilt and brings some X86_FEATURE that will be used when updating the copies of tools/arch/x86/lib/mem{cpy,set}_64.S with the kernel sources: CC /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o CC /tmp/build/perf/bench/mem-memset-x86-64-asm.o And addresses this perf build warning: Warning: Kernel ABI header differences: diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Daniel Sneddon <daniel.sneddon@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/lkml/ZirIx4kPtJwGFZS0@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	perf annotate: Fix data type profiling on stdio	Namhyung Kim
	The loop in hists__find_annotations() never set the 'nd' pointer to NULL and it makes stdio output repeating the last element forever. I think it doesn't set to NULL for TUI to prevent it from exiting unexpectedly. But it should just set on stdio mode. Fixes: d001c7a7f4736743 ("perf annotate-data: Add hist_entry__annotate_data_tui()") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240423020643.740029-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	perf build: Pretend scandirat is missing with msan	Ian Rogers
	Memory sanitizer lacks an interceptor for scandirat, reporting all memory it allocates as uninitialized. Memory sanitizer has a scandir interceptor so use the fallback function in this case. This allows 'perf test' to run under memory sanitizer. Additional notes from Ian on running in this mode: Note, as msan needs to instrument memory allocations libraries need to be compiled with it. I lacked the msan built libraries and so built with: ``` $ make -C tools/perf O=/tmp/perf DEBUG=1 EXTRA_CFLAGS="-O0 -g -fno-omit-frame-pointer -fsanitize=memory -fsanitize-memory-track-origins" CC=clang CXX=clang++ HOSTCC=clang NO_LIBTRACEEVENT=1 NO_LIBELF=1 BUILD_BPF_SKEL=0 NO_LIBPFM=1 ``` oh, I disabled libbpf here as the bpf system call also lacks msan interceptors. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240320163244.1287780-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	perf intel-pt: Fix unassigned instruction op (discovered by MemorySanitizer)	Adrian Hunter
	MemorySanitizer discovered instances where the instruction op value was not assigned.: WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x5581c00a76b3 in intel_pt_sample_flags tools/perf/util/intel-pt.c:1527:17 Uninitialized value was stored to memory at #0 0x5581c005ddf8 in intel_pt_walk_insn tools/perf/util/intel-pt-decoder/intel-pt-decoder.c:1256:25 The op value is used to set branch flags for branch instructions encountered when walking the code, so fix by setting op to INTEL_PT_OP_OTHER in other cases. Fixes: 4c761d805bb2d2ea ("perf intel-pt: Fix intel_pt_fup_event() assumptions about setting state type") Reported-by: Ian Rogers <irogers@google.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Closes: https://lore.kernel.org/linux-perf-users/20240320162619.1272015-1-irogers@google.com/ Link: https://lore.kernel.org/r/20240326083223.10883-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	perf record: Fix comment misspellings	Howard Chu
	Fix comment misspellings Signed-off-by: Howard Chu <howardchu95@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240425060427.1800663-1-howardchu95@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	perf annotate: Update DSO binary type when trying build-id	Namhyung Kim
	dso__disassemble_filename() tries to get the filename for objdump (or capstone) using build-id. But I found sometimes it didn't disassemble some functions. It turned out that those functions belong to a DSO which has no binary type set. It seems it sets the binary type for some special files only - like kernel (kallsyms or kcore) or BPF images. And there's a logic to skip dso with DSO_BINARY_TYPE__NOT_FOUND. As it's checked the build-id cache link, it should set the binary type as DSO_BINARY_TYPE__BUILD_ID_CACHE. Fixes: 873a83731f1cc85c ("perf annotate: Skip DSOs not found") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240425005157.1104789-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	perf annotate: Fallback disassemble to objdump when capstone fails	Namhyung Kim
	I found some cases that capstone failed to disassemble. Probably my capstone is an old version but anyway there's a chance it can fail. And then it silently stopped in the middle. In my case, it didn't understand "RDPKRU" instruction. Let's check if the capstone disassemble reached the end of the function and fallback to objdump if not. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240425005157.1104789-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	perf annotate-data: Check if 'struct annotation_source' was allocated on ↵	Namhyung Kim
	'perf report' TUI As it removed the sample accounting for code when no symbol sort key is given for 'perf report' TUI, it might not have allocated the 'struct annotated_source' yet. Let's check if it's NULL first. Fixes: 6cdd977ec24e1538 ("perf report: Do not collect sample histogram unnecessarily") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240424230015.1054013-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	perf test: Add a new test for 'perf annotate'	Namhyung Kim
	Add a basic 'perf annotate' test: $ ./perf test annotate -vv 76: perf annotate basic tests: --- start --- test child forked, pid 846989 fbcd0-fbd55 l noploop perf does have symbol 'noploop' Basic perf annotate test : 0 0xfbcd0 <noploop>: 0.00 : fbcd0: pushq %rbp 0.00 : fbcd1: movq %rsp, %rbp 0.00 : fbcd4: pushq %r12 0.00 : fbcd6: pushq %rbx 0.00 : fbcd7: movl $1, %ebx 0.00 : fbcdc: subq $0x10, %rsp 0.00 : fbce0: movq %fs:0x28, %rax 0.00 : fbce9: movq %rax, -0x18(%rbp) 0.00 : fbced: xorl %eax, %eax 0.00 : fbcef: testl %edi, %edi 0.00 : fbcf1: jle 0xfbd04 0.00 : fbcf3: movq (%rsi), %rdi 0.00 : fbcf6: movl $0xa, %edx 0.00 : fbcfb: xorl %esi, %esi 0.00 : fbcfd: callq 0x41920 0.00 : fbd02: movl %eax, %ebx 0.00 : fbd04: leaq -0x7b(%rip), %r12 # fbc90 <sighandler> 0.00 : fbd0b: movl $2, %edi 0.00 : fbd10: movq %r12, %rsi 0.00 : fbd13: callq 0x40a00 0.00 : fbd18: movl $0xe, %edi 0.00 : fbd1d: movq %r12, %rsi 0.00 : fbd20: callq 0x40a00 0.00 : fbd25: movl %ebx, %edi 0.00 : fbd27: callq 0x407c0 0.10 : fbd2c: movl 0x89785e(%rip), %eax # 993590 <done> 0.00 : fbd32: testl %eax, %eax 99.90 : fbd34: je 0xfbd2c 0.00 : fbd36: movq -0x18(%rbp), %rax 0.00 : fbd3a: subq %fs:0x28, %rax 0.00 : fbd43: jne 0xfbd50 0.00 : fbd45: addq $0x10, %rsp 0.00 : fbd49: xorl %eax, %eax 0.00 : fbd4b: popq %rbx 0.00 : fbd4c: popq %r12 0.00 : fbd4e: popq %rbp 0.00 : fbd4f: retq 0.00 : fbd50: callq 0x407e0 0.00 : fbcd0: pushq %rbp 0.00 : fbcd1: movq %rsp, %rbp 0.00 : fbcd4: pushq %r12 0.00 : fbcd0: push %rbp 0.00 : fbcd1: mov %rsp,%rbp 0.00 : fbcd4: push %r12 Basic annotate test [Success] ---- end(0) ---- 76: perf annotate basic tests : Ok Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240424001231.849972-1-namhyung@kernel.org [ Improved a bit the error messages ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	perf parse-events: Tidy the setting of the default event name	Ian Rogers
	Add comments. Pass ownership of the event name to save on a strdup. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-17-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	perf parse-events: Minor grouping tidy up	Ian Rogers
	Add comments. Ensure leader->group_name is freed before overwriting it. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-16-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	perf parse-event: Constify event_symbol arrays	Ian Rogers
	Moves 352 bytes from .data to .data.rel.ro. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-15-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	perf parse-events: Improvements to modifier parsing	Ian Rogers
	Use a struct/bitmap rather than a copied string from lexer. In lexer give improved error message when too many precise flags are given or repeated modifiers. Before: $ perf stat -e 'cycles:kuk' true event syntax error: 'cycles:kuk' \___ Bad modifier ... $ perf stat -e 'cycles:pppp' true event syntax error: 'cycles:pppp' \___ Bad modifier ... $ perf stat -e '{instructions:p,cycles:pp}:pp' -a true event syntax error: '..cycles:pp}:pp' \___ Bad modifier ... After: $ perf stat -e 'cycles:kuk' true event syntax error: 'cycles:kuk' \___ Duplicate modifier 'k' (kernel) ... $ perf stat -e 'cycles:pppp' true event syntax error: 'cycles:pppp' \___ Maximum precise value is 3 ... $ perf stat -e '{instructions:p,cycles:pp}:pp' true event syntax error: '..cycles:pp}:pp' \___ Maximum combined precise value is 3, adding precision to "cycles:pp" ... Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-14-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	perf parse-events: Inline parse_events_evlist_error	Ian Rogers
	Inline parse_events_evlist_error that is only used in parse_events_error. Modify parse_events_error to not report a parser error unless errors haven't already been reported. Make it clearer that the latter case only happens for unrecognized input. Before: $ perf stat -e 'cycles/period=99999999999999999999/' true event syntax error: 'cycles/period=99999999999999999999/' \___ parser error event syntax error: '..les/period=99999999999999999999/' \___ Bad base 10 number "99999999999999999999" Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events $ perf stat -e 'cycles:xyz' true event syntax error: 'cycles:xyz' \___ parser error Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events After: $ perf stat -e 'cycles/period=99999999999999999999/xyz' true event syntax error: '..les/period=99999999999999999999/xyz' \___ Bad base 10 number "99999999999999999999" Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events $ perf stat -e 'cycles:xyz' true event syntax error: 'cycles:xyz' \___ Unrecognized input Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-13-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	perf parse-events: Improve error message for bad numbers	Ian Rogers
	Use the error handler from the parse_state to give a more informative error message. Before: $ perf stat -e 'cycles/period=99999999999999999999/' true event syntax error: 'cycles/period=99999999999999999999/' \___ parser error Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events After: $ perf stat -e 'cycles/period=99999999999999999999/' true event syntax error: 'cycles/period=99999999999999999999/' \___ parser error event syntax error: '..les/period=99999999999999999999/' \___ Bad base 10 number "99999999999999999999" Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-12-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	perf parse-events: Inline parse_events_update_lists	Ian Rogers
	The helper function just wraps a splice and free. Making the free inline removes a comment, so then it just wraps a splice which we can make inline too. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-11-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	perf parse-events: Prefer sysfs/JSON hardware events over legacy	Ian Rogers
	It was requested that RISC-V be able to add events to the perf tool so the PMU driver didn't need to map legacy events to config encodings: https://lore.kernel.org/lkml/20240217005738.3744121-1-atishp@rivosinc.com/ This change makes the priority of events specified without a PMU the same as those specified with a PMU, namely sysfs and JSON events are checked first before using the legacy encoding. The hw_term is made more generic as a hardware_event that encodes a pair of string and int value, allowing parse_events_multi_pmu_add to fall back on a known encoding when the sysfs/JSON adding fails for core events. As this covers PE_VALUE_SYM_HW, that token is removed and related code simplified. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-10-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	perf parse-events: Constify parse_events_add_numeric	Ian Rogers
	Allow the term list to be const so that other functions can pass const term lists. Add const as necessary to called functions. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-9-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	perf parse-events: Handle PE_TERM_HW in name_or_raw	Ian Rogers
	Avoid duplicate logic for name_or_raw and PE_TERM_HW by having a rule to turn PE_TERM_HW into a name_or_raw. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	perf parse-events: Legacy cache names on all PMUs and lower priority	Ian Rogers
	Prior behavior is to not look for legacy cache names in sysfs/JSON and to create events on all core PMUs. New behavior is to look for sysfs/JSON events first on all PMUs, for core PMUs add a legacy event if the sysfs/JSON event isn't present. This is done so that there is consistency with how event names in terms are handled and their prioritization of sysfs/JSON over legacy. It may make sense to use a legacy cache event name as an event name on a non-core PMU so we should allow it. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	perf tests parse-events: Use "branches" rather than "cache-references"	Ian Rogers
	Switch from "cache-references" to "branches" in test as Intel has a sysfs event for "cache-references" and changing the priority for sysfs over legacy causes the test to fail. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	perf pmu: Refactor perf_pmu__match()	Ian Rogers
	Move all implementation to pmu code. Don't allocate a fnmatch wildcard pattern, matching ignoring the suffix already handles this, and only use fnmatch if the given PMU name has a '*' in it. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	perf parse-events: Avoid copying an empty list	Ian Rogers
	In parse_events_add_pmu, delay copying the list of terms until it is known the list contains terms. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	perf parse-events: Directly pass PMU to parse_events_add_pmu()	Ian Rogers
	Avoid passing the name of a PMU then finding it again, just directly pass the PMU. parse_events_multi_pmu_add_or_add_pmu() is the only version that needs to find a PMU, so move the find there. Remove the error message as parse_events_multi_pmu_add_or_add_pmu will given an error at the end when a name isn't either a PMU name or event name. Without the error message being created the location in the input parameter (loc) can be removed. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	perf parse-events: Factor out '<event_or_pmu>/.../' parsing	Ian Rogers
	Factor out the case of an event or PMU name followed by a slash based term list. This is with a view to sharing the code with new legacy hardware parsing. Use early return to reduce indentation in the code. Make parse_events_add_pmu static now it doesn't need sharing with parse-events.y. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Atish Patra <atishp@rivosinc.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Beeman Strong <beeman@rivosinc.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240416061533.921723-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	perf scripts python: Add a script to run instances of 'perf script' in parallel	Adrian Hunter
	Add a Python script to run a perf script command multiple times in parallel, using perf script options --cpu and --time so that each job processes a different chunk of the data. Extend perf script tests to test also the new script. The script supports the use of normal 'perf script' options like --dlfilter and --script, so that the benefit of running parallel jobs naturally extends to them also. In addition, a command can be provided (refer --pipe-to option) to pipe standard output to a custom command. Refer to the script's own help text at the end of the patch for more details. The script is useful for Intel PT traces, that can be efficiently decoded by 'perf script' when split by CPU and/or time ranges. Running jobs in parallel can decrease the overall decoding time. Committer testing: Ian reported that shellcheck found some issues, I installed it as there are no warnings about it not being available, but when available it fails the build with: TEST /tmp/build/perf-tools-next/tests/shell/script.sh.shellcheck_log CC /tmp/build/perf-tools-next/util/header.o In tests/shell/script.sh line 20: rm -rf "${temp_dir}/"* ^-------------^ SC2115 (warning): Use "${var:?}" to ensure this never expands to /* . In tests/shell/script.sh line 83: output1_dir="${temp_dir}/output1" ^---------^ SC2034 (warning): output1_dir appears unused. Verify use (or export if used externally). In tests/shell/script.sh line 84: output2_dir="${temp_dir}/output2" ^---------^ SC2034 (warning): output2_dir appears unused. Verify use (or export if used externally). In tests/shell/script.sh line 86: python3 "${pp}" -o "${output_dir}" --jobs 4 --verbose -- perf script -i "${perf_data}" ^-----------^ SC2154 (warning): output_dir is referenced but not assigned (did you mean 'output1_dir'?). For more information: https://www.shellcheck.net/wiki/SC2034 -- output1_dir appears unused. Verif... https://www.shellcheck.net/wiki/SC2115 -- Use "${var:?}" to ensure this nev... https://www.shellcheck.net/wiki/SC2154 -- output_dir is referenced but not ... Did these fixes: - rm -rf "${temp_dir}/"* + rm -rf "${temp_dir:?}/"* And: @@ -83,8 +83,8 @@ test_parallel_perf() output1_dir="${temp_dir}/output1" output2_dir="${temp_dir}/output2" perf record -o "${perf_data}" --sample-cpu uname - python3 "${pp}" -o "${output_dir}" --jobs 4 --verbose -- perf script -i "${perf_data}" - python3 "${pp}" -o "${output_dir}" --jobs 4 --verbose --per-cpu -- perf script -i "${perf_data}" + python3 "${pp}" -o "${output1_dir}" --jobs 4 --verbose -- perf script -i "${perf_data}" + python3 "${pp}" -o "${output2_dir}" --jobs 4 --verbose --per-cpu -- perf script -i "${perf_data}" After that: root@number:~# perf test -vv "perf script tests" 97: perf script tests: --- start --- test child forked, pid 4084139 DB test [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.032 MB /tmp/perf-test-script.T4MJDr0L6J/perf.data (7 samples) ] <SNIP> DB test [Success] parallel-perf test Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.034 MB /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data (7 samples) ] Starting: perf script --time=,91898.301878499 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --time=91898.301878500,91898.301905999 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --time=91898.301906000,91898.301933499 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --time=91898.301933500, -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --time=91898.301878500,91898.301905999 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --time=91898.301906000,91898.301933499 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 4 jobs: 2 completed, 2 running Finished: perf script --time=,91898.301878499 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --time=91898.301933500, -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 4 jobs: 4 completed, 0 running All jobs finished successfully parallel-perf.py done Starting: perf script --cpu=0 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=1 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=2 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=3 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=0 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=1 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=2 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=3 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 4 completed, 0 running Starting: perf script --cpu=4 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=5 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=6 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=7 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=4 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=5 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=6 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=7 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 8 completed, 0 running Starting: perf script --cpu=8 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=9 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=10 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=11 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=8 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=9 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=10 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=11 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 12 completed, 0 running Starting: perf script --cpu=12 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=13 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=14 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=15 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=12 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=13 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=14 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=15 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 16 completed, 0 running Starting: perf script --cpu=16 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=17 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=18 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=19 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=16 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=17 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=18 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=19 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 20 completed, 0 running Starting: perf script --cpu=20 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=21 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=22 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=23 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=20 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=21 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=22 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=23 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 24 completed, 0 running Starting: perf script --cpu=24 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=25 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=26 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Starting: perf script --cpu=27 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=25 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=26 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data Finished: perf script --cpu=27 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 27 completed, 1 running Finished: perf script --cpu=24 -i /tmp/perf-test-script.T4MJDr0L6J/pp-perf.data There are 28 jobs: 28 completed, 0 running All jobs finished successfully parallel-perf.py done parallel-perf test [Success] --- Cleaning up --- ---- end(0) ---- 97: perf script tests : Ok root@number:~# Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240423133248.10206-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	tools lib rbtree: Pick some improvements from the kernel rbtree code	Arnaldo Carvalho de Melo
	The tools/lib/rbtree.c code came from the kernel, removing the EXPORT_SYMBOL() that make sense only there, unfortunately it is not being checked with tools/perf/check_headers.sh, will try to remedy this, till then pick the improvements from: b0687c1119b4e8c8 ("lib/rbtree: use '+' instead of '\|' for setting color.") That I noticed by doing: diff -u tools/lib/rbtree.c lib/rbtree.c diff -u tools/include/linux/rbtree_augmented.h include/linux/rbtree_augmented.h There is one other cases, but lets pick it in separate patches. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Noah Goldstein <goldstein.w.n@gmail.com> Link: https://lore.kernel.org/lkml/ZigZzeFoukzRKG1Q@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26	perf tests shell kprobes: Add missing description as used by 'perf test' output	Arnaldo Carvalho de Melo
	Before: root@x1:~# perf test 76 76: SPDX-License-Identifier: GPL-2.0 : Ok root@x1:~# After: root@x1:~# perf test 76 76: Add 'perf probe's, list and remove them. : Ok root@x1:~# Reviewed-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Veronika Molnarova <vmolnaro@redhat.com> Link: https://lore.kernel.org/lkml/ZigRDKUGkcDqD-yW@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>