summaryrefslogtreecommitdiff
path: root/tools/perf
AgeCommit message (Collapse)Author
2025-02-19perf report: Fix input reload/switch with symbol sort keyDmitry Vyukov
Currently the code checks that there is no "ipc" in the sort order and add an ipc string. This will always error out on the second pass after input reload/switch, since the sort order already contains "ipc". Do the ipc check/fixup only on the first pass. Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Link: https://lore.kernel.org/r/20250108063628.215577-1-dvyukov@google.com Fixes: ec6ae74fe8f0 ("perf report: Display average IPC and IPC coverage per symbol") Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-19perf report: Support switching data w/ and w/o callchainsNamhyung Kim
The symbol_conf.use_callchain should be reset when switching to new data file, otherwise report__setup_sample_type() will show an error message that it enabled callchains but no callchain data. The function also will turn on the callchains if the data has PERF_SAMPLE_CALLCHAIN so I think it's ok to reset symbol_conf.use_callchain here. Link: https://lore.kernel.org/r/20250211060745.294289-2-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-19perf report: Switch data file correctly in TUINamhyung Kim
The 's' key is to switch to a new data file and load the data in the same window. The switch_data_file() will show a popup menu to select which data file user wants and update the 'input_name' global variable. But in the cmd_report(), it didn't update the data.path using the new 'input_name' and keep usng the old file. This is fairly an old bug and I assume people don't use this feature much. :) Link: https://lore.kernel.org/r/20250211060745.294289-1-namhyung@kernel.org Closes: https://lore.kernel.org/linux-perf-users/89e678bc-f0af-4929-a8a6-a2666f1294a4@linaro.org Fixes: f5fc14124c5cefdd ("perf tools: Add data object to handle perf data file") Reported-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-19perf tools: Fix up some comments and code to properly use the event_source busGreg Kroah-Hartman
In sysfs, the perf events are all located in /sys/bus/event_source/devices/ but some places ended up hard-coding the location to be at the root of /sys/devices/ which could be very risky as you do not exactly know what type of device you are accessing in sysfs at that location. So fix this all up by properly pointing everything at the bus device list instead of the root of the sysfs devices/ tree. Cc: stable <stable@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/2025021955-implant-excavator-179d@gregkh Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-19perf list: Also append PMU name in verbose modeJames Clark
When listing in verbose mode, the long description is used but the PMU name isn't appended. There doesn't seem to be a reason to exclude it when asking for more information, so use the same print block for both long and short descriptions. Before: $ perf list -v ... inst_retired [Instruction architecturally executed] After: $ perf list -v ... inst_retired [Instruction architecturally executed. Unit: armv8_cortex_a57] Signed-off-by: James Clark <james.clark@linaro.org> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250219151622.1097289-1-james.clark@linaro.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-19perf vendor events arm64: Fix incorrect CPU_CYCLE in metrics exprYangyu Chen
Some existing metrics for Neoverse N3 and V3 expressions use CPU_CYCLE to represent the number of cycles, but this is incorrect. The correct event to use is CPU_CYCLES. I encountered this issue while working on a patch to add pmu events for Cortex A720 and A520 by reusing the existing patch for Neoverse N3 and V3 by James Clark [1] and my check script [2] reported this issue. [1] https://lore.kernel.org/lkml/20250122163504.2061472-1-james.clark@linaro.org/ [2] https://github.com/cyyself/arm-pmu-check Signed-off-by: Yangyu Chen <cyy@cyyself.name> Reviewed-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/tencent_D4ED18476ADCE818E31084C60E3E72C14907@qq.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-18perf script: Fix hangup in offline flamegraph reportNamhyung Kim
A recent change in the flamegraph script fixed an issue with live mode but it created another for offline mode. It needs to pass "-" to -i option to read from stdin in the live mode. Actually there's a logic to pass the option in the perf script code, but the script was written with "-- $@" which prevented the option to go to the perf script. So the previous commit added the hard-coded "-i -" to the report command. But it's a problem for the offline mode which expects input from a file and now it's stuck on reading from stdin. Let's remove the "-i - --" part and let it pass the options properly to perf script. Closes: https://lore.kernel.org/linux-perf-users/c41e4b04-e1fd-45ab-80b0-ec2ac6e94310@linux.ibm.com Fixes: 23e0a63c6dd3f69c ("perf script: force stdin for flamegraph in live mode") Reported-by: Thomas Richter <tmricht@linux.ibm.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Anubhav Shelat <ashelat@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-18perf hist: Shrink struct hist_entry sizeDmitry Vyukov
Reorder the struct fields by size to reduce paddings and reduce struct simd_flags size from 8 to 1 byte. This reduces struct hist_entry size by 8 bytes (592->584), and leaves a single more usable 6 byte padding hole. Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lore.kernel.org/r/7c1cb1c8f9901e945162701ba7269d0f9c70be89.1739437531.git.dvyukov@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-18perf test: Add tests for latency and parallelism profilingDmitry Vyukov
Ensure basic operation of latency/parallelism profiling and that main latency/parallelism record/report invocations don't fail/crash. Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lore.kernel.org/r/c129c8f02f328f68e1e9ef2cdc582f8a9786a97d.1739437531.git.dvyukov@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-18perf report: Add latency and parallelism profiling documentationDmitry Vyukov
Describe latency and parallelism profiling, related flags, and differences with the currently only supported CPU-consumption-centric profiling. Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lore.kernel.org/r/a13f270ed33cedb03ce9ebf9ddbd064854ca0f19.1739437531.git.dvyukov@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-18perf report: Add --latency flagDmitry Vyukov
Add record/report --latency flag that allows to capture and show latency-centric profiles rather than the default CPU-consumption-centric profiles. For latency profiles record captures context switch events, and report shows Latency as the first column. Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lore.kernel.org/r/e9640464bcbc47dde2cb557003f421052ebc9eec.1739437531.git.dvyukov@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-18perf report: Add latency output fieldDmitry Vyukov
Latency output field is similar to overhead, but represents overhead for latency rather than CPU consumption. It's re-scaled from overhead by dividing weight by the current parallelism level at the time of the sample. It effectively models profiling with 1 sample taken per unit of wall-clock time rather than unit of CPU time. Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lore.kernel.org/r/b6269518758c2166e6ffdc2f0e24cfdecc8ef9c1.1739437531.git.dvyukov@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-18perf report: Add parallelism filterDmitry Vyukov
Add parallelism filter that can be used to look at specific parallelism levels only. The format is the same as cpu lists. For example: Only single-threaded samples: --parallelism=1 Low parallelism only: --parallelism=1-4 High parallelism only: --parallelism=64-128 Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lore.kernel.org/r/e61348985ff0a6a14b07c39e880edbd60a8f8635.1739437531.git.dvyukov@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-18perf report: Switch filtered from u8 to u16Dmitry Vyukov
We already have all u8 bits taken, adding one more filter leads to unpleasant failure mode, where code compiles w/o warnings, but the last filters silently don't work. Add a typedef and switch to u16. Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lore.kernel.org/r/32b4ce1731126c88a2d9e191dc87e39ae4651cb7.1739437531.git.dvyukov@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-17perf report: Add parallelism sort keyDmitry Vyukov
Show parallelism level in profiles if requested by user. Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lore.kernel.org/r/7f7bb87cbaa51bf1fb008a0d68b687423ce4bad4.1739437531.git.dvyukov@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-17perf report: Add machine parallelismDmitry Vyukov
Add calculation of the current parallelism level (number of threads actively running on CPUs). The parallelism level can be shown in reports on its own, and to calculate latency overheads. Signed-off-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Link: https://lore.kernel.org/r/0f8c1b8eb12619029e31b3d5c0346f4616a5aeda.1739437531.git.dvyukov@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-14perf tools: Fix compile error on sample->user_regsNamhyung Kim
It's recently changed to allocate dynamically but misses to update some arch-dependent codes to use perf_sample__user_regs(). Fixes: dc6d2bc2d893a878 ("perf sample: Make user_regs and intr_regs optional") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250214191641.756664-1-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-14perf tools: Fix compilation error on arm64Leo Yan
Since the commit dc6d2bc2d893 ("perf sample: Make user_regs and intr_regs optional"), the building for Arm64 reports error: arch/arm64/util/unwind-libdw.c: In function ‘libdw__arch_set_initial_registers’: arch/arm64/util/unwind-libdw.c:11:32: error: initialization of ‘struct regs_dump *’ from incompatible pointer type ‘struct regs_dump **’ [-Werror=incompatible-pointer-types] 11 | struct regs_dump *user_regs = &ui->sample->user_regs; | ^ cc1: all warnings being treated as errors make[6]: *** [/home/niayan01/linux/tools/build/Makefile.build:85: arch/arm64/util/unwind-libdw.o] Error 1 make[5]: *** [/home/niayan01/linux/tools/build/Makefile.build:138: util] Error 2 arch/arm64/tests/dwarf-unwind.c: In function ‘test__arch_unwind_sample’: arch/arm64/tests/dwarf-unwind.c:48:27: error: initialization of ‘struct regs_dump *’ from incompatible pointer type ‘struct regs_dump **’ [-Werror=incompatible-pointer-types] 48 | struct regs_dump *regs = &sample->user_regs; | ^ To fix the issue, use the helper perf_sample__user_regs() to retrieve the user_regs. Fixes: dc6d2bc2d893 ("perf sample: Make user_regs and intr_regs optional") Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250214111025.14478-1-leo.yan@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf sample: Make user_regs and intr_regs optionalIan Rogers
The struct dump_regs contains 512 bytes of cache_regs, meaning the two values in perf_sample contribute 1088 bytes of its total 1384 bytes size. Initializing this much memory has a cost reported by Tavian Barnes <tavianator@tavianator.com> as about 2.5% when running `perf script --itrace=i0`: https://lore.kernel.org/lkml/d841b97b3ad2ca8bcab07e4293375fb7c32dfce7.1736618095.git.tavianator@tavianator.com/ Adrian Hunter <adrian.hunter@intel.com> replied that the zero initialization was necessary and couldn't simply be removed. This patch aims to strike a middle ground of still zeroing the perf_sample, but removing 79% of its size by make user_regs and intr_regs optional pointers to zalloc-ed memory. To support the allocation accessors are created for user_regs and intr_regs. To support correct cleanup perf_sample__init and perf_sample__exit functions are created and added throughout the code base. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250113194345.1537821-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf test stat_all_metrics: Ensure missing events fail testIan Rogers
Issue reported by Thomas Falcon and diagnosed by Kan Liang here: https://lore.kernel.org/lkml/d44036481022c27d83ce0faf8c7f77042baedb34.camel@intel.com/ Metrics with missing events can be erroneously skipped if they contain FP, AMX or PMM events. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250211213031.114209-25-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf vendor events: Update Tigerlake events/metricsIan Rogers
Update events from v1.16 to v1.17. Update TMA metrics from 4.8 to 5.02. Bring in the event updates v1.17: https://github.com/intel/perfmon/commit/e1d5ac3412450bf049301cb26206d03c41066b83 The TMA 5.02 addition is from (with subsequent fixes): https://github.com/intel/perfmon/commit/1d72913b2d938781fb28f3cc3507aaec5c22d782 Co-developed-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Caleb Biggers <caleb.biggers@intel.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250211213031.114209-24-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf vendor events: Update SkylakeX events/metricsIan Rogers
Update events from v1.35 to v1.36. Update TMA metrics from 4.8 to 5.02. Bring in the event updates v1.36: https://github.com/intel/perfmon/commit/f6801e5c145406f355f40e1746f836eaa1426cf9 The TMA 5.02 addition is from (with subsequent fixes): https://github.com/intel/perfmon/commit/1d72913b2d938781fb28f3cc3507aaec5c22d782 Co-developed-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Caleb Biggers <caleb.biggers@intel.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250211213031.114209-23-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf vendor events: Update Skylake metricsIan Rogers
Update TMA metrics from 4.8 to 5.02. The TMA 5.02 addition is from (with subsequent fixes): https://github.com/intel/perfmon/commit/1d72913b2d938781fb28f3cc3507aaec5c22d782 Co-developed-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Caleb Biggers <caleb.biggers@intel.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250211213031.114209-22-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf vendor events: Update Sierraforest events/metricsIan Rogers
Update events from v1.04 to v1.07. Update TMA metrics from 4.8 to 5.02. Bring in the event updates v1.08: https://github.com/intel/perfmon/commit/7ae9c45ccf42cea2dc0b867ec1030ab5a8445b9f https://github.com/intel/perfmon/commit/903b3d0a0a61bb6064013db9eb4c26457dacfea6 https://github.com/intel/perfmon/commit/825c4361473e676119b51f04c7896a8cfa8a5ea5 https://github.com/intel/perfmon/commit/bafe6a7b5cbee92c31ec19dfcefd6dcc243e4e8a The TMA 5.02 addition is from (with subsequent fixes): https://github.com/intel/perfmon/commit/1d72913b2d938781fb28f3cc3507aaec5c22d782 Update uncore IIO events umask with the change: https://github.com/intel/perfmon/commit/d78e8a166537c9ceab4f2e901dc96c53667a2174 which should address an issue originally raised by Michael Petlan: Reported-by: Michael Petlan <mpetlan@redhat.com> Closes: https://lore.kernel.org/all/alpine.LRH.2.20.2401300733310.11354@Diego/ Co-developed-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250211213031.114209-21-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf vendor events: Update Sapphirerapids events/metricsIan Rogers
Update events from v1.23 to v1.25. Update TMA metrics from 4.8 to 5.02. Bring in the event updates v1.25: https://github.com/intel/perfmon/commit/78d6273c546329052429e3a005491b58fbe1167b https://github.com/intel/perfmon/commit/f069ed9d0b69b02d76d4b4c59dfc75b62bfb2254 The TMA 5.02 addition is from (with subsequent fixes): https://github.com/intel/perfmon/commit/1d72913b2d938781fb28f3cc3507aaec5c22d782 Update uncore IIO events umask with the change: https://github.com/intel/perfmon/commit/d78e8a166537c9ceab4f2e901dc96c53667a2174 which should address an issue originally raised by Michael Petlan: Reported-by: Michael Petlan <mpetlan@redhat.com> Closes: https://lore.kernel.org/all/alpine.LRH.2.20.2401300733310.11354@Diego/ Co-developed-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Caleb Biggers <caleb.biggers@intel.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250211213031.114209-20-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf vendor events: Update Rocketlake events/metricsIan Rogers
Update events from v1.03 to v1.04. Update TMA metrics from 4.8 to 5.02. Bring in the event updates v1.04: https://github.com/intel/perfmon/commit/015d5a5eab6850e6367ee4f82e4808e166eaf5a5 The TMA 5.02 addition is from (with subsequent fixes): https://github.com/intel/perfmon/commit/1d72913b2d938781fb28f3cc3507aaec5c22d782 Co-developed-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Caleb Biggers <caleb.biggers@intel.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250211213031.114209-19-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf vendor events: Update Meteorlake events/metricsIan Rogers
Update events from v1.10 to v1.12. Update TMA metrics from 4.8 to 5.02. Bring in the event updates v1.12: https://github.com/intel/perfmon/commit/d8fe70c91bf8f166ba08edd4d02fd7846a3fd956 https://github.com/intel/perfmon/commit/b9dabd05ff44af24fde0682e16d1a716c932f0d0 This updates the mapfile.csv for the 0xB5 CPUID variant of meteorlake. https://github.com/intel/perfmon/commit/c3094bc9bbaff30071874a492afc3369554d572e The TMA 5.02 addition is from (with subsequent fixes): https://github.com/intel/perfmon/commit/1d72913b2d938781fb28f3cc3507aaec5c22d782 Co-developed-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250211213031.114209-18-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf vendor events: Update/add Lunarlake events/metricsIan Rogers
Update events from v1.01 to v1.10. Add TMA metrics 5.02. Bring in the event updates v1.11: https://github.com/intel/perfmon/commit/af329039e8a0bee7c9274fc0a18781cf8e572256 https://github.com/intel/perfmon/commit/4a1cff8cebe9791a1ceb91ca39fc64e9139a3993 https://github.com/intel/perfmon/commit/cbc3b0dc19e8fc52c9604f1da301648ed69f012b https://github.com/intel/perfmon/commit/28f4b24f9152a0ee1fb3435535628384ad881c22 https://github.com/intel/perfmon/commit/172900e962fdd34ddb80879f4f91add5f773ca29 https://github.com/intel/perfmon/commit/dab0308f7a27d2c644e08d63436b790a207fb22e The TMA 5.02 addition is from (with subsequent fixes): https://github.com/intel/perfmon/commit/1d72913b2d938781fb28f3cc3507aaec5c22d782 Co-developed-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250211213031.114209-17-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf vendor events: Update IcelakeX events/metricsIan Rogers
Update events from v1.26 to v1.27. Update TMA metrics from 4.8 to 5.02. Bring in the event updates v1.27: https://github.com/intel/perfmon/commit/6ee80d0532a778caee68d6e29d8e05278567e69f The TMA 5.02 update is from (with subsequent fixes): https://github.com/intel/perfmon/commit/1d72913b2d938781fb28f3cc3507aaec5c22d782 Co-developed-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Caleb Biggers <caleb.biggers@intel.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250211213031.114209-16-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf vendor events: Update Icelake events/metricsIan Rogers
Update events from v1.22 to v1.24. Update TMA metrics from 4.8 to 5.02. Bring in the event updates v1.24: https://github.com/intel/perfmon/commit/d4f10746cf549466723d17cd214e1ee9cb7bac11 The TMA 5.02 update is from (with subsequent fixes): https://github.com/intel/perfmon/commit/1d72913b2d938781fb28f3cc3507aaec5c22d782 Co-developed-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Caleb Biggers <caleb.biggers@intel.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250211213031.114209-15-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf vendor events: Update HaswellX events/metricsIan Rogers
Update events from v28 to v29. Update TMA metrics from 4.8 to 5.02. Bring in the event updates v29: https://github.com/intel/perfmon/commit/71dbf03aba964f79fb096c9ded385c8a486a99b3 The TMA 5.02 update is from (with subsequent fixes): https://github.com/intel/perfmon/commit/1d72913b2d938781fb28f3cc3507aaec5c22d782 Co-developed-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Caleb Biggers <caleb.biggers@intel.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250211213031.114209-14-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf vendor events: Update Haswell events/metricsIan Rogers
Update events from v35 to v36. Update TMA metrics from 4.8 to 5.02. Bring in the event updates v36: https://github.com/intel/perfmon/commit/616ec6fc0315dac35c1bea0abc7f59e21a2d51c0 The TMA 5.02 update is from (with subsequent fixes): https://github.com/intel/perfmon/commit/1d72913b2d938781fb28f3cc3507aaec5c22d782 Remove duplicate event UNC_CLOCK.SOCKET that was erroneously left in uncore-other.json. Co-developed-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250211213031.114209-13-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf vendor events: Update/add Graniterapids events/metricsIan Rogers
Update events from v1.02 to v1.06. Add TMA metrics 5.02. Bring in the event updates v1.06: https://github.com/intel/perfmon/commit/de5502e51a86b0cf42d0807d4e8ed3c6299b4e6c https://github.com/intel/perfmon/commit/79b9e512eab58641941a0b8d10ffe75914a87e17 https://github.com/intel/perfmon/commit/bc74a895e461b5ac720559da667e83a8fedf7829 The TMA 5.02 addition is from (with subsequent fixes): https://github.com/intel/perfmon/commit/1d72913b2d938781fb28f3cc3507aaec5c22d782 Update uncore IIO events umask with the change: https://github.com/intel/perfmon/commit/d78e8a166537c9ceab4f2e901dc96c53667a2174 which should address an issue originally raised by Michael Petlan: Reported-by: Michael Petlan <mpetlan@redhat.com> Closes: https://lore.kernel.org/all/alpine.LRH.2.20.2401300733310.11354@Diego/ Co-developed-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250211213031.114209-12-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf vendor events: Update GrandRidge events/metricsIan Rogers
Update events from v1.03 to v1.05. Update TMA metrics from 4.8 to 5.02. Bring in the event updates v1.05: https://github.com/intel/perfmon/commit/3b2e3528fbfb5576f443607ac9d772de88aed72c https://github.com/intel/perfmon/commit/9bc1815536ff1f6fe73693a19a410b6a711740c2 The TMA 5.02 update is from (with subsequent fixes): https://github.com/intel/perfmon/commit/1d72913b2d938781fb28f3cc3507aaec5c22d782 Update uncore IIO events umask with the change: https://github.com/intel/perfmon/commit/d78e8a166537c9ceab4f2e901dc96c53667a2174 which should address an issue originally raised by Michael Petlan: Reported-by: Michael Petlan <mpetlan@redhat.com> Closes: https://lore.kernel.org/all/alpine.LRH.2.20.2401300733310.11354@Diego/ Co-developed-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Caleb Biggers <caleb.biggers@intel.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250211213031.114209-11-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf vendor events: Update EmeraldRapids events/metricsIan Rogers
Update events from v1.09 to v1.11. Update TMA metrics from 4.8 to 5.02. Bring in the event updates v1.11: https://github.com/intel/perfmon/commit/bffcec00a184bb93d505f182047cf889d124fbd5 https://github.com/intel/perfmon/commit/a63da6de48046c365ab91c5001bfd5d907d5a1d6 The TMA 5.02 update is from (with subsequent fixes): https://github.com/intel/perfmon/commit/1d72913b2d938781fb28f3cc3507aaec5c22d782 Update uncore IIO events umask with the change: https://github.com/intel/perfmon/commit/d78e8a166537c9ceab4f2e901dc96c53667a2174 which should address an issue originally raised by Michael Petlan: Reported-by: Michael Petlan <mpetlan@redhat.com> Closes: https://lore.kernel.org/all/alpine.LRH.2.20.2401300733310.11354@Diego/ Co-developed-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Caleb Biggers <caleb.biggers@intel.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250211213031.114209-10-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf vendor events: Add Clearwaterforest eventsIan Rogers
Add events v1.00. Bring in the events from: https://github.com/intel/perfmon/tree/main/CWF/events Co-developed-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Caleb Biggers <caleb.biggers@intel.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250211213031.114209-9-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf vendor events: Update CascadelakeX events/metricsIan Rogers
Update events from v1.22 to v1.23. Update TMA metrics from 4.8 to 5.02. Bring in the event updates v1.23: https://github.com/intel/perfmon/commit/8f3665f6be4688fd1dd1e713ba49ca16ec93b856 The TMA 5.02 update is from (with subsequent fixes): https://github.com/intel/perfmon/commit/1d72913b2d938781fb28f3cc3507aaec5c22d782 Co-developed-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Caleb Biggers <caleb.biggers@intel.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250211213031.114209-8-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf vendor events: Update BroadwellX events/metricsIan Rogers
Update events from v22 to v23. Update TMA metrics from 4.8 to 5.02. Bring in the event updates v23: https://github.com/intel/perfmon/commit/679982113f4bfa16cee19d5408a7f8e309e3ac23 The TMA 5.02 update is from (with subsequent fixes): https://github.com/intel/perfmon/commit/1d72913b2d938781fb28f3cc3507aaec5c22d782 Co-developed-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Caleb Biggers <caleb.biggers@intel.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250211213031.114209-7-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf vendor events: Update BroadwellDE events/metricsIan Rogers
Update events from v11 to v12. Update TMA metrics from 4.8 to 5.02. Bring in the event updates v12: https://github.com/intel/perfmon/commit/e0b83388d545e527933031ddb2a1d22d65040de1 The TMA 5.02 update is from (with subsequent fixes): https://github.com/intel/perfmon/commit/1d72913b2d938781fb28f3cc3507aaec5c22d782 Co-developed-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Caleb Biggers <caleb.biggers@intel.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250211213031.114209-6-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf vendor events: Update Broadwell events/metricsIan Rogers
Update events from v29 to v30. Update TMA metrics from 4.8 to 5.02. Bring in the event updates v30: https://github.com/intel/perfmon/commit/9a1827b2ac3927a455ae7df5aa3d1e1b10e69f15 The TMA 5.02 update is from (with subsequent fixes): https://github.com/intel/perfmon/commit/1d72913b2d938781fb28f3cc3507aaec5c22d782 Co-developed-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Caleb Biggers <caleb.biggers@intel.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250211213031.114209-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf vendor events: Add Arrowlake events/metricsIan Rogers
Add events v1.07. Add TMA metrics based on v5.02. Bring in the events from: https://github.com/intel/perfmon/tree/main/ARL/events TMA 5.02 is from (with subsequent fixes): https://github.com/intel/perfmon/commit/1d72913b2d938781fb28f3cc3507aaec5c22d782 Co-developed-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250211213031.114209-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf vendor events: Update AlderlakeN events/metricsIan Rogers
Update events from v1.27 to v1.28. Update TMA metrics from 4.8 to 5.02. Bring in the event updates v1.28: https://github.com/intel/perfmon/commit/801f43f22ec6bd23fbb5d18860f395d61e7f4081 The TMA 5.02 update is from (with subsequent fixes): https://github.com/intel/perfmon/commit/1d72913b2d938781fb28f3cc3507aaec5c22d782 Co-developed-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Caleb Biggers <caleb.biggers@intel.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250211213031.114209-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf vendor events: Update Alderlake events/metricsIan Rogers
Update events from v1.27 to v1.28. Update TMA metrics from 4.8 to 5.02. Bring in the event updates v1.28: https://github.com/intel/perfmon/commit/801f43f22ec6bd23fbb5d18860f395d61e7f4081 The TMA 5.02 update is from (with subsequent fixes): https://github.com/intel/perfmon/commit/1d72913b2d938781fb28f3cc3507aaec5c22d782 Co-authored-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Caleb Biggers <caleb.biggers@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250211213031.114209-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf tools: Use symfs when opening debuginfo by pathNamhyung Kim
I found that it failed to load a binary using --symfs option. Say I have a binary in /home/user/prog/xxx and a perf data file with it. If I move them to a different machine and use --symfs, it tries to find the binary in some locations under symfs using dso__read_binary_type_filename(), but not the last one. ${symfs}/usr/lib/debug/home/user/prog/xxx.debug ${symfs}/usr/lib/debug/home/user/prog/xxx ${symfs}/home/user/prog/.debug/xxx /home/user/prog/xxx It should check ${symfs}/home/usr/prog/xxx. Let's fix it. Reviewed-by: Ian Rogers <irogers@google.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20250212221445.437481-1-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf trace: Add --summary-mode optionNamhyung Kim
The --summary-mode option will select how to show the syscall summary at the end. By default, it'll show the summary for each thread and it's the same as if --summary-mode=thread is passed. The other option is to show total summary, which is --summary-mode=total. I'd like to have this instead of a separate option like --total-summary because we may want to add a new summary mode (by cgroup) later. $ sudo ./perf trace -as --summary-mode=total sleep 1 Summary of events: total, 21580 events syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ epoll_wait 1305 0 14716.712 0.000 11.277 551.529 8.87% futex 1256 89 13331.197 0.000 10.614 733.722 15.49% poll 669 0 6806.618 0.000 10.174 459.316 11.77% ppoll 220 0 3968.797 0.000 18.040 516.775 25.35% clock_nanosleep 1 0 1000.027 1000.027 1000.027 1000.027 0.00% epoll_pwait 21 0 592.783 0.000 28.228 522.293 88.29% nanosleep 16 0 60.515 0.000 3.782 10.123 33.33% ioctl 510 0 4.284 0.001 0.008 0.182 8.84% recvmsg 1434 775 3.497 0.001 0.002 0.174 6.37% write 1393 0 2.854 0.001 0.002 0.017 1.79% read 1063 100 2.236 0.000 0.002 0.083 5.11% ... Reviewed-by: Howard Chu <howardchu95@gmail.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250205205443.1986408-5-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf tools: Get rid of now-unused rb_resort.hNamhyung Kim
It was only used in perf trace and it switched to use hashmap instead. Let's delete the code. Reviewed-by: Howard Chu <howardchu95@gmail.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250205205443.1986408-4-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf trace: Convert syscall_stats to hashmapNamhyung Kim
It was using a RBtree-based int-list as a hash and a custom resort logic for that. As we have hashmap, let's convert to it and add a custom sort function for the hashmap entries using an array. It should be faster and more light-weighted. It's also to prepare supporting system-wide syscall stats. No functional changes intended. Reviewed-by: Howard Chu <howardchu95@gmail.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250205205443.1986408-3-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf trace: Allocate syscall stats only if summary is onNamhyung Kim
The syscall stats are used only when summary is requested. Let's avoid unnecessary operations. While at it, let's pass 'trace' pointer directly instead of passing 'output' file pointer and 'summary' option in the 'trace' separately. Reviewed-by: Howard Chu <howardchu95@gmail.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250205205443.1986408-2-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf tests: Fix Tool PMU test segfaultJames Clark
tool_pmu__event_to_str() now handles skipped events by returning NULL, so it's wrong to re-check for a skip on the resulting string. Calling tool_pmu__skip_event() with a NULL string results in a segfault so remove the unnecessary skip to fix it: $ perf test -vv "parsing with PMU name" 12.2: Parsing with PMU name: ... ---- unexpected signal (11) ---- 12.2: Parsing with PMU name : FAILED! Fixes: ee8aef2d2321 ("perf tools: Add skip check in tool_pmu__event_to_str()") Signed-off-by: James Clark <james.clark@linaro.org> Reported-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250212163859.1489916-1-james.clark@linaro.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-10perf tools: Add skip check in tool_pmu__event_to_str()Kan Liang
Some topdown related metrics may fail on hybrid machines. $ perf stat -M tma_frontend_bound Cannot resolve IDs for tma_frontend_bound: cpu_atom@TOPDOWN_FE_BOUND.ALL@ / (8 * cpu_atom@CPU_CLK_UNHALTED.CORE@) In the find_tool_events(), the tool_pmu__event_to_str() is used to compare the tool_events. It only checks the event name, no PMU or arch. So the tool_events[TOOL_PMU__EVENT_SLOTS] is set to true, because the p-core Topdown metrics has "slots" event. The tool_events is shared. So when parsing the e-core metrics, the "slots" is automatically added. The "slots" event as a tool event should only be available on arm64. It has a different meaning on X86. The tool_pmu__skip_event() intends handle the case. Apply it for tool_pmu__event_to_str() as well. There is a lack of sanity check in the expr__get_id(). Add the check. Closes: https://lore.kernel.org/lkml/608077bc-4139-4a97-8dc4-7997177d95c4@linux.intel.com/ Fixes: 069057239a67 ("perf tool_pmu: Move expr literals to tool_pmu") Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Reviewed-by: Ian Rogers <irogers@google.com> Cc: thomas.falcon@intel.com Link: https://lore.kernel.org/r/20250207152844.302167-1-kan.liang@linux.intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>