summaryrefslogtreecommitdiff
path: root/tools/perf/util
AgeCommit message (Collapse)Author
2017-02-02perf diff: Fix -o/--order option behavior (again)Namhyung Kim
Commit 21e6d8428664 ("perf diff: Use perf_hpp__register_sort_field interface") changed list_add() to perf_hpp__register_sort_field(). This resulted in a behavior change since the field was added to the tail instead of the head. So the -o option is mostly ignored due to its order in the list. This patch fixes it by adding perf_hpp__prepend_sort_field(). Signed-off-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Fixes: 21e6d8428664 ("perf diff: Use perf_hpp__register_sort_field interface") Link: http://lkml.kernel.org/r/20170118051457.30946-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-31perf tools: Create for_each_event macro for tracepoints iterationTaeung Song
Similar to for_each_subsystem and for_each_event in util/parse-events.c, add new macro 'for_each_event' for easy iteration over the tracepoints in order to be more compact and readable. Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1485862711-20216-2-git-send-email-treeze.taeung@gmail.com [ Slight change to keep existing style for checking strcmp() return ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-31tools perf util: Make rm_rf(path) argument constJoe Stringer
rm_rf() doesn't modify its path argument, and a future caller will pass a string constant into it to delete. Signed-off-by: Joe Stringer <joe@ovn.org> Cc: Alexei Starovoitov <ast@fb.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Wang Nan <wangnan0@huawei.com> Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/20170126212001.14103-5-joe@ovn.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-31perf callchain: Reference count mapsKrister Johansen
If dso__load_kcore frees all of the existing maps, but one has already been attached to a callchain cursor node, then we can get a SIGSEGV in any function that happens to try to use this invalid cursor. Use the existing map refcount mechanism to forestall cleanup of a map until the cursor iterates past the node. Signed-off-by: Krister Johansen <kjlx@templeofstupid.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: stable@kernel.org Fixes: 84c2cafa2889 ("perf tools: Reference count struct map") Link: http://lkml.kernel.org/r/20170106062331.GB2707@templeofstupid.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Two trivial overlapping changes conflicts in MPLS and mlx5. Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-27perf tools: Propagate perf_config() errorsArnaldo Carvalho de Melo
Previously these were being ignored, sometimes silently. Stop doing that, emitting debug messages and handling the errors. Testing it: $ cat ~/.perfconfig cat: /home/acme/.perfconfig: No such file or directory $ perf stat -e cycles usleep 1 Performance counter stats for 'usleep 1': 938,996 cycles:u 0.003813731 seconds time elapsed $ perf top --stdio Error: You may not have permission to collect system-wide stats. Consider tweaking /proc/sys/kernel/perf_event_paranoid, <SNIP> [ perf record: Captured and wrote 0.019 MB perf.data (7 samples) ] [acme@jouet linux]$ perf report --stdio # To display the perf.data header info, please use --header/--header-only options. # Overhead Command Shared Object Symbol # ........ ....... ................. ......................... 71.77% usleep libc-2.24.so [.] _dl_addr 27.07% usleep ld-2.24.so [.] _dl_next_ld_env_entry 1.13% usleep [kernel.kallsyms] [k] page_fault $ $ touch ~/.perfconfig $ ls -la ~/.perfconfig -rw-rw-r--. 1 acme acme 0 Jan 27 12:14 /home/acme/.perfconfig $ $ perf stat -e instructions usleep 1 Performance counter stats for 'usleep 1': 244,610 instructions:u 0.000805383 seconds time elapsed $ [root@jouet ~]# chown acme.acme ~/.perfconfig [root@jouet ~]# perf stat -e cycles usleep 1 Warning: File /root/.perfconfig not owned by current user or root, ignoring it. Performance counter stats for 'usleep 1': 937,615 cycles 0.000836931 seconds time elapsed # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-j2rq96so6xdqlr8p8rd6a3jx@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-27perf config: Do not consider an error not to have any perfconfig fileArnaldo Carvalho de Melo
While propagating the errors from perf_config(), which were being completely ignored, everything stopped working for people without a ~/.perfconfig file, because the perf_config_set__init() was considering an error not to have a .perfconfig file, duh, fix it by checking the errno after the failed stat() call. It should also not return an error when it says it is ignoring the file, and also a empty file should not return an error either. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: Wang Nan <wangnan0@huawei.com> Fixes: 8beeb00f2c84 ("perf config: Use new perf_config_set__init() to initialize config set") Link: http://lkml.kernel.org/n/tip-ygpbab3apbs6l8wr97xedwks@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-26Merge tag 'perf-core-for-mingo-4.11-20170126' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull the latest perf/core updates from Arnaldo Carvalho de Melo: New features: - Introduce 'perf ftrace' a perf front end to the kernel's ftrace function and function_graph tracer, defaulting to the "function_graph" tracer, more work will be done in reviving this effort, forward porting it from its initial patch submission (Namhyung Kim) - Add 'e' and 'c' hotkeys to expand/collapse call chains for a single hist entry in the 'perf report' and 'perf top' TUI (Jiri Olsa) Fixes: - Fix wrong register name for arm64, used in 'perf probe' (He Kuang) - Fix map offsets in relocation in libbpf (Joe Stringer) - Fix looking up dwarf unwind stack info (Matija Glavinic Pecotic) Infrastructure changes: - libbpf prog functions sync with what is exported via uapi (Joe Stringer) Trivial changes: - Remove unnecessary checks and assignments in 'perf probe's try_to_find_absolute_address() (Markus Elfring) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-26perf util: Add more debug message on failure pathNamhyung Kim
It's helpful for debugging on tracing features. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jeremy Eder <jeder@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com>, Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Steven Rostedt <rostedt@goodmis.org> Link: http://lkml.kernel.org/n/tip-rjysr9ljiesymgk4qblteaty@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-26perf util: Save pid-cmdline mapping into tracing headerNamhyung Kim
Current trace info data lacks the saved cmdline mapping which is needed for pevent to find out the comm of a task. Add this and bump up the version number so that perf can determine its presence when reading. This is mostly corresponding to trace.dat file version 6, but still lacks 4 byte of number of cpus, and 10 bytes of type string - and I think we don't need those anyway. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Jeremy Eder <jeder@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com>, Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Steven Rostedt <rostedt@goodmis.org> [ Change version test from == to >= ] Link: http://lkml.kernel.org/n/tip-vaooqpxsikxbb3359p0corcb@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-26perf scripting perl: Do not die() when not founding event for a typeArnaldo Carvalho de Melo
Do just like handling other cases i.e. print some debug message and ignore the sample. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-t7kzlm3cxyvbd7d9n9554ai9@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-26perf probe: Delete an unnecessary assignment in try_to_find_absolute_address()Markus Elfring
Remove an error code assignment which is redundant in an if branch for the handling of a memory allocation failure because the same value was set for the local variable "err" before. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Cc: kernel-janitors@vger.kernel.org Link: http://lkml.kernel.org/r/0ede09ec-79b6-c8bd-5b20-02c63ed98aab@users.sourceforge.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-26perf probe: Delete an unnecessary check in try_to_find_absolute_address()Markus Elfring
Remove a condition check which is unnecessary at the end because this source code place should usually only be reached with a non-zero pointer. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Milian Wolff <milian.wolff@kdab.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Cc: kernel-janitors@vger.kernel.org Link: http://lkml.kernel.org/r/a3f2473b-6383-a326-bce0-b826423608b8@users.sourceforge.net Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-25lib, traceevent: add PRINT_HEX_STR variantDaniel Borkmann
Add support for the __print_hex_str() macro that was added for tracing, so that user space tools such as perf can understand it as well. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-25Merge branch 'linus' into perf/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-01-18perf unwind: Fix looking up dwarf unwind stack infoMatija Glavinic Pecotic
Using perf with call graph method dwarf fails to provide backtrace support for stripped binary even though .gnu_debuglink points to *.dbg flavor with properly populated debug symbols. Problem is reproduced on ARM (v7, v8), kernels 3.14.y, 4.4.y and 4.10.rc3. Perf is configured with libunwind, and unwind dwarf support [1]. Test code (stress_bt.c) can be found on [2]. Running (explicitly disable other unwinding methods): $ gcc -g -o stress_bt -fomit-frame-pointer -fno-unwind-tables \ -fno-asynchronous-unwind-tables stress_bt.c $ perf record -N --call-graph dwarf ./stress_bt $ perf report results in properly generated call graph. Stripping the binary and running it results with missing call graph. Expected result is to have call graph: $ gcc -g -o stress_bt -fomit-frame-pointer -fno-unwind-tables \ -fno-asynchronous-unwind-tables stress_bt.c $ objcopy --only-keep-debug stress_bt stress_bt.dbg $ objcopy --strip-debug stress_bt $ objcopy --add-gnu-debuglink=stress_bt.dbg stress_bt $ perf record -N --call-graph dwarf ./stress_bt $ perf report Problem is that perf doesn't try to read symbols pointed by gnu debuglink. Patch adds checking, and reading of the symbols from debuglink and symsrc. Order of the check is to first check within dso, then check whether symsrc is defined and try to read from it. Finally, debuglink is checked. Default locations of debug files are discussed in [3] and [4]. Comments on RFC are on [5]. [1] https://wiki.linaro.org/LEG/Engineering/TOOLS/perf-callstack-unwinding [2] [1]#Backtrace_stress_application [3] https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html [4] https://sourceware.org/binutils/docs/binutils/objcopy.html [5] https://lkml.org/lkml/2016/8/22/473 Signed-off-by: Matija Glavinic Pecotic <matija.glavinic-pecotic.ext@nokia.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Sverdlin <alexander.sverdlin@nokia.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lkml.kernel.org/r/d309d40a-463f-482b-68e1-1465326efdc1@nokia.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-17perf evlist: Fix typo in deliver_sample()Soramichi AKIYAMA
This patch fixes a typo: s/delievery/delivery/ Signed-off-by: Soramichi Akiyama <akiyama@m.soramichi.jp> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170117222233.dfd92de0ad701e7c53396950@m.soramichi.jp Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-17perf tools: Move two variables usied in libperf from perf.cSoramichi AKIYAMA
The use_browser and perf_version_string variables are both declared in perf.c but they are also referenced by other functions of libperf.a. Therefore a user linking an own main() with libperf.a must declare those two variables in their files even if the files never use the browser or the version information. This patch fixes this issue by moving use_browser and perf_version_string out of perf.c to some other files. Signed-off-by: Soramichi Akiyama <akiyama@m.soramichi.jp> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20170117002237.c1aec0ce3b4d675dca018deb@m.soramichi.jp Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-16perf probe: Fix to probe on gcc generated functions in modulesMasami Hiramatsu
Fix to probe on gcc generated functions on modules. Since probing on a module is based on its symbol name, it should be adjusted on actual symbols. E.g. without this fix, perf probe shows probe definition on non-exist symbol as below. $ perf probe -m build-x86_64/net/netfilter/nf_nat.ko -F in_range* in_range.isra.12 $ perf probe -m build-x86_64/net/netfilter/nf_nat.ko -D in_range p:probe/in_range nf_nat:in_range+0 With this fix, perf probe correctly shows a probe on gcc-generated symbol. $ perf probe -m build-x86_64/net/netfilter/nf_nat.ko -D in_range p:probe/in_range nf_nat:in_range.isra.12+0 This also fixes same problem on online module as below. $ perf probe -m i915 -D assert_plane p:probe/assert_plane i915:assert_plane.constprop.134+0 Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/148411450673.9978.14905987549651656075.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-16perf probe: Add error checks to offline probe post-processingMasami Hiramatsu
Add error check codes on post processing and improve it for offline probe events as: - post processing fails if no matched symbol found in map(-ENOENT) or strdup() failed(-ENOMEM). - Even if the symbol name is the same, it updates symbol address and offset. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/148411443738.9978.4617979132625405545.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-16perf probe: Fix to show correct locations for events on modulesMasami Hiramatsu
Fix to show correct locations for events on modules by relocating given address instead of retrying after failure. This happens when the module text size is big enough, bigger than sh_addr, because the original code retries with given address + sh_addr if it failed to find CU DIE at the given address. Any address smaller than sh_addr always fails and it retries with the correct address, but addresses bigger than sh_addr will get a CU DIE which is on the given address (not adjusted by sh_addr). In my environment(x86-64), the sh_addr of ".text" section is 0x10030. Since i915 is a huge kernel module, we can see this issue as below. $ grep "[Tt] .*\[i915\]" /proc/kallsyms | sort | head -n1 ffffffffc0270000 t i915_switcheroo_can_switch [i915] ffffffffc0270000 + 0x10030 = ffffffffc0280030, so we'll check symbols cross this boundary. $ grep "[Tt] .*\[i915\]" /proc/kallsyms | grep -B1 ^ffffffffc028\ | head -n 2 ffffffffc027ff80 t haswell_init_clock_gating [i915] ffffffffc0280110 t valleyview_init_clock_gating [i915] So setup probes on both function and see what happen. $ sudo ./perf probe -m i915 -a haswell_init_clock_gating \ -a valleyview_init_clock_gating Added new events: probe:haswell_init_clock_gating (on haswell_init_clock_gating in i915) probe:valleyview_init_clock_gating (on valleyview_init_clock_gating in i915) You can now use it in all perf tools, such as: perf record -e probe:valleyview_init_clock_gating -aR sleep 1 $ sudo ./perf probe -l probe:haswell_init_clock_gating (on haswell_init_clock_gating@gpu/drm/i915/intel_pm.c in i915) probe:valleyview_init_clock_gating (on i915_vga_set_decode:4@gpu/drm/i915/i915_drv.c in i915) As you can see, haswell_init_clock_gating is correctly shown, but valleyview_init_clock_gating is not. With this patch, both events are shown correctly. $ sudo ./perf probe -l probe:haswell_init_clock_gating (on haswell_init_clock_gating@gpu/drm/i915/intel_pm.c in i915) probe:valleyview_init_clock_gating (on valleyview_init_clock_gating@gpu/drm/i915/intel_pm.c in i915) Committer notes: In my case: # perf probe -m i915 -a haswell_init_clock_gating -a valleyview_init_clock_gating Added new events: probe:haswell_init_clock_gating (on haswell_init_clock_gating in i915) probe:valleyview_init_clock_gating (on valleyview_init_clock_gating in i915) You can now use it in all perf tools, such as: perf record -e probe:valleyview_init_clock_gating -aR sleep 1 # perf probe -l probe:haswell_init_clock_gating (on i915_getparam+432@gpu/drm/i915/i915_drv.c in i915) probe:valleyview_init_clock_gating (on __i915_printk+240@gpu/drm/i915/i915_drv.c in i915) # # readelf -SW /lib/modules/4.9.0+/build/vmlinux | egrep -w '.text|Name' [Nr] Name Type Address Off Size ES Flg Lk Inf Al [ 1] .text PROGBITS ffffffff81000000 200000 822fd3 00 AX 0 0 4096 # So both are b0rked, now with the fix: # perf probe -m i915 -a haswell_init_clock_gating -a valleyview_init_clock_gating Added new events: probe:haswell_init_clock_gating (on haswell_init_clock_gating in i915) probe:valleyview_init_clock_gating (on valleyview_init_clock_gating in i915) You can now use it in all perf tools, such as: perf record -e probe:valleyview_init_clock_gating -aR sleep 1 # perf probe -l probe:haswell_init_clock_gating (on haswell_init_clock_gating@gpu/drm/i915/intel_pm.c in i915) probe:valleyview_init_clock_gating (on valleyview_init_clock_gating@gpu/drm/i915/intel_pm.c in i915) # Both looks correct. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/148411436777.9978.1440275861947194930.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-16perf pmu: Factor out scale conversion codeAndi Kleen
Move the scale factor parsing code to an own function to reuse it in an upcoming patch. v2: Return error in case strdup returns NULL. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170103150833.6694-2-andi@firstfloor.org [ Keep returning -ENOMEM when strdup() fails in perf_pmu__parse_scale()/convert_scale() ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11perf record: Add switch-output size warningJiri Olsa
Adding switch-output size warning if the requested size of lower than the wakeup ring buffer size. $ perf record --switch-output=1K ls WARNING: switch-output data size lower than wakeup kernel buffer size (258K) expect bigger perf.data sizes ... Signed-off-by: Jiri Olsa <jolsa@kernel.org> Suggested-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1483955520-29063-6-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11perf tools: Add unit_number__scnprintf functionJiri Olsa
Add unit_number__scnprintf function to display size units and use it in -m option info message. Before: $ perf record -m 10M ls rounding mmap pages size to 16777216 bytes (4096 pages) ... After: $ perf record -m 10M ls rounding mmap pages size to 16M (4096 pages) ... Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/r/1483955520-29063-2-git-send-email-jolsa@kernel.org [ Rename it to unit_number__scnprintf for consistency ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11perf evlist: Fix typo in perf_evlist__start_workload()Soramichi Akiyama
This patch fixes a typo: s/enable to/unable to/ Signed-off-by: Soramichi AKIYAMA <akiyama@m.soramichi.jp> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: bcf3145fbeb1 ("perf evlist: Enhance perf_evlist__start_workload()") Link: http://lkml.kernel.org/r/20170110200006.e1f7a766b4faf1f107ae2e1b@m.soramichi.jp [ Wasn't applying, fixed it up by hand, added Fixes: tag ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-11perf machine: Add a kallsyms loading constructorArnaldo Carvalho de Melo
To reduce the boilerplate for searching for functions in the running kernel and modules. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-93iqzayafpaxaguoiwjqezgz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-04perf probe: Fix to probe on gcc generated symbols for offline kernelMasami Hiramatsu
Fix perf-probe to show probe definition on gcc generated symbols for offline kernel (including cross-arch kernel image). gcc sometimes optimizes functions and generate new symbols with suffixes such as ".constprop.N" or ".isra.N" etc. Since those symbol names are not recorded in DWARF, we have to find correct generated symbols from offline ELF binary to probe on it (kallsyms doesn't correct it). For online kernel or uprobes we don't need it because those are rebased on _text, or a section relative address. E.g. Without this: $ perf probe -k build-arm/vmlinux -F __slab_alloc* __slab_alloc.constprop.9 $ perf probe -k build-arm/vmlinux -D __slab_alloc p:probe/__slab_alloc __slab_alloc+0 If you put above definition on target machine, it should fail because there is no __slab_alloc in kallsyms. With this fix, perf probe shows correct probe definition on __slab_alloc.constprop.9: $ perf probe -k build-arm/vmlinux -D __slab_alloc p:probe/__slab_alloc __slab_alloc.constprop.9+0 Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/148350060434.19001.11864836288580083501.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-04perf probe: Fix --funcs to show correct symbols for offline moduleMasami Hiramatsu
Fix --funcs (-F) option to show correct symbols for offline module. Since previous perf-probe uses machine__findnew_module_map() for offline module, even if user passes a module file (with full path) which is for other architecture, perf-probe always tries to load symbol map for current kernel module. This fix uses dso__new_map() to load the map from given binary as same as a map for user applications. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/148350053478.19001.15435255244512631545.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-03perf symbols: Robustify reading of build-id from sysfsArnaldo Carvalho de Melo
Markus reported that perf segfaults when reading /sys/kernel/notes from a kernel linked with GNU gold, due to what looks like a gold bug, so do some bounds checking to avoid crashing in that case. Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de> Report-Link: http://lkml.kernel.org/r/20161219161821.GA294@x4 Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-ryhgs6a6jxvz207j2636w31c@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-01-02perf probe: Fix to get correct modname from elf headerMasami Hiramatsu
Since 'perf probe' supports cross-arch probes, it is possible to analyze different arch kernel image which has different bits-per-long. In that case, it fails to get the module name because it uses the MOD_NAME_OFFSET macro based on the host machine bits-per-long, instead of the target arch bits-per-long. This fixes above issue by changing modname-offset based on the target archs bit width. This is ok because linux kernel uses LP64 model on 64bit arch. E.g. without this (on x86_64, and target module is arm32): $ perf probe -m build-arm/fs/configfs/configfs.ko -D configfs_lookup p:probe/configfs_lookup :configfs_lookup+0 ^-Here is an empty module name. With this fix, you can see correct module name: $ perf probe -m build-arm/fs/configfs/configfs.ko -D configfs_lookup p:probe/configfs_lookup configfs:configfs_lookup+0 Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/148337043836.6752.383495516397005695.stgit@devbox Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-20perf diff: Do not overwrite valid build idKan Liang
Fixes a perf diff regression issue which was introduced by commit 5baecbcd9c9a ("perf symbols: we can now read separate debug-info files based on a build ID") The binary name could be same when perf diff different binaries. Build id is used to distinguish between them. However, the previous patch assumes the same binary name has same build id. So it overwrites the build id according to the binary name, regardless of whether the build id is set or not. Check the has_build_id in dso__load. If the build id is already set, use it. Before the fix: $ perf diff 1.perf.data 2.perf.data # Event 'cycles' # # Baseline Delta Shared Object Symbol # ........ ....... ................ ............................. # 99.83% -99.80% tchain_edit [.] f2 0.12% +99.81% tchain_edit [.] f3 0.02% -0.01% [ixgbe] [k] ixgbe_read_reg After the fix: $ perf diff 1.perf.data 2.perf.data # Event 'cycles' # # Baseline Delta Shared Object Symbol # ........ ....... ................ ............................. # 99.83% +0.10% tchain_edit [.] f3 0.12% -0.08% tchain_edit [.] f2 Signed-off-by: Kan Liang <kan.liang@intel.com> Cc: Andi Kleen <andi@firstfloor.org> CC: Dima Kogan <dima@secretsauce.net> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Fixes: 5baecbcd9c9a ("perf symbols: we can now read separate debug-info files based on a build ID") Link: http://lkml.kernel.org/r/1481642984-13593-1-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-20perf annotate: Don't throw error for zero length symbolsRavi Bangoria
'perf report --tui' exits with error when it finds a sample of zero length symbol (i.e. addr == sym->start == sym->end). Actually these are valid samples. Don't exit TUI and show report with such symbols. Reported-and-Tested-by: Anton Blanchard <anton@samba.org> Link: https://lkml.org/lkml/2016/10/8/189 Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Chris Riyder <chris.ryder@arm.com> Cc: linuxppc-dev@lists.ozlabs.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@kernel.org # v4.9+ Link: http://lkml.kernel.org/r/1479804050-5028-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15perf annotate: Fix jump target outside of function address rangeRavi Bangoria
If jump target is outside of function range, perf is not handling it correctly. Especially when target address is lesser than function start address, target offset will be negative. But, target address declared to be unsigned, converts negative number into 2's complement. See below example. Here target of 'jumpq' instruction at 34cf8 is 34ac0 which is lesser than function start address(34cf0). 34ac0 - 34cf0 = -0x230 = 0xfffffffffffffdd0 Objdump output: 0000000000034cf0 <__sigaction>: __GI___sigaction(): 34cf0: lea -0x20(%rdi),%eax 34cf3: cmp -bashx1,%eax 34cf6: jbe 34d00 <__sigaction+0x10> 34cf8: jmpq 34ac0 <__GI___libc_sigaction> 34cfd: nopl (%rax) 34d00: mov 0x386161(%rip),%rax # 3bae68 <_DYNAMIC+0x2e8> 34d07: movl -bashx16,%fs:(%rax) 34d0e: mov -bashxffffffff,%eax 34d13: retq perf annotate before applying patch: __GI___sigaction /usr/lib64/libc-2.22.so lea -0x20(%rdi),%eax cmp -bashx1,%eax v jbe 10 v jmpq fffffffffffffdd0 nop 10: mov _DYNAMIC+0x2e8,%rax movl -bashx16,%fs:(%rax) mov -bashxffffffff,%eax retq perf annotate after applying patch: __GI___sigaction /usr/lib64/libc-2.22.so lea -0x20(%rdi),%eax cmp -bashx1,%eax v jbe 10 ^ jmpq 34ac0 <__GI___libc_sigaction> nop 10: mov _DYNAMIC+0x2e8,%rax movl -bashx16,%fs:(%rax) mov -bashxffffffff,%eax retq Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chris Riyder <chris.ryder@arm.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1480953407-7605-3-git-send-email-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15perf annotate: Support jump instruction with target as second operandRavi Bangoria
Architectures like PowerPC have jump instructions that includes a target address as a second operand. For example, 'bne cr7,0xc0000000000f6154'. Add support for such instruction in perf annotate. objdump o/p: c0000000000f6140: ld r9,1032(r31) c0000000000f6144: cmpdi cr7,r9,0 c0000000000f6148: bne cr7,0xc0000000000f6154 c0000000000f614c: ld r9,2312(r30) c0000000000f6150: std r9,1032(r31) c0000000000f6154: ld r9,88(r31) Corresponding perf annotate o/p: Before patch: ld r9,1032(r31) cmpdi cr7,r9,0 v bne 3ffffffffff09f2c ld r9,2312(r30) std r9,1032(r31) 74: ld r9,88(r31) After patch: ld r9,1032(r31) cmpdi cr7,r9,0 v bne 74 ld r9,2312(r30) std r9,1032(r31) 74: ld r9,88(r31) Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chris Riyder <chris.ryder@arm.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1480953407-7605-2-git-send-email-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15perf evsel: Allow to ignore missing pidJiri Olsa
Adding perf_evsel::ignore_missing_cpu_thread bool. When set true, it allows perf to ignore error of missing pid of perf event syscall. We remove missing thread id from the thread_map, so the rest of the processing like ioctl and mmap won't get disturbed with -1 fd. The reason for supporting this is to ease up monitoring group of pids, that 'disappear' before perf opens their event. This currently leads perf to report error and exit and makes perf record's -u option unusable under certain setup. With this change we will allow this race and ignore such failure with following warning: WARNING: Ignored open failure for pid 8605 Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/20161213074622.GA3084@krava Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15perf thread_map: Add thread_map__remove functionJiri Olsa
Add thread_map__remove function to remove thread from thread map. Add automated test also. Committer notes: Testing it: # perf test "Remove thread map" 39: Remove thread map : Ok # perf test -v "Remove thread map" 39: Remove thread map : --- start --- test child forked, pid 4483 2 threads: 4482, 4483 1 thread: 4483 0 thread: test child finished with 0 ---- end ---- Remove thread map: Ok # Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1481538943-21874-4-git-send-email-jolsa@kernel.org [ Added stdlib.h, to get the free() declaration ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-15perf evsel: Use variable instead of repeating lengthy FD macroJiri Olsa
It's more readable and will ease up following patches. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/r/1481538943-21874-3-git-send-email-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-07perf callchain: Introduce callchain_cursor__copy()Namhyung Kim
The callchain_cursor__copy() function is to save current callchain captured by a cursor. It'll be used to keep callchains when switching to idle task for each cpu. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <andi@firstfloor.org> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20161206034010.6499-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05perf annotate: Show raw form for jump instruction with indirect targetRavi Bangoria
For jump instructions that does not include target address as direct operand, show the original disassembled line for them. This is needed for certain powerpc jump instructions that use target address in a register (such as bctr, btar, ...). Before: ld r12,32088(r12) mtctr r12 v bctr ffffffffffffca2c std r2,24(r1) addis r12,r2,-1 After: ld r12,32088(r12) mtctr r12 v bctr std r2,24(r1) addis r12,r2,-1 Committer notes: Testing it using a perf.data file and vmlinux for powerpc64, cross-annotating it on a x86_64 workstation: Before: .__bpf_prog_run vmlinux.powerpc │ std r10,512(r9) ▒ │ lbz r9,0(r31) ▒ │ rldicr r9,r9,3,60 ▒ │ ldx r9,r30,r9 ▒ │ mtctr r9 ▒ 100.00 │ ↓ bctr 3fffffffffe01510 ▒ │ lwa r10,4(r31) ▒ │ lwz r9,0(r31) ▒ <SNIP> Invalid jump offset: 3fffffffffe01510 After: .__bpf_prog_run vmlinux.powerpc │ std r10,512(r9) ▒ │ lbz r9,0(r31) ▒ │ rldicr r9,r9,3,60 ▒ │ ldx r9,r30,r9 ▒ │ mtctr r9 ▒ 100.00 │ ↓ bctr ▒ │ lwa r10,4(r31) ▒ │ lwz r9,0(r31) ▒ <SNIP> Invalid jump offset: 3fffffffffe01510 This, in turn, uncovers another problem with jumps without operands, the ENTER/-> operation, to jump to the target, still continues using the bogus target :-) BTW, this was the file used for the above tests: [acme@jouet ravi_bangoria]$ perf report --header-only -i perf.data.f22vm.powerdev # ======== # captured on: Thu Nov 24 12:40:38 2016 # hostname : pdev-f22-qemu # os release : 4.4.10-200.fc22.ppc64 # perf version : 4.9.rc1.g6298ce # arch : ppc64 # nrcpus online : 48 # nrcpus avail : 48 # cpudesc : POWER7 (architected), altivec supported # cpuid : 74,513 # total memory : 4158976 kB # cmdline : /home/ravi/Workspace/linux/tools/perf/perf record -a # event : name = cycles:ppp, , size = 112, { sample_period, sample_freq } = 4000, sample_type = IP|TID|TIME|CPU|PERIOD, disabled = 1, inherit = 1, mmap = 1, c # HEADER_CPU_TOPOLOGY info available, use -I to display # HEADER_NUMA_TOPOLOGY info available, use -I to display # pmu mappings: cpu = 4, software = 1, tracepoint = 2, breakpoint = 5 # missing features: HEADER_TRACING_DATA HEADER_BRANCH_STACK HEADER_GROUP_DESC HEADER_AUXTRACE HEADER_STAT HEADER_CACHE # ======== # [acme@jouet ravi_bangoria]$ Suggested-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chris Riyder <chris.ryder@arm.com> Cc: Kim Phillips <kim.phillips@arm.com> Cc: Markus Trippelsdorf <markus@trippelsdorf.de> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Taeung Song <treeze.taeung@gmail.com> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/r/1480953407-7605-1-git-send-email-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05perf clang: Compile BPF script using builtin clang supportWang Nan
After this patch, perf utilizes builtin clang support to build BPF script, no longer depend on external clang, but fallbacking to it if for some reason the builtin compiling framework fails. Test: $ type clang -bash: type: clang: not found $ cat ~/.perfconfig $ echo '#define LINUX_VERSION_CODE 0x040700' > ./test.c $ cat ./tools/perf/tests/bpf-script-example.c >> ./test.c $ ./perf record -v --dry-run -e ./test.c 2>&1 | grep builtin bpf: successfull builtin compilation $ Can't pass cflags so unable to include kernel headers now. Will be fixed by following commits. Committer notes: Make sure '-v' comes before the '-e ./test.c' in the command line otherwise the 'verbose' variable will not be set when the bpf event is parsed and thus the pr_debug indicating a 'successfull builtin compilation' will not be output, as the debug level (1) will be less than what 'verbose' has at that point (0). Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Stringer <joe@ovn.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161126070354.141764-16-wangnan0@huawei.com [ Spell check/reflow successfull pr_debug string ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05perf clang: Support compile IR to BPF object and add testcaseWang Nan
getBPFObjectFromModule() is introduced to compile LLVM IR(Module) to BPF object. Add new testcase for it. Test result: $ ./buildperf/perf test -v clang 51: builtin clang support : 51.1: builtin clang compile C source to IR : --- start --- test child forked, pid 21822 test child finished with 0 ---- end ---- builtin clang support subtest 0: Ok 51.2: builtin clang compile C source to ELF object : --- start --- test child forked, pid 21823 test child finished with 0 ---- end ---- builtin clang support subtest 1: Ok Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Stringer <joe@ovn.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161126070354.141764-15-wangnan0@huawei.com [ Remove redundant "Test" from entry descriptions ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05perf clang: Update test case to use real BPF scriptWang Nan
Allow C++ code to use util.h and tests/llvm.h. Let 'perf test' compile a real BPF script. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Stringer <joe@ovn.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161126070354.141764-14-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05perf clang: Allow passing CFLAGS to builtin clangWang Nan
Improve getModuleFromSource() API to accept a cflags list. This feature will be used to pass LINUX_VERSION_CODE and -I flags. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Stringer <joe@ovn.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161126070354.141764-13-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05perf clang: Use real file system for #includeWang Nan
Utilize clang's OverlayFileSystem facility, allow CompilerInstance to access real file system. With this patch the '#include' directive can be used. Add a new getModuleFromSource for real file. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Stringer <joe@ovn.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161126070354.141764-12-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05perf clang: Add builtin clang support ant test caseWang Nan
Add basic clang support in clang.cpp and test__clang() testcase. The first testcase checks if builtin clang is able to generate LLVM IR. tests/clang.c is a proxy. Real testcase resides in utils/c++/clang-test.cpp in c++ and exports C interface to perf test subsystem. Test result: $ perf test -v clang 51: builtin clang support : 51.1: Test builtin clang compile C source to IR : --- start --- test child forked, pid 13215 test child finished with 0 ---- end ---- Test builtin clang support subtest 0: Ok Committer note: Make sure you've enabled CLANG and LLVM builtin support by setting the LIBCLANGLLVM variable on the make command line, e.g.: make LIBCLANGLLVM=1 O=/tmp/build/perf -C tools/perf install-bin Otherwise you'll get this when trying to do the 'perf test' call above: # perf test clang 51: builtin clang support : Skip (not compiled in) # Signed-off-by: Wang Nan <wangnan0@huawei.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Stringer <joe@ovn.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161126070354.141764-11-wangnan0@huawei.com [ Removed "Test" from descriptions, redundant and already removed from all the other entries ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05perf llvm: Extract helpers in llvm-utils.cWang Nan
The following commits will use builtin clang to compile BPF scripts. llvm__get_kbuild_opts() and llvm__get_nr_cpus() are extracted to help building '-DKERNEL_VERSION_CODE' and '-D__NR_CPUS__' macros. Doing object dumping in bpf loader, so further builtin clang compiling needn't consider it. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Stringer <joe@ovn.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161126070354.141764-7-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-05perf tools: Pass context to perf hook functionsWang Nan
Pass a pointer to perf hook functions so they receive context information during setup. Signed-off-by: Wang Nan <wangnan0@huawei.com> Cc: Alexei Starovoitov <ast@fb.com> Cc: He Kuang <hekuang@huawei.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Stringer <joe@ovn.org> Cc: Zefan Li <lizefan@huawei.com> Cc: pi3orama@163.com Link: http://lkml.kernel.org/r/20161126070354.141764-6-wangnan0@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-01perf annotate: AArch64 supportKim Phillips
This is a regex converted version from the original: https://lkml.org/lkml/2016/5/19/461 Add basic support to recognise AArch64 assembly. This allows perf to identify AArch64 instructions that branch to other parts within the same function, thereby properly annotating them. Rebased onto new cross-arch annotation bits: https://lkml.org/lkml/2016/11/25/546 Sample output: security_file_permission vmlinux 5.80 │ ← ret ▒ │70: ldr w0, [x21,#68] ▒ 4.44 │ ↓ tbnz d0 ▒ │ mov w0, #0x24 // #36 ▒ 1.37 │ ands w0, w22, w0 ▒ │ ↑ b.eq 60 ▒ 1.37 │ ↓ tbnz e4 ▒ │ mov w19, #0x20000 // #131072 ▒ 1.02 │ ↓ tbz ec ▒ │90:┌─→ldr x3, [x21,#24] ▒ 1.37 │ │ add x21, x21, #0x10 ▒ │ │ mov w2, w19 ▒ 1.02 │ │ mov x0, x21 ▒ │ │ mov x1, x3 ▒ 1.71 │ │ ldr x20, [x3,#48] ▒ │ │→ bl __fsnotify_parent ▒ 0.68 │ │↑ cbnz 60 ▒ │ │ mov x2, x21 ▒ 1.37 │ │ mov w1, w19 ▒ │ │ mov x0, x20 ▒ 0.68 │ │ mov w5, #0x0 // #0 ▒ │ │ mov x4, #0x0 // #0 ▒ 1.71 │ │ mov w3, #0x1 // #1 ▒ │ │→ bl fsnotify ▒ 1.37 │ │↑ b 60 ▒ │d0:│ mov w0, #0x0 // #0 ▒ │ │ ldp x19, x20, [sp,#16] ▒ │ │ ldp x21, x22, [sp,#32] ▒ │ │ ldp x29, x30, [sp],#48 ▒ │ │← ret ▒ │e4:│ mov w19, #0x10000 // #65536 ▒ │ └──b 90 ◆ │ec: brk #0x800 ▒ Press 'h' for help on key bindings Signed-off-by: Kim Phillips <kim.phillips@arm.com> Signed-off-by: Chris Ryder <chris.ryder@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/20161130092344.012e18e3e623bea395162f95@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-01perf annotate: Use arch->objdump.comment_char in dec__parse()Kim Phillips
Presume neglected in commit 786c1b5 "perf annotate: Start supporting cross arch annotation". This doesn't fix a bug since none of the affected arches support parsing dec/inc instructions yet. Signed-off-by: Kim Phillips <kim.phillips@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Chris Ryder <chris.ryder@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/20161130092333.1cca5dd2c77e1790d61c1e9c@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2016-12-01perf tools: Move parse_nsec_time to time-utils.cDavid Ahern
Code move only; no functional change intended. Committer notes: Fix the build on Ubuntu 16.04 x86-64 cross-compiling to S/390, with this set of auto-detected features: ... dwarf: [ on ] ... dwarf_getlocations: [ on ] ... glibc: [ on ] ... gtk2: [ OFF ] ... libaudit: [ OFF ] ... libbfd: [ OFF ] ... libelf: [ on ] ... libnuma: [ OFF ] ... numa_num_possible_cpus: [ OFF ] ... libperl: [ OFF ] ... libpython: [ OFF ] ... libslang: [ OFF ] ... libcrypto: [ OFF ] ... libunwind: [ OFF ] ... libdw-dwarf-unwind: [ on ] ... zlib: [ on ] ... lzma: [ OFF ] ... get_cpuid: [ OFF ] ... bpf: [ on ] Where it was failing with: CC /tmp/build/perf/util/time-utils.o util/time-utils.c: In function 'parse_nsec_time': util/time-utils.c:17:13: error: implicit declaration of function 'strtoul' [-Werror=implicit-function-declaration] time_sec = strtoul(str, &end, 10); ^ util/time-utils.c:17:2: error: nested extern declaration of 'strtoul' [-Werror=nested-externs] time_sec = strtoul(str, &end, 10); ^ util/time-utils.c: In function 'perf_time__parse_str': util/time-utils.c:93:2: error: implicit declaration of function 'free' [-Werror=implicit-function-declaration] free(str); ^ util/time-utils.c:93:2: error: incompatible implicit declaration of built-in function 'free' [-Werror] util/time-utils.c:93:2: note: include '<stdlib.h>' or provide a declaration of 'free' Do as suggested and add a '#include <stdlib.h>' to get the free() and strtoul() declarations and fix the build. Signed-off-by: David Ahern <dsahern@gmail.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1480439746-42695-3-git-send-email-dsahern@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>