diff options
author | Ingo Molnar <mingo@kernel.org> | 2016-04-27 17:02:24 +0200 |
---|---|---|
committer | Ingo Molnar <mingo@kernel.org> | 2016-04-27 17:02:24 +0200 |
commit | a8944c5bf86dc6c153a71f2a386738c0d3f5ff9c (patch) | |
tree | a251b1d510831dc071eadbbbe3e38a85fe643365 /tools/perf/builtin-script.c | |
parent | 67d61296ffcc850bffdd4466430cb91e5328f39a (diff) | |
parent | 4cb93446c587d56e2a54f4f83113daba2c0b6dee (diff) |
Merge tag 'perf-core-for-mingo-20160427' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
- perf trace --pf maj/min/all works with --call-graph: (Arnaldo Carvalho de Melo)
Tracing write syscalls and major page faults with callchains while starting
firefox, limiting the stack to 5 frames:
# perf trace -e write --pf maj --max-stack 5 firefox
589.549 ( 0.014 ms): firefox/15377 write(fd: 4, buf: 0x7fff80acc898, count: 151) = 151
[0xfaed] (/usr/lib64/libpthread-2.22.so)
fire_glxtest_process+0x5c (/usr/lib64/firefox/libxul.so)
InstallGdkErrorHandler+0x41 (/usr/lib64/firefox/libxul.so)
XREMain::XRE_mainInit+0x12c (/usr/lib64/firefox/libxul.so)
XREMain::XRE_main+0x1e4 (/usr/lib64/firefox/libxul.so)
760.704 ( 0.000 ms): firefox/15332 majfault [gtk_tree_view_accessible_get_type+0x0] => /usr/lib64/libgtk-3.so.0.1800.9@0xa0850 (x.)
gtk_tree_view_accessible_get_type+0x0 (/usr/lib64/libgtk-3.so.0.1800.9)
gtk_tree_view_class_intern_init+0x1a54 (/usr/lib64/libgtk-3.so.0.1800.9)
g_type_class_ref+0x6dd (/usr/lib64/libgobject-2.0.so.0.4600.2)
[0x115378] (/usr/lib64/libgnutls.so.30.6.3)
This automagically selects "--call-graph dwarf", use "--call-graph fp" on systems
where -fno-omit-frame-pointer was used to built the components of interest, to
incur in less overhead, or tune "--call-graph dwarf" appropriately, see 'perf record --help'.
- Allow /proc/sys/kernel/perf_event_max_stack, that defaults to the old hard coded value
of PERF_MAX_STACK_DEPTH (127), useful for huge callstacks for things like Groovy, Ruby, etc,
and also to reduce overhead by limiting it to a smaller value, upcoming work will allow
this to be done per-event (Arnaldo Carvalho de Melo)
- Make 'perf trace --min-stack' be honoured by --pf and --event (Arnaldo Carvalho de Melo)
- Make 'perf evlist -v' decode perf_event_attr->branch_sample_type (Arnaldo Carvalho de Melo)
# perf record --call lbr usleep 1
# perf evlist -v
cycles:ppp: ... sample_type: IP|TID|TIME|CALLCHAIN|PERIOD|BRANCH_STACK, ...
branch_sample_type: USER|CALL_STACK|NO_FLAGS|NO_CYCLES
#
- Clear dummy entry accumulated period, fixing such 'perf top/report' output
as: (Kan Liang)
4769.98% 0.01% 0.00% 0.01% tchain_edit [kernel] [k] update_fast_timekeeper
- System calls with pid_t arguments gets them augmented with the COMM event
more thoroughly:
# trace -e perf_event_open perf stat -e cycles -p 15608
6.876 ( 0.014 ms): perf_event_open(attr_uptr: 0x2ae20d8, pid: 15608 (hexchat), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 3
6.882 ( 0.005 ms): perf_event_open(attr_uptr: 0x2ae20d8, pid: 15639 (gmain), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
6.889 ( 0.005 ms): perf_event_open(attr_uptr: 0x2ae20d8, pid: 15640 (gdbus), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 5
^^^^^^^^^^^^^^^^^^
^C
- Fix offline module name mismatch issue in 'perf probe' (Ravi Bangoria)
- Fix module probe issue if no dwarf support in (Ravi Bangoria)
Assorted fixes:
- Fix off-by-one in write_buildid() (Andrey Ryabinin)
- Fix segfault when printing callchains in 'perf script' (Chris Phlipot)
- Replace assignment with comparison on assert check in 'perf test' entry (Colin Ian King)
- Fix off-by-one comparison in intel-pt code (Colin Ian King)
- Close target file on error path in 'perf probe' (Masami Hiramatsu)
- Set default kprobe group name if not given in 'perf probe' (Masami Hiramatsu)
- Avoid partial perf_event_header reads (Wang Nan)
Infrastructure changes:
- Update x86's syscall_64.tbl copy, adding preadv2 & pwritev2 (Arnaldo Carvalho de Melo)
- Make the x86 clean quiet wrt syscall table removal (Jiri Olsa)
Cleanups:
- Simplify wrapper for LOCK_PI in 'perf bench futex' (Davidlohr Bueso)
- Remove duplicate const qualifier (Eric Engestrom)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Diffstat (limited to 'tools/perf/builtin-script.c')
-rw-r--r-- | tools/perf/builtin-script.c | 16 |
1 files changed, 9 insertions, 7 deletions
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c index 5099740aa50bc..efca81679bb31 100644 --- a/tools/perf/builtin-script.c +++ b/tools/perf/builtin-script.c @@ -570,12 +570,12 @@ static void print_sample_bts(struct perf_sample *sample, /* print branch_from information */ if (PRINT_FIELD(IP)) { unsigned int print_opts = output[attr->type].print_ip_opts; - struct callchain_cursor *cursor = NULL, cursor_callchain; + struct callchain_cursor *cursor = NULL; if (symbol_conf.use_callchain && sample->callchain && - thread__resolve_callchain(al->thread, &cursor_callchain, evsel, + thread__resolve_callchain(al->thread, &callchain_cursor, evsel, sample, NULL, NULL, scripting_max_stack) == 0) - cursor = &cursor_callchain; + cursor = &callchain_cursor; if (cursor == NULL) { putchar(' '); @@ -789,12 +789,12 @@ static void process_event(struct perf_script *script, printf("%16" PRIu64, sample->weight); if (PRINT_FIELD(IP)) { - struct callchain_cursor *cursor = NULL, cursor_callchain; + struct callchain_cursor *cursor = NULL; if (symbol_conf.use_callchain && sample->callchain && - thread__resolve_callchain(al->thread, &cursor_callchain, evsel, + thread__resolve_callchain(al->thread, &callchain_cursor, evsel, sample, NULL, NULL, scripting_max_stack) == 0) - cursor = &cursor_callchain; + cursor = &callchain_cursor; putchar(cursor ? '\n' : ' '); sample__fprintf_sym(sample, al, 0, output[attr->type].print_ip_opts, cursor, stdout); @@ -2031,7 +2031,7 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused) OPT_UINTEGER(0, "max-stack", &scripting_max_stack, "Set the maximum stack depth when parsing the callchain, " "anything beyond the specified depth will be ignored. " - "Default: " __stringify(PERF_MAX_STACK_DEPTH)), + "Default: kernel.perf_event_max_stack or " __stringify(PERF_MAX_STACK_DEPTH)), OPT_BOOLEAN('I', "show-info", &show_full_info, "display extended information from perf.data file"), OPT_BOOLEAN('\0', "show-kernel-path", &symbol_conf.show_kernel_path, @@ -2067,6 +2067,8 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused) NULL }; + scripting_max_stack = sysctl_perf_event_max_stack; + setup_scripting(); argc = parse_options_subcommand(argc, argv, options, script_subcommands, script_usage, |