Age | Commit message (Collapse) | Author |
|
Add perf_data__create_dir() to create nr files inside 'struct perf_data'
path directory:
int perf_data__create_dir(struct perf_data *data, int nr);
and function to close that data:
void perf_data__close_dir(struct perf_data *data);
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190224190656.30163-9-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
And display the error message from removing the old data file:
$ perf record ls
Can't remove old data: Permission denied (perf.data.old)
Perf session creation failed.
$ perf record ls
Can't remove old data: Unknown file found (perf.data.old)
Perf session creation failed.
Not sure how to make fail the rename (after we successfully remove the
destination file/dir) to show the message, anyway let's have it there.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190224190656.30163-8-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Change check_backup() to call rm_rf_perf_data() instead of unlink() to
work over directory paths.
Also move the call earlier in the code, before we fork for file/dir, so
it can backup also directory data.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190224190656.30163-7-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To remove perf.data including the directory, with checking on expected
files and no other directories inside.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Suggested-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190224190656.30163-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add pattern argument to rm_rf_depth() (and rename it to rm_rf_depth_pat())
to specify the name pattern files need to match inside the directory.
The function fails if we find different file to remove.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190224190656.30163-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Adding depth argument to rm_rf (and renaming it to rm_rf_depth) to
specify the depth we will go searching for files to remove.
It will be used to specify single depth for perf.data directory removal
in following patch.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190224190656.30163-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Do insertions and removal of filters during rehash in higher volumes.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add checking of newly added trace.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Track the basic codepaths of delta rehash handling,
using mlxsw tracepoints. Use IPv6 addresses.
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Implement test that runs 5 instances of tc replace filter in parallel with
5 instances of tc del filter from same tp instance. Each instance uses its
own filter handle and key range.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Implement test that runs 5 instances of tc add filter in parallel with 5
instances of tc del filter from same tp instance. Each instance uses its
own filter handle and key range.
Extend tdc_multibatch.py with additional options required to implement the
test: common prefix for all generated batch files, first value of filter
handle range, MAC address prefix modifier. These are necessary to allow
creating batch files with unique keys and handle ranges with multiple
invocation of tdc_multibatch.py helper script.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Implement test that verifies concurrent deletion of rules by executing 10
tc instances that delete flower filters in same handle range. In this case
only one tc instance succeeds in deleting a filter with particular handle.
To mitigate expected failures of all other instances, run tc with 'force'
option to continue processing batch file in case of errors and expect xargs
to return code '123' that indicates that invocation of command(s) exited
with error in range 1-125.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Implement test that verifies concurrent replacement of rules by executing
10 tc instances that replace flower filters in same handle range.
Extend tdc_multibatch.py script with new optional CLI argument that is used
to generate all batch files with same filter handle range.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Implement test that verifies parallel rules replacement by adding 1 million
flower filters and then replacing them with 10 concurrent tc instances.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Implement test that verifies parallel rules deletion by adding 1 million
flower filters and then deleting them with 10 concurrent tc instances.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Implement test that verifies parallel rules insertion by adding 1 million
flower filters with 10 concurrent tc instances. Put it to standalone
'concurrency' category.
Implement tdc_multibatch.py helper script that is used to generate multiple
batch files for concurrent tc execution. Extend config with new 'BATCH_DIR'
variable to specify temporary output directory that is used to store batch
files generated by tdc_multibatch.py.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Extend tdc_batch.py with several optional CLI arguments that are used for
implementation of concurrency tests in following patches in this set:
- Add optional argument to specify range of filter handles used in batch
file [fitler_handle, filter_handle+number). This is needed for testing
filter deletion where it is necessary to know exact handles of configured
filters.
- Add optional argument to specify filter operation type (possible values
are ['add', 'del', 'replace']) instead of hardcoded "add" value. This
allows generation of batches for filter addition, deletion and
replacement.
- Add optional argument to allow user to change mac address prefix that is
used for all filters in batch. This is necessary to allow generating
multiple batches with unique flower classifier keys.
Signed-off-by: Vlad Buslov <vladbu@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Three conflicts, one of which, for marvell10g.c is non-trivial and
requires some follow-up from Heiner or someone else.
The issue is that Heiner converted the marvell10g driver over to
use the generic c45 code as much as possible.
However, in 'net' a bug fix appeared which makes sure that a new
local mask (MDIO_AN_10GBT_CTRL_ADV_NBT_MASK) with value 0x01e0
is cleared.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Just like commit e2ba732a1681 ("selftests: fib_tests: sleep after
changing carrier"), wait one second to allow linkwatch to propagate the
carrier change to the stack.
There are two sets of carrier tests. The first slept after the carrier
was set to off, and when the second set ran, it was likely that the
linkwatch would be able to run again without much delay, reducing the
likelihood of a race. However, if you run 'fib_tests.sh -t carrier' on a
loop, you will quickly notice the failures.
Sleeping on the second set of tests make the failures go away.
Cc: David Ahern <dsahern@gmail.com>
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a 'path' member to 'struct perf_data'. It will keep the configured
path for the data (const char *). The path in struct perf_data_file is
now dynamically allocated (duped) from it.
This scheme is useful/used in following patches where struct
perf_data::path holds the 'configure' directory path and struct
perf_data_file::path holds the allocated path for specific files.
Also it actually makes the code little simpler.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190221094145.9151-3-jolsa@kernel.org
[ Fixup data-convert-bt.c missing conversion ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We are about to add support for multiple files, so we need each file to
keep its size.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190221094145.9151-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add a new report to display top calls by elapsed time. It displays calls
in descending order of time elapsed between when the function was called
and when it returned.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
If no selection is made on the 'Selected branches' dialog, then the
output is the same as the 'All branches' report. That is not really an
error, and is not desirable for future reports, so remove it.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Remove SQLTableDialogDataItem as it is no longer used.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Create new dialog data item classes to replace SQLTableDialogDataItem.
This separates out different dialog data items and makes it easier to
add new ones. SQLTableDialogDataItem is removed in a separate patch
because it makes the diff more readable.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The report name is a report variable so move it into into ReportVars.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Factor out ReportVars to provide a single container for information from
report dialogs.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Factor out ReportDialogBase so it can be re-used.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Move column headers from SQLAutoTableModel into SQLTableModel so that
they can be used for other models based on SQLTableModel.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
calls table
The Call Graph depends on the calls table which is optional when exporting
data, so hide the Call Graph option if there is no calls table.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Remove leftover debugging prints.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
exported-sql-viewer.py is a standalone python script and requires a
shebang. Also only python2 is supported at present. Restore the shebang
but use the more flexible 'env' form.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Fixes: a38352de4495 ("perf script python: Remove explicit shebang from Python script")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
x86 retpoline functions pollute the call graph by showing up everywhere
there is an indirect branch, but they do not really mean anything. Make
changes so that the default retpoline functions will no longer appear in
the call graph. Note this only affects the call graph, since all the
original branches are left unchanged.
This does not handle function return thunks, nor is there any
improvement for the handling of inline thunks or extern thunks.
Example:
$ cat simple-retpoline.c
__attribute__((noinline)) int bar(void)
{
return -1;
}
int foo(void)
{
return bar() + 1;
}
__attribute__((indirect_branch("thunk"))) int main()
{
int (*volatile fn)(void) = foo;
fn();
return fn();
}
$ gcc -ggdb3 -Wall -Wextra -O2 -o simple-retpoline simple-retpoline.c
$ objdump -d simple-retpoline
<SNIP>
0000000000001040 <main>:
1040: 48 83 ec 18 sub $0x18,%rsp
1044: 48 8d 05 25 01 00 00 lea 0x125(%rip),%rax # 1170 <foo>
104b: 48 89 44 24 08 mov %rax,0x8(%rsp)
1050: 48 8b 44 24 08 mov 0x8(%rsp),%rax
1055: e8 1f 01 00 00 callq 1179 <__x86_indirect_thunk_rax>
105a: 48 8b 44 24 08 mov 0x8(%rsp),%rax
105f: 48 83 c4 18 add $0x18,%rsp
1063: e9 11 01 00 00 jmpq 1179 <__x86_indirect_thunk_rax>
<SNIP>
0000000000001160 <bar>:
1160: b8 ff ff ff ff mov $0xffffffff,%eax
1165: c3 retq
<SNIP>
0000000000001170 <foo>:
1170: e8 eb ff ff ff callq 1160 <bar>
1175: 83 c0 01 add $0x1,%eax
1178: c3 retq
0000000000001179 <__x86_indirect_thunk_rax>:
1179: e8 07 00 00 00 callq 1185 <__x86_indirect_thunk_rax+0xc>
117e: f3 90 pause
1180: 0f ae e8 lfence
1183: eb f9 jmp 117e <__x86_indirect_thunk_rax+0x5>
1185: 48 89 04 24 mov %rax,(%rsp)
1189: c3 retq
<SNIP>
$ perf record -o simple-retpoline.perf.data -e intel_pt/cyc/u ./simple-retpoline
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0,017 MB simple-retpoline.perf.data ]
$ perf script -i simple-retpoline.perf.data --itrace=be -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py simple-retpoline.db branches calls
2019-01-08 14:03:37.851655 Creating database...
2019-01-08 14:03:37.863256 Writing records...
2019-01-08 14:03:38.069750 Adding indexes
2019-01-08 14:03:38.078799 Done
$ ~/libexec/perf-core/scripts/python/exported-sql-viewer.py simple-retpoline.db
Before:
main
-> __x86_indirect_thunk_rax
-> __x86_indirect_thunk_rax
-> foo
-> bar
After:
main
-> foo
-> bar
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190109091835.5570-7-adrian.hunter@intel.com
[ Remove (sym->name != NULL) test, this is not a pointer and breaks the build with clang version 7.0.1 (Fedora 7.0.1-2.fc30) ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
trie_delete_elem() was deleting an entry even though it was not matching
if the prefixlen was correct. This patch adds a check on matchlen.
Reproducer:
$ sudo bpftool map create /sys/fs/bpf/mylpm type lpm_trie key 8 value 1 entries 128 name mylpm flags 1
$ sudo bpftool map update pinned /sys/fs/bpf/mylpm key hex 10 00 00 00 aa bb cc dd value hex 01
$ sudo bpftool map dump pinned /sys/fs/bpf/mylpm
key: 10 00 00 00 aa bb cc dd value: 01
Found 1 element
$ sudo bpftool map delete pinned /sys/fs/bpf/mylpm key hex 10 00 00 00 ff ff ff ff
$ echo $?
0
$ sudo bpftool map dump pinned /sys/fs/bpf/mylpm
Found 0 elements
A similar reproducer is added in the selftests.
Without the patch:
$ sudo ./tools/testing/selftests/bpf/test_lpm_map
test_lpm_map: test_lpm_map.c:485: test_lpm_delete: Assertion `bpf_map_delete_elem(map_fd, key) == -1 && errno == ENOENT' failed.
Aborted
With the patch: test_lpm_map runs without errors.
Fixes: e454cf595853 ("bpf: Implement map_delete_elem for BPF_MAP_TYPE_LPM_TRIE")
Cc: Craig Gallek <kraig@google.com>
Signed-off-by: Alban Crequy <alban@kinvolk.io>
Acked-by: Craig Gallek <kraig@google.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
Improve thread_stack__no_call_return() to better handle 'returns' that
do not match the stack i.e. 'no call'. See code comments for details.
The example below shows how retpolines are affected:
Example:
$ cat simple-retpoline.c
__attribute__((noinline)) int bar(void)
{
return -1;
}
int foo(void)
{
return bar() + 1;
}
__attribute__((indirect_branch("thunk"))) int main()
{
int (*volatile fn)(void) = foo;
fn();
return fn();
}
$ gcc -ggdb3 -Wall -Wextra -O2 -o simple-retpoline simple-retpoline.c
$ objdump -d simple-retpoline
<SNIP>
0000000000001040 <main>:
1040: 48 83 ec 18 sub $0x18,%rsp
1044: 48 8d 05 25 01 00 00 lea 0x125(%rip),%rax # 1170 <foo>
104b: 48 89 44 24 08 mov %rax,0x8(%rsp)
1050: 48 8b 44 24 08 mov 0x8(%rsp),%rax
1055: e8 1f 01 00 00 callq 1179 <__x86_indirect_thunk_rax>
105a: 48 8b 44 24 08 mov 0x8(%rsp),%rax
105f: 48 83 c4 18 add $0x18,%rsp
1063: e9 11 01 00 00 jmpq 1179 <__x86_indirect_thunk_rax>
<SNIP>
0000000000001160 <bar>:
1160: b8 ff ff ff ff mov $0xffffffff,%eax
1165: c3 retq
<SNIP>
0000000000001170 <foo>:
1170: e8 eb ff ff ff callq 1160 <bar>
1175: 83 c0 01 add $0x1,%eax
1178: c3 retq
0000000000001179 <__x86_indirect_thunk_rax>:
1179: e8 07 00 00 00 callq 1185 <__x86_indirect_thunk_rax+0xc>
117e: f3 90 pause
1180: 0f ae e8 lfence
1183: eb f9 jmp 117e <__x86_indirect_thunk_rax+0x5>
1185: 48 89 04 24 mov %rax,(%rsp)
1189: c3 retq
<SNIP>
$ perf record -o simple-retpoline.perf.data -e intel_pt/cyc/u ./simple-retpoline
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0,017 MB simple-retpoline.perf.data ]
$ perf script -i simple-retpoline.perf.data --itrace=be -s ~/libexec/perf-core/scripts/python/export-to-sqlite.py simple-retpoline.db branches calls
2019-01-08 14:03:37.851655 Creating database...
2019-01-08 14:03:37.863256 Writing records...
2019-01-08 14:03:38.069750 Adding indexes
2019-01-08 14:03:38.078799 Done
$ ~/libexec/perf-core/scripts/python/exported-sql-viewer.py simple-retpoline.db
Before:
main
-> __x86_indirect_thunk_rax
-> __x86_indirect_thunk_rax
-> __x86_indirect_thunk_rax
-> bar
After:
main
-> __x86_indirect_thunk_rax
-> __x86_indirect_thunk_rax
-> foo
-> bar
Committer testing:
Chose "Reports", Then "Context-Sensitive Call Graph" and then go on
expanding:
Before:
simple-retpolin
PID:PID
_start
_start
__libc_start_main
main
__x86_indirect_thunk_rax
__x86_indirect_thunk_rax
bar
After:
Remove the "simple.retpoline.db" file, run again the 'perf script' line
to regenerate the .db file and run the exported-sql-viewer.py again to
get the same all the way to 'main', then, from there, including 'main':
main
__x86_indirect_thunk_rax
__x86_indirect_thunk_rax
foo
bar
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20190109091835.5570-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The output of "perf annotate -l --stdio xxx" changed since commit 425859ff0de33
("perf annotate: No need to calculate notes->start twice") removed notes->start
assignment in symbol__calc_lines(). It will get failed in
find_address_in_section() from symbol__tty_annotate() subroutine as the
a2l->addr is wrong. So the annotate summary doesn't report the line number of
source code correctly.
Before fix:
liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ cat common_while_1.c
void hotspot_1(void)
{
volatile int i;
for (i = 0; i < 0x10000000; i++);
for (i = 0; i < 0x10000000; i++);
for (i = 0; i < 0x10000000; i++);
}
int main(void)
{
hotspot_1();
return 0;
}
liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ gcc common_while_1.c -g -o common_while_1
liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf record ./common_while_1
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.488 MB perf.data (12498 samples) ]
liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf annotate -l -s hotspot_1 --stdio
Sorted summary for file /home/liwei/main_code/hulk_work/hulk/tools/perf/common_while_1
----------------------------------------------
19.30 common_while_1[32]
19.03 common_while_1[4e]
19.01 common_while_1[16]
5.04 common_while_1[13]
4.99 common_while_1[4b]
4.78 common_while_1[2c]
4.77 common_while_1[10]
4.66 common_while_1[2f]
4.59 common_while_1[51]
4.59 common_while_1[35]
4.52 common_while_1[19]
4.20 common_while_1[56]
0.51 common_while_1[48]
Percent | Source code & Disassembly of common_while_1 for cycles:ppp (12480 samples, percent: local period)
-----------------------------------------------------------------------------------------------------------------
:
:
:
: Disassembly of section .text:
:
: 00000000000005fa <hotspot_1>:
: hotspot_1():
: void hotspot_1(void)
: {
0.00 : 5fa: push %rbp
0.00 : 5fb: mov %rsp,%rbp
: volatile int i;
:
: for (i = 0; i < 0x10000000; i++);
0.00 : 5fe: movl $0x0,-0x4(%rbp)
0.00 : 605: jmp 610 <hotspot_1+0x16>
0.00 : 607: mov -0x4(%rbp),%eax
common_while_1[10] 4.77 : 60a: add $0x1,%eax
common_while_1[13] 5.04 : 60d: mov %eax,-0x4(%rbp)
common_while_1[16] 19.01 : 610: mov -0x4(%rbp),%eax
common_while_1[19] 4.52 : 613: cmp $0xfffffff,%eax
0.00 : 618: jle 607 <hotspot_1+0xd>
: for (i = 0; i < 0x10000000; i++);
...
After fix:
liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf record ./common_while_1
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.488 MB perf.data (12500 samples) ]
liwei@euler:~/main_code/hulk_work/hulk/tools/perf$ sudo ./perf annotate -l -s hotspot_1 --stdio
Sorted summary for file /home/liwei/main_code/hulk_work/hulk/tools/perf/common_while_1
----------------------------------------------
33.34 common_while_1.c:5
33.34 common_while_1.c:6
33.32 common_while_1.c:7
Percent | Source code & Disassembly of common_while_1 for cycles:ppp (12482 samples, percent: local period)
-----------------------------------------------------------------------------------------------------------------
:
:
:
: Disassembly of section .text:
:
: 00000000000005fa <hotspot_1>:
: hotspot_1():
: void hotspot_1(void)
: {
0.00 : 5fa: push %rbp
0.00 : 5fb: mov %rsp,%rbp
: volatile int i;
:
: for (i = 0; i < 0x10000000; i++);
0.00 : 5fe: movl $0x0,-0x4(%rbp)
0.00 : 605: jmp 610 <hotspot_1+0x16>
0.00 : 607: mov -0x4(%rbp),%eax
common_while_1.c:5 4.70 : 60a: add $0x1,%eax
4.89 : 60d: mov %eax,-0x4(%rbp)
common_while_1.c:5 19.03 : 610: mov -0x4(%rbp),%eax
common_while_1.c:5 4.72 : 613: cmp $0xfffffff,%eax
0.00 : 618: jle 607 <hotspot_1+0xd>
: for (i = 0; i < 0x10000000; i++);
0.00 : 61a: movl $0x0,-0x4(%rbp)
0.00 : 621: jmp 62c <hotspot_1+0x32>
0.00 : 623: mov -0x4(%rbp),%eax
common_while_1.c:6 4.54 : 626: add $0x1,%eax
4.73 : 629: mov %eax,-0x4(%rbp)
common_while_1.c:6 19.54 : 62c: mov -0x4(%rbp),%eax
common_while_1.c:6 4.54 : 62f: cmp $0xfffffff,%eax
...
Signed-off-by: Wei Li <liwei391@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: 425859ff0de33 ("perf annotate: No need to calculate notes->start twice")
Link: http://lkml.kernel.org/r/20190221095716.39529-1-liwei391@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Let rm_rf() remove a file if it's provided by path, not just
directories.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190220122800.864-7-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
So it does not screw up single -v verbose output.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190220122800.864-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add a missing new line into pr_debug call in perf_event__synthesize_bpf_events(),
so that the error message does not screw the verbose output.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Link: http://lkml.kernel.org/r/20190220122800.864-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add support to add/remove fields for specific event types in -F option.
It's now possible to use '+-' after event type, like:
# cat > test.c
#include <stdio.h>
int main(void)
{
printf("Hello world\n");
while(1) {}
}
^D
# gcc -g -o test test.c
# perf probe -x test 'test.c:5'
# perf record -e '{cpu/cpu-cycles,period=10000/,probe_test:main}:S' ./test
...
# perf script -Ftrace:+period,-cpu
test 3859 396291.117343: 10275 cpu/cpu-cycles,period=10000/: 7f..
test 3859 396291.118234: 11041 cpu/cpu-cycles,period=10000/: ffffff..
test 3859 396291.118234: 1 probe_test:main:
test 3859 396291.118248: 8668 cpu/cpu-cycles,period=10000/: ffffff..
test 3859 396291.118263: 10139 cpu/cpu-cycles,period=10000/: ffffff..
Committer testing:
Couldn't make the test above work, but tested it with:
# perf probe -x hello main
Added new event:
probe_hello:main (on main in /home/acme/c/hello)
You can now use it in all perf tools, such as:
perf record -e probe_hello:main -aR sleep 1
# perf record -e probe_hello:main ./hello
hello, world
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.025 MB perf.data (1 samples) ]
# perf script
hello 21454 [002] 254116.874005: probe_hello:main: (401126)
#
# perf script -Ftrace:+period,-cpu
hello 21454 254116.874005: 1 probe_hello:main: (401126)
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190220122800.864-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Force sample_type setup for slave events in group leader sessions.
We don't get sample for slave events, we make them when delivering group
leader sample. Set the slave event to follow the master sample_type to
ease up report.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190220122800.864-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
There's no reason to deliver a sample with zero period. It means there
was no value for slave event since its last group leader sample.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190220122800.864-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Test case 'control_msg' has been updated to peek non-data record and
then verify the type of record received. Subsequently, the same record
is retrieved without MSG_PEEK flag in recvmsg().
Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Initial use case:
Dumping the maps setup by tools/perf/examples/bpf/augmented_raw_syscalls.c,
which so far are just booleans, showing just non-zeroed entries:
# cat ~/.perfconfig
[llvm]
dump-obj = true
clang-opt = -g
[trace]
#add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
add_events = /wb/augmented_raw_syscalls.o
$ date
Tue Feb 19 16:29:33 -03 2019
$ ls -la /wb/augmented_raw_syscalls.o
-rwxr-xr-x. 1 root root 14048 Jan 24 12:09 /wb/augmented_raw_syscalls.o
$ file /wb/augmented_raw_syscalls.o
/wb/augmented_raw_syscalls.o: ELF 64-bit LSB relocatable, eBPF, version 1 (SYSV), with debug_info, not stripped
$
# trace -e recvmmsg,sendmmsg --map-dump foobar
ERROR: BPF map "foobar" not found
# trace -e recvmmsg,sendmmsg --map-dump filtered_pids
ERROR: BPF map "filtered_pids" not found
# trace -e recvmmsg,sendmmsg --map-dump pids_filtered
[2583] = 1,
[2267] = 1,
^Z
[1]+ Stopped trace -e recvmmsg,sendmmsg --map-dump pids_filtered
# pidof trace
2267
# ps ax|grep gnome-terminal|grep -v grep
2583 ? Ssl 58:33 /usr/libexec/gnome-terminal-server
^C
# trace -e recvmmsg,sendmmsg --map-dump syscalls
[299] = 1,
[307] = 1,
^C
# grep x64_recvmmsg arch/x86/entry/syscalls/syscall_64.tbl
299 64 recvmmsg __x64_sys_recvmmsg
# grep x64_sendmmsg arch/x86/entry/syscalls/syscall_64.tbl
307 64 sendmmsg __x64_sys_sendmmsg
#
Next step probably will be something like 'perf stat's --interval-print and
--interval-clear.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lkml.kernel.org/n/tip-ztxj25rtx37ixo9cfajt8ocy@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
At some point I'll suggest moving this to libbpf, for now I'll
experiment with ways to dump BPF maps set by events in 'perf trace',
starting with a very basic dumper for the current very limited needs
of the augmented_raw_syscalls code: dumping booleans.
Having functions that apply to the map keys and values and do table
lookup in things like syscall id to string tables should come next.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lkml.kernel.org/n/tip-lz14w0esqyt1333aon05jpwc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Commit 489338a717a0 ("perf tests evsel-tp-sched: Fix bitwise operator")
causes test case 14 "Parse sched tracepoints fields" to fail on s390.
This test succeeds on x86.
In fact this test now fails on all architectures with type char treated
as type unsigned char.
The root cause is the signed-ness of character arrays in the tracepoints
sched_switch for structure members prev_comm and next_comm.
On s390 the output of:
[root@m35lp76 perf]# cat /sys/kernel/debug/tracing/events/sched/sched_switch/format
name: sched_switch
ID: 287
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
...
field:char prev_comm[16]; offset:8; size:16; signed:0;
...
field:char next_comm[16]; offset:40; size:16; signed:0;
reveals the character arrays prev_comm and next_comm are per
default unsigned char and have values in the range of 0..255.
On x86 both fields are signed as this output shows:
[root@f29]# cat /sys/kernel/debug/tracing/events/sched/sched_switch/format
name: sched_switch
ID: 287
format:
field:unsigned short common_type; offset:0; size:2; signed:0;
...
field:char prev_comm[16]; offset:8; size:16; signed:1;
...
field:char next_comm[16]; offset:40; size:16; signed:1;
and the character arrays prev_comm and next_comm are per default signed
char and have values in the range of -1..127. The implementation of
type char is architecture specific.
Since the character arrays in both tracepoints sched_switch and
sched_wakeup should contain ascii characters, simply omit the check for
signedness in the test case.
Output before:
[root@m35lp76 perf]# ./perf test -F 14
14: Parse sched tracepoints fields :
--- start ---
sched:sched_switch: "prev_comm" signedness(0) is wrong, should be 1
sched:sched_switch: "next_comm" signedness(0) is wrong, should be 1
sched:sched_wakeup: "comm" signedness(0) is wrong, should be 1
---- end ----
14: Parse sched tracepoints fields : FAILED!
[root@m35lp76 perf]#
Output after:
[root@m35lp76 perf]# ./perf test -Fv 14
14: Parse sched tracepoints fields :
--- start ---
---- end ----
Parse sched tracepoints fields: Ok
[root@m35lp76 perf]#
Fixes: 489338a717a0 ("perf tests evsel-tp-sched: Fix bitwise operator")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Link: http://lkml.kernel.org/r/20190219153639.31267-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
According to the current documentation the flags section is placed after
the file header itself but the code assumes to find the flags section
after the data section. This change updates the documentation to that
assumption.
Signed-off-by: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: http://lkml.kernel.org/r/20190219154515.3954-2-jonas.rabenstein@studium.uni-erlangen.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The content of the HEADER_CMDLINE feature header is a perf_header_string_list
of the argument vector and not a perf_header_string of the commandline.
Signed-off-by: Jonas Rabenstein <jonas.rabenstein@studium.uni-erlangen.de>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: http://lkml.kernel.org/r/20190219154515.3954-1-jonas.rabenstein@studium.uni-erlangen.de
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
bpftool has support for attach types "stream_verdict" and
"stream_parser" but the documentation was referring to them as
"skb_verdict" and "skb_parse". The inconsistency comes from commit
b7d3826c2ed6 ("bpf: bpftool, add support for attaching programs to
maps").
This patch changes the documentation to match the implementation:
- "bpftool prog help"
- man pages
- bash completion
Signed-off-by: Alban Crequy <alban@kinvolk.io>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
We can't assume inlined symbols with the same name are equal, because
their address range may be different. This will cause the symbols with
different addresses be shadowed when adding to the hist entry, and lead
to ERANGE error when checking the symbol address during sample parse,
the addr should be within the range of [sym.start, sym.end].
The error message is like: "0x36aea60 [0x8]: failed to process type: 68".
The second parameter of symbol__new() is the length of the fake symbol
for the inline frame, which is the subtraction of the end and start
address of base_sym.
Signed-off-by: He Kuang <hekuang@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: aa441895f7b4 ("perf report: Compare symbol name for inlined frames when sorting")
Link: http://lkml.kernel.org/r/20190219130531.15692-1-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|