summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-08-15perf scripts python: Fix missing call_path_id in export-to-postgresql scriptAdrian Hunter
The export does not work if only branches are exported because of a missing column in the samples table. Fix by adding the missing call_path_id. Fixes: 3521f3bc9dae ("perf script: Update export-to-postgresql to support callchain export") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Link: http://lkml.kernel.org/r/1501749090-20357-2-git-send-email-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-15perf test shell vfs_getname: Skip for tools built with NO_LIBDWARF=1Arnaldo Carvalho de Melo
If that is the case, or if the required lib is not present, e.g. elfutils-devel in Fedora systems, then just skip the tests requiring DWARF analysis. Before: # rpm -e elfutils-devel # perf test ping vfs_getname 60: Use vfs_getname probe to get syscall args filenames : FAILED! 61: probe libc's inet_pton & backtrace it with ping : Ok 62: Check open filename arg using perf trace + vfs_getname: FAILED! 63: Add vfs_getname probe to get syscall args filenames : FAILED! # After: # perf test vfs_getname 60: Use vfs_getname probe to get syscall args filenames : Skip 62: Check open filename arg using perf trace + vfs_getname: Skip 63: Add vfs_getname probe to get syscall args filenames : Skip # Then, reinstalling elfutils-devel, rebuilding the tool and running again: # perf test vfs_getname 60: Use vfs_getname probe to get syscall args filenames : Ok 62: Check open filename arg using perf trace + vfs_getname: Ok 63: Add vfs_getname probe to get syscall args filenames : Ok # Reported-by: Kim Phillips <kim.phillips@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-d67tvn401fxrwr97pu5ihfb1@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-15perf test shell: Check if 'perf probe' is available, skip tests if notArnaldo Carvalho de Melo
Add a library function that checks if 'perf probe' is built into the tool being tested, skipping tests that need it. Testing it on a system after removing the library needed to build 'probe' as a perf subcommand: # perf test ping vfs_getname 59: Use vfs_getname probe to get syscall args filenames : Skip 60: probe libc's inet_pton & backtrace it with ping : Skip 61: Check open filename arg using perf trace + vfs_getname: Skip 62: Add vfs_getname probe to get syscall args filenames : Skip # perf probe perf: 'probe' is not a perf-command. See 'perf --help'. # Now reinstalling elfutils-libelf-devel on this Fedora 26 system to rebuild perf and then retest this: # perf test ping vfs_getname 60: Use vfs_getname probe to get syscall args filenames : Ok 61: probe libc's inet_pton & backtrace it with ping : Ok 62: Check open filename arg using perf trace + vfs_getname: Ok 63: Add vfs_getname probe to get syscall args filenames : Ok # Reported-by: Kim Phillips <kim.phillips@arm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-ctdck2gzsskqhjzu3ebb62zm@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-15perf tests shell: Remove duplicate skip_if_no_debuginfo() functionArnaldo Carvalho de Melo
Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-3zxjswdbs2au3ih0rino0iy1@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-14Merge tag 'perf-core-for-mingo-4.14-20170814' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo: Infrastructure changes: - Do not consider empty files as valid srclines (Milian Wolff) - Fix wrong size in perf_record_mmap for last kernel module, which resulted in erroneous symbol resolution in at least s390x (Thomas Richter) - Add missing newline to expr parser error messages (Andi Kleen) - Fix saved values rbtree lookup in 'perf stat' (Andi Kleen) - Add support for shell based tests in 'perf test', add a few that run 'perf probe', 'perf trace', using kprobes, uprobes to check the output of those tools and the effects on the system, checking, for instance, DWARF backtraces from uprobes (Arnaldo Carvalho de Melo) Arch specific changes: - Add ppc64le to audit uname list in the python scripting support (Naveen N. Rao) - Update POWER9 vendor events tables (Sukadev Bhattiprolu) - Fix module symbol adjustment for s390x (Thomas Richter) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-11perf test shell: Add uprobes + backtrace ping testArnaldo Carvalho de Melo
Installs a probe on libc's inet_pton function, that will use uprobes, then use 'perf trace' on a ping to localhost asking for just one packet with the a backtrace 3 levels deep, check that it is what we expect. This needs no debuginfo package, all is done using the libc ELF symtab and the CFI info in the binaries. Testing it: # perf test ping 61: probe libc's inet_pton & backtrace it with ping : Ok In verbose mode: # perf test -v ping 61: probe libc's inet_pton & backtrace it with ping : --- start --- test child forked, pid 1007 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.058 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.058/0.058/0.058/0.000 ms 0.000 probe_libc:inet_pton:(7f75fce12a20)) __GI___inet_pton (/usr/lib64/libc-2.24.so) getaddrinfo (/usr/lib64/libc-2.24.so) _init (/usr/bin/ping) test child finished with 0 ---- end ---- probe libc's inet_pton & backtrace it with ping: Ok # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-idrntt4nbg15aafu8hjmv7sk@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-11perf report: Fix module symbol adjustment for s390xThomas Richter
The 'perf report' tool does not display the addresses of kernel module symbols correctly. For example symbol qeth_send_ipa_cmd in kernel module qeth.ko has this relative address for function qeth_send_ipa_cmd(): [root@s8360047 linux]# nm -g drivers/s390/net/qeth.ko | fgrep send_ipa_cmd 0000000000013088 T qeth_send_ipa_cmd The module is loaded at address: [root@s8360047 linux]# cat /sys/module/qeth/sections/.text 0x000003ff80296d20 [root@s8360047 linux]# This should result in a start address of: 0x13088 + 0x3ff80296d20 = 0x3ff802a9da8 Using crash to verify the address on a live system: [root@s8360046 linux]# crash vmlinux crash 7.1.9++ Copyright (C) 2002-2016 Red Hat, Inc. Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation [...] crash> mod -s qeth drivers/s390/net/qeth.ko MODULE NAME SIZE OBJECT FILE 3ff8028d700 qeth 151552 drivers/s390/net/qeth.ko crash> sym qeth_send_ipa_cmd 3ff802a9da8 (T) qeth_send_ipa_cmd [qeth] /root/linux/drivers/s390/net/qeth_core_main.c: 2944 crash> Now perf report displays the address of symbol qeth_send_ipa_cmd: symbol__new: qeth_send_ipa_cmd 0x130f0-0x132ce There is a difference of 0x68 between the entry in the symbol table (see nm command above) and perf. The difference is from the offset the .text segment of qeth.ko: [root@s8360047 perf]# readelf -a drivers/s390/net/qeth.ko Section Headers: [Nr] Name Type Address Offset Size EntSize Flags Link Info Align [ 0] NULL 0000000000000000 00000000 0000000000000000 0000000000000000 0 0 0 [ 1] .note.gnu.build-i NOTE 0000000000000000 00000040 0000000000000024 0000000000000000 A 0 0 4 [ 2] .text PROGBITS 0000000000000000 00000068 000000000001c8a0 0000000000000000 AX 0 0 8 As seen the .text segment has an offset of 0x68 with start address 0x0. Therefore 0x68 is added to the address of qeth_send_ipa_cmd and thus 0x13088 + 0x68 = 0x130f0 is displayed. This is wrong, perf report needs to display the start address of symbol qeth_send_ipa_cmd at 0x13088 + qeth.ko.text section start address. The qeth.ko module .text start address is available in the qeth.ko DSO map. Just identify the kernel module symbols and correct the addresses. With the fix I see this correct address for symbol: symbol__new: qeth_send_ipa_cmd 0x3ff802a9da8-0x3ff802a9f86 Signed-off-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com> Cc: Zvonko Kosic <zvonko.kosic@de.ibm.com> LPU-Reference: 20170803134902.47207-1-tmricht@linux.vnet.ibm.com Link: http://lkml.kernel.org/n/tip-q8lktlpoxb5e3dj52u1s1rw4@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-11perf record: Fix wrong size in perf_record_mmap for last kernel moduleThomas Richter
During work on perf report for s390 I ran into the following issue: 0 0x318 [0x78]: PERF_RECORD_MMAP -1/0: [0x3ff804d6990(0xfffffc007fb2966f) @ 0]: x /lib/modules/4.12.0perf1+/kernel/drivers/s390/net/qeth_l2.ko This is a PERF_RECORD_MMAP entry of the perf.data file with an invalid module size for qeth_l2.ko (the s390 ethernet device driver). Even a mainframe does not have 0xfffffc007fb2966f bytes of main memory. It turned out that this wrong size is created by the perf record command. What happens is this function call sequence from __cmd_record(): perf_session__new(): perf_session__create_kernel_maps(): machine__create_kernel_maps(): machine__create_modules(): Creates map for all loaded kernel modules. modules__parse(): Reads /proc/modules and extracts module name and load address (1st and last column) machine__create_module(): Called for every module found in /proc/modules. Creates a new map for every module found and enters module name and start address into the map. Since the module end address is unknown it is set to zero. This ends up with a kernel module map list sorted by module start addresses. All module end addresses are zero. Last machine__create_kernel_maps() calls function map_groups__fixup_end(). This function iterates through the maps and assigns each map entry's end address the successor map entry start address. The last entry of the map group has no successor, so ~0 is used as end to consume the remaining memory. Later __cmd_record calls function record__synthesize() which in turn calls perf_event__synthesize_kernel_mmap() and perf_event__synthesize_modules() to create PERF_REPORT_MMAP entries into the perf.data file. On s390 this results in the last module qeth_l2.ko (which has highest start address, see module table: [root@s8360047 perf]# cat /proc/modules qeth_l2 86016 1 - Live 0x000003ff804d6000 qeth 266240 1 qeth_l2, Live 0x000003ff80296000 ccwgroup 24576 1 qeth, Live 0x000003ff80218000 vmur 36864 0 - Live 0x000003ff80182000 qdio 143360 2 qeth_l2,qeth, Live 0x000003ff80002000 [root@s8360047 perf]# ) to be the last entry and its map has an end address of ~0. When the PERF_RECORD_MMAP entry is created for kernel module qeth_l2.ko its start address and length is written. The length is calculated in line: event->mmap.len = pos->end - pos->start; and results in 0xffffffffffffffff - 0x3ff804d6990(*) = 0xfffffc007fb2966f (*) On s390 the module start address is actually determined by a __weak function named arch__fix_module_text_start() in machine__create_module(). I think this improvable. We can use the module size (2nd column of /proc/modules) to get each loaded kernel module size and calculate its end address. Only for map entries which do not have a valid end address (end is still zero) we can use the heuristic we have now, that is use successor start address or ~0. Signed-off-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com> Cc: Zvonko Kosic <zvonko.kosic@de.ibm.com> LPU-Reference: 20170803134902.47207-2-tmricht@linux.vnet.ibm.com Link: http://lkml.kernel.org/n/tip-nmoqij5b5vxx7rq2ckwu8iaj@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-11perf srcline: Do not consider empty files as valid srclinesMilian Wolff
Sometimes we get a non-null, but empty, string for the filename from bfd. This then results in srclines of the form ":0", which is different from the canonical SRCLINE_UNKNOWN in the form "??:0". Set the file to NULL if it is empty to fix this. Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Yao Jin <yao.jin@linux.intel.com> Link: http://lkml.kernel.org/r/20170806212446.24925-14-milian.wolff@kdab.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-11perf util: Take elf_name as const string in dso__demangle_symMilian Wolff
The input string is not modified and thus can be passed in as a pointer to const data. Signed-off-by: Milian Wolff <milian.wolff@kdab.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Yao Jin <yao.jin@linux.intel.com> Link: http://lkml.kernel.org/r/20170806212446.24925-3-milian.wolff@kdab.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-11perf test shell: Add test using vfs_getname + 'perf trace'Arnaldo Carvalho de Melo
Uses the 'perf test shell' library to add probe:vfs_getname to the system then use it with 'perf trace' using 'touch' to write to a temp file, then checks that that was captured by the vfs_getname was used by 'perf trace', that already handles "probe:vfs_getname" if present, and used in the "open" syscall "filename" argument beautifier. Testing it: # perf test "trace + vfs_getname" 61: Check open filename arg using perf trace + vfs_getname: Ok # # perf test -v "trace + vfs_getname" 61: Check open filename arg using perf trace + vfs_getname: --- start --- test child forked, pid 30846 Added new event: probe:vfs_getname (on getname_flags:72 with pathname=result->name:string) You can now use it in all perf tools, such as: perf record -e probe:vfs_getname -aR sleep 1 2.237 ( 0.012 ms): touch/30855 open(filename: /tmp/temporary_file.kmoWQ, flags: CREAT|NOCTTY|NONBLOCK|WRONLY, mode: IRUGO|IWUGO) = 3 test child finished with 0 ---- end ---- Check open filename arg using perf trace + vfs_getname: Ok # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-j02nobfvvn9c7yrphdsnbqx0@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-11perf test shell: Add test using probe:vfs_getname and verifying resultsArnaldo Carvalho de Melo
This test uses the 'perf test shell' library to add probe:vfs_getname to the system then use it with 'perf record' using 'touch' to write to a temp file, then checks that that was captured by the vfs_getname probe in the generated perf.data file, with the temp file name as the pathname argument. Using it: # perf test "Use vfs_getname" 60: Use vfs_getname probe to get syscall args filenames: Ok # perf test -v "Use vfs_getname" 60: Use vfs_getname probe to get syscall args filenames: --- start --- test child forked, pid 16414 Added new event: probe:vfs_getname (on getname_flags:72 with pathname=result->name:string) You can now use it in all perf tools, such as: perf record -e probe:vfs_getname -aR sleep 1 Recording open file: [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.022 MB /tmp/vaca.perf.data.QZsn7 (13 samples) ] Looking at perf.data file for vfs_getname records for the file we touched: touch 16421 [002] 1255152.879561: probe:vfs_getname: (ffffffffa626e608) pathname="/tmp/vaca.l10SL" test child finished with 0 ---- end ---- Use vfs_getname probe to get syscall args filenames: Ok # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-t555fnhbcbxnukltk23dqxur@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-11perf test shell: Move vfs_getname probe function to libArnaldo Carvalho de Melo
Multiple tests will be able to reuse these functions, to test things like perf report, 'trace', etc, using this probe. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-48xagvozhouhyi8fjota6o2d@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-11perf test shell: Install shell testsArnaldo Carvalho de Melo
Now that we have shell tests, install them. Developers don't need this pass, as 'perf test' will look first at the in tree scripts at tools/perf/tests/shell/. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-j21u4v0jsehi0lpwqwjb4j45@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-11perf test shell: Add 'probe_vfs_getname' shell testArnaldo Carvalho de Melo
First perf shell test: # perf test vfs_getname 60: Add vfs_getname probe to get syscall args filenames: Ok # In verbose mode: # perf test -v vfs_getname 60: Add vfs_getname probe to get syscall args filenames: --- start --- test child forked, pid 19146 Added new event: probe:vfs_getname (on getname_flags:72 with pathname=result->name:string) You can now use it in all perf tools, such as: perf record -e probe:vfs_getname -aR sleep 1 test child finished with 0 ---- end ---- Add vfs_getname probe to get syscall args filenames: Ok # And if the vmlinux file is not found: # mv ../build/v4.12.0-rc6+/vmlinux ../build/v4.12.0-rc6+/vmlinux.hidden # perf test vfs_getname 60: Add vfs_getname probe to get syscall args filenames: Skip # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-8f3n22c1yn516ev30s603ow2@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-11perf test: Make 'list' use same filtering code as main 'perf test'Arnaldo Carvalho de Melo
Before: # perf test Synth 39: Synthesize thread map : Ok 41: Synthesize cpu map : Ok 42: Synthesize stat config : Ok 43: Synthesize stat : Ok 44: Synthesize stat round : Ok 45: Synthesize attr update : Ok # perf test list Synth # After: # perf test Synth 39: Synthesize thread map : Ok 41: Synthesize cpu map : Ok 42: Synthesize stat config : Ok 43: Synthesize stat : Ok 44: Synthesize stat round : Ok 45: Synthesize attr update : Ok # perf test list Synth 39: Synthesize thread map 41: Synthesize cpu map 42: Synthesize stat config 43: Synthesize stat 44: Synthesize stat round 45: Synthesize attr update # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-v95tqqzuwawsmds3zn2mosje@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-11perf test: Add infrastructure to run shell based testsArnaldo Carvalho de Melo
To allow testing by directly using perf tools in scripts, checking that the effects on the system are the ones expected and that the output produced is as well the desired one. For instance, adding a probe at a well known location with 'perf probe', then checking that the results from using that probe to record are the desired ones, etc. The next csets will introduce tests using this new testing infrastructure. The scripts should return 0 for Ok, 1 for FAIL and 2 for SKIP. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-swbpn7amrjqffh83lsr39s9p@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-11perf test: Add 'struct test *' to the test functionsArnaldo Carvalho de Melo
This way we'll be able to pass more test specific parameters without having to change this function signature. Will be used by the upcoming 'shell tests', shell scripts that will call perf tools and check if they work as expected, comparing its effects on the system (think 'perf probe foo') the output produced, etc. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-wq250w7j1opbzyiynozuajbl@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-11perf test: Make 'list' subcommand match main 'perf test' numbering/matchingArnaldo Carvalho de Melo
Before: # perf test Synth 39: Synthesize thread map : Ok 41: Synthesize cpu map : Ok 42: Synthesize stat config : Ok 43: Synthesize stat : Ok 44: Synthesize stat round : Ok 45: Synthesize attr update : Ok # # perf test list Synth 1: Synthesize thread map 2: Synthesize cpu map 3: Synthesize stat config 4: Synthesize stat 5: Synthesize stat round 6: Synthesize attr update # After: # perf test Synth 39: Synthesize thread map : Ok 41: Synthesize cpu map : Ok 42: Synthesize stat config : Ok 43: Synthesize stat : Ok 44: Synthesize stat round : Ok 45: Synthesize attr update : Ok # # perf test list Synth 39: Synthesize thread map 41: Synthesize cpu map 42: Synthesize stat config 43: Synthesize stat 44: Synthesize stat round 45: Synthesize attr update # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Matt Fleming <matt.fleming@intel.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: http://lkml.kernel.org/n/tip-pjhuhkphs7o3tkbqrukfv6bz@git.kernel.org Fixes: e8210cefb7e1 ("perf tests: Introduce iterator function for tests") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-11perf tools: Add missing newline to expr parser error messagesAndi Kleen
Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170724234015.5165-6-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-11perf stat: Fix saved values rbtree lookupAndi Kleen
The stat shadow saved values rbtree is indexed by a pointer. Fix the comparison function: - We cannot return a pointer delta as an int because that loses bits on 64bit. - Doing pointer arithmetic on the struct pointer only works if the objects are spaced by the multiple of the object size, which is not guaranteed for individual malloc'ed object Replace it with a proper comparison. This fixes various problems with values not being found. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170724234015.5165-4-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-11perf vendor events powerpc: Update POWER9 eventsSukadev Bhattiprolu
Update and cleanup POWER9 PMU events. Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Anton Blanchard <anton@au1.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Link: http://lkml.kernel.org/r/20170802174617.GA32545@us.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-11perf vendor events powerpc: remove suffix in mapfileSukadev Bhattiprolu
Drop the .json suffix for events directory in the mapfile.csv. Now that we have separate JSON files for each topic in a CPU (eg: see tools/perf/pmu-events/arch/powerpc/power8/*.json) the .json suffix in the mapfile is misleading and redundant. Reported-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com> Cc: Anton Blanchard <anton@au1.ibm.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lkml.kernel.org/r/20170802174617.GA32545@us.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-11perf scripting python: Add ppc64le to audit uname listNaveen N. Rao
Before patch: $ uname -m ppc64le $ ./perf script -s ./scripts/python/syscall-counts.py Install the audit-libs-python package to get syscall names. For example: # apt-get install python-audit (Ubuntu) # yum install audit-libs-python (Fedora) etc. Press control+C to stop and show the summary ^CWarning: 4 out of order events recorded. syscall events: event count ---------------------------------------- ----------- 4 504638 54 1206 221 42 55 21 3 12 167 10 11 8 6 7 125 6 5 6 108 5 162 4 90 4 45 3 33 3 311 1 246 1 238 1 93 1 91 1 After patch: ./perf script -s ./scripts/python/syscall-counts.py Press control+C to stop and show the summary ^CWarning: 5 out of order events recorded. syscall events: event count ---------------------------------------- ----------- write 643411 ioctl 1206 futex 54 fcntl 27 poll 14 read 12 execve 8 close 7 mprotect 6 open 6 nanosleep 5 fstat 5 mmap 4 inotify_add_watch 3 brk 3 access 3 timerfd_settime 1 clock_gettime 1 epoll_wait 1 ftruncate 1 munmap 1 Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Acked-by: Paul Clarke <pc@us.ibm.com> Cc: linuxppc-dev@lists.ozlabs.org Link: http://lkml.kernel.org/n/tip-bnl67p1alkvx97pn9moxz3qp@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-10Merge tag 'perf-core-for-mingo-4.14-20170801' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core Pull perf/core improvements from Arnaldo Carvalho de Melo: User visible changes: - Beautifiers for the 'cmd' arg of several ioctl types, including: sound, DRM, KVM, vhost virtio and perf_events. This was done by using scripts that extract the information from the UAPI headers, generating string tables that are then used in the 'perf trace' syscall argument ioctl beautifier. More work needed to further use it, for instance, to use the _IOC_DIR value where it is used sanely to suppress the third argument, to set formatters for non-pointer values and ultimately for using eBPF + pahole-like code to collect + beautify structs in the third arg. Using the current scheme of having tools/ copies of kernel headers we'll make sure tooling stays working when changes are made to the kernel ABI headers and will be notified when they get changed, reducing the time for 'perf trace' to support new ABIs and allowing the tools/perf/ codebase to have the definitions it needs to build in dozens of distros/versions, as routinely tested using containers for, at this time, 47 environments. (Arnaldo Carvalho de Melo) Infrastructure changes: - Clarify header version warning message (Ingo Molnar) - Sync kernel ABI headers with tooling headers (Ingo Molnar, Arnaldo Carvalho de Melo) Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10kprobes/x86: Do not jump-optimize kprobes on irq entry codeMasami Hiramatsu
Since the kernel segment registers are not prepared at the entry of irq-entry code, if a kprobe on such code is jump-optimized, accessing per-CPU variables may cause a kernel panic. However, if the kprobe is not optimized, it triggers an int3 exception and sets segment registers correctly. With this patch we check the probe-address and if it is in the irq-entry code, it prohibits optimizing such kprobes. This means we can continue probing such interrupt handlers by kprobes but it is not optimized anymore. Reported-by: Francis Deslauriers <francis.deslauriers@efficios.com> Tested-by: Francis Deslauriers <francis.deslauriers@efficios.com> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Chris Zankel <chris@zankel.net> Cc: David S . Miller <davem@davemloft.net> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: linux-arch@vger.kernel.org Cc: linux-cris-kernel@axis.com Cc: mathieu.desnoyers@efficios.com Link: http://lkml.kernel.org/r/150172795654.27216.9824039077047777477.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10irq: Make the irqentry text section unconditionalMasami Hiramatsu
Generate irqentry and softirqentry text sections without any Kconfig dependencies. This will add extra sections, but there should be no performace impact. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Chris Zankel <chris@zankel.net> Cc: David S . Miller <davem@davemloft.net> Cc: Francis Deslauriers <francis.deslauriers@efficios.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: linux-arch@vger.kernel.org Cc: linux-cris-kernel@axis.com Cc: mathieu.desnoyers@efficios.com Link: http://lkml.kernel.org/r/150172789110.27216.3955739126693102122.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10cris: Mark _stext and _end as char-arrays, not single char variablesMasami Hiramatsu
Mark _stext and _end as character arrays instead of single character variable, like include/asm-generic/sections.h does. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Chris Zankel <chris@zankel.net> Cc: David S . Miller <davem@davemloft.net> Cc: Francis Deslauriers <francis.deslauriers@efficios.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: linux-arch@vger.kernel.org Cc: linux-cris-kernel@axis.com Cc: mathieu.desnoyers@efficios.com Link: http://lkml.kernel.org/r/150172782555.27216.2805751327900543374.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10xtensa: Mark _stext and _end as char-arrays, not single char variablesMasami Hiramatsu
Mark _stext and _end as character arrays instead of single character variables, like include/asm-generic/sections.h does. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Chris Zankel <chris@zankel.net> Cc: David S . Miller <davem@davemloft.net> Cc: Francis Deslauriers <francis.deslauriers@efficios.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: linux-arch@vger.kernel.org Cc: linux-cris-kernel@axis.com Cc: mathieu.desnoyers@efficios.com Link: http://lkml.kernel.org/r/150172775958.27216.12951305461398200544.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10h8300: Mark _stext and _etext as char-arrays, not single char variablesMasami Hiramatsu
Mark _stext and _etext as character arrays instead of single character variables, like include/asm-generic/sections.h does. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: Chris Zankel <chris@zankel.net> Cc: David S . Miller <davem@davemloft.net> Cc: Francis Deslauriers <francis.deslauriers@efficios.com> Cc: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Mikael Starvik <starvik@axis.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: linux-arch@vger.kernel.org Cc: linux-cris-kernel@axis.com Cc: mathieu.desnoyers@efficios.com Link: http://lkml.kernel.org/r/150172769415.27216.12021110228384155707.stgit@devbox Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10perf/core: Reduce context switch overheadleilei.lin
Skip most of the PMU context switching overhead when ctx->nr_events is 0. 50% performance overhead was observed under an extreme testcase. Signed-off-by: leilei.lin <leilei.lin@alibaba-inc.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: acme@kernel.org Cc: alexander.shishkin@linux.intel.com Cc: eranian@gmail.com Cc: jolsa@redhat.com Cc: linxiulei@gmail.com Cc: yang_oliver@hotmail.com Link: http://lkml.kernel.org/r/20170809002921.69813-1-leilei.lin@alibaba-inc.com [ Rewrote the changelog. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10perf/x86/amd/uncore: Get correct number of cores sharing last level cacheJanakarajan Natarajan
In Family 17h, the number of cores sharing a cache level is obtained from the Cache Properties CPUID leaf (0x8000001d) by passing in the cache level in ECX. In prior families, a cache level of 2 was used to determine this information. To get the right information, irrespective of Family, iterate over the cache levels using CPUID 0x8000001d. The last level cache is the last value to return a non-zero value in EAX. Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/5ab569025b39cdfaeca55b571d78c0fc800bdb69.1497452002.git.Janakarajan.Natarajan@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10perf/x86/amd/uncore: Rename cpufeatures macro for cache countersJanakarajan Natarajan
In Family 17h, L3 is the last level cache as opposed to L2 in previous families. Avoid this name confusion and rename X86_FEATURE_PERFCTR_L2 to X86_FEATURE_PERFCTR_LLC to indicate the performance counter on the last level of cache. Signed-off-by: Janakarajan Natarajan <Janakarajan.Natarajan@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/016311029fdecdc3fdc13b7ed865c6cbf48b2f15.1497452002.git.Janakarajan.Natarajan@amd.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10Merge branch 'perf/urgent' into perf/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10perf/core: Fix time on IOC_ENABLEPeter Zijlstra
Vince reported that when we do IOC_ENABLE/IOC_DISABLE while the task is SIGSTOP'ed state the timestamps go wobbly. It turns out we indeed fail to correctly account time while in 'OFF' state and doing IOC_ENABLE without getting scheduled in exposes the problem. Further thinking about this problem, it occurred to me that we can suffer a similar fate when we migrate an uncore event between CPUs. The perf_event_install() on the 'new' CPU will do add_event_to_ctx() which will reset all the time stamp, resulting in a subsequent update_event_times() to overwrite the total_time_* fields with smaller values. Reported-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10perf/x86: Fix RDPMC vs. mm_struct trackingPeter Zijlstra
Vince reported the following rdpmc() testcase failure: > Failing test case: > > fd=perf_event_open(); > addr=mmap(fd); > exec() // without closing or unmapping the event > fd=perf_event_open(); > addr=mmap(fd); > rdpmc() // GPFs due to rdpmc being disabled The problem is of course that exec() plays tricks with what is current->mm, only destroying the old mappings after having installed the new mm. Fix this confusion by passing along vma->vm_mm instead of relying on current->mm. Reported-by: Vince Weaver <vincent.weaver@maine.edu> Tested-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Fixes: 1e0fb9ec679c ("perf: Add pmu callbacks to track event mapping and unmapping") Link: http://lkml.kernel.org/r/20170802173930.cstykcqefmqt7jau@hirez.programming.kicks-ass.net [ Minor cleanups. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-09Merge tag 'pinctrl-v4.13-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: "These are the pin control fixes I have gathered since the return from my vacation. They boiled in -next a while so let's get them in. Apart from the documentation build it is purely driver fixes. Which is nice. The Intel fixes seem kind of important. - Fix the documentation build as the docs were moved - Correct the UART pin list on the Intel Merrifield - Fix pin assignment and number of pins on the Marvell Armada 37xx pin controller - Cover the Setzer models in the Chromebook DMI quirk in the Intel cheryview driver so they start working - Add the missing "sim" function to the sunxi driver - Fix USB pin definitions on Uniphier Pro4 - Smatch fix for invalid reference in the zx pin control driver" * tag 'pinctrl-v4.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: generic: update references to Documentation/pinctrl.txt pinctrl: intel: merrifield: Correct UART pin lists pinctrl: armada-37xx: Fix number of pin in south bridge pinctrl: armada-37xx: Fix the pin 23 on south bridge pinctrl: cherryview: Add Setzer models to the Chromebook DMI quirk pinctrl: sunxi: add a missing function of A10/A20 pinctrl driver pinctrl: uniphier: fix USB3 pin assignment for Pro4 pinctrl: zte: fix dereference of 'data' in zx_set_mux()
2017-08-09futex: Remove unnecessary warning from get_futex_keyMel Gorman
Commit 65d8fc777f6d ("futex: Remove requirement for lock_page() in get_futex_key()") removed an unnecessary lock_page() with the side-effect that page->mapping needed to be treated very carefully. Two defensive warnings were added in case any assumption was missed and the first warning assumed a correct application would not alter a mapping backing a futex key. Since merging, it has not triggered for any unexpected case but Mark Rutland reported the following bug triggering due to the first warning. kernel BUG at kernel/futex.c:679! Internal error: Oops - BUG: 0 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 3695 Comm: syz-executor1 Not tainted 4.13.0-rc3-00020-g307fec773ba3 #3 Hardware name: linux,dummy-virt (DT) task: ffff80001e271780 task.stack: ffff000010908000 PC is at get_futex_key+0x6a4/0xcf0 kernel/futex.c:679 LR is at get_futex_key+0x6a4/0xcf0 kernel/futex.c:679 pc : [<ffff00000821ac14>] lr : [<ffff00000821ac14>] pstate: 80000145 The fact that it's a bug instead of a warning was due to an unrelated arm64 problem, but the warning itself triggered because the underlying mapping changed. This is an application issue but from a kernel perspective it's a recoverable situation and the warning is unnecessary so this patch removes the warning. The warning may potentially be triggered with the following test program from Mark although it may be necessary to adjust NR_FUTEX_THREADS to be a value smaller than the number of CPUs in the system. #include <linux/futex.h> #include <pthread.h> #include <stdio.h> #include <stdlib.h> #include <sys/mman.h> #include <sys/syscall.h> #include <sys/time.h> #include <unistd.h> #define NR_FUTEX_THREADS 16 pthread_t threads[NR_FUTEX_THREADS]; void *mem; #define MEM_PROT (PROT_READ | PROT_WRITE) #define MEM_SIZE 65536 static int futex_wrapper(int *uaddr, int op, int val, const struct timespec *timeout, int *uaddr2, int val3) { syscall(SYS_futex, uaddr, op, val, timeout, uaddr2, val3); } void *poll_futex(void *unused) { for (;;) { futex_wrapper(mem, FUTEX_CMP_REQUEUE_PI, 1, NULL, mem + 4, 1); } } int main(int argc, char *argv[]) { int i; mem = mmap(NULL, MEM_SIZE, MEM_PROT, MAP_SHARED | MAP_ANONYMOUS, -1, 0); printf("Mapping @ %p\n", mem); printf("Creating futex threads...\n"); for (i = 0; i < NR_FUTEX_THREADS; i++) pthread_create(&threads[i], NULL, poll_futex, NULL); printf("Flipping mapping...\n"); for (;;) { mmap(mem, MEM_SIZE, MEM_PROT, MAP_FIXED | MAP_SHARED | MAP_ANONYMOUS, -1, 0); } return 0; } Reported-and-tested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Mel Gorman <mgorman@suse.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@vger.kernel.org # 4.7+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-08-09Merge branch 'i2c/for-current' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: "The main thing is to allow empty id_tables for ACPI to make some drivers get probed again. It looks a bit bigger than usual because it needs some internal renaming, too. Other than that, there is a fix for broken DSTDs, a super simple enablement for ARM MPS, and two documentation fixes which I'd like to see in v4.13 already" * 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: rephrase explanation of I2C_CLASS_DEPRECATED i2c: allow i2c-versatile for ARM MPS platforms i2c: designware: Some broken DSTDs use 1MiHz instead of 1MHz i2c: designware: Print clock freq on invalid clock freq error i2c: core: Allow empty id_table in ACPI case as well i2c: mux: pinctrl: mention correct module name in Kconfig help text
2017-08-09Merge branch 'for-linus' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "Three patches that should go into this release. Two of them are from Paolo and fix up some corner cases with BFQ, and the last patch is from Ming and fixes up a potential usage count imbalance regression due to the recent NOWAIT work" * 'for-linus' of git://git.kernel.dk/linux-block: blk-mq: don't leak preempt counter/q_usage_counter when allocating rq failed block, bfq: consider also in_service_entity to state whether an entity is active block, bfq: reset in_service_entity if it becomes idle
2017-08-09Merge branch 'linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: "Fix two regressions in the inside-secure driver with respect to hmac(sha1)" * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: inside-secure - fix the sha state length in hmac_sha1_setkey crypto: inside-secure - fix invalidation check in hmac_sha1_setkey
2017-08-09Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: "The pull requests are getting smaller, that's progress I suppose :-) 1) Fix infinite loop in CIPSO option parsing, from Yujuan Qi. 2) Fix remote checksum handling in VXLAN and GUE tunneling drivers, from Koichiro Den. 3) Missing u64_stats_init() calls in several drivers, from Florian Fainelli. 4) TCP can set the congestion window to an invalid ssthresh value after congestion window reductions, from Yuchung Cheng. 5) Fix BPF jit branch generation on s390, from Daniel Borkmann. 6) Correct MIPS ebpf JIT merge, from David Daney. 7) Correct byte order test in BPF test_verifier.c, from Daniel Borkmann. 8) Fix various crashes and leaks in ASIX driver, from Dean Jenkins. 9) Handle SCTP checksums properly in mlx4 driver, from Davide Caratti. 10) We can potentially enter tcp_connect() with a cached route already, due to fastopen, so we have to explicitly invalidate it. 11) skb_warn_bad_offload() can bark in legitimate situations, fix from Willem de Bruijn" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits) net: avoid skb_warn_bad_offload false positives on UFO qmi_wwan: fix NULL deref on disconnect ppp: fix xmit recursion detection on ppp channels rds: Reintroduce statistics counting tcp: fastopen: tcp_connect() must refresh the route net: sched: set xt_tgchk_param par.net properly in ipt_init_target net: dsa: mediatek: add adjust link support for user ports net/mlx4_en: don't set CHECKSUM_COMPLETE on SCTP packets qed: Fix a memory allocation failure test in 'qed_mcp_cmd_init()' hysdn: fix to a race condition in put_log_buffer s390/qeth: fix L3 next-hop in xmit qeth hdr asix: Fix small memory leak in ax88772_unbind() asix: Ensure asix_rx_fixup_info members are all reset asix: Add rx->ax_skb = NULL after usbnet_skb_return() bpf: fix selftest/bpf/test_pkt_md_access on s390x netvsc: fix race on sub channel creation bpf: fix byte order test in test_verifier xgene: Always get clk source, but ignore if it's missing for SGMII ports MIPS: Add missing file for eBPF JIT. bpf, s390: fix build for libbpf and selftest suite ...
2017-08-08net: avoid skb_warn_bad_offload false positives on UFOWillem de Bruijn
skb_warn_bad_offload triggers a warning when an skb enters the GSO stack at __skb_gso_segment that does not have CHECKSUM_PARTIAL checksum offload set. Commit b2504a5dbef3 ("net: reduce skb_warn_bad_offload() noise") observed that SKB_GSO_DODGY producers can trigger the check and that passing those packets through the GSO handlers will fix it up. But, the software UFO handler will set ip_summed to CHECKSUM_NONE. When __skb_gso_segment is called from the receive path, this triggers the warning again. Make UFO set CHECKSUM_UNNECESSARY instead of CHECKSUM_NONE. On Tx these two are equivalent. On Rx, this better matches the skb state (checksum computed), as CHECKSUM_NONE here means no checksum computed. See also this thread for context: http://patchwork.ozlabs.org/patch/799015/ Fixes: b2504a5dbef3 ("net: reduce skb_warn_bad_offload() noise") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08qmi_wwan: fix NULL deref on disconnectBjørn Mork
qmi_wwan_disconnect is called twice when disconnecting devices with separate control and data interfaces. The first invocation will set the interface data to NULL for both interfaces to flag that the disconnect has been handled. But the matching NULL check was left out when qmi_wwan_disconnect was added, resulting in this oops: usb 2-1.4: USB disconnect, device number 4 qmi_wwan 2-1.4:1.6 wwp0s29u1u4i6: unregister 'qmi_wwan' usb-0000:00:1d.0-1.4, WWAN/QMI device BUG: unable to handle kernel NULL pointer dereference at 00000000000000e0 IP: qmi_wwan_disconnect+0x25/0xc0 [qmi_wwan] PGD 0 P4D 0 Oops: 0000 [#1] SMP Modules linked in: <stripped irrelevant module list> CPU: 2 PID: 33 Comm: kworker/2:1 Tainted: G E 4.12.3-nr44-normandy-r1500619820+ #1 Hardware name: LENOVO 4291LR7/4291LR7, BIOS CBET4000 4.6-810-g50522254fb 07/21/2017 Workqueue: usb_hub_wq hub_event [usbcore] task: ffff8c882b716040 task.stack: ffffb8e800d84000 RIP: 0010:qmi_wwan_disconnect+0x25/0xc0 [qmi_wwan] RSP: 0018:ffffb8e800d87b38 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000001 RSI: ffff8c8824f3f1d0 RDI: ffff8c8824ef6400 RBP: ffff8c8824ef6400 R08: 0000000000000000 R09: 0000000000000000 R10: ffffb8e800d87780 R11: 0000000000000011 R12: ffffffffc07ea0e8 R13: ffff8c8824e2e000 R14: ffff8c8824e2e098 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff8c8835300000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000000000e0 CR3: 0000000229ca5000 CR4: 00000000000406e0 Call Trace: ? usb_unbind_interface+0x71/0x270 [usbcore] ? device_release_driver_internal+0x154/0x210 ? qmi_wwan_unbind+0x6d/0xc0 [qmi_wwan] ? usbnet_disconnect+0x6c/0xf0 [usbnet] ? qmi_wwan_disconnect+0x87/0xc0 [qmi_wwan] ? usb_unbind_interface+0x71/0x270 [usbcore] ? device_release_driver_internal+0x154/0x210 Reported-and-tested-by: Nathaniel Roach <nroach44@gmail.com> Fixes: c6adf77953bc ("net: usb: qmi_wwan: add qmap mux protocol support") Cc: Daniele Palmas <dnlplm@gmail.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08ppp: fix xmit recursion detection on ppp channelsGuillaume Nault
Commit e5dadc65f9e0 ("ppp: Fix false xmit recursion detect with two ppp devices") dropped the xmit_recursion counter incrementation in ppp_channel_push() and relied on ppp_xmit_process() for this task. But __ppp_channel_push() can also send packets directly (using the .start_xmit() channel callback), in which case the xmit_recursion counter isn't incremented anymore. If such packets get routed back to the parent ppp unit, ppp_xmit_process() won't notice the recursion and will call ppp_channel_push() on the same channel, effectively creating the deadlock situation that the xmit_recursion mechanism was supposed to prevent. This patch re-introduces the xmit_recursion counter incrementation in ppp_channel_push(). Since the xmit_recursion variable is now part of the parent ppp unit, incrementation is skipped if the channel doesn't have any. This is fine because only packets routed through the parent unit may enter the channel recursively. Finally, we have to ensure that pch->ppp is not going to be modified while executing ppp_channel_push(). Instead of taking this lock only while calling ppp_xmit_process(), we now have to hold it for the full ppp_channel_push() execution. This respects the ppp locks ordering which requires locking ->upl before ->downl. Fixes: e5dadc65f9e0 ("ppp: Fix false xmit recursion detect with two ppp devices") Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08rds: Reintroduce statistics countingHåkon Bugge
In commit 7e3f2952eeb1 ("rds: don't let RDS shutdown a connection while senders are present"), refilling the receive queue was removed from rds_ib_recv(), along with the increment of s_ib_rx_refill_from_thread. Commit 73ce4317bf98 ("RDS: make sure we post recv buffers") re-introduces filling the receive queue from rds_ib_recv(), but does not add the statistics counter. rds_ib_recv() was later renamed to rds_ib_recv_path(). This commit reintroduces the statistics counting of s_ib_rx_refill_from_thread and s_ib_rx_refill_from_cq. Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Reviewed-by: Knut Omang <knut.omang@oracle.com> Reviewed-by: Wei Lin Guay <wei.lin.guay@oracle.com> Reviewed-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08tcp: fastopen: tcp_connect() must refresh the routeEric Dumazet
With new TCP_FASTOPEN_CONNECT socket option, there is a possibility to call tcp_connect() while socket sk_dst_cache is either NULL or invalid. +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 4 +0 fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0 +0 setsockopt(4, SOL_TCP, TCP_FASTOPEN_CONNECT, [1], 4) = 0 +0 connect(4, ..., ...) = 0 << sk->sk_dst_cache becomes obsolete, or even set to NULL >> +1 sendto(4, ..., 1000, MSG_FASTOPEN, ..., ...) = 1000 We need to refresh the route otherwise bad things can happen, especially when syzkaller is running on the host :/ Fixes: 19f6d3f3c8422 ("net/tcp-fastopen: Add new API support") Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Wei Wang <weiwan@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Wei Wang <weiwan@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08net: sched: set xt_tgchk_param par.net properly in ipt_init_targetXin Long
Now xt_tgchk_param par in ipt_init_target is a local varibale, par.net is not initialized there. Later when xt_check_target calls target's checkentry in which it may access par.net, it would cause kernel panic. Jaroslav found this panic when running: # ip link add TestIface type dummy # tc qd add dev TestIface ingress handle ffff: # tc filter add dev TestIface parent ffff: u32 match u32 0 0 \ action xt -j CONNMARK --set-mark 4 This patch is to pass net param into ipt_init_target and set par.net with it properly in there. v1->v2: As Wang Cong pointed, I missed ipt_net_id != xt_net_id, so fix it by also passing net_id to __tcf_ipt_init. v2->v3: Missed the fixes tag, so add it. Fixes: ecb2421b5ddf ("netfilter: add and use nf_ct_netns_get/put") Reported-by: Jaroslav Aster <jaster@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08net: dsa: mediatek: add adjust link support for user portsJohn Crispin
Manually adjust the port settings of user ports once PHY polling has completed. This patch extends the adjust_link callback to configure the per port PMCR register, applying the proper values polled from the PHY. Without this patch flow control was not always getting setup properly. Signed-off-by: Shashidhar Lakkavalli <shashidhar.lakkavalli@openmesh.com> Signed-off-by: Muciri Gatimu <muciri@openmesh.com> Signed-off-by: John Crispin <john@phrozen.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-08net/mlx4_en: don't set CHECKSUM_COMPLETE on SCTP packetsDavide Caratti
if the NIC fails to validate the checksum on TCP/UDP, and validation of IP checksum is successful, the driver subtracts the pseudo-header checksum from the value obtained by the hardware and sets CHECKSUM_COMPLETE. Don't do that if protocol is IPPROTO_SCTP, otherwise CRC32c validation fails. V2: don't test MLX4_CQE_STATUS_IPV6 if MLX4_CQE_STATUS_IPV4 is set Reported-by: Shuang Li <shuali@redhat.com> Fixes: f8c6455bb04b ("net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE") Signed-off-by: Davide Caratti <dcaratti@redhat.com> Acked-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>