summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)Author
2018-01-17perf unwind: Do not look just at the global callchain_param.record_modeArnaldo Carvalho de Melo
When setting up DWARF callchains on specific events, without using 'record' or 'trace' --call-graph, but instead doing it like: perf trace -e cycles/call-graph=dwarf/ The unwind__prepare_access() call in thread__insert_map() when we process PERF_RECORD_MMAP(2) metadata events were not being performed, precluding us from using per-event DWARF callchains, handling them just when we asked for all events to be DWARF, using "--call-graph dwarf". We do it in the PERF_RECORD_MMAP because we have to look at one of the executable maps to figure out the executable type (64-bit, 32-bit) of the DSO laid out in that mmap. Also to look at the architecture where the perf.data file was recorded. All this probably should be deferred to when we process a sample for some thread that has callchains, so that we do this processing only for the threads with samples, not for all of them. For now, fix using DWARF on specific events. Before: # perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.048 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.048/0.048/0.048/0.000 ms 0.000 probe_libc:inet_pton:(7fe9597bb350)) Problem processing probe_libc:inet_pton callchain, skipping... # After: # perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.060 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.060/0.060/0.060/0.000 ms 0.000 probe_libc:inet_pton:(7fd4aa930350)) __inet_pton (inlined) gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so) __GI_getaddrinfo (inlined) [0xffffaa804e51af3f] (/usr/bin/ping) __libc_start_main (/usr/lib64/libc-2.26.so) [0xffffaa804e51b379] (/usr/bin/ping) # # perf trace --call-graph=dwarf --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.057 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.057/0.057/0.057/0.000 ms 0.000 probe_libc:inet_pton:(7f9363b9e350)) __inet_pton (inlined) gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so) __GI_getaddrinfo (inlined) [0xffffa9e8a14e0f3f] (/usr/bin/ping) __libc_start_main (/usr/lib64/libc-2.26.so) [0xffffa9e8a14e1379] (/usr/bin/ping) # # perf trace --call-graph=fp --no-syscalls -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.077 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.077/0.077/0.077/0.000 ms 0.000 probe_libc:inet_pton:(7f4947e1c350)) __inet_pton (inlined) gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so) __GI_getaddrinfo (inlined) [0xffffaa716d88ef3f] (/usr/bin/ping) __libc_start_main (/usr/lib64/libc-2.26.so) [0xffffaa716d88f379] (/usr/bin/ping) # # perf trace --no-syscalls -e probe_libc:inet_pton/call-graph=fp/ ping -6 -c 1 ::1 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.078 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.078/0.078/0.078/0.000 ms 0.000 probe_libc:inet_pton:(7fa157696350)) __GI___inet_pton (/usr/lib64/libc-2.26.so) getaddrinfo (/usr/lib64/libc-2.26.so) [0xffffa9ba39c74f40] (/usr/bin/ping) # Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/r/20180116182650.GE16107@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-01-17perf callchain: Fix attr.sample_max_stack settingArnaldo Carvalho de Melo
When setting the "dwarf" unwinder for a specific event and not specifying the max-stack, the attr.sample_max_stack ended up using an uninitialized callchain_param.max_stack, fix it by using designated initializers for that callchain_param variable, zeroing all non explicitely initialized struct members. Here is what happened: # perf trace -vv --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1 callchain: type DWARF callchain: stack dump size 8192 perf_event_attr: type 2 size 112 config 0x730 { sample_period, sample_freq } 1 sample_type IP|TID|TIME|ADDR|CALLCHAIN|CPU|PERIOD|RAW|REGS_USER|STACK_USER|DATA_SRC exclude_callchain_user 1 { wakeup_events, wakeup_watermark } 1 sample_regs_user 0xff0fff sample_stack_user 8192 sample_max_stack 50656 sys_perf_event_open failed, error -75 Value too large for defined data type # perf trace -vv --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1 callchain: type DWARF callchain: stack dump size 8192 perf_event_attr: type 2 size 112 config 0x730 sample_type IP|TID|TIME|ADDR|CALLCHAIN|CPU|PERIOD|RAW|REGS_USER|STACK_USER|DATA_SRC exclude_callchain_user 1 sample_regs_user 0xff0fff sample_stack_user 8192 sample_max_stack 30448 sys_perf_event_open failed, error -75 Value too large for defined data type # Now the attr.sample_max_stack is set to zero and the above works as expected: # perf trace --no-syscalls --max-stack 4 -e probe_libc:inet_pton/call-graph=dwarf/ ping -6 -c 1 ::1 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.072 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.072/0.072/0.072/0.000 ms 0.000 probe_libc:inet_pton:(7feb7a998350)) __inet_pton (inlined) gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so) __GI_getaddrinfo (inlined) [0xffffaa39b6108f3f] (/usr/bin/ping) # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-is9tramondqa9jlxxsgcm9iz@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-01-17perf tools: Add ARM Statistical Profiling Extensions (SPE) supportKim Phillips
'perf record' and 'perf report --dump-raw-trace' supported in this release. Example usage: # perf record -e arm_spe/ts_enable=1,pa_enable=1/ dd if=/dev/zero of=/dev/null count=10000 # perf report --dump-raw-trace Note that the perf.data file is portable, so the report can be run on another architecture host if necessary. Output will contain raw SPE data and its textual representation, such as: 0x5c8 [0x30]: PERF_RECORD_AUXTRACE size: 0x200000 offset: 0 ref: 0x1891ad0e idx: 1 tid: 2227 cpu: 1 . . ... ARM SPE data: size 2097152 bytes . 00000000: 49 00 LD . 00000002: b2 c0 3b 29 0f 00 00 ff ff VA 0xffff00000f293bc0 . 0000000b: b3 c0 eb 24 fb 00 00 00 80 PA 0xfb24ebc0 ns=1 . 00000014: 9a 00 00 LAT 0 XLAT . 00000017: 42 16 EV RETIRED L1D-ACCESS TLB-ACCESS . 00000019: b0 00 c4 15 08 00 00 ff ff PC 0xff00000815c400 el3 ns=1 . 00000022: 98 00 00 LAT 0 TOT . 00000025: 71 36 6c 21 2c 09 00 00 00 TS 39395093558 . 0000002e: 49 00 LD . 00000030: b2 80 3c 29 0f 00 00 ff ff VA 0xffff00000f293c80 . 00000039: b3 80 ec 24 fb 00 00 00 80 PA 0xfb24ec80 ns=1 . 00000042: 9a 00 00 LAT 0 XLAT . 00000045: 42 16 EV RETIRED L1D-ACCESS TLB-ACCESS . 00000047: b0 f4 11 16 08 00 00 ff ff PC 0xff0000081611f4 el3 ns=1 . 00000050: 98 00 00 LAT 0 TOT . 00000053: 71 36 6c 21 2c 09 00 00 00 TS 39395093558 . 0000005c: 48 00 INSN-OTHER . 0000005e: 42 02 EV RETIRED . 00000060: b0 2c ef 7f 08 00 00 ff ff PC 0xff0000087fef2c el3 ns=1 . 00000069: 98 00 00 LAT 0 TOT . 0000006c: 71 d1 6f 21 2c 09 00 00 00 TS 39395094481 ... Other release notes: - applies to acme's perf/{core,urgent} branches, likely elsewhere - Report is self-contained within the tool. Record requires enabling the kernel SPE driver by setting CONFIG_ARM_SPE_PMU. - The intel-bts implementation was used as a starting point; its min/default/max buffer sizes and power of 2 pages granularity need to be revisited for ARM SPE - Recording across multiple SPE clusters/domains not supported - Snapshot support (record -S), and conversion to native perf events (e.g., via 'perf inject --itrace'), are also not supported - Technically both cs-etm and spe can be used simultaneously, however disabled for simplicity in this release Signed-off-by: Kim Phillips <kim.phillips@arm.com> Reviewed-by: Dongjiu Geng <gengdongjiu@huawei.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Pawel Moll <pawel.moll@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rob Herring <robh@kernel.org> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wang Nan <wangnan0@huawei.com> Cc: Will Deacon <will.deacon@arm.com> Link: http://lkml.kernel.org/r/20180114132850.0b127434b704a26bad13268f@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-01-17tools lib traceevent: Fix get_field_str() for dynamic stringsSteven Rostedt (VMware)
If a field is a dynamic string, get_field_str() returned just the offset/size value and not the string. Have it parse the offset/size correctly to return the actual string. Otherwise filtering fails when trying to filter fields that are dynamic strings. Reported-by: Gopanapalli Pradeep <prap_hai@yahoo.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20180112004823.146333275@goodmis.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-01-17tools lib traceevent: Fix missing break in FALSE case of ↵Taeung Song
pevent_filter_clear_trivial() Currently the FILTER_TRIVIAL_FALSE case has a missing break statement, if the trivial type is FALSE, it will also run into the TRUE case, and always be skipped as the TRUE statement will continue the loop on the inverse condition of the FALSE statement. Reported-by: Namhyung Kim <namhyung@kernel.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Taeung Song <treeze.taeung@gmail.com> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20180112004823.012918807@goodmis.org Link: http://lkml.kernel.org/r/1493218540-12296-1-git-send-email-treeze.taeung@gmail.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-01-17tools lib traceevent: Add UL suffix to MISSING_EVENTSMichael Sartain
Add UL suffix to MISSING_EVENTS since ints shouldn't be left shifted by 31. Signed-off-by: Michael Sartain <mikesart@fastmail.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20171016165542.13038-4-mikesart@fastmail.com Link: http://lkml.kernel.org/r/20180112004822.829533885@goodmis.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-01-17tools lib traceevent: Use asprintf when possibleFederico Vaga
It makes the code clearer and less error prone. clearer: - less code - the code is now using the same format to create strings dynamically less error prone: - no magic number +2 +9 +5 to compute the size - no copy&paste of the strings to compute the size and to concatenate The function `asprintf` is not POSIX standard but the program was already using it. Later it can be decided to use only POSIX functions, then we can easly replace all the `asprintf(3)` with a local implementation of that function. Signed-off-by: Federico Vaga <federico.vaga@vaga.pv.it> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Federico Vaga <federico.vaga@vaga.pv.it> Link: http://lkml.kernel.org/r/20170802221558.9684-2-federico.vaga@vaga.pv.it Link: http://lkml.kernel.org/r/20180112004822.686281649@goodmis.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-01-17tools lib traceevent: Show contents (in hex) of data of unrecognized type ↵Steven Rostedt (VMware)
records When a record has an unrecognized type, an error message is reported, but it would also be helpful to see the contents of that record. At least show what it is in hex, instead of just showing a blank line. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20180112004822.542204577@goodmis.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-01-17tools lib traceevent: Handle new pointer processing of bprint stringsSteven Rostedt (VMware)
The Linux kernel printf() has some extended use cases that dereference the pointer. This is dangerouse for tracing because the pointer that is dereferenced can change or even be unmapped. It also causes issues when the trace data is extracted, because user space does not have access to the contents of the pointer even if it still exists. To handle this, the kernel was updated to process these dereferenced pointers at the time they are recorded, and not post processed. Now they exist in the tracing buffer, and no dereference is needed at the time of reading the trace. The event parsing library needs to handle this new case. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20180112004822.403349289@goodmis.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-01-17tools lib traceevent: Simplify pointer print logic and fix %pFSteven Rostedt (VMware)
When processing %pX in pretty_print(), simplify the logic slightly by incrementing the ptr to the format string if isalnum(ptr[1]) is true. This follows the logic a bit more closely to what is in the kernel. Also, this fixes a small bug where %pF was not giving the offset of the function. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20180112004822.260262257@goodmis.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-01-17tools lib traceevent: Print value of unknown symbolic fieldsJan Kiszka
Aligns trace-cmd with the behavior of the kernel. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/e60c889f-55e7-4ee8-0e50-151e435ffd8c@siemens.com Link: http://lkml.kernel.org/r/20180112004822.118332436@goodmis.org Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-01-17tools lib traceevent: Show value of flags that have not been parsedSteven Rostedt (VMware)
If the value contains bits that are not defined by print_flags() helper, then show the remaining bits. This aligns with the functionality of the kernel. Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/e60c889f-55e7-4ee8-0e50-151e435ffd8c@siemens.com Link: http://lkml.kernel.org/r/20180112004821.976225232@goodmis.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-01-17tools lib traceevent: Fix bad force_token escape sequenceMichael Sartain
Older kernels have a bug that creates invalid symbols. event-parse.c handles them by replacing them with a "%s" token. But the fix included an extra backslash, and "\%s" was added incorrectly. Signed-off-by: Michael Sartain <mikesart@fastmail.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20180112004821.827168881@goodmis.org Link: http://lkml.kernel.org/r/d320000d37c10ce0912851e1fb78d1e0c946bcd9.1497486273.git.mikesart@fastmail.com Signed-off-by: Steven Rostedt <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-01-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Overlapping changes all over. The mini-qdisc bits were a little bit tricky, however. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-17libbpf: Makefile set specified permission modeJesper Dangaard Brouer
The third parameter to do_install was not used by $(INSTALL) command. Fix this by only setting the -m option when the third parameter is supplied. The use of a third parameter was introduced in commit eb54e522a000 ("bpf: install libbpf headers on 'make install'"). Without this change, the header files are install as executables files (755). Fixes: eb54e522a000 ("bpf: install libbpf headers on 'make install'") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-17libbpf: cleanup Makefile, remove unused elementsJesper Dangaard Brouer
The plugin_dir_SQ variable is not used, remove it. The function update_dir is also unused, remove it. The variable $VERSION_FILES is empty, remove it. These all originates from the introduction of the Makefile, and is likely a copy paste from tools/lib/traceevent/Makefile. Fixes: 1b76c13e4b36 ("bpf tools: Introduce 'bpf' library and add bpf feature check") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-17libbpf: install the header file libbpf.hJesper Dangaard Brouer
It seems like an oversight not to install the header file for libbpf, given the libbpf.so + libbpf.a files are installed. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-17libbpf: fix string comparison for guessing eBPF program typeQuentin Monnet
libbpf is able to deduce the type of a program from the name of the ELF section in which it is located. However, the comparison is made on the first n characters, n being determined with sizeof() applied to the reference string (e.g. "xdp"). When such section names are supposed to receive a suffix separated with a slash (e.g. "kprobe/"), using sizeof() takes the final NUL character of the reference string into account, which implies that both strings must be equal. Instead, the desired behaviour would consist in taking the length of the string, *without* accounting for the ending NUL character, and to make sure the reference string is a prefix to the ELF section name. Subtract 1 to the total size of the string for obtaining the length for the comparison. Fixes: 583c90097f72 ("libbpf: add ability to guess program type based on section name") Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-17tools: bpftool: add -DPACKAGE when including bfd.hJiong Wang
bfd.h is requiring including of config.h except when PACKAGE or PACKAGE_VERSION are defined. /* PR 14072: Ensure that config.h is included first. */ #if !defined PACKAGE && !defined PACKAGE_VERSION #error config.h must be included before this header #endif This check has been introduced since May-2012. It doesn't show up in bfd.h on some Linux distribution, probably because distributions have remove it when building the package. However, sometimes the user might just build libfd from source code then link bpftool against it. For this case, bfd.h will be original that we need to define PACKAGE or PACKAGE_VERSION. Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Jiong Wang <jiong.wang@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-16bpf: reject stores into ctx via st and xaddDaniel Borkmann
Alexei found that verifier does not reject stores into context via BPF_ST instead of BPF_STX. And while looking at it, we also should not allow XADD variant of BPF_STX. The context rewriter is only assuming either BPF_LDX_MEM- or BPF_STX_MEM-type operations, thus reject anything other than that so that assumptions in the rewriter properly hold. Add test cases as well for BPF selftests. Fixes: d691f9e8d440 ("bpf: allow programs to write to certain skb fields") Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Two read past end of buffer fixes in AF_KEY, from Eric Biggers. 2) Memory leak in key_notify_policy(), from Steffen Klassert. 3) Fix overflow with bpf arrays, from Daniel Borkmann. 4) Fix RDMA regression with mlx5 due to mlx5 no longer using pci_irq_get_affinity(), from Saeed Mahameed. 5) Missing RCU read locking in nl80211_send_iface() when it calls ieee80211_bss_get_ie(), from Dominik Brodowski. 6) cfg80211 should check dev_set_name()'s return value, from Johannes Berg. 7) Missing module license tag in 9p protocol, from Stephen Hemminger. 8) Fix crash due to too small MTU in udp ipv6 sendmsg, from Mike Maloney. 9) Fix endless loop in netlink extack code, from David Ahern. 10) TLS socket layer sets inverted error codes, resulting in an endless loop. From Robert Hering. 11) Revert openvswitch erspan tunnel support, it's mis-designed and we need to kill it before it goes into a real release. From William Tu. 12) Fix lan78xx failures in full speed USB mode, from Yuiko Oshino. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (54 commits) net, sched: fix panic when updating miniq {b,q}stats qed: Fix potential use-after-free in qed_spq_post() nfp: use the correct index for link speed table lan78xx: Fix failure in USB Full Speed sctp: do not allow the v4 socket to bind a v4mapped v6 address sctp: return error if the asoc has been peeled off in sctp_wait_for_sndbuf sctp: reinit stream if stream outcnt has been change by sinit in sendmsg ibmvnic: Fix pending MAC address changes netlink: extack: avoid parenthesized string constant warning ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY net: Allow neigh contructor functions ability to modify the primary_key sh_eth: fix dumping ARSTR Revert "openvswitch: Add erspan tunnel support." net/tls: Fix inverted error codes to avoid endless loop ipv6: ip6_make_skb() needs to clear cork.base.dst sctp: avoid compiler warning on implicit fallthru net: ipv4: Make "ip route get" match iif lo rules again. netlink: extack needs to be reset each time through loop tipc: fix a memory leak in tipc_nl_node_get_link() ipv6: fix udpv6 sendmsg crash caused by too small MTU ...
2018-01-16selftests: Fix loss of test output in run_kselftests.shMichael Ellerman
Commit fbcab13d2e25 ("selftests: silence test output by default") changed the run_tests logic as well as the logic to generate run_kselftests.sh to redirect test output away from the console. As discussed on the list and at kernel summit, this is not a desirable default as it means in order to debug a failure the console output is not sufficient, you also need access to the test machine to get the full test logs. Additionally it's impolite to write directly to /tmp/$TEST_NAME on shared systems. The change to the run_tests logic was reverted in commit a323335e62cc ("selftests: lib.mk: print individual test results to console by default"), and instead a summary option was added so that quiet output could be requested. However the change to run_kselftests.sh was left as-is. This commit applies the same logic to the run_kselftests.sh code, ie. the script now takes a "--summary" option which suppresses the output, but shows all output by default. Additionally instead of writing to /tmp/$TEST_NAME the output is redirected to the directory where the generated test script is located. Fixes: fbcab13d2e25 ("selftests: silence test output by default") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2018-01-16selftest: ftrace: Fix to add 256 kprobe events correctlyMasami Hiramatsu
Current multiple-kprobe testcase only tries to add kprobe events on first 256 text symbols. However kprobes fails to probe on some text symbols (like blacklisted symbols). Thus in the worst case, the test can not add any kprobe events. To avoid that, continue to try adding kprobe events until 256 events. Also it confirms the number of registered kprobe events. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2018-01-16selftest: ftrace: Fix to pick text symbols for kprobesMasami Hiramatsu
Fix to pick text symbols for multiple kprobe testcase. kallsyms shows text symbols with " t " or " T " but current testcase picks all symbols including "t", so it picks data symbols if it includes 't' (e.g. "str"). This fixes it to find symbol lines with " t " or " T " (including spaces). Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Reported-by: Russell King <linux@armlinux.org.uk> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2018-01-16bpftool: recognize BPF_PROG_TYPE_CGROUP_DEVICE programsRoman Gushchin
Bpftool doesn't recognize BPF_PROG_TYPE_CGROUP_DEVICE programs, so the prog show command prints the numeric type value: $ bpftool prog show 1: type 15 name bpf_prog1 tag ac9f93dbfd6d9b74 loaded_at Jan 15/07:58 uid 0 xlated 96B jited 105B memlock 4096B This patch defines the corresponding textual representation: $ bpftool prog show 1: cgroup_device name bpf_prog1 tag ac9f93dbfd6d9b74 loaded_at Jan 15/07:58 uid 0 xlated 96B jited 105B memlock 4096B Signed-off-by: Roman Gushchin <guro@fb.com> Cc: Jakub Kicinski <jakub.kicinski@netronome.com> Cc: Quentin Monnet <quentin.monnet@netronome.com> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Alexei Starovoitov <ast@kernel.org> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-16objtool: Improve error message for bad file argumentJosh Poimboeuf
If a nonexistent file is supplied to objtool, it complains with a non-helpful error: open: No such file or directory Improve it to: objtool: Can't open 'foo': No such file or directory Reported-by: Markus <M4rkusXXL@web.de> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/406a3d00a21225eee2819844048e17f68523ccf6.1516025651.git.jpoimboe@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-15usercopy: Enhance and rename report_usercopy()Kees Cook
In preparation for refactoring the usercopy checks to pass offset to the hardened usercopy report, this renames report_usercopy() to the more accurate usercopy_abort(), marks it as noreturn because it is, adds a hopefully helpful comment for anyone investigating such reports, makes the function available to the slab allocators, and adds new "detail" and "offset" arguments. Signed-off-by: Kees Cook <keescook@chromium.org>
2018-01-15Merge 4.15-rc8 into usb-nextGreg Kroah-Hartman
We want the USB fixes in here as well for merge issues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2018-01-14bpf: offload: add map offload infrastructureJakub Kicinski
BPF map offload follow similar path to program offload. At creation time users may specify ifindex of the device on which they want to create the map. Map will be validated by the kernel's .map_alloc_check callback and device driver will be called for the actual allocation. Map will have an empty set of operations associated with it (save for alloc and free callbacks). The real device callbacks are kept in map->offload->dev_ops because they have slightly different signatures. Map operations are called in process context so the driver may communicate with HW freely, msleep(), wait() etc. Map alloc and free callbacks are muxed via existing .ndo_bpf, and are always called with rtnl lock held. Maps and programs are guaranteed to be destroyed before .ndo_uninit (i.e. before unregister_netdev() returns). Map callbacks are invoked with bpf_devs_lock *read* locked, drivers must take care of exclusive locking if necessary. All offload-specific branches are marked with unlikely() (through bpf_map_is_dev_bound()), given that branch penalty will be negligible compared to IO anyway, and we don't want to penalize SW path unnecessarily. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-01-14Merge branch 'x86-pti-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 pti updates from Thomas Gleixner: "This contains: - a PTI bugfix to avoid setting reserved CR3 bits when PCID is disabled. This seems to cause issues on a virtual machine at least and is incorrect according to the AMD manual. - a PTI bugfix which disables the perf BTS facility if PTI is enabled. The BTS AUX buffer is not globally visible and causes the CPU to fault when the mapping disappears on switching CR3 to user space. A full fix which restores BTS on PTI is non trivial and will be worked on. - PTI bugfixes for EFI and trusted boot which make sure that the user space visible page table entries have the NX bit cleared - removal of dead code in the PTI pagetable setup functions - add PTI documentation - add a selftest for vsyscall to verify that the kernel actually implements what it advertises. - a sysfs interface to expose vulnerability and mitigation information so there is a coherent way for users to retrieve the status. - the initial spectre_v2 mitigations, aka retpoline: + The necessary ASM thunk and compiler support + The ASM variants of retpoline and the conversion of affected ASM code + Make LFENCE serializing on AMD so it can be used as speculation trap + The RSB fill after vmexit - initial objtool support for retpoline As I said in the status mail this is the most of the set of patches which should go into 4.15 except two straight forward patches still on hold: - the retpoline add on of LFENCE which waits for ACKs - the RSB fill after context switch Both should be ready to go early next week and with that we'll have covered the major holes of spectre_v2 and go back to normality" * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (28 commits) x86,perf: Disable intel_bts when PTI security/Kconfig: Correct the Documentation reference for PTI x86/pti: Fix !PCID and sanitize defines selftests/x86: Add test_vsyscall x86/retpoline: Fill return stack buffer on vmexit x86/retpoline/irq32: Convert assembler indirect jumps x86/retpoline/checksum32: Convert assembler indirect jumps x86/retpoline/xen: Convert Xen hypercall indirect jumps x86/retpoline/hyperv: Convert assembler indirect jumps x86/retpoline/ftrace: Convert ftrace assembler indirect jumps x86/retpoline/entry: Convert entry assembler indirect jumps x86/retpoline/crypto: Convert crypto assembler indirect jumps x86/spectre: Add boot time option to select Spectre v2 mitigation x86/retpoline: Add initial retpoline support objtool: Allow alternatives to be ignored objtool: Detect jumps to retpoline thunks x86/pti: Make unpoison of pgd for trusted boot work for real x86/alternatives: Fix optimize_nops() checking sysfs/cpu: Fix typos in vulnerability documentation x86/cpu/AMD: Use LFENCE_RDTSC in preference to MFENCE_RDTSC ...
2018-01-13tools/objtool/Makefile: don't assume sync-check.sh is executableAndrew Morton
patch(1) loses the x bit. So if a user follows our patching instructions in Documentation/admin-guide/README.rst, their kernel will not compile. Fixes: 3bd51c5a371de ("objtool: Move kernel headers/code sync check to a script") Reported-by: Nicolas Bock <nicolasbock@gentoo.org> Reported-by Joakim Tjernlund <Joakim.Tjernlund@infinera.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-13selftests/x86: Add test_vsyscallAndy Lutomirski
This tests that the vsyscall entries do what they're expected to do. It also confirms that attempts to read the vsyscall page behave as expected. If changes are made to the vsyscall code or its memory map handling, running this test in all three of vsyscall=none, vsyscall=emulate, and vsyscall=native are helpful. (Because it's easy, this also compares the vsyscall results to their vDSO equivalents.) Note to KAISER backporters: please test this under all three vsyscall modes. Also, in the emulate and native modes, make sure that test_vsyscall_64 agrees with the command line or config option as to which mode you're in. It's quite easy to mess up the kernel such that native mode accidentally emulates or vice versa. Greg, etc: please backport this to all your Meltdown-patched kernels. It'll help make sure the patches didn't regress vsyscalls. CSigned-off-by: Andy Lutomirski <luto@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Hugh Dickins <hughd@google.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/2b9c5a174c1d60fd7774461d518aa75598b1d8fd.1515719552.git.luto@kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-12perf trace: Fix setting of --call-graph/--max-stack for non-syscall eventsArnaldo Carvalho de Melo
The raw_syscalls:sys_{enter,exit} were first supported in 'perf trace', together with minor and major page faults, then we supported --call-graph, then --max-stack, but when the other tracepoints got supported, and bpf, etc, I forgot to make those global call-graph settings apply to them. Fix it by realizing that the global --max-stack and --call-graph settings are done via: OPT_CALLBACK(0, "call-graph", &trace.opts, "record_mode[,record_size]", record_callchain_help, &record_parse_callchain_opt), And then, when we go to parse the events in -e via: OPT_CALLBACK('e', "event", &trace, "event", "event/syscall selector. use 'perf list' to list available events", trace__parse_events_option), And trace__parse_sevents_option() calls: struct option o = OPT_CALLBACK('e', "event", &trace->evlist, "event", "event selector. use 'perf list' to list available events", parse_events_option); err = parse_events_option(&o, lists[0], 0); parse_events_option() will override the global --call-graph and --max-stack if the "call-graph" and/or "max-stack" terms are in the event definition, such as in the probe_libc:inet_pton event in one of the examples below (-e probe_libc:inet_pton/max-stack=2). Before: # perf trace --mmap 1024 --call-graph dwarf -e sendto,probe_libc:inet_pton ping -6 -c 1 ::1 1.525 ( ): probe_libc:inet_pton:(7f77f3ac9350)) PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.071 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.071/0.071/0.071/0.000 ms 1.677 ( 0.081 ms): ping/31296 sendto(fd: 3, buff: 0x55681b652720, len: 64, addr: 0x55681b650640, addr_len: 28) = 64 __libc_sendto (/usr/lib64/libc-2.26.so) [0xffffaa97e4bc9cef] (/usr/bin/ping) [0xffffaa97e4bc656d] (/usr/bin/ping) [0xffffaa97e4bc7d0a] (/usr/bin/ping) [0xffffaa97e4bca447] (/usr/bin/ping) [0xffffaa97e4bc2f91] (/usr/bin/ping) __libc_start_main (/usr/lib64/libc-2.26.so) [0xffffaa97e4bc3379] (/usr/bin/ping) # After: # perf trace --mmap 1024 --call-graph dwarf -e sendto,probe_libc:inet_pton ping -6 -c 1 ::1 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.089 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.089/0.089/0.089/0.000 ms 1.955 ( ): probe_libc:inet_pton:(7f383a311350)) __inet_pton (inlined) gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so) __GI_getaddrinfo (inlined) [0xffffaa5d91444f3f] (/usr/bin/ping) __libc_start_main (/usr/lib64/libc-2.26.so) [0xffffaa5d91445379] (/usr/bin/ping) 2.140 ( 0.101 ms): ping/32047 sendto(fd: 3, buff: 0x55a26edd0720, len: 64, addr: 0x55a26edce640, addr_len: 28) = 64 __libc_sendto (/usr/lib64/libc-2.26.so) [0xffffaa5d9144bcef] (/usr/bin/ping) [0xffffaa5d9144856d] (/usr/bin/ping) [0xffffaa5d91449d0a] (/usr/bin/ping) [0xffffaa5d9144c447] (/usr/bin/ping) [0xffffaa5d91444f91] (/usr/bin/ping) __libc_start_main (/usr/lib64/libc-2.26.so) [0xffffaa5d91445379] (/usr/bin/ping) # Same thing for --max-stack, the global one: # perf trace --max-stack 3 -e sendto,probe_libc:inet_pton ping -6 -c 1 ::1 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.097 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.097/0.097/0.097/0.000 ms 1.577 ( ): probe_libc:inet_pton:(7f32f3957350)) __inet_pton (inlined) gaih_inet.constprop.7 (/usr/lib64/libc-2.26.so) __GI_getaddrinfo (inlined) 1.738 ( 0.108 ms): ping/32103 sendto(fd: 3, buff: 0x55c3132d7720, len: 64, addr: 0x55c3132d5640, addr_len: 28) = 64 __libc_sendto (/usr/lib64/libc-2.26.so) [0xffffaa3cecf44cef] (/usr/bin/ping) [0xffffaa3cecf4156d] (/usr/bin/ping) # And then setting up a global setting (dwarf, max-stack=4), that will affect the raw_syscall:sys_enter for the 'sendto' syscall and that will be overriden in the probe_libc:inet_pton call to just one entry. # perf trace --max-stack=4 --call-graph dwarf -e sendto -e probe_libc:inet_pton/max-stack=1/ ping -6 -c 1 ::1 PING ::1(::1) 56 data bytes 64 bytes from ::1: icmp_seq=1 ttl=64 time=0.090 ms --- ::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.090/0.090/0.090/0.000 ms 2.140 ( ): probe_libc:inet_pton:(7f9fe9337350)) __GI___inet_pton (/usr/lib64/libc-2.26.so) 2.283 ( 0.103 ms): ping/31804 sendto(fd: 3, buff: 0x55c7f3e19720, len: 64, addr: 0x55c7f3e17640, addr_len: 28) = 64 __libc_sendto (/usr/lib64/libc-2.26.so) [0xffffaa380c402cef] (/usr/bin/ping) [0xffffaa380c3ff56d] (/usr/bin/ping) [0xffffaa380c400d0a] (/usr/bin/ping) # Install iputils-debuginfo to get those /usr/bin/ping addresses resolved, those routines are not on its .dymsym nor .symtab :-) Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-qgl2gse8elhh9zztw4ajopg3@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-01-12perf evsel: Check if callchain is enabled before setting it upArnaldo Carvalho de Melo
The construct: if (callchain_param) perf_evsel__config_callchain(evsel, opts, &callchain_param); happens in several places, so make perf_evsel__config_callchain() work just like free(NULL), do nothing if param->enabled is not set. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-ykk0qzxnxwx3o611ctjnmxav@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-01-12perf tools: Fix copyfile_offset update of output offsetJiri Olsa
We need to increase output offset in each iteration, not decrease it as we currently do. I guess we were lucky to finish in most cases in first iteration, so the bug never showed. However it shows a lot when working with big (~4GB) size data. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: 9c9f5a2f1944 ("perf tools: Introduce copyfile_offset() function") Link: http://lkml.kernel.org/r/20180109133923.25406-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-01-12selftests: media_tests: Add SPDX license identifierShuah Khan
Replace GPL license statement with SPDX GPL-2.0 license identifier. Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2018-01-12selftests: kselftest.h: Add SPDX license identifierShuah Khan
Replace GPL license statement with SPDX GPL-2.0 license identifier. Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2018-01-12selftests: kselftest_install.sh: Add SPDX license identifierShuah Khan
Replace GPL license statement with SPDX GPL-2.0 license identifier. Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2018-01-12selftests: gen_kselftest_tar.h: Add SPDX license identifierShuah Khan
Replace GPL license statement with SPDX GPL-2.0 license identifier. Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2018-01-12selftests: media_tests: Fix Makefile 'clean' target warningShuah Khan
Remove 'clean' target and change TEST_PROGS to TEST_GEN_PROGS so the common lib.mk 'clean' target clean these generated files. TEST_PROGS is for shell scripts and not for generated test executables. Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2018-01-12tools/testing: Fix trailing semicolonLuis de Bethencourt
The trailing semicolon is an empty statement that does no operation. Removing it since it doesn't do anything. Signed-off-by: Luis de Bethencourt <luisbg@kernel.org> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2018-01-12perf trace: No need to set PERF_SAMPLE_IDENTIFIER explicitelyArnaldo Carvalho de Melo
Since 75562573bab3 ("perf tools: Add support for PERF_SAMPLE_IDENTIFIER") we don't need explicitely set PERF_SAMPLE_IDENTIFIER, as perf_evlist__config() will do this for us, i.e. when there are more than one evsel in an evlist, it will check if some evsel has a sample_type different than the one on the first evsel in the list, setting PERF_SAMPLE_IDENTIFIER in that case. So, to simplify 'perf trace' codebase, ditch that check. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Hendrick Brueckner <brueckner@linux.vnet.ibm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Thomas Richter <tmricht@linux.vnet.ibm.com> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-12xq6orhwttee2tdtu96ucrp@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-01-12perf script python: Add script to profile and resolve physical mem typeKan Liang
There could be different types of memory in the system. E.g normal System Memory, Persistent Memory. To understand how the workload maps to those memories, it's important to know the I/O statistics of them. Perf can collect physical addresses, but those are raw data. It still needs extra work to resolve the physical addresses. Provide a script to facilitate the physical addresses resolving and I/O statistics. Profile with MEM_INST_RETIRED.ALL_LOADS or MEM_UOPS_RETIRED.ALL_LOADS event if any of them is available. Look up the /proc/iomem and resolve the physical address. Provide memory type summary. Here is an example output: # perf script report mem-phys-addr Event: mem_inst_retired.all_loads:P Memory type count percentage ---------------------------------------- ----------- ----------- System RAM 74 53.2% Persistent Memory 55 39.6% N/A --- Changes since V2: - Apply the new license rules. - Add comments for globals Changes since V1: - Do not mix DLA and Load Latency. Do not compare the loads and stores. Only profile the loads. - Use event name to replace the RAW event Signed-off-by: Kan Liang <Kan.liang@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Stephane Eranian <eranian@google.com> Link: https://lkml.kernel.org/r/1515099595-34770-1-git-send-email-kan.liang@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-01-12perf evlist: Remove trailing semicolonLuis de Bethencourt
The trailing semicolon is an empty statement that does no operation. Removing it since it doesn't do anything. Signed-off-by: Luis de Bethencourt <luisbg@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20180111155020.9782-1-luisbg@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-01-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
BPF alignment tests got a conflict because the registers are output as Rn_w instead of just Rn in net-next, and in net a fixup for a testcase prohibits logical operations on pointers before using them. Also, we should attempt to patch BPF call args if JIT always on is enabled. Instead, if we fail to JIT the subprogs we should pass an error back up and fail immediately. Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-12objtool: Allow alternatives to be ignoredJosh Poimboeuf
Getting objtool to understand retpolines is going to be a bit of a challenge. For now, take advantage of the fact that retpolines are patched in with alternatives. Just read the original (sane) non-alternative instruction, and ignore the patched-in retpoline. This allows objtool to understand the control flow *around* the retpoline, even if it can't yet follow what's inside. This means the ORC unwinder will fail to unwind from inside a retpoline, but will work fine otherwise. Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515707194-20531-3-git-send-email-dwmw@amazon.co.uk
2018-01-12objtool: Detect jumps to retpoline thunksJosh Poimboeuf
A direct jump to a retpoline thunk is really an indirect jump in disguise. Change the objtool instruction type accordingly. Objtool needs to know where indirect branches are so it can detect switch statement jump tables. This fixes a bunch of warnings with CONFIG_RETPOLINE like: arch/x86/events/intel/uncore_nhmex.o: warning: objtool: nhmex_rbox_msr_enable_event()+0x44: sibling call from callable instruction with modified stack frame kernel/signal.o: warning: objtool: copy_siginfo_to_user()+0x91: sibling call from callable instruction with modified stack frame ... Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: gnomes@lxorguk.ukuu.org.uk Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: thomas.lendacky@amd.com Cc: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Jiri Kosina <jikos@kernel.org> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Kees Cook <keescook@google.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Greg Kroah-Hartman <gregkh@linux-foundation.org> Cc: Paul Turner <pjt@google.com> Link: https://lkml.kernel.org/r/1515707194-20531-2-git-send-email-dwmw@amazon.co.uk
2018-01-11perf evsel: Fix incorrect handling of type _TERM_DRV_CFGMathieu Poirier
Commit ("d0565132605f perf evsel: Enable type checking for perf_evsel_config_term types") assumes PERF_EVSEL__CONFIG_TERM_DRV_CFG isn't used and as such adds a BUG_ON(). Since the enumeration type is used in macro ADD_CONFIG_TERM() the change break CoreSight trace acquisition. This patch restores the original code. Signed-off-by: Mathieu Poirier <mathieu.poirier@linaro.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Fixes: d0565132605f ("perf evsel: Enable type checking for perf_evsel_config_term types") Link: http://lkml.kernel.org/r/1515617211-32024-1-git-send-email-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-01-10bpf: arsh is not supported in 32 bit alu thus reject itDaniel Borkmann
The following snippet was throwing an 'unknown opcode cc' warning in BPF interpreter: 0: (18) r0 = 0x0 2: (7b) *(u64 *)(r10 -16) = r0 3: (cc) (u32) r0 s>>= (u32) r0 4: (95) exit Although a number of JITs do support BPF_ALU | BPF_ARSH | BPF_{K,X} generation, not all of them do and interpreter does neither. We can leave existing ones and implement it later in bpf-next for the remaining ones, but reject this properly in verifier for the time being. Fixes: 17a5267067f3 ("bpf: verifier (add verifier core)") Reported-by: syzbot+93c4904c5c70348a6890@syzkaller.appspotmail.com Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-01-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf 2018-01-09 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Prevent out-of-bounds speculation in BPF maps by masking the index after bounds checks in order to fix spectre v1, and add an option BPF_JIT_ALWAYS_ON into Kconfig that allows for removing the BPF interpreter from the kernel in favor of JIT-only mode to make spectre v2 harder, from Alexei. 2) Remove false sharing of map refcount with max_entries which was used in spectre v1, from Daniel. 3) Add a missing NULL psock check in sockmap in order to fix a race, from John. 4) Fix test_align BPF selftest case since a recent change in verifier rejects the bit-wise arithmetic on pointers earlier but test_align update was missing, from Alexei. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>