linux/linux-stable.git - Linux kernel stable tree

Age	Commit message (Collapse)	Author
2019-11-24	selftests/bpf: Integrate verbose verifier log into test_progs	Andrii Nakryiko
	Add exra level of verboseness, activated by -vvv argument. When -vv is specified, verbose libbpf and verifier log (level 1) is output, even for successful tests. With -vvv, verifier log goes to level 2. This is extremely useful to debug verifier failures, as well as just see the state and flow of verification. Before this, you'd have to go and modify load_program()'s source code inside libbpf to specify extra log_level flags, which is suboptimal to say the least. Currently -vv and -vvv triggering verifier output is integrated into test_stub's bpf_prog_load as well as bpf_verif_scale.c tests. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191120003548.4159797-1-andriin@fb.com
2019-11-24	selftests, bpftool: Skip the build test if not in tree	Jakub Kicinski
	If selftests are copied over to another machine/location for execution the build test of bpftool will obviously not work, since the sources are not copied. Skip it if we can't find bpftool's Makefile. Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191119105010.19189-3-quentin.monnet@netronome.com
2019-11-24	selftests, bpftool: Set EXIT trap after usage function	Quentin Monnet
	The trap on EXIT is used to clean up any temporary directory left by the build attempts. It is not needed when the user simply calls the script with its --help option, and may not be needed either if we add checks (e.g. on the availability of bpftool files) before the build attempts. Let's move this trap and related variables lower down in the code, so that we don't accidentally change the value returned from the script on early exits at pre-checks. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Link: https://lore.kernel.org/bpf/20191119105010.19189-2-quentin.monnet@netronome.com
2019-11-24	tools, bpf: Fix build for 'make -s tools/bpf O=<dir>'	Quentin Monnet
	Building selftests with 'make TARGETS=bpf kselftest' was fixed in commit 55d554f5d140 ("tools: bpf: Use !building_out_of_srctree to determine srctree"). However, by updating $(srctree) in tools/bpf/Makefile for in-tree builds only, we leave out the case where we pass an output directory to build BPF tools, but $(srctree) is not set. This typically happens for: $ make -s tools/bpf O=/tmp/foo Makefile:40: /tools/build/Makefile.feature: No such file or directory Fix it by updating $(srctree) in the Makefile not only for out-of-tree builds, but also if $(srctree) is empty. Detected with test_bpftool_build.sh. Fixes: 55d554f5d140 ("tools: bpf: Use !building_out_of_srctree to determine srctree") Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Link: https://lore.kernel.org/bpf/20191119105626.21453-1-quentin.monnet@netronome.com
2019-11-24	tools, bpftool: Fix warning on ignored return value for 'read'	Quentin Monnet
	When building bpftool, a warning was introduced by commit a94364603610 ("bpftool: Allow to read btf as raw data"), because the return value from a call to 'read()' is ignored. Let's address it. Signed-off-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Andrii Nakryiko <andriin@fb.com> Link: https://lore.kernel.org/bpf/20191119111706.22440-1-quentin.monnet@netronome.com
2019-11-22	Merge branch 'next/seccomp' into for-next	Paul Walmsley

2019-11-22	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Minor conflict in drivers/s390/net/qeth_l2_main.c, kept the lock from commit c8183f548902 ("s390/qeth: fix potential deadlock on workqueue flush"), removed the code which was removed by commit 9897d583b015 ("s390/qeth: consolidate some duplicated HW cmd code"). Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
2019-11-22	perf parse: Fix potential memory leak when handling tracepoint errors	Ian Rogers
	An error may be in place when tracepoint_error is called, use parse_events__handle_error to avoid a memory leak and to capture the first and last error. Error detected by LLVM's libFuzzer using the following event: $ perf stat -e 'msr/event/,f:e' event syntax error: 'msr/event/,f:e' \___ can't access trace events Error: No permissions to read /sys/kernel/debug/tracing/events/f/e Hint: Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/tracing/' Initial error: event syntax error: 'msr/event/,f:e' \___ no value assigned for term Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: clang-built-linux@googlegroups.com Link: http://lore.kernel.org/lkml/20191120180925.21787-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-22	perf probe: Fix spelling mistake "addrees" -> "address"	Colin Ian King
	There is a spelling mistake in a pr_warning message. Fix it. Signed-off-by: Colin King <colin.king@canonical.com> Acked-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-janitors@vger.kernel.org Link: http://lore.kernel.org/lkml/20191121092623.374896-1-colin.king@canonical.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-22	libtraceevent: Fix memory leakage in copy_filter_type	Hewenliang
	It is necessary to free the memory that we have allocated when error occurs. Fixes: ef3072cd1d5c ("tools lib traceevent: Get rid of die in add_filter_type()") Signed-off-by: Hewenliang <hewenliang4@huawei.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Tzvetomir Stoyanov <tstoyanov@vmware.com> Link: http://lore.kernel.org/lkml/20191119014415.57210-1-hewenliang4@huawei.com Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-22	libtraceevent: Fix header installation	Sudip Mukherjee
	When we passed some location in DESTDIR, install_headers called do_install with DESTDIR as part of the second argument. But do_install is again using '$(DESTDIR_SQ)$2', so as a result the headers were installed in a location $DESTDIR/$DESTDIR. In my testing I passed DESTDIR=/home/sudip/test and the headers were installed in: /home/sudip/test/home/sudip/test/usr/include/traceevent. Lets remove DESTDIR from the second argument of do_install so that the headers are installed in the correct location. Signed-off-by: Sudipm Mukherjee <sudipm.mukherjee@gmail.com> Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Sudipm Mukherjee <sudipm.mukherjee@gmail.com> Cc: linux-trace-devel@vger.kernel.org Link: http://lore.kernel.org/lkml/20191114133719.309-1-sudipm.mukherjee@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-22	perf intel-bts: Does not support AUX area sampling	Adrian Hunter
	Add an error message because Intel BTS does not support AUX area sampling. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20191115124225.5247-16-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-22	perf intel-pt: Add support for decoding AUX area samples	Adrian Hunter
	Add support for dumping, queuing and decoding AUX area samples. Decoding samples is the same as regular decoding, except in the case where there are no timestamps, in which case buffers are decoded immediately before the sample event. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20191115124225.5247-15-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-22	perf intel-pt: Add support for recording AUX area samples	Adrian Hunter
	Set up the default number of mmap pages, default sample size and default psb_period for AUX area sampling. Add documentation also. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20191115124225.5247-14-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-22	perf pmu: When using default config, record which bits of config were ↵	Adrian Hunter
	changed by the user Default config for a PMU is defined before selected events are parsed. That allows the user-entered config to override the default config. However that does not allow for changing the default config based on other options. For example, if the user chooses AUX area sampling mode, in the case of Intel PT, the psb_period needs to be small for sampling, so there is a need to set the default psb_period to 0 (2 KiB) in that case. However that should not override a value set by the user. To allow for that, when using default config, record which bits of config were changed by the user. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20191115124225.5247-13-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-22	perf auxtrace: Add support for queuing AUX area samples	Adrian Hunter
	Add functions to queue AUX area samples in advance (auxtrace_queue_data()) or individually (auxtrace_queues__add_sample()) or find out what queue a sample belongs on (auxtrace_queues__sample_queue()). auxtrace_queue_data() can also queue snapshot data which keeps snapshots and samples ordered with respect to each other in case support for that is desired. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20191115124225.5247-12-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-22	perf session: Add facility to peek at all events	Adrian Hunter
	AUX area samples are not limited in how far back in time the sample could start. Consequently samples must be queued in advance to allow for time-ordered processing. To achieve that, add perf_session__peek_events() that walks and peeks at all the events. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20191115124225.5247-11-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-22	perf auxtrace: Add support for dumping AUX area samples	Adrian Hunter
	Add support for dumping AUX area samples i.e. via the perf script/report -D (--dump-raw-trace) option. Committer notes: Add __maybe_unused to the two args for auxtrace__dump_auxtrace_sample() for when we don't HAVE_AUXTRACE_SUPPORT. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20191115124225.5247-10-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-22	perf inject: Cut AUX area samples	Adrian Hunter
	After decoding AUX area samples, the AUX area data is no longer needed (having been replaced by synthesized events) so cut it out. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20191115124225.5247-9-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-22	perf record: Add aux-sample-size config term	Adrian Hunter
	To allow individual events to be selected for AUX area sampling, add aux-sample-size config term. attr.aux_sample_size is updated by auxtrace_parse_sample_options() so that the existing validation will see the value. Any event that has a non-zero aux_sample_size will cause AUX area sampling to be configured, irrespective of the --aux-sample option. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20191115124225.5247-8-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-22	perf record: Add support for AUX area sampling	Adrian Hunter
	Add a 'perf record' option '--aux-sample' to request AUX area sampling. AUX area sampling uses an overwriting buffer much like snapshot mode, so adjust the AUX buffer mmapping accordingly. To make it easy to queue samples for decoding, synthesize an ID index. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20191115124225.5247-7-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-22	perf auxtrace: Add support for AUX area sample recording	Adrian Hunter
	Add support for parsing and validating AUX area sample options. At present, the only option is the sample size, but it is also necessary to ensure that events are in a group with an AUX area event as the leader. Committer note: Add missing 'static inline' in front of auxtrace_parse_sample_options() for when we don't HAVE_AUXTRACE_SUPPORT. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20191115124225.5247-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-22	perf auxtrace: Move perf_evsel__find_pmu()	Adrian Hunter
	Move perf_evsel__find_pmu() so it can be used without forward declaration. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20191115124225.5247-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-22	perf record: Add a function to test for kernel support for AUX area sampling	Adrian Hunter
	Architectures are expected to know if AUX area sampling is supported by the hardware. Add a function perf_can_aux_sample() which will determine whether the kernel supports it. Committer notes: I reported that this message was taking place on a kernel without the required bits: # perf record --aux-sample -e '{intel_pt//u,branch-misses:u}' Error: The sys_perf_event_open() syscall returned with 7 (Argument list too long) for event (branch-misses:u). /bin/dmesg \| grep -i perf may provide additional information. Adrian sent a patch addressing it, with this explanation: ---- perf_can_aux_sample_size() always returned true because it did not pass the attribute size to sys_perf_event_open, nor correctly check the return value and errno. ---- After applying it I get, later in the series, when --aux-sample is added: # perf record --aux-sample -e '{intel_pt//u,branch-misses:u}' AUX area sampling is not supported by kernel Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20191115124225.5247-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-21	tools: hv: add vmbus testing tool	Branden Bonaby
	This is a userspace tool to drive the testing. Currently it supports introducing user specified delay in the host to guest communication path on a per-channel basis. Signed-off-by: Branden Bonaby <brandonbonaby94@gmail.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2019-11-21	selftests/x86/sigreturn/32: Invalidate DS and ES when abusing the kernel	Andy Lutomirski
	If the kernel accidentally uses DS or ES while the user values are loaded, it will work fine for sane userspace. In the interest of simulating maximally insane userspace, make sigreturn_32 zero out DS and ES for the nasty parts so that inadvertent use of these segments will crash. Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@kernel.org
2019-11-21	selftests/x86/mov_ss_trap: Fix the SYSENTER test	Andy Lutomirski
	For reasons that I haven't quite fully diagnosed, running mov_ss_trap_32 on a 32-bit kernel results in an infinite loop in userspace. This appears to be because the hacky SYSENTER test doesn't segfault as desired; instead it corrupts the program state such that it infinite loops. Fix it by explicitly clearing EBP before doing SYSENTER. This will give a more reliable segfault. Fixes: 59c2a7226fc5 ("x86/selftests: Add mov_to_ss test") Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: stable@kernel.org
2019-11-21	Merge tag 'gpio-v5.4-5' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fixes from Linus Walleij: "A last set of small fixes for GPIO, this cycle was quite busy. - Fix debounce delays on the MAX77620 GPIO expander - Use the correct unit for debounce times on the BD70528 GPIO expander - Get proper deps for parallel builds of the GPIO tools - Add a specific ACPI quirk for the Terra Pad 1061" * tag 'gpio-v5.4-5' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpiolib: acpi: Add Terra Pad 1061 to the run_edge_events_on_boot_blacklist tools: gpio: Correctly add make dependencies for gpio_utils gpio: bd70528: Use correct unit for debounce times gpio: max77620: Fixup debounce delays
2019-11-21	perf tools: Add kernel AUX area sampling definitions	Adrian Hunter
	Add kernel AUX area sampling definitions, which brings perf_event.h into line with the kernel version. New sample type PERF_SAMPLE_AUX requests a sample of the AUX area buffer. New perf_event_attr member 'aux_sample_size' specifies the desired size of the sample. Also add support for parsing samples containing AUX area data i.e. PERF_SAMPLE_AUX. Committer notes: I squashed the first two patches in this series to avoid breaking automatic bisection, i.e. after applying only the original first patch in this series we would have: # perf test -v parsing 26: Sample parsing : --- start --- test child forked, pid 17018 sample format has changed, some new PERF_SAMPLE_ bit was introduced - test needs updating test child finished with -1 ---- end ---- Sample parsing: FAILED! # With the two paches combined: # perf test parsing 26: Sample parsing : Ok # Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: http://lore.kernel.org/lkml/20191115124225.5247-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-21	tools/power/x86/intel-speed-select: Display TRL buckets for just base config ↵	Srinivas Pandruvada
	level When only base config level is present, this tool is displaying TRL (Turbo-ratio-limits) by reading legacy MSR. In this case, also present core count for TRL by reading MSR 0x1AE. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2019-11-21	tools/power/x86/intel-speed-select: Ignore missing config level	Srinivas Pandruvada
	It is possible that certain config levels are not available, even if the max level includes the level. There can be missing levels in some platforms. So ignore the level when called for information dump for all levels and fail if specifically ask for the missing level. Here the changes is to continue reading information about other levels even if we fail to get information for the current level. But use the "processed" flag to indicate the failure. When the "processed" flag is not set, don't dump information about that level. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2019-11-21	Merge branch 'kvm-tsx-ctrl' into HEAD	Paolo Bonzini
	Conflicts: arch/x86/kvm/vmx/vmx.c
2019-11-21	selftests/powerpc: spectre_v2 test must be built 64-bit	Michael Ellerman
	The spectre_v2 test must be built 64-bit, it includes hand-written asm that is 64-bit only, and segfaults if built 32-bit. Fixes: c790c3d2b0ec ("selftests/powerpc: Add a test of spectre_v2 mitigations") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20191120023924.13130-1-mpe@ellerman.id.au
2019-11-20	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next	David S. Miller
	Daniel Borkmann says: ==================== pull-request: bpf-next 2019-11-20 The following pull-request contains BPF updates for your net-next tree. We've added 81 non-merge commits during the last 17 day(s) which contain a total of 120 files changed, 4958 insertions(+), 1081 deletions(-). There are 3 trivial conflicts, resolve it by always taking the chunk from 196e8ca74886c433: <<<<<<< HEAD ======= void bpf_map_area_mmapable_alloc(u64 size, int numa_node); >>>>>>> 196e8ca74886c433dcfc64a809707074b936aaf5 <<<<<<< HEAD void bpf_map_area_alloc(u64 size, int numa_node) ======= static void __bpf_map_area_alloc(u64 size, int numa_node, bool mmapable) >>>>>>> 196e8ca74886c433dcfc64a809707074b936aaf5 <<<<<<< HEAD if (size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) { ======= / kmalloc()'ed memory can't be mmap()'ed */ if (!mmapable && size <= (PAGE_SIZE << PAGE_ALLOC_COSTLY_ORDER)) { >>>>>>> 196e8ca74886c433dcfc64a809707074b936aaf5 The main changes are: 1) Addition of BPF trampoline which works as a bridge between kernel functions, BPF programs and other BPF programs along with two new use cases: i) fentry/fexit BPF programs for tracing with practically zero overhead to call into BPF (as opposed to k[ret]probes) and ii) attachment of the former to networking related programs to see input/output of networking programs (covering xdpdump use case), from Alexei Starovoitov. 2) BPF array map mmap support and use in libbpf for global data maps; also a big batch of libbpf improvements, among others, support for reading bitfields in a relocatable manner (via libbpf's CO-RE helper API), from Andrii Nakryiko. 3) Extend s390x JIT with usage of relative long jumps and loads in order to lift the current 64/512k size limits on JITed BPF programs there, from Ilya Leoshkevich. 4) Add BPF audit support and emit messages upon successful prog load and unload in order to have a timeline of events, from Daniel Borkmann and Jiri Olsa. 5) Extension to libbpf and xdpsock sample programs to demo the shared umem mode (XDP_SHARED_UMEM) as well as RX-only and TX-only sockets, from Magnus Karlsson. 6) Several follow-up bug fixes for libbpf's auto-pinning code and a new API call named bpf_get_link_xdp_info() for retrieving the full set of prog IDs attached to XDP, from Toke Høiland-Jørgensen. 7) Add BTF support for array of int, array of struct and multidimensional arrays and enable it for skb->cb[] access in kfree_skb test, from Martin KaFai Lau. 8) Fix AF_XDP by using the correct number of channels from ethtool, from Luigi Rizzo. 9) Two fixes for BPF selftest to get rid of a hang in test_tc_tunnel and to avoid xdping to be run as standalone, from Jiri Benc. 10) Various BPF selftest fixes when run with latest LLVM trunk, from Yonghong Song. 11) Fix a memory leak in BPF fentry test run data, from Colin Ian King. 12) Various smaller misc cleanups and improvements mostly all over BPF selftests and samples, from Daniel T. Lee, Andre Guedes, Anders Roxell, Mao Wenan, Yue Haibing. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-19	selftests/bpf: Enforce no-ALU32 for test_progs-no_alu32	Andrii Nakryiko
	With the most recent Clang, alu32 is enabled by default if -mcpu=probe or -mcpu=v3 is specified. Use a separate build rule with -mcpu=v2 to enforce no ALU32 mode. Suggested-by: Yonghong Song <yhs@fb.com> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20191120002510.4130605-1-andriin@fb.com
2019-11-19	libbpf: Fix call relocation offset calculation bug	Andrii Nakryiko
	When relocating subprogram call, libbpf doesn't take into account relo->text_off, which comes from symbol's value. This generally works fine for subprograms implemented as static functions, but breaks for global functions. Taking a simplified test_pkt_access.c as an example: __attribute__ ((noinline)) static int test_pkt_access_subprog1(volatile struct __sk_buff skb) { return skb->len 2; } __attribute__ ((noinline)) static int test_pkt_access_subprog2(int val, volatile struct __sk_buff skb) { return skb->len + val; } SEC("classifier/test_pkt_access") int test_pkt_access(struct __sk_buff skb) { if (test_pkt_access_subprog1(skb) != skb->len * 2) return TC_ACT_SHOT; if (test_pkt_access_subprog2(2, skb) != skb->len + 2) return TC_ACT_SHOT; return TC_ACT_UNSPEC; } When compiled, we get two relocations, pointing to '.text' symbol. .text has st_value set to 0 (it points to the beginning of .text section): 0000000000000008 000000050000000a R_BPF_64_32 0000000000000000 .text 0000000000000040 000000050000000a R_BPF_64_32 0000000000000000 .text test_pkt_access_subprog1 and test_pkt_access_subprog2 offsets (targets of two calls) are encoded within call instruction's imm32 part as -1 and 2, respectively: 0000000000000000 test_pkt_access_subprog1: 0: 61 10 00 00 00 00 00 00 r0 = (u32 )(r1 + 0) 1: 64 00 00 00 01 00 00 00 w0 <<= 1 2: 95 00 00 00 00 00 00 00 exit 0000000000000018 test_pkt_access_subprog2: 3: 61 10 00 00 00 00 00 00 r0 = (u32 )(r1 + 0) 4: 04 00 00 00 02 00 00 00 w0 += 2 5: 95 00 00 00 00 00 00 00 exit 0000000000000000 test_pkt_access: 0: bf 16 00 00 00 00 00 00 r6 = r1 ===> 1: 85 10 00 00 ff ff ff ff call -1 2: bc 01 00 00 00 00 00 00 w1 = w0 3: b4 00 00 00 02 00 00 00 w0 = 2 4: 61 62 00 00 00 00 00 00 r2 = (u32 )(r6 + 0) 5: 64 02 00 00 01 00 00 00 w2 <<= 1 6: 5e 21 08 00 00 00 00 00 if w1 != w2 goto +8 <LBB0_3> 7: bf 61 00 00 00 00 00 00 r1 = r6 ===> 8: 85 10 00 00 02 00 00 00 call 2 9: bc 01 00 00 00 00 00 00 w1 = w0 10: 61 62 00 00 00 00 00 00 r2 = (u32 )(r6 + 0) 11: 04 02 00 00 02 00 00 00 w2 += 2 12: b4 00 00 00 ff ff ff ff w0 = -1 13: 1e 21 01 00 00 00 00 00 if w1 == w2 goto +1 <LBB0_3> 14: b4 00 00 00 02 00 00 00 w0 = 2 0000000000000078 LBB0_3: 15: 95 00 00 00 00 00 00 00 exit Now, if we compile example with global functions, the setup changes. Relocations are now against specifically test_pkt_access_subprog1 and test_pkt_access_subprog2 symbols, with test_pkt_access_subprog2 pointing 24 bytes into its respective section (.text), i.e., 3 instructions in: 0000000000000008 000000070000000a R_BPF_64_32 0000000000000000 test_pkt_access_subprog1 0000000000000048 000000080000000a R_BPF_64_32 0000000000000018 test_pkt_access_subprog2 Calls instructions now encode offsets relative to function symbols and are both set ot -1: 0000000000000000 test_pkt_access_subprog1: 0: 61 10 00 00 00 00 00 00 r0 = (u32 )(r1 + 0) 1: 64 00 00 00 01 00 00 00 w0 <<= 1 2: 95 00 00 00 00 00 00 00 exit 0000000000000018 test_pkt_access_subprog2: 3: 61 20 00 00 00 00 00 00 r0 = (u32 )(r2 + 0) 4: 0c 10 00 00 00 00 00 00 w0 += w1 5: 95 00 00 00 00 00 00 00 exit 0000000000000000 test_pkt_access: 0: bf 16 00 00 00 00 00 00 r6 = r1 ===> 1: 85 10 00 00 ff ff ff ff call -1 2: bc 01 00 00 00 00 00 00 w1 = w0 3: b4 00 00 00 02 00 00 00 w0 = 2 4: 61 62 00 00 00 00 00 00 r2 = (u32 )(r6 + 0) 5: 64 02 00 00 01 00 00 00 w2 <<= 1 6: 5e 21 09 00 00 00 00 00 if w1 != w2 goto +9 <LBB2_3> 7: b4 01 00 00 02 00 00 00 w1 = 2 8: bf 62 00 00 00 00 00 00 r2 = r6 ===> 9: 85 10 00 00 ff ff ff ff call -1 10: bc 01 00 00 00 00 00 00 w1 = w0 11: 61 62 00 00 00 00 00 00 r2 = (u32 )(r6 + 0) 12: 04 02 00 00 02 00 00 00 w2 += 2 13: b4 00 00 00 ff ff ff ff w0 = -1 14: 1e 21 01 00 00 00 00 00 if w1 == w2 goto +1 <LBB2_3> 15: b4 00 00 00 02 00 00 00 w0 = 2 0000000000000080 LBB2_3: 16: 95 00 00 00 00 00 00 00 exit Thus the right formula to calculate target call offset after relocation should take into account relocation's target symbol value (offset within section), call instruction's imm32 offset, and (subtracting, to get relative instruction offset) instruction index of call instruction itself. All that is shifted by number of instructions in main program, given all sub-programs are copied over after main program. Convert few selftests relying on bpf-to-bpf calls to use global functions instead of static ones. Fixes: 48cca7e44f9f ("libbpf: add support for bpf_call") Reported-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20191119224447.3781271-1-andriin@fb.com
2019-11-19	perf report: Jump to symbol source view from total cycles view	Jin Yao
	This patch supports jumping from tui total cycles view to symbol source view. For example, perf record -b ./div perf report --total-cycles In total cycles view, we can select one entry and press 'a' or press ENTER key to jump to symbol source view. This patch also sets sort_order to NULL in cmd_report() which will use the default branch sort order. The percent value in new annotate view will be consistent with the percent in annotate view switched from perf report (we observed the original percent gap with previous patches). v2: --- Fix the 'make NO_SLANG=1' error. (set __maybe_unused to annotation_opts in block_hists_tui_browse()). Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20191118140849.20714-2-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-19	perf util: Move block TUI function to ui browsers	Jin Yao
	It would be nice if we could jump to the assembler/source view (like the normal perf report) from total cycles view. This patch moves the block_hists_tui_browse from block-info.c to ui/browsers/hists.c in order to reuse some browser codes (i.e do_annotate) for implementing new annotation view. v2: --- Fix the 'make NO_SLANG=1' error. (Change 'int block_hists_tui_browse()' to 'static inline int block_hists_tui_browse()') Signed-off-by: Jin Yao <yao.jin@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jin Yao <yao.jin@intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20191118140849.20714-1-yao.jin@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-19	perf session: Fix decompression of PERF_RECORD_COMPRESSED records	Alexey Budankov
	Avoid termination of trace loading in case the last record in the decompressed buffer partly resides in the following mmaped PERF_RECORD_COMPRESSED record. In this case NULL value returned by fetch_mmaped_event() means to proceed to the next mmaped record then decompress it and load compressed events. The issue can be reproduced like this: $ perf record -z -- some_long_running_workload $ perf report --stdio -vv decomp (B): 44519 to 163000 decomp (B): 48119 to 174800 decomp (B): 65527 to 131072 fetch_mmaped_event: head=0x1ffe0 event->header_size=0x28, mmap_size=0x20000: fuzzed perf.data? Error: failed to process sample ... Testing: 71: Zstd perf.data compression/decompression : Ok $ tools/perf/perf report -vv --stdio decomp (B): 59593 to 262160 decomp (B): 4438 to 16512 decomp (B): 285 to 880 Looking at the vmlinux_path (8 entries long) Using vmlinux for symbols decomp (B): 57474 to 261248 prefetch_event: head=0x3fc78 event->header_size=0x28, mmap_size=0x3fc80: fuzzed or compressed perf.data? decomp (B): 25 to 32 decomp (B): 52 to 120 ... Fixes: 57fc032ad643 ("perf session: Avoid infinite loop when seeing invalid header.size") Link: https://marc.info/?l=linux-kernel&m=156580812427554&w=2 Co-developed-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/cf782c34-f3f8-2f9f-d6ab-145cee0d5322@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-19	perf dso: Move dso_id from 'struct map' to 'struct dso'	Arnaldo Carvalho de Melo
	And take it into account when looking up DSOs when we have the dso_id fields obtained from somewhere, like from PERF_RECORD_MMAP2 records. Instances of struct map pointing to the same DSO pathname but with anything in dso_id different are in fact different DSOs, so better have different 'struct dso' instances to reflect that. At some point we may want to get copies of the contents of the different objects if we want to do correct annotation or other analysis. With this we get 'struct map' 24 bytes leaner: $ pahole -C map ~/bin/perf struct map { union { struct rb_node rb_node __attribute__((__aligned__(8))); /* 0 24 / struct list_head node; / 0 16 / } __attribute__((__aligned__(8))); / 0 24 / u64 start; / 24 8 / u64 end; / 32 8 / _Bool erange_warned:1; / 40: 0 1 / _Bool priv:1; / 40: 1 1 / / XXX 6 bits hole, try to pack / / XXX 3 bytes hole, try to pack / u32 prot; / 44 4 / u64 pgoff; / 48 8 / u64 reloc; / 56 8 / / --- cacheline 1 boundary (64 bytes) --- / u64 (map_ip)(struct map , u64); / 64 8 / u64 (unmap_ip)(struct map , u64); / 72 8 / struct dso dso; /* 80 8 / refcount_t refcnt; / 88 4 / u32 flags; / 92 4 / / size: 96, cachelines: 2, members: 13 / / sum members: 92, holes: 1, sum holes: 3 / / sum bitfield members: 2 bits, bit holes: 1, sum bit holes: 6 bits / / forced alignments: 1 / / last cacheline: 32 bytes */ } __attribute__((__aligned__(8))); $ Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-g4hxxmraplo7wfjmk384mfsb@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-19	net-af_xdp: Use correct number of channels from ethtool	Luigi Rizzo
	Drivers use different fields to report the number of channels, so take the maximum of all data channels (rx, tx, combined) when determining the size of the xsk map. The current code used only 'combined' which was set to 0 in some drivers e.g. mlx4. Tested: compiled and run xdpsock -q 3 -r -S on mlx4 Signed-off-by: Luigi Rizzo <lrizzo@google.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/bpf/20191119001951.92930-1-lrizzo@google.com
2019-11-19	perf dsos: Remove unused dsos__find() method	Arnaldo Carvalho de Melo
	Not used anywhere, nuke it. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-teqz0eqcw43mnt7i3me44esw@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-19	perf map: Move comparision of map's dso_id to a separate function	Arnaldo Carvalho de Melo
	We'll use it when doing DSO lookups using dso_ids. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-u2nr1oq03o0i29w2ay9jx03s@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-19	perf map: Pass a dso_id to map__new()	Arnaldo Carvalho de Melo
	Instead of the 4 fields, a step in the direction of moving this to struct dso. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-gp5s1xgxacurmih5d1l94ymy@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-19	perf map: Move maj/min/ino/ino_generation to separate struct	Arnaldo Carvalho de Melo
	And this patch highlights where these fields are being used: in the sort order where it uses it to compare maps and classify samples taking into account not just the DSO, but those DSO id fields. I think these should be used to differentiate DSOs with the same name but different 'struct dso_id' fields, i.e. these fields should move to 'struct dso' and then be used as part of the key when doing lookups for DSOs, in addition to the DSO name. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/n/tip-8v5isitqy0dup47nnwkpc80f@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-18	selftests: forwarding: Add speed and auto-negotiation test	Amit Cohen
	Check configurations and packets transference with different variations of autoneg and speed. Test plan: 1. Test force of same speed with autoneg off 2. Test force of different speeds with autoneg off (should fail) 3. One side is autoneg on and other side sets force of common speeds 4. One side is autoneg on and other side only advertises a subset of the common speeds (one speed of the subset) 5. One side is autoneg on and other side only advertises a subset of the common speeds. Check that highest speed is negotiated 6. Test autoneg on, but each side advertises different speeds (should fail) Signed-off-by: Amit Cohen <amitc@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18	selftests: forwarding: lib.sh: Add wait for dev with timeout	Amit Cohen
	Add a function that waits for device with maximum number of iterations. It enables to limit the waiting and prevent infinite loop. This will be used by the subsequent patch which will set two ports to different speeds in order to make sure they cannot negotiate a link. Waiting for all the setup is limited with 10 minutes for each device. Signed-off-by: Amit Cohen <amitc@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18	selftests: forwarding: Add ethtool_lib.sh	Amit Cohen
	Functions: 1. speeds_arr_get The function returns an array of speed values from /usr/include/linux/ethtool.h The array looks as follows: [10baseT/Half] = 0, [10baseT/Full] = 1, ... 2. ethtool_set: params: cmd The function runs ethtool by cmd (ethtool -s cmd) and checks if there was an error in configuration 3. dev_speeds_get: params: dev, with_mode (0 or 1), adver (0 or 1) return value: Array of supported/Advertised link modes with/without mode * Example 1: speeds_get swp1 0 0 return: 1000 10000 40000 * Example 2: speeds_get swp1 1 1 return: 1000baseKX/Full 10000baseKR/Full 40000baseCR4/Full 4. common_speeds_get: params: dev1, dev2, with_mode (0 or 1), adver (0 or 1) return value: Array of common speeds of dev1 and dev2 * Example: common_speeds_get swp1 swp2 0 0 return: 1000 10000 Assuming that swp1 supports 1000 10000 40000 and swp2 supports 1000 10000 Signed-off-by: Amit Cohen <amitc@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18	selftests: mlxsw: Check devlink device before running test	Danielle Ratson
	The scale test for Spectrum-2 should only be invoked for Spectrum-2. Skip the test otherwise. Signed-off-by: Danielle Ratson <danieller@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-11-18	selftests: mlxsw: Add router scale test for Spectrum-2	Danielle Ratson
	Same as for Spectrum-1, test the ability to add the maximum number of routes possible to the switch. Invoke the test from the 'resource_scale' wrapper script. Signed-off-by: Danielle Ratson <danieller@mellanox.com> Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>