linux/linux-stable.git - Linux kernel stable tree

Age	Commit message (Collapse)	Author
2022-05-11	Merge remote-tracking branch 'torvalds/master' into perf/core	Arnaldo Carvalho de Melo
	Get fixes sent via perf/urgent, etc. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-11	selftests: xsk: make stat tests not spin on getsockopt	Magnus Karlsson
	Convert the stats tests from spinning on the getsockopt to just check getsockopt once when the Rx thread has received all the packets. The actual completion of receiving the last packet forms a natural point in time when the receiver is ready to call the getsockopt to check the stats. In the previous version , we just span on the getsockopt until we received the right answer. This could be forever or just getting the "correct" answer by shear luck. The pacing_on variable can now be dropped since all test can now handle pacing properly. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20220510115604.8717-10-magnus.karlsson@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-11	selftests: xsk: make the stats tests normal tests	Magnus Karlsson
	Make the stats tests look and feel just like normal tests instead of bunched under the umbrella of TEST_STATS. This means we will always run each of them even if one fails. Also gets rid of some special case code. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20220510115604.8717-9-magnus.karlsson@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-11	selftests: xsk: introduce validation functions	Magnus Karlsson
	Introduce validation functions that can be optionally called by the Rx and Tx threads. These are then used to replace the Rx and Tx stats dispatchers. This so that we in the next commit can make the stats tests proper normal tests and not be some special case, as today. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20220510115604.8717-8-magnus.karlsson@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-11	selftests: xsk: cleanup veth pair at ctrl-c	Magnus Karlsson
	Remove the veth pair when the tests are aborted by pressing ctrl-c. Currently in this situation, the veth pair is left on the system polluting the netdev space. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20220510115604.8717-7-magnus.karlsson@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-11	selftests: xsk: add timeout to tests	Magnus Karlsson
	Add a timeout to the tests so that if all packets have not been received within 3 seconds, fail the ongoing test. Hinders a test from dead-locking if there is something wrong. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20220510115604.8717-6-magnus.karlsson@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-11	selftests: xsk: fix reporting of failed tests	Magnus Karlsson
	Fix the reporting of failed tests as it was broken in several ways. First, a failed test was reported as both failed and passed messing up the count. Second, tests were not aborted after a failure and could generate more "failures" messing up the count even more. Third, the failure reporting from the application to the shell script was wrong. It always reported pass. And finally, the handling of the failures in the launch script was not correct. Correct all this by propagating the failure up through the function calls to a calling function that can abort the test. A receiver or sender thread will mark the new variable in the test spec called fail, if a test has failed. This is then picked up by the main thread when everyone else has exited and this is then marked and propagated up to the calling script. Also add a summary function in the calling script so that a user does not have to go through the sub tests to see if something has failed. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20220510115604.8717-5-magnus.karlsson@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-11	selftests: xsk: run all tests for busy-poll	Magnus Karlsson
	Execute all xsk selftests for busy-poll mode too. Currently they were only run for the standard interrupt driven softirq mode. Replace the unused option queue-id with the new option busy-poll. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20220510115604.8717-4-magnus.karlsson@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-11	selftests: xsk: do not send zero-length packets	Magnus Karlsson
	Do not try to send packets of zero length since they are dropped by veth after commit 726e2c5929de84 ("veth: Ensure eth header is in skb's linear part"). Replace these two packets with packets of length 60 so that they are not dropped. Also clean up the confusing naming. MIN_PKT_SIZE was really MIN_ETH_PKT_SIZE and PKT_SIZE was both MIN_ETH_SIZE and the default packet size called just PKT_SIZE. Make it consistent by using the right define in the right place. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20220510115604.8717-3-magnus.karlsson@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-11	selftests: xsk: cleanup bash scripts	Magnus Karlsson
	Remove the spec-file that is not used any longer from the shell scripts. Also remove an unused option. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Link: https://lore.kernel.org/r/20220510115604.8717-2-magnus.karlsson@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-11	perf/ibs: Fix comment	Ravi Bangoria
	s/IBS Op Data 2/IBS Op Data 1/ for MSR 0xc0011035. Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20220509044914.1473-9-ravi.bangoria@amd.com
2022-05-11	libbpf: Add bpf_program__set_insns function	Jiri Olsa
	Adding bpf_program__set_insns that allows to set new instructions for a BPF program. This is a very advanced libbpf API and users need to know what they are doing. This should be used from prog_prepare_load_fn callback only. We can have changed instructions after calling prog_prepare_load_fn callback, reloading them. One of the users of this new API will be perf's internal BPF prologue generation. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220510074659.2557731-2-jolsa@kernel.org
2022-05-11	libbpf: Clean up ringbuf size adjustment implementation	Andrii Nakryiko
	Drop unused iteration variable, move overflow prevention check into the for loop. Fixes: 0087a681fa8c ("libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary") Reported-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220510185159.754299-1-andrii@kernel.org
2022-05-10	selftest/bpf: The test cases of BPF cookie for fentry/fexit/fmod_ret/lsm.	Kui-Feng Lee
	Make sure BPF cookies are correct for fentry/fexit/fmod_ret/lsm. Signed-off-by: Kui-Feng Lee <kuifeng@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220510205923.3206889-6-kuifeng@fb.com
2022-05-10	libbpf: Assign cookies to links in libbpf.	Kui-Feng Lee
	Add a cookie field to the attributes of bpf_link_create(). Add bpf_program__attach_trace_opts() to attach a cookie to a link. Signed-off-by: Kui-Feng Lee <kuifeng@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220510205923.3206889-5-kuifeng@fb.com
2022-05-10	bpf, x86: Attach a cookie to fentry/fexit/fmod_ret/lsm.	Kui-Feng Lee
	Pass a cookie along with BPF_LINK_CREATE requests. Add a bpf_cookie field to struct bpf_tracing_link to attach a cookie. The cookie of a bpf_tracing_link is available by calling bpf_get_attach_cookie when running the BPF program of the attached link. The value of a cookie will be set at bpf_tramp_run_ctx by the trampoline of the link. Signed-off-by: Kui-Feng Lee <kuifeng@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220510205923.3206889-4-kuifeng@fb.com
2022-05-10	bpf, x86: Generate trampolines from bpf_tramp_links	Kui-Feng Lee
	Replace struct bpf_tramp_progs with struct bpf_tramp_links to collect struct bpf_tramp_link(s) for a trampoline. struct bpf_tramp_link extends bpf_link to act as a linked list node. arch_prepare_bpf_trampoline() accepts a struct bpf_tramp_links to collects all bpf_tramp_link(s) that a trampoline should call. Change BPF trampoline and bpf_struct_ops to pass bpf_tramp_links instead of bpf_tramp_progs. Signed-off-by: Kui-Feng Lee <kuifeng@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220510205923.3206889-2-kuifeng@fb.com
2022-05-10	selftests/bpf: Add attach bench test	Jiri Olsa
	Adding test that reads all functions from ftrace available_filter_functions file and attach them all through kprobe_multi API. It also prints stats info with -v option, like on my setup: test_bench_attach: found 48712 functions test_bench_attach: attached in 1.069s test_bench_attach: detached in 0.373s Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20220510122616.2652285-6-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-10	selftests/bpf: Add bpf link iter test	Dmitrii Dolgov
	Add a simple test for bpf link iterator Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com> Link: https://lore.kernel.org/r/20220510155233.9815-5-9erthalion6@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-10	selftests/bpf: Use ASSERT_* instead of CHECK	Dmitrii Dolgov
	Replace usage of CHECK with a corresponding ASSERT_* macro for bpf_iter tests. Only done if the final result is equivalent, no changes when replacement means loosing some information, e.g. from formatting string. Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com> Link: https://lore.kernel.org/r/20220510155233.9815-4-9erthalion6@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-10	selftests/bpf: Fix result check for test_bpf_hash_map	Dmitrii Dolgov
	The original condition looks like a typo, verify the skeleton loading result instead. Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com> Link: https://lore.kernel.org/r/20220510155233.9815-3-9erthalion6@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-10	selftests/bpf: Replace bpf_trace_printk in tunnel kernel code	Kaixi Fan
	Replace bpf_trace_printk with bpf_printk in test_tunnel_kern.c. function bpf_printk is more easier and useful than bpf_trace_printk. Signed-off-by: Kaixi Fan <fankaixi.li@bytedance.com> Link: https://lore.kernel.org/r/20220430074844.69214-4-fankaixi.li@bytedance.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-10	selftests/bpf: Move vxlan tunnel testcases to test_progs	Kaixi Fan
	Move vxlan tunnel testcases from test_tunnel.sh to test_progs. And add vxlan tunnel source testcases also. Other tunnel testcases will be moved to test_progs step by step in the future. Rename bpf program section name as SEC("tc") because test_progs bpf loader could not load sections with name SEC("gre_set_tunnel"). Because of this, add bpftool to load bpf programs in test_tunnel.sh. Signed-off-by: Kaixi Fan <fankaixi.li@bytedance.com> Link: https://lore.kernel.org/r/20220430074844.69214-3-fankaixi.li@bytedance.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-10	bpf: Add source ip in "struct bpf_tunnel_key"	Kaixi Fan
	Add tunnel source ip field in "struct bpf_tunnel_key". Add related code to set and get tunnel source field. Signed-off-by: Kaixi Fan <fankaixi.li@bytedance.com> Link: https://lore.kernel.org/r/20220430074844.69214-2-fankaixi.li@bytedance.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-05-10	bpftool: bpf_link_get_from_fd support for LSM programs in lskel	KP Singh
	bpf_link_get_from_fd currently returns a NULL fd for LSM programs. LSM programs are similar to tracing programs and can also use skel_raw_tracepoint_open. Signed-off-by: KP Singh <kpsingh@kernel.org> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20220509214905.3754984-1-kpsingh@kernel.org
2022-05-10	perf annotate: Add --percent-limit option	Namhyung Kim
	Like in 'perf report' and 'perf top', Add this option to limit the number of functions it displays based on the overhead value in percent. This affects only stdio and stdio2 output modes. Without this, it shows very long disassembly lines for every function in the data file. If users don't want this behavior, they can set a value in percent to suppress that. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20220502232015.697243-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-10	selftests/bpf: Handle batch operations for map-in-map bpf-maps	Takshak Chahande
	This patch adds up test cases that handles 4 combinations: a) outer map: BPF_MAP_TYPE_ARRAY_OF_MAPS inner maps: BPF_MAP_TYPE_ARRAY and BPF_MAP_TYPE_HASH b) outer map: BPF_MAP_TYPE_HASH_OF_MAPS inner maps: BPF_MAP_TYPE_ARRAY and BPF_MAP_TYPE_HASH Signed-off-by: Takshak Chahande <ctakshak@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20220510082221.2390540-2-ctakshak@fb.com
2022-05-10	perf auxtrace: Record whether an auxtrace mmap is needed	Adrian Hunter
	Add a flag needs_auxtrace_mmap to record whether an auxtrace mmap is needed, in preparation for correctly determining whether or not an auxtrace mmap is needed. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220506122601.367589-10-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-10	libperf evlist: Add evsel as a parameter to ->idx()	Adrian Hunter
	Add evsel as a parameter to ->idx() in preparation for correctly determining whether an auxtrace mmap is needed. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220506122601.367589-9-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-10	libperf evlist: Move ->idx() into mmap_per_evsel()	Adrian Hunter
	Move ->idx() into mmap_per_evsel() in preparation for adding evsel as a parameter. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220506122601.367589-8-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-10	libperf evlist: Remove ->idx() per_cpu parameter	Adrian Hunter
	Remove ->idx() per_cpu parameter because it isn't needed. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220506122601.367589-7-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-10	perf auxtrace: Do not mix up mmap idx	Adrian Hunter
	The idx is with respect to evlist not evsel. That hasn't mattered because they are the same at present. Prepare for that not being the case, which it won't be when sideband tracking events are allowed on all CPUs even when auxtrace is limited to selected CPUs. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220506122601.367589-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-10	perf auxtrace: Move evlist__enable_event_idx() to auxtrace.c	Adrian Hunter
	evlist__enable_event_idx() is used only by auxtrace. Move it to auxtrace.c in preparation for making it even more auxtrace specific. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220506122601.367589-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-10	perf evlist: Use libperf functions in evlist__enable_event_idx()	Adrian Hunter
	evlist__enable_event_idx() is used only for auxtrace events which are never system_wide. Simplify by using libperf enable event functions. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220506122601.367589-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-10	libperf evsel: Add perf_evsel__enable_thread()	Adrian Hunter
	Add perf_evsel__enable_thread() as a counterpart to perf_evsel__enable_cpu(), to enable all events for a thread. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Link: http://lore.kernel.org/lkml/20220506122601.367589-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-05-09	selftests: clarify common error when running gup_test	Joel Savitz
	The gup_test binary will fail showing only the output of perror("open") in the case that /sys/kernel/debug/gup_test is not found. This will almost always be due to CONFIG_GUP_TEST not being set, which enables compilation of a kernel that provides this file. Add a short error message to clarify this failure and point the user to the solution. Link: https://lkml.kernel.org/r/20220502224942.995427-1-jsavitz@redhat.com Signed-off-by: Joel Savitz <jsavitz@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Nico Pache <npache@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09	mm/page-flags: reuse PG_mappedtodisk as PG_anon_exclusive for PageAnon() pages	David Hildenbrand
	The basic question we would like to have a reliable and efficient answer to is: is this anonymous page exclusive to a single process or might it be shared? We need that information for ordinary/single pages, hugetlb pages, and possibly each subpage of a THP. Introduce a way to mark an anonymous page as exclusive, with the ultimate goal of teaching our COW logic to not do "wrong COWs", whereby GUP pins lose consistency with the pages mapped into the page table, resulting in reported memory corruptions. Most pageflags already have semantics for anonymous pages, however, PG_mappedtodisk should never apply to pages in the swapcache, so let's reuse that flag. As PG_has_hwpoisoned also uses that flag on the second tail page of a compound page, convert it to PG_error instead, which is marked as PF_NO_TAIL, so never used for tail pages. Use custom page flag modification functions such that we can do additional sanity checks. The semantics we'll put into some kernel doc in the future are: " PG_anon_exclusive is usually only expressive in combination with a page table entry. Depending on the page table entry type it might store the following information: Is what's mapped via this page table entry exclusive to the single process and can be mapped writable without further checks? If not, it might be shared and we might have to COW. For now, we only expect PTE-mapped THPs to make use of PG_anon_exclusive in subpages. For other anonymous compound folios (i.e., hugetlb), only the head page is logically mapped and holds this information. For example, an exclusive, PMD-mapped THP only has PG_anon_exclusive set on the head page. When replacing the PMD by a page table full of PTEs, PG_anon_exclusive, if set on the head page, will be set on all tail pages accordingly. Note that converting from a PTE-mapping to a PMD mapping using the same compound page is currently not possible and consequently doesn't require care. If GUP wants to take a reliable pin (FOLL_PIN) on an anonymous page, it should only pin if the relevant PG_anon_exclusive is set. In that case, the pin will be fully reliable and stay consistent with the pages mapped into the page table, as the bit cannot get cleared (e.g., by fork(), KSM) while the page is pinned. For anonymous pages that are mapped R/W, PG_anon_exclusive can be assumed to always be set because such pages cannot possibly be shared. The page table lock protecting the page table entry is the primary synchronization mechanism for PG_anon_exclusive; GUP-fast that does not take the PT lock needs special care when trying to clear the flag. Page table entry types and PG_anon_exclusive: * Present: PG_anon_exclusive applies. * Swap: the information is lost. PG_anon_exclusive was cleared. * Migration: the entry holds this information instead. PG_anon_exclusive was cleared. * Device private: PG_anon_exclusive applies. * Device exclusive: PG_anon_exclusive applies. * HW Poison: PG_anon_exclusive is stale and not changed. If the page may be pinned (FOLL_PIN), clearing PG_anon_exclusive is not allowed and the flag will stick around until the page is freed and folio->mapping is cleared. " We won't be clearing PG_anon_exclusive on destructive unmapping (i.e., zapping) of page table entries, page freeing code will handle that when also invalidate page->mapping to not indicate PageAnon() anymore. Letting information about exclusivity stick around will be an important property when adding sanity checks to unpinning code. Note that we properly clear the flag in free_pages_prepare() via PAGE_FLAGS_CHECK_AT_PREP for each individual subpage of a compound page, so there is no need to manually clear the flag. Link: https://lkml.kernel.org/r/20220428083441.37290-12-david@redhat.com Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Christoph Hellwig <hch@lst.de> Cc: David Rientjes <rientjes@google.com> Cc: Don Dutile <ddutile@redhat.com> Cc: Hugh Dickins <hughd@google.com> Cc: Jan Kara <jack@suse.cz> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Khalid Aziz <khalid.aziz@oracle.com> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Cc: Liang Zhang <zhangliang5@huawei.com> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Mike Kravetz <mike.kravetz@oracle.com> Cc: Mike Rapoport <rppt@linux.ibm.com> Cc: Nadav Amit <namit@vmware.com> Cc: Oded Gabbay <oded.gabbay@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pedro Demarchi Gomes <pedrodemargomes@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: Roman Gushchin <guro@fb.com> Cc: Shakeel Butt <shakeelb@google.com> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09	bpftool: Declare generator name	Jason Wang
	Most code generators declare its name so did this for bfptool. Signed-off-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220509090247.5457-1-jasowang@redhat.com
2022-05-09	selftests: vm: Makefile: rename TARGETS to VMTARGETS	Joel Savitz
	The tools/testing/selftests/vm/Makefile uses the variable TARGETS internally to generate a list of platform-specific binary build targets suffixed with _{32,64}. When building the selftests using its own Makefile directly, such as via the following command run in a kernel tree: One receives an error such as the following: make: Entering directory '/root/linux/tools/testing/selftests' make --no-builtin-rules ARCH=x86 -C ../../.. headers_install make[1]: Entering directory '/root/linux' INSTALL ./usr/include make[1]: Leaving directory '/root/linux' make[1]: Entering directory '/root/linux/tools/testing/selftests/vm' make[1]: * No rule to make target 'vm.c', needed by '/root/linux/tools/testing/selftests/vm/vm_64'. Stop. make[1]: Leaving directory '/root/linux/tools/testing/selftests/vm' make: * [Makefile:175: all] Error 2 make: Leaving directory '/root/linux/tools/testing/selftests' The TARGETS variable passed to tools/testing/selftests/Makefile collides with the TARGETS used in tools/testing/selftests/vm/Makefile, so rename the latter to VMTARGETS, eliminating the collision with no functional change. Link: https://lkml.kernel.org/r/20220504213454.1282532-1-jsavitz@redhat.com Fixes: f21fda8f6453 ("selftests: vm: pkeys: fix multilib builds for x86") Signed-off-by: Joel Savitz <jsavitz@redhat.com> Acked-by: Nico Pache <npache@redhat.com> Cc: Joel Savitz <jsavitz@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Sandipan Das <sandipan@linux.ibm.com> Cc: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-05-09	bpftool: Output message if no helpers found in feature probing	Milan Landaverde
	Currently in libbpf, we have hardcoded program types that are not supported for helper function probing (e.g. tracing, ext, lsm). Due to this (and other legitimate failures), bpftool feature probe returns empty for those program type helper functions. Instead of implying to the user that there are no helper functions available for a program type, we output a message to the user explaining that helper function probing failed for that program type. Signed-off-by: Milan Landaverde <milan@mdaverde.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220504161356.3497972-3-milan@mdaverde.com
2022-05-09	bpftool: Adjust for error codes from libbpf probes	Milan Landaverde
	Originally [1], libbpf's (now deprecated) probe functions returned a bool to acknowledge support but the new APIs return an int with a possible negative error code to reflect probe failure. This change decides for bpftool to declare maps and helpers are not available on probe failures. [1]: https://lore.kernel.org/bpf/20220202225916.3313522-3-andrii@kernel.org/ Signed-off-by: Milan Landaverde <milan@mdaverde.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20220504161356.3497972-2-milan@mdaverde.com
2022-05-09	selftests/bpf: Test libbpf's ringbuf size fix up logic	Andrii Nakryiko
	Make sure we always excercise libbpf's ringbuf map size adjustment logic by specifying non-zero size that's definitely not a page size multiple. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220509004148.1801791-10-andrii@kernel.org
2022-05-09	libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary	Andrii Nakryiko
	Kernel imposes a pretty particular restriction on ringbuf map size. It has to be a power-of-2 multiple of page size. While generally this isn't hard for user to satisfy, sometimes it's impossible to do this declaratively in BPF source code or just plain inconvenient to do at runtime. One such example might be BPF libraries that are supposed to work on different architectures, which might not agree on what the common page size is. Let libbpf find the right size for user instead, if it turns out to not satisfy kernel requirements. If user didn't set size at all, that's most probably a mistake so don't upsize such zero size to one full page, though. Also we need to be careful about not overflowing __u32 max_entries. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220509004148.1801791-9-andrii@kernel.org
2022-05-09	libbpf: Provide barrier() and barrier_var() in bpf_helpers.h	Andrii Nakryiko
	Add barrier() and barrier_var() macros into bpf_helpers.h to be used by end users. While a bit advanced and specialized instruments, they are sometimes indispensable. Instead of requiring each user to figure out exact asm volatile incantations for themselves, provide them from bpf_helpers.h. Also remove conflicting definitions from selftests. Some tests rely on barrier_var() definition being nothing, those will still work as libbpf does the #ifndef/#endif guarding for barrier() and barrier_var(), allowing users to redefine them, if necessary. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220509004148.1801791-8-andrii@kernel.org
2022-05-09	selftests/bpf: Add bpf_core_field_offset() tests	Andrii Nakryiko
	Add test cases for bpf_core_field_offset() helper. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220509004148.1801791-7-andrii@kernel.org
2022-05-09	libbpf: Complete field-based CO-RE helpers with field offset helper	Andrii Nakryiko
	Add bpf_core_field_offset() helper to complete field-based CO-RE helpers. This helper can be useful for feature-detection and for some more advanced cases of field reading (e.g., reading flexible array members). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220509004148.1801791-6-andrii@kernel.org
2022-05-09	selftests/bpf: Use both syntaxes for field-based CO-RE helpers	Andrii Nakryiko
	Excercise both supported forms of bpf_core_field_exists() and bpf_core_field_size() helpers: variable-based field reference and type/field name-based one. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220509004148.1801791-5-andrii@kernel.org
2022-05-09	libbpf: Improve usability of field-based CO-RE helpers	Andrii Nakryiko
	Allow to specify field reference in two ways: - if user has variable of necessary type, they can use variable-based reference (my_var.my_field or my_var_ptr->my_field). This was the only supported syntax up till now. - now, bpf_core_field_exists() and bpf_core_field_size() support also specifying field in a fashion similar to offsetof() macro, by specifying type of the containing struct/union separately and field name separately: bpf_core_field_exists(struct my_type, my_field). This forms is quite often more convenient in practice and it matches type-based CO-RE helpers that support specifying type by its name without requiring any variables. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220509004148.1801791-4-andrii@kernel.org
2022-05-09	libbpf: Make __kptr and __kptr_ref unconditionally use btf_type_tag() attr	Andrii Nakryiko
	It will be annoying and surprising for users of __kptr and __kptr_ref if libbpf silently ignores them just because Clang used for compilation didn't support btf_type_tag(). It's much better to get clear compiler error than debug BPF verifier failures later on. Fixes: ef89654f2bc7 ("libbpf: Add kptr type tag macros to bpf_helpers.h") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220509004148.1801791-3-andrii@kernel.org
2022-05-09	selftests/bpf: Prevent skeleton generation race	Andrii Nakryiko
	Prevent "classic" and light skeleton generation rules from stomping on each other's toes due to the use of the same <obj>.linked{1,2,3}.o naming pattern. There is no coordination and synchronizataion between .skel.h and .lskel.h rules, so they can easily overwrite each other's intermediate object files, leading to errors like: /bin/sh: line 1: 170928 Bus error (core dumped) /data/users/andriin/linux/tools/testing/selftests/bpf/tools/sbin/bpftool gen skeleton /data/users/andriin/linux/tools/testing/selftests/bpf/test_ksyms_weak.linked3.o name test_ksyms_weak > /data/users/andriin/linux/tools/testing/selftests/bpf/test_ksyms_weak.skel.h make: * [Makefile:507: /data/users/andriin/linux/tools/testing/selftests/bpf/test_ksyms_weak.skel.h] Error 135 make: * Deleting file '/data/users/andriin/linux/tools/testing/selftests/bpf/test_ksyms_weak.skel.h' Fix by using different suffix for light skeleton rule. Fixes: c48e51c8b07a ("bpf: selftests: Add selftests for module kfunc support") Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20220509004148.1801791-2-andrii@kernel.org