linux/linux-stable.git - Linux kernel stable tree

Age	Commit message (Collapse)	Author
2023-02-02	selftests/bpf: xdp_hw_metadata clear metadata when -EOPNOTSUPP	Jesper Dangaard Brouer
	The AF_XDP userspace part of xdp_hw_metadata see non-zero as a signal of the availability of rx_timestamp and rx_hash in data_meta area. The kernel-side BPF-prog code doesn't initialize these members when kernel returns an error e.g. -EOPNOTSUPP. This memory area is not guaranteed to be zeroed, and can contain garbage/previous values, which will be read and interpreted by AF_XDP userspace side. Tested this on different drivers. The experiences are that for most packets they will have zeroed this data_meta area, but occasionally it will contain garbage data. Example of failure tested on ixgbe: poll: 1 (0) xsk_ring_cons__peek: 1 0x18ec788: rx_desc[0]->addr=100000000008000 addr=8100 comp_addr=8000 rx_hash: 3697961069 rx_timestamp: 9024981991734834796 (sec:9024981991.7348) 0x18ec788: complete idx=8 addr=8000 Converting to date: date -d @9024981991 2255-12-28T20:26:31 CET I choose a simple fix in this patch. When kfunc fails or isn't supported assign zero to the corresponding struct meta value. It's up to the individual BPF-programmer to do something smarter e.g. that fits their use-case, like getting a software timestamp and marking a flag that gives the type of timestamp. Fixes: 297a3f124155 ("selftests/bpf: Simple program to dump XDP RX metadata") Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/167527271027.937063.5177725618616476592.stgit@firesoul
2023-02-02	selftests/bpf: Add testcase for static kfunc with unused arg	David Vernet
	kfuncs are allowed to be static, or not use one or more of their arguments. For example, bpf_xdp_metadata_rx_hash() in net/core/xdp.c is meant to be implemented by drivers, with the default implementation just returning -EOPNOTSUPP. As described in [0], such kfuncs can have their arguments elided, which can cause BTF encoding to be skipped. The new __bpf_kfunc macro should address this, and this patch adds a selftest which verifies that a static kfunc with at least one unused argument can still be encoded and invoked by a BPF program. Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230201173016.342758-5-void@manifault.com
2023-01-29	selftests/bpf: Fix sk_assign on s390x	Ilya Leoshkevich
	sk_assign is failing on an s390x machine running Debian "bookworm" for 2 reasons: legacy server_map definition and uninitialized addrlen in recvfrom() call. Fix by adding a new-style server_map definition and dropping addrlen (recvfrom() allows NULL values for src_addr and addrlen). Since the test should support tc built without libbpf, build the prog twice: with the old-style definition and with the new-style definition, then select the right one at runtime. This could be done at compile time too, but this would not be cross-compilation friendly. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230129190501.1624747-2-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-28	selftests/bpf: Fix profiler on s390x	Ilya Leoshkevich
	Use bpf_probe_read_kernel() and bpf_probe_read_kernel_str() instead of bpf_probe_read() and bpf_probe_read_kernel(). Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-21-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-28	selftests/bpf: Fix xdp_synproxy/tc on s390x	Ilya Leoshkevich
	Use the correct datatype for the values map values; currently the test works by accident, since on little-endian machines it is sometimes acceptable to access u64 as u32. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-20-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-28	selftests/bpf: Fix vmlinux test on s390x	Ilya Leoshkevich
	Use a syscall macro to access the nanosleep()'s first argument; currently the code uses gprs[2] instead of orig_gpr2. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-18-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-28	selftests/bpf: Fix test_xdp_adjust_tail_grow2 on s390x	Ilya Leoshkevich
	s390x cache line size is 256 bytes, so skb_shared_info must be aligned on a much larger boundary than for x86. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-17-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-28	selftests/bpf: Fix test_lsm on s390x	Ilya Leoshkevich
	Use syscall macros to access the setdomainname() arguments; currently the code uses gprs[2] instead of orig_gpr2 for the first argument. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-16-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-28	selftests/bpf: Add a sign-extension test for kfuncs	Ilya Leoshkevich
	s390x ABI requires the caller to zero- or sign-extend the arguments. eBPF already deals with zero-extension (by definition of its ABI), but not with sign-extension. Add a test to cover that potentially problematic area. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-15-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-28	selftests/bpf: Fix verify_pkcs7_sig on s390x	Ilya Leoshkevich
	Use bpf_probe_read_kernel() instead of bpf_probe_read(), which is not defined on all architectures. While at it, improve the error handling: do not hide the verifier log, and check the return values of bpf_probe_read_kernel() and bpf_copy_from_user(). Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230128000650.1516334-10-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-25	selftests/bpf: Calls bpf_setsockopt() on a ktls enabled socket.	Kui-Feng Lee
	Ensures that whenever bpf_setsockopt() is called with the SOL_TCP option on a ktls enabled socket, the call will be accepted by the system. The provided test makes sure of this by performing an examination when the server side socket is in the CLOSE_WAIT state. At this stage, ktls is still enabled on the server socket and can be used to test if bpf_setsockopt() works correctly with linux. Signed-off-by: Kui-Feng Lee <kuifeng@meta.com> Link: https://lore.kernel.org/r/20230125201608.908230-3-kuifeng@meta.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-01-25	bpf/selftests: Verify struct_ops prog sleepable behavior	David Vernet
	In a set of prior changes, we added the ability for struct_ops programs to be sleepable. This patch enhances the dummy_st_ops selftest suite to validate this behavior by adding a new sleepable struct_ops entry to dummy_st_ops. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20230125164735.785732-5-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-25	selftests/bpf: Add selftest suite for cpumask kfuncs	David Vernet
	A recent patch added a new set of kfuncs for allocating, freeing, manipulating, and querying cpumasks. This patch adds a new 'cpumask' selftest suite which verifies their behavior. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20230125143816.721952-5-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-25	selftests/bpf: Add nested trust selftests suite	David Vernet
	Now that defining trusted fields in a struct is supported, we should add selftests to verify the behavior. This patch adds a few such testcases. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20230125143816.721952-4-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-25	bpf: Disallow NULLable pointers for trusted kfuncs	David Vernet
	KF_TRUSTED_ARGS kfuncs currently have a subtle and insidious bug in validating pointers to scalars. Say that you have a kfunc like the following, which takes an array as the first argument: bool bpf_cpumask_empty(const struct cpumask *cpumask) { return cpumask_empty(cpumask); } ... BTF_ID_FLAGS(func, bpf_cpumask_empty, KF_TRUSTED_ARGS) ... If a BPF program were to invoke the kfunc with a NULL argument, it would crash the kernel. The reason is that struct cpumask is defined as a bitmap, which is itself defined as an array, and is accessed as a memory address by bitmap operations. So when the verifier analyzes the register, it interprets it as a pointer to a scalar struct, which is an array of size 8. check_mem_reg() then sees that the register is NULL and returns 0, and the kfunc crashes when it passes it down to the cpumask wrappers. To fix this, this patch adds a check for KF_ARG_PTR_TO_MEM which verifies that the register doesn't contain a possibly-NULL pointer if the kfunc is KF_TRUSTED_ARGS. Signed-off-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20230125143816.721952-2-void@manifault.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-23	selftests/bpf: Add 6-argument syscall tracing test	Andrii Nakryiko
	Turns out splice() is one of the syscalls that's using current maximum number of arguments (six). This is perfect for testing, so extend bpf_syscall_macro selftest to also trace splice() syscall, using BPF_KSYSCALL() macro. This makes sure all the syscall argument register definitions are correct. Suggested-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Alan Maguire <alan.maguire@oracle.com> # arm64 Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> # s390x Link: https://lore.kernel.org/bpf/20230120200914.3008030-25-andrii@kernel.org
2023-01-23	selftests/bpf: Validate arch-specific argument registers limits	Andrii Nakryiko
	Update uprobe_autoattach selftest to validate architecture-specific argument passing through registers. Use new BPF_UPROBE and BPF_URETPROBE, and construct both BPF-side and user-space side in such a way that for different architectures we are fetching and checking different number of arguments, matching architecture-specific limit of how many registers are available for argument passing. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Alan Maguire <alan.maguire@oracle.com> # arm64 Tested-by: Ilya Leoshkevich <iii@linux.ibm.com> # s390x Link: https://lore.kernel.org/bpf/20230120200914.3008030-12-andrii@kernel.org
2023-01-23	selftests/bpf: Use __failure macro in task kfunc testsuite	David Vernet
	In commit 537c3f66eac1 ("selftests/bpf: add generic BPF program tester-loader"), a new mechanism was added to the BPF selftest framework to allow testsuites to use macros to define expected failing testcases. This allows any testsuite which tests verification failure to remove a good amount of boilerplate code. This patch updates the task_kfunc selftest suite to use these new macros. Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20230120021844.3048244-1-void@manifault.com
2023-01-23	selftests/bpf: Simple program to dump XDP RX metadata	Stanislav Fomichev
	To be used for verification of driver implementations. Note that the skb path is gone from the series, but I'm still keeping the implementation for any possible future work. $ xdp_hw_metadata <ifname> On the other machine: $ echo -n xdp \| nc -u -q1 <target> 9091 # for AF_XDP $ echo -n skb \| nc -u -q1 <target> 9092 # for skb Sample output: # xdp xsk_ring_cons__peek: 1 0x19f9090: rx_desc[0]->addr=100000000008000 addr=8100 comp_addr=8000 rx_timestamp_supported: 1 rx_timestamp: 1667850075063948829 0x19f9090: complete idx=8 addr=8000 # skb found skb hwtstamp = 1668314052.854274681 Decoding: # xdp rx_timestamp=1667850075.063948829 $ date -d @1667850075 Mon Nov 7 11:41:15 AM PST 2022 $ date Mon Nov 7 11:42:05 AM PST 2022 # skb $ date -d @1668314052 Sat Nov 12 08:34:12 PM PST 2022 $ date Sat Nov 12 08:37:06 PM PST 2022 Cc: John Fastabend <john.fastabend@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Willem de Bruijn <willemb@google.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Anatoly Burakov <anatoly.burakov@intel.com> Cc: Alexander Lobakin <alexandr.lobakin@intel.com> Cc: Magnus Karlsson <magnus.karlsson@gmail.com> Cc: Maryam Tahhan <mtahhan@redhat.com> Cc: xdp-hints@xdp-project.net Cc: netdev@vger.kernel.org Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20230119221536.3349901-18-sdf@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-01-23	selftests/bpf: Verify xdp_metadata xdp->af_xdp path	Stanislav Fomichev
	- create new netns - create veth pair (veTX+veRX) - setup AF_XDP socket for both interfaces - attach bpf to veRX - send packet via veTX - verify the packet has expected metadata at veRX Cc: John Fastabend <john.fastabend@gmail.com> Cc: David Ahern <dsahern@gmail.com> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Willem de Bruijn <willemb@google.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Anatoly Burakov <anatoly.burakov@intel.com> Cc: Alexander Lobakin <alexandr.lobakin@intel.com> Cc: Magnus Karlsson <magnus.karlsson@gmail.com> Cc: Maryam Tahhan <mtahhan@redhat.com> Cc: xdp-hints@xdp-project.net Cc: netdev@vger.kernel.org Signed-off-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/20230119221536.3349901-12-sdf@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-01-20	selftests/bpf: Add dynptr helper tests	Kumar Kartikeya Dwivedi
	First test that we allow overwriting dynptr slots and reinitializing them in unreferenced case, and disallow overwriting for referenced case. Include tests to ensure slices obtained from destroyed dynptrs are being invalidated on their destruction. The destruction needs to be scoped, as in slices of dynptr A should not be invalidated when dynptr B is destroyed. Next, test that MEM_UNINIT doesn't allow writing dynptr stack slots. Acked-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20230121002241.2113993-13-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-20	selftests/bpf: Add dynptr partial slot overwrite tests	Kumar Kartikeya Dwivedi
	Try creating a dynptr, then overwriting second slot with first slot of another dynptr. Then, the first slot of first dynptr should also be invalidated, but without our fix that does not happen. As a consequence, the unfixed case allows passing first dynptr (as the kernel check only checks for slot_type and then first_slot == true). Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20230121002241.2113993-12-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-20	selftests/bpf: Add dynptr var_off tests	Kumar Kartikeya Dwivedi
	Ensure that variable offset is handled correctly, and verifier takes both fixed and variable part into account. Also ensures that only constant var_off is allowed. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20230121002241.2113993-11-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-20	selftests/bpf: Add dynptr pruning tests	Kumar Kartikeya Dwivedi
	Add verifier tests that verify the new pruning behavior for STACK_DYNPTR slots, and ensure that state equivalence takes into account changes to the old and current verifier state correctly. Also ensure that the stacksafe changes are actually enabling pruning in case states are equivalent from pruning PoV. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20230121002241.2113993-10-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-20	selftests/bpf: convenience macro for use with 'asm volatile' blocks	Eduard Zingerman
	A set of macros useful for writing naked BPF functions using inline assembly. E.g. as follows: struct map_struct { ... } map SEC(".maps"); SEC(...) __naked int foo_test(void) { asm volatile( "r0 = 0;" "(u64)(r10 - 8) = r0;" "r1 = %[map] ll;" "r2 = r10;" "r2 += -8;" "call %[bpf_map_lookup_elem];" "r0 = 0;" "exit;" : : __imm(bpf_map_lookup_elem), __imm_addr(map) : __clobber_all); } Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> [ Kartikeya: Add acks, include __clobber_common from Andrii ] Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20230121002241.2113993-9-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-20	bpf: Invalidate slices on destruction of dynptrs on stack	Kumar Kartikeya Dwivedi
	The previous commit implemented destroy_if_dynptr_stack_slot. It destroys the dynptr which given spi belongs to, but still doesn't invalidate the slices that belong to such a dynptr. While for the case of referenced dynptr, we don't allow their overwrite and return an error early, we still allow it and destroy the dynptr for unreferenced dynptr. To be able to enable precise and scoped invalidation of dynptr slices in this case, we must be able to associate the source dynptr of slices that have been obtained using bpf_dynptr_data. When doing destruction, only slices belonging to the dynptr being destructed should be invalidated, and nothing else. Currently, dynptr slices belonging to different dynptrs are indistinguishible. Hence, allocate a unique id to each dynptr (CONST_PTR_TO_DYNPTR and those on stack). This will be stored as part of reg->id. Whenever using bpf_dynptr_data, transfer this unique dynptr id to the returned PTR_TO_MEM_OR_NULL slice pointer, and store it in a new per-PTR_TO_MEM dynptr_id register state member. Finally, after establishing such a relationship between dynptrs and their slices, implement precise invalidation logic that only invalidates slices belong to the destroyed dynptr in destroy_if_dynptr_stack_slot. Acked-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20230121002241.2113993-5-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-20	bpf: Fix partial dynptr stack slot reads/writes	Kumar Kartikeya Dwivedi
	Currently, while reads are disallowed for dynptr stack slots, writes are not. Reads don't work from both direct access and helpers, while writes do work in both cases, but have the effect of overwriting the slot_type. While this is fine, handling for a few edge cases is missing. Firstly, a user can overwrite the stack slots of dynptr partially. Consider the following layout: spi: [d][d][?] 2 1 0 First slot is at spi 2, second at spi 1. Now, do a write of 1 to 8 bytes for spi 1. This will essentially either write STACK_MISC for all slot_types or STACK_MISC and STACK_ZERO (in case of size < BPF_REG_SIZE partial write of zeroes). The end result is that slot is scrubbed. Now, the layout is: spi: [d][m][?] 2 1 0 Suppose if user initializes spi = 1 as dynptr. We get: spi: [d][d][d] 2 1 0 But this time, both spi 2 and spi 1 have first_slot = true. Now, when passing spi 2 to dynptr helper, it will consider it as initialized as it does not check whether second slot has first_slot == false. And spi 1 should already work as normal. This effectively replaced size + offset of first dynptr, hence allowing invalid OOB reads and writes. Make a few changes to protect against this: When writing to PTR_TO_STACK using BPF insns, when we touch spi of a STACK_DYNPTR type, mark both first and second slot (regardless of which slot we touch) as STACK_INVALID. Reads are already prevented. Second, prevent writing to stack memory from helpers if the range may contain any STACK_DYNPTR slots. Reads are already prevented. For helpers, we cannot allow it to destroy dynptrs from the writes as depending on arguments, helper may take uninit_mem and dynptr both at the same time. This would mean that helper may write to uninit_mem before it reads the dynptr, which would be bad. PTR_TO_MEM: [?????dd] Depending on the code inside the helper, it may end up overwriting the dynptr contents first and then read those as the dynptr argument. Verifier would only simulate destruction when it does byte by byte access simulation in check_helper_call for meta.access_size, and fail to catch this case, as it happens after argument checks. The same would need to be done for any other non-trivial objects created on the stack in the future, such as bpf_list_head on stack, or bpf_rb_root on stack. A common misunderstanding in the current code is that MEM_UNINIT means writes, but note that writes may also be performed even without MEM_UNINIT in case of helpers, in that case the code after handling meta && meta->raw_mode will complain when it sees STACK_DYNPTR. So that invalid read case also covers writes to potential STACK_DYNPTR slots. The only loophole was in case of meta->raw_mode which simulated writes through instructions which could overwrite them. A future series sequenced after this will focus on the clean up of helper access checks and bugs around that. Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs") Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20230121002241.2113993-4-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-20	bpf: Fix missing var_off check for ARG_PTR_TO_DYNPTR	Kumar Kartikeya Dwivedi
	Currently, the dynptr function is not checking the variable offset part of PTR_TO_STACK that it needs to check. The fixed offset is considered when computing the stack pointer index, but if the variable offset was not a constant (such that it could not be accumulated in reg->off), we will end up a discrepency where runtime pointer does not point to the actual stack slot we mark as STACK_DYNPTR. It is impossible to precisely track dynptr state when variable offset is not constant, hence, just like bpf_timer, kptr, bpf_spin_lock, etc. simply reject the case where reg->var_off is not constant. Then, consider both reg->off and reg->var_off.value when computing the stack pointer index. A new helper dynptr_get_spi is introduced to hide over these details since the dynptr needs to be located in multiple places outside the process_dynptr_func checks, hence once we know it's a PTR_TO_STACK, we need to enforce these checks in all places. Note that it is disallowed for unprivileged users to have a non-constant var_off, so this problem should only be possible to trigger from programs having CAP_PERFMON. However, its effects can vary. Without the fix, it is possible to replace the contents of the dynptr arbitrarily by making verifier mark different stack slots than actual location and then doing writes to the actual stack address of dynptr at runtime. Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs") Acked-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20230121002241.2113993-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-18	selftests/bpf: Fix build errors if CONFIG_NF_CONNTRACK=m	Tiezhu Yang
	If CONFIG_NF_CONNTRACK=m, there are no definitions of NF_NAT_MANIP_SRC and NF_NAT_MANIP_DST in vmlinux.h, build test_bpf_nf.c failed. $ make -C tools/testing/selftests/bpf/ CLNG-BPF [test_maps] test_bpf_nf.bpf.o progs/test_bpf_nf.c:160:42: error: use of undeclared identifier 'NF_NAT_MANIP_SRC' bpf_ct_set_nat_info(ct, &saddr, sport, NF_NAT_MANIP_SRC); ^ progs/test_bpf_nf.c:163:42: error: use of undeclared identifier 'NF_NAT_MANIP_DST' bpf_ct_set_nat_info(ct, &daddr, dport, NF_NAT_MANIP_DST); ^ 2 errors generated. Copy the definitions in include/net/netfilter/nf_nat.h to test_bpf_nf.c, in order to avoid redefinitions if CONFIG_NF_CONNTRACK=y, rename them with ___local suffix. This is similar with commit 1058b6a78db2 ("selftests/bpf: Do not fail build if CONFIG_NF_CONNTRACK=m/n"). Fixes: b06b45e82b59 ("selftests/bpf: add tests for bpf_ct_set_nat_info kfunc") Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Acked-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Jiri Olsa <jolsa@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/r/1674028604-7113-1-git-send-email-yangtiezhu@loongson.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-01-18	fs: port vfs_*() helpers to struct mnt_idmap	Christian Brauner
	Convert to struct mnt_idmap. Last cycle we merged the necessary infrastructure in 256c8aed2b42 ("fs: introduce dedicated idmap type for mounts"). This is just the conversion to struct mnt_idmap. Currently we still pass around the plain namespace that was attached to a mount. This is in general pretty convenient but it makes it easy to conflate namespaces that are relevant on the filesystem with namespaces that are relevent on the mount level. Especially for non-vfs developers without detailed knowledge in this area this can be a potential source for bugs. Once the conversion to struct mnt_idmap is done all helpers down to the really low-level helpers will take a struct mnt_idmap argument instead of two namespace arguments. This way it becomes impossible to conflate the two eliminating the possibility of any bugs. All of the vfs and all filesystems only operate on struct mnt_idmap. Acked-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Christian Brauner (Microsoft) <brauner@kernel.org>
2023-01-15	selftests/bpf: add ipip6 and ip6ip decap to test_tc_tunnel	Ziyang Xuan
	Add ipip6 and ip6ip decap testcases. Verify that bpf_skb_adjust_room() correctly decapsulate ipip6 and ip6ip tunnel packets. Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/dfd2d8cfdf9111bd129170d4345296f53bee6a67.1673574419.git.william.xuanziyang@huawei.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-01-13	selftests/bpf: Fix missing space error	Roberto Valenzuela
	Add the missing space after 'dest' variable assignment. This change will resolve the following checkpatch.pl script error: ERROR: spaces required around that '+=' (ctx:VxW) Signed-off-by: Roberto Valenzuela <valenzuelarober@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20230113180257.39769-1-valenzuelarober@gmail.com
2023-01-11	selftests/xsk: add test when some packets are XDP_DROPed	Magnus Karlsson
	Add a new test where some of the packets are not passed to the AF_XDP socket and instead get a XDP_DROP verdict. This is important as it tests the recycling mechanism of the buffer pool. If a packet is not sent to the AF_XDP socket, the buffer the packet resides in is instead recycled so it can be used again without the round-trip to user space. The test introduces a new XDP program that drops every other packet. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/r/20230111093526.11682-13-magnus.karlsson@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-11	selftests/xsk: get rid of built-in XDP program	Magnus Karlsson
	Get rid of the built-in XDP program that was part of the old libbpf code in xsk.c and replace it with an eBPF program build using the framework by all the other bpf selftests. This will form the base for adding more programs in later commits. Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com> Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/r/20230111093526.11682-12-magnus.karlsson@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-01-05	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	No conflicts. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-04	Merge tag 'for-netdev' of ↵	Jakub Kicinski
	https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Daniel Borkmann says: ==================== bpf-next 2023-01-04 We've added 45 non-merge commits during the last 21 day(s) which contain a total of 50 files changed, 1454 insertions(+), 375 deletions(-). The main changes are: 1) Fixes, improvements and refactoring of parts of BPF verifier's state equivalence checks, from Andrii Nakryiko. 2) Fix a few corner cases in libbpf's BTF-to-C converter in particular around padding handling and enums, also from Andrii Nakryiko. 3) Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key to better support decap on GRE tunnel devices not operating in collect metadata, from Christian Ehrig. 4) Improve x86 JIT's codegen for PROBE_MEM runtime error checks, from Dave Marchevsky. 5) Remove the need for trace_printk_lock for bpf_trace_printk and bpf_trace_vprintk helpers, from Jiri Olsa. 6) Add proper documentation for BPF_MAP_TYPE_SOCK{MAP,HASH} maps, from Maryam Tahhan. 7) Improvements in libbpf's btf_parse_elf error handling, from Changbin Du. 8) Bigger batch of improvements to BPF tracing code samples, from Daniel T. Lee. 9) Add LoongArch support to libbpf's bpf_tracing helper header, from Hengqi Chen. 10) Fix a libbpf compiler warning in perf_event_open_probe on arm32, from Khem Raj. 11) Optimize bpf_local_storage_elem by removing 56 bytes of padding, from Martin KaFai Lau. 12) Use pkg-config to locate libelf for resolve_btfids build, from Shen Jiamin. 13) Various libbpf improvements around API documentation and errno handling, from Xin Liu. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (45 commits) libbpf: Return -ENODATA for missing btf section libbpf: Add LoongArch support to bpf_tracing.h libbpf: Restore errno after pr_warn. libbpf: Added the description of some API functions libbpf: Fix invalid return address register in s390 samples/bpf: Use BPF_KSYSCALL macro in syscall tracing programs samples/bpf: Fix tracex2 by using BPF_KSYSCALL macro samples/bpf: Change _kern suffix to .bpf with syscall tracing program samples/bpf: Use vmlinux.h instead of implicit headers in syscall tracing program samples/bpf: Use kyscall instead of kprobe in syscall tracing program bpf: rename list_head -> graph_root in field info types libbpf: fix errno is overwritten after being closed. bpf: fix regs_exact() logic in regsafe() to remap IDs correctly bpf: perform byte-by-byte comparison only when necessary in regsafe() bpf: reject non-exact register type matches in regsafe() bpf: generalize MAYBE_NULL vs non-MAYBE_NULL rule bpf: reorganize struct bpf_reg_state fields bpf: teach refsafe() to take into account ID remapping bpf: Remove unused field initialization in bpf's ctl_table selftests/bpf: Add jit probe_mem corner case tests to s390x denylist ... ==================== Link: https://lore.kernel.org/r/20230105000926.31350-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-22	selftests/bpf: check null propagation only neither reg is PTR_TO_BTF_ID	Hao Sun
	Verify that nullness information is not porpagated in the branches of register to register JEQ and JNE operations if one of them is PTR_TO_BTF_ID. Implement this in C level so we can use CO-RE. Signed-off-by: Hao Sun <sunhao.th@gmail.com> Suggested-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20221222024414.29539-2-sunhao.th@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-12-22	selftests/bpf: Test bpf_skb_adjust_room on CHECKSUM_PARTIAL	Martin KaFai Lau
	When the bpf_skb_adjust_room() shrinks the skb such that its csum_start is invalid, the skb->ip_summed should be reset from CHECKSUM_PARTIAL to CHECKSUM_NONE. The commit 54c3f1a81421 ("bpf: pull before calling skb_postpull_rcsum()") fixed it. This patch adds a test to ensure the skb->ip_summed changed from CHECKSUM_PARTIAL to CHECKSUM_NONE after bpf_skb_adjust_room(). Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20221221185653.1589961-1-martin.lau@linux.dev
2022-12-22	selftests/bpf: Add verifier test exercising jit PROBE_MEM logic	Dave Marchevsky
	This patch adds a test exercising logic that was fixed / improved in the previous patch in the series, as well as general sanity checking for jit's PROBE_MEM logic which should've been unaffected by the previous patch. The added verifier test does the following: * Acquire a referenced kptr to struct prog_test_ref_kfunc using existing net/bpf/test_run.c kfunc * Helper returns ptr to a specific prog_test_ref_kfunc whose first two fields - both ints - have been prepopulated w/ vals 42 and 108, respectively * kptr_xchg the acquired ptr into an arraymap * Do a direct map_value load of the just-added ptr * Goal of all this setup is to get an unreferenced kptr pointing to struct with ints of known value, which is the result of this step * Using unreferenced kptr obtained in previous step, do loads of prog_test_ref_kfunc.a (offset 0) and .b (offset 4) * Then incr the kptr by 8 and load prog_test_ref_kfunc.a again (this time at offset -8) * Add all the loaded ints together and return Before the PROBE_MEM fixes in previous patch, the loads at offset 0 and 4 would succeed, while the load at offset -8 would incorrectly fail runtime check emitted by the JIT and 0 out dst reg as a result. This confirmed by retval of 150 for this test before previous patch - since second .a read is 0'd out - and a retval of 192 with the fixed logic. The test exercises the two optimizations to fixed logic added in last patch as well: * First load, with insn "r8 = (u32 )(r9 + 0)" exercises "insn->off is 0, no need to add / sub from src_reg" optimization * Third load, with insn "r9 = (u32 )(r9 - 8)" exercises "src_reg == dst_reg, no need to restore src_reg after load" optimization Signed-off-by: Dave Marchevsky <davemarchevsky@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20221216214319.3408356-2-davemarchevsky@fb.com
2022-12-19	selftests/bpf: Add BPF_F_NO_TUNNEL_KEY test	Christian Ehrig
	This patch adds a selftest simulating a GRE sender and receiver using tunnel headers without tunnel keys. It validates if packets encapsulated using BPF_F_NO_TUNNEL_KEY are decapsulated by a GRE receiver not configured with tunnel keys. Signed-off-by: Christian Ehrig <cehrig@cloudflare.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20221218051734.31411-2-cehrig@cloudflare.com
2022-12-15	libbpf: Fix btf_dump's packed struct determination	Andrii Nakryiko
	Fix bug in btf_dump's logic of determining if a given struct type is packed or not. The notion of "natural alignment" is not needed and is even harmful in this case, so drop it altogether. The biggest difference in btf_is_struct_packed() compared to its original implementation is that we don't really use btf__align_of() to determine overall alignment of a struct type (because it could be 1 for both packed and non-packed struct, depending on specifci field definitions), and just use field's actual alignment to calculate whether any field is requiring packing or struct's size overall necessitates packing. Add two simple test cases that demonstrate the difference this change would make. Fixes: ea2ce1ba99aa ("libbpf: Fix BTF-to-C converter's padding logic") Reported-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20221215183605.4149488-1-andrii@kernel.org
2022-12-14	selftests/bpf: Add a test for using a cpumap from an freplace-to-XDP program	Toke Høiland-Jørgensen
	This adds a simple test for inserting an XDP program into a cpumap that is "owned" by an XDP program that was loaded as PROG_TYPE_EXT (as libxdp does). Prior to the kernel fix this would fail because the map type ownership would be set to PROG_TYPE_EXT instead of being resolved to PROG_TYPE_XDP. v5: - Fix a few nits from Andrii, add his ACK v4: - Use skeletons for selftest v3: - Update comment to better explain the cause - Add Yonghong's ACK Acked-by: Yonghong Song <yhs@fb.com> Acked-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://lore.kernel.org/r/20221214230254.790066-2-toke@redhat.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2022-12-15	selftests/bpf: Add few corner cases to test padding handling of btf_dump	Andrii Nakryiko
	Add few hand-crafted cases and few randomized cases found using script from [0] that tests btf_dump's padding logic. [0] https://lore.kernel.org/bpf/85f83c333f5355c8ac026f835b18d15060725fcb.camel@ericsson.com/ Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20221212211505.558851-7-andrii@kernel.org
2022-12-15	libbpf: Fix BTF-to-C converter's padding logic	Andrii Nakryiko
	Turns out that btf_dump API doesn't handle a bunch of tricky corner cases, as reported by Per, and further discovered using his testing Python script ([0]). This patch revamps btf_dump's padding logic significantly, making it more correct and also avoiding unnecessary explicit padding, where compiler would pad naturally. This overall topic turned out to be very tricky and subtle, there are lots of subtle corner cases. The comments in the code tries to give some clues, but comments themselves are supposed to be paired with good understanding of C alignment and padding rules. Plus some experimentation to figure out subtle things like whether `long :0;` means that struct is now forced to be long-aligned (no, it's not, turns out). Anyways, Per's script, while not completely correct in some known situations, doesn't show any obvious cases where this logic breaks, so this is a nice improvement over the previous state of this logic. Some selftests had to be adjusted to accommodate better use of natural alignment rules, eliminating some unnecessary padding, or changing it to `type: 0;` alignment markers. Note also that for when we are in between bitfields, we emit explicit bit size, while otherwise we use `: 0`, this feels much more natural in practice. Next patch will add few more test cases, found through randomized Per's script. [0] https://lore.kernel.org/bpf/85f83c333f5355c8ac026f835b18d15060725fcb.camel@ericsson.com/ Reported-by: Per Sundström XP <per.xp.sundstrom@ericsson.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20221212211505.558851-6-andrii@kernel.org
2022-12-15	selftests/bpf: Add non-standardly sized enum tests for btf_dump	Andrii Nakryiko
	Add few custom enum definitions testing mode(byte) and mode(word) attributes. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20221212211505.558851-4-andrii@kernel.org
2022-12-14	selftests/bpf: Fix a selftest compilation error with CONFIG_SMP=n	Yonghong Song
	Kernel test robot reported bpf selftest build failure when CONFIG_SMP is not set. The error message looks below: >> progs/rcu_read_lock.c:256:34: error: no member named 'last_wakee' in 'struct task_struct' last_wakee = task->real_parent->last_wakee; ~~~~~~~~~~~~~~~~~ ^ 1 error generated. When CONFIG_SMP is not set, the field 'last_wakee' is not available in struct 'task_struct'. Hence the above compilation failure. To fix the issue, let us choose another field 'group_leader' which is available regardless of CONFIG_SMP set or not. Fixes: fe147956fca4 ("bpf/selftests: Add selftests for new task kfuncs") Fixes: 48671232fcb8 ("selftests/bpf: Add tests for bpf_rcu_read_lock()") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: David Vernet <void@manifault.com> Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20221213012224.379581-1-yhs@fb.com
2022-12-08	selftests/bpf: Add test for dynptr reinit in user_ringbuf callback	Kumar Kartikeya Dwivedi
	The original support for bpf_user_ringbuf_drain callbacks simply short-circuited checks for the dynptr state, allowing users to pass PTR_TO_DYNPTR (now CONST_PTR_TO_DYNPTR) to helpers that initialize a dynptr. This bug would have also surfaced with other dynptr helpers in the future that changed dynptr view or modified it in some way. Include test cases for all cases, i.e. both bpf_dynptr_from_mem and bpf_ringbuf_reserve_dynptr, and ensure verifier rejects both of them. Without the fix, both of these programs load and pass verification. While at it, remove sys_nanosleep target from failure cases' SEC definition, as there is no such tracepoint. Acked-by: David Vernet <void@manifault.com> Acked-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20221207204141.308952-8-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-12-08	bpf: Refactor ARG_PTR_TO_DYNPTR checks into process_dynptr_func	Kumar Kartikeya Dwivedi
	ARG_PTR_TO_DYNPTR is akin to ARG_PTR_TO_TIMER, ARG_PTR_TO_KPTR, where the underlying register type is subjected to more special checks to determine the type of object represented by the pointer and its state consistency. Move dynptr checks to their own 'process_dynptr_func' function so that is consistent and in-line with existing code. This also makes it easier to reuse this code for kfunc handling. Then, reuse this consolidated function in kfunc dynptr handling too. Note that for kfuncs, the arg_type constraint of DYNPTR_TYPE_LOCAL has been lifted. Acked-by: David Vernet <void@manifault.com> Acked-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20221207204141.308952-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-12-07	selftests/bpf: convert dynptr_fail and map_kptr_fail subtests to generic tester	Andrii Nakryiko
	Convert big chunks of dynptr and map_kptr subtests to use generic verification_tester. They are switched from using manually maintained tables of test cases, specifying program name and expected error verifier message, to btf_decl_tag-based annotations directly on corresponding BPF programs: __failure to specify that BPF program is expected to fail verification, and __msg() to specify expected log message. Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20221207201648.2990661-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-12-07	selftests/bpf: add generic BPF program tester-loader	Andrii Nakryiko
	It's become a common pattern to have a collection of small BPF programs in one BPF object file, each representing one test case. On user-space side of such tests we maintain a table of program names and expected failure or success, along with optional expected verifier log message. This works, but each set of tests reimplement this mundane code over and over again, which is a waste of time for anyone trying to add a new set of tests. Furthermore, it's quite error prone as it's way too easy to miss some entries in these manually maintained test tables (as evidences by dynptr_fail tests, in which ringbuf_release_uninit_dynptr subtest was accidentally missed; this is fixed in next patch). So this patch implements generic test_loader, which accepts skeleton name and handles the rest of details: opens and loads BPF object file, making sure each program is tested in isolation. Optionally each test case can specify expected BPF verifier log message. In case of failure, tester makes sure to report verifier log, but it also reports verifier log in verbose mode unconditionally. Now, the interesting deviation from existing custom implementations is the use of btf_decl_tag attribute to specify expected-to-fail vs expected-to-succeed markers and, optionally, expected log message directly next to BPF program source code, eliminating the need to manually create and update table of tests. We define few macros wrapping btf_decl_tag with a convention that all values of btf_decl_tag start with "comment:" prefix, and then utilizing a very simple "just_some_text_tag" or "some_key_name=<value>" pattern to define things like expected success/failure, expected verifier message, extra verifier log level (if necessary). This approach is demonstrated by next patch in which two existing sets of failure tests are converted. Tester supports both expected-to-fail and expected-to-succeed programs, though this patch set didn't convert any existing expected-to-succeed programs yet, as existing tests couple BPF program loading with their further execution through attach or test_prog_run. One way to allow testing scenarios like this would be ability to specify custom callback, executed for each successfully loaded BPF program. This is left for follow up patches, after some more analysis of existing test cases. This test_loader is, hopefully, a start of a test_verifier-like runner, but integrated into test_progs infrastructure. It will allow much better "user experience" of defining low-level verification tests that can take advantage of all the libbpf-provided nicety features on BPF side: global variables, declarative maps, etc. All while having a choice of defining it in C or as BPF assembly (through __attribute__((naked)) functions and using embedded asm), depending on what makes most sense in each particular case. This will be explored in follow up patches as well. Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20221207201648.2990661-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>