Age | Commit message (Collapse) | Author |
|
In the current state, an erroneous call to
bpf_object__find_map_by_name(NULL, ...) leads to a segmentation
fault through the following call chain:
bpf_object__find_map_by_name(obj = NULL, ...)
-> bpf_object__for_each_map(pos, obj = NULL)
-> bpf_object__next_map((obj = NULL), NULL)
-> return (obj = NULL)->maps
While calling bpf_object__find_map_by_name with obj = NULL is
obviously incorrect, this should not lead to a segmentation
fault but rather be handled gracefully.
As __bpf_map__iter already handles this situation correctly, we
can delegate the check for the regular case there and only add
a check in case the prev or next parameter is NULL.
Signed-off-by: Andreas Ziegler <ziegler.andreas@siemens.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240703083436.505124-1-ziegler.andreas@siemens.com
|
|
Now that the s390x JIT supports exceptions, remove the respective tests
from the denylist.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240703005047.40915-4-iii@linux.ibm.com
|
|
Implement the following three pieces required from the JIT:
- A "top-level" BPF prog (exception_boundary) must save all
non-volatile registers, and not only the ones that it clobbers.
- A "handler" BPF prog (exception_cb) must switch stack to that of
exception_boundary, and restore the registers that exception_boundary
saved.
- arch_bpf_stack_walk() must unwind the stack and provide the results
in a way that satisfies both bpf_throw() and exception_cb.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240703005047.40915-3-iii@linux.ibm.com
|
|
Using a mask instead of an array saves a small amount of memory and
allows marking multiple registers as seen with a simple "or". Another
positive side-effect is that it speeds up verification with jitterbug.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240703005047.40915-2-iii@linux.ibm.com
|
|
After commit 0ede61d8589c ("file: convert to SLAB_TYPESAFE_BY_RCU") this
loop always iterates exactly one time. Delete the for statement and pull
the code in a tab.
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/ZoWJF51D4zWb6f5t@stanley.mountain
|
|
When BPF_TRAMP_F_CALL_ORIG is not set, stack space for passing arguments
on stack doesn't need to be reserved because the original function is
not called.
Only reserve space for stacked arguments when BPF_TRAMP_F_CALL_ORIG is
set.
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Pu Lehui <pulehui@huawei.com>
Link: https://lore.kernel.org/bpf/20240708114758.64414-1-puranjay@kernel.org
|
|
Use the .map_allock_check callback to perform allocation checks before
allocating memory for the devmap.
Signed-off-by: Florian Lehner <dev@der-flo.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240615101158.57889-1-dev@der-flo.net
|
|
Now that the s390x JIT supports arena, remove the respective tests from
the denylist.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240701234304.14336-13-iii@linux.ibm.com
|
|
Check that __sync_*() functions don't cause kernel panics when handling
freed arena pages.
x86_64 does not support some arena atomics yet, and aarch64 may or may
not support them, based on the availability of LSE atomics at run time.
Do not enable this test for these architectures for simplicity.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240701234304.14336-12-iii@linux.ibm.com
|
|
While clang uses __attribute__((address_space(1))) both for defining
arena pointers and arena globals, GCC requires different syntax for
both. While __arena covers the first use case, introduce __arena_global
to cover the second one.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240701234304.14336-11-iii@linux.ibm.com
|
|
s390x supports most BPF atomics using single instructions, which
makes implementing arena support a matter of adding arena address to
the base register (unfortunately atomics do not support index
registers), and wrapping the respective native instruction in probing
sequences.
An exception is BPF_XCHG, which is implemented using two different
memory accesses and a loop. Make sure there is enough extable entries
for both instructions. Compute the base address once for both memory
accesses. Since on exception we need to land after the loop, emit the
nops manually.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240701234304.14336-10-iii@linux.ibm.com
|
|
Now that BPF_PROBE_MEM32 and address space cast instructions are
implemented, tell the verifier that the JIT supports arena.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240701234304.14336-9-iii@linux.ibm.com
|
|
The new address cast instruction translates arena offsets to userspace
addresses. NULL pointers must not be translated.
The common code sets up the mappings in such a way that it's enough to
replace the higher 32 bits to achieve the desired result. s390x has
just an instruction for this: INSERT IMMEDIATE.
Implement the sequence using 3 instruction: LOAD AND TEST, BRANCH
RELATIVE ON CONDITION and INSERT IMMEDIATE.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240701234304.14336-8-iii@linux.ibm.com
|
|
BPF_PROBE_MEM32 is a new mode for LDX, ST and STX instructions. The JIT
is supposed to add the start address of the kernel arena mapping to the
%dst register, and use a probing variant of the respective memory
access.
Reuse the existing probing infrastructure for that. Put the arena
address into the literal pool, load it into %r1 and use that as an
index register. Do not clear any registers in ex_handler_bpf() for
failing ST and STX instructions.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240701234304.14336-7-iii@linux.ibm.com
|
|
Currently we land on the nop, which is unnecessary: we can just as well
begin executing the next instruction. Furthermore, the upcoming arena
support for the loop-based BPF_XCHG implementation will require landing
on an instruction that comes after the loop.
So land on the next JITed instruction, which covers both cases.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240701234304.14336-6-iii@linux.ibm.com
|
|
Currently probe insns are handled by two "if" statements at the
beginning and at the end of bpf_jit_insn(). The first one needs to be
in sync with the huge insn->code statement that follows it, which was
not a problem so far, since the check is small.
The introduction of arena will make it significantly larger, and it
will no longer be obvious whether it is in sync with the opcode switch.
Move these statements to the new bpf_jit_probe_load_pre() and
bpf_jit_probe_post() functions, and call them only from cases that need
them.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240701234304.14336-5-iii@linux.ibm.com
|
|
Commit 7fc8c362e782 ("s390/bpf: encode register within extable entry")
introduced explicit passing of the number of the register to be cleared
to ex_handler_bpf(), which replaced deducing it from the respective
native load instruction using get_probe_mem_regno().
Replace the second and last usage in the same manner, and remove this
function.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240701234304.14336-4-iii@linux.ibm.com
|
|
The upcoming arena support for the loop-based BPF_XCHG implementation
requires emitting nop and extable entries separately. Move nop handling
into a separate function, and keep track of the nop offset.
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240701234304.14336-3-iii@linux.ibm.com
|
|
Zero-extending results of atomic probe operations fails with:
verifier bug. zext_dst is set, but no reg is defined
The problem is that insn_def_regno() handles BPF_ATOMICs, but not
BPF_PROBE_ATOMICs. Fix by adding the missing condition.
Fixes: d503a04f8bc0 ("bpf: Add support for certain atomics in bpf_arena to x86 JIT")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240701234304.14336-2-iii@linux.ibm.com
|
|
As Quentin said [0], BPF map pinning will fail if the pinmaps path is not
under the bpffs, like:
libbpf: specified path /home/ubuntu/test/sock_ops_map is not on BPF FS
Error: failed to pin all maps
[0] https://github.com/libbpf/bpftool/issues/146
Fixes: 3767a94b3253 ("bpftool: add pinmaps argument to the load/loadall")
Signed-off-by: Tao Chen <chen.dylane@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Quentin Monnet <qmo@kernel.org>
Reviewed-by: Quentin Monnet <qmo@kernel.org>
Link: https://lore.kernel.org/bpf/20240702131150.15622-1-chen.dylane@gmail.com
|
|
Add testcase where 7th argument is struct for architectures with 8 argument
registers, and increase the complexity of the struct.
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Acked-by: Björn Töpel <bjorn@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20240702121944.1091530-4-pulehui@huaweicloud.com
|
|
Factor out many args tests from tracing_struct and rename some function names
to make more sense. Meanwhile, remove unnecessary skeleton detach operation
as it will be covered by skeleton destroy operation.
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20240702121944.1091530-3-pulehui@huaweicloud.com
|
|
This patch adds 12 function arguments support for riscv64 bpf trampoline.
The current bpf trampoline supports <= sizeof(u64) bytes scalar arguments [0]
and <= 16 bytes struct arguments [1]. Therefore, we focus on the situation
where scalars are at most XLEN bits and aggregates whose total size does not
exceed 2×XLEN bits in the riscv calling convention [2].
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Acked-by: Björn Töpel <bjorn@kernel.org>
Acked-by: Puranjay Mohan <puranjay@kernel.org>
Link: https://elixir.bootlin.com/linux/v6.8/source/kernel/bpf/btf.c#L6184 [0]
Link: https://elixir.bootlin.com/linux/v6.8/source/kernel/bpf/btf.c#L6769 [1]
Link: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/releases/download/draft-20230929-e5c800e661a53efe3c2678d71a306323b60eb13b/riscv-abi.pdf [2]
Link: https://lore.kernel.org/bpf/20240702121944.1091530-2-pulehui@huaweicloud.com
|
|
Introduce dynamic adjustment capabilities for fill_size and comp_size
parameters to support larger batch sizes beyond the previous 2K limit.
Update HW_SW_MAX_RING_SIZE test cases to evaluate AF_XDP's robustness by
pushing hardware and software ring sizes to their limits. This test
ensures AF_XDP's reliability amidst potential producer/consumer throttling
due to maximum ring utilization.
Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20240702055916.48071-3-tushar.vyavahare@intel.com
|
|
in xskxceiver
Previously, HW_SW_MIN_RING_SIZE and HW_SW_MAX_RING_SIZE test cases were
not validating Tx/Rx traffic at all due to early return after changing HW
ring size in testapp_validate_traffic().
Fix the flow by checking return value of set_ring_size() and act upon it
rather than terminating the test case there.
Signed-off-by: Tushar Vyavahare <tushar.vyavahare@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/bpf/20240702055916.48071-2-tushar.vyavahare@intel.com
|
|
Delete extra blank lines inside of test_selftest().
Signed-off-by: Zhu Jun <zhujun2@cmss.chinamobile.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240627031905.7133-1-zhujun2@cmss.chinamobile.com
|
|
We used bpf_prog_pack to aggregate bpf programs into huge page to
relieve the iTLB pressure on the system. We can apply it to bpf
trampoline, as Song had been implemented it in core and x86 [0]. This
patch is going to use bpf_prog_pack to RV64 bpf trampoline. Since Song
and Puranjay have done a lot of work for bpf_prog_pack on RV64,
implementing this function will be easy.
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Björn Töpel <bjorn@rivosinc.com> #riscv
Link: https://lore.kernel.org/all/20231206224054.492250-1-song@kernel.org [0]
Link: https://lore.kernel.org/bpf/20240622030437.3973492-4-pulehui@huaweicloud.com
|
|
We get the size of the trampoline image during the dry run phase and
allocate memory based on that size. The allocated image will then be
populated with instructions during the real patch phase. But after
commit 26ef208c209a ("bpf: Use arch_bpf_trampoline_size"), the `im`
argument is inconsistent in the dry run and real patch phase. This may
cause emit_imm in RV64 to generate a different number of instructions
when generating the 'im' address, potentially causing out-of-bounds
issues. Let's emit the maximum number of instructions for the "im"
address during dry run to fix this problem.
Fixes: 26ef208c209a ("bpf: Use arch_bpf_trampoline_size")
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240622030437.3973492-3-pulehui@huaweicloud.com
|
|
For trampoline using bpf_prog_pack, we need to generate a rw_image
buffer with size of (image_end - image). For regular trampoline, we use
the precise image size generated by arch_bpf_trampoline_size to allocate
rw_image. But for struct_ops trampoline, we allocate rw_image directly
using close to PAGE_SIZE size. We do not need to allocate for that much,
as the patch size is usually much smaller than PAGE_SIZE. Let's use
precise image size for it too.
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Björn Töpel <bjorn@rivosinc.com> #riscv
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/bpf/20240622030437.3973492-2-pulehui@huaweicloud.com
|
|
Coverity points out that after calling btf__new_empty_split() the wrong
value is checked for error.
Fixes: 58e185a0dc35 ("libbpf: Add btf__distill_base() creating split BTF with distilled base BTF")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240629100058.2866763-1-alan.maguire@oracle.com
|
|
Introduce e2e selftest for bpf_xdp_flow_lookup kfunc through
xdp_flowtable utility.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/b74393fb4539aecbbd5ac7883605f86a95fb0b6b.1719698275.git.lorenzo@kernel.org
|
|
Introduce bpf_xdp_flow_lookup kfunc in order to perform the lookup
of a given flowtable entry based on a fib tuple of incoming traffic.
bpf_xdp_flow_lookup can be used as building block to offload in xdp
the processing of sw flowtable when hw flowtable is not available.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Link: https://lore.kernel.org/bpf/55d38a4e5856f6d1509d823ff4e98aaa6d356097.1719698275.git.lorenzo@kernel.org
|
|
This adds a small internal mapping table so that a new bpf (xdp) kfunc
can perform lookups in a flowtable.
As-is, xdp program has access to the device pointer, but no way to do a
lookup in a flowtable -- there is no way to obtain the needed struct
without questionable stunts.
This allows to obtain an nf_flowtable pointer given a net_device
structure.
In order to keep backward compatibility, the infrastructure allows the
user to add a given device to multiple flowtables, but it will always
return the first added mapping performing the lookup since it assumes
the right configuration is 1:1 mapping between flowtables and net_devices.
Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Pablo Neira Ayuso <pablo@netfilter.org>
Link: https://lore.kernel.org/bpf/9f20e2c36f494b3bf177328718367f636bb0b2ab.1719698275.git.lorenzo@kernel.org
|
|
ARRAY_SIZE is used on multiple places, move its definition in
bpf_misc.h header.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20240626134719.3893748-1-jolsa@kernel.org
|
|
When building with clang for ARCH=i386, the following errors are
observed:
CC kernel/bpf/btf_relocate.o
./tools/lib/bpf/btf_relocate.c:206:23: error: implicit truncation from 'int' to a one-bit wide bit-field changes value from 1 to -1 [-Werror,-Wsingle-bit-bitfield-constant-conversion]
206 | info[id].needs_size = true;
| ^ ~
./tools/lib/bpf/btf_relocate.c:256:25: error: implicit truncation from 'int' to a one-bit wide bit-field changes value from 1 to -1 [-Werror,-Wsingle-bit-bitfield-constant-conversion]
256 | base_info.needs_size = true;
| ^ ~
2 errors generated.
The problem is we use 1-bit, 31-bit bitfields in a signed int.
Changing to
bool needs_size: 1;
unsigned int size:31;
...resolves the error and pahole reports that 4 bytes are used
for the underlying representation:
$ pahole btf_name_info tools/lib/bpf/btf_relocate.o
struct btf_name_info {
const char * name; /* 0 8 */
unsigned int needs_size:1; /* 8: 0 4 */
unsigned int size:31; /* 8: 1 4 */
__u32 id; /* 12 4 */
/* size: 16, cachelines: 1, members: 4 */
/* last cacheline: 16 bytes */
};
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240624192903.854261-1-alan.maguire@oracle.com
|
|
Guard close() with extra link_fd[i] > 0 and fexit_fd[i] > 0
check to prevent close(-1).
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240623131753.2133829-1-make24@iscas.ac.cn
|
|
and reg->type check
Add new negative selftests which are intended to cover the
out-of-bounds memory access that could be performed on a
CONST_PTR_TO_DYNPTR within functions taking a ARG_PTR_TO_DYNPTR |
MEM_RDONLY as an argument, and acceptance of invalid register types
i.e. PTR_TO_BTF_ID within functions taking a ARG_PTR_TO_DYNPTR |
MEM_RDONLY.
Reported-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
Link: https://lore.kernel.org/r/20240625062857.92760-2-mattbobrowski@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
accesses
Currently, it's possible to pass in a modified CONST_PTR_TO_DYNPTR to
a global function as an argument. The adverse effects of this is that
BPF helpers can continue to make use of this modified
CONST_PTR_TO_DYNPTR from within the context of the global function,
which can unintentionally result in out-of-bounds memory accesses and
therefore compromise overall system stability i.e.
[ 244.157771] BUG: KASAN: slab-out-of-bounds in bpf_dynptr_data+0x137/0x140
[ 244.161345] Read of size 8 at addr ffff88810914be68 by task test_progs/302
[ 244.167151] CPU: 0 PID: 302 Comm: test_progs Tainted: G O E 6.10.0-rc3-00131-g66b586715063 #533
[ 244.174318] Call Trace:
[ 244.175787] <TASK>
[ 244.177356] dump_stack_lvl+0x66/0xa0
[ 244.179531] print_report+0xce/0x670
[ 244.182314] ? __virt_addr_valid+0x200/0x3e0
[ 244.184908] kasan_report+0xd7/0x110
[ 244.187408] ? bpf_dynptr_data+0x137/0x140
[ 244.189714] ? bpf_dynptr_data+0x137/0x140
[ 244.192020] bpf_dynptr_data+0x137/0x140
[ 244.194264] bpf_prog_b02a02fdd2bdc5fa_global_call_bpf_dynptr_data+0x22/0x26
[ 244.198044] bpf_prog_b0fe7b9d7dc3abde_callback_adjust_bpf_dynptr_reg_off+0x1f/0x23
[ 244.202136] bpf_user_ringbuf_drain+0x2c7/0x570
[ 244.204744] ? 0xffffffffc0009e58
[ 244.206593] ? __pfx_bpf_user_ringbuf_drain+0x10/0x10
[ 244.209795] bpf_prog_33ab33f6a804ba2d_user_ringbuf_callback_const_ptr_to_dynptr_reg_off+0x47/0x4b
[ 244.215922] bpf_trampoline_6442502480+0x43/0xe3
[ 244.218691] __x64_sys_prlimit64+0x9/0xf0
[ 244.220912] do_syscall_64+0xc1/0x1d0
[ 244.223043] entry_SYSCALL_64_after_hwframe+0x77/0x7f
[ 244.226458] RIP: 0033:0x7ffa3eb8f059
[ 244.228582] Code: 08 89 e8 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 8f 1d 0d 00 f7 d8 64 89 01 48
[ 244.241307] RSP: 002b:00007ffa3e9c6eb8 EFLAGS: 00000206 ORIG_RAX: 000000000000012e
[ 244.246474] RAX: ffffffffffffffda RBX: 00007ffa3e9c7cdc RCX: 00007ffa3eb8f059
[ 244.250478] RDX: 00007ffa3eb162b4 RSI: 0000000000000000 RDI: 00007ffa3e9c7fb0
[ 244.255396] RBP: 00007ffa3e9c6ed0 R08: 00007ffa3e9c76c0 R09: 0000000000000000
[ 244.260195] R10: 0000000000000000 R11: 0000000000000206 R12: ffffffffffffff80
[ 244.264201] R13: 000000000000001c R14: 00007ffc5d6b4260 R15: 00007ffa3e1c7000
[ 244.268303] </TASK>
Add a check_func_arg_reg_off() to the path in which the BPF verifier
verifies the arguments of global function arguments, specifically
those which take an argument of type ARG_PTR_TO_DYNPTR |
MEM_RDONLY. Also, process_dynptr_func() doesn't appear to perform any
explicit and strict type matching on the supplied register type, so
let's also enforce that a register either type PTR_TO_STACK or
CONST_PTR_TO_DYNPTR is by the caller.
Reported-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
Link: https://lore.kernel.org/r/20240625062857.92760-1-mattbobrowski@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Since f663a03c8e35 ("bpf, x64: Remove tail call detection"),
tail_call_reachable won't be detected in x86 JIT. And, tail_call_reachable
is provided by verifier.
Therefore, in test_bpf, the tail_call_reachable must be provided in test
cases before running.
Fix and test:
[ 174.828662] test_bpf: #0 Tail call leaf jited:1 170 PASS
[ 174.829574] test_bpf: #1 Tail call 2 jited:1 244 PASS
[ 174.830363] test_bpf: #2 Tail call 3 jited:1 296 PASS
[ 174.830924] test_bpf: #3 Tail call 4 jited:1 719 PASS
[ 174.831863] test_bpf: #4 Tail call load/store leaf jited:1 197 PASS
[ 174.832240] test_bpf: #5 Tail call load/store jited:1 326 PASS
[ 174.832240] test_bpf: #6 Tail call error path, max count reached jited:1 2214 PASS
[ 174.835713] test_bpf: #7 Tail call count preserved across function calls jited:1 609751 PASS
[ 175.446098] test_bpf: #8 Tail call error path, NULL target jited:1 472 PASS
[ 175.447597] test_bpf: #9 Tail call error path, index out of range jited:1 206 PASS
[ 175.448833] test_bpf: test_tail_calls: Summary: 10 PASSED, 0 FAILED, [10/10 JIT'ed]
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202406251415.c51865bc-oliver.sang@intel.com
Fixes: f663a03c8e35 ("bpf, x64: Remove tail call detection")
Signed-off-by: Leon Hwang <hffilwlqm@gmail.com>
Link: https://lore.kernel.org/r/20240625145351.40072-1-hffilwlqm@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
When upgrading to libbpf 1.3 we noticed a big performance hit while
loading programs using CORE on non base-BTF symbols. This was tracked
down to the new BTF sanity check logic. The issue is the base BTF
definitions are checked first for the base BTF and then again for every
module BTF.
Loading 5 dummy programs (using libbpf-rs) that are using CORE on a
non-base BTF symbol on my system:
- Before this fix: 3s.
- With this fix: 0.1s.
Fix this by only checking the types starting at the BTF start id. This
should ensure the base BTF is still checked as expected but only once
(btf->start_id == 1 when creating the base BTF), and then only
additional types are checked for each module BTF.
Fixes: 3903802bb99a ("libbpf: Add basic BTF sanity validation")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20240624090908.171231-1-atenart@kernel.org
|
|
Kernel test robot reports that kernel build fails with
resilient split BTF changes.
Examining the associated config and code we see that
btf_relocate_id() is defined under CONFIG_DEBUG_INFO_BTF_MODULES.
Moving it outside the #ifdef solves the issue.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202406221742.d2srFLVI-lkp@intel.com/
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/r/20240623135224.27981-1-alan.maguire@oracle.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch does the following to address IETF feedback:
* Remove mention of "program type" and reference future
docs (and mention platform-specific docs exist) for
helper functions and BTF. Addresses Roman Danyliw's
comments based on GENART review from Ines Robles [0].
* Add reference for endianness as requested by John
Scudder [1].
* Added bit numbers to top of 32-bit wide format diagrams
as requested by Paul Wouters [2].
* Added more text about why BPF doesn't stand for anything, based
on text from ebpf.io [3], as requested by Eric Vyncke and
Gunter Van de Velde [4].
* Replaced "htobe16" (and similar) and the direction-specific
description with just "be16" (and similar) and a direction-agnostic
description, to match the direction-agnostic description in
the Byteswap Instructions section. Based on feedback from Eric
Vyncke [5].
[0] https://mailarchive.ietf.org/arch/msg/bpf/DvDgDWOiwk05OyNlWlAmELZFPlM/
[1] https://mailarchive.ietf.org/arch/msg/bpf/eKNXpU4jCLjsbZDSw8LjI29M3tM/
[2] https://mailarchive.ietf.org/arch/msg/bpf/hGk8HkYxeZTpdu9qW_MvbGKj7WU/
[3] https://ebpf.io/what-is-ebpf/#what-do-ebpf-and-bpf-stand-for
[4] https://mailarchive.ietf.org/arch/msg/bpf/i93lzdN3ewnzzS_JMbinCIYxAIU/
[5] https://mailarchive.ietf.org/arch/msg/bpf/KBWXbMeDcSrq4vsKR_KkBbV6hI4/
Acked-by: David Vernet <void@manifault.com>
Signed-off-by: Dave Thaler <dthaler1968@googlemail.com>
Link: https://lore.kernel.org/r/20240623150453.10613-1-dthaler1968@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Alan Maguire says:
====================
bpf: resilient split BTF followups
Follow-up to resilient split BTF series [1],
- cleaning up libbpf relocation code (patch 1);
- adding 'struct module' support for base BTF data (patch 2);
- splitting out field iteration code into separate file (patch 3);
- sharing libbpf relocation code with the kernel (patch 4);
- adding a kbuild --btf_features flag to generate distilled base
BTF in the module-specific case where KBUILD_EXTMOD is true
(patch 5); and
- adding test coverage for module-based kfunc dtor (patch 6)
Generation of distilled base BTF for modules requires the pahole patch
at [2], but without it we just won't get distilled base BTF (and thus BTF
relocation on module load) for bpf_testmod.ko.
Changes since v1 [3]:
- fixed line lengths and made comparison an explicit == 0 (Andrii, patch 1)
- moved btf_iter.c changes to separate patch (Andrii, patch 3)
- grouped common targets in kernel/bpf/Makefile (Andrii, patch 4)
- updated bpf_testmod ctx alloc to use GFP_ATOMIC, and updated dtor
selftest to use map-based dtor cleanup (Eduard, patch 6)
[1] https://lore.kernel.org/bpf/20240613095014.357981-1-alan.maguire@oracle.com/
[2] https://lore.kernel.org/bpf/20240517102714.4072080-1-alan.maguire@oracle.com/
[3] https://lore.kernel.org/bpf/20240618162449.809994-1-alan.maguire@oracle.com/
====================
Link: https://lore.kernel.org/r/20240620091733.1967885-1-alan.maguire@oracle.com
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
|
|
add simple kfuncs to create/destroy a context type to bpf_testmod,
register them and add a kfunc_call test to use them. This provides
test coverage for registration of dtor kfuncs from modules.
By transferring the context pointer to a map value as a __kptr
we also trigger the map-based dtor cleanup logic, improving test
coverage.
Suggested-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240620091733.1967885-7-alan.maguire@oracle.com
|
|
Support creation of module BTF along with distilled base BTF;
the latter is stored in a .BTF.base ELF section and supplements
split BTF references to base BTF with information about base types,
allowing for later relocation of split BTF with a (possibly
changed) base. resolve_btfids detects the presence of a .BTF.base
section and will use it instead of the base BTF it is passed in
BTF id resolution.
Modules will be built with a distilled .BTF.base section for external
module build, i.e.
make -C. -M=path2/module
...while in-tree module build as part of a normal kernel build will
not generate distilled base BTF; this is because in-tree modules
change with the kernel and do not require BTF relocation for the
running vmlinux.
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240620091733.1967885-6-alan.maguire@oracle.com
|
|
Share relocation implementation with the kernel. As part of this,
we also need the type/string iteration functions so also share
btf_iter.c file. Relocation code in kernel and userspace is identical
save for the impementation of the reparenting of split BTF to the
relocated base BTF and retrieval of the BTF header from "struct btf";
these small functions need separate user-space and kernel implementations
for the separate "struct btf"s they operate upon.
One other wrinkle on the kernel side is we have to map .BTF.ids in
modules as they were generated with the type ids used at BTF encoding
time. btf_relocate() optionally returns an array mapping from old BTF
ids to relocated ids, so we use that to fix up these references where
needed for kfuncs.
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240620091733.1967885-5-alan.maguire@oracle.com
|
|
This will allow it to be shared with the kernel. No functional change.
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240620091733.1967885-4-alan.maguire@oracle.com
|
|
...as this will allow split BTF modules with a base BTF
representation (rather than the full vmlinux BTF at time of
BTF encoding) to resolve their references to kernel types in a
way that is more resilient to small changes in kernel types.
This will allow modules that are not built every time the kernel
is to provide more resilient BTF, rather than have it invalidated
every time BTF ids for core kernel types change.
Fields are ordered to avoid holes in struct module.
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240620091733.1967885-3-alan.maguire@oracle.com
|
|
Use less verbose names in BTF relocation code and fix off-by-one error
and typo in btf_relocate.c. Simplify loop over matching distilled
types, moving from assigning a _next value in loop body to moving
match check conditions into the guard.
Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240620091733.1967885-2-alan.maguire@oracle.com
|
|
Adding selftest to verify that struct_ops maps are auto attached by
bpf skeleton's `*__attach` function.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240621180324.238379-1-yatsenko@meta.com
|