Age | Commit message (Collapse) | Author |
|
Merge in the fixes we had queued in case there was another -rc.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Alexei Starovoitov says:
====================
pull-request: bpf-next 2021-11-01
We've added 181 non-merge commits during the last 28 day(s) which contain
a total of 280 files changed, 11791 insertions(+), 5879 deletions(-).
The main changes are:
1) Fix bpf verifier propagation of 64-bit bounds, from Alexei.
2) Parallelize bpf test_progs, from Yucong and Andrii.
3) Deprecate various libbpf apis including af_xdp, from Andrii, Hengqi, Magnus.
4) Improve bpf selftests on s390, from Ilya.
5) bloomfilter bpf map type, from Joanne.
6) Big improvements to JIT tests especially on Mips, from Johan.
7) Support kernel module function calls from bpf, from Kumar.
8) Support typeless and weak ksym in light skeleton, from Kumar.
9) Disallow unprivileged bpf by default, from Pawan.
10) BTF_KIND_DECL_TAG support, from Yonghong.
11) Various bpftool cleanups, from Quentin.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (181 commits)
libbpf: Deprecate AF_XDP support
kbuild: Unify options for BTF generation for vmlinux and modules
selftests/bpf: Add a testcase for 64-bit bounds propagation issue.
bpf: Fix propagation of signed bounds from 64-bit min/max into 32-bit.
bpf: Fix propagation of bounds from 64-bit min/max into 32-bit and var_off.
selftests/bpf: Fix also no-alu32 strobemeta selftest
bpf: Add missing map_delete_elem method to bloom filter map
selftests/bpf: Add bloom map success test for userspace calls
bpf: Add alignment padding for "map_extra" + consolidate holes
bpf: Bloom filter map naming fixups
selftests/bpf: Add test cases for struct_ops prog
bpf: Add dummy BPF STRUCT_OPS for test purpose
bpf: Factor out helpers for ctx access checking
bpf: Factor out a helper to prepare trampoline for struct_ops prog
selftests, bpf: Fix broken riscv build
riscv, libbpf: Add RISC-V (RV64) support to bpf_tracing.h
tools, build: Add RISC-V to HOSTARCH parsing
riscv, bpf: Increase the maximum number of iterations
selftests, bpf: Add one test for sockmap with strparser
selftests, bpf: Fix test_txmsg_ingress_parser error
...
====================
Link: https://lore.kernel.org/r/20211102013123.9005-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This tests the sysctl options for ARP/ND:
/net/ipv4/conf/<iface>/arp_evict_nocarrier
/net/ipv4/conf/all/arp_evict_nocarrier
/net/ipv6/conf/<iface>/ndisc_evict_nocarrier
/net/ipv6/conf/all/ndisc_evict_nocarrier
Signed-off-by: James Prestwood <prestwoj@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Deprecate AF_XDP support in libbpf ([0]). This has been moved to
libxdp as it is a better fit for that library. The AF_XDP support only
uses the public libbpf functions and can therefore just use libbpf as
a library from libxdp. The libxdp APIs are exactly the same so it
should just be linking with libxdp instead of libbpf for the AF_XDP
functionality. If not, please submit a bug report. Linking with both
libraries is supported but make sure you link in the correct order so
that the new functions in libxdp are used instead of the deprecated
ones in libbpf.
Libxdp can be found at https://github.com/xdp-project/xdp-tools.
[0] Closes: https://github.com/libbpf/libbpf/issues/270
Signed-off-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20211029090111.4733-1-magnus.karlsson@gmail.com
|
|
./test_progs-no_alu32 -vv -t twfw
Before the 64-bit_into_32-bit fix:
19: (25) if r1 > 0x3f goto pc+6
R1_w=inv(id=0,umax_value=63,var_off=(0x0; 0xff),s32_max_value=255,u32_max_value=255)
and eventually:
invalid access to map value, value_size=8 off=7 size=8
R6 max value is outside of the allowed memory range
libbpf: failed to load object 'no_alu32/twfw.o'
After the fix:
19: (25) if r1 > 0x3f goto pc+6
R1_w=inv(id=0,umax_value=63,var_off=(0x0; 0x3f))
verif_twfw:OK
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211101222153.78759-3-alexei.starovoitov@gmail.com
|
|
Before this fix:
166: (b5) if r2 <= 0x1 goto pc+22
from 166 to 189: R2=invP(id=1,umax_value=1,var_off=(0x0; 0xffffffff))
After this fix:
166: (b5) if r2 <= 0x1 goto pc+22
from 166 to 189: R2=invP(id=1,umax_value=1,var_off=(0x0; 0x1))
While processing BPF_JLE the reg_set_min_max() would set true_reg->umax_value = 1
and call __reg_combine_64_into_32(true_reg).
Without the fix it would not pass the condition:
if (__reg64_bound_u32(reg->umin_value) && __reg64_bound_u32(reg->umax_value))
since umin_value == 0 at this point.
Before commit 10bf4e83167c the umin was incorrectly ingored.
The commit 10bf4e83167c fixed the correctness issue, but pessimized
propagation of 64-bit min max into 32-bit min max and corresponding var_off.
Fixes: 10bf4e83167c ("bpf: Fix propagation of 32 bit unsigned bounds from 64 bit bounds")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211101222153.78759-1-alexei.starovoitov@gmail.com
|
|
Previous fix aded bpf_clamp_umax() helper use to re-validate boundaries.
While that works correctly, it introduces more branches, which blows up
past 1 million instructions in no-alu32 variant of strobemeta selftests.
Switching len variable from u32 to u64 also fixes the issue and reduces
the number of validated instructions, so use that instead. Fix this
patch and bpf_clamp_umax() removed, both alu32 and no-alu32 selftests
pass.
Fixes: 0133c20480b1 ("selftests/bpf: Fix strobemeta selftest regression")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211101230118.1273019-1-andrii@kernel.org
|
|
This patch has two changes:
1) Adds a new function "test_success_cases" to test
successfully creating + adding + looking up a value
in a bloom filter map from the userspace side.
2) Use bpf_create_map instead of bpf_create_map_xattr in
the "test_fail_cases" and test_inner_map to make the
code look cleaner.
Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211029224909.1721024-4-joannekoong@fb.com
|
|
This patch makes 2 changes regarding alignment padding
for the "map_extra" field.
1) In the kernel header, "map_extra" and "btf_value_type_id"
are rearranged to consolidate the hole.
Before:
struct bpf_map {
...
u32 max_entries; /* 36 4 */
u32 map_flags; /* 40 4 */
/* XXX 4 bytes hole, try to pack */
u64 map_extra; /* 48 8 */
int spin_lock_off; /* 56 4 */
int timer_off; /* 60 4 */
/* --- cacheline 1 boundary (64 bytes) --- */
u32 id; /* 64 4 */
int numa_node; /* 68 4 */
...
bool frozen; /* 117 1 */
/* XXX 10 bytes hole, try to pack */
/* --- cacheline 2 boundary (128 bytes) --- */
...
struct work_struct work; /* 144 72 */
/* --- cacheline 3 boundary (192 bytes) was 24 bytes ago --- */
struct mutex freeze_mutex; /* 216 144 */
/* --- cacheline 5 boundary (320 bytes) was 40 bytes ago --- */
u64 writecnt; /* 360 8 */
/* size: 384, cachelines: 6, members: 26 */
/* sum members: 354, holes: 2, sum holes: 14 */
/* padding: 16 */
/* forced alignments: 2, forced holes: 1, sum forced holes: 10 */
} __attribute__((__aligned__(64)));
After:
struct bpf_map {
...
u32 max_entries; /* 36 4 */
u64 map_extra; /* 40 8 */
u32 map_flags; /* 48 4 */
int spin_lock_off; /* 52 4 */
int timer_off; /* 56 4 */
u32 id; /* 60 4 */
/* --- cacheline 1 boundary (64 bytes) --- */
int numa_node; /* 64 4 */
...
bool frozen /* 113 1 */
/* XXX 14 bytes hole, try to pack */
/* --- cacheline 2 boundary (128 bytes) --- */
...
struct work_struct work; /* 144 72 */
/* --- cacheline 3 boundary (192 bytes) was 24 bytes ago --- */
struct mutex freeze_mutex; /* 216 144 */
/* --- cacheline 5 boundary (320 bytes) was 40 bytes ago --- */
u64 writecnt; /* 360 8 */
/* size: 384, cachelines: 6, members: 26 */
/* sum members: 354, holes: 1, sum holes: 14 */
/* padding: 16 */
/* forced alignments: 2, forced holes: 1, sum forced holes: 14 */
} __attribute__((__aligned__(64)));
2) Add alignment padding to the bpf_map_info struct
More details can be found in commit 36f9814a494a ("bpf: fix uapi hole
for 32 bit compat applications")
Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211029224909.1721024-3-joannekoong@fb.com
|
|
Running a BPF_PROG_TYPE_STRUCT_OPS prog for dummy_st_ops::test_N()
through bpf_prog_test_run(). Four test cases are added:
(1) attach dummy_st_ops should fail
(2) function return value of bpf_dummy_ops::test_1() is expected
(3) pointer argument of bpf_dummy_ops::test_1() works as expected
(4) multiple arguments passed to bpf_dummy_ops::test_2() are correct
Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20211025064025.2567443-5-houtao1@huawei.com
|
|
This patch is closely related to commit 6016df8fe874 ("selftests/bpf:
Fix broken riscv build"). When clang includes the system include
directories, but targeting BPF program, __BITS_PER_LONG defaults to
32, unless explicitly set. Work around this problem, by explicitly
setting __BITS_PER_LONG to __riscv_xlen.
Signed-off-by: Björn Töpel <bjorn@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211028161057.520552-5-bjorn@kernel.org
|
|
Add macros for 64-bit RISC-V PT_REGS to bpf_tracing.h.
Signed-off-by: Björn Töpel <bjorn@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211028161057.520552-4-bjorn@kernel.org
|
|
Add RISC-V to the HOSTARCH parsing, so that ARCH is "riscv", and not
"riscv32" or "riscv64".
This affects the perf and libbpf builds, so that arch specific
includes are correctly picked up for RISC-V.
Signed-off-by: Björn Töpel <bjorn@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20211028161057.520552-3-bjorn@kernel.org
|
|
Add the test to check sockmap with strparser is working well.
Signed-off-by: Liu Jian <liujian56@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211029141216.211899-3-liujian56@huawei.com
|
|
After "skmsg: lose offset info in sk_psock_skb_ingress", the test case
with ktls failed. This because ktls parser(tls_read_size) return value
is 285 not 256.
The case like this:
tls_sk1 --> redir_sk --> tls_sk2
tls_sk1 sent out 512 bytes data, after tls related processing redir_sk
recved 570 btyes data, and redirect 512 (skb_use_parser) bytes data to
tls_sk2; but tls_sk2 needs 285 * 2 bytes data, receive timeout occurred.
Signed-off-by: Liu Jian <liujian56@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20211029141216.211899-2-liujian56@huawei.com
|
|
After most recent nightly Clang update strobemeta selftests started
failing with the following error (relevant portion of assembly included):
1624: (85) call bpf_probe_read_user_str#114
1625: (bf) r1 = r0
1626: (18) r2 = 0xfffffffe
1628: (5f) r1 &= r2
1629: (55) if r1 != 0x0 goto pc+7
1630: (07) r9 += 104
1631: (6b) *(u16 *)(r9 +0) = r0
1632: (67) r0 <<= 32
1633: (77) r0 >>= 32
1634: (79) r1 = *(u64 *)(r10 -456)
1635: (0f) r1 += r0
1636: (7b) *(u64 *)(r10 -456) = r1
1637: (79) r1 = *(u64 *)(r10 -368)
1638: (c5) if r1 s< 0x1 goto pc+778
1639: (bf) r6 = r8
1640: (0f) r6 += r7
1641: (b4) w1 = 0
1642: (6b) *(u16 *)(r6 +108) = r1
1643: (79) r3 = *(u64 *)(r10 -352)
1644: (79) r9 = *(u64 *)(r10 -456)
1645: (bf) r1 = r9
1646: (b4) w2 = 1
1647: (85) call bpf_probe_read_user_str#114
R1 unbounded memory access, make sure to bounds check any such access
In the above code r0 and r1 are implicitly related. Clang knows that,
but verifier isn't able to infer this relationship.
Yonghong Song narrowed down this "regression" in code generation to
a recent Clang optimization change ([0]), which for BPF target generates
code pattern that BPF verifier can't handle and loses track of register
boundaries.
This patch works around the issue by adding an BPF assembly-based helper
that helps to prove to the verifier that upper bound of the register is
a given constant by controlling the exact share of generated BPF
instruction sequence. This fixes the immediate issue for strobemeta
selftest.
[0] https://github.com/llvm/llvm-project/commit/acabad9ff6bf13e00305d9d8621ee8eafc1f8b08
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20211029182907.166910-1-andrii@kernel.org
|
|
This is selftest script for amt interface.
This script includes basic forwarding scenarion and torture scenario.
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently the simult_flows.sh self-tests are not very stable,
especially when running on slow VMs.
The tests measure runtime for transfers on multiple subflows
and check that the time is near the theoretical maximum.
The current test infra introduces a bit of jitter in test
runtime, due to multiple explicit delays. Additionally the
runtime is measured by the shell script wrapper. On a slow
VM, the script overhead is measurable and subject to relevant
jitter.
One solution to make the test more stable would be adding more
slack to the expected time; that could possibly hide real
regressions. Instead move the measurement inside the command
doing the transfer, and drop most unneeded sleeps.
Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In listener_ns, we should pass srv_proto argument to mptcp_connect command,
not cl_proto.
Fixes: 7d1e6f1639044 ("selftests: mptcp: add testcase for active-back")
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Before fix:
| Case IPv6 rejection returned 0, expected 1
|FAIL - 1/4 cases failed
With the fix:
| OK
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Make sure to use pclose() to properly close the pipe opened by popen().
Fixes: 81f77fd0deeb ("bpf: add selftest for stackmap with BPF_F_STACK_BUILD_ID")
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20211026143409.42666-1-andrea.righi@canonical.com
|
|
When I fixed IGMPv3/MLDv2 to use the bridge's multicast_membership_interval
value which is chosen by user-space instead of calculating it based on
multicast_query_interval and multicast_query_response_interval I forgot
to update the selftests relying on that behaviour. Now we have to
manually set the expected GMI value to perform the tests correctly and get
proper results (similar to IGMPv2 behaviour).
Fixes: fac3cb82a54a ("net: bridge: mcast: use multicast_membership_interval for IGMPv3")
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Update .gitignore with newly added tests:
tools/testing/selftests/net/af_unix/test_unix_oob
tools/testing/selftests/net/gro
tools/testing/selftests/net/ioam6_parser
tools/testing/selftests/net/toeplitz
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
TBF can be used as a root qdisc, in which case it is supposed to configure
port shaper. Add a test that verifies that this is so by installing a root
TBF with a ETS or PRIO below it, and then expecting individual bands to all
be shaped according to the root TBF configuration.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
TBF can be used as a root qdisc, with the usual ETS/RED/TBF hierarchy below
it. This use should now be offloaded. Add a test that verifies that it is.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The allocated ring buffer is never freed, do so in the cleanup path.
Fixes: f446b570ac7e ("bpf/selftests: Update the IMA test to use BPF ring buffer")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211028063501.2239335-9-memxor@gmail.com
|
|
Similar to the fix in commit:
e31eec77e4ab ("bpf: selftests: Fix fd cleanup in get_branch_snapshot")
We use designated initializer to set fds to -1 without breaking on
future changes to MAX_SERVER constant denoting the array size.
The particular close(0) occurs on non-reuseport tests, so it can be seen
with -n 115/{2,3} but not 115/4. This can cause problems with future
tests if they depend on BTF fd never being acquired as fd 0, breaking
internal libbpf assumptions.
Fixes: 0ab5539f8584 ("selftests/bpf: Tests for BPF_SK_LOOKUP attach point")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211028063501.2239335-8-memxor@gmail.com
|
|
Also, avoid using CO-RE features, as lskel doesn't support CO-RE, yet.
Include both light and libbpf skeleton in same file to test both of them
together.
In c48e51c8b07a ("bpf: selftests: Add selftests for module kfunc support"),
I added support for generating both lskel and libbpf skel for a BPF
object, however the name parameter for bpftool caused collisions when
included in same file together. This meant that every test needed a
separate file for a libbpf/light skeleton separation instead of
subtests.
Change that by appending a "_lskel" suffix to the name for files using
light skeleton, and convert all existing users.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211028063501.2239335-7-memxor@gmail.com
|
|
There are some instances where we don't use O_CLOEXEC when opening an
fd, fix these up. Otherwise, it is possible that a parallel fork causes
these fds to leak into a child process on execve.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211028063501.2239335-6-memxor@gmail.com
|
|
Add a simple wrapper for passing an fd and getting a new one >= 3 if it
is one of 0, 1, or 2. There are two primary reasons to make this change:
First, libbpf relies on the assumption a certain BPF fd is never 0 (e.g.
most recently noticed in [0]). Second, Alexei pointed out in [1] that
some environments reset stdin, stdout, and stderr if they notice an
invalid fd at these numbers. To protect against both these cases, switch
all internal BPF syscall wrappers in libbpf to always return an fd >= 3.
We only need to modify the syscall wrappers and not other code that
assumes a valid fd by doing >= 0, to avoid pointless churn, and because
it is still a valid assumption. The cost paid is two additional syscalls
if fd is in range [0, 2].
[0]: e31eec77e4ab ("bpf: selftests: Fix fd cleanup in get_branch_snapshot")
[1]: https://lore.kernel.org/bpf/CAADnVQKVKY8o_3aU8Gzke443+uHa-eGoM0h7W4srChMXU1S4Bg@mail.gmail.com
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211028063501.2239335-5-memxor@gmail.com
|
|
This extends existing ksym relocation code to also support relocating
weak ksyms. Care needs to be taken to zero out the src_reg (currently
BPF_PSEUOD_BTF_ID, always set for gen_loader by bpf_object__relocate_data)
when the BTF ID lookup fails at runtime. This is not a problem for
libbpf as it only sets ext->is_set when BTF ID lookup succeeds (and only
proceeds in case of failure if ext->is_weak, leading to src_reg
remaining as 0 for weak unresolved ksym).
A pattern similar to emit_relo_kfunc_btf is followed of first storing
the default values and then jumping over actual stores in case of an
error. For src_reg adjustment, we also need to perform it when copying
the populated instruction, so depending on if copied insn[0].imm is 0 or
not, we decide to jump over the adjustment.
We cannot reach that point unless the ksym was weak and resolved and
zeroed out, as the emit_check_err will cause us to jump to cleanup
label, so we do not need to recheck whether the ksym is weak before
doing the adjustment after copying BTF ID and BTF FD.
This is consistent with how libbpf relocates weak ksym. Logging
statements are added to show the relocation result and aid debugging.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211028063501.2239335-4-memxor@gmail.com
|
|
This uses the bpf_kallsyms_lookup_name helper added in previous patches
to relocate typeless ksyms. The return value ENOENT can be ignored, and
the value written to 'res' can be directly stored to the insn, as it is
overwritten to 0 on lookup failure. For repeating symbols, we can simply
copy the previously populated bpf_insn.
Also, we need to take care to not close fds for typeless ksym_desc, so
reuse the 'off' member's space to add a marker for typeless ksym and use
that to skip them in cleanup_relos.
We add a emit_ksym_relo_log helper that avoids duplicating common
logging instructions between typeless and weak ksym (for future commit).
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211028063501.2239335-3-memxor@gmail.com
|
|
This helper allows us to get the address of a kernel symbol from inside
a BPF_PROG_TYPE_SYSCALL prog (used by gen_loader), so that we can
relocate typeless ksym vars.
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20211028063501.2239335-2-memxor@gmail.com
|
|
filter
This patch adds benchmark tests for comparing the performance of hashmap
lookups without the bloom filter vs. hashmap lookups with the bloom filter.
Checking the bloom filter first for whether the element exists should
overall enable a higher throughput for hashmap lookups, since if the
element does not exist in the bloom filter, we can avoid a costly lookup in
the hashmap.
On average, using 5 hash functions in the bloom filter tended to perform
the best across the widest range of different entry sizes. The benchmark
results using 5 hash functions (running on 8 threads on a machine with one
numa node, and taking the average of 3 runs) were roughly as follows:
value_size = 4 bytes -
10k entries: 30% faster
50k entries: 40% faster
100k entries: 40% faster
500k entres: 70% faster
1 million entries: 90% faster
5 million entries: 140% faster
value_size = 8 bytes -
10k entries: 30% faster
50k entries: 40% faster
100k entries: 50% faster
500k entres: 80% faster
1 million entries: 100% faster
5 million entries: 150% faster
value_size = 16 bytes -
10k entries: 20% faster
50k entries: 30% faster
100k entries: 35% faster
500k entres: 65% faster
1 million entries: 85% faster
5 million entries: 110% faster
value_size = 40 bytes -
10k entries: 5% faster
50k entries: 15% faster
100k entries: 20% faster
500k entres: 65% faster
1 million entries: 75% faster
5 million entries: 120% faster
Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211027234504.30744-6-joannekoong@fb.com
|
|
This patch adds benchmark tests for the throughput (for lookups + updates)
and the false positive rate of bloom filter lookups, as well as some
minor refactoring of the bash script for running the benchmarks.
These benchmarks show that as the number of hash functions increases,
the throughput and the false positive rate of the bloom filter decreases.
>From the benchmark data, the approximate average false-positive rates
are roughly as follows:
1 hash function = ~30%
2 hash functions = ~15%
3 hash functions = ~5%
4 hash functions = ~2.5%
5 hash functions = ~1%
6 hash functions = ~0.5%
7 hash functions = ~0.35%
8 hash functions = ~0.15%
9 hash functions = ~0.1%
10 hash functions = ~0%
For reference data, the benchmarks run on one thread on a machine
with one numa node for 1 to 5 hash functions for 8-byte and 64-byte
values are as follows:
1 hash function:
50k entries
8-byte value
Lookups - 51.1 M/s operations
Updates - 33.6 M/s operations
False positive rate: 24.15%
64-byte value
Lookups - 15.7 M/s operations
Updates - 15.1 M/s operations
False positive rate: 24.2%
100k entries
8-byte value
Lookups - 51.0 M/s operations
Updates - 33.4 M/s operations
False positive rate: 24.04%
64-byte value
Lookups - 15.6 M/s operations
Updates - 14.6 M/s operations
False positive rate: 24.06%
500k entries
8-byte value
Lookups - 50.5 M/s operations
Updates - 33.1 M/s operations
False positive rate: 27.45%
64-byte value
Lookups - 15.6 M/s operations
Updates - 14.2 M/s operations
False positive rate: 27.42%
1 mil entries
8-byte value
Lookups - 49.7 M/s operations
Updates - 32.9 M/s operations
False positive rate: 27.45%
64-byte value
Lookups - 15.4 M/s operations
Updates - 13.7 M/s operations
False positive rate: 27.58%
2.5 mil entries
8-byte value
Lookups - 47.2 M/s operations
Updates - 31.8 M/s operations
False positive rate: 30.94%
64-byte value
Lookups - 15.3 M/s operations
Updates - 13.2 M/s operations
False positive rate: 30.95%
5 mil entries
8-byte value
Lookups - 41.1 M/s operations
Updates - 28.1 M/s operations
False positive rate: 31.01%
64-byte value
Lookups - 13.3 M/s operations
Updates - 11.4 M/s operations
False positive rate: 30.98%
2 hash functions:
50k entries
8-byte value
Lookups - 34.1 M/s operations
Updates - 20.1 M/s operations
False positive rate: 9.13%
64-byte value
Lookups - 8.4 M/s operations
Updates - 7.9 M/s operations
False positive rate: 9.21%
100k entries
8-byte value
Lookups - 33.7 M/s operations
Updates - 18.9 M/s operations
False positive rate: 9.13%
64-byte value
Lookups - 8.4 M/s operations
Updates - 7.7 M/s operations
False positive rate: 9.19%
500k entries
8-byte value
Lookups - 32.7 M/s operations
Updates - 18.1 M/s operations
False positive rate: 12.61%
64-byte value
Lookups - 8.4 M/s operations
Updates - 7.5 M/s operations
False positive rate: 12.61%
1 mil entries
8-byte value
Lookups - 30.6 M/s operations
Updates - 18.9 M/s operations
False positive rate: 12.54%
64-byte value
Lookups - 8.0 M/s operations
Updates - 7.0 M/s operations
False positive rate: 12.52%
2.5 mil entries
8-byte value
Lookups - 25.3 M/s operations
Updates - 16.7 M/s operations
False positive rate: 16.77%
64-byte value
Lookups - 7.9 M/s operations
Updates - 6.5 M/s operations
False positive rate: 16.88%
5 mil entries
8-byte value
Lookups - 20.8 M/s operations
Updates - 14.7 M/s operations
False positive rate: 16.78%
64-byte value
Lookups - 7.0 M/s operations
Updates - 6.0 M/s operations
False positive rate: 16.78%
3 hash functions:
50k entries
8-byte value
Lookups - 25.1 M/s operations
Updates - 14.6 M/s operations
False positive rate: 7.65%
64-byte value
Lookups - 5.8 M/s operations
Updates - 5.5 M/s operations
False positive rate: 7.58%
100k entries
8-byte value
Lookups - 24.7 M/s operations
Updates - 14.1 M/s operations
False positive rate: 7.71%
64-byte value
Lookups - 5.8 M/s operations
Updates - 5.3 M/s operations
False positive rate: 7.62%
500k entries
8-byte value
Lookups - 22.9 M/s operations
Updates - 13.9 M/s operations
False positive rate: 2.62%
64-byte value
Lookups - 5.6 M/s operations
Updates - 4.8 M/s operations
False positive rate: 2.7%
1 mil entries
8-byte value
Lookups - 19.8 M/s operations
Updates - 12.6 M/s operations
False positive rate: 2.60%
64-byte value
Lookups - 5.3 M/s operations
Updates - 4.4 M/s operations
False positive rate: 2.69%
2.5 mil entries
8-byte value
Lookups - 16.2 M/s operations
Updates - 10.7 M/s operations
False positive rate: 4.49%
64-byte value
Lookups - 4.9 M/s operations
Updates - 4.1 M/s operations
False positive rate: 4.41%
5 mil entries
8-byte value
Lookups - 18.8 M/s operations
Updates - 9.2 M/s operations
False positive rate: 4.45%
64-byte value
Lookups - 5.2 M/s operations
Updates - 3.9 M/s operations
False positive rate: 4.54%
4 hash functions:
50k entries
8-byte value
Lookups - 19.7 M/s operations
Updates - 11.1 M/s operations
False positive rate: 1.01%
64-byte value
Lookups - 4.4 M/s operations
Updates - 4.0 M/s operations
False positive rate: 1.00%
100k entries
8-byte value
Lookups - 19.5 M/s operations
Updates - 10.9 M/s operations
False positive rate: 1.00%
64-byte value
Lookups - 4.3 M/s operations
Updates - 3.9 M/s operations
False positive rate: 0.97%
500k entries
8-byte value
Lookups - 18.2 M/s operations
Updates - 10.6 M/s operations
False positive rate: 2.05%
64-byte value
Lookups - 4.3 M/s operations
Updates - 3.7 M/s operations
False positive rate: 2.05%
1 mil entries
8-byte value
Lookups - 15.5 M/s operations
Updates - 9.6 M/s operations
False positive rate: 1.99%
64-byte value
Lookups - 4.0 M/s operations
Updates - 3.4 M/s operations
False positive rate: 1.99%
2.5 mil entries
8-byte value
Lookups - 13.8 M/s operations
Updates - 7.7 M/s operations
False positive rate: 3.91%
64-byte value
Lookups - 3.7 M/s operations
Updates - 3.6 M/s operations
False positive rate: 3.78%
5 mil entries
8-byte value
Lookups - 13.0 M/s operations
Updates - 6.9 M/s operations
False positive rate: 3.93%
64-byte value
Lookups - 3.5 M/s operations
Updates - 3.7 M/s operations
False positive rate: 3.39%
5 hash functions:
50k entries
8-byte value
Lookups - 16.4 M/s operations
Updates - 9.1 M/s operations
False positive rate: 0.78%
64-byte value
Lookups - 3.5 M/s operations
Updates - 3.2 M/s operations
False positive rate: 0.77%
100k entries
8-byte value
Lookups - 16.3 M/s operations
Updates - 9.0 M/s operations
False positive rate: 0.79%
64-byte value
Lookups - 3.5 M/s operations
Updates - 3.2 M/s operations
False positive rate: 0.78%
500k entries
8-byte value
Lookups - 15.1 M/s operations
Updates - 8.8 M/s operations
False positive rate: 1.82%
64-byte value
Lookups - 3.4 M/s operations
Updates - 3.0 M/s operations
False positive rate: 1.78%
1 mil entries
8-byte value
Lookups - 13.2 M/s operations
Updates - 7.8 M/s operations
False positive rate: 1.81%
64-byte value
Lookups - 3.2 M/s operations
Updates - 2.8 M/s operations
False positive rate: 1.80%
2.5 mil entries
8-byte value
Lookups - 10.5 M/s operations
Updates - 5.9 M/s operations
False positive rate: 0.29%
64-byte value
Lookups - 3.2 M/s operations
Updates - 2.4 M/s operations
False positive rate: 0.28%
5 mil entries
8-byte value
Lookups - 9.6 M/s operations
Updates - 5.7 M/s operations
False positive rate: 0.30%
64-byte value
Lookups - 3.2 M/s operations
Updates - 2.7 M/s operations
False positive rate: 0.30%
Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211027234504.30744-5-joannekoong@fb.com
|
|
This patch adds test cases for bpf bloom filter maps. They include tests
checking against invalid operations by userspace, tests for using the
bloom filter map as an inner map, and a bpf program that queries the
bloom filter map for values added by a userspace program.
Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211027234504.30744-4-joannekoong@fb.com
|
|
This patch adds the libbpf infrastructure for supporting a
per-map-type "map_extra" field, whose definition will be
idiosyncratic depending on map type.
For example, for the bloom filter map, the lower 4 bits of
map_extra is used to denote the number of hash functions.
Please note that until libbpf 1.0 is here, the
"bpf_create_map_params" struct is used as a temporary
means for propagating the map_extra field to the kernel.
Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211027234504.30744-3-joannekoong@fb.com
|
|
This patch adds the kernel-side changes for the implementation of
a bpf bloom filter map.
The bloom filter map supports peek (determining whether an element
is present in the map) and push (adding an element to the map)
operations.These operations are exposed to userspace applications
through the already existing syscalls in the following way:
BPF_MAP_LOOKUP_ELEM -> peek
BPF_MAP_UPDATE_ELEM -> push
The bloom filter map does not have keys, only values. In light of
this, the bloom filter map's API matches that of queue stack maps:
user applications use BPF_MAP_LOOKUP_ELEM/BPF_MAP_UPDATE_ELEM
which correspond internally to bpf_map_peek_elem/bpf_map_push_elem,
and bpf programs must use the bpf_map_peek_elem and bpf_map_push_elem
APIs to query or add an element to the bloom filter map. When the
bloom filter map is created, it must be created with a key_size of 0.
For updates, the user will pass in the element to add to the map
as the value, with a NULL key. For lookups, the user will pass in the
element to query in the map as the value, with a NULL key. In the
verifier layer, this requires us to modify the argument type of
a bloom filter's BPF_FUNC_map_peek_elem call to ARG_PTR_TO_MAP_VALUE;
as well, in the syscall layer, we need to copy over the user value
so that in bpf_map_peek_elem, we know which specific value to query.
A few things to please take note of:
* If there are any concurrent lookups + updates, the user is
responsible for synchronizing this to ensure no false negative lookups
occur.
* The number of hashes to use for the bloom filter is configurable from
userspace. If no number is specified, the default used will be 5 hash
functions. The benchmarks later in this patchset can help compare the
performance of using different number of hashes on different entry
sizes. In general, using more hashes decreases both the false positive
rate and the speed of a lookup.
* Deleting an element in the bloom filter map is not supported.
* The bloom filter map may be used as an inner map.
* The "max_entries" size that is specified at map creation time is used
to approximate a reasonable bitmap size for the bloom filter, and is not
otherwise strictly enforced. If the user wishes to insert more entries
into the bloom filter than "max_entries", they may do so but they should
be aware that this may lead to a higher false positive rate.
Signed-off-by: Joanne Koong <joannekoong@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211027234504.30744-2-joannekoong@fb.com
|
|
include/net/sock.h
7b50ecfcc6cd ("net: Rename ->stream_memory_read to ->sock_is_readable")
4c1e34c0dbff ("vsock: Enable y2038 safe timeval for timeout")
drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c
0daa55d033b0 ("octeontx2-af: cn10k: debugfs for dumping LMTST map table")
e77bcdd1f639 ("octeontx2-af: Display all enabled PF VF rsrc_alloc entries.")
Adjacent code addition in both cases, keep both.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch delete ns_src/ns_dst/ns_redir namespaces before recreating
them, making the test more robust.
Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211025223345.2136168-5-fallentree@fb.com
|
|
This patch makes attach_probe uses its own method as attach point,
avoiding conflict with other tests like bpf_cookie.
Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211025223345.2136168-4-fallentree@fb.com
|
|
Increase memory to 4G, 8 SMP core with host cpu passthrough. This
make it run faster in parallel mode and more likely to succeed.
Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211025223345.2136168-2-fallentree@fb.com
|
|
Add a flag to `enum libbpf_strict_mode' to disable the global
`bpf_objects_list', preventing race conditions when concurrent threads
call bpf_object__open() or bpf_object__close().
bpf_object__next() will return NULL if this option is set.
Callers may achieve the same workflow by tracking bpf_objects in
application code.
[0] Closes: https://github.com/libbpf/libbpf/issues/293
Signed-off-by: Joe Burton <jevburton@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211026223528.413950-1-jevburton.kernel@gmail.com
|
|
Daniel Borkmann says:
====================
pull-request: bpf 2021-10-26
We've added 12 non-merge commits during the last 7 day(s) which contain
a total of 23 files changed, 118 insertions(+), 98 deletions(-).
The main changes are:
1) Fix potential race window in BPF tail call compatibility check, from Toke Høiland-Jørgensen.
2) Fix memory leak in cgroup fs due to missing cgroup_bpf_offline(), from Quanyang Wang.
3) Fix file descriptor reference counting in generic_map_update_batch(), from Xu Kuohai.
4) Fix bpf_jit_limit knob to the max supported limit by the arch's JIT, from Lorenz Bauer.
5) Fix BPF sockmap ->poll callbacks for UDP and AF_UNIX sockets, from Cong Wang and Yucong Sun.
6) Fix BPF sockmap concurrency issue in TCP on non-blocking sendmsg calls, from Liu Jian.
7) Fix build failure of INODE_STORAGE and TASK_STORAGE maps on !CONFIG_NET, from Tejun Heo.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf: Fix potential race in tail call compatibility check
bpf: Move BPF_MAP_TYPE for INODE_STORAGE and TASK_STORAGE outside of CONFIG_NET
selftests/bpf: Use recv_timeout() instead of retries
net: Implement ->sock_is_readable() for UDP and AF_UNIX
skmsg: Extract and reuse sk_msg_is_readable()
net: Rename ->stream_memory_read to ->sock_is_readable
tcp_bpf: Fix one concurrency problem in the tcp_bpf_send_verdict function
cgroup: Fix memory leak caused by missing cgroup_bpf_offline
bpf: Fix error usage of map_fd and fdget() in generic_map_update_batch()
bpf: Prevent increasing bpf_jit_limit above max
bpf: Define bpf_jit_alloc_exec_limit for arm64 JIT
bpf: Define bpf_jit_alloc_exec_limit for riscv JIT
====================
Link: https://lore.kernel.org/r/20211026201920.11296-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We use non-blocking sockets in those tests, retrying for
EAGAIN is ugly because there is no upper bound for the packet
arrival time, at least in theory. After we fix poll() on
sockmap sockets, now we can switch to select()+recv().
Signed-off-by: Yucong Sun <sunyucong@gmail.com>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20211008203306.37525-5-xiyou.wangcong@gmail.com
|
|
After adding the previous patches, the constraint that all the router
interface MAC addresses have the same prefix is no longer relevant.
Remove the test cases that validated that this constraint is honored.
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When all the RIF MAC profiles are in use, test that it is possible to
change the MAC of a netdev (i.e., a RIF) when its MAC profile is not
shared with other RIFs. Test that replacement fails when the MAC profile
is shared.
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Verify that MAC profile changes are indeed applied and that packets are
forwarded with the correct source MAC.
Output example:
$ ./rif_mac_profiles.sh
TEST: h1->h2: new mac profile [ OK ]
TEST: h2->h1: new mac profile [ OK ]
TEST: h1->h2: edit mac profile [ OK ]
TEST: h2->h1: edit mac profile [ OK ]
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Query the maximum number of supported RIF MAC profiles using
devlink-resource and verify that all available MAC profiles can be utilized
and that an error is generated when user space tries to exceed this number.
Output example in Spectrum-2:
$ TESTS='rif_mac_profile' ./resource_scale.sh
TEST: 'rif_mac_profile' 4 [ OK ]
TEST: 'rif_mac_profile' overflow 5 [ OK ]
Signed-off-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Function in modules could appear in /proc/kallsyms in random order.
ffffffffa02608a0 t bpf_testmod_loop_test
ffffffffa02600c0 t __traceiter_bpf_testmod_test_writable_bare
ffffffffa0263b60 d __tracepoint_bpf_testmod_test_write_bare
ffffffffa02608c0 T bpf_testmod_test_read
ffffffffa0260d08 t __SCT__tp_func_bpf_testmod_test_writable_bare
ffffffffa0263300 d __SCK__tp_func_bpf_testmod_test_read
ffffffffa0260680 T bpf_testmod_test_write
ffffffffa0260860 t bpf_testmod_test_mod_kfunc
Therefore, we cannot reliably use kallsyms_find_next() to find the end of
a function. Replace it with a simple guess (start + 128). This is good
enough for this test.
Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20211022234814.318457-1-songliubraving@fb.com
|