Age | Commit message (Collapse) | Author |
|
Coverity points out that after calling btf__new_empty_split() the wrong
value is checked for error.
Fixes: 58e185a0dc35 ("libbpf: Add btf__distill_base() creating split BTF with distilled base BTF")
Reported-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240629100058.2866763-1-alan.maguire@oracle.com
|
|
Introduce e2e selftest for bpf_xdp_flow_lookup kfunc through
xdp_flowtable utility.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/b74393fb4539aecbbd5ac7883605f86a95fb0b6b.1719698275.git.lorenzo@kernel.org
|
|
ARRAY_SIZE is used on multiple places, move its definition in
bpf_misc.h header.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20240626134719.3893748-1-jolsa@kernel.org
|
|
When building with clang for ARCH=i386, the following errors are
observed:
CC kernel/bpf/btf_relocate.o
./tools/lib/bpf/btf_relocate.c:206:23: error: implicit truncation from 'int' to a one-bit wide bit-field changes value from 1 to -1 [-Werror,-Wsingle-bit-bitfield-constant-conversion]
206 | info[id].needs_size = true;
| ^ ~
./tools/lib/bpf/btf_relocate.c:256:25: error: implicit truncation from 'int' to a one-bit wide bit-field changes value from 1 to -1 [-Werror,-Wsingle-bit-bitfield-constant-conversion]
256 | base_info.needs_size = true;
| ^ ~
2 errors generated.
The problem is we use 1-bit, 31-bit bitfields in a signed int.
Changing to
bool needs_size: 1;
unsigned int size:31;
...resolves the error and pahole reports that 4 bytes are used
for the underlying representation:
$ pahole btf_name_info tools/lib/bpf/btf_relocate.o
struct btf_name_info {
const char * name; /* 0 8 */
unsigned int needs_size:1; /* 8: 0 4 */
unsigned int size:31; /* 8: 1 4 */
__u32 id; /* 12 4 */
/* size: 16, cachelines: 1, members: 4 */
/* last cacheline: 16 bytes */
};
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240624192903.854261-1-alan.maguire@oracle.com
|
|
Guard close() with extra link_fd[i] > 0 and fexit_fd[i] > 0
check to prevent close(-1).
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240623131753.2133829-1-make24@iscas.ac.cn
|
|
and reg->type check
Add new negative selftests which are intended to cover the
out-of-bounds memory access that could be performed on a
CONST_PTR_TO_DYNPTR within functions taking a ARG_PTR_TO_DYNPTR |
MEM_RDONLY as an argument, and acceptance of invalid register types
i.e. PTR_TO_BTF_ID within functions taking a ARG_PTR_TO_DYNPTR |
MEM_RDONLY.
Reported-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
Link: https://lore.kernel.org/r/20240625062857.92760-2-mattbobrowski@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
When upgrading to libbpf 1.3 we noticed a big performance hit while
loading programs using CORE on non base-BTF symbols. This was tracked
down to the new BTF sanity check logic. The issue is the base BTF
definitions are checked first for the base BTF and then again for every
module BTF.
Loading 5 dummy programs (using libbpf-rs) that are using CORE on a
non-base BTF symbol on my system:
- Before this fix: 3s.
- With this fix: 0.1s.
Fix this by only checking the types starting at the BTF start id. This
should ensure the base BTF is still checked as expected but only once
(btf->start_id == 1 when creating the base BTF), and then only
additional types are checked for each module BTF.
Fixes: 3903802bb99a ("libbpf: Add basic BTF sanity validation")
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20240624090908.171231-1-atenart@kernel.org
|
|
add simple kfuncs to create/destroy a context type to bpf_testmod,
register them and add a kfunc_call test to use them. This provides
test coverage for registration of dtor kfuncs from modules.
By transferring the context pointer to a map value as a __kptr
we also trigger the map-based dtor cleanup logic, improving test
coverage.
Suggested-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240620091733.1967885-7-alan.maguire@oracle.com
|
|
Share relocation implementation with the kernel. As part of this,
we also need the type/string iteration functions so also share
btf_iter.c file. Relocation code in kernel and userspace is identical
save for the impementation of the reparenting of split BTF to the
relocated base BTF and retrieval of the BTF header from "struct btf";
these small functions need separate user-space and kernel implementations
for the separate "struct btf"s they operate upon.
One other wrinkle on the kernel side is we have to map .BTF.ids in
modules as they were generated with the type ids used at BTF encoding
time. btf_relocate() optionally returns an array mapping from old BTF
ids to relocated ids, so we use that to fix up these references where
needed for kfuncs.
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240620091733.1967885-5-alan.maguire@oracle.com
|
|
This will allow it to be shared with the kernel. No functional change.
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240620091733.1967885-4-alan.maguire@oracle.com
|
|
Use less verbose names in BTF relocation code and fix off-by-one error
and typo in btf_relocate.c. Simplify loop over matching distilled
types, moving from assigning a _next value in loop body to moving
match check conditions into the guard.
Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240620091733.1967885-2-alan.maguire@oracle.com
|
|
Adding selftest to verify that struct_ops maps are auto attached by
bpf skeleton's `*__attach` function.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240621180324.238379-1-yatsenko@meta.com
|
|
This patch changes a few tests to make use of regular expressions.
Fixed tests otherwise fail when compiled with GCC.
Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240617141458.471620-3-cupertino.miranda@oracle.com
|
|
Add support for __regex and __regex_unpriv macros to check the test
execution output against a regular expression. This is similar to __msg
and __msg_unpriv, however those expect do substring matching.
Signed-off-by: Cupertino Miranda <cupertino.miranda@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240617141458.471620-2-cupertino.miranda@oracle.com
|
|
I encountered an issue when building the test_progs from the repository [1]:
$ pwd
/work/Qemu/x86_64/linux-6.10-rc2/tools/testing/selftests/bpf/
$ make test_progs V=1
[...]
./tools/sbin/bpftool gen object ./ip_check_defrag.bpf.linked2.o ./ip_check_defrag.bpf.linked1.o
libbpf: failed to find symbol for variable 'bpf_dynptr_slice' in section '.ksyms'
Error: failed to link './ip_check_defrag.bpf.linked1.o': No such file or directory (2)
[...]
Upon investigation, I discovered that the btf_types referenced in the '.ksyms'
section had a kind of BTF_KIND_FUNC instead of BTF_KIND_VAR:
$ bpftool btf dump file ./ip_check_defrag.bpf.linked1.o
[...]
[2] DATASEC '.ksyms' size=0 vlen=2
type_id=16 offset=0 size=0 (FUNC 'bpf_dynptr_from_skb')
type_id=17 offset=0 size=0 (FUNC 'bpf_dynptr_slice')
[...]
[16] FUNC 'bpf_dynptr_from_skb' type_id=82 linkage=extern
[17] FUNC 'bpf_dynptr_slice' type_id=85 linkage=extern
[...]
For a detailed analysis, please refer to [2]. We can add a kind checking to
fix the issue.
[1] https://github.com/eddyz87/bpf/tree/binsort-btf-dedup
[2] https://lore.kernel.org/all/0c0ef20c-c05e-4db9-bad7-2cbc0d6dfae7@oracle.com/
Fixes: 8fd27bf69b86 ("libbpf: Add BPF static linker BTF and BTF.ext support")
Signed-off-by: Donglin Peng <dolinux.peng@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240619122355.426405-1-dolinux.peng@gmail.com
|
|
New versions of bpftool now emit additional link placeholders for BPF
maps (struct_ops maps are the only maps right now that support
attachment), and set up BPF skeleton in such a way that libbpf will
auto-attach BPF maps automatically, assumming libbpf is recent enough
(v1.5+). Old libbpf will do nothing with those links and won't attempt
to auto-attach maps. This allows user code to handle both pre-v1.5 and
v1.5+ versions of libbpf at runtime, if necessary.
But if users don't have (or don't want to) control bpftool version that
generates skeleton, then they can't just assume that skeleton will have
link placeholders. To make this detection possible and easy, let's add
the following to generated skeleton header file:
#define BPF_SKEL_SUPPORTS_MAP_AUTO_ATTACH 1
This can be used during compilation time to guard code that accesses
skel->links.<map> slots.
Note, if auto-attachment is undesirable, libbpf allows to disable this
through bpf_map__set_autoattach(map, false). This is necessary only on
libbpf v1.5+, older libbpf doesn't support map auto-attach anyways.
Libbpf version can be detected at compilation time using
LIBBPF_MAJOR_VERSION and LIBBPF_MINOR_VERSION macros, or at runtime with
libbpf_major_version() and libbpf_minor_version() APIs.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Quentin Monnet <qmo@kernel.org>
Link: https://lore.kernel.org/bpf/20240618183832.2535876-1-andrii@kernel.org
|
|
This reverts [1] and changes return value for bpf_session_cookie
in bpf selftests. Having long * might lead to problems on 32-bit
architectures.
Fixes: 2b8dd87332cd ("bpf: Make bpf_session_cookie() kfunc return long *")
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240619081624.1620152-1-jolsa@kernel.org
|
|
Since start_server_str() is added now, it can be used in script
test_tcp_check_syncookie_user.c instead of start_server_addr() to
simplify the code.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/5d2f442261d37cff16c1f1b21a2b188508ab67fa.1718932493.git.tanggeliang@kylinos.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Since start_server_str() is added now, it can be used in mptcp.c in
start_mptcp_server() instead of using helpers make_sockaddr() and
start_server_addr() to simplify the code.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/16fb3e2cd60b64b5470b0e69f1aa233feaf2717c.1718932493.git.tanggeliang@kylinos.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
In test_bpf_ip_check_defrag_ok(), the new helper client_socket() can be
used to replace connect_to_fd_opts() with "noconnect" opts, and the strcut
member "noconnect" of network_helper_opts can be dropped now, always
connect to server in connect_to_fd_opts().
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/f45760becce51986e4e08283c7df0f933eb0da14.1718932493.git.tanggeliang@kylinos.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch extracts a new helper client_socket() from connect_to_fd_opts()
to create the client socket, but don't connect to the server. Then
connect_to_fd_opts() can be implemented using client_socket() and
connect_fd_to_addr(). This helper can be used in connect_to_addr() too,
and make "noconnect" opts useless.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/4169c554e1cee79223feea49a1adc459d55e1ffe.1718932493.git.tanggeliang@kylinos.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This patch moves "post_socket_cb" and "noconnect" into connect_to_addr(),
then connect_to_fd_opts() can be implemented by getsockname() and
connect_to_addr(). This change makes connect_to_* interfaces more unified.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/4569c30533e14c22fae6c05070aad809720551c1.1718932493.git.tanggeliang@kylinos.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The opts.{type, noconnect} is at least a bit non intuitive or unnecessary.
The only use case now is in test_bpf_ip_check_defrag_ok which ends up
bypassing most (or at least some) of the connect_to_fd_opts() logic. It's
much better that test should have its own connect_to_fd_opts() instead.
This patch adds a new "type" parameter for connect_to_fd_opts(), then
opts->type and getsockopt(SO_TYPE) can be replaced by "type" parameter in
it.
In connect_to_fd(), use getsockopt(SO_TYPE) to get "type" value and pass
it to connect_to_fd_opts().
In bpf_tcp_ca.c and cgroup_v1v2.c, "SOCK_STREAM" types are passed to
connect_to_fd_opts(), and in ip_check_defrag.c, different types "SOCK_RAW"
and "SOCK_DGRAM" are passed to it.
With these changes, the strcut member "type" of network_helper_opts can be
dropped now.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Link: https://lore.kernel.org/r/cfd20b5ad4085c1d1af5e79df3b09013a407199f.1718932493.git.tanggeliang@kylinos.cn
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Now that btf_parse_elf() handles .BTF.base section presence,
we need to ensure that resolve_btfids uses .BTF.base when present
rather than the vmlinux base BTF passed in via the -B option.
Detect .BTF.base section presence and unset the base BTF path
to ensure that BTF ELF parsing will do the right thing.
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240613095014.357981-7-alan.maguire@oracle.com
|
|
Update btf_parse_elf() to check if .BTF.base section is present.
The logic is as follows:
if .BTF.base section exists:
distilled_base := btf_new(.BTF.base)
if distilled_base:
btf := btf_new(.BTF, .base_btf=distilled_base)
if base_btf:
btf_relocate(btf, base_btf)
else:
btf := btf_new(.BTF)
return btf
In other words:
- if .BTF.base section exists, load BTF from it and use it as a base
for .BTF load;
- if base_btf is specified and .BTF.base section exist, relocate newly
loaded .BTF against base_btf.
Signed-off-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20240613095014.357981-6-alan.maguire@oracle.com
|
|
Ensure relocated BTF looks as expected; in this case identical to
original split BTF, with a few duplicate anonymous types added to
split BTF by the relocation process. Also add relocation tests
for edge cases like missing type in base BTF and multiple types
of the same name.
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240613095014.357981-5-alan.maguire@oracle.com
|
|
Map distilled base BTF type ids referenced in split BTF and their
references to the base BTF passed in, and if the mapping succeeds,
reparent the split BTF to the base BTF.
Relocation is done by first verifying that distilled base BTF
only consists of named INT, FLOAT, ENUM, FWD, STRUCT and
UNION kinds; then we sort these to speed lookups. Once sorted,
the base BTF is iterated, and for each relevant kind we check
for an equivalent in distilled base BTF. When found, the
mapping from distilled -> base BTF id and string offset is recorded.
In establishing mappings, we need to ensure we check STRUCT/UNION
size when the STRUCT/UNION is embedded in a split BTF STRUCT/UNION,
and when duplicate names exist for the same STRUCT/UNION. Otherwise
size is ignored in matching STRUCT/UNIONs.
Once all mappings are established, we can update type ids
and string offsets in split BTF and reparent it to the new base.
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240613095014.357981-4-alan.maguire@oracle.com
|
|
Test generation of split+distilled base BTF, ensuring that
- named base BTF STRUCTs and UNIONs are represented as 0-vlen sized
STRUCT/UNIONs
- named ENUM[64]s are represented as 0-vlen named ENUM[64]s
- anonymous struct/unions are represented in full in split BTF
- anonymous enums are represented in full in split BTF
- types unreferenced from split BTF are not present in distilled
base BTF
Also test that with vmlinux BTF and split BTF based upon it,
we only represent needed base types referenced from split BTF
in distilled base.
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240613095014.357981-3-alan.maguire@oracle.com
|
|
To support more robust split BTF, adding supplemental context for the
base BTF type ids that split BTF refers to is required. Without such
references, a simple shuffling of base BTF type ids (without any other
significant change) invalidates the split BTF. Here the attempt is made
to store additional context to make split BTF more robust.
This context comes in the form of distilled base BTF providing minimal
information (name and - in some cases - size) for base INTs, FLOATs,
STRUCTs, UNIONs, ENUMs and ENUM64s along with modified split BTF that
points at that base and contains any additional types needed (such as
TYPEDEF, PTR and anonymous STRUCT/UNION declarations). This
information constitutes the minimal BTF representation needed to
disambiguate or remove split BTF references to base BTF. The rules
are as follows:
- INT, FLOAT, FWD are recorded in full.
- if a named base BTF STRUCT or UNION is referred to from split BTF, it
will be encoded as a zero-member sized STRUCT/UNION (preserving
size for later relocation checks). Only base BTF STRUCT/UNIONs
that are either embedded in split BTF STRUCT/UNIONs or that have
multiple STRUCT/UNION instances of the same name will _need_ size
checks at relocation time, but as it is possible a different set of
types will be duplicates in the later to-be-resolved base BTF,
we preserve size information for all named STRUCT/UNIONs.
- if an ENUM[64] is named, a ENUM forward representation (an ENUM
with no values) of the same size is used.
- in all other cases, the type is added to the new split BTF.
Avoiding struct/union/enum/enum64 expansion is important to keep the
distilled base BTF representation to a minimum size.
When successful, new representations of the distilled base BTF and new
split BTF that refers to it are returned. Both need to be freed by the
caller.
So to take a simple example, with split BTF with a type referring
to "struct sk_buff", we will generate distilled base BTF with a
0-member STRUCT sk_buff of the appropriate size, and the split BTF
will refer to it instead.
Tools like pahole can utilize such split BTF to populate the .BTF
section (split BTF) and an additional .BTF.base section. Then
when the split BTF is loaded, the distilled base BTF can be used
to relocate split BTF to reference the current (and possibly changed)
base BTF.
So for example if "struct sk_buff" was id 502 when the split BTF was
originally generated, we can use the distilled base BTF to see that
id 502 refers to a "struct sk_buff" and replace instances of id 502
with the current (relocated) base BTF sk_buff type id.
Distilled base BTF is small; when building a kernel with all modules
using distilled base BTF as a test, overall module size grew by only
5.3Mb total across ~2700 modules.
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240613095014.357981-2-alan.maguire@oracle.com
|
|
Improve arena based tests and add several C and asm tests
with specific pattern.
These tests would have failed without add_const verifier support.
Also add several loop_inside_iter*() tests that are not related to add_const,
but nice to have.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240613013815.953-5-alexei.starovoitov@gmail.com
|
|
Add big endian support for can_loop/cond_break macros.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/bpf/20240613013815.953-4-alexei.starovoitov@gmail.com
|
|
Compilers can generate the code
r1 = r2
r1 += 0x1
if r2 < 1000 goto ...
use knowledge of r2 range in subsequent r1 operations
So remember constant delta between r2 and r1 and update r1 after 'if' condition.
Unfortunately LLVM still uses this pattern for loops with 'can_loop' construct:
for (i = 0; i < 1000 && can_loop; i++)
The "undo" pass was introduced in LLVM
https://reviews.llvm.org/D121937
to prevent this optimization, but it cannot cover all cases.
Instead of fighting middle end optimizer in BPF backend teach the verifier
about this pattern.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20240613013815.953-3-alexei.starovoitov@gmail.com
|
|
Add special test to be sure that only __nullable BTF params can be
replaced by NULL. This patch adds fake kfuncs in bpf_testmod to
properly test different params.
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Link: https://lore.kernel.org/r/20240613211817.1551967-6-vadfed@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The bench shows some improvements, around 4% faster on decrypt.
Before:
Benchmark 'crypto-decrypt' started.
Iter 0 (325.719us): hits 5.105M/s ( 5.105M/prod), drops 0.000M/s, total operations 5.105M/s
Iter 1 (-17.295us): hits 5.224M/s ( 5.224M/prod), drops 0.000M/s, total operations 5.224M/s
Iter 2 ( 5.504us): hits 4.630M/s ( 4.630M/prod), drops 0.000M/s, total operations 4.630M/s
Iter 3 ( 9.239us): hits 5.148M/s ( 5.148M/prod), drops 0.000M/s, total operations 5.148M/s
Iter 4 ( 37.885us): hits 5.198M/s ( 5.198M/prod), drops 0.000M/s, total operations 5.198M/s
Iter 5 (-53.282us): hits 5.167M/s ( 5.167M/prod), drops 0.000M/s, total operations 5.167M/s
Iter 6 (-17.809us): hits 5.186M/s ( 5.186M/prod), drops 0.000M/s, total operations 5.186M/s
Summary: hits 5.092 ± 0.228M/s ( 5.092M/prod), drops 0.000 ±0.000M/s, total operations 5.092 ± 0.228M/s
After:
Benchmark 'crypto-decrypt' started.
Iter 0 (268.912us): hits 5.312M/s ( 5.312M/prod), drops 0.000M/s, total operations 5.312M/s
Iter 1 (124.869us): hits 5.354M/s ( 5.354M/prod), drops 0.000M/s, total operations 5.354M/s
Iter 2 (-36.801us): hits 5.334M/s ( 5.334M/prod), drops 0.000M/s, total operations 5.334M/s
Iter 3 (254.628us): hits 5.334M/s ( 5.334M/prod), drops 0.000M/s, total operations 5.334M/s
Iter 4 (-77.691us): hits 5.275M/s ( 5.275M/prod), drops 0.000M/s, total operations 5.275M/s
Iter 5 (-164.510us): hits 5.313M/s ( 5.313M/prod), drops 0.000M/s, total operations 5.313M/s
Iter 6 (-81.376us): hits 5.346M/s ( 5.346M/prod), drops 0.000M/s, total operations 5.346M/s
Summary: hits 5.326 ± 0.029M/s ( 5.326M/prod), drops 0.000 ±0.000M/s, total operations 5.326 ± 0.029M/s
Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Link: https://lore.kernel.org/r/20240613211817.1551967-5-vadfed@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Adjust selftests to use nullable option for state and IV arg.
Reviewed-by: Eduard Zingerman <eddyz87@gmail.com>
Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Link: https://lore.kernel.org/r/20240613211817.1551967-4-vadfed@meta.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
When selftests are built with a new enough clang, the arena selftests
opt-in to use LLVM address_space attribute annotations for arena
pointers.
These annotations are not emitted by kfunc prototype generation. This
causes compilation errors when clang sees conflicting prototypes.
Fix by opting arena selftests out of using generated kfunc prototypes.
Fixes: 770abbb5a25a ("bpftool: Support dumping kfunc prototypes from BTF")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/r/202406131810.c1B8hTm8-lkp@intel.com/
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Link: https://lore.kernel.org/r/fc59a617439ceea9ad8dfbb4786843c2169496ae.1718295425.git.dxu@dxuuu.xyz
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Adjust skb program test to run with checksum validation.
Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240606145851.229116-2-vadfed@meta.com
|
|
Add special flag to validate that TC BPF program properly updates
checksum information in skb.
Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240606145851.229116-1-vadfed@meta.com
|
|
This patch enables dumping kfunc prototypes from bpftool. This is useful
b/c with this patch, end users will no longer have to manually define
kfunc prototypes. For the kernel tree, this also means we can optionally
drop kfunc prototypes from:
tools/testing/selftests/bpf/bpf_kfuncs.h
tools/testing/selftests/bpf/bpf_experimental.h
Example usage:
$ make PAHOLE=/home/dxu/dev/pahole/build/pahole -j30 vmlinux
$ ./tools/bpf/bpftool/bpftool btf dump file ./vmlinux format c | rg "__ksym;" | head -3
extern void cgroup_rstat_updated(struct cgroup *cgrp, int cpu) __weak __ksym;
extern void cgroup_rstat_flush(struct cgroup *cgrp) __weak __ksym;
extern struct bpf_key *bpf_lookup_user_key(u32 serial, u64 flags) __weak __ksym;
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Link: https://lore.kernel.org/r/bf6c08f9263c4bd9d10a717de95199d766a13f61.1718207789.git.dxu@dxuuu.xyz
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The xfrm_info selftest locally defines an aliased type such that folks
with CONFIG_XFRM_INTERFACE=m/n configs can still build the selftests.
See commit aa67961f3243 ("selftests/bpf: Allow building bpf tests with CONFIG_XFRM_INTERFACE=[m|n]").
Thus, it is simpler if this selftest opts out of using enerated kfunc
prototypes. The preprocessor macro this commit uses will be introduced
in the final commit.
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Link: https://lore.kernel.org/r/afe0bb1c50487f52542cdd5230c4aef9e36ce250.1718207789.git.dxu@dxuuu.xyz
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The bpf-nf selftests play various games with aliased types such that
folks with CONFIG_NF_CONNTRACK=m/n configs can still build the
selftests. See commits:
1058b6a78db2 ("selftests/bpf: Do not fail build if CONFIG_NF_CONNTRACK=m/n")
92afc5329a5b ("selftests/bpf: Fix build errors if CONFIG_NF_CONNTRACK=m")
Thus, it is simpler if these selftests opt out of using generated kfunc
prototypes. The preprocessor macro this commit uses will be introduced
in the final commit.
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Link: https://lore.kernel.org/r/044a5b10cb3abd0d71cb1c818ee0bfc4a2239332.1718207789.git.dxu@dxuuu.xyz
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Previously, kfunc declarations in bpf_kfuncs.h (and others) used "user
facing" types for kfuncs prototypes while the actual kfunc definitions
used "kernel facing" types. More specifically: bpf_dynptr vs
bpf_dynptr_kern, __sk_buff vs sk_buff, and xdp_md vs xdp_buff.
It wasn't an issue before, as the verifier allows aliased types.
However, since we are now generating kfunc prototypes in vmlinux.h (in
addition to keeping bpf_kfuncs.h around), this conflict creates
compilation errors.
Fix this conflict by using "user facing" types in kfunc definitions.
This results in more casts, but otherwise has no additional runtime
cost.
Note, similar to 5b268d1ebcdc ("bpf: Have bpf_rdonly_cast() take a const
pointer"), we also make kfuncs take const arguments where appropriate in
order to make the kfunc more permissive.
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Link: https://lore.kernel.org/r/b58346a63a0e66bc9b7504da751b526b0b189a67.1718207789.git.dxu@dxuuu.xyz
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
With generated kfunc prototypes, the existing callback names will
conflict. Fix by namespacing with a bpf_ prefix.
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Link: https://lore.kernel.org/r/efe7aadad8a054e5aeeba94b1d2e4502eee09d7a.1718207789.git.dxu@dxuuu.xyz
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The prototype in progs/map_percpu_stats.c is not in line with how the
actual kfuncs are defined in kernel/bpf/map_iter.c. This causes
compilation errors when kfunc prototypes are generated from BTF.
Fix by aligning with actual kfunc definitions.
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Link: https://lore.kernel.org/r/0497e11a71472dcb71ada7c90ad691523ae87c3b.1718207789.git.dxu@dxuuu.xyz
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
The prototype in progs/nested_trust_common.h is not in line with how the
actual kfuncs are defined in kernel/bpf/cpumask.c. This causes compilation
errors when kfunc prototypes are generated from BTF.
Fix by aligning with actual kfunc definitions.
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Link: https://lore.kernel.org/r/437936a4e554b02e04566dd6e3f0a5d08370cc8c.1718207789.git.dxu@dxuuu.xyz
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Some prototypes in progs/get_func_ip_test.c were not in line with how the
actual kfuncs are defined in net/bpf/test_run.c. This causes compilation
errors when kfunc prototypes are generated from BTF.
Fix by aligning with actual kfunc definitions.
Also remove two unused prototypes.
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Link: https://lore.kernel.org/r/1e68870e7626b7b9c6420e65076b307fc404a2f0.1718207789.git.dxu@dxuuu.xyz
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
bpf_iter_task_vma_new() is defined as taking a u64 as its 3rd argument.
u64 is a unsigned long long. bpf_experimental.h was defining the
prototype as unsigned long.
Fix by using __u64.
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Link: https://lore.kernel.org/r/fab4509bfee914f539166a91c3ff41e949f3df30.1718207789.git.dxu@dxuuu.xyz
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
When CONFIG_NETKIT=y,
bpftool-cgroup shows error even if the cgroup's path is correct:
$ bpftool cgroup tree /sys/fs/cgroup
CgroupPath
ID AttachType AttachFlags Name
Error: can't query bpf programs attached to /sys/fs/cgroup: No such device or address
>From strace and kernel tracing, I found netkit returned ENXIO and this command failed.
I think this AttachType(BPF_NETKIT_PRIMARY) is not relevant to cgroup.
bpftool-cgroup should query just only cgroup-related attach types.
v2->v3:
- removed an unnecessary check
v1->v2:
- used an array of cgroup attach types
Signed-off-by: Kenta Tada <tadakentaso@gmail.com>
Reviewed-by: Quentin Monnet <qmo@kernel.org>
Link: https://lore.kernel.org/r/20240607111704.6716-1-tadakentaso@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:
====================
pull-request: bpf-next 2024-06-06
We've added 54 non-merge commits during the last 10 day(s) which contain
a total of 50 files changed, 1887 insertions(+), 527 deletions(-).
The main changes are:
1) Add a user space notification mechanism via epoll when a struct_ops
object is getting detached/unregistered, from Kui-Feng Lee.
2) Big batch of BPF selftest refactoring for sockmap and BPF congctl
tests, from Geliang Tang.
3) Add BTF field (type and string fields, right now) iterator support
to libbpf instead of using existing callback-based approaches,
from Andrii Nakryiko.
4) Extend BPF selftests for the latter with a new btf_field_iter
selftest, from Alan Maguire.
5) Add new kfuncs for a generic, open-coded bits iterator,
from Yafang Shao.
6) Fix BPF selftests' kallsyms_find() helper under kernels configured
with CONFIG_LTO_CLANG_THIN, from Yonghong Song.
7) Remove a bunch of unused structs in BPF selftests,
from David Alan Gilbert.
8) Convert test_sockmap section names into names understood by libbpf
so it can deduce program type and attach type, from Jakub Sitnicki.
9) Extend libbpf with the ability to configure log verbosity
via LIBBPF_LOG_LEVEL environment variable, from Mykyta Yatsenko.
10) Fix BPF selftests with regards to bpf_cookie and find_vma flakiness
in nested VMs, from Song Liu.
11) Extend riscv32/64 JITs to introduce shift/add helpers to generate Zba
optimization, from Xiao Wang.
12) Enable BPF programs to declare arrays and struct fields with kptr,
bpf_rb_root, and bpf_list_head, from Kui-Feng Lee.
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (54 commits)
selftests/bpf: Drop useless arguments of do_test in bpf_tcp_ca
selftests/bpf: Use start_test in test_dctcp in bpf_tcp_ca
selftests/bpf: Use start_test in test_dctcp_fallback in bpf_tcp_ca
selftests/bpf: Add start_test helper in bpf_tcp_ca
selftests/bpf: Use connect_to_fd_opts in do_test in bpf_tcp_ca
libbpf: Auto-attach struct_ops BPF maps in BPF skeleton
selftests/bpf: Add btf_field_iter selftests
selftests/bpf: Fix send_signal test with nested CONFIG_PARAVIRT
libbpf: Remove callback-based type/string BTF field visitor helpers
bpftool: Use BTF field iterator in btfgen
libbpf: Make use of BTF field iterator in BTF handling code
libbpf: Make use of BTF field iterator in BPF linker code
libbpf: Add BTF field iterator
selftests/bpf: Ignore .llvm.<hash> suffix in kallsyms_find()
selftests/bpf: Fix bpf_cookie and find_vma in nested VM
selftests/bpf: Test global bpf_list_head arrays.
selftests/bpf: Test global bpf_rb_root arrays and fields in nested struct types.
selftests/bpf: Test kptr arrays and kptrs in nested struct fields.
bpf: limit the number of levels of a nested struct type.
bpf: look into the types of the fields of a struct type recursively.
...
====================
Link: https://lore.kernel.org/r/20240606223146.23020-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ACLs that reside in the algorithmic TCAM (A-TCAM) in Spectrum-2 and
newer ASICs can share the same mask if their masks only differ in up to
8 consecutive bits. For example, consider the following filters:
# tc filter add dev swp1 ingress pref 1 proto ip flower dst_ip 192.0.2.0/24 action drop
# tc filter add dev swp1 ingress pref 1 proto ip flower dst_ip 198.51.100.128/25 action drop
The second filter can use the same mask as the first (dst_ip/24) with a
delta of 1 bit.
However, the above only works because the two filters have different
values in the common unmasked part (dst_ip/24). When entries have the
same value in the common unmasked part they create undesired collisions
in the device since many entries now have the same key. This leads to
firmware errors such as [1] and to a reduced scale.
Fix by adjusting the hash table key to only include the value in the
common unmasked part. That is, without including the delta bits. That
way the driver will detect the collision during filter insertion and
spill the filter into the circuit TCAM (C-TCAM).
Add a test case that fails without the fix and adjust existing cases
that check C-TCAM spillage according to the above limitation.
[1]
mlxsw_spectrum2 0000:06:00.0: EMAD reg access failed (tid=3379b18a00003394,reg_id=3027(ptce3),type=write,status=8(resource not available))
Fixes: c22291f7cf45 ("mlxsw: spectrum: acl: Implement delta for ERP")
Reported-by: Alexander Zubkov <green@qrator.net>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Tested-by: Alexander Zubkov <green@qrator.net>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|