Age | Commit message (Collapse) | Author |
|
[ Upstream commit b42693415b86f608049cf1b4870adc1dc65e58b0 ]
C++ enum forward declarations are fundamentally not compatible with pure
C enum definitions, and so libbpf's use of `enum bpf_stats_type;`
forward declaration in libbpf/bpf.h public API header is causing C++
compilation issues.
More details can be found in [0], but it comes down to C++ supporting
enum forward declaration only with explicitly specified backing type:
enum bpf_stats_type: int;
In C (and I believe it's a GCC extension also), such forward declaration
is simply:
enum bpf_stats_type;
Further, in Linux UAPI this enum is defined in pure C way:
enum bpf_stats_type { BPF_STATS_RUN_TIME = 0; }
And even though in both cases backing type is int, which can be
confirmed by looking at DWARF information, for C++ compiler actual enum
definition and forward declaration are incompatible.
To eliminate this problem, for C++ mode define input argument as int,
which makes enum unnecessary in libbpf public header. This solves the
issue and as demonstrated by next patch doesn't cause any unwanted
compiler warnings, at least with default warnings setting.
[0] https://stackoverflow.com/questions/42766839/c11-enum-forward-causes-underlying-type-mismatch
[1] Closes: https://github.com/libbpf/libbpf/issues/249
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20221130200013.2997831-1-andrii@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f3c51fe02c55bd944662714e5b91b96dc271ad9f ]
When examining module BTF, it is common to see core kernel structures
such as sk_buff, net_device duplicated in the module. After adding
debug messaging to BTF it turned out that much of the problem
was down to the identical struct test failing during deduplication;
sometimes the compiler adds identical structs. However
it turns out sometimes that type ids of identical struct members
can also differ, even when the containing structs are still identical.
To take an example, for struct sk_buff, debug messaging revealed
that the identical struct matching was failing for the anon
struct "headers"; specifically for the first field:
__u8 __pkt_type_offset[0]; /* 128 0 */
Looking at the code in BTF deduplication, we have code that guards
against the possibility of identical struct definitions, down to
type ids, and identical array definitions. However in this case
we have a struct which is being defined twice but does not have
identical type ids since each duplicate struct has separate type
ids for the above array member. A similar problem (though not
observed) could occur for struct-in-struct.
The solution is to make the "identical struct" test check members
not just for matching ids, but to also check if they in turn are
identical structs or arrays.
The results of doing this are quite dramatic (for some modules
at least); I see the number of type ids drop from around 10000
to just over 1000 in one module for example.
For testing use latest pahole or apply [1], otherwise dedups
can fail for the reasons described there.
Also fix return type of btf_dedup_identical_arrays() as
suggested by Andrii to match boolean return type used
elsewhere.
Fixes: efdd3eb8015e ("libbpf: Accommodate DWARF/compiler bug with duplicated structs")
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1666622309-22289-1-git-send-email-alan.maguire@oracle.com
[1] https://lore.kernel.org/bpf/1666364523-9648-1-git-send-email-alan.maguire
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d0d382f95a9270dcf803539d6781d6bd67e3f5b2 ]
When there are no program sections, obj->programs is left unallocated,
and find_prog_by_sec_insn()'s search lands on &obj->programs[0] == NULL,
and will cause null-pointer dereference in the following access to
prog->sec_idx.
Guard the search with obj->nr_programs similar to what's being done in
__bpf_program__iter() to prevent null-pointer access from happening.
Fixes: db2b8b06423c ("libbpf: Support CO-RE relocations for multi-prog sections")
Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221012022353.7350-4-shung-hsi.yu@suse.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 35a855509e6ee3442477c8ebc6827b5b5d32a7b5 ]
ELF section data pointer returned by libelf may be NULL (if section has
SHT_NOBITS), so null check section data pointer before attempting to
copy license and kversion section.
Fixes: cb1e5e961991 ("bpf tools: Collect version and license from ELF sections")
Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221012022353.7350-3-shung-hsi.yu@suse.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 51deedc9b8680953437dfe359e5268120de10e30 ]
This commit replace e_shnum with the elf_getshdrnum() helper to fix two
oss-fuzz-reported heap-buffer overflow in __bpf_object__open. Both
reports are incorrectly marked as fixed and while still being
reproducible in the latest libbpf.
# clusterfuzz-testcase-minimized-bpf-object-fuzzer-5747922482888704
libbpf: loading object 'fuzz-object' from buffer
libbpf: sec_cnt is 0
libbpf: elf: section(1) .data, size 0, link 538976288, flags 2020202020202020, type=2
libbpf: elf: section(2) .data, size 32, link 538976288, flags 202020202020ff20, type=1
=================================================================
==13==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x6020000000c0 at pc 0x0000005a7b46 bp 0x7ffd12214af0 sp 0x7ffd12214ae8
WRITE of size 4 at 0x6020000000c0 thread T0
SCARINESS: 46 (4-byte-write-heap-buffer-overflow-far-from-bounds)
#0 0x5a7b45 in bpf_object__elf_collect /src/libbpf/src/libbpf.c:3414:24
#1 0x5733c0 in bpf_object_open /src/libbpf/src/libbpf.c:7223:16
#2 0x5739fd in bpf_object__open_mem /src/libbpf/src/libbpf.c:7263:20
...
The issue lie in libbpf's direct use of e_shnum field in ELF header as
the section header count. Where as libelf implemented an extra logic
that, when e_shnum == 0 && e_shoff != 0, will use sh_size member of the
initial section header as the real section header count (part of ELF
spec to accommodate situation where section header counter is larger
than SHN_LORESERVE).
The above inconsistency lead to libbpf writing into a zero-entry calloc
area. So intead of using e_shnum directly, use the elf_getshdrnum()
helper provided by libelf to retrieve the section header counter into
sec_cnt.
Fixes: 0d6988e16a12 ("libbpf: Fix section counting logic")
Fixes: 25bbbd7a444b ("libbpf: Remove assumptions about uniqueness of .rodata/.data/.bss maps")
Signed-off-by: Shung-Hsi Yu <shung-hsi.yu@suse.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=40868
Link: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=40957
Link: https://lore.kernel.org/bpf/20221012022353.7350-2-shung-hsi.yu@suse.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e19db6762c18ab1ddf7a3ef4d0023780c24dc1e8 ]
Add explicit error message if BPF object file is still using legacy BPF
map definitions in SEC("maps"). Before this change, if BPF object file
is still using legacy map definition user will see a bit confusing:
libbpf: elf: skipping unrecognized data section(4) maps
libbpf: prog 'handler': bad map relo against 'server_map' in section 'maps'
Now libbpf will be explicit about rejecting "maps" ELF section:
libbpf: elf: legacy map definitions in 'maps' section are not supported by libbpf v1.0+
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220803214202.23750-1-andrii@kernel.org
Stable-dep-of: 51deedc9b868 ("libbpf: Use elf_getshdrnum() instead of e_shnum")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 0dc9254e03704c75f2ebc9cbef2ce4de83fba603 ]
In the arm64 version of parse_usdt_arg(), when sscanf returns 2, reg_name
is allocated but not freed. Fix it.
Fixes: 0f8619929c57 ("libbpf: Usdt aarch64 arg parsing support")
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/bpf/20221011120108.782373-3-xukuohai@huaweicloud.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 93c660ca40b5d2f7c1b1626e955a8e9fa30e0749 ]
ASAN reports an use-after-free in btf_dump_name_dups:
ERROR: AddressSanitizer: heap-use-after-free on address 0xffff927006db at pc 0xaaaab5dfb618 bp 0xffffdd89b890 sp 0xffffdd89b928
READ of size 2 at 0xffff927006db thread T0
#0 0xaaaab5dfb614 in __interceptor_strcmp.part.0 (test_progs+0x21b614)
#1 0xaaaab635f144 in str_equal_fn tools/lib/bpf/btf_dump.c:127
#2 0xaaaab635e3e0 in hashmap_find_entry tools/lib/bpf/hashmap.c:143
#3 0xaaaab635e72c in hashmap__find tools/lib/bpf/hashmap.c:212
#4 0xaaaab6362258 in btf_dump_name_dups tools/lib/bpf/btf_dump.c:1525
#5 0xaaaab636240c in btf_dump_resolve_name tools/lib/bpf/btf_dump.c:1552
#6 0xaaaab6362598 in btf_dump_type_name tools/lib/bpf/btf_dump.c:1567
#7 0xaaaab6360b48 in btf_dump_emit_struct_def tools/lib/bpf/btf_dump.c:912
#8 0xaaaab6360630 in btf_dump_emit_type tools/lib/bpf/btf_dump.c:798
#9 0xaaaab635f720 in btf_dump__dump_type tools/lib/bpf/btf_dump.c:282
#10 0xaaaab608523c in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:236
#11 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875
#12 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062
#13 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697
#14 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308
#15 0xaaaab5d65990 (test_progs+0x185990)
0xffff927006db is located 11 bytes inside of 16-byte region [0xffff927006d0,0xffff927006e0)
freed by thread T0 here:
#0 0xaaaab5e2c7c4 in realloc (test_progs+0x24c7c4)
#1 0xaaaab634f4a0 in libbpf_reallocarray tools/lib/bpf/libbpf_internal.h:191
#2 0xaaaab634f840 in libbpf_add_mem tools/lib/bpf/btf.c:163
#3 0xaaaab636643c in strset_add_str_mem tools/lib/bpf/strset.c:106
#4 0xaaaab6366560 in strset__add_str tools/lib/bpf/strset.c:157
#5 0xaaaab6352d70 in btf__add_str tools/lib/bpf/btf.c:1519
#6 0xaaaab6353e10 in btf__add_field tools/lib/bpf/btf.c:2032
#7 0xaaaab6084fcc in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:232
#8 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875
#9 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062
#10 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697
#11 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308
#12 0xaaaab5d65990 (test_progs+0x185990)
previously allocated by thread T0 here:
#0 0xaaaab5e2c7c4 in realloc (test_progs+0x24c7c4)
#1 0xaaaab634f4a0 in libbpf_reallocarray tools/lib/bpf/libbpf_internal.h:191
#2 0xaaaab634f840 in libbpf_add_mem tools/lib/bpf/btf.c:163
#3 0xaaaab636643c in strset_add_str_mem tools/lib/bpf/strset.c:106
#4 0xaaaab6366560 in strset__add_str tools/lib/bpf/strset.c:157
#5 0xaaaab6352d70 in btf__add_str tools/lib/bpf/btf.c:1519
#6 0xaaaab6353ff0 in btf_add_enum_common tools/lib/bpf/btf.c:2070
#7 0xaaaab6354080 in btf__add_enum tools/lib/bpf/btf.c:2102
#8 0xaaaab6082f50 in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:162
#9 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875
#10 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062
#11 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697
#12 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308
#13 0xaaaab5d65990 (test_progs+0x185990)
The reason is that the key stored in hash table name_map is a string
address, and the string memory is allocated by realloc() function, when
the memory is resized by realloc() later, the old memory may be freed,
so the address stored in name_map references to a freed memory, causing
use-after-free.
Fix it by storing duplicated string address in name_map.
Fixes: 919d2b1dbb07 ("libbpf: Allow modification of BTF and add btf__add_str API")
Signed-off-by: Xu Kuohai <xukuohai@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/bpf/20221011120108.782373-2-xukuohai@huaweicloud.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 689eb2f1ba46b4b02195ac2a71c55b96d619ebf8 ]
Using page size as max_entries when probing ring buffer map, else the
probe may fail on host with 64KB page size (e.g., an ARM64 host).
After the fix, the output of "bpftool feature" on above host will be
correct.
Before :
eBPF map_type ringbuf is NOT available
eBPF map_type user_ringbuf is NOT available
After :
eBPF map_type ringbuf is available
eBPF map_type user_ringbuf is available
Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221116072351.1168938-2-houtao@huaweicloud.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit dfd0afbf151d85411b371e841f62b81ee5d1ca54 upstream.
GCC 11.3.0 fails to compile btf_dump.c due to the following error,
which seems to originate in btf_dump_struct_data where the returned
value would be uninitialized if btf_vlen returns zero.
btf_dump.c: In function ‘btf_dump_dump_type_data’:
btf_dump.c:2363:12: error: ‘err’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
2363 | if (err < 0)
| ^
Fixes: 920d16af9b42 ("libbpf: BTF dumper support for typed data")
Signed-off-by: David Michael <fedora.dm0@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/87zgcu60hq.fsf@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 927cbb478adf917e0a142b94baa37f06279cc466 ]
The maximum size of ringbuf is 2GB on x86-64 host, so 2 * max_entries
will overflow u32 when mapping producer page and data pages. Only
casting max_entries to size_t is not enough, because for 32-bits
application on 64-bits kernel the size of read-only mmap region
also could overflow size_t.
So fixing it by casting the size of read-only mmap region into a __u64
and checking whether or not there will be overflow during mmap.
Fixes: bf99c936f947 ("libbpf: Add BPF ring buffer support")
Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221116072351.1168938-3-houtao@huaweicloud.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 5fd2a60aecf3a42b14fa371c55b3dbb18b229230 ]
We need to pass '*link' to final libbpf_get_error,
because that one holds the return value, not 'link'.
Fixes: 4fa5bcfe07f7 ("libbpf: Allow BPF program auto-attach handlers to bail out")
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20221114145257.882322-1-jolsa@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 51e05a8cf8eb34da7473823b7f236a77adfef0b4 ]
I accidentally found that a change in commit 1045b03e07d8 ("netlink: fix
overrun in attribute iteration") was not synchronized to the function
`nla_ok` in tools/lib/bpf/nlattr.c, I think it is necessary to modify,
this patch will do it.
Signed-off-by: Xin Liu <liuxin350@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220930090708.62394-1-liuxin350@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 9e32084ef1c33a87a736d6ce3fcb95b60dac9aa1 ]
Currently, resolve_full_path() requires executable permission for both
programs and shared libraries. This causes failures on distos like Debian
since the shared libraries are not installed executable and Linux is not
requiring shared libraries to have executable permissions. Let's remove
executable permission check for shared libraries.
Reported-by: Goro Fuji <goro@fastly.com>
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220806102021.3867130-1-hengqi.chen@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit d25f40ff68aa61c838947bb9adee6c6b36e77453 ]
GCC expects the always_inline attribute to only be set on inline
functions, as such we should make all functions with this attribute
use the __always_inline macro which makes the function inline and
sets the attribute.
Fixes errors like:
/home/buildroot/bpf-next/tools/testing/selftests/bpf/tools/include/bpf/bpf_tracing.h:439:1: error: ‘always_inline’ function might not be inlinable [-Werror=attributes]
439 | ____##name(unsigned long long *ctx, ##args)
| ^~~~
Signed-off-by: James Hilliard <james.hilliard1@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20220803151403.793024-1-james.hilliard1@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 87dbdc230d162bf9ee1ac77c8ade178b6b1e199e ]
Drop the requirement for system-wide kernel UAPI headers to provide full
struct btf_enum64 definition. This is an unexpected requirement that
slipped in libbpf 1.0 and put unnecessary pressure ([0]) on users to have
a bleeding-edge kernel UAPI header from unreleased Linux 6.0.
To achieve this, we forward declare struct btf_enum64. But that's not
enough as there is btf_enum64_value() helper that expects to know the
layout of struct btf_enum64. So we get a bit creative with
reinterpreting memory layout as array of __u32 and accesing lo32/hi32
fields as array elements. Alternative way would be to have a local
pointer variable for anonymous struct with exactly the same layout as
struct btf_enum64, but that gets us into C++ compiler errors complaining
about invalid type casts. So play it safe, if ugly.
[0] Closes: https://github.com/libbpf/libbpf/issues/562
Fixes: d90ec262b35b ("libbpf: Add enum64 support for btf_dump")
Reported-by: Toke Høiland-Jørgensen <toke@toke.dk>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>
Link: https://lore.kernel.org/bpf/20220927042940.147185-1-andrii@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit dbdea9b36fb61da3b9a1be0dd63542e2bfd3e5d7 ]
When attach_prog_fd field was removed in libbpf 1.0 and replaced with
`long: 0` placeholder, it actually shifted all the subsequent fields by
8 byte. This is due to `long: 0` promising to adjust next field's offset
to long-aligned offset. But in this case we were already long-aligned
as pin_root_path is a pointer. So `long: 0` had no effect, and thus
didn't feel the gap created by removed attach_prog_fd.
Non-zero bitfield should have been used instead. I validated using
pahole. Originally kconfig field was at offset 40. With `long: 0` it's
at offset 32, which is wrong. With this change it's back at offset 40.
While technically libbpf 1.0 is allowed to break backwards
compatibility and applications should have been recompiled against
libbpf 1.0 headers, but given how trivial it is to preserve memory
layout, let's fix this.
Reported-by: Grant Seltzer Richman <grantseltzer@gmail.com>
Fixes: 146bf811f5ac ("libbpf: remove most other deprecated high-level APIs")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220923230559.666608-1-andrii@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 7620bffbf72cd66a5d18e444a143b5b5989efa87 ]
We found that function btf_dump__dump_type_data can be called by the
user as an API, but in this function, the `opts` parameter may be used
as a null pointer.This causes `opts->indent_str` to trigger a NULL
pointer exception.
Fixes: 2ce8450ef5a3 ("libbpf: add bpf_object__open_{file, mem} w/ extensible opts")
Signed-off-by: Xin Liu <liuxin350@huawei.com>
Signed-off-by: Weibin Kong <kongweibin2@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220917084809.30770-1-liuxin350@huawei.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 749c202cb6ea40f4d7ac95c4a1217a7b506f43a8 ]
Fix SIGSEGV caused by libbpf trying to find attach type in vmlinux BTF
for freplace programs. It's wrong to search in vmlinux BTF and libbpf
doesn't even mark vmlinux BTF as required for freplace programs. So
trying to search anything in obj->vmlinux_btf might cause NULL
dereference if nothing else in BPF object requires vmlinux BTF.
Instead, error out if freplace (EXT) program doesn't specify
attach_prog_fd during at the load time.
Fixes: 91abb4a6d79d ("libbpf: Support attachment of BPF tracing programs to kernel modules")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220909193053.577111-3-andrii@kernel.org
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 3045f42a64324d339125a8a1a1763bb9e1e08300 ]
GCC-11 warns about the possibly unitialized err variable in
probe_map_create:
libbpf_probes.c: In function 'probe_map_create':
libbpf_probes.c:361:38: error: 'err' may be used uninitialized in this function [-Werror=maybe-uninitialized]
361 | return fd < 0 && err == exp_err ? 1 : 0;
| ~~~~^~~~~~~~~~
Fixes: 878d8def0603 ("libbpf: Rework feature-probing APIs")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20220801025109.1206633-1-f.fainelli@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
Originally, (refer commit f90d194a867a5a1d ("perf evlist: Do not poll
events that use the system_wide flag") there wasn't much reason to poll
system-wide events because:
1. The mmaps get "merged" via set-output anyway (the per-cpu case)
2. perf reads all mmaps when any event is woken
3. system-wide mmaps do not fill up as fast as the mmaps for user
selected events
But there was 1 reason not to poll which was that it prevented correct
termination due to POLLHUP on all user selected events. That issue is
now easily resolved by using fdarray_flag__nonfilterable.
With the advent of commit ae4f8ae16a078964 ("libperf evlist: Allow
mixing per-thread and per-cpu mmaps"), system-wide mmaps can be used
also in the per-thread case where reason 1 does not apply.
Fix the omission of system-wide events from polling by using the
fdarray_flag__nonfilterable flag.
Example:
Before:
$ perf record --no-bpf-event -vvv -e intel_pt// --per-thread uname 2>err.txt
Linux
$ grep 'sys_perf_event_open.*=\|pollfd' err.txt
sys_perf_event_open: pid 155076 cpu -1 group_fd -1 flags 0x8 = 5
sys_perf_event_open: pid 155076 cpu -1 group_fd -1 flags 0x8 = 6
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 7
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 9
sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 10
sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 11
sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 12
sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 13
sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 14
sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 15
thread_data[0x55fb43c29e80]: pollfd[0] <- event_fd=5
thread_data[0x55fb43c29e80]: pollfd[1] <- event_fd=6
thread_data[0x55fb43c29e80]: pollfd[2] <- non_perf_event fd=4
After:
$ perf record --no-bpf-event -vvv -e intel_pt// --per-thread uname 2>err.txt
Linux
$ grep 'sys_perf_event_open.*=\|pollfd' err.txt
sys_perf_event_open: pid 156316 cpu -1 group_fd -1 flags 0x8 = 5
sys_perf_event_open: pid 156316 cpu -1 group_fd -1 flags 0x8 = 6
sys_perf_event_open: pid -1 cpu 0 group_fd -1 flags 0x8 = 7
sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 9
sys_perf_event_open: pid -1 cpu 2 group_fd -1 flags 0x8 = 10
sys_perf_event_open: pid -1 cpu 3 group_fd -1 flags 0x8 = 11
sys_perf_event_open: pid -1 cpu 4 group_fd -1 flags 0x8 = 12
sys_perf_event_open: pid -1 cpu 5 group_fd -1 flags 0x8 = 13
sys_perf_event_open: pid -1 cpu 6 group_fd -1 flags 0x8 = 14
sys_perf_event_open: pid -1 cpu 7 group_fd -1 flags 0x8 = 15
thread_data[0x55cc19e58e80]: pollfd[0] <- event_fd=5
thread_data[0x55cc19e58e80]: pollfd[1] <- event_fd=6
thread_data[0x55cc19e58e80]: pollfd[2] <- event_fd=7
thread_data[0x55cc19e58e80]: pollfd[3] <- event_fd=9
thread_data[0x55cc19e58e80]: pollfd[4] <- event_fd=10
thread_data[0x55cc19e58e80]: pollfd[5] <- event_fd=11
thread_data[0x55cc19e58e80]: pollfd[6] <- event_fd=12
thread_data[0x55cc19e58e80]: pollfd[7] <- event_fd=13
thread_data[0x55cc19e58e80]: pollfd[8] <- event_fd=14
thread_data[0x55cc19e58e80]: pollfd[9] <- event_fd=15
thread_data[0x55cc19e58e80]: pollfd[10] <- non_perf_event fd=4
Fixes: ae4f8ae16a078964 ("libperf evlist: Allow mixing per-thread and per-cpu mmaps")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20220915122612.81738-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The offending commit removed mmap_per_thread(), which did not consider
the different set-output rules for per-thread mmaps i.e. in the per-thread
case set-output is used for file descriptors of the same thread not the
same cpu.
This was not immediately noticed because it only happens with
multi-threaded targets and we do not have a test for that yet.
Reinstate mmap_per_thread() expanding it to cover also system-wide per-cpu
events i.e. to continue to allow the mixing of per-thread and per-cpu
mmaps.
Debug messages (with -vv) show the file descriptors that are opened with
sys_perf_event_open. New debug messages are added (needs -vvv) that show
also which file descriptors are mmapped and which are redirected with
set-output.
In the per-cpu case (cpu != -1) file descriptors for the same CPU are
set-output to the first file descriptor for that CPU.
In the per-thread case (cpu == -1) file descriptors for the same thread are
set-output to the first file descriptor for that thread.
Example (process 17489 has 2 threads):
Before (but with new debug prints):
$ perf record --no-bpf-event -vvv --per-thread -p 17489
<SNIP>
sys_perf_event_open: pid 17489 cpu -1 group_fd -1 flags 0x8 = 5
sys_perf_event_open: pid 17490 cpu -1 group_fd -1 flags 0x8 = 6
<SNIP>
libperf: idx 0: mmapping fd 5
libperf: idx 0: set output fd 6 -> 5
failed to mmap with 22 (Invalid argument)
After:
$ perf record --no-bpf-event -vvv --per-thread -p 17489
<SNIP>
sys_perf_event_open: pid 17489 cpu -1 group_fd -1 flags 0x8 = 5
sys_perf_event_open: pid 17490 cpu -1 group_fd -1 flags 0x8 = 6
<SNIP>
libperf: mmap_per_thread: nr cpu values (may include -1) 1 nr threads 2
libperf: idx 0: mmapping fd 5
libperf: idx 1: mmapping fd 6
<SNIP>
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.018 MB perf.data (15 samples) ]
Per-cpu example (process 20341 has 2 threads, same as above):
$ perf record --no-bpf-event -vvv -p 20341
<SNIP>
sys_perf_event_open: pid 20341 cpu 0 group_fd -1 flags 0x8 = 5
sys_perf_event_open: pid 20342 cpu 0 group_fd -1 flags 0x8 = 6
sys_perf_event_open: pid 20341 cpu 1 group_fd -1 flags 0x8 = 7
sys_perf_event_open: pid 20342 cpu 1 group_fd -1 flags 0x8 = 8
sys_perf_event_open: pid 20341 cpu 2 group_fd -1 flags 0x8 = 9
sys_perf_event_open: pid 20342 cpu 2 group_fd -1 flags 0x8 = 10
sys_perf_event_open: pid 20341 cpu 3 group_fd -1 flags 0x8 = 11
sys_perf_event_open: pid 20342 cpu 3 group_fd -1 flags 0x8 = 12
sys_perf_event_open: pid 20341 cpu 4 group_fd -1 flags 0x8 = 13
sys_perf_event_open: pid 20342 cpu 4 group_fd -1 flags 0x8 = 14
sys_perf_event_open: pid 20341 cpu 5 group_fd -1 flags 0x8 = 15
sys_perf_event_open: pid 20342 cpu 5 group_fd -1 flags 0x8 = 16
sys_perf_event_open: pid 20341 cpu 6 group_fd -1 flags 0x8 = 17
sys_perf_event_open: pid 20342 cpu 6 group_fd -1 flags 0x8 = 18
sys_perf_event_open: pid 20341 cpu 7 group_fd -1 flags 0x8 = 19
sys_perf_event_open: pid 20342 cpu 7 group_fd -1 flags 0x8 = 20
<SNIP>
libperf: mmap_per_cpu: nr cpu values 8 nr threads 2
libperf: idx 0: mmapping fd 5
libperf: idx 0: set output fd 6 -> 5
libperf: idx 1: mmapping fd 7
libperf: idx 1: set output fd 8 -> 7
libperf: idx 2: mmapping fd 9
libperf: idx 2: set output fd 10 -> 9
libperf: idx 3: mmapping fd 11
libperf: idx 3: set output fd 12 -> 11
libperf: idx 4: mmapping fd 13
libperf: idx 4: set output fd 14 -> 13
libperf: idx 5: mmapping fd 15
libperf: idx 5: set output fd 16 -> 15
libperf: idx 6: mmapping fd 17
libperf: idx 6: set output fd 18 -> 17
libperf: idx 7: mmapping fd 19
libperf: idx 7: set output fd 20 -> 19
<SNIP>
[ perf record: Woken up 7 times to write data ]
[ perf record: Captured and wrote 0.020 MB perf.data (17 samples) ]
Fixes: ae4f8ae16a078964 ("libperf evlist: Allow mixing per-thread and per-cpu mmaps")
Reported-by: Tomáš Trnka <trnka@scm.com>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=216441
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220905114209.8389-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
It checks a various combination of the read format settings and verify
it return the value in a proper position. The test uses task-clock
software events to guarantee it's always active and sets enabled/running
time.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220819003644.508916-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The perf_counts_values should be increased to read the new lost data.
Also adjust values after read according the read format.
This supports PERF_FORMAT_GROUP which has a different data format but
it's only available for leader events. Currently it doesn't have an API
to read sibling (member) events in the group. But users may read the
sibling event directly.
Also reading from mmap would be disabled when the read format has ID or
LOST bit as it's not exposed via mmap.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220819003644.508916-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
A mask encoding of a cpu map is laid out as:
u16 nr
u16 long_size
unsigned long mask[];
However, the mask may be 8-byte aligned meaning there is a 4-byte pad
after long_size. This means 32-bit and 64-bit builds see the mask as
being at different offsets. On top of this the structure is in the byte
data[] encoded as:
u16 type
char data[]
This means the mask's struct isn't the required 4 or 8 byte aligned, but
is offset by 2. Consequently the long reads and writes are causing
undefined behavior as the alignment is broken.
Fix the mask struct by creating explicit 32 and 64-bit variants, use a
union to avoid data[] and casts; the struct must be packed so the
layout matches the existing perf.data layout. Taking an address of a
member of a packed struct breaks alignment so pass the packed
perf_record_cpu_map_data to functions, so they can access variables with
the right alignment.
As the 64-bit version has 4 bytes of padding, optimizing writing to only
write the 32-bit version.
Committer notes:
Disable warnings about 'packed' that break the build in some arches like
riscv64, but just around that specific struct.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Colin Ian King <colin.king@intel.com>
Cc: Dave Marchevsky <davemarchevsky@fb.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220614143353.1559597-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Allows max() to be used with 'const struct perf_cpu_maps *'.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Colin Ian King <colin.king@intel.com>
Cc: Dave Marchevsky <davemarchevsky@fb.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220614143353.1559597-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bluetooth, bpf, can and netfilter.
A little larger than usual but it's all fixes, no late features. It's
large partially because of timing, and partially because of follow ups
to stuff that got merged a week or so before the merge window and
wasn't as widely tested. Maybe the Bluetooth fixes are a little
alarming so we'll address that, but the rest seems okay and not scary.
Notably we're including a fix for the netfilter Kconfig [1], your WiFi
warning [2] and a bluetooth fix which should unblock syzbot [3].
Current release - regressions:
- Bluetooth:
- don't try to cancel uninitialized works [3]
- L2CAP: fix use-after-free caused by l2cap_chan_put
- tls: rx: fix device offload after recent rework
- devlink: fix UAF on failed reload and leftover locks in mlxsw
Current release - new code bugs:
- netfilter:
- flowtable: fix incorrect Kconfig dependencies [1]
- nf_tables: fix crash when nf_trace is enabled
- bpf:
- use proper target btf when exporting attach_btf_obj_id
- arm64: fixes for bpf trampoline support
- Bluetooth:
- ISO: unlock on error path in iso_sock_setsockopt()
- ISO: fix info leak in iso_sock_getsockopt()
- ISO: fix iso_sock_getsockopt for BT_DEFER_SETUP
- ISO: fix memory corruption on iso_pinfo.base
- ISO: fix not using the correct QoS
- hci_conn: fix updating ISO QoS PHY
- phy: dp83867: fix get nvmem cell fail
Previous releases - regressions:
- wifi: cfg80211: fix validating BSS pointers in
__cfg80211_connect_result [2]
- atm: bring back zatm uAPI after ATM had been removed
- properly fix old bug making bonding ARP monitor mode not being able
to work with software devices with lockless Tx
- tap: fix null-deref on skb->dev in dev_parse_header_protocol
- revert "net: usb: ax88179_178a needs FLAG_SEND_ZLP" it helps some
devices and breaks others
- netfilter:
- nf_tables: many fixes rejecting cross-object linking which may
lead to UAFs
- nf_tables: fix null deref due to zeroed list head
- nf_tables: validate variable length element extension
- bgmac: fix a BUG triggered by wrong bytes_compl
- bcmgenet: indicate MAC is in charge of PHY PM
Previous releases - always broken:
- bpf:
- fix bad pointer deref in bpf_sys_bpf() injected via test infra
- disallow non-builtin bpf programs calling the prog_run command
- don't reinit map value in prealloc_lru_pop
- fix UAFs during the read of map iterator fd
- fix invalidity check for values in sk local storage map
- reject sleepable program for non-resched map iterator
- mptcp:
- move subflow cleanup in mptcp_destroy_common()
- do not queue data on closed subflows
- virtio_net: fix memory leak inside XDP_TX with mergeable
- vsock: fix memory leak when multiple threads try to connect()
- rework sk_user_data sharing to prevent psock leaks
- geneve: fix TOS inheriting for ipv4
- tunnels & drivers: do not use RT_TOS for IPv6 flowlabel
- phy: c45 baset1: do not skip aneg configuration if clock role is
not specified
- rose: avoid overflow when /proc displays timer information
- x25: fix call timeouts in blocking connects
- can: mcp251x: fix race condition on receive interrupt
- can: j1939:
- replace user-reachable WARN_ON_ONCE() with netdev_warn_once()
- fix memory leak of skbs in j1939_session_destroy()
Misc:
- docs: bpf: clarify that many things are not uAPI
- seg6: initialize induction variable to first valid array index (to
silence clang vs objtool warning)
- can: ems_usb: fix clang 14's -Wunaligned-access warning"
* tag 'net-6.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (117 commits)
net: atm: bring back zatm uAPI
dpaa2-eth: trace the allocated address instead of page struct
net: add missing kdoc for struct genl_multicast_group::flags
nfp: fix use-after-free in area_cache_get()
MAINTAINERS: use my korg address for mt7601u
mlxsw: minimal: Fix deadlock in ports creation
bonding: fix reference count leak in balance-alb mode
net: usb: qmi_wwan: Add support for Cinterion MV32
bpf: Shut up kern_sys_bpf warning.
net/tls: Use RCU API to access tls_ctx->netdev
tls: rx: device: don't try to copy too much on detach
tls: rx: device: bound the frag walk
net_sched: cls_route: remove from list when handle is 0
selftests: forwarding: Fix failing tests with old libnet
net: refactor bpf_sk_reuseport_detach()
net: fix refcount bug in sk_psock_get (2)
selftests/bpf: Ensure sleepable program is rejected by hash map iter
selftests/bpf: Add write tests for sk local storage map iterator
selftests/bpf: Add tests for reading a dangling map iter fd
bpf: Only allow sleepable program for resched-able iterator
...
|
|
The verifier cannot perform sufficient validation of bpf_attr->test.ctx_in
pointer, therefore bpf programs should not be allowed to call BPF_PROG_RUN
command from within the program.
To fix this issue split bpf_sys_bpf() bpf helper into normal kern_sys_bpf()
kernel function that can only be used by the kernel light skeleton directly.
Reported-by: YiFei Zhu <zhuyifei@google.com>
Fixes: b1d18a7574d0 ("bpf: Extend sys_bpf commands for bpf_syscall programs.")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Pull bitmap updates from Yury Norov:
- fix the duplicated comments on bitmap_to_arr64() (Qu Wenruo)
- optimize out non-atomic bitops on compile-time constants (Alexander
Lobakin)
- cleanup bitmap-related headers (Yury Norov)
- x86/olpc: fix 'logical not is only applied to the left hand side'
(Alexander Lobakin)
- lib/nodemask: inline wrappers around bitmap (Yury Norov)
* tag 'bitmap-6.0-rc1' of https://github.com/norov/linux: (26 commits)
lib/nodemask: inline next_node_in() and node_random()
powerpc: drop dependency on <asm/machdep.h> in archrandom.h
x86/olpc: fix 'logical not is only applied to the left hand side'
lib/cpumask: move some one-line wrappers to header file
headers/deps: mm: align MANITAINERS and Docs with new gfp.h structure
headers/deps: mm: Split <linux/gfp_types.h> out of <linux/gfp.h>
headers/deps: mm: Optimize <linux/gfp.h> header dependencies
lib/cpumask: move trivial wrappers around find_bit to the header
lib/cpumask: change return types to unsigned where appropriate
cpumask: change return types to bool where appropriate
lib/bitmap: change type of bitmap_weight to unsigned long
lib/bitmap: change return types to bool where appropriate
arm: align find_bit declarations with generic kernel
iommu/vt-d: avoid invalid memory access via node_online(NUMA_NO_NODE)
lib/test_bitmap: test the tail after bitmap_to_arr64()
lib/bitmap: fix off-by-one in bitmap_to_arr64()
lib: test_bitmap: add compile-time optimization/evaluations assertions
bitmap: don't assume compiler evaluates small mem*() builtins calls
net/ice: fix initializing the bitmap in the switch code
bitops: let optimize out non-atomic bitops on compile-time constants
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools updates from Arnaldo Carvalho de Melo:
- Introduce 'perf lock contention' subtool, using new lock contention
tracepoints and using BPF for in kernel aggregation and then
userspace processing using the perf tooling infrastructure for
resolving symbols, target specification, etc.
Since the new lock contention tracepoints don't provide lock names,
get up to 8 stack traces and display the first non-lock function
symbol name as a caller:
$ perf lock report -F acquired,contended,avg_wait,wait_total
Name acquired contended avg wait total wait
update_blocked_a... 40 40 3.61 us 144.45 us
kernfs_fop_open+... 5 5 3.64 us 18.18 us
_nohz_idle_balance 3 3 2.65 us 7.95 us
tick_do_update_j... 1 1 6.04 us 6.04 us
ep_scan_ready_list 1 1 3.93 us 3.93 us
Supports the usual 'perf record' + 'perf report' workflow as well as
a BCC/bpftrace like mode where you start the tool and then press
control+C to get results:
$ sudo perf lock contention -b
^C
contended total wait max wait avg wait type caller
42 192.67 us 13.64 us 4.59 us spinlock queue_work_on+0x20
23 85.54 us 10.28 us 3.72 us spinlock worker_thread+0x14a
6 13.92 us 6.51 us 2.32 us mutex kernfs_iop_permission+0x30
3 11.59 us 10.04 us 3.86 us mutex kernfs_dop_revalidate+0x3c
1 7.52 us 7.52 us 7.52 us spinlock kthread+0x115
1 7.24 us 7.24 us 7.24 us rwlock:W sys_epoll_wait+0x148
2 7.08 us 3.99 us 3.54 us spinlock delayed_work_timer_fn+0x1b
1 6.41 us 6.41 us 6.41 us spinlock idle_balance+0xa06
2 2.50 us 1.83 us 1.25 us mutex kernfs_iop_lookup+0x2f
1 1.71 us 1.71 us 1.71 us mutex kernfs_iop_getattr+0x2c
...
- Add new 'perf kwork' tool to trace time properties of kernel work
(such as softirq, and workqueue), uses eBPF skeletons to collect info
in kernel space, aggregating data that then gets processed by the
userspace tool, e.g.:
# perf kwork report
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
----------------------------------------------------------------------------------------------------
nvme0q5:130 | 004 | 1.101 ms | 49 | 0.051 ms | 26035.056403 s | 26035.056455 s |
amdgpu:162 | 002 | 0.176 ms | 9 | 0.046 ms | 26035.268020 s | 26035.268066 s |
nvme0q24:149 | 023 | 0.161 ms | 55 | 0.009 ms | 26035.655280 s | 26035.655288 s |
nvme0q20:145 | 019 | 0.090 ms | 33 | 0.014 ms | 26035.939018 s | 26035.939032 s |
nvme0q31:156 | 030 | 0.075 ms | 21 | 0.010 ms | 26035.052237 s | 26035.052247 s |
nvme0q8:133 | 007 | 0.062 ms | 12 | 0.021 ms | 26035.416840 s | 26035.416861 s |
nvme0q6:131 | 005 | 0.054 ms | 22 | 0.010 ms | 26035.199919 s | 26035.199929 s |
nvme0q19:144 | 018 | 0.052 ms | 14 | 0.010 ms | 26035.110615 s | 26035.110625 s |
nvme0q7:132 | 006 | 0.049 ms | 13 | 0.007 ms | 26035.125180 s | 26035.125187 s |
nvme0q18:143 | 017 | 0.033 ms | 14 | 0.007 ms | 26035.169698 s | 26035.169705 s |
nvme0q17:142 | 016 | 0.013 ms | 1 | 0.013 ms | 26035.565147 s | 26035.565160 s |
enp5s0-rx-0:164 | 006 | 0.004 ms | 4 | 0.002 ms | 26035.928882 s | 26035.928884 s |
enp5s0-tx-0:166 | 008 | 0.003 ms | 3 | 0.002 ms | 26035.870923 s | 26035.870925 s |
--------------------------------------------------------------------------------------------------------
See commit log messages for more examples with extra options to limit
the events time window, etc.
- Add support for new AMD IBS (Instruction Based Sampling) features:
With the DataSrc extensions, the source of data can be decoded among:
- Local L3 or other L1/L2 in CCX.
- A peer cache in a near CCX.
- Data returned from DRAM.
- A peer cache in a far CCX.
- DRAM address map with "long latency" bit set.
- Data returned from MMIO/Config/PCI/APIC.
- Extension Memory (S-Link, GenZ, etc - identified by the CS target
and/or address map at DF's choice).
- Peer Agent Memory.
- Support hardware tracing with Intel PT on guest machines, combining
the traces with the ones in the host machine.
- Add a "-m" option to 'perf buildid-list' to show kernel and modules
build-ids, to display all of the information needed to do external
symbolization of kernel stack traces, such as those collected by
bpf_get_stackid().
- Add arch TSC frequency information to perf.data file headers.
- Handle changes in the binutils disassembler function signatures in
perf, bpftool and bpf_jit_disasm (Acked by the bpftool maintainer).
- Fix building the perf perl binding with the newest gcc in distros
such as fedora rawhide, where some new warnings were breaking the
build as perf uses -Werror.
- Add 'perf test' entry for branch stack sampling.
- Add ARM SPE system wide 'perf test' entry.
- Add user space counter reading tests to 'perf test'.
- Build with python3 by default, if available.
- Add python converter script for the vendor JSON event files.
- Update vendor JSON files for most Intel cores.
- Add vendor JSON File for Intel meteorlake.
- Add Arm Cortex-A78C and X1C JSON vendor event files.
- Add workaround to symbol address reading from ELF files without phdr,
falling back to the previoous equation.
- Convert legacy map definition to BTF-defined in the perf BPF script
test.
- Rework prologue generation code to stop using libbpf deprecated APIs.
- Add default hybrid events for 'perf stat' on x86.
- Add topdown metrics in the default 'perf stat' on the hybrid machines
(big/little cores).
- Prefer sampled CPU when exporting JSON in 'perf data convert'
- Fix ('perf stat CSV output linter') and ("Check branch stack
sampling") 'perf test' entries on s390.
* tag 'perf-tools-for-v6.0-2022-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (169 commits)
perf stat: Refactor __run_perf_stat() common code
perf lock: Print the number of lost entries for BPF
perf lock: Add --map-nr-entries option
perf lock: Introduce struct lock_contention
perf scripting python: Do not build fail on deprecation warnings
genelf: Use HAVE_LIBCRYPTO_SUPPORT, not the never defined HAVE_LIBCRYPTO
perf build: Suppress openssl v3 deprecation warnings in libcrypto feature test
perf parse-events: Break out tracepoint and printing
perf parse-events: Don't #define YY_EXTRA_TYPE
tools bpftool: Don't display disassembler-four-args feature test
tools bpftool: Fix compilation error with new binutils
tools bpf_jit_disasm: Don't display disassembler-four-args feature test
tools bpf_jit_disasm: Fix compilation error with new binutils
tools perf: Fix compilation error with new binutils
tools include: add dis-asm-compat.h to handle version differences
tools build: Don't display disassembler-four-args feature test
tools build: Add feature test for init_disassemble_info API changes
perf test: Add ARM SPE system wide test
perf tools: Rework prologue generation code
perf bpf: Convert legacy map definition to BTF-defined
...
|
|
Add an extensible variant of bpf_obj_get() capable of setting the
`file_flags` parameter.
This parameter is needed to enable unprivileged access to BPF maps.
Without a method like this, users must manually make the syscall.
Signed-off-by: Joe Burton <jevburton@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220729202727.3311806-1-jevburton.kernel@gmail.com
|
|
Commit 708ac5bea0ce ("libbpf: add ksyscall/kretsyscall sections support
for syscall kprobes") added the arch_specific_syscall_pfx() function,
which returns a string representing the architecture in use. As it turns
out this function is currently not aware of Power PC, where NULL is
returned. That's being flagged by the libbpf CI system, which builds for
ppc64le and the compiler sees a NULL pointer being passed in to a %s
format string.
With this change we add representations for two more architectures, for
Power PC and Power PC 64, and also adjust the string format logic to
handle NULL pointers gracefully, in an attempt to prevent similar issues
with other architectures in the future.
Fixes: 708ac5bea0ce ("libbpf: add ksyscall/kretsyscall sections support for syscall kprobes")
Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220728222345.3125975-1-deso@posteo.net
|
|
Explicitly list known quirks. Mention that socket-related syscalls can be
invoked via socketcall().
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20220726134008.256968-2-iii@linux.ibm.com
|
|
The return from strcmp() is inverted so it wrongly returns true instead
of false and vice versa.
Fixes: a1c9d61b19cb ("libbpf: Improve library identification for uprobe binary path resolution")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Cc: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/YtZ+/dAA195d99ak@kili
|
|
The code here is supposed to take a signed int and store it in a signed
long long. Unfortunately, the way that the type promotion works with
this conditional statement is that it takes a signed int, type promotes
it to a __u32, and then stores that as a signed long long. The result is
never negative.
This is from static analysis, but I made a little test program just to
test it before I sent the patch:
#include <stdio.h>
int main(void)
{
unsigned long long src = -1ULL;
signed long long dst1, dst2;
int is_signed = 1;
dst1 = is_signed ? *(int *)&src : *(unsigned int *)0;
dst2 = is_signed ? (signed long long)*(int *)&src : *(unsigned int *)0;
printf("%lld\n", dst1);
printf("%lld\n", dst2);
return 0;
}
Fixes: d90ec262b35b ("libbpf: Add enum64 support for btf_dump")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/YtZ+LpgPADm7BeEd@kili
|
|
Add machine_pid and vcpu to struct perf_record_auxtrace_error. The existing
fmt member is used to identify the new format.
The new members make it possible to easily differentiate errors from guest
machines.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-18-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When injecting events from a guest perf.data file, the events will have
separate sample ID numbers. These ID numbers can then be used to determine
which machine an event belongs to. To facilitate that, add machine_pid and
vcpu to id_index records. For backward compatibility, these are added at
the end of the record, and the length of the record is used to determine
if they are present or not.
Note, this is needed because the events from a guest perf.data file contain
the pid/tid of the process running at that time inside the VM not the
pid/tid of the (QEMU) hypervisor thread. So a way is needed to relate
guest events back to the guest machine and VCPU, and using sample ID
numbers for that is relatively simple and convenient.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-11-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The snprintf() function returns the number of bytes it *would* have
copied if there were enough space. So it can return > the
sizeof(gen->attach_target).
Fixes: 67234743736a ("libbpf: Generate loader program out of BPF ELF file.")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/r/YtZ+oAySqIhFl6/J@kili
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Make libbpf adjust RINGBUF map size (rounding it up to closest power-of-2
of page_size) more eagerly: during open phase when initializing the map
and on explicit calls to bpf_map__set_max_entries().
Such approach allows user to check actual size of BPF ringbuf even
before it's created in the kernel, but also it prevents various edge
case scenarios where BPF ringbuf size can get out of sync with what it
would be in kernel. One of them (reported in [0]) is during an attempt
to pin/reuse BPF ringbuf.
Move adjust_ringbuf_sz() helper closer to its first actual use. The
implementation of the helper is unchanged.
Also make detection of whether bpf_object is already loaded more robust
by checking obj->loaded explicitly, given that map->fd can be < 0 even
if bpf_object is already loaded due to ability to disable map creation
with bpf_map__set_autocreate(map, false).
[0] Closes: https://github.com/libbpf/libbpf/pull/530
Fixes: 0087a681fa8c ("libbpf: Automatically fix up BPF_MAP_TYPE_RINGBUF size, if necessary")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220715230952.2219271-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Teach libbpf to fallback to tracefs mount point (/sys/kernel/tracing) if
debugfs (/sys/kernel/debug/tracing) isn't mounted.
Acked-by: Yonghong Song <yhs@fb.com>
Suggested-by: Connor O'Brien <connoro@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220715185736.898848-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add SEC("ksyscall")/SEC("ksyscall/<syscall_name>") and corresponding
kretsyscall variants (for return kprobes) to allow users to kprobe
syscall functions in kernel. These special sections allow to ignore
complexities and differences between kernel versions and host
architectures when it comes to syscall wrapper and corresponding
__<arch>_sys_<syscall> vs __se_sys_<syscall> differences, depending on
whether host kernel has CONFIG_ARCH_HAS_SYSCALL_WRAPPER (though libbpf
itself doesn't rely on /proc/config.gz for detecting this, see
BPF_KSYSCALL patch for how it's done internally).
Combined with the use of BPF_KSYSCALL() macro, this allows to just
specify intended syscall name and expected input arguments and leave
dealing with all the variations to libbpf.
In addition to SEC("ksyscall+") and SEC("kretsyscall+") add
bpf_program__attach_ksyscall() API which allows to specify syscall name
at runtime and provide associated BPF cookie value.
At the moment SEC("ksyscall") and bpf_program__attach_ksyscall() do not
handle all the calling convention quirks for mmap(), clone() and compat
syscalls. It also only attaches to "native" syscall interfaces. If host
system supports compat syscalls or defines 32-bit syscalls in 64-bit
kernel, such syscall interfaces won't be attached to by libbpf.
These limitations may or may not change in the future. Therefore it is
recommended to use SEC("kprobe") for these syscalls or if working with
compat and 32-bit interfaces is required.
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220714070755.3235561-5-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Improve BPF_KPROBE_SYSCALL (and rename it to shorter BPF_KSYSCALL to
match libbpf's SEC("ksyscall") section name, added in next patch) to use
__kconfig variable to determine how to properly fetch syscall arguments.
Instead of relying on hard-coded knowledge of whether kernel's
architecture uses syscall wrapper or not (which only reflects the latest
kernel versions, but is not necessarily true for older kernels and won't
necessarily hold for later kernel versions on some particular host
architecture), determine this at runtime by attempting to create
perf_event (with fallback to kprobe event creation through tracefs on
legacy kernels, just like kprobe attachment code is doing) for kernel
function that would correspond to bpf() syscall on a system that has
CONFIG_ARCH_HAS_SYSCALL_WRAPPER set (e.g., for x86-64 it would try
'__x64_sys_bpf').
If host kernel uses syscall wrapper, syscall kernel function's first
argument is a pointer to struct pt_regs that then contains syscall
arguments. In such case we need to use bpf_probe_read_kernel() to fetch
actual arguments (which we do through BPF_CORE_READ() macro) from inner
pt_regs.
But if the kernel doesn't use syscall wrapper approach, input
arguments can be read from struct pt_regs directly with no probe reading.
All this feature detection is done without requiring /proc/config.gz
existence and parsing, and BPF-side helper code uses newly added
LINUX_HAS_SYSCALL_WRAPPER virtual __kconfig extern to keep in sync with
user-side feature detection of libbpf.
BPF_KSYSCALL() macro can be used both with SEC("kprobe") programs that
define syscall function explicitly (e.g., SEC("kprobe/__x64_sys_bpf"))
and SEC("ksyscall") program added in the next patch (which are the same
kprobe program with added benefit of libbpf determining correct kernel
function name automatically).
Kretprobe and kretsyscall (added in next patch) programs don't need
BPF_KSYSCALL as they don't provide access to input arguments. Normal
BPF_KRETPROBE is completely sufficient and is recommended.
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220714070755.3235561-4-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Libbpf supports single virtual __kconfig extern currently: LINUX_KERNEL_VERSION.
LINUX_KERNEL_VERSION isn't coming from /proc/kconfig.gz and is intead
customly filled out by libbpf.
This patch generalizes this approach to support more such virtual
__kconfig externs. One such extern added in this patch is
LINUX_HAS_BPF_COOKIE which is used for BPF-side USDT supporting code in
usdt.bpf.h instead of using CO-RE-based enum detection approach for
detecting bpf_get_attach_cookie() BPF helper. This allows to remove
otherwise not needed CO-RE dependency and keeps user-space and BPF-side
parts of libbpf's USDT support strictly in sync in terms of their
feature detection.
We'll use similar approach for syscall wrapper detection for
BPF_KSYSCALL() BPF-side macro in follow up patch.
Generally, currently libbpf reserves CONFIG_ prefix for Kconfig values
and LINUX_ for virtual libbpf-backed externs. In the future we might
extend the set of prefixes that are supported. This can be done without
any breaking changes, as currently any __kconfig extern with
unrecognized name is rejected.
For LINUX_xxx externs we support the normal "weak rule": if libbpf
doesn't recognize given LINUX_xxx extern but such extern is marked as
__weak, it is not rejected and defaults to zero. This follows
CONFIG_xxx handling logic and will allow BPF applications to
opportunistically use newer libbpf virtual externs without breaking on
older libbpf versions unnecessarily.
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220714070755.3235561-2-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Add support for writing a custom event reader, by exposing the ring
buffer.
With the new API perf_buffer__buffer() you will get access to the
raw mmaped()'ed per-cpu underlying memory of the ring buffer.
This region contains both the perf buffer data and header
(struct perf_event_mmap_page), which manages the ring buffer
state (head/tail positions, when accessing the head/tail position
it's important to take into consideration SMP).
With this type of low level access one can implement different types of
consumers here are few simple examples where this API helps with:
1. perf_event_read_simple is allocating using malloc, perhaps you want
to handle the wrap-around in some other way.
2. Since perf buf is per-cpu then the order of the events is not
guarnteed, for example:
Given 3 events where each event has a timestamp t0 < t1 < t2,
and the events are spread on more than 1 CPU, then we can end
up with the following state in the ring buf:
CPU[0] => [t0, t2]
CPU[1] => [t1]
When you consume the events from CPU[0], you could know there is
a t1 missing, (assuming there are no drops, and your event data
contains a sequential index).
So now one can simply do the following, for CPU[0], you can store
the address of t0 and t2 in an array (without moving the tail, so
there data is not perished) then move on the CPU[1] and set the
address of t1 in the same array.
So you end up with something like:
void **arr[] = [&t0, &t1, &t2], now you can consume it orderely
and move the tails as you process in order.
3. Assuming there are multiple CPUs and we want to start draining the
messages from them, then we can "pick" with which one to start with
according to the remaining free space in the ring buffer.
Signed-off-by: Jon Doron <jond@wiz.io>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220715181122.149224-1-arilou@gmail.com
|
|
bitmap_weight() doesn't return negative values, so change it's type
to unsigned long. It may help compiler to generate better code and
catch bugs.
Signed-off-by: Yury Norov <yury.norov@gmail.com>
|
|
Some bitmap functions return boolean results in int variables. Fix it
by changing return types to bool.
Signed-off-by: Yury Norov <yury.norov@gmail.com>
|
|
BPF map name is limited to BPF_OBJ_NAME_LEN.
A map name is defined as being longer than BPF_OBJ_NAME_LEN,
it will be truncated to BPF_OBJ_NAME_LEN when a userspace program
calls libbpf to create the map. A pinned map also generates a path
in the /sys. If the previous program wanted to reuse the map,
it can not get bpf_map by name, because the name of the map is only
partially the same as the name which get from pinned path.
The syscall information below show that map name "process_pinned_map"
is truncated to "process_pinned_".
bpf(BPF_OBJ_GET, {pathname="/sys/fs/bpf/process_pinned_map",
bpf_fd=0, file_flags=0}, 144) = -1 ENOENT (No such file or directory)
bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4,
value_size=4,max_entries=1024, map_flags=0, inner_map_fd=0,
map_name="process_pinned_",map_ifindex=0, btf_fd=3, btf_key_type_id=6,
btf_value_type_id=10,btf_vmlinux_value_type_id=0}, 72) = 4
This patch check that if the name of pinned map are the same as the
actual name for the first (BPF_OBJ_NAME_LEN - 1),
bpf map still uses the name which is included in bpf object.
Fixes: 26736eb9a483 ("tools: libbpf: allow map reuse")
Signed-off-by: Anquan Wu <leiqi96@hotmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/OSZP286MB1725CEA1C95C5CB8E7CCC53FB8869@OSZP286MB1725.JPNP286.PROD.OUTLOOK.COM
|
|
binary_path is a required non-null parameter for bpf_program__attach_usdt
and bpf_program__attach_uprobe_opts. Check it against NULL to prevent
coredump on strchr.
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220712025745.2703995-1-hengqi.chen@gmail.com
|
|
Daniel Borkmann says:
====================
pull-request: bpf-next 2022-07-09
We've added 94 non-merge commits during the last 19 day(s) which contain
a total of 125 files changed, 5141 insertions(+), 6701 deletions(-).
The main changes are:
1) Add new way for performing BTF type queries to BPF, from Daniel Müller.
2) Add inlining of calls to bpf_loop() helper when its function callback is
statically known, from Eduard Zingerman.
3) Implement BPF TCP CC framework usability improvements, from Jörn-Thorben Hinz.
4) Add LSM flavor for attaching per-cgroup BPF programs to existing LSM
hooks, from Stanislav Fomichev.
5) Remove all deprecated libbpf APIs in prep for 1.0 release, from Andrii Nakryiko.
6) Add benchmarks around local_storage to BPF selftests, from Dave Marchevsky.
7) AF_XDP sample removal (given move to libxdp) and various improvements around AF_XDP
selftests, from Magnus Karlsson & Maciej Fijalkowski.
8) Add bpftool improvements for memcg probing and bash completion, from Quentin Monnet.
9) Add arm64 JIT support for BPF-2-BPF coupled with tail calls, from Jakub Sitnicki.
10) Sockmap optimizations around throughput of UDP transmissions which have been
improved by 61%, from Cong Wang.
11) Rework perf's BPF prologue code to remove deprecated functions, from Jiri Olsa.
12) Fix sockmap teardown path to avoid sleepable sk_psock_stop, from John Fastabend.
13) Fix libbpf's cleanup around legacy kprobe/uprobe on error case, from Chuang Wang.
14) Fix libbpf's bpf_helpers.h to work with gcc for the case of its sec/pragma
macro, from James Hilliard.
15) Fix libbpf's pt_regs macros for riscv to use a0 for RC register, from Yixun Lan.
16) Fix bpftool to show the name of type BPF_OBJ_LINK, from Yafang Shao.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (94 commits)
selftests/bpf: Fix xdp_synproxy build failure if CONFIG_NF_CONNTRACK=m/n
bpf: Correctly propagate errors up from bpf_core_composites_match
libbpf: Disable SEC pragma macro on GCC
bpf: Check attach_func_proto more carefully in check_return_code
selftests/bpf: Add test involving restrict type qualifier
bpftool: Add support for KIND_RESTRICT to gen min_core_btf command
MAINTAINERS: Add entry for AF_XDP selftests files
selftests, xsk: Rename AF_XDP testing app
bpf, docs: Remove deprecated xsk libbpf APIs description
selftests/bpf: Add benchmark for local_storage RCU Tasks Trace usage
libbpf, riscv: Use a0 for RC register
libbpf: Remove unnecessary usdt_rel_ip assignments
selftests/bpf: Fix few more compiler warnings
selftests/bpf: Fix bogus uninitialized variable warning
bpftool: Remove zlib feature test from Makefile
libbpf: Cleanup the legacy uprobe_event on failed add/attach_event()
libbpf: Fix wrong variable used in perf_event_uprobe_open_legacy()
libbpf: Cleanup the legacy kprobe_event on failed add/attach_event()
selftests/bpf: Add type match test against kernel's task_struct
selftests/bpf: Add nested type to type based tests
...
====================
Link: https://lore.kernel.org/r/20220708233145.32365-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This change addresses a comment made earlier [0] about a missing return
of an error when __bpf_core_types_match is invoked from
bpf_core_composites_match, which could have let to us erroneously
ignoring errors.
Regarding the typedef name check pointed out in the same context, it is
not actually an issue, because callers of the function perform a name
check for the root type anyway. To make that more obvious, let's add
comments to the function (similar to what we have for
bpf_core_types_are_compat, which is called in pretty much the same
context).
[0]: https://lore.kernel.org/bpf/165708121449.4919.13204634393477172905.git-patchwork-notify@kernel.org/T/#m55141e8f8cfd2e8d97e65328fa04852870d01af6
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220707211931.3415440-1-deso@posteo.net
|