summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)Author
2024-07-29selftests/bpf: Add tests for ldsx of pkt data/data_end/data_meta accessesYonghong Song
The following tests are added to verifier_ldsx.c: - sign extension of data/data_end/data_meta for tcx programs. The actual checking is in bpf_skb_is_valid_access() which is called by sk_filter, cg_skb, lwt, tc(tcx) and sk_skb. - sign extension of data/data_end/data_meta for xdp programs. - sign extension of data/data_end for flow_dissector programs. All newly-added tests have verification failure with message "invalid bpf_context access". Without previous patch, all these tests succeeded verification. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20240723153444.2430365-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-07-29tools/runqslower: Fix LDFLAGS and add LDLIBS supportTony Ambardar
Actually use previously defined LDFLAGS during build and add support for LDLIBS to link extra standalone libraries e.g. 'argp' which is not provided by musl libc. Fixes: 585bf4640ebe ("tools: runqslower: Add EXTRA_CFLAGS and EXTRA_LDFLAGS support") Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/bpf/20240723003045.2273499-1-tony.ambardar@gmail.com
2024-07-29selftests/bpf: Fix wrong binary in Makefile log outputTony Ambardar
Make log output incorrectly shows 'test_maps' as the binary name for every 'CLNG-BPF' build step, apparently picking up the last value defined for the $(TRUNNER_BINARY) variable. Update the 'CLANG_BPF_BUILD_RULE' variants to fix this confusing output. Current output: CLNG-BPF [test_maps] access_map_in_map.bpf.o GEN-SKEL [test_progs] access_map_in_map.skel.h ... CLNG-BPF [test_maps] access_map_in_map.bpf.o GEN-SKEL [test_progs-no_alu32] access_map_in_map.skel.h ... CLNG-BPF [test_maps] access_map_in_map.bpf.o GEN-SKEL [test_progs-cpuv4] access_map_in_map.skel.h After fix: CLNG-BPF [test_progs] access_map_in_map.bpf.o GEN-SKEL [test_progs] access_map_in_map.skel.h ... CLNG-BPF [test_progs-no_alu32] access_map_in_map.bpf.o GEN-SKEL [test_progs-no_alu32] access_map_in_map.skel.h ... CLNG-BPF [test_progs-cpuv4] access_map_in_map.bpf.o GEN-SKEL [test_progs-cpuv4] access_map_in_map.skel.h Fixes: a5d0c26a2784 ("selftests/bpf: Add a cpuv4 test runner for cpu=v4 testing") Fixes: 89ad7420b25c ("selftests/bpf: Drop the need for LLVM's llc") Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/bpf/20240720052535.2185967-1-tony.ambardar@gmail.com
2024-07-29selftests/bpf: Add uprobe multi consumers testJiri Olsa
Adding test that attaches/detaches multiple consumers on single uprobe and verifies all were hit as expected. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240722202758.3889061-3-jolsa@kernel.org
2024-07-29selftests/bpf: Add uprobe fail tests for uprobe multiJiri Olsa
Adding tests for checking on recovery after failing to attach uprobe. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240722202758.3889061-2-jolsa@kernel.org
2024-07-29selftests/bpf: Fix compilation failure when CONFIG_NET_FOU!=yArtem Savkov
Without CONFIG_NET_FOU bpf selftests are unable to build because of missing definitions. Add ___local versions of struct bpf_fou_encap and enum bpf_fou_encap_type to fix the issue. Signed-off-by: Artem Savkov <asavkov@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240723071031.3389423-1-asavkov@redhat.com
2024-07-29selftests/bpf: Fix error linking uprobe_multi on mipsTony Ambardar
Linking uprobe_multi.c on mips64el fails due to relocation overflows, when the GOT entries required exceeds the default maximum. Add a specific CFLAGS (-mxgot) for uprobe_multi.c on MIPS that allows using a larger GOT and avoids errors such as: /tmp/ccBTNQzv.o: in function `bench': uprobe_multi.c:49:(.text+0x1d7720): relocation truncated to fit: R_MIPS_GOT_DISP against `uprobe_multi_func_08188' uprobe_multi.c:49:(.text+0x1d7730): relocation truncated to fit: R_MIPS_GOT_DISP against `uprobe_multi_func_08189' ... collect2: error: ld returned 1 exit status Fixes: 519dfeaf5119 ("selftests/bpf: Add uprobe_multi test program") Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/14eb7b70f8ccef9834874d75eb373cb9292129da.1721692479.git.tony.ambardar@gmail.com
2024-07-29selftests/bpf: Add missing system defines for mipsTony Ambardar
Update get_sys_includes in Makefile with missing MIPS-related definitions to fix many, many compilation errors building selftests/bpf. The following added defines drive conditional logic in system headers for word-size and endianness selection: MIPSEL, MIPSEB _MIPS_SZPTR _MIPS_SZLONG _MIPS_SIM, _ABIO32, _ABIN32, _ABI64 Signed-off-by: Tony Ambardar <tony.ambardar@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/f3cfceaf5299cdd2ac0e0a36072d6ca7be23e603.1721692479.git.tony.ambardar@gmail.com
2024-07-29selftests/bpf: Don't include .d files on make cleanIhor Solodrai
Ignore generated %.test.o dependencies when make goal is clean or docs-clean. Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/all/oNTIdax7aWGJdEgabzTqHzF4r-WTERrV1e1cNaPQMp-UhYUQpozXqkbuAlLBulczr6I99-jM5x3dxv56JJowaYBkm765R9Aa9kyrVuCl_kA=@pm.me Link: https://lore.kernel.org/bpf/K69Y8OKMLXBWR0dtOfsC4J46-HxeQfvqoFx1CysCm7u19HRx4MB6yAKOFkM6X-KAx2EFuCcCh_9vYWpsgQXnAer8oQ8PMeDEuiRMYECuGH4=@pm.me
2024-07-29selftests/bpf: Add a test for mmap-able map in mapSong Liu
Regular BPF hash map is not mmap-able from user space. However, map-in-map with outer map of type BPF_MAP_TYPE_HASH_OF_MAPS and mmap-able array as inner map can perform similar operations as a mmap-able hash map. This can be used by applications that benefit from fast accesses to some local data. Add a selftest to show this use case. Signed-off-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20240723051455.1589192-1-song@kernel.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-07-29selftests/bpf: Drop __start_server in network_helpersGeliang Tang
The helper start_server_addr() is a wrapper of __start_server(), the only difference between them is __start_server() accepts a sockaddr type address parameter, but start_server_addr() accepts a sockaddr_storage one. This patch drops __start_server(), and updates the callers to invoke start_server_addr() instead. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/31399df7cb957b7c233e79963b0aa0dc4278d273.1721475357.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-07-29selftests/bpf: Drop inetaddr_len in sk_lookupGeliang Tang
No need to use a dedicated helper inetaddr_len() to get the length of the IPv4 or IPv6 address, it can be got by make_sockaddr(), this patch drops it. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/32e2a4122921051da38a6e4fbb2ebee5f0af5a4e.1721475357.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-07-29selftests/bpf: Drop make_socket in sk_lookupGeliang Tang
This patch uses the public network helers client_socket() + make_sockaddr() in sk_lookup.c to create the client socket, set the timeout sockopts, and make the connecting address. The local defined function make_socket() can be dropped then. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/588771977ac48c27f73526d8421a84b91d7cf218.1721475357.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-07-29selftests/bpf: Drop make_client in sk_lookupGeliang Tang
This patch uses the new helper connect_to_addr_str() in sk_lookup.c to create the client socket and connect to the server, instead of using local defined function make_client(). This local function can be dropped then. Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/058199d7ab46802249dae066ca22c98f6be508ee.1721475357.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-07-29selftests/bpf: Workaround strict bpf_lsm return value check.Alexei Starovoitov
test_progs-no_alu32 -t libbpf_get_fd_by_id_opts is being rejected by the verifier with the following error due to compiler optimization: 6: (67) r0 <<= 62 ; R0_w=scalar(smax=0x4000000000000000,umax=0xc000000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xc000000000000000)) 7: (c7) r0 s>>= 63 ; R0_w=scalar(smin=smin32=-1,smax=smax32=0) ; @ test_libbpf_get_fd_by_id_opts.c:0 8: (57) r0 &= -13 ; R0_w=scalar(smax=0x7ffffffffffffff3,umax=0xfffffffffffffff3,smax32=0x7ffffff3,umax32=0xfffffff3,var_off=(0x0; 0xfffffffffffffff3)) ; int BPF_PROG(check_access, struct bpf_map *map, fmode_t fmode) @ test_libbpf_get_fd_by_id_opts.c:27 9: (95) exit At program exit the register R0 has smax=9223372036854775795 should have been in [-4095, 0] Workaround by adding barrier(). Eventually the verifier will be able to recognize it. Fixes: 5d99e198be27 ("bpf, lsm: Add check for BPF LSM return value") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-07-29selftests/bpf: Add verifier tests for bpf lsmXu Kuohai
Add verifier tests to check bpf lsm return values and disabled hooks. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Link: https://lore.kernel.org/r/20240719110059.797546-10-xukuohai@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-07-29selftests/bpf: Add test for lsm tail callXu Kuohai
Add test for lsm tail call to ensure tail call can only be used between bpf lsm progs attached to the same hook. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Link: https://lore.kernel.org/r/20240719110059.797546-9-xukuohai@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-07-29selftests/bpf: Add return value checks for failed testsXu Kuohai
The return ranges of some bpf lsm test progs can not be deduced by the verifier accurately. To avoid erroneous rejections, add explicit return value checks for these progs. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Link: https://lore.kernel.org/r/20240719110059.797546-8-xukuohai@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-07-29selftests/bpf: Avoid load failure for token_lsm.cXu Kuohai
The compiler optimized the two bpf progs in token_lsm.c to make return value from the bool variable in the "return -1" path, causing an unexpected rejection: 0: R1=ctx() R10=fp0 ; int BPF_PROG(bpf_token_capable, struct bpf_token *token, int cap) @ bpf_lsm.c:17 0: (b7) r6 = 0 ; R6_w=0 ; if (my_pid == 0 || my_pid != (bpf_get_current_pid_tgid() >> 32)) @ bpf_lsm.c:19 1: (18) r1 = 0xffffc9000102a000 ; R1_w=map_value(map=bpf_lsm.bss,ks=4,vs=5) 3: (61) r7 = *(u32 *)(r1 +0) ; R1_w=map_value(map=bpf_lsm.bss,ks=4,vs=5) R7_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) 4: (15) if r7 == 0x0 goto pc+11 ; R7_w=scalar(smin=umin=umin32=1,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) 5: (67) r7 <<= 32 ; R7_w=scalar(smax=0x7fffffff00000000,umax=0xffffffff00000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xffffffff00000000)) 6: (c7) r7 s>>= 32 ; R7_w=scalar(smin=0xffffffff80000000,smax=0x7fffffff) 7: (85) call bpf_get_current_pid_tgid#14 ; R0=scalar() 8: (77) r0 >>= 32 ; R0_w=scalar(smin=0,smax=umax=0xffffffff,var_off=(0x0; 0xffffffff)) 9: (5d) if r0 != r7 goto pc+6 ; R0_w=scalar(smin=smin32=0,smax=umax=umax32=0x7fffffff,var_off=(0x0; 0x7fffffff)) R7=scalar(smin=smin32=0,smax=umax=umax32=0x7fffffff,var_off=(0x0; 0x7fffffff)) ; if (reject_capable) @ bpf_lsm.c:21 10: (18) r1 = 0xffffc9000102a004 ; R1_w=map_value(map=bpf_lsm.bss,ks=4,vs=5,off=4) 12: (71) r6 = *(u8 *)(r1 +0) ; R1_w=map_value(map=bpf_lsm.bss,ks=4,vs=5,off=4) R6_w=scalar(smin=smin32=0,smax=umax=smax32=umax32=255,var_off=(0x0; 0xff)) ; @ bpf_lsm.c:0 13: (87) r6 = -r6 ; R6_w=scalar() 14: (67) r6 <<= 56 ; R6_w=scalar(smax=0x7f00000000000000,umax=0xff00000000000000,smin32=0,smax32=umax32=0,var_off=(0x0; 0xff00000000000000)) 15: (c7) r6 s>>= 56 ; R6_w=scalar(smin=smin32=-128,smax=smax32=127) ; int BPF_PROG(bpf_token_capable, struct bpf_token *token, int cap) @ bpf_lsm.c:17 16: (bf) r0 = r6 ; R0_w=scalar(id=1,smin=smin32=-128,smax=smax32=127) R6_w=scalar(id=1,smin=smin32=-128,smax=smax32=127) 17: (95) exit At program exit the register R0 has smin=-128 smax=127 should have been in [-4095, 0] To avoid this failure, change the variable type from bool to int. Signed-off-by: Xu Kuohai <xukuohai@huawei.com> Link: https://lore.kernel.org/r/20240719110059.797546-7-xukuohai@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-07-29selftests/bpf: Ensure the unsupported struct_ops prog cannot be loadedMartin KaFai Lau
There is an existing "bpf_tcp_ca/unsupp_cong_op" test to ensure the unsupported tcp-cc "get_info" struct_ops prog cannot be loaded. This patch adds a new test in the bpf_testmod such that the unsupported ops test does not depend on other kernel subsystem where its supporting ops may be changed in the future. Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20240722183049.2254692-4-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-07-29selftests/bpf: Fix the missing tramp_1 to tramp_40 ops in cfi_stubsMartin KaFai Lau
The tramp_1 to tramp_40 ops is not set in the cfi_stubs in the bpf_testmod_ops. It fails the struct_ops_multi_pages test after retiring the unsupported_ops in the earlier patch. This patch initializes them in a loop during the bpf_testmod_init(). Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20240722183049.2254692-3-martin.lau@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-07-29bpftool: Add document for net attach/detach on tcx subcommandTao Chen
This commit adds sample output for net attach/detach on tcx subcommand. Signed-off-by: Tao Chen <chen.dylane@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Quentin Monnet <qmo@kernel.org> Link: https://lore.kernel.org/bpf/20240721144252.96264-1-chen.dylane@gmail.com Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-07-29bpftool: Add bash-completion for tcx subcommandTao Chen
This commit adds bash-completion for attaching tcx program on interface. Signed-off-by: Tao Chen <chen.dylane@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Quentin Monnet <qmo@kernel.org> Link: https://lore.kernel.org/bpf/20240721144238.96246-1-chen.dylane@gmail.com Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-07-29bpftool: Add net attach/detach command to tcx progTao Chen
Now, attach/detach tcx prog supported in libbpf, so we can add new command 'bpftool attach/detach tcx' to attach tcx prog with bpftool for user. # bpftool prog load tc_prog.bpf.o /sys/fs/bpf/tc_prog # bpftool prog show ... 192: sched_cls name tc_prog tag 187aeb611ad00cfc gpl loaded_at 2024-07-11T15:58:16+0800 uid 0 xlated 152B jited 97B memlock 4096B map_ids 100,99,97 btf_id 260 # bpftool net attach tcx_ingress name tc_prog dev lo # bpftool net ... tc: lo(1) tcx/ingress tc_prog prog_id 29 # bpftool net detach tcx_ingress dev lo # bpftool net ... tc: # bpftool net attach tcx_ingress name tc_prog dev lo # bpftool net tc: lo(1) tcx/ingress tc_prog prog_id 29 Test environment: ubuntu_22_04, 6.7.0-060700-generic Signed-off-by: Tao Chen <chen.dylane@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Quentin Monnet <qmo@kernel.org> Link: https://lore.kernel.org/bpf/20240721144221.96228-1-chen.dylane@gmail.com Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-07-29bpftool: Refactor xdp attach/detach type judgmentTao Chen
This commit no logical changed, just increases code readability and facilitates TCX prog expansion, which will be implemented in the next patch. Signed-off-by: Tao Chen <chen.dylane@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Quentin Monnet <qmo@kernel.org> Link: https://lore.kernel.org/bpf/20240721143353.95980-2-chen.dylane@gmail.com Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-07-29selftests/bpf: Add testcases for tailcall hierarchy fixingLeon Hwang
Add some test cases to confirm the tailcall hierarchy issue has been fixed. On x64, the selftests result is: cd tools/testing/selftests/bpf && ./test_progs -t tailcalls 327/18 tailcalls/tailcall_bpf2bpf_hierarchy_1:OK 327/19 tailcalls/tailcall_bpf2bpf_hierarchy_fentry:OK 327/20 tailcalls/tailcall_bpf2bpf_hierarchy_fexit:OK 327/21 tailcalls/tailcall_bpf2bpf_hierarchy_fentry_fexit:OK 327/22 tailcalls/tailcall_bpf2bpf_hierarchy_fentry_entry:OK 327/23 tailcalls/tailcall_bpf2bpf_hierarchy_2:OK 327/24 tailcalls/tailcall_bpf2bpf_hierarchy_3:OK 327 tailcalls:OK Summary: 1/24 PASSED, 0 SKIPPED, 0 FAILED On arm64, the selftests result is: cd tools/testing/selftests/bpf && ./test_progs -t tailcalls 327/18 tailcalls/tailcall_bpf2bpf_hierarchy_1:OK 327/19 tailcalls/tailcall_bpf2bpf_hierarchy_fentry:OK 327/20 tailcalls/tailcall_bpf2bpf_hierarchy_fexit:OK 327/21 tailcalls/tailcall_bpf2bpf_hierarchy_fentry_fexit:OK 327/22 tailcalls/tailcall_bpf2bpf_hierarchy_fentry_entry:OK 327/23 tailcalls/tailcall_bpf2bpf_hierarchy_2:OK 327/24 tailcalls/tailcall_bpf2bpf_hierarchy_3:OK 327 tailcalls:OK Summary: 1/24 PASSED, 0 SKIPPED, 0 FAILED Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Leon Hwang <hffilwlqm@gmail.com> Link: https://lore.kernel.org/r/20240714123902.32305-4-hffilwlqm@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-07-29selftests/bpf: Update comments find_equal_scalars->sync_linked_regsEduard Zingerman
find_equal_scalars() is renamed to sync_linked_regs(), this commit updates existing references in the selftests comments. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240718202357.1746514-5-eddyz87@gmail.com
2024-07-29selftests/bpf: Tests for per-insn sync_linked_regs() precision trackingEduard Zingerman
Add a few test cases to verify precision tracking for scalars gaining range because of sync_linked_regs(): - check what happens when more than 6 registers might gain range in sync_linked_regs(); - check if precision is propagated correctly when operand of conditional jump gained range in sync_linked_regs() and one of linked registers is marked precise; - check if precision is propagated correctly when operand of conditional jump gained range in sync_linked_regs() and a other-linked operand of the conditional jump is marked precise; - add a minimized reproducer for precision tracking bug reported in [0]; - Check that mark_chain_precision() for one of the conditional jump operands does not trigger equal scalars precision propagation. [0] https://lore.kernel.org/bpf/CAEf4BzZ0xidVCqB47XnkXcNhkPWF6_nTV7yt+_Lf0kcFEut2Mg@mail.gmail.com/ Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240718202357.1746514-4-eddyz87@gmail.com
2024-07-29bpf: Remove mark_precise_scalar_ids()Eduard Zingerman
Function mark_precise_scalar_ids() is superseded by bt_sync_linked_regs() and equal scalars tracking in jump history. mark_precise_scalar_ids() propagates precision over registers sharing same ID on parent/child state boundaries, while jump history records allow bt_sync_linked_regs() to propagate same information with instruction level granularity, which is strictly more precise. This commit removes mark_precise_scalar_ids() and updates test cases in progs/verifier_scalar_ids to reflect new verifier behavior. The tests are updated in the following manner: - mark_precise_scalar_ids() propagated precision regardless of presence of conditional jumps, while new jump history based logic only kicks in when conditional jumps are present. Hence test cases are augmented with conditional jumps to still trigger precision propagation. - As equal scalars tracking no longer relies on parent/child state boundaries some test cases are no longer interesting, such test cases are removed, namely: - precision_same_state and precision_cross_state are superseded by linked_regs_bpf_k; - precision_same_state_broken_link and equal_scalars_broken_link are superseded by linked_regs_broken_link. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240718202357.1746514-3-eddyz87@gmail.com
2024-07-29bpf: Track equal scalars history on per-instruction levelEduard Zingerman
Use bpf_verifier_state->jmp_history to track which registers were updated by find_equal_scalars() (renamed to collect_linked_regs()) when conditional jump was verified. Use recorded information in backtrack_insn() to propagate precision. E.g. for the following program: while verifying instructions 1: r1 = r0 | 2: if r1 < 8 goto ... | push r0,r1 as linked registers in jmp_history 3: if r0 > 16 goto ... | push r0,r1 as linked registers in jmp_history 4: r2 = r10 | 5: r2 += r0 v mark_chain_precision(r0) while doing mark_chain_precision(r0) 5: r2 += r0 | mark r0 precise 4: r2 = r10 | 3: if r0 > 16 goto ... | mark r0,r1 as precise 2: if r1 < 8 goto ... | mark r0,r1 as precise 1: r1 = r0 v Technically, do this as follows: - Use 10 bits to identify each register that gains range because of sync_linked_regs(): - 3 bits for frame number; - 6 bits for register or stack slot number; - 1 bit to indicate if register is spilled. - Use u64 as a vector of 6 such records + 4 bits for vector length. - Augment struct bpf_jmp_history_entry with a field 'linked_regs' representing such vector. - When doing check_cond_jmp_op() remember up to 6 registers that gain range because of sync_linked_regs() in such a vector. - Don't propagate range information and reset IDs for registers that don't fit in 6-value vector. - Push a pair {instruction index, linked registers vector} to bpf_verifier_state->jmp_history. - When doing backtrack_insn() check if any of recorded linked registers is currently marked precise, if so mark all linked registers as precise. This also requires fixes for two test_verifier tests: - precise: test 1 - precise: test 2 Both tests contain the following instruction sequence: 19: (bf) r2 = r9 ; R2=scalar(id=3) R9=scalar(id=3) 20: (a5) if r2 < 0x8 goto pc+1 ; R2=scalar(id=3,umin=8) 21: (95) exit 22: (07) r2 += 1 ; R2_w=scalar(id=3+1,...) 23: (bf) r1 = r10 ; R1_w=fp0 R10=fp0 24: (07) r1 += -8 ; R1_w=fp-8 25: (b7) r3 = 0 ; R3_w=0 26: (85) call bpf_probe_read_kernel#113 The call to bpf_probe_read_kernel() at (26) forces r2 to be precise. Previously, this forced all registers with same id to become precise immediately when mark_chain_precision() is called. After this change, the precision is propagated to registers sharing same id only when 'if' instruction is backtracked. Hence verification log for both tests is changed: regs=r2,r9 -> regs=r2 for instructions 25..20. Fixes: 904e6ddf4133 ("bpf: Use scalar ids in mark_chain_precision()") Reported-by: Hao Sun <sunhao.th@gmail.com> Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240718202357.1746514-2-eddyz87@gmail.com Closes: https://lore.kernel.org/bpf/CAEf4BzZ0xidVCqB47XnkXcNhkPWF6_nTV7yt+_Lf0kcFEut2Mg@mail.gmail.com/
2024-07-29selftests/bpf: Use auto-dependencies for test objectsIhor Solodrai
Make use of -M compiler options when building .test.o objects to generate .d files and avoid re-building all tests every time. Previously, if a single test bpf program under selftests/bpf/progs/*.c has changed, make would rebuild all the *.bpf.o, *.skel.h and *.test.o objects, which is a lot of unnecessary work. A typical dependency chain is: progs/x.c -> x.bpf.o -> x.skel.h -> x.test.o -> trunner_binary However for many tests it's not a 1:1 mapping by name, and so far %.test.o have been simply dependent on all %.skel.h files, and %.skel.h files on all %.bpf.o objects. Avoid full rebuilds by instructing the compiler (via -MMD) to produce *.d files with real dependencies, and appropriately including them. Exploit make feature that rebuilds included makefiles if they were changed by setting %.test.d as prerequisite for %.test.o files. A couple of examples of compilation time speedup (after the first clean build): $ touch progs/verifier_and.c && time make -j8 Before: real 0m16.651s After: real 0m2.245s $ touch progs/read_vsyscall.c && time make -j8 Before: real 0m15.743s After: real 0m1.575s A drawback of this change is that now there is an overhead due to make processing lots of .d files, which potentially may slow down unrelated targets. However a time to make all from scratch hasn't changed significantly: $ make clean && time make -j8 Before: real 1m31.148s After: real 1m30.309s Suggested-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/VJihUTnvtwEgv_mOnpfy7EgD9D2MPNoHO-MlANeLIzLJPGhDeyOuGKIYyKgk0O6KPjfM-MuhtvPwZcngN8WFqbTnTRyCSMc2aMZ1ODm1T_g=@pm.me
2024-07-29selftests/bpf: Add connect_to_addr_str helperGeliang Tang
Similar to connect_to_addr() helper for connecting to a server with the given sockaddr_storage type address, this patch adds a new helper named connect_to_addr_str() for connecting to a server with the given string type address "addr_str", together with its "family" and "port" as other parameters of connect_to_addr_str(). In connect_to_addr_str(), the parameters "family", "addr_str" and "port" are used to create a sockaddr_storage type address "addr" by invoking make_sockaddr(). Then pass this "addr" together with "addrlen", "type" and "opts" to connect_to_addr(). Suggested-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/647e82170831558dbde132a7a3d86df660dba2c4.1721282219.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-07-29selftests/bpf: Drop must_fail from network_helper_optsGeliang Tang
The struct member "must_fail" of network_helper_opts() is only used in cgroup_v1v2 tests, it makes sense to drop it from network_helper_opts. Return value (fd) of connect_to_fd_opts() and the expect errno (EPERM) can be checked in cgroup_v1v2.c directly, no need to check them in connect_fd_to_addr() in network_helpers.c. This also makes connect_fd_to_addr() function useless. It can be replaced by connect(). Suggested-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/3faf336019a9a48e2e8951f4cdebf19e3ac6e441.1721282219.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-07-29selftests/bpf: Drop type of connect_to_fd_optsGeliang Tang
The "type" parameter of connect_to_fd_opts() is redundant of "server_fd". Since the "type" can be obtained inside by invoking getsockopt(SO_TYPE), without passing it in as a parameter. This patch drops the "type" parameter of connect_to_fd_opts() and updates its callers. Suggested-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn> Link: https://lore.kernel.org/r/50d8ce7ab7ab0c0f4d211fc7cc4ebe3d3f63424c.1721282219.git.tanggeliang@kylinos.cn Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-07-25Merge tag 'net-6.11-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf and netfilter. A lot of networking people were at a conference last week, busy catching COVID, so relatively short PR. Current release - regressions: - tcp: process the 3rd ACK with sk_socket for TFO and MPTCP Current release - new code bugs: - l2tp: protect session IDR and tunnel session list with one lock, make sure the state is coherent to avoid a warning - eth: bnxt_en: update xdp_rxq_info in queue restart logic - eth: airoha: fix location of the MBI_RX_AGE_SEL_MASK field Previous releases - regressions: - xsk: require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len, the field reuses previously un-validated pad Previous releases - always broken: - tap/tun: drop short frames to prevent crashes later in the stack - eth: ice: add a per-VF limit on number of FDIR filters - af_unix: disable MSG_OOB handling for sockets in sockmap/sockhash" * tag 'net-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (34 commits) tun: add missing verification for short frame tap: add missing verification for short frame mISDN: Fix a use after free in hfcmulti_tx() gve: Fix an edge case for TSO skb validity check bnxt_en: update xdp_rxq_info in queue restart logic tcp: process the 3rd ACK with sk_socket for TFO/MPTCP selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata test xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len bpf: Fix a segment issue when downgrading gso_size net: mediatek: Fix potential NULL pointer dereference in dummy net_device handling MAINTAINERS: make Breno the netconsole maintainer MAINTAINERS: Update bonding entry net: nexthop: Initialize all fields in dumped nexthops net: stmmac: Correct byte order of perfect_match selftests: forwarding: skip if kernel not support setting bridge fdb learning limit tipc: Return non-zero value from tipc_udp_addr2str() on error netfilter: nft_set_pipapo_avx2: disable softinterrupts ice: Fix recipe read procedure ice: Add a per-VF limit on number of FDIR filters net: bonding: correctly annotate RCU in bond_should_notify_peers() ...
2024-07-25Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2024-07-25 We've added 14 non-merge commits during the last 8 day(s) which contain a total of 19 files changed, 177 insertions(+), 70 deletions(-). The main changes are: 1) Fix af_unix to disable MSG_OOB handling for sockets in BPF sockmap and BPF sockhash. Also add test coverage for this case, from Michal Luczaj. 2) Fix a segmentation issue when downgrading gso_size in the BPF helper bpf_skb_adjust_room(), from Fred Li. 3) Fix a compiler warning in resolve_btfids due to a missing type cast, from Liwei Song. 4) Fix stack allocation for arm64 to align the stack pointer at a 16 byte boundary in the fexit_sleep BPF selftest, from Puranjay Mohan. 5) Fix a xsk regression to require a flag when actuating tx_metadata_len, from Stanislav Fomichev. 6) Fix function prototype BTF dumping in libbpf for prototypes that have no input arguments, from Andrii Nakryiko. 7) Fix stacktrace symbol resolution in perf script for BPF programs containing subprograms, from Hou Tao. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata test xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len bpf: Fix a segment issue when downgrading gso_size tools/resolve_btfids: Fix comparison of distinct pointer types warning in resolve_btfids bpf, events: Use prog to emit ksymbol event for main program selftests/bpf: Test sockmap redirect for AF_UNIX MSG_OOB selftests/bpf: Parametrize AF_UNIX redir functions to accept send() flags selftests/bpf: Support SOCK_STREAM in unix_inet_redir_to_connected() af_unix: Disable MSG_OOB handling for sockets in sockmap/sockhash bpftool: Fix typo in usage help libbpf: Fix no-args func prototype BTF dumping syntax MAINTAINERS: Update powerpc BPF JIT maintainers MAINTAINERS: Update email address of Naveen selftests/bpf: fexit_sleep: Fix stack allocation for arm64 ==================== Link: https://patch.msgid.link/20240725114312.32197-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-25selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata testStanislav Fomichev
This flag is now required to use tx_metadata_len. Fixes: 40808a237d9c ("selftests/bpf: Add TX side to xdp_metadata") Reported-by: Julian Schindel <mail@arctic-alpaca.de> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Link: https://lore.kernel.org/bpf/20240713015253.121248-3-sdf@fomichev.me
2024-07-24Merge tag 'random-6.11-rc1-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: "This adds getrandom() support to the vDSO. First, it adds a new kind of mapping to mmap(2), MAP_DROPPABLE, which lets the kernel zero out pages anytime under memory pressure, which enables allocating memory that never gets swapped to disk but also doesn't count as being mlocked. Then, the vDSO implementation of getrandom() is introduced in a generic manner and hooked into random.c. Next, this is implemented on x86. (Also, though it's not ready for this pull, somebody has begun an arm64 implementation already) Finally, two vDSO selftests are added. There are also two housekeeping cleanup commits" * tag 'random-6.11-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: MAINTAINERS: add random.h headers to RNG subsection random: note that RNDGETPOOL was removed in 2.6.9-rc2 selftests/vDSO: add tests for vgetrandom x86: vdso: Wire up getrandom() vDSO implementation random: introduce generic vDSO getrandom() implementation mm: add MAP_DROPPABLE for designating always lazily freeable mappings
2024-07-24Merge tag 'vfs-6.11-rc1.fixes.2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: "VFS: - The new 64bit mount ids start after the old mount id, i.e., at the first non-32 bit value. However, we started counting one id too late and thus lost 4294967296 as the first valid id. Fix that. - Update a few comments on some vfs_*() creation helpers. - Move copying of the xattr name out from the locks required to start a filesystem write. - Extend the filelock lock UAF fix to the compat code as well. - Now that we added the ability to look up an inode under RCU it's possible that lockless hash lookup can find and lock an inode after it gets I_FREEING set. It then waits until inode teardown in evict() is finished. The flag however is still set after evict() has woken up all waiters. If the inode lock is taken late enough on the waiting side after hash removal and wakeup happened the waiting thread will never be woken. Before RCU based lookup this was synchronized via the inode_hash_lock. But since unhashing requires the inode lock as well we can check whether the inode is unhashed while holding inode lock even without holding inode_hash_lock. pidfd: - The nsproxy structure contains nearly all of the namespaces associated with a task. When a namespace type isn't supported nsproxy might contain a NULL pointer or always point to the initial namespace type. The logic isn't consistent. So when deriving namespace fds we need to ensure that the namespace type is supported. First, so that we don't risk dereferncing NULL pointers. The correct bigger fix would be to change all namespaces to always set a valid namespace pointer in struct nsproxy independent of whether or not it is compiled in. But that requires quite a few changes. Second, so that we don't allow deriving namespace fds when the namespace type doesn't exist and thus when they couldn't also be derived via /proc/self/ns/. - Add missing selftests for the new pidfd ioctls to derive namespace fds. This simply extends the already existing testsuite. netfs: - Fix debug logging and fix kconfig variable name so it actually works. - Fix writeback that goes both to the server and cache. The streams are only activated once a subreq is added. When a server write happens the subreq doesn't need to have finished by the time the cache write is started. If the server write has already finished by the time the cache write is about to start the cache write will operate on a folio that might already have been reused. Fix this by preactivating the cache write. - Limit cachefiles subreq size for cache writes to MAX_RW_COUNT" * tag 'vfs-6.11-rc1.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: inode: clarify what's locked vfs: Fix potential circular locking through setxattr() and removexattr() filelock: Fix fcntl/close race recovery compat path fs: use all available ids cachefiles: Set the max subreq size for cache writes to MAX_RW_COUNT netfs: Fix writeback that needs to go to both server and cache pidfs: add selftests for new namespace ioctls pidfs: handle kernels without namespaces cleanly pidfs: when time ns disabled add check for ioctl vfs: correct the comments of vfs_*() helpers vfs: handle __wait_on_freeing_inode() and evict() race netfs: Rename CONFIG_FSCACHE_DEBUG to CONFIG_NETFS_DEBUG netfs: Revert "netfs: Switch debug logging to pr_debug()"
2024-07-24selftests: forwarding: skip if kernel not support setting bridge fdb ↵Hangbin Liu
learning limit If the testing kernel doesn't support setting fdb_max_learned or show fdb_n_learned, just skip it. Or we will get errors like ./bridge_fdb_learning_limit.sh: line 218: [: null: integer expression expected ./bridge_fdb_learning_limit.sh: line 225: [: null: integer expression expected Fixes: 6f84090333bb ("selftests: forwarding: bridge_fdb_learning_limit: Add a new selftest") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Johannes Nixdorf <jnixdorf-oss@avm.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-07-24pidfs: add selftests for new namespace ioctlsChristian Brauner
Add selftests to verify that deriving namespace file descriptors from pidfd file descriptors works correctly. Link: https://lore.kernel.org/r/20240722-work-pidfs-69dbea91edab@brauner Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-07-23Merge tag 'perf-tools-fixes-for-v6.11-2024-07-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools fixes from Namhyung Kim: "Two fixes for building perf and other tools: - Fix breakage in tracing tools due to pkg-config for libtrace{event,fs} - Fix build of perf when libunwind is used" * tag 'perf-tools-fixes-for-v6.11-2024-07-23' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: perf dso: Fix build when libunwind is enabled tools/latency: Use pkg-config in lib_setup of Makefile.config tools/rtla: Use pkg-config in lib_setup of Makefile.config tools/verification: Use pkg-config in lib_setup of Makefile.config tools: Make pkg-config dependency checks usable by other tools perf build: Warn if libtracefs is not found
2024-07-23Merge tag 'kbuild-v6.11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild updates from Masahiro Yamada: - Remove tristate choice support from Kconfig - Stop using the PROVIDE() directive in the linker script - Reduce the number of links for the combination of CONFIG_KALLSYMS and CONFIG_DEBUG_INFO_BTF - Enable the warning for symbol reference to .exit.* sections by default - Fix warnings in RPM package builds - Improve scripts/make_fit.py to generate a FIT image with separate base DTB and overlays - Improve choice value calculation in Kconfig - Fix conditional prompt behavior in choice in Kconfig - Remove support for the uncommon EMAIL environment variable in Debian package builds - Remove support for the uncommon "name <email>" form for the DEBEMAIL environment variable - Raise the minimum supported GNU Make version to 4.0 - Remove stale code for the absolute kallsyms - Move header files commonly used for host programs to scripts/include/ - Introduce the pacman-pkg target to generate a pacman package used in Arch Linux - Clean up Kconfig * tag 'kbuild-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (65 commits) kbuild: doc: gcc to CC change kallsyms: change sym_entry::percpu_absolute to bool type kallsyms: unify seq and start_pos fields of struct sym_entry kallsyms: add more original symbol type/name in comment lines kallsyms: use \t instead of a tab in printf() kallsyms: avoid repeated calculation of array size for markers kbuild: add script and target to generate pacman package modpost: use generic macros for hash table implementation kbuild: move some helper headers from scripts/kconfig/ to scripts/include/ Makefile: add comment to discourage tools/* addition for kernel builds kbuild: clean up scripts/remove-stale-files kconfig: recursive checks drop file/lineno kbuild: rpm-pkg: introduce a simple changelog section for kernel.spec kallsyms: get rid of code for absolute kallsyms kbuild: Create INSTALL_PATH directory if it does not exist kbuild: Abort make on install failures kconfig: remove 'e1' and 'e2' macros from expression deduplication kconfig: remove SYMBOL_CHOICEVAL flag kconfig: add const qualifiers to several function arguments kconfig: call expr_eliminate_yn() at least once in expr_eliminate_dups() ...
2024-07-23Merge tag 'livepatching-for-6.11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching Pull livepatching update from Petr Mladek: - show patch->replace flag in sysfs - add or improve few selftests * tag 'livepatching-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching: livepatch: Replace snprintf() with sysfs_emit() selftests/livepatch: Add selftests for "replace" sysfs attribute livepatch: Add "replace" sysfs attribute selftests: livepatch: Test atomic replace against multiple modules selftests/livepatch: define max test-syscall processes
2024-07-23Merge branch 'for-6.11/sysfs-patch-replace' into for-linusPetr Mladek
2024-07-22tools/resolve_btfids: Fix comparison of distinct pointer types warning in ↵Liwei Song
resolve_btfids Add a type cast for set8->pairs to fix below compile warning: main.c: In function 'sets_patch': main.c:699:50: warning: comparison of distinct pointer types lacks a cast 699 | BUILD_BUG_ON(set8->pairs != &set8->pairs[0].id); | ^~ Fixes: 9707ac4fe2f5 ("tools/resolve_btfids: Refactor set sorting with types from btf_ids.h") Signed-off-by: Liwei Song <liwei.song.lsong@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/bpf/20240722083305.4009723-1-liwei.song.lsong@gmail.com
2024-07-21Merge tag 'mm-nonmm-stable-2024-07-21-15-07' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - In the series "treewide: Refactor heap related implementation", Kuan-Wei Chiu has significantly reworked the min_heap library code and has taught bcachefs to use the new more generic implementation. - Yury Norov's series "Cleanup cpumask.h inclusion in core headers" reworks the cpumask and nodemask headers to make things generally more rational. - Kuan-Wei Chiu has sent along some maintenance work against our sorting library code in the series "lib/sort: Optimizations and cleanups". - More library maintainance work from Christophe Jaillet in the series "Remove usage of the deprecated ida_simple_xx() API". - Ryusuke Konishi continues with the nilfs2 fixes and clanups in the series "nilfs2: eliminate the call to inode_attach_wb()". - Kuan-Ying Lee has some fixes to the gdb scripts in the series "Fix GDB command error". - Plus the usual shower of singleton patches all over the place. Please see the relevant changelogs for details. * tag 'mm-nonmm-stable-2024-07-21-15-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (98 commits) ia64: scrub ia64 from poison.h watchdog/perf: properly initialize the turbo mode timestamp and rearm counter tsacct: replace strncpy() with strscpy() lib/bch.c: use swap() to improve code test_bpf: convert comma to semicolon init/modpost: conditionally check section mismatch to __meminit* init: remove unused __MEMINIT* macros nilfs2: Constify struct kobj_type nilfs2: avoid undefined behavior in nilfs_cnt32_ge macro math: rational: add missing MODULE_DESCRIPTION() macro lib/zlib: add missing MODULE_DESCRIPTION() macro fs: ufs: add MODULE_DESCRIPTION() lib/rbtree.c: fix the example typo ocfs2: add bounds checking to ocfs2_check_dir_entry() fs: add kernel-doc comments to ocfs2_prepare_orphan_dir() coredump: simplify zap_process() selftests/fpu: add missing MODULE_DESCRIPTION() macro compiler.h: simplify data_race() macro build-id: require program headers to be right after ELF header resource: add missing MODULE_DESCRIPTION() ...
2024-07-21Merge tag 'mm-stable-2024-07-21-14-50' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - In the series "mm: Avoid possible overflows in dirty throttling" Jan Kara addresses a couple of issues in the writeback throttling code. These fixes are also targetted at -stable kernels. - Ryusuke Konishi's series "nilfs2: fix potential issues related to reserved inodes" does that. This should actually be in the mm-nonmm-stable tree, along with the many other nilfs2 patches. My bad. - More folio conversions from Kefeng Wang in the series "mm: convert to folio_alloc_mpol()" - Kemeng Shi has sent some cleanups to the writeback code in the series "Add helper functions to remove repeated code and improve readability of cgroup writeback" - Kairui Song has made the swap code a little smaller and a little faster in the series "mm/swap: clean up and optimize swap cache index". - In the series "mm/memory: cleanly support zeropage in vm_insert_page*(), vm_map_pages*() and vmf_insert_mixed()" David Hildenbrand has reworked the rather sketchy handling of the use of the zeropage in MAP_SHARED mappings. I don't see any runtime effects here - more a cleanup/understandability/maintainablity thing. - Dev Jain has improved selftests/mm/va_high_addr_switch.c's handling of higher addresses, for aarch64. The (poorly named) series is "Restructure va_high_addr_switch". - The core TLB handling code gets some cleanups and possible slight optimizations in Bang Li's series "Add update_mmu_tlb_range() to simplify code". - Jane Chu has improved the handling of our fake-an-unrecoverable-memory-error testing feature MADV_HWPOISON in the series "Enhance soft hwpoison handling and injection". - Jeff Johnson has sent a billion patches everywhere to add MODULE_DESCRIPTION() to everything. Some landed in this pull. - In the series "mm: cleanup MIGRATE_SYNC_NO_COPY mode", Kefeng Wang has simplified migration's use of hardware-offload memory copying. - Yosry Ahmed performs more folio API conversions in his series "mm: zswap: trivial folio conversions". - In the series "large folios swap-in: handle refault cases first", Chuanhua Han inches us forward in the handling of large pages in the swap code. This is a cleanup and optimization, working toward the end objective of full support of large folio swapin/out. - In the series "mm,swap: cleanup VMA based swap readahead window calculation", Huang Ying has contributed some cleanups and a possible fixlet to his VMA based swap readahead code. - In the series "add mTHP support for anonymous shmem" Baolin Wang has taught anonymous shmem mappings to use multisize THP. By default this is a no-op - users must opt in vis sysfs controls. Dramatic improvements in pagefault latency are realized. - David Hildenbrand has some cleanups to our remaining use of page_mapcount() in the series "fs/proc: move page_mapcount() to fs/proc/internal.h". - David also has some highmem accounting cleanups in the series "mm/highmem: don't track highmem pages manually". - Build-time fixes and cleanups from John Hubbard in the series "cleanups, fixes, and progress towards avoiding "make headers"". - Cleanups and consolidation of the core pagemap handling from Barry Song in the series "mm: introduce pmd|pte_needs_soft_dirty_wp helpers and utilize them". - Lance Yang's series "Reclaim lazyfree THP without splitting" has reduced the latency of the reclaim of pmd-mapped THPs under fairly common circumstances. A 10x speedup is seen in a microbenchmark. It does this by punting to aother CPU but I guess that's a win unless all CPUs are pegged. - hugetlb_cgroup cleanups from Xiu Jianfeng in the series "mm/hugetlb_cgroup: rework on cftypes". - Miaohe Lin's series "Some cleanups for memory-failure" does just that thing. - Someone other than SeongJae has developed a DAMON feature in Honggyu Kim's series "DAMON based tiered memory management for CXL memory". This adds DAMON features which may be used to help determine the efficiency of our placement of CXL/PCIe attached DRAM. - DAMON user API centralization and simplificatio work in SeongJae Park's series "mm/damon: introduce DAMON parameters online commit function". - In the series "mm: page_type, zsmalloc and page_mapcount_reset()" David Hildenbrand does some maintenance work on zsmalloc - partially modernizing its use of pageframe fields. - Kefeng Wang provides more folio conversions in the series "mm: remove page_maybe_dma_pinned() and page_mkclean()". - More cleanup from David Hildenbrand, this time in the series "mm/memory_hotplug: use PageOffline() instead of PageReserved() for !ZONE_DEVICE". It "enlightens memory hotplug more about PageOffline() pages" and permits the removal of some virtio-mem hacks. - Barry Song's series "mm: clarify folio_add_new_anon_rmap() and __folio_add_anon_rmap()" is a cleanup to the anon folio handling in preparation for mTHP (multisize THP) swapin. - Kefeng Wang's series "mm: improve clear and copy user folio" implements more folio conversions, this time in the area of large folio userspace copying. - The series "Docs/mm/damon/maintaier-profile: document a mailing tool and community meetup series" tells people how to get better involved with other DAMON developers. From SeongJae Park. - A large series ("kmsan: Enable on s390") from Ilya Leoshkevich does that. - David Hildenbrand sends along more cleanups, this time against the migration code. The series is "mm/migrate: move NUMA hinting fault folio isolation + checks under PTL". - Jan Kara has found quite a lot of strangenesses and minor errors in the readahead code. He addresses this in the series "mm: Fix various readahead quirks". - SeongJae Park's series "selftests/damon: test DAMOS tried regions and {min,max}_nr_regions" adds features and addresses errors in DAMON's self testing code. - Gavin Shan has found a userspace-triggerable WARN in the pagecache code. The series "mm/filemap: Limit page cache size to that supported by xarray" addresses this. The series is marked cc:stable. - Chengming Zhou's series "mm/ksm: cmp_and_merge_page() optimizations and cleanup" cleans up and slightly optimizes KSM. - Roman Gushchin has separated the memcg-v1 and memcg-v2 code - lots of code motion. The series (which also makes the memcg-v1 code Kconfigurable) are "mm: memcg: separate legacy cgroup v1 code and put under config option" and "mm: memcg: put cgroup v1-specific memcg data under CONFIG_MEMCG_V1" - Dan Schatzberg's series "Add swappiness argument to memory.reclaim" adds an additional feature to this cgroup-v2 control file. - The series "Userspace controls soft-offline pages" from Jiaqi Yan permits userspace to stop the kernel's automatic treatment of excessive correctable memory errors. In order to permit userspace to monitor and handle this situation. - Kefeng Wang's series "mm: migrate: support poison recover from migrate folio" teaches the kernel to appropriately handle migration from poisoned source folios rather than simply panicing. - SeongJae Park's series "Docs/damon: minor fixups and improvements" does those things. - In the series "mm/zsmalloc: change back to per-size_class lock" Chengming Zhou improves zsmalloc's scalability and memory utilization. - Vivek Kasireddy's series "mm/gup: Introduce memfd_pin_folios() for pinning memfd folios" makes the GUP code use FOLL_PIN rather than bare refcount increments. So these paes can first be moved aside if they reside in the movable zone or a CMA block. - Andrii Nakryiko has added a binary ioctl()-based API to /proc/pid/maps for much faster reading of vma information. The series is "query VMAs from /proc/<pid>/maps". - In the series "mm: introduce per-order mTHP split counters" Lance Yang improves the kernel's presentation of developer information related to multisize THP splitting. - Michael Ellerman has developed the series "Reimplement huge pages without hugepd on powerpc (8xx, e500, book3s/64)". This permits userspace to use all available huge page sizes. - In the series "revert unconditional slab and page allocator fault injection calls" Vlastimil Babka removes a performance-affecting and not very useful feature from slab fault injection. * tag 'mm-stable-2024-07-21-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (411 commits) mm/mglru: fix ineffective protection calculation mm/zswap: fix a white space issue mm/hugetlb: fix kernel NULL pointer dereference when migrating hugetlb folio mm/hugetlb: fix possible recursive locking detected warning mm/gup: clear the LRU flag of a page before adding to LRU batch mm/numa_balancing: teach mpol_to_str about the balancing mode mm: memcg1: convert charge move flags to unsigned long long alloc_tag: fix page_ext_get/page_ext_put sequence during page splitting lib: reuse page_ext_data() to obtain codetag_ref lib: add missing newline character in the warning message mm/mglru: fix overshooting shrinker memory mm/mglru: fix div-by-zero in vmpressure_calc_level() mm/kmemleak: replace strncpy() with strscpy() mm, page_alloc: put should_fail_alloc_page() back behing CONFIG_FAIL_PAGE_ALLOC mm, slab: put should_failslab() back behind CONFIG_SHOULD_FAILSLAB mm: ignore data-race in __swap_writepage hugetlbfs: ensure generic_hugetlb_get_unmapped_area() returns higher address than mmap_min_addr mm: shmem: rename mTHP shmem counters mm: swap_state: use folio_alloc_mpol() in __read_swap_cache_async() mm/migrate: putback split folios when numa hint migration fails ...
2024-07-20Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm updates from Paolo Bonzini: "ARM: - Initial infrastructure for shadow stage-2 MMUs, as part of nested virtualization enablement - Support for userspace changes to the guest CTR_EL0 value, enabling (in part) migration of VMs between heterogenous hardware - Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1 of the protocol - FPSIMD/SVE support for nested, including merged trap configuration and exception routing - New command-line parameter to control the WFx trap behavior under KVM - Introduce kCFI hardening in the EL2 hypervisor - Fixes + cleanups for handling presence/absence of FEAT_TCRX - Miscellaneous fixes + documentation updates LoongArch: - Add paravirt steal time support - Add support for KVM_DIRTY_LOG_INITIALLY_SET - Add perf kvm-stat support for loongarch RISC-V: - Redirect AMO load/store access fault traps to guest - perf kvm stat support - Use guest files for IMSIC virtualization, when available s390: - Assortment of tiny fixes which are not time critical x86: - Fixes for Xen emulation - Add a global struct to consolidate tracking of host values, e.g. EFER - Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the effective APIC bus frequency, because TDX - Print the name of the APICv/AVIC inhibits in the relevant tracepoint - Clean up KVM's handling of vendor specific emulation to consistently act on "compatible with Intel/AMD", versus checking for a specific vendor - Drop MTRR virtualization, and instead always honor guest PAT on CPUs that support self-snoop - Update to the newfangled Intel CPU FMS infrastructure - Don't advertise IA32_PERF_GLOBAL_OVF_CTRL as an MSR-to-be-saved, as it reads '0' and writes from userspace are ignored - Misc cleanups x86 - MMU: - Small cleanups, renames and refactoring extracted from the upcoming Intel TDX support - Don't allocate kvm_mmu_page.shadowed_translation for shadow pages that can't hold leafs SPTEs - Unconditionally drop mmu_lock when allocating TDP MMU page tables for eager page splitting, to avoid stalling vCPUs when splitting huge pages - Bug the VM instead of simply warning if KVM tries to split a SPTE that is non-present or not-huge. KVM is guaranteed to end up in a broken state because the callers fully expect a valid SPTE, it's all but dangerous to let more MMU changes happen afterwards x86 - AMD: - Make per-CPU save_area allocations NUMA-aware - Force sev_es_host_save_area() to be inlined to avoid calling into an instrumentable function from noinstr code - Base support for running SEV-SNP guests. API-wise, this includes a new KVM_X86_SNP_VM type, encrypting/measure the initial image into guest memory, and finalizing it before launching it. Internally, there are some gmem/mmu hooks needed to prepare gmem-allocated pages before mapping them into guest private memory ranges This includes basic support for attestation guest requests, enough to say that KVM supports the GHCB 2.0 specification There is no support yet for loading into the firmware those signing keys to be used for attestation requests, and therefore no need yet for the host to provide certificate data for those keys. To support fetching certificate data from userspace, a new KVM exit type will be needed to handle fetching the certificate from userspace. An attempt to define a new KVM_EXIT_COCO / KVM_EXIT_COCO_REQ_CERTS exit type to handle this was introduced in v1 of this patchset, but is still being discussed by community, so for now this patchset only implements a stub version of SNP Extended Guest Requests that does not provide certificate data x86 - Intel: - Remove an unnecessary EPT TLB flush when enabling hardware - Fix a series of bugs that cause KVM to fail to detect nested pending posted interrupts as valid wake eents for a vCPU executing HLT in L2 (with HLT-exiting disable by L1) - KVM: x86: Suppress MMIO that is triggered during task switch emulation Explicitly suppress userspace emulated MMIO exits that are triggered when emulating a task switch as KVM doesn't support userspace MMIO during complex (multi-step) emulation Silently ignoring the exit request can result in the WARN_ON_ONCE(vcpu->mmio_needed) firing if KVM exits to userspace for some other reason prior to purging mmio_needed See commit 0dc902267cb3 ("KVM: x86: Suppress pending MMIO write exits if emulator detects exception") for more details on KVM's limitations with respect to emulated MMIO during complex emulator flows Generic: - Rename the AS_UNMOVABLE flag that was introduced for KVM to AS_INACCESSIBLE, because the special casing needed by these pages is not due to just unmovability (and in fact they are only unmovable because the CPU cannot access them) - New ioctl to populate the KVM page tables in advance, which is useful to mitigate KVM page faults during guest boot or after live migration. The code will also be used by TDX, but (probably) not through the ioctl - Enable halt poll shrinking by default, as Intel found it to be a clear win - Setup empty IRQ routing when creating a VM to avoid having to synchronize SRCU when creating a split IRQCHIP on x86 - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag that arch code can use for hooking both sched_in() and sched_out() - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid truncating a bogus value from userspace, e.g. to help userspace detect bugs - Mark a vCPU as preempted if and only if it's scheduled out while in the KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest memory when retrieving guest state during live migration blackout Selftests: - Remove dead code in the memslot modification stress test - Treat "branch instructions retired" as supported on all AMD Family 17h+ CPUs - Print the guest pseudo-RNG seed only when it changes, to avoid spamming the log for tests that create lots of VMs - Make the PMU counters test less flaky when counting LLC cache misses by doing CLFLUSH{OPT} in every loop iteration" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits) crypto: ccp: Add the SNP_VLEK_LOAD command KVM: x86/pmu: Add kvm_pmu_call() to simplify static calls of kvm_pmu_ops KVM: x86: Introduce kvm_x86_call() to simplify static calls of kvm_x86_ops KVM: x86: Replace static_call_cond() with static_call() KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event x86/sev: Move sev_guest.h into common SEV header KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event KVM: x86: Suppress MMIO that is triggered during task switch emulation KVM: x86/mmu: Clean up make_huge_page_split_spte() definition and intro KVM: x86/mmu: Bug the VM if KVM tries to split a !hugepage SPTE KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory() KVM: x86/mmu: Make kvm_mmu_do_page_fault() return mapped level KVM: x86/mmu: Account pf_{fixed,emulate,spurious} in callers of "do page fault" KVM: x86/mmu: Bump pf_taken stat only in the "real" page fault handler KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory KVM: Document KVM_PRE_FAULT_MEMORY ioctl mm, virt: merge AS_UNMOVABLE and AS_INACCESSIBLE perf kvm: Add kvm-stat for loongarch64 LoongArch: KVM: Add PV steal time support in guest side ...
2024-07-20Merge tag 'libnvdimm-for-6.11' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm updates from Ira Weiny: - One small cleanup to use sizeof(*pointer) - Add MODULE_DESCRIPTIONS() to eliminate make W=1 warnings * tag 'libnvdimm-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: testing: nvdimm: Add MODULE_DESCRIPTION() macros testing: nvdimm: iomap: add MODULE_DESCRIPTION() dax: add missing MODULE_DESCRIPTION() macros nvdimm: add missing MODULE_DESCRIPTION() macros ACPI: NFIT: add missing MODULE_DESCRIPTION() macro nvdimm/btt: use sizeof(*pointer) instead of sizeof(type)