summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-11-12Merge branch 'bpf-support-private-stack-for-bpf-progs'Alexei Starovoitov
Yonghong Song says: ==================== bpf: Support private stack for bpf progs The main motivation for private stack comes from nested scheduler in sched-ext from Tejun. The basic idea is that - each cgroup will its own associated bpf program, - bpf program with parent cgroup will call bpf programs in immediate child cgroups. Let us say we have the following cgroup hierarchy: root_cg (prog0): cg1 (prog1): cg11 (prog11): cg111 (prog111) cg112 (prog112) cg12 (prog12): cg121 (prog121) cg122 (prog122) cg2 (prog2): cg21 (prog21) cg22 (prog22) cg23 (prog23) In the above example, prog0 will call a kfunc which will call prog1 and prog2 to get sched info for cg1 and cg2 and then the information is summarized and sent back to prog0. Similarly, prog11 and prog12 will be invoked in the kfunc and the result will be summarized and sent back to prog1, etc. The following illustrates a possible call sequence: ... -> bpf prog A -> kfunc -> ops.<callback_fn> (bpf prog B) ... Currently, for each thread, the x86 kernel allocate 16KB stack. Each bpf program (including its subprograms) has maximum 512B stack size to avoid potential stack overflow. Nested bpf programs further increase the risk of stack overflow. To avoid potential stack overflow caused by bpf programs, this patch set supported private stack and bpf program stack space is allocated during jit time. Using private stack for bpf progs can reduce or avoid potential kernel stack overflow. Currently private stack is applied to tracing programs like kprobe/uprobe, perf_event, tracepoint, raw tracepoint and struct_ops progs. Tracing progs enable private stack if any subprog stack size is more than a threshold (i.e. 64B). Struct-ops progs enable private stack based on particular struct op implementation which can enable private stack before verification at per-insn level. Struct-ops progs have the same treatment as tracing progs w.r.t when to enable private stack. For all these progs, the kernel will do recursion check (no nesting for per prog per cpu) to ensure that private stack won't be overwritten. The bpf_prog_aux struct has a callback func recursion_detected() which can be implemented by kernel subsystem to synchronously detect recursion, report error, etc. Only x86_64 arch supports private stack now. It can be extended to other archs later. Please see each individual patch for details. Change logs: v11 -> v12: - v11 link: https://lore.kernel.org/bpf/20241109025312.148539-1-yonghong.song@linux.dev/ - Fix a bug where allocated percpu space is less than actual private stack. - Add guard memory (before and after actual prog stack) to detect potential underflow/overflow. v10 -> v11: - v10 link: https://lore.kernel.org/bpf/20241107024138.3355687-1-yonghong.song@linux.dev/ - Use two bool variables, priv_stack_requested (used by struct-ops only) and jits_use_priv_stack, in order to make code cleaner. - Set env->prog->aux->jits_use_priv_stack to true if any subprog uses private stack. This is for struct-ops use case to kick in recursion protection. v9 -> v10: - v9 link: https://lore.kernel.org/bpf/20241104193455.3241859-1-yonghong.song@linux.dev/ - Simplify handling async cbs by making those async cb related progs using normal kernel stack. - Do percpu allocation in jit instead of verifier. v8 -> v9: - v8 link: https://lore.kernel.org/bpf/20241101030950.2677215-1-yonghong.song@linux.dev/ - Use enum to express priv stack mode. - Use bits in bpf_subprog_info struct to do subprog recursion check between main/async and async subprogs. - Fix potential memory leak. - Rename recursion detection func from recursion_skipped() to recursion_detected(). v7 -> v8: - v7 link: https://lore.kernel.org/bpf/20241029221637.264348-1-yonghong.song@linux.dev/ - Add recursion_skipped() callback func to bpf_prog->aux structure such that if a recursion miss happened and bpf_prog->aux->recursion_skipped is not NULL, the callback fn will be called so the subsystem can do proper action based on their respective design. v6 -> v7: - v6 link: https://lore.kernel.org/bpf/20241020191341.2104841-1-yonghong.song@linux.dev/ - Going back to do private stack allocation per prog instead per subtree. This can simplify implementation and avoid verifier complexity. - Handle potential nested subprog run if async callback exists. - Use struct_ops->check_member() callback to set whether a particular struct-ops prog wants private stack or not. v5 -> v6: - v5 link: https://lore.kernel.org/bpf/20241017223138.3175885-1-yonghong.song@linux.dev/ - Instead of using (or not using) private stack at struct_ops level, each prog in struct_ops can decide whether to use private stack or not. v4 -> v5: - v4 link: https://lore.kernel.org/bpf/20241010175552.1895980-1-yonghong.song@linux.dev/ - Remove bpf_prog_call() related implementation. - Allow (opt-in) private stack for sched-ext progs. v3 -> v4: - v3 link: https://lore.kernel.org/bpf/20240926234506.1769256-1-yonghong.song@linux.dev/ There is a long discussion in the above v3 link trying to allow private stack to be used by kernel functions in order to simplify implementation. But unfortunately we didn't find a workable solution yet, so we return to the approach where private stack is only used by bpf programs. - Add bpf_prog_call() kfunc. v2 -> v3: - Instead of per-subprog private stack allocation, allocate private stacks at main prog or callback entry prog. Subprogs not main or callback progs will increment the inherited stack pointer to be their frame pointer. - Private stack allows each prog max stack size to be 512 bytes, intead of the whole prog hierarchy to be 512 bytes. - Add some tests. ==================== Link: https://lore.kernel.org/r/20241112163902.2223011-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-12selftests/bpf: Add struct_ops prog private stack testsYonghong Song
Add three tests for struct_ops using private stack. ./test_progs -t struct_ops_private_stack #336/1 struct_ops_private_stack/private_stack:OK #336/2 struct_ops_private_stack/private_stack_fail:OK #336/3 struct_ops_private_stack/private_stack_recur:OK #336 struct_ops_private_stack:OK The following is a snippet of a struct_ops check_member() implementation: u32 moff = __btf_member_bit_offset(t, member) / 8; switch (moff) { case offsetof(struct bpf_testmod_ops3, test_1): prog->aux->priv_stack_requested = true; prog->aux->recursion_detected = test_1_recursion_detected; fallthrough; default: break; } return 0; The first test is with nested two different callback functions where the first prog has more than 512 byte stack size (including subprogs) with private stack enabled. The second test is a negative test where the second prog has more than 512 byte stack size without private stack enabled. The third test is the same callback function recursing itself. At run time, the jit trampoline recursion check kicks in to prevent the recursion. The recursion_detected() callback function is implemented by the bpf_testmod, the following message in dmesg bpf_testmod: oh no, recursing into test_1, recursion_misses 1 demonstrates the callback function is indeed triggered when recursion miss happens. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20241112163938.2225528-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-12bpf: Support private stack for struct_ops progsYonghong Song
For struct_ops progs, whether a particular prog uses private stack depends on prog->aux->priv_stack_requested setting before actual insn-level verification for that prog. One particular implementation is to piggyback on struct_ops->check_member(). The next patch has an example for this. The struct_ops->check_member() sets prog->aux->priv_stack_requested to be true which enables private stack usage. The struct_ops prog follows the same rule as kprobe/tracing progs after function bpf_enable_priv_stack(). For example, even a struct_ops prog requests private stack, it could still use normal kernel stack if the stack size is small (< 64 bytes). Similar to tracing progs, nested same cpu same prog run will be skipped. A field (recursion_detected()) is added to bpf_prog_aux structure. If bpf_prog->aux->recursion_detected is implemented by the struct_ops subsystem and nested same cpu/prog happens, the function will be triggered to report an error, collect related info, etc. Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20241112163933.2224962-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-12selftests/bpf: Add tracing prog private stack testsYonghong Song
Some private stack tests are added including: - main prog only with stack size greater than BPF_PSTACK_MIN_SIZE. - main prog only with stack size smaller than BPF_PSTACK_MIN_SIZE. - prog with one subprog having MAX_BPF_STACK stack size and another subprog having non-zero small stack size. - prog with callback function. - prog with exception in main prog or subprog. - prog with async callback without nesting - prog with async callback with possible nesting Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20241112163927.2224750-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-12bpf, x86: Support private stack in jitYonghong Song
Private stack is allocated in function bpf_int_jit_compile() with alignment 8. Private stack allocation size includes the stack size determined by verifier and additional space to protect stack overflow and underflow. See below an illustration: ---> memory address increasing [8 bytes to protect overflow] [normal stack] [8 bytes to protect underflow] If overflow/underflow is detected, kernel messages will be emited in dmesg like BPF private stack overflow/underflow detected for prog Fx BPF Private stack overflow/underflow detected for prog bpf_prog_a41699c234a1567a_subprog1x Those messages are generated when I made some changes to jitted code to intentially cause overflow for some progs. For the jited prog, The x86 register 9 (X86_REG_R9) is used to replace bpf frame register (BPF_REG_10). The private stack is used per subprog per cpu. The X86_REG_R9 is saved and restored around every func call (not including tailcall) to maintain correctness of X86_REG_R9. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20241112163922.2224385-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-12bpf, x86: Avoid repeated usage of bpf_prog->aux->stack_depthYonghong Song
Refactor the code to avoid repeated usage of bpf_prog->aux->stack_depth in do_jit() func. If the private stack is used, the stack_depth will be 0 for that prog. Refactoring make it easy to adjust stack_depth. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20241112163917.2224189-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-12bpf: Enable private stack for eligible subprogsYonghong Song
If private stack is used by any subprog, set that subprog prog->aux->jits_use_priv_stack to be true so later jit can allocate private stack for that subprog properly. Also set env->prog->aux->jits_use_priv_stack to be true if any subprog uses private stack. This is a use case for a single main prog (no subprogs) to use private stack, and also a use case for later struct-ops progs where env->prog->aux->jits_use_priv_stack will enable recursion check if any subprog uses private stack. Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20241112163912.2224007-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-12bpf: Find eligible subprogs for private stack supportYonghong Song
Private stack will be allocated with percpu allocator in jit time. To avoid complexity at runtime, only one copy of private stack is available per cpu per prog. So runtime recursion check is necessary to avoid stack corruption. Current private stack only supports kprobe/perf_event/tp/raw_tp which has recursion check in the kernel, and prog types that use bpf trampoline recursion check. For trampoline related prog types, currently only tracing progs have recursion checking. To avoid complexity, all async_cb subprogs use normal kernel stack including those subprogs used by both main prog subtree and async_cb subtree. Any prog having tail call also uses kernel stack. To avoid jit penalty with private stack support, a subprog stack size threshold is set such that only if the stack size is no less than the threshold, private stack is supported. The current threshold is 64 bytes. This avoids jit penality if the stack usage is small. A useless 'continue' is also removed from a loop in func check_max_stack_depth(). Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20241112163907.2223839-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-12Merge branch 'selftests-bpf-fix-for-bpf_signal-stalls-watchdog-for-test_progs'Alexei Starovoitov
Eduard Zingerman says: ==================== selftests/bpf: fix for bpf_signal stalls, watchdog for test_progs Test case 'bpf_signal' had been recently reported to stall, both on the mailing list [1] and CI [2]. The stall is caused by CPU cycles perf event not being delivered within expected time frame, before test process enters system call and waits indefinitely. This patch-set addresses the issue in several ways: - A watchdog timer is added to test_progs.c runner: - it prints current sub-test name to stderr if sub-test takes longer than 10 seconds to finish; - it terminates process executing sub-test if sub-test takes longer than 120 seconds to finish. - The test case is updated to await perf event notification with a timeout and a few retries, this serves two purposes: - busy loops longer to increase the time frame for CPU cycles event generation/delivery; - makes a timeout, not stall, a worst case scenario. - The test case is updated to lower frequency of perf events, as high frequency of such events caused events generation throttling, which in turn delayed events delivery by amount of time sufficient to cause test case failure. Note: librt pthread-based timer API is used to implement watchdog timer. I chose this API over SIGALRM because signal handler execution within test process context was sufficient to trigger perf event delivery for send_signal/send_signal_nmi_thread_remote test case, w/o any additional changes. Thus I concluded that SIGALRM based implementation interferes with tests execution. [1] https://lore.kernel.org/bpf/CAP01T75OUeE8E-Lw9df84dm8ag2YmHW619f1DmPSVZ5_O89+Bg@mail.gmail.com/ [2] https://github.com/kernel-patches/bpf/actions/runs/11791485271/job/32843996871 ==================== Link: https://lore.kernel.org/r/20241112110906.3045278-1-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-12selftests/bpf: update send_signal to lower perf evemts frequencyEduard Zingerman
Similar to commit [1] sample perf events less often in test_send_signal_nmi(). This should reduce perf events throttling. [1] 7015843afcaf ("selftests/bpf: Fix send_signal test with nested CONFIG_PARAVIRT") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20241112110906.3045278-5-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-12selftests/bpf: allow send_signal test to timeoutEduard Zingerman
The following invocation: $ t1=send_signal/send_signal_perf_thread_remote \ t2=send_signal/send_signal_nmi_thread_remote \ ./test_progs -t $t1,$t2 Leads to send_signal_nmi_thread_remote to be stuck on a line 180: /* wait for result */ err = read(pipe_c2p[0], buf, 1); In this test case: - perf event PERF_COUNT_HW_CPU_CYCLES is created for parent process; - BPF program is attached to perf event, and sends a signal to child process when event occurs; - parent program burns some CPU in busy loop and calls read() to get notification from child that it received a signal. The perf event is declared with .sample_period = 1. This forces perf to throttle events, and under some unclear conditions the event does not always occur while parent is in busy loop. After parent enters read() system call CPU cycles event won't be generated for parent anymore. Thus, if perf event had not occurred already the test is stuck. This commit updates the parent to wait for notification with a timeout, doing several iterations of busy loop + read_with_timeout(). Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20241112110906.3045278-4-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-12selftests/bpf: add read_with_timeout() utility functionEduard Zingerman
int read_with_timeout(int fd, char *buf, size_t count, long usec) As a regular read(2), but allows to specify a timeout in micro-seconds. Returns -EAGAIN on timeout. Implemented using select(). Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20241112110906.3045278-3-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-12selftests/bpf: watchdog timer for test_progsEduard Zingerman
This commit provides a watchdog timer that sets a limit of how long a single sub-test could run: - if sub-test runs for 10 seconds, the name of the test is printed (currently the name of the test is printed only after it finishes); - if sub-test runs for 120 seconds, the running thread is terminated with SIGSEGV (to trigger crash_handler() and get a stack trace). Specifically: - the timer is armed on each call to run_one_test(); - re-armed at each call to test__start_subtest(); - is stopped when exiting run_one_test(). Default timeout could be overridden using '-w' or '--watchdog-timeout' options. Value 0 can be used to turn the timer off. Here is an example execution: $ ./ssh-exec.sh ./test_progs -w 5 -t \ send_signal/send_signal_perf_thread_remote,send_signal/send_signal_nmi_thread_remote WATCHDOG: test case send_signal/send_signal_nmi_thread_remote executes for 5 seconds, terminating with SIGSEGV Caught signal #11! Stack trace: ./test_progs(crash_handler+0x1f)[0x9049ef] /lib64/libc.so.6(+0x40d00)[0x7f1f1184fd00] /lib64/libc.so.6(read+0x4a)[0x7f1f1191cc4a] ./test_progs[0x720dd3] ./test_progs[0x71ef7a] ./test_progs(test_send_signal+0x1db)[0x71edeb] ./test_progs[0x9066c5] ./test_progs(main+0x5ed)[0x9054ad] /lib64/libc.so.6(+0x2a088)[0x7f1f11839088] /lib64/libc.so.6(__libc_start_main+0x8b)[0x7f1f1183914b] ./test_progs(_start+0x25)[0x527385] #292 send_signal:FAIL test_send_signal_common:PASS:reading pipe 0 nsec test_send_signal_common:PASS:reading pipe error: size 0 0 nsec test_send_signal_common:PASS:incorrect result 0 nsec test_send_signal_common:PASS:pipe_write 0 nsec test_send_signal_common:PASS:setpriority 0 nsec Timer is implemented using timer_{create,start} librt API. Internally librt uses pthreads for SIGEV_THREAD timers, so this change adds a background timer thread to the test process. Because of this a few checks in tests 'bpf_iter' and 'iters' need an update to account for an extra thread. For parallelized scenario the watchdog is also created for each worker fork. If one of the workers gets stuck, it would be terminated by a watchdog. In theory, this might lead to a scenario when all worker threads are exhausted, however this should not be a problem for server_main(), as it would exit with some of the tests not run. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20241112110906.3045278-2-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-11Merge branch 'libbpf-stringify-error-codes-in-log-messages'Andrii Nakryiko
Mykyta Yatsenko says: ==================== libbpf: stringify error codes in log messages From: Mykyta Yatsenko <yatsenko@meta.com> Libbpf may report error in 2 ways: 1. Numeric errno 2. Errno's text representation, returned by strerror Both ways may be confusing for users: numeric code requires people to know how to find its meaning and strerror may be too generic and unclear. These patches modify libbpf error reporting by swapping numeric codes and strerror with the standard short error name, for example: "failed to attach: -22" becomes "failed to attach: -EINVAL". ==================== Link: https://patch.msgid.link/20241111212919.368971-1-mykyta.yatsenko5@gmail.com Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-11-11libbpf: Stringify errno in log messages in the remaining codeMykyta Yatsenko
Convert numeric error codes into the string representations in log messages in the rest of libbpf source files. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241111212919.368971-5-mykyta.yatsenko5@gmail.com
2024-11-11libbpf: Stringify errno in log messages in btf*.cMykyta Yatsenko
Convert numeric error codes into the string representations in log messages in btf.c and btf_dump.c. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241111212919.368971-4-mykyta.yatsenko5@gmail.com
2024-11-11libbpf: Stringify errno in log messages in libbpf.cMykyta Yatsenko
Convert numeric error codes into the string representations in log messages in libbpf.c. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241111212919.368971-3-mykyta.yatsenko5@gmail.com
2024-11-11libbpf: Introduce errstr() for stringifying errnoMykyta Yatsenko
Add function errstr(int err) that allows converting numeric error codes into string representations. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241111212919.368971-2-mykyta.yatsenko5@gmail.com
2024-11-11bpf: Replace the document for PTR_TO_BTF_ID_OR_NULLMenglong Dong
Commit c25b2ae13603 ("bpf: Replace PTR_TO_XXX_OR_NULL with PTR_TO_XXX | PTR_MAYBE_NULL") moved the fields around and misplaced the documentation for "PTR_TO_BTF_ID_OR_NULL". So, let's replace it in the proper place. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241111124911.1436911-1-dongml2@chinatelecom.cn
2024-11-11tools/bpf: Fix the wrong format specifier in bpf_jit_disasmLuo Yifan
There is a static checker warning that the %d in format string is mismatched with the corresponding argument type, which could result in incorrect printed data. This patch fixes it. Signed-off-by: Luo Yifan <luoyifan@cmss.chinamobile.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241111021004.272293-1-luoyifan@cmss.chinamobile.com
2024-11-11kbuild,bpf: Pass make jobs' value to paholeFlorian Schmaus
Pass the value of make's -j/--jobs argument to pahole, to avoid out of memory errors and make pahole respect the "jobs" value of make. On systems with little memory but many cores, invoking pahole using -j without argument potentially creates too many pahole instances, causing an out-of-memory situation. Instead, we should pass make's "jobs" value as an argument to pahole's -j, which is likely configured to be (much) lower than the actual core count on such systems. If make was invoked without -j, either via cmdline or MAKEFLAGS, then JOBS will be simply empty, resulting in the existing behavior, as expected. Signed-off-by: Florian Schmaus <flo@geekplace.eu> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com> Link: https://lore.kernel.org/bpf/20241102100452.793970-1-flo@geekplace.eu
2024-11-11Merge branch 'refactor-lock-management'Alexei Starovoitov
Kumar Kartikeya Dwivedi says: ==================== Refactor lock management This set refactors lock management in the verifier in preparation for spin locks that can be acquired multiple times. In addition to this, unnecessary code special case reference leak logic for callbacks is also dropped, that is no longer necessary. See patches for details. Changelog: ---------- v5 -> v6 v5: https://lore.kernel.org/bpf/20241109225243.2306756-1-memxor@gmail.com * Move active_locks mutation to {acquire,release}_lock_state (Alexei) v4 -> v5 v4: https://lore.kernel.org/bpf/20241109074347.1434011-1-memxor@gmail.com * Make active_locks part of bpf_func_state (Alexei) * Remove unneeded in_callback_fn logic for references v3 -> v4 v3: https://lore.kernel.org/bpf/20241104151716.2079893-1-memxor@gmail.com * Address comments from Alexei * Drop struct bpf_active_lock definition * Name enum type, expand definition to multiple lines * s/REF_TYPE_BPF_LOCK/REF_TYPE_LOCK/g * Change active_lock type to int * Fix type of 'type' in acquire_lock_state * Filter by taking type explicitly in find_lock_state * WARN for default case in refsafe switch statement v2 -> v3 v2: https://lore.kernel.org/bpf/20241103212252.547071-1-memxor@gmail.com * Rebase on bpf-next to resolve merge conflict v1 -> v2 v1: https://lore.kernel.org/bpf/20241103205856.345580-1-memxor@gmail.com * Fix refsafe state comparison to check callback_ref and ptr separately. ==================== Link: https://lore.kernel.org/r/20241109231430.2475236-1-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-11-11bpf: Drop special callback reference handlingKumar Kartikeya Dwivedi
Logic to prevent callbacks from acquiring new references for the program (i.e. leaving acquired references), and releasing caller references (i.e. those acquired in parent frames) was introduced in commit 9d9d00ac29d0 ("bpf: Fix reference state management for synchronous callbacks"). This was necessary because back then, the verifier simulated each callback once (that could potentially be executed N times, where N can be zero). This meant that callbacks that left lingering resources or cleared caller resources could do it more than once, operating on undefined state or leaking memory. With the fixes to callback verification in commit ab5cfac139ab ("bpf: verify callbacks as if they are called unknown number of times"), all of this extra logic is no longer necessary. Hence, drop it as part of this commit. Cc: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20241109231430.2475236-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-11-11bpf: Refactor active lock managementKumar Kartikeya Dwivedi
When bpf_spin_lock was introduced originally, there was deliberation on whether to use an array of lock IDs, but since bpf_spin_lock is limited to holding a single lock at any given time, we've been using a single ID to identify the held lock. In preparation for introducing spin locks that can be taken multiple times, introduce support for acquiring multiple lock IDs. For this purpose, reuse the acquired_refs array and store both lock and pointer references. We tag the entry with REF_TYPE_PTR or REF_TYPE_LOCK to disambiguate and find the relevant entry. The ptr field is used to track the map_ptr or btf (for bpf_obj_new allocations) to ensure locks can be matched with protected fields within the same "allocation", i.e. bpf_obj_new object or map value. The struct active_lock is changed to an int as the state is part of the acquired_refs array, and we only need active_lock as a cheap way of detecting lock presence. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20241109231430.2475236-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-11-11selftests/bpf: skip the timer_lockup test for single-CPU nodesViktor Malik
The timer_lockup test needs 2 CPUs to work, on single-CPU nodes it fails to set thread affinity to CPU 1 since it doesn't exist: # ./test_progs -t timer_lockup test_timer_lockup:PASS:timer_lockup__open_and_load 0 nsec test_timer_lockup:PASS:pthread_create thread1 0 nsec test_timer_lockup:PASS:pthread_create thread2 0 nsec timer_lockup_thread:PASS:cpu affinity 0 nsec timer_lockup_thread:FAIL:cpu affinity unexpected error: 22 (errno 0) test_timer_lockup:PASS: 0 nsec #406 timer_lockup:FAIL Skip the test if only 1 CPU is available. Signed-off-by: Viktor Malik <vmalik@redhat.com> Fixes: 50bd5a0c658d1 ("selftests/bpf: Add timer lockup selftest") Tested-by: Philo Lu <lulie@linux.alibaba.com> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20241107115231.75200-1-vmalik@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-11-11Merge branch 'fix-lockdep-warning-for-htab-of-map'Alexei Starovoitov
Hou Tao says: ==================== The patch set fixes a lockdep warning for htab of map. The warning is found when running test_maps. The warning occurs when htab_put_fd_value() attempts to acquire map_idr_lock to free the map id of the inner map while already holding the bucket lock (raw_spinlock_t). The fix moves the invocation of free_htab_elem() after htab_unlock_bucket() and adds a test case to verify the solution. ==================== Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-11-11selftests/bpf: Test the update operations for htab of mapsHou Tao
Add test cases to verify the following four update operations on htab of maps don't trigger lockdep warning: (1) add then delete (2) add, overwrite, then delete (3) add, then lookup_and_delete (4) add two elements, then lookup_and_delete_batch Test cases are added for pre-allocated and non-preallocated htab of maps respectively. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241106063542.357743-4-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-11-11selftests/bpf: Move ENOTSUPP from bpf_util.hHou Tao
Moving the definition of ENOTSUPP into bpf_util.h to remove the duplicated definitions in multiple files. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241106063542.357743-3-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-11-11bpf: Call free_htab_elem() after htab_unlock_bucket()Hou Tao
For htab of maps, when the map is removed from the htab, it may hold the last reference of the map. bpf_map_fd_put_ptr() will invoke bpf_map_free_id() to free the id of the removed map element. However, bpf_map_fd_put_ptr() is invoked while holding a bucket lock (raw_spin_lock_t), and bpf_map_free_id() attempts to acquire map_idr_lock (spinlock_t), triggering the following lockdep warning: ============================= [ BUG: Invalid wait context ] 6.11.0-rc4+ #49 Not tainted ----------------------------- test_maps/4881 is trying to lock: ffffffff84884578 (map_idr_lock){+...}-{3:3}, at: bpf_map_free_id.part.0+0x21/0x70 other info that might help us debug this: context-{5:5} 2 locks held by test_maps/4881: #0: ffffffff846caf60 (rcu_read_lock){....}-{1:3}, at: bpf_fd_htab_map_update_elem+0xf9/0x270 #1: ffff888149ced148 (&htab->lockdep_key#2){....}-{2:2}, at: htab_map_update_elem+0x178/0xa80 stack backtrace: CPU: 0 UID: 0 PID: 4881 Comm: test_maps Not tainted 6.11.0-rc4+ #49 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), ... Call Trace: <TASK> dump_stack_lvl+0x6e/0xb0 dump_stack+0x10/0x20 __lock_acquire+0x73e/0x36c0 lock_acquire+0x182/0x450 _raw_spin_lock_irqsave+0x43/0x70 bpf_map_free_id.part.0+0x21/0x70 bpf_map_put+0xcf/0x110 bpf_map_fd_put_ptr+0x9a/0xb0 free_htab_elem+0x69/0xe0 htab_map_update_elem+0x50f/0xa80 bpf_fd_htab_map_update_elem+0x131/0x270 htab_map_update_elem+0x50f/0xa80 bpf_fd_htab_map_update_elem+0x131/0x270 bpf_map_update_value+0x266/0x380 __sys_bpf+0x21bb/0x36b0 __x64_sys_bpf+0x45/0x60 x64_sys_call+0x1b2a/0x20d0 do_syscall_64+0x5d/0x100 entry_SYSCALL_64_after_hwframe+0x76/0x7e One way to fix the lockdep warning is using raw_spinlock_t for map_idr_lock as well. However, bpf_map_alloc_id() invokes idr_alloc_cyclic() after acquiring map_idr_lock, it will trigger a similar lockdep warning because the slab's lock (s->cpu_slab->lock) is still a spinlock. Instead of changing map_idr_lock's type, fix the issue by invoking htab_put_fd_value() after htab_unlock_bucket(). However, only deferring the invocation of htab_put_fd_value() is not enough, because the old map pointers in htab of maps can not be saved during batched deletion. Therefore, also defer the invocation of free_htab_elem(), so these to-be-freed elements could be linked together similar to lru map. There are four callers for ->map_fd_put_ptr: (1) alloc_htab_elem() (through htab_put_fd_value()) It invokes ->map_fd_put_ptr() under a raw_spinlock_t. The invocation of htab_put_fd_value() can not simply move after htab_unlock_bucket(), because the old element has already been stashed in htab->extra_elems. It may be reused immediately after htab_unlock_bucket() and the invocation of htab_put_fd_value() after htab_unlock_bucket() may release the newly-added element incorrectly. Therefore, saving the map pointer of the old element for htab of maps before unlocking the bucket and releasing the map_ptr after unlock. Beside the map pointer in the old element, should do the same thing for the special fields in the old element as well. (2) free_htab_elem() (through htab_put_fd_value()) Its caller includes __htab_map_lookup_and_delete_elem(), htab_map_delete_elem() and __htab_map_lookup_and_delete_batch(). For htab_map_delete_elem(), simply invoke free_htab_elem() after htab_unlock_bucket(). For __htab_map_lookup_and_delete_batch(), just like lru map, linking the to-be-freed element into node_to_free list and invoking free_htab_elem() for these element after unlock. It is safe to reuse batch_flink as the link for node_to_free, because these elements have been removed from the hash llist. Because htab of maps doesn't support lookup_and_delete operation, __htab_map_lookup_and_delete_elem() doesn't have the problem, so kept it as is. (3) fd_htab_map_free() It invokes ->map_fd_put_ptr without raw_spinlock_t. (4) bpf_fd_htab_map_update_elem() It invokes ->map_fd_put_ptr without raw_spinlock_t. After moving free_htab_elem() outside htab bucket lock scope, using pcpu_freelist_push() instead of __pcpu_freelist_push() to disable the irq before freeing elements, and protecting the invocations of bpf_mem_cache_free() with migrate_{disable|enable} pair. Signed-off-by: Hou Tao <houtao1@huawei.com> Link: https://lore.kernel.org/r/20241106063542.357743-2-houtao@huaweicloud.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-11-11Merge branch 'bpf-add-uprobe-session-support'Andrii Nakryiko
Jiri Olsa says: ==================== bpf: Add uprobe session support hi, this patchset is adding support for session uprobe attachment and using it through bpf link for bpf programs. The session means that the uprobe consumer is executed on entry and return of probed function with additional control: - entry callback can control execution of the return callback - entry and return callbacks can share data/cookie Uprobe changes (on top of perf/core [1] are posted in here [2]. This patchset is based on bpf-next/master and will be merged once we pull [2] in bpf-next/master. v9 changes: - rebased on bpf-next/master with perf/core tag merged (thanks Peter!) thanks, jirka [1] git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git perf/core [2] https://lore.kernel.org/bpf/20241018202252.693462-1-jolsa@kernel.org/T/#ma43c549c4bf684ca1b17fa638aa5e7cbb46893e9 --- ==================== Link: https://lore.kernel.org/r/20241108134544.480660-1-jolsa@kernel.org Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-11-11selftests/bpf: Add threads to consumer testJiri Olsa
With recent uprobe fix [1] the sync time after unregistering uprobe is much longer and prolongs the consumer test which creates and destroys hundreds of uprobes. This change adds 16 threads (which fits the test logic) and speeds up the test. Before the change: # perf stat --null ./test_progs -t uprobe_multi_test/consumers #421/9 uprobe_multi_test/consumers:OK #421 uprobe_multi_test:OK Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED Performance counter stats for './test_progs -t uprobe_multi_test/consumers': 28.818778973 seconds time elapsed 0.745518000 seconds user 0.919186000 seconds sys After the change: # perf stat --null ./test_progs -t uprobe_multi_test/consumers 2>&1 #421/9 uprobe_multi_test/consumers:OK #421 uprobe_multi_test:OK Summary: 1/1 PASSED, 0 SKIPPED, 0 FAILED Performance counter stats for './test_progs -t uprobe_multi_test/consumers': 3.504790814 seconds time elapsed 0.012141000 seconds user 0.751760000 seconds sys [1] commit 87195a1ee332 ("uprobes: switch to RCU Tasks Trace flavor for better performance") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241108134544.480660-14-jolsa@kernel.org
2024-11-11selftests/bpf: Add uprobe sessions to consumer testJiri Olsa
Adding uprobe session consumers to the consumer test, so we get the session into the test mix. In addition scaling down the test to have just 1 uprobe and 1 uretprobe, otherwise the test time grows and is unsuitable for CI even with threads. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241108134544.480660-13-jolsa@kernel.org
2024-11-11selftests/bpf: Add uprobe session single consumer testJiri Olsa
Testing that the session ret_handler bypass works on single uprobe with multiple consumers, each with different session ignore return value. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241108134544.480660-12-jolsa@kernel.org
2024-11-11selftests/bpf: Add kprobe session verifier test for return valueJiri Olsa
Making sure kprobe.session program can return only [0,1] values. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241108134544.480660-11-jolsa@kernel.org
2024-11-11selftests/bpf: Add uprobe session verifier test for return valueJiri Olsa
Making sure uprobe.session program can return only [0,1] values. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241108134544.480660-10-jolsa@kernel.org
2024-11-11selftests/bpf: Add uprobe session recursive testJiri Olsa
Adding uprobe session test that verifies the cookie value is stored properly when single uprobe-ed function is executed recursively. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241108134544.480660-9-jolsa@kernel.org
2024-11-11selftests/bpf: Add uprobe session cookie testJiri Olsa
Adding uprobe session test that verifies the cookie value get properly propagated from entry to return program. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241108134544.480660-8-jolsa@kernel.org
2024-11-11selftests/bpf: Add uprobe session testJiri Olsa
Adding uprobe session test and testing that the entry program return value controls execution of the return probe program. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241108134544.480660-7-jolsa@kernel.org
2024-11-11libbpf: Add support for uprobe multi session attachJiri Olsa
Adding support to attach program in uprobe session mode with bpf_program__attach_uprobe_multi function. Adding session bool to bpf_uprobe_multi_opts struct that allows to load and attach the bpf program via uprobe session. the attachment to create uprobe multi session. Also adding new program loader section that allows: SEC("uprobe.session/bpf_fentry_test*") and loads/attaches uprobe program as uprobe session. Adding sleepable hook (uprobe.session.s) as well. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241108134544.480660-6-jolsa@kernel.org
2024-11-11bpf: Add support for uprobe multi session contextJiri Olsa
Placing bpf_session_run_ctx layer in between bpf_run_ctx and bpf_uprobe_multi_run_ctx, so the session data can be retrieved from uprobe_multi link. Plus granting session kfuncs access to uprobe session programs. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241108134544.480660-5-jolsa@kernel.org
2024-11-11bpf: Add support for uprobe multi session attachJiri Olsa
Adding support to attach BPF program for entry and return probe of the same function. This is common use case which at the moment requires to create two uprobe multi links. Adding new BPF_TRACE_UPROBE_SESSION attach type that instructs kernel to attach single link program to both entry and exit probe. It's possible to control execution of the BPF program on return probe simply by returning zero or non zero from the entry BPF program execution to execute or not the BPF program on return probe respectively. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241108134544.480660-4-jolsa@kernel.org
2024-11-11bpf: Force uprobe bpf program to always return 0Jiri Olsa
As suggested by Andrii make uprobe multi bpf programs to always return 0, so they can't force uprobe removal. Keeping the int return type for uprobe_prog_run, because it will be used in following session changes. Fixes: 89ae89f53d20 ("bpf: Add multi uprobe link") Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241108134544.480660-3-jolsa@kernel.org
2024-11-11bpf: Allow return values 0 and 1 for kprobe sessionJiri Olsa
The kprobe session program can return only 0 or 1, instruct verifier to check for that. Fixes: 535a3692ba72 ("bpf: Add support for kprobe session attach") Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241108134544.480660-2-jolsa@kernel.org
2024-11-11selftests/bpf: Fix uprobe consumer test (again)Jiri Olsa
The new uprobe changes bring some new behaviour that we need to reflect in the consumer test. Now pending uprobe instance in the kernel can survive longer and thus might call uretprobe consumer callbacks in some situations in which, previously, such callback would be omitted. We now need to take that into account in uprobe-multi consumer tests. The idea being that uretprobe under test either stayed from before to after (uret_stays + test_bit) or uretprobe instance survived and we have uretprobe active in after (uret_survives + test_bit). uret_survives just states that uretprobe survives if there are *any* uretprobes both before and after (overlapping or not, doesn't matter) and uprobe was attached before. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241107094337.3848210-1-jolsa@kernel.org
2024-11-11bpf: Remove trailing whitespace in verifier.rstAbhinav Saxena
Remove trailing whitespace in Documentation/bpf/verifier.rst. Signed-off-by: Abhinav Saxena <xandfury@gmail.com> Link: https://lore.kernel.org/r/20241107063708.106340-2-xandfury@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-11-11selftests/bpf: Allow building with extra flagsViktor Malik
In order to specify extra compilation or linking flags to BPF selftests, it is possible to set EXTRA_CFLAGS and EXTRA_LDFLAGS from the command line. The problem is that they are not propagated to sub-make calls (runqslower, bpftool, libbpf) and in the better case are not applied, in the worse case cause the entire build fail. Propagate EXTRA_CFLAGS and EXTRA_LDFLAGS to the sub-makes. This, for instance, allows to build selftests as PIE with $ make EXTRA_CFLAGS='-fPIE' EXTRA_LDFLAGS='-pie' Without this change, the command would fail because libbpf.a would not be built with -fPIE and other PIE binaries would not link against it. The only problem is that we have to explicitly provide empty EXTRA_CFLAGS='' and EXTRA_LDFLAGS='' to the builds of kernel modules as we don't want to build modules with flags used for userspace (the above example would fail as kernel doesn't support PIE). Signed-off-by: Viktor Malik <vmalik@redhat.com> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-11-06Merge tag 'perf-core-for-bpf-next' from tip treeAndrii Nakryiko
Stable tag for bpf-next's uprobe work. Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-11-04Merge branch 'handle-possible-null-trusted-raw_tp-arguments'Alexei Starovoitov
Kumar Kartikeya Dwivedi says: ==================== Handle possible NULL trusted raw_tp arguments More context is available in [0], but the TLDR; is that the verifier incorrectly assumes that any raw tracepoint argument will always be non-NULL. This means that even when users correctly check possible NULL arguments, the verifier can remove the NULL check due to incorrect knowledge of the NULL-ness of the pointer. Secondly, kernel helpers or kfuncs taking these trusted tracepoint arguments incorrectly assume that all arguments will always be valid non-NULL. In this set, we mark raw_tp arguments as PTR_MAYBE_NULL on top of PTR_TRUSTED, but special case their behavior when dereferencing them or pointer arithmetic over them is involved. When passing trusted args to helpers or kfuncs, raw_tp programs are permitted to pass possibly NULL pointers in such cases. Any loads into such maybe NULL trusted PTR_TO_BTF_ID is promoted to a PROBE_MEM load to handle emanating page faults. The verifier will ensure NULL checks on such pointers are preserved and do not lead to dead code elimination. This new behavior is not applied when ref_obj_id is non-zero, as those pointers do not belong to raw_tp arguments, but instead acquired objects. Since helpers and kfuncs already require attention for PTR_TO_BTF_ID (non-trusted) pointers, we do not implement any protection for such cases in this patch set, and leave it as future work for an upcoming series. A selftest is included with this patch set to verify the new behavior, and it crashes the kernel without the first patch. [0]: https://lore.kernel.org/bpf/CAADnVQLMPPavJQR6JFsi3dtaaLHB816JN4HCV_TFWohJ61D+wQ@mail.gmail.com Changelog: ---------- v2 -> v3 v2: https://lore.kernel.org/bpf/20241103184144.3765700-1-memxor@gmail.com * Fix lenient check around check_ptr_to_btf_access allowing any PTR_TO_BTF_ID with PTR_MAYBE_NULL to be deref'd. * Add Juri and Jiri's Tested-by, Reviewed-by resp. v1 -> v2 v1: https://lore.kernel.org/bpf/20241101000017.3424165-1-memxor@gmail.com * Add patch to clean up users of gettid (Andrii) * Avoid nested blocks in sefltest (Andrii) * Prevent code motion optimization in selftest using barrier() ==================== Link: https://lore.kernel.org/r/20241104171959.2938862-1-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-04selftests/bpf: Add tests for raw_tp null handlingKumar Kartikeya Dwivedi
Ensure that trusted PTR_TO_BTF_ID accesses perform PROBE_MEM handling in raw_tp program. Without the previous fix, this selftest crashes the kernel due to a NULL-pointer dereference. Also ensure that dead code elimination does not kick in for checks on the pointer. Reviewed-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20241104171959.2938862-4-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-04selftests/bpf: Clean up open-coded gettid syscall invocationsKumar Kartikeya Dwivedi
Availability of the gettid definition across glibc versions supported by BPF selftests is not certain. Currently, all users in the tree open-code syscall to gettid. Convert them to a common macro definition. Reviewed-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20241104171959.2938862-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>