summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-09-28net: ipa: kill unused status exceptionsAlex Elder
Only the deaggregation status exception type is ever actually used. If any other status exception type is reported we basically ignore it, and consume the packet. Remove the unused definitions of status exception type symbols; they can be added back when we actually handle them. Separately, two consecutive if statements test the same condition near the top of ipa_endpoint_suspend_one(). Instead, use a single test with a block that combines the previously-separate lines of code. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28net: ipa: kill unused status opcodesAlex Elder
Three status opcodes are not currently supported. Symbols representing their numeric values are defined but never used. Remove those unused definitions; they can be defined again when they actually get used. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28net: ipa: kill definition of TRE_FLAGS_IEOB_FMASKAlex Elder
In "gsi_trans.c", the field mask TRE_FLAGS_IEOB_FMASK is defined but never used. Although there's no harm in defining this, remove it for now and redefine it at some future date if it becomes needed. This is warned about if "W=2" is added to the build command. Signed-off-by: Alex Elder <elder@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28Merge branch 'bpf: add helpers to support BTF-based kernel'Alexei Starovoitov
Alan Maguire says: ==================== This series attempts to provide a simple way for BPF programs (and in future other consumers) to utilize BPF Type Format (BTF) information to display kernel data structures in-kernel. The use case this functionality is applied to here is to support a snprintf()-like helper to copy a BTF representation of kernel data to a string, and a BPF seq file helper to display BTF data for an iterator. There is already support in kernel/bpf/btf.c for "show" functionality; the changes here generalize that support from seq-file specific verifier display to the more generic case and add another specific use case; rather than seq_printf()ing the show data, it is copied to a supplied string using a snprintf()-like function. Other future consumers of the show functionality could include a bpf_printk_btf() function which printk()ed the data instead. Oops messaging in particular would be an interesting application for such functionality. The above potential use case hints at a potential reply to a reasonable objection that such typed display should be solved by tracing programs, where the in-kernel tracing records data and the userspace program prints it out. While this is certainly the recommended approach for most cases, I believe having an in-kernel mechanism would be valuable also. Critically in BPF programs it greatly simplifies debugging and tracing of such data to invoking a simple helper. One challenge raised in an earlier iteration of this work - where the BTF printing was implemented as a printk() format specifier - was that the amount of data printed per printk() was large, and other format specifiers were far simpler. Here we sidestep that concern by printing components of the BTF representation as we go for the seq file case, and in the string case the snprintf()-like operation is intended to be a basis for perf event or ringbuf output. The reasons for avoiding bpf_trace_printk are that 1. bpf_trace_printk() strings are restricted in size and cannot display anything beyond trivial data structures; and 2. bpf_trace_printk() is for debugging purposes only. As Alexei suggested, a bpf_trace_puts() helper could solve this in the future but it still would be limited by the 1000 byte limit for traced strings. Default output for an sk_buff looks like this (zeroed fields are omitted): (struct sk_buff){ .transport_header = (__u16)65535, .mac_header = (__u16)65535, .end = (sk_buff_data_t)192, .head = (unsigned char *)0x000000007524fd8b, .data = (unsigned char *)0x000000007524fd8b, .truesize = (unsigned int)768, .users = (refcount_t){ .refs = (atomic_t){ .counter = (int)1, }, }, } Flags can modify aspects of output format; see patch 3 for more details. Changes since v6: - Updated safe data size to 32, object name size to 80. This increases the number of safe copies done, but performance is not a key goal here. WRT name size the largest type name length in bpf-next according to "pahole -s" is 64 bytes, so that still gives room for additional type qualifiers, parens etc within the name limit (Alexei, patch 2) - Remove inlines and converted as many #defines to functions as was possible. In a few cases - btf_show_type_value[s]() specifically - I left these as macros as btf_show_type_value[s]() prepends and appends format strings to the format specifier (in order to include indentation, delimiters etc so a macro makes that simpler (Alexei, patch 2) - Handle btf_resolve_size() error in btf_show_obj_safe() (Alexei, patch 2) - Removed clang loop unroll in BTF snprintf test (Alexei) - switched to using bpf_core_type_id_kernel(type) as suggested by Andrii, and Alexei noted that __builtin_btf_type_id(,1) should be used (patch 4) - Added skip logic if __builtin_btf_type_id is not available (patches 4,8) - Bumped limits on bpf iters to support printing larger structures (Alexei, patch 5) - Updated overflow bpf_iter tests to reflect new iter max size (patch 6) - Updated seq helper to use type id only (Alexei, patch 7) - Updated BTF task iter test to use task struct instead of struct fs_struct since new limits allow a task_struct to be displayed (patch 8) - Fixed E2BIG handling in iter task (Alexei, patch 8) Changes since v5: - Moved btf print prepare into patch 3, type show seq with flags into patch 2 (Alexei, patches 2,3) - Fixed build bot warnings around static declarations and printf attributes - Renamed functions to snprintf_btf/seq_printf_btf (Alexei, patches 3-6) Changes since v4: - Changed approach from a BPF trace event-centric design to one utilizing a snprintf()-like helper and an iter helper (Alexei, patches 3,5) - Added tests to verify BTF output (patch 4) - Added support to tests for verifying BTF type_id-based display as well as type name via __builtin_btf_type_id (Andrii, patch 4). - Augmented task iter tests to cover the BTF-based seq helper. Because a task_struct's BTF-based representation would overflow the PAGE_SIZE limit on iterator data, the "struct fs_struct" (task->fs) is displayed for each task instead (Alexei, patch 6). Changes since v3: - Moved to RFC since the approach is different (and bpf-next is closed) - Rather than using a printk() format specifier as the means of invoking BTF-enabled display, a dedicated BPF helper is used. This solves the issue of printk() having to output large amounts of data using a complex mechanism such as BTF traversal, but still provides a way for the display of such data to be achieved via BPF programs. Future work could include a bpf_printk_btf() function to invoke display via printk() where the elements of a data structure are printk()ed one at a time. Thanks to Petr Mladek, Andy Shevchenko and Rasmus Villemoes who took time to look at the earlier printk() format-specifier-focused version of this and provided feedback clarifying the problems with that approach. - Added trace id to the bpf_trace_printk events as a means of separating output from standard bpf_trace_printk() events, ensuring it can be easily parsed by the reader. - Added bpf_trace_btf() helper tests which do simple verification of the various display options. Changes since v2: - Alexei and Yonghong suggested it would be good to use probe_kernel_read() on to-be-shown data to ensure safety during operation. Safe copy via probe_kernel_read() to a buffer object in "struct btf_show" is used to support this. A few different approaches were explored including dynamic allocation and per-cpu buffers. The downside of dynamic allocation is that it would be done during BPF program execution for bpf_trace_printk()s using %pT format specifiers. The problem with per-cpu buffers is we'd have to manage preemption and since the display of an object occurs over an extended period and in printk context where we'd rather not change preemption status, it seemed tricky to manage buffer safety while considering preemption. The approach of utilizing stack buffer space via the "struct btf_show" seemed like the simplest approach. The stack size of the associated functions which have a "struct btf_show" on their stack to support show operation (btf_type_snprintf_show() and btf_type_seq_show()) stays under 500 bytes. The compromise here is the safe buffer we use is small - 256 bytes - and as a result multiple probe_kernel_read()s are needed for larger objects. Most objects of interest are smaller than this (e.g. "struct sk_buff" is 224 bytes), and while task_struct is a notable exception at ~8K, performance is not the priority for BTF-based display. (Alexei and Yonghong, patch 2). - safe buffer use is the default behaviour (and is mandatory for BPF) but unsafe display - meaning no safe copy is done and we operate on the object itself - is supported via a 'u' option. - pointers are prefixed with 0x for clarity (Alexei, patch 2) - added additional comments and explanations around BTF show code, especially around determining whether objects such zeroed. Also tried to comment safe object scheme used. (Yonghong, patch 2) - added late_initcall() to initialize vmlinux BTF so that it would not have to be initialized during printk operation (Alexei, patch 5) - removed CONFIG_BTF_PRINTF config option as it is not needed; CONFIG_DEBUG_INFO_BTF can be used to gate test behaviour and determining behaviour of type-based printk can be done via retrieval of BTF data; if it's not there BTF was unavailable or broken (Alexei, patches 4,6) - fix bpf_trace_printk test to use vmlinux.h and globals via skeleton infrastructure, removing need for perf events (Andrii, patch 8) Changes since v1: - changed format to be more drgn-like, rendering indented type info along with type names by default (Alexei) - zeroed values are omitted (Arnaldo) by default unless the '0' modifier is specified (Alexei) - added an option to print pointer values without obfuscation. The reason to do this is the sysctls controlling pointer display are likely to be irrelevant in many if not most tracing contexts. Some questions on this in the outstanding questions section below... - reworked printk format specifer so that we no longer rely on format %pT<type> but instead use a struct * which contains type information (Rasmus). This simplifies the printk parsing, makes use more dynamic and also allows specification by BTF id as well as name. - removed incorrect patch which tried to fix dereferencing of resolved BTF info for vmlinux; instead we skip modifiers for the relevant case (array element type determination) (Alexei). - fixed issues with negative snprintf format length (Rasmus) - added test cases for various data structure formats; base types, typedefs, structs, etc. - tests now iterate through all typedef, enum, struct and unions defined for vmlinux BTF and render a version of the target dummy value which is either all zeros or all 0xff values; the idea is this exercises the "skip if zero" and "print everything" cases. - added support in BPF for using the %pT format specifier in bpf_trace_printk() - added BPF tests which ensure %pT format specifier use works (Alexei). ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-09-28selftests/bpf: Add test for bpf_seq_printf_btf helperAlan Maguire
Add a test verifying iterating over tasks and displaying BTF representation of task_struct succeeds. Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/1601292670-1616-9-git-send-email-alan.maguire@oracle.com
2020-09-28bpf: Add bpf_seq_printf_btf helperAlan Maguire
A helper is added to allow seq file writing of kernel data structures using vmlinux BTF. Its signature is long bpf_seq_printf_btf(struct seq_file *m, struct btf_ptr *ptr, u32 btf_ptr_size, u64 flags); Flags and struct btf_ptr definitions/use are identical to the bpf_snprintf_btf helper, and the helper returns 0 on success or a negative error value. Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/1601292670-1616-8-git-send-email-alan.maguire@oracle.com
2020-09-28selftests/bpf: Fix overflow tests to reflect iter size increaseAlan Maguire
bpf iter size increase to PAGE_SIZE << 3 means overflow tests assuming page size need to be bumped also. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/1601292670-1616-7-git-send-email-alan.maguire@oracle.com
2020-09-28bpf: Bump iter seq size to support BTF representation of large data structuresAlan Maguire
BPF iter size is limited to PAGE_SIZE; if we wish to display BTF-based representations of larger kernel data structures such as task_struct, this will be insufficient. Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/1601292670-1616-6-git-send-email-alan.maguire@oracle.com
2020-09-28selftests/bpf: Add bpf_snprintf_btf helper testsAlan Maguire
Tests verifying snprintf()ing of various data structures, flags combinations using a tp_btf program. Tests are skipped if __builtin_btf_type_id is not available to retrieve BTF type ids. Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/1601292670-1616-5-git-send-email-alan.maguire@oracle.com
2020-09-28bpf: Add bpf_snprintf_btf helperAlan Maguire
A helper is added to support tracing kernel type information in BPF using the BPF Type Format (BTF). Its signature is long bpf_snprintf_btf(char *str, u32 str_size, struct btf_ptr *ptr, u32 btf_ptr_size, u64 flags); struct btf_ptr * specifies - a pointer to the data to be traced - the BTF id of the type of data pointed to - a flags field is provided for future use; these flags are not to be confused with the BTF_F_* flags below that control how the btf_ptr is displayed; the flags member of the struct btf_ptr may be used to disambiguate types in kernel versus module BTF, etc; the main distinction is the flags relate to the type and information needed in identifying it; not how it is displayed. For example a BPF program with a struct sk_buff *skb could do the following: static struct btf_ptr b = { }; b.ptr = skb; b.type_id = __builtin_btf_type_id(struct sk_buff, 1); bpf_snprintf_btf(str, sizeof(str), &b, sizeof(b), 0, 0); Default output looks like this: (struct sk_buff){ .transport_header = (__u16)65535, .mac_header = (__u16)65535, .end = (sk_buff_data_t)192, .head = (unsigned char *)0x000000007524fd8b, .data = (unsigned char *)0x000000007524fd8b, .truesize = (unsigned int)768, .users = (refcount_t){ .refs = (atomic_t){ .counter = (int)1, }, }, } Flags modifying display are as follows: - BTF_F_COMPACT: no formatting around type information - BTF_F_NONAME: no struct/union member names/types - BTF_F_PTR_RAW: show raw (unobfuscated) pointer values; equivalent to %px. - BTF_F_ZERO: show zero-valued struct/union members; they are not displayed by default Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/1601292670-1616-4-git-send-email-alan.maguire@oracle.com
2020-09-28bpf: Move to generic BTF show support, apply it to seq files/stringsAlan Maguire
generalize the "seq_show" seq file support in btf.c to support a generic show callback of which we support two instances; the current seq file show, and a show with snprintf() behaviour which instead writes the type data to a supplied string. Both classes of show function call btf_type_show() with different targets; the seq file or the string to be written. In the string case we need to track additional data - length left in string to write and length to return that we would have written (a la snprintf). By default show will display type information, field members and their types and values etc, and the information is indented based upon structure depth. Zeroed fields are omitted. Show however supports flags which modify its behaviour: BTF_SHOW_COMPACT - suppress newline/indent. BTF_SHOW_NONAME - suppress show of type and member names. BTF_SHOW_PTR_RAW - do not obfuscate pointer values. BTF_SHOW_UNSAFE - do not copy data to safe buffer before display. BTF_SHOW_ZERO - show zeroed values (by default they are not shown). Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/1601292670-1616-3-git-send-email-alan.maguire@oracle.com
2020-09-28bpf: Provide function to get vmlinux BTF informationAlan Maguire
It will be used later for BPF structure display support Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/1601292670-1616-2-git-send-email-alan.maguire@oracle.com
2020-09-28libbpf: Add btf__new_empty() to create an empty BTF objectAndrii Nakryiko
Add an ability to create an empty BTF object from scratch. This is going to be used by pahole for BTF encoding. And also by selftest for convenient creation of BTF objects. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200926011357.2366158-7-andriin@fb.com
2020-09-28libbpf: Allow modification of BTF and add btf__add_str APIAndrii Nakryiko
Allow internal BTF representation to switch from default read-only mode, in which raw BTF data is a single non-modifiable block of memory with BTF header, types, and strings layed out sequentially and contiguously in memory, into a writable representation with types and strings data split out into separate memory regions, that can be dynamically expanded. Such writable internal representation is transparent to users of libbpf APIs, but allows to append new types and strings at the end of BTF, which is a typical use case when generating BTF programmatically. All the basic guarantees of BTF types and strings layout is preserved, i.e., user can get `struct btf_type *` pointer and read it directly. Such btf_type pointers might be invalidated if BTF is modified, so some care is required in such mixed read/write scenarios. Switch from read-only to writable configuration happens automatically the first time when user attempts to modify BTF by either adding a new type or new string. It is still possible to get raw BTF data, which is a single piece of memory that can be persisted in ELF section or into a file as raw BTF. Such raw data memory is also still owned by BTF and will be freed either when BTF object is freed or if another modification to BTF happens, as any modification invalidates BTF raw representation. This patch adds the first two BTF manipulation APIs: btf__add_str(), which allows to add arbitrary strings to BTF string section, and btf__find_str() which allows to find existing string offset, but not add it if it's missing. All the added strings are automatically deduplicated. This is achieved by maintaining an additional string lookup index for all unique strings. Such index is built when BTF is switched to modifiable mode. If at that time BTF strings section contained duplicate strings, they are not de-duplicated. This is done specifically to not modify the existing content of BTF (types, their string offsets, etc), which can cause confusion and is especially important property if there is struct btf_ext associated with struct btf. By following this "imperfect deduplication" process, btf_ext is kept consitent and correct. If deduplication of strings is necessary, it can be forced by doing BTF deduplication, at which point all the strings will be eagerly deduplicated and all string offsets both in struct btf and struct btf_ext will be updated. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200926011357.2366158-6-andriin@fb.com
2020-09-28libbpf: Extract generic string hashing function for reuseAndrii Nakryiko
Calculating a hash of zero-terminated string is a common need when using hashmap, so extract it for reuse. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200926011357.2366158-5-andriin@fb.com
2020-09-28libbpf: Generalize common logic for managing dynamically-sized arraysAndrii Nakryiko
Managing dynamically-sized array is a common, but not trivial functionality, which significant amount of logic and code to implement properly. So instead of re-implementing it all the time, extract it into a helper function ans reuse. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200926011357.2366158-4-andriin@fb.com
2020-09-28libbpf: Remove assumption of single contiguous memory for BTF dataAndrii Nakryiko
Refactor internals of struct btf to remove assumptions that BTF header, type data, and string data are layed out contiguously in a memory in a single memory allocation. Now we have three separate pointers pointing to the start of each respective are: header, types, strings. In the next patches, these pointers will be re-assigned to point to independently allocated memory areas, if BTF needs to be modified. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200926011357.2366158-3-andriin@fb.com
2020-09-28libbpf: Refactor internals of BTF type indexAndrii Nakryiko
Refactor implementation of internal BTF type index to not use direct pointers. Instead it uses offset relative to the start of types data section. This allows for types data to be reallocatable, enabling implementation of modifiable BTF. As now getting type by ID has an extra indirection step, convert all internal type lookups to a new helper btf_type_id(), that returns non-const pointer to a type by its ID. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/bpf/20200926011357.2366158-2-andriin@fb.com
2020-09-28selftests: Remove fmod_ret from test_overheadToke Høiland-Jørgensen
The test_overhead prog_test included an fmod_ret program that attached to __set_task_comm() in the kernel. However, this function was never listed as allowed for return modification, so this only worked because of the verifier skipping tests when a trampoline already existed for the attach point. Now that the verifier checks have been fixed, remove fmod_ret from the test so it works again. Fixes: 4eaf0b5c5e04 ("selftest/bpf: Fmod_ret prog and implement test_overhead as part of bench") Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-09-28bpf: verifier: refactor check_attach_btf_id()Toke Høiland-Jørgensen
The check_attach_btf_id() function really does three things: 1. It performs a bunch of checks on the program to ensure that the attachment is valid. 2. It stores a bunch of state about the attachment being requested in the verifier environment and struct bpf_prog objects. 3. It allocates a trampoline for the attachment. This patch splits out (1.) and (3.) into separate functions which will perform the checks, but return the computed values instead of directly modifying the environment. This is done in preparation for reusing the checks when the actual attachment is happening, which will allow tracing programs to have multiple (compatible) attachments. This also fixes a bug where a bunch of checks were skipped if a trampoline already existed for the tracing target. Fixes: 6ba43b761c41 ("bpf: Attachment verification for BPF_MODIFY_RETURN") Fixes: 1e6c62a88215 ("bpf: Introduce sleepable BPF programs") Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-09-28bpf: change logging calls from verbose() to bpf_log() and use log pointerToke Høiland-Jørgensen
In preparation for moving code around, change a bunch of references to env->log (and the verbose() logging helper) to use bpf_log() and a direct pointer to struct bpf_verifier_log. While we're touching the function signature, mark the 'prog' argument to bpf_check_type_match() as const. Also enhance the bpf_verifier_log_needed() check to handle NULL pointers for the log struct so we can re-use the code with logging disabled. Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-09-28bpf: disallow attaching modify_return tracing functions to other BPF programsToke Høiland-Jørgensen
From the checks and commit messages for modify_return, it seems it was never the intention that it should be possible to attach a tracing program with expected_attach_type == BPF_MODIFY_RETURN to another BPF program. However, check_attach_modify_return() will only look at the function name, so if the target function starts with "security_", the attach will be allowed even for bpf2bpf attachment. Fix this oversight by also blocking the modification if a target program is supplied. Fixes: 18644cec714a ("bpf: Fix use-after-free in fmod_ret check") Fixes: 6ba43b761c41 ("bpf: Attachment verification for BPF_MODIFY_RETURN") Acked-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-09-28Merge branch 'Sockmap copying'Alexei Starovoitov
Lorenz Bauer says: ==================== Changes in v2: - Check sk_fullsock in map_update_elem (Martin) Enable calling map_update_elem on sockmaps from bpf_iter context. This in turn allows us to copy a sockmap by iterating its elements. The change itself is tiny, all thanks to the ground work from Martin, whose series [1] this patch is based on. I updated the tests to do some copying, and also included two cleanups. I'm sending this out now rather than when Martin's series has landed because I hope this can get in before the merge window (potentially) closes this weekend. 1: https://lore.kernel.org/bpf/20200925000337.3853598-1-kafai@fb.com/ ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2020-09-28selftest: bpf: Test copying a sockmap and sockhashLorenz Bauer
Since we can now call map_update_elem(sockmap) from bpf_iter context it's possible to copy a sockmap or sockhash in the kernel. Add a selftest which exercises this. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200928090805.23343-5-lmb@cloudflare.com
2020-09-28selftests: bpf: Remove shared header from sockmap iter testLorenz Bauer
The shared header to define SOCKMAP_MAX_ENTRIES is a bit overkill. Dynamically allocate the sock_fd array based on bpf_map__max_entries instead. Suggested-by: Yonghong Song <yhs@fb.com> Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Yonghong Song <yhs@fb.com> Link: https://lore.kernel.org/bpf/20200928090805.23343-4-lmb@cloudflare.com
2020-09-28selftests: bpf: Add helper to compare socket cookiesLorenz Bauer
We compare socket cookies to ensure that insertion into a sockmap worked. Pull this out into a helper function for use in other tests. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200928090805.23343-3-lmb@cloudflare.com
2020-09-28bpf: sockmap: Enable map_update_elem from bpf_iterLorenz Bauer
Allow passing a pointer to a BTF struct sock_common* when updating a sockmap or sockhash. Since BTF pointers can fault and therefore be NULL at runtime we need to add an additional !sk check to sock_map_update_elem. Since we may be passed a request or timewait socket we also need to check sk_fullsock. Doing this allows calling map_update_elem on sockmap from bpf_iter context, which uses BTF pointers. Signed-off-by: Lorenz Bauer <lmb@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200928090805.23343-2-lmb@cloudflare.com
2020-09-28Merge branch 'ibmvnic-refactor-some-send-handle-functions'David S. Miller
Lijun Pan says: ==================== ibmvnic: refactor some send/handle functions This patch series rename and factor some send crq request functions. The new naming aligns better with handle* functions such that it make the code easier to read and search by new contributors. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28ibmvnic: create send_control_ip_offloadLijun Pan
Factor send_control_ip_offload out of handle_query_ip_offload_rsp. Signed-off-by: Lijun Pan <ljp@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28ibmvnic: create send_query_ip_offloadLijun Pan
Factor send_query_ip_offload out of handle_request_cap_rsp to pair with handle_query_ip_offload_rsp. Signed-off-by: Lijun Pan <ljp@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28ibmvnic: rename send_map_query to send_query_mapLijun Pan
The new name send_query_map pairs with handle_query_map_rsp. Signed-off-by: Lijun Pan <ljp@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28ibmvnic: rename ibmvnic_send_req_caps to send_request_capLijun Pan
The new name send_request_cap pairs with handle_request_cap_rsp. Signed-off-by: Lijun Pan <ljp@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28ibmvnic: rename send_cap_queries to send_query_capLijun Pan
The new name send_query_cap pairs with handle_query_cap_rsp. Signed-off-by: Lijun Pan <ljp@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28ibmvnic: set up 200GBPS speedLijun Pan
Set up the speed according to crq->query_phys_parms.rsp.speed. Fix IBMVNIC_10GBPS typo. Fixes: f8d6ae0d27ec ("ibmvnic: Report actual backing device speed and duplex values") Signed-off-by: Lijun Pan <ljp@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28atm: atmtcp: Constify atmtcp_v_dev_opsRikard Falkeborn
The only usage of atmtcp_v_dev_ops is to pass its address to atm_dev_register() which takes a pointer to const, and comparing its address to another address, which does not modify it. Make it const to allow the compiler to put it in read-only memory. Signed-off-by: Rikard Falkeborn <rikard.falkeborn@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28ip6gre: avoid tx_error when sending MLD/DAD on external tunnelsDavide Caratti
similarly to what has been done with commit 9d149045b3c0 ("geneve: change from tx_error to tx_dropped on missing metadata"), avoid reporting errors to userspace in case the kernel doesn't find any tunnel information for a skb that is going to be transmitted: an increase of tx_dropped is enough. tested with the following script: # for t in ip6gre ip6gretap ip6erspan; do > ip link add dev gre6-test0 type $t external > ip address add dev gre6-test0 2001:db8::1/64 > ip link set dev gre6-test0 up > sleep 30 > ip -s -j link show dev gre6-test0 | jq \ > '.[0].stats64.tx | {"errors": .errors, "dropped": .dropped}' > ip link del dev gre6-test0 > done Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28Merge branch 'net-smc-introduce-SMC-Dv2-support'David S. Miller
Karsten Graul says: ==================== net/smc: introduce SMC-Dv2 support SMC-Dv2 support (see https://www.ibm.com/support/pages/node/6326337) provides multi-subnet support for SMC-D, eliminating the current same-subnet restriction. The new version detects if any of the virtual ISM devices are on the same system and can therefore be used for an SMC-Dv2 connection. Furthermore, SMC-Dv2 eliminates the need for PNET IDs on s390. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28net/smc: CLC decline - V2 enhancementsUrsula Braun
This patch covers the small SMCD version 2 changes for CLC decline. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28net/smc: introduce CLC first contact extensionUrsula Braun
SMC Version 2 defines a first contact extension for CLC accept and CLC confirm. This patch covers sending and receiving of the CLC first contact extension. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28net/smc: CLC accept / confirm V2Ursula Braun
The new format of SMCD V2 CLC accept and confirm is introduced, and building and checking of SMCD V2 CLC accepts / confirms is adapted accordingly. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28net/smc: determine accepted ISM devicesUrsula Braun
SMCD Version 2 allows to propose up to 8 additional ISM devices offered to the peer as candidates for SMCD communication. This patch covers the server side, i.e. selection of an ISM device matching one of the proposed ISM devices, that will be used for CLC accept Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28net/smc: build and send V2 CLC proposalUrsula Braun
The new format of an SMCD V2 CLC proposal is introduced, and building and checking of SMCD V2 CLC proposals is adapted accordingly. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28net/smc: determine proposed ISM devicesUrsula Braun
SMCD Version 2 allows to propose up to 8 additional ISM devices offered to the peer as candidates for SMCD communication. This patch covers determination of the ISM devices to be proposed. ISM devices without PNETID are preferred, since ISM devices with PNETID are a V1 leftover and will disappear over the time. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28net/smc: introduce list of pnetids for Ethernet devicesUrsula Braun
SMCD version 2 allows usage of ISM devices with hardware PNETID only, if an Ethernet net_device exists with the same hardware PNETID. This requires to maintain a list of pnetids belonging to Ethernet net_devices, which is covered by this patch. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28net/smc: introduce CHID callback for ISM devicesUrsula Braun
With SMCD version 2 the CHIDs of ISM devices are needed for the CLC handshake. This patch provides the new callback to retrieve the CHID of an ISM device. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28net/smc: introduce System Enterprise ID (SEID)Ursula Braun
SMCD version 2 defines a System Enterprise ID (short SEID). This patch contains the SEID creation and adds the callback to retrieve the created SEID. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28net/smc: prepare for more proposed ISM devicesUrsula Braun
SMCD Version 2 allows proposing of up to 8 ISM devices in addition to the native ISM device of SMCD Version 1. This patch prepares the struct smc_init_info to deal with these additional 8 ISM devices. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28net/smc: split CLC confirm/accept data to be sentUrsula Braun
When sending CLC confirm and CLC accept, separate the trailing part of the message from the initial part (to be prepared for future first contact extension). Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28net/smc: separate find device functionsUrsula Braun
This patch provides better separation of device determinations in function smc_listen_work(). No functional change. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-09-28net/smc: CLC header fields renamingUrsula Braun
SMCD version 2 defines 2 more bits in the CLC header to specify version 2 types. This patch prepares better naming of the CLC header fields. No functional change. Signed-off-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>