Age | Commit message (Collapse) | Author |
|
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Add a couple large ecmp group nexthop selftests to cover
the remnant fixed by d69100b8eee27c2d60ee52df76e0b80a8d492d34.
The tests create 100 x32 ecmp groups of ipv4 and ipv6 and then
dump them. On kernels without the fix, they will fail due
to data remnant during the dump.
Signed-off-by: Stephen Worley <sworley@cumulusnetworks.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
this command hangs forever:
# tc qdisc add dev eth0 root fq_pie flows 65536
watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [tc:1028]
[...]
CPU: 1 PID: 1028 Comm: tc Not tainted 5.7.0-rc6+ #167
RIP: 0010:fq_pie_init+0x60e/0x8b7 [sch_fq_pie]
Code: 4c 89 65 50 48 89 f8 48 c1 e8 03 42 80 3c 30 00 0f 85 2a 02 00 00 48 8d 7d 10 4c 89 65 58 48 89 f8 48 c1 e8 03 42 80 3c 30 00 <0f> 85 a7 01 00 00 48 8d 7d 18 48 c7 45 10 46 c3 23 00 48 89 f8 48
RSP: 0018:ffff888138d67468 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff13
RAX: 1ffff9200018d2b2 RBX: ffff888139c1c400 RCX: ffffffffffffffff
RDX: 000000000000c5e8 RSI: ffffc900000e5000 RDI: ffffc90000c69590
RBP: ffffc90000c69580 R08: fffffbfff79a9699 R09: fffffbfff79a9699
R10: 0000000000000700 R11: fffffbfff79a9698 R12: ffffc90000c695d0
R13: 0000000000000000 R14: dffffc0000000000 R15: 000000002347c5e8
FS: 00007f01e1850e40(0000) GS:ffff88814c880000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000067c340 CR3: 000000013864c000 CR4: 0000000000340ee0
Call Trace:
qdisc_create+0x3fd/0xeb0
tc_modify_qdisc+0x3be/0x14a0
rtnetlink_rcv_msg+0x5f3/0x920
netlink_rcv_skb+0x121/0x350
netlink_unicast+0x439/0x630
netlink_sendmsg+0x714/0xbf0
sock_sendmsg+0xe2/0x110
____sys_sendmsg+0x5b4/0x890
___sys_sendmsg+0xe9/0x160
__sys_sendmsg+0xd3/0x170
do_syscall_64+0x9a/0x370
entry_SYSCALL_64_after_hwframe+0x44/0xa9
we can't accept 65536 as a valid number for 'nflows', because the loop on
'idx' in fq_pie_init() will never end. The extack message is correct, but
it doesn't say that 0 is not a valid number for 'flows': while at it, fix
this also. Add a tdc selftest to check correct validation of 'flows'.
CC: Ivan Vecera <ivecera@redhat.com>
Fixes: ec97ecf1ebe4 ("net: sched: add Flow Queue PIE packet scheduler")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Ivan Vecera <ivecera@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Linux 5.7-rc7
|
|
To align with recent recommended values. Will be configurable by future
patches.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Revert
45e29d119e99 ("x86/syscalls: Make __X32_SYSCALL_BIT be unsigned long")
and add a comment to discourage someone else from making the same
mistake again.
It turns out that some user code fails to compile if __X32_SYSCALL_BIT
is unsigned long. See, for example [1] below.
[ bp: Massage and do the same thing in the respective tools/ header. ]
Fixes: 45e29d119e99 ("x86/syscalls: Make __X32_SYSCALL_BIT be unsigned long")
Reported-by: Thorsten Glaser <t.glaser@tarent.de>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: stable@kernel.org
Link: [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=954294
Link: https://lkml.kernel.org/r/92e55442b744a5951fdc9cfee10badd0a5f7f828.1588983892.git.luto@kernel.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux
Pull cpupower utility updates for v5.8-rc1 from Shuah Khan:
"This cpupower update for Linux 5.8-rc1 consists of a single
patch to fix coccicheck unneeded semicolon warning."
* tag 'linux-cpupower-5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux:
cpupower: Remove unneeded semicolon
|
|
The MSCC bug fix in 'net' had to be slightly adjusted because the
register accesses are done slightly differently in net-next.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull networking fixes from David Miller:
1) Fix RCU warnings in ipv6 multicast router code, from Madhuparna
Bhowmik.
2) Nexthop attributes aren't being checked properly because of
mis-initialized iterator, from David Ahern.
3) Revert iop_idents_reserve() change as it caused performance
regressions and was just working around what is really a UBSAN bug
in the compiler. From Yuqi Jin.
4) Read MAC address properly from ROM in bmac driver (double iteration
proceeds past end of address array), from Jeremy Kerr.
5) Add Microsoft Surface device IDs to r8152, from Marc Payne.
6) Prevent reference to freed SKB in __netif_receive_skb_core(), from
Boris Sukholitko.
7) Fix ACK discard behavior in rxrpc, from David Howells.
8) Preserve flow hash across packet scrubbing in wireguard, from Jason
A. Donenfeld.
9) Cap option length properly for SO_BINDTODEVICE in AX25, from Eric
Dumazet.
10) Fix encryption error checking in kTLS code, from Vadim Fedorenko.
11) Missing BPF prog ref release in flow dissector, from Jakub Sitnicki.
12) dst_cache must be used with BH disabled in tipc, from Eric Dumazet.
13) Fix use after free in mlxsw driver, from Jiri Pirko.
14) Order kTLS key destruction properly in mlx5 driver, from Tariq
Toukan.
15) Check devm_platform_ioremap_resource() return value properly in
several drivers, from Tiezhu Yang.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (71 commits)
net: smsc911x: Fix runtime PM imbalance on error
net/mlx4_core: fix a memory leak bug.
net: ethernet: ti: cpsw: fix ASSERT_RTNL() warning during suspend
net: phy: mscc: fix initialization of the MACsec protocol mode
net: stmmac: don't attach interface until resume finishes
net: Fix return value about devm_platform_ioremap_resource()
net/mlx5: Fix error flow in case of function_setup failure
net/mlx5e: CT: Correctly get flow rule
net/mlx5e: Update netdev txq on completions during closure
net/mlx5: Annotate mutex destroy for root ns
net/mlx5: Don't maintain a case of del_sw_func being null
net/mlx5: Fix cleaning unmanaged flow tables
net/mlx5: Fix memory leak in mlx5_events_init
net/mlx5e: Fix inner tirs handling
net/mlx5e: kTLS, Destroy key object after destroying the TIS
net/mlx5e: Fix allowed tc redirect merged eswitch offload cases
net/mlx5: Avoid processing commands before cmdif is ready
net/mlx5: Fix a race when moving command interface to events mode
net/mlx5: Add command entry handling completion
rxrpc: Fix a memory leak in rxkad_verify_response()
...
|
|
Remove unused variable "i", which was triggering a compiler warning.
Fixes: 29750f71a9b4 ("hugetlb_cgroup: add hugetlb_cgroup reservation tests")
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-By: Mina Almasry <almasrymina@google.com>
Cc: Brian Geffon <bgeffon@google.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Link: http://lkml.kernel.org/r/20200517001245.361762-2-jhubbard@nvidia.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Add mremap_dontunmap to .gitignore.
Fixes: 0c28759ee3c9 ("selftests: add MREMAP_DONTUNMAP selftest")
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Brian Geffon <bgeffon@google.com>
Link: http://lkml.kernel.org/r/20200517002509.362401-2-jhubbard@nvidia.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Daniel Borkmann says:
====================
pull-request: bpf-next 2020-05-23
The following pull-request contains BPF updates for your *net-next* tree.
We've added 50 non-merge commits during the last 8 day(s) which contain
a total of 109 files changed, 2776 insertions(+), 2887 deletions(-).
The main changes are:
1) Add a new AF_XDP buffer allocation API to the core in order to help
lowering the bar for drivers adopting AF_XDP support. i40e, ice, ixgbe
as well as mlx5 have been moved over to the new API and also gained a
small improvement in performance, from Björn Töpel and Magnus Karlsson.
2) Add getpeername()/getsockname() attach types for BPF sock_addr programs
in order to allow for e.g. reverse translation of load-balancer backend
to service address/port tuple from a connected peer, from Daniel Borkmann.
3) Improve the BPF verifier is_branch_taken() logic to evaluate pointers
being non-NULL, e.g. if after an initial test another non-NULL test on
that pointer follows in a given path, then it can be pruned right away,
from John Fastabend.
4) Larger rework of BPF sockmap selftests to make output easier to understand
and to reduce overall runtime as well as adding new BPF kTLS selftests
that run in combination with sockmap, also from John Fastabend.
5) Batch of misc updates to BPF selftests including fixing up test_align
to match verifier output again and moving it under test_progs, allowing
bpf_iter selftest to compile on machines with older vmlinux.h, and
updating config options for lirc and v6 segment routing helpers, from
Stanislav Fomichev, Andrii Nakryiko and Alan Maguire.
6) Conversion of BPF tracing samples outdated internal BPF loader to use
libbpf API instead, from Daniel T. Lee.
7) Follow-up to BPF kernel test infrastructure in order to fix a flake in
the XDP selftests, from Jesper Dangaard Brouer.
8) Minor improvements to libbpf's internal hashmap implementation, from
Ian Rogers.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
test_lirc_mode2.sh assumes presence of /sys/class/rc/rc0/lirc*/uevent
which will not be present unless CONFIG_LIRC=y
Fixes: 6bdd533cee9a ("bpf: add selftest for lirc_mode2 type program")
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/1590147389-26482-3-git-send-email-alan.maguire@oracle.com
|
|
test_seg6_loop.o uses the helper bpf_lwt_seg6_adjust_srh();
it will not be present if CONFIG_IPV6_SEG6_BPF is not specified.
Fixes: b061017f8b4d ("selftests/bpf: add realistic loop tests")
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/1590147389-26482-2-git-send-email-alan.maguire@oracle.com
|
|
Getting a clean BPF selftests run involves ensuring latest trunk LLVM/clang
are used, pahole is recent (>=1.16) and config matches the specified
config file as closely as possible. Add to bpf_devel_QA.rst and point
tools/testing/selftests/bpf/README.rst to it.
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/1590146674-25485-1-git-send-email-alan.maguire@oracle.com
|
|
Starting from iputils s20190709 (used in Fedora 31), arping does not
support timeout being specified as a decimal:
$ arping -c 1 -I swp1 -b 192.0.2.66 -q -w 0.1
arping: invalid argument: '0.1'
Previously, such timeouts were rounded to an integer.
Fix this by specifying the timeout as an integer.
Fixes: a5ee171d087e ("selftests: mlxsw: qos_mc_aware: Add a test for UC awareness")
Signed-off-by: Amit Cohen <amitc@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The variable is used by log_test() to check if the test case completely
successfully or not. In case it is not initialized at the start of a
test case, it is possible for the test case to fail despite not
encountering any errors.
Example:
```
...
TEST: Trap group statistics [ OK ]
TEST: Trap policer [FAIL]
Policer drop counter was not incremented
TEST: Trap policer binding [FAIL]
Policer drop counter was not incremented
```
Failure of trap_policer_test() caused trap_policer_bind_test() to fail
as well.
Fix by adding missing initialization of the variable.
Fixes: 5fbff58e27a1 ("selftests: netdevsim: Add test cases for devlink-trap policers")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Daniel Borkmann says:
====================
pull-request: bpf 2020-05-22
The following pull-request contains BPF updates for your *net* tree.
We've added 3 non-merge commits during the last 3 day(s) which contain
a total of 5 files changed, 69 insertions(+), 11 deletions(-).
The main changes are:
1) Fix to reject mmap()'ing read-only array maps as writable since BPF verifier
relies on such map content to be frozen, from Andrii Nakryiko.
2) Fix breaking audit from secid_to_secctx() LSM hook by avoiding to use
call_int_hook() since this hook is not stackable, from KP Singh.
3) Fix BPF flow dissector program ref leak on netns cleanup, from Jakub Sitnicki.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This commit adds ipv4 and ipv6 fdb nexthop api tests to fib_nexthops.sh.
Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The core mask display is wrong in some cases. This is showing more
cpus than the mask has. This is because mask is 64 bit but it used
with BIT() macro to get the presence of CPU which doesn't support
unsigned long long. Added a new macro for BIT_ULL and use that
to get the presence of a CPU.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Increase CPU count so that more than 64 is supported in one request.
For example:
sudo ./intel-speed-select -d --cpu 0-66 core-power assoc -clos 0
The above command stops at 63. With this change, it can support more
CPU numbers from 0-255.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
The 'intel-speed-select -f json perf-profile get-lock-status' command
outputs the package, die, and cpu data as separate fields.
ex)
"package-0": {
"die-0": {
"cpu-0": {
Commit 74062363f855 ("tools/power/x86/intel-speed-select: Avoid duplicate Package strings for json") prettied this output so that it is a single line for
some json output commands and the same should be done for other commands.
Output package, die, and cpu info in a single line when using json output.
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: platform-driver-x86@vger.kernel.org
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
|
|
Adding a printk to test_sk_lookup_kern created the reported failure
where a pointer type is checked twice for NULL. Lets add it to the
progs test test_sk_lookup_kern.c so we test the case from C all the
way into the verifier.
We already have printk's in selftests so seems OK to add another one.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/159009170603.6313.1715279795045285176.stgit@john-Precision-5820-Tower
|
|
When we have pointer type that is known to be non-null we only follow
the non-null branch. This adds tests to cover the map_value pointer
returned from a map lookup. To force an error if both branches are
followed we do an ALU op on R10.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/159009168650.6313.7434084136067263554.stgit@john-Precision-5820-Tower
|
|
When we have pointer type that is known to be non-null and comparing
against zero we only follow the non-null branch. This adds tests to
cover this case for reference tracking. Also add the other case when
comparison against a non-zero value and ensure we still fail with
unreleased reference.
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/159009166599.6313.1593680633787453767.stgit@john-Precision-5820-Tower
|
|
While working on commit b5372fe5dc84 ("exec: load_script: Do not exec
truncated interpreter path"), I wrote a series of test scripts to verify
corner cases. However, soon after, commit 6eb3c3d0a52d ("exec: increase
BINPRM_BUF_SIZE to 256") landed, resulting in the tests needing to be
refactored for the larger BINPRM_BUF_SIZE, which got lost on my TODO
list. During the recent exec refactoring work[1], the need for these tests
resurfaced, so I've finished them up for addition to the kernel selftests.
[1] https://lore.kernel.org/lkml/202005191144.E3112135@keescook/
Link: https://lkml.kernel.org/r/202005200204.D07DF079@keescook
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
gcc-10 switched to defaulting to -fno-common, which broke iproute2-5.4.
This was fixed in iproute-5.6, so switch to that. Because we're after a
stable testing surface, we generally don't like to bump these
unnecessarily, but in this case, being able to actually build is a basic
necessity.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As discussed in [0], it's dangerous to allow mapping BPF map, that's meant to
be frozen and is read-only on BPF program side, because that allows user-space
to actually store a writable view to the page even after it is frozen. This is
exacerbated by BPF verifier making a strong assumption that contents of such
frozen map will remain unchanged. To prevent this, disallow mapping
BPF_F_RDONLY_PROG mmap()'able BPF maps as writable, ever.
[0] https://lore.kernel.org/bpf/CAEf4BzYGWYhXdp6BJ7_=9OQPJxQpgug080MMjdSB72i9R+5c6g@mail.gmail.com/
Fixes: fc9702273e2e ("bpf: Add mmap() support for BPF_MAP_TYPE_ARRAY")
Suggested-by: Jann Horn <jannh@google.com>
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Jann Horn <jannh@google.com>
Link: https://lore.kernel.org/bpf/20200519053824.1089415-1-andriin@fb.com
|
|
Objtool currently only compiles for x86 architectures. This is
fine as it presently does not support tooling for other
architectures. However, we would like to be able to convert other
kernel tools to run as objtool sub commands because they too
process ELF object files. This will allow us to convert tools
such as recordmcount to use objtool's ELF code.
Since much of recordmcount's ELF code is copy-paste code to/from
a variety of other kernel tools (look at modpost for example) this
means that if we can convert recordmcount we can convert more.
We define weak definitions for subcommand entry functions and other weak
definitions for shared functions critical to building existing
subcommands. These return 127 when the command is missing which signify
tools that do not exist on all architectures. In this case the "check"
and "orc" tools do not exist on all architectures so we only add them
for x86. Future changes adding support for "check", to arm64 for
example, can then modify the SUBCMD_CHECK variable when building for
arm64.
Objtool is not currently wired in to KConfig to be built for other
architectures because it's not needed for those architectures and
there are no commands it supports other than those for x86. As more
command support is enabled on various architectures the necessary
KConfig changes can be made (e.g. adding "STACK_VALIDATION") to
trigger building objtool.
[ jpoimboe: remove aliases, add __weak macro, add error messages ]
Cc: Julien Thierry <jthierry@redhat.com>
Signed-off-by: Matt Helsley <mhelsley@vmware.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
|
|
The objtool_file structure describes the files objtool works on,
is used by the check subcommand, and the check.h header is included
by the orc subcommands so it's presently used by all subcommands.
Since the structure will be useful in all subcommands besides check,
and some subcommands may not want to include check.h to get the
definition, split the structure out into a new header meant for use
by all objtool subcommands.
Signed-off-by: Matt Helsley <mhelsley@vmware.com>
Reviewed-by: Julien Thierry <jthierry@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
|
|
When the user requests help it's not an error so do not exit with
a non-zero exit code. This is not especially useful for a user but
any script that might wish to check that objtool --help is at least
available can't rely on the exit code to crudely check that, for
example, building an objtool executable succeeds.
Signed-off-by: Matt Helsley <mhelsley@vmware.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
|
|
check_kcov_mode() is called by write_comp_data() and
__sanitizer_cov_trace_pc(), which are already on the uaccess safe list.
It's notrace and doesn't call out to anything else, so add it to the
list too.
This fixes the following warnings:
kernel/kcov.o: warning: objtool: __sanitizer_cov_trace_pc()+0x15: call to check_kcov_mode() with UACCESS enabled
kernel/kcov.o: warning: objtool: write_comp_data()+0x1b: call to check_kcov_mode() with UACCESS enabled
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into HEAD
|
|
b9f4c01f3e0b ("selftest/bpf: Make bpf_iter selftest compilable against old vmlinux.h")
missed the fact that bpf_iter_test_kern{3,4}.c are not just including
bpf_iter_test_kern_common.h and need similar bpf_iter_meta re-definition
explicitly.
Fixes: b9f4c01f3e0b ("selftest/bpf: Make bpf_iter selftest compilable against old vmlinux.h")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200519192341.134360-1-andriin@fb.com
|
|
Add some basic stand alone self tests for HMM.
The test program and shell scripts use the test_hmm.ko driver to exercise
HMM functionality in the kernel.
Link: https://lore.kernel.org/r/20200422195028.3684-3-rcampbell@nvidia.com
Signed-off-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
|
|
It's good to be able to compile bpf_iter selftest even on systems that don't
have the very latest vmlinux.h, e.g., for libbpf tests against older kernels in
Travis CI. To that extent, re-define bpf_iter_meta and corresponding bpf_iter
context structs in each selftest. To avoid type clashes with vmlinux.h, rename
vmlinux.h's definitions to get them out of the way.
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/20200518234516.3915052-1-andriin@fb.com
|
|
Sync tools/include/uapi/linux/bpf.h from include/uapi.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Extend the existing connect_force_port test to assert get{peer,sock}name programs
as well. The workflow for e.g. IPv4 is as follows: i) server binds to concrete
port, ii) client calls getsockname() on server fd which exposes 1.2.3.4:60000 to
client, iii) client connects to service address 1.2.3.4:60000 binds to concrete
local address (127.0.0.1:22222) and remaps service address to a concrete backend
address (127.0.0.1:60123), iv) client then calls getsockname() on its own fd to
verify local address (127.0.0.1:22222) and getpeername() on its own fd which then
publishes service address (1.2.3.4:60000) instead of actual backend. Same workflow
is done for IPv6 just with different address/port tuples.
# ./test_progs -t connect_force_port
#14 connect_force_port:OK
Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Andrey Ignatov <rdna@fb.com>
Link: https://lore.kernel.org/bpf/3343da6ad08df81af715a95d61a84fb4a960f2bf.1589841594.git.daniel@iogearbox.net
|
|
Make bpftool aware and add the new get{peer,sock}name attach types to its
cli, documentation and bash completion to allow attachment/detachment of
sock_addr programs there.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Andrey Ignatov <rdna@fb.com>
Link: https://lore.kernel.org/bpf/9765b3d03e4c29210c4df56a9cc7e52f5f7bb5ef.1589841594.git.daniel@iogearbox.net
|
|
Trivial patch to add the new get{peer,sock}name attach types to the section
definitions in order to hook them up to sock_addr cgroup program type.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Andrey Ignatov <rdna@fb.com>
Link: https://lore.kernel.org/bpf/7fcd4b1e41a8ebb364754a5975c75a7795051bd2.1589841594.git.daniel@iogearbox.net
|
|
As stated in 983695fa6765 ("bpf: fix unconnected udp hooks"), the objective
for the existing cgroup connect/sendmsg/recvmsg/bind BPF hooks is to be
transparent to applications. In Cilium we make use of these hooks [0] in
order to enable E-W load balancing for existing Kubernetes service types
for all Cilium managed nodes in the cluster. Those backends can be local
or remote. The main advantage of this approach is that it operates as close
as possible to the socket, and therefore allows to avoid packet-based NAT
given in connect/sendmsg/recvmsg hooks we only need to xlate sock addresses.
This also allows to expose NodePort services on loopback addresses in the
host namespace, for example. As another advantage, this also efficiently
blocks bind requests for applications in the host namespace for exposed
ports. However, one missing item is that we also need to perform reverse
xlation for inet{,6}_getname() hooks such that we can return the service
IP/port tuple back to the application instead of the remote peer address.
The vast majority of applications does not bother about getpeername(), but
in a few occasions we've seen breakage when validating the peer's address
since it returns unexpectedly the backend tuple instead of the service one.
Therefore, this trivial patch allows to customise and adds a getpeername()
as well as getsockname() BPF cgroup hook for both IPv4 and IPv6 in order
to address this situation.
Simple example:
# ./cilium/cilium service list
ID Frontend Service Type Backend
1 1.2.3.4:80 ClusterIP 1 => 10.0.0.10:80
Before; curl's verbose output example, no getpeername() reverse xlation:
# curl --verbose 1.2.3.4
* Rebuilt URL to: 1.2.3.4/
* Trying 1.2.3.4...
* TCP_NODELAY set
* Connected to 1.2.3.4 (10.0.0.10) port 80 (#0)
> GET / HTTP/1.1
> Host: 1.2.3.4
> User-Agent: curl/7.58.0
> Accept: */*
[...]
After; with getpeername() reverse xlation:
# curl --verbose 1.2.3.4
* Rebuilt URL to: 1.2.3.4/
* Trying 1.2.3.4...
* TCP_NODELAY set
* Connected to 1.2.3.4 (1.2.3.4) port 80 (#0)
> GET / HTTP/1.1
> Host: 1.2.3.4
> User-Agent: curl/7.58.0
> Accept: */*
[...]
Originally, I had both under a BPF_CGROUP_INET{4,6}_GETNAME type and exposed
peer to the context similar as in inet{,6}_getname() fashion, but API-wise
this is suboptimal as it always enforces programs having to test for ctx->peer
which can easily be missed, hence BPF_CGROUP_INET{4,6}_GET{PEER,SOCK}NAME split.
Similarly, the checked return code is on tnum_range(1, 1), but if a use case
comes up in future, it can easily be changed to return an error code instead.
Helper and ctx member access is the same as with connect/sendmsg/etc hooks.
[0] https://github.com/cilium/cilium/blob/master/bpf/bpf_sock.c
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Andrey Ignatov <rdna@fb.com>
Link: https://lore.kernel.org/bpf/61a479d759b2482ae3efb45546490bacd796a220.1589841594.git.daniel@iogearbox.net
|
|
Get the noinstr section and annotation markers to base the RCU parts on.
|
|
semantic conflict
Resolve structural conflict between:
59566b0b622e: ("x86/ftrace: Have ftrace trampolines turn read-only at the end of system boot up")
which introduced a new reference to 'ftrace_epilogue', and:
0298739b7983: ("x86,ftrace: Fix ftrace_regs_caller() unwind")
Which renamed it to 'ftrace_caller_end'. Rename the new usage site in the merge commit.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The 'pref medium' attribute was moved in iproute2 to be near the prefix
which is where it applies versus after the last nexthop. The nexthop
tests were updated to drop the string from route checking, but it crept
in again with the compat tests.
Fixes: 4dddb5be136a ("selftests: net: add new testcases for nexthop API compat mode sysctl")
Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It can be derived dynamically from the trap's name, so drop it.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
One blank line is enough.
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pull kvm fixes from Paolo Bonzini:
"A new testcase for guest debugging (gdbstub) that exposed a bunch of
bugs, mostly for AMD processors. And a few other x86 fixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Fix off-by-one error in kvm_vcpu_ioctl_x86_setup_mce
KVM: x86: Fix pkru save/restore when guest CR4.PKE=0, move it to x86.c
KVM: SVM: Disable AVIC before setting V_IRQ
KVM: Introduce kvm_make_all_cpus_request_except()
KVM: VMX: pass correct DR6 for GD userspace exit
KVM: x86, SVM: isolate vcpu->arch.dr6 from vmcb->save.dr6
KVM: SVM: keep DR6 synchronized with vcpu->arch.dr6
KVM: nSVM: trap #DB and #BP to userspace if guest debugging is on
KVM: selftests: Add KVM_SET_GUEST_DEBUG test
KVM: X86: Fix single-step with KVM_SET_GUEST_DEBUG
KVM: X86: Set RTM for DB_VECTOR too for KVM_EXIT_DEBUG
KVM: x86: fix DR6 delivery for various cases of #DB injection
KVM: X86: Declare KVM_CAP_SET_GUEST_DEBUG properly
|
|
Until now we have only had minimal ktls+sockmap testing when being
used with helpers and different sendmsg/sendpage patterns. Add a
pass with ktls here.
To run just ktls tests,
$ ./test_sockmap --whitelist="ktls"
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/158939736278.15176.5435314315563203761.stgit@john-Precision-5820-Tower
|
|
This adds a blacklist to test_sockmap. For example, now we can run
all apply and cork tests except those with timeouts by doing,
$ ./test_sockmap --whitelist "apply,cork" --blacklist "hang"
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/158939734350.15176.6643981099665208826.stgit@john-Precision-5820-Tower
|
|
Allow running specific tests with a comma deliminated whitelist. For example
to run all apply and cork tests.
$ ./test_sockmap --whitelist="cork,apply"
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Link: https://lore.kernel.org/bpf/158939732464.15176.1959113294944564542.stgit@john-Precision-5820-Tower
|