Age | Commit message (Collapse) | Author |
|
Newer versions of glibc annotate the poll() function with
__attribute__(access) which triggers a compiler warning inside the
testcase poll_fault.
Avoid this by using a plain NULL which is enough for the testcase.
To avoid potential future warnings also adapt the other EFAULT
testcases, except select_fault as NULL is a valid value for its
argument.
nolibc-test.c: In function ‘run_syscall’:
nolibc-test.c:338:62: warning: ‘poll’ writing 8 bytes into a region of size 0 overflows the destination [-Wstringop-overflow=]
338 | do { if (!(cond)) result(llen, SKIPPED); else ret += expect_syserr2(expr, expret, experr1, experr2, llen); } while (0)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
nolibc-test.c:341:9: note: in expansion of macro ‘EXPECT_SYSER2’
341 | EXPECT_SYSER2(cond, expr, expret, experr, 0)
| ^~~~~~~~~~~~~
nolibc-test.c:905:47: note: in expansion of macro ‘EXPECT_SYSER’
905 | CASE_TEST(poll_fault); EXPECT_SYSER(1, poll((void *)1, 1, 0), -1, EFAULT); break;
| ^~~~~~~~~~~~
cc1: note: destination object is likely at address zero
In file included from /usr/include/poll.h:1,
from nolibc-test.c:33:
/usr/include/sys/poll.h:54:12: note: in a call to function ‘poll’ declared with attribute ‘access (write_only, 1, 2)’
54 | extern int poll (struct pollfd *__fds, nfds_t __nfds, int __timeout)
| ^~~~
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
|
|
This function is only called by memcpy(), there is no real reason to
have this wrapper. Delete this function and move the code to memcpy()
directly.
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Reviewed-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
|
|
This nolibc internal function is not used. Delete it. It was probably
supposed to handle memmove(), but today the memmove() has its own
implementation.
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Reviewed-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
|
|
Simplify memset() on the x86-64 arch.
The x86-64 arch has a 'rep stosb' instruction, which can perform
memset() using only a single instruction, given:
%al = value (just like the second argument of memset())
%rdi = destination
%rcx = length
Before this patch:
```
00000000000010c9 <memset>:
10c9: 48 89 f8 mov %rdi,%rax
10cc: 48 85 d2 test %rdx,%rdx
10cf: 74 0e je 10df <memset+0x16>
10d1: 31 c9 xor %ecx,%ecx
10d3: 40 88 34 08 mov %sil,(%rax,%rcx,1)
10d7: 48 ff c1 inc %rcx
10da: 48 39 ca cmp %rcx,%rdx
10dd: 75 f4 jne 10d3 <memset+0xa>
10df: c3 ret
```
After this patch:
```
0000000000001511 <memset>:
1511: 96 xchg %eax,%esi
1512: 48 89 d1 mov %rdx,%rcx
1515: 57 push %rdi
1516: f3 aa rep stos %al,%es:(%rdi)
1518: 58 pop %rax
1519: c3 ret
```
v2:
- Use pushq %rdi / popq %rax (Alviro).
- Use xchg %eax, %esi (Willy).
Link: https://lore.kernel.org/lkml/ZO9e6h2jjVIMpBJP@1wt.eu
Suggested-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org>
Suggested-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Reviewed-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
|
|
Simplify memcpy() and memmove() on the x86-64 arch.
The x86-64 arch has a 'rep movsb' instruction, which can perform
memcpy() using only a single instruction, given:
%rdi = destination
%rsi = source
%rcx = length
Additionally, it can also handle the overlapping case by setting DF=1
(backward copy), which can be used as the memmove() implementation.
Before this patch:
```
00000000000010ab <memmove>:
10ab: 48 89 f8 mov %rdi,%rax
10ae: 31 c9 xor %ecx,%ecx
10b0: 48 39 f7 cmp %rsi,%rdi
10b3: 48 83 d1 ff adc $0xffffffffffffffff,%rcx
10b7: 48 85 d2 test %rdx,%rdx
10ba: 74 25 je 10e1 <memmove+0x36>
10bc: 48 83 c9 01 or $0x1,%rcx
10c0: 48 39 f0 cmp %rsi,%rax
10c3: 48 c7 c7 ff ff ff ff mov $0xffffffffffffffff,%rdi
10ca: 48 0f 43 fa cmovae %rdx,%rdi
10ce: 48 01 cf add %rcx,%rdi
10d1: 44 8a 04 3e mov (%rsi,%rdi,1),%r8b
10d5: 44 88 04 38 mov %r8b,(%rax,%rdi,1)
10d9: 48 01 cf add %rcx,%rdi
10dc: 48 ff ca dec %rdx
10df: 75 f0 jne 10d1 <memmove+0x26>
10e1: c3 ret
00000000000010e2 <memcpy>:
10e2: 48 89 f8 mov %rdi,%rax
10e5: 48 85 d2 test %rdx,%rdx
10e8: 74 12 je 10fc <memcpy+0x1a>
10ea: 31 c9 xor %ecx,%ecx
10ec: 40 8a 3c 0e mov (%rsi,%rcx,1),%dil
10f0: 40 88 3c 08 mov %dil,(%rax,%rcx,1)
10f4: 48 ff c1 inc %rcx
10f7: 48 39 ca cmp %rcx,%rdx
10fa: 75 f0 jne 10ec <memcpy+0xa>
10fc: c3 ret
```
After this patch:
```
// memmove is an alias for memcpy
000000000040133b <memcpy>:
40133b: 48 89 d1 mov %rdx,%rcx
40133e: 48 89 f8 mov %rdi,%rax
401341: 48 89 fa mov %rdi,%rdx
401344: 48 29 f2 sub %rsi,%rdx
401347: 48 39 ca cmp %rcx,%rdx
40134a: 72 03 jb 40134f <memcpy+0x14>
40134c: f3 a4 rep movsb %ds:(%rsi),%es:(%rdi)
40134e: c3 ret
40134f: 48 8d 7c 0f ff lea -0x1(%rdi,%rcx,1),%rdi
401354: 48 8d 74 0e ff lea -0x1(%rsi,%rcx,1),%rsi
401359: fd std
40135a: f3 a4 rep movsb %ds:(%rsi),%es:(%rdi)
40135c: fc cld
40135d: c3 ret
```
v3:
- Make memmove as an alias for memcpy (Willy).
- Make the forward copy the likely case (Alviro).
v2:
- Fix the broken memmove implementation (David).
Link: https://lore.kernel.org/lkml/20230902062237.GA23141@1wt.eu
Link: https://lore.kernel.org/lkml/5a821292d96a4dbc84c96ccdc6b5b666@AcuMS.aculab.com
Suggested-by: David Laight <David.Laight@aculab.com>
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
|
|
Avoid any accidental reliance on system includes.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
|
|
This allows nolic to work with `-nostdinc` avoiding any reliance on
system headers.
The implementation has been lifted from musl libc 1.2.4.
There is already an implementation of stdarg.h in include/linux/stdarg.h
but that is GPL licensed and therefore not suitable for nolibc.
The used compiler builtins have been validated to be at least available
since GCC 4.1.2 and clang 3.0.0.
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Willy Tarreau <w@1wt.eu>
|
|
Otherwise the different instances of _start_c from each compilation unit
will lead to linker errors:
/usr/bin/ld: /tmp/ccSNvRqs.o: in function `_start_c':
nolibc-test-foo.c:(.text.nolibc_memset+0x9): multiple definition of `_start_c'; /tmp/ccG25101.o:nolibc-test.c:(.text+0x1ea3): first defined here
Fixes: 17336755150b ("tools/nolibc: add new crt.h with _start_c")
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/lkml/20231012-nolibc-start_c-multiple-v1-1-fbfc73e0283f@weissschuh.net/
Link: https://lore.kernel.org/lkml/20231012-nolibc-linkage-test-v1-1-315e682768b4@weissschuh.net/
Acked-by: Willy Tarreau <w@1wt.eu>
|
|
The ABI mandates that the %esp register must be a multiple of 16 when
executing a 'call' instruction.
Commit 2ab446336b17 ("tools/nolibc: i386: shrink _start with _start_c")
simplified the _start function, but it didn't take care of the %esp
alignment, causing SIGSEGV on SSE and AVX programs that use aligned move
instruction (e.g., movdqa, movaps, and vmovdqa).
The 'and $-16, %esp' aligns the %esp at a multiple of 16. Then 'push
%eax' will subtract the %esp by 4; thus, it breaks the 16-byte
alignment. Make sure the %esp is correctly aligned after the push by
subtracting 12 before the push.
Extra:
Add 'add $12, %esp' before the 'and $-16, %esp' to avoid over-estimating
for particular cases as suggested by Willy.
A test program to validate the %esp alignment on _start can be found at:
https://lore.kernel.org/lkml/ZOoindMFj1UKqo+s@biznet-home.integral.gnuweeb.org
[ Thomas: trim Fixes tag commit id ]
Cc: Zhangjin Wu <falcon@tinylab.org>
Fixes: 2ab446336b17 ("tools/nolibc: i386: shrink _start with _start_c")
Reported-by: Nicholas Rosenberg <inori@vnlx.org>
Acked-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Reviewed-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
|
|
Some SoCs have firmware support to notify, if the system can't lower
power limit to a value requested from user space via RAPL constraints.
This test program waits for notification of power floor and prints. This
program can be used to test this feature and also allows other user space
programs to use as a reference.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Extend x86's state to forcefully load *all* host-supported xfeatures by
modifying xstate_bv in the saved state. Stuffing xstate_bv ensures that
the selftest is verifying KVM's full ABI regardless of whether or not the
guest code is successful in getting various xfeatures out of their INIT
state, e.g. see the disaster that is/was MPX.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230928001956.924301-6-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Expand x86's state test to load XSAVE state into a "dummy" vCPU prior to
KVM_SET_CPUID2, and again with an empty guest CPUID model. Except for
off-by-default features, i.e. AMX, KVM's ABI for KVM_SET_XSAVE is that
userspace is allowed to load xfeatures so long as they are supported by
the host. This is a regression test for a combination of KVM bugs where
the state saved by KVM_GET_XSAVE{2} could not be loaded via KVM_SET_XSAVE
if the saved xstate_bv would load guest-unsupported xfeatures.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230928001956.924301-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Modify support XSAVE state in the "state test's" guest code so that saving
and loading state via KVM_{G,S}ET_XSAVE actually does something useful,
i.e. so that xstate_bv in XSAVE state isn't empty.
Punt on BNDCSR for now, it's easier to just stuff that xfeature from the
host side.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230928001956.924301-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
These selftests are written in prog_tests style instead of adding
them to the existing test_sock_addr tests. Migrating the existing
sock addr tests to prog_tests style is left for future work. This
commit adds support for testing bind() sockaddr hooks, even though
there's no unix socket sockaddr hook for bind(). We leave this code
intact for when the INET and INET6 tests are migrated in the future
which do support intercepting bind().
Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com>
Link: https://lore.kernel.org/r/20231011185113.140426-10-daan.j.demeyer@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
The mount directory for the selftests cgroup tree might
not exist so let's make sure it does exist by creating
it ourselves if it doesn't exist.
Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com>
Link: https://lore.kernel.org/r/20231011185113.140426-9-daan.j.demeyer@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Add the necessary plumbing to hook up the new cgroup unix sockaddr
hooks into bpftool.
Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/r/20231011185113.140426-7-daan.j.demeyer@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Add the necessary plumbing to hook up the new cgroup unix sockaddr
hooks into libbpf.
Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com>
Link: https://lore.kernel.org/r/20231011185113.140426-6-daan.j.demeyer@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
These hooks allows intercepting connect(), getsockname(),
getpeername(), sendmsg() and recvmsg() for unix sockets. The unix
socket hooks get write access to the address length because the
address length is not fixed when dealing with unix sockets and
needs to be modified when a unix socket address is modified by
the hook. Because abstract socket unix addresses start with a
NUL byte, we cannot recalculate the socket address in kernelspace
after running the hook by calculating the length of the unix socket
path using strlen().
These hooks can be used when users want to multiplex syscall to a
single unix socket to multiple different processes behind the scenes
by redirecting the connect() and other syscalls to process specific
sockets.
We do not implement support for intercepting bind() because when
using bind() with unix sockets with a pathname address, this creates
an inode in the filesystem which must be cleaned up. If we rewrite
the address, the user might try to clean up the wrong file, leaking
the socket in the filesystem where it is never cleaned up. Until we
figure out a solution for this (and a use case for intercepting bind()),
we opt to not allow rewriting the sockaddr in bind() calls.
We also implement recvmsg() support for connected streams so that
after a connect() that is modified by a sockaddr hook, any corresponding
recmvsg() on the connected socket can also be modified to make the
connected program think it is connected to the "intended" remote.
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com>
Link: https://lore.kernel.org/r/20231011185113.140426-5-daan.j.demeyer@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Jiri added more careful handling of output of the code generator
to avoid wiping out existing files in
commit f65f305ae008 ("tools: ynl-gen: use temporary file for rendering")
Make use of the -o option in the Makefiles, it is already used
by ynl-regen.sh.
Link: https://lore.kernel.org/r/20231010202714.4045168-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
These were missed when these hooks were first added so add them now
instead to make sure every sockaddr hook has a matching section name
test.
Signed-off-by: Daan De Meyer <daan.j.demeyer@gmail.com>
Link: https://lore.kernel.org/r/20231011185113.140426-2-daan.j.demeyer@gmail.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux
Pull hyperv fixes from Wei Liu:
- fixes for Hyper-V VTL code (Saurabh Sengar and Olaf Hering)
- fix hv_kvp_daemon to support keyfile based connection profile
(Shradha Gupta)
* tag 'hyperv-fixes-signed-20231009' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
hv/hv_kvp_daemon:Support for keyfile based connection profile
hyperv: reduce size of ms_hyperv_info
x86/hyperv: Add common print prefix "Hyper-V" in hv_init
x86/hyperv: Remove hv_vtl_early_init initcall
x86/hyperv: Restrict get_vtl to only VTL platforms
|
|
Add mock_domain_alloc_user() and a new test case for
IOMMU_HWPT_ALLOC_NEST_PARENT.
Link: https://lore.kernel.org/r/20230928071528.26258-6-yi.l.liu@intel.com
Co-developed-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Ifcfg config file support in NetworkManger is deprecated. This patch
provides support for the new keyfile config format for connection
profiles in NetworkManager. The patch modifies the hv_kvp_daemon code
to generate the new network configuration in keyfile
format(.ini-style format) along with a ifcfg format configuration.
The ifcfg format configuration is also retained to support easy
backward compatibility for distro vendors. These configurations are
stored in temp files which are further translated using the
hv_set_ifconfig.sh script. This script is implemented by individual
distros based on the network management commands supported.
For example, RHEL's implementation could be found here:
https://gitlab.com/redhat/centos-stream/src/hyperv-daemons/-/blob/c9s/hv_set_ifconfig.sh
Debian's implementation could be found here:
https://github.com/endlessm/linux/blob/master/debian/cloud-tools/hv_set_ifconfig
The next part of this support is to let the Distro vendors consume
these modified implementations to the new configuration format.
Tested-on: Rhel9(Hyper-V, Azure)(nm and ifcfg files verified)
Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Reviewed-by: Ani Sinha <anisinha@redhat.com>
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Link: https://lore.kernel.org/r/1696847920-31125-1-git-send-email-shradhagupta@linux.microsoft.com
|
|
The code supports dumps with no input attributes currently
thru a combination of special-casing and luck.
Clean up the handling of ops with no inputs. Create empty
Structs, and skip printing of empty types.
This makes dos with no inputs work.
Tested-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Link: https://lore.kernel.org/r/20231006135032.3328523-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch extends the existing fib_lookup test suite by adding two test
cases (for each IP family):
* Test source IP selection from the egressing netdev.
* Test source IP selection when an IP route has a preferred src IP addr.
Signed-off-by: Martynas Pumputis <m@lambda.lt>
Link: https://lore.kernel.org/r/20231007081415.33502-3-m@lambda.lt
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Extend the bpf_fib_lookup() helper by making it to return the source
IPv4/IPv6 address if the BPF_FIB_LOOKUP_SRC flag is set.
For example, the following snippet can be used to derive the desired
source IP address:
struct bpf_fib_lookup p = { .ipv4_dst = ip4->daddr };
ret = bpf_skb_fib_lookup(skb, p, sizeof(p),
BPF_FIB_LOOKUP_SRC | BPF_FIB_LOOKUP_SKIP_NEIGH);
if (ret != BPF_FIB_LKUP_RET_SUCCESS)
return TC_ACT_SHOT;
/* the p.ipv4_src now contains the source address */
The inability to derive the proper source address may cause malfunctions
in BPF-based dataplanes for hosts containing netdevs with more than one
routable IP address or for multi-homed hosts.
For example, Cilium implements packet masquerading in BPF. If an
egressing netdev to which the Cilium's BPF prog is attached has
multiple IP addresses, then only one [hardcoded] IP address can be used for
masquerading. This breaks connectivity if any other IP address should have
been selected instead, for example, when a public and private addresses
are attached to the same egress interface.
The change was tested with Cilium [1].
Nikolay Aleksandrov helped to figure out the IPv6 addr selection.
[1]: https://github.com/cilium/cilium/pull/28283
Signed-off-by: Martynas Pumputis <m@lambda.lt>
Link: https://lore.kernel.org/r/20231007081415.33502-2-m@lambda.lt
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
A previous commit updated the verifier to print an accurate failure
message for when someone specifies a nonzero return value from an async
callback. This adds a testcase for validating that the verifier emits
the correct message in such a case.
Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20231009161414.235829-2-void@manifault.com
|
|
A C string lacks alignment so use aligned arrays to avoid potential
alignment problems. Switch to using sizeof (less 1 for the \0
terminator) rather than a hardcode size constant.
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20231007044439.25171-2-irogers@google.com
|
|
libbpf accesses the ELF data requiring at least 8 byte alignment,
however, the data is generated into a C string that doesn't guarantee
alignment. Fix this by assigning to an aligned char array. Use sizeof
on the array, less one for the \0 terminator, rather than generating a
constant.
Fixes: a6cc6b34b93e ("bpftool: Provide a helper method for accessing skeleton's embedded ELF data")
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20231007044439.25171-1-irogers@google.com
|
|
Now that we support pinning a BPF timer to the current core, we should
test it with some selftests. This patch adds two new testcases to the
timer suite, which verifies that a BPF timer both with and without
BPF_F_TIMER_ABS, can be pinned to the calling core with BPF_F_TIMER_CPU_PIN.
Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/bpf/20231004162339.200702-3-void@manifault.com
|
|
BPF supports creating high resolution timers using bpf_timer_* helper
functions. Currently, only the BPF_F_TIMER_ABS flag is supported, which
specifies that the timeout should be interpreted as absolute time. It
would also be useful to be able to pin that timer to a core. For
example, if you wanted to make a subset of cores run without timer
interrupts, and only have the timer be invoked on a single core.
This patch adds support for this with a new BPF_F_TIMER_CPU_PIN flag.
When specified, the HRTIMER_MODE_PINNED flag is passed to
hrtimer_start(). A subsequent patch will update selftests to validate.
Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <song@kernel.org>
Acked-by: Hou Tao <houtao1@huawei.com>
Link: https://lore.kernel.org/bpf/20231004162339.200702-2-void@manifault.com
|
|
Martin reported that on his local dev machine the test_tc_chain_mixed() fails as
"test_tc_chain_mixed:FAIL:seen_tc5 unexpected seen_tc5: actual 1 != expected 0"
and others occasionally, too.
However, when running in a more isolated setup (qemu in particular), it works fine
for him. The reason is that there is a small race-window where seen_tc* could turn
into true for various test cases when there is background traffic, e.g. after the
asserts they often get reset. In such case when subsequent detach takes place,
unrelated background traffic could have already flipped the bool to true beforehand.
Add a small helper tc_skel_reset_all_seen() to reset all bools before we do the ping
test. At this point, everything is set up as expected and therefore no race can occur.
All tc_{opts,links} tests continue to pass after this change.
Reported-by: Martin KaFai Lau <martin.lau@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20231006220655.1653-7-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Add a new test case to query on an empty bpf_mprog and pass the revision
directly into expected_revision for attachment to assert that this does
succeed.
./test_progs -t tc_opts
[ 1.406778] tsc: Refined TSC clocksource calibration: 3407.990 MHz
[ 1.408863] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fcaf6eb0, max_idle_ns: 440795321766 ns
[ 1.412419] clocksource: Switched to clocksource tsc
[ 1.428671] bpf_testmod: loading out-of-tree module taints kernel.
[ 1.430260] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
#252 tc_opts_after:OK
#253 tc_opts_append:OK
#254 tc_opts_basic:OK
#255 tc_opts_before:OK
#256 tc_opts_chain_classic:OK
#257 tc_opts_chain_mixed:OK
#258 tc_opts_delete_empty:OK
#259 tc_opts_demixed:OK
#260 tc_opts_detach:OK
#261 tc_opts_detach_after:OK
#262 tc_opts_detach_before:OK
#263 tc_opts_dev_cleanup:OK
#264 tc_opts_invalid:OK
#265 tc_opts_max:OK
#266 tc_opts_mixed:OK
#267 tc_opts_prepend:OK
#268 tc_opts_query:OK
#269 tc_opts_query_attach:OK <--- (new test)
#270 tc_opts_replace:OK
#271 tc_opts_revision:OK
Summary: 20/0 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20231006220655.1653-6-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Simplify __assert_mprog_count() to remove the -ENOENT corner case as the
bpf_prog_query() now returns 0 when no bpf_mprog is attached. This also
allows to convert a few test cases from using raw __assert_mprog_count()
over to plain assert_mprog_count() helper.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20231006220655.1653-5-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Add a new test case which performs double query of the bpf_mprog through
libbpf API, but also via raw bpf(2) syscall. This is testing to gather
first the count and then in a subsequent probe the full information with
the program array without clearing passed structs in between.
# ./vmtest.sh -- ./test_progs -t tc_opts
[...]
./test_progs -t tc_opts
[ 1.398818] tsc: Refined TSC clocksource calibration: 3407.999 MHz
[ 1.400263] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x311fd336761, max_idle_ns: 440795243819 ns
[ 1.402734] clocksource: Switched to clocksource tsc
[ 1.426639] bpf_testmod: loading out-of-tree module taints kernel.
[ 1.428112] bpf_testmod: module verification failed: signature and/or required key missing - tainting kernel
#252 tc_opts_after:OK
#253 tc_opts_append:OK
#254 tc_opts_basic:OK
#255 tc_opts_before:OK
#256 tc_opts_chain_classic:OK
#257 tc_opts_chain_mixed:OK
#258 tc_opts_delete_empty:OK
#259 tc_opts_demixed:OK
#260 tc_opts_detach:OK
#261 tc_opts_detach_after:OK
#262 tc_opts_detach_before:OK
#263 tc_opts_dev_cleanup:OK
#264 tc_opts_invalid:OK
#265 tc_opts_max:OK
#266 tc_opts_mixed:OK
#267 tc_opts_prepend:OK
#268 tc_opts_query:OK <--- (new test)
#269 tc_opts_replace:OK
#270 tc_opts_revision:OK
Summary: 19/0 PASSED, 0 SKIPPED, 0 FAILED
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20231006220655.1653-4-daniel@iogearbox.net
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
These duplicate defines should automatically be picked up from kernel
headers.
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Remove duplicate defines which are already defined in kernel headers and
re-definition isn't required.
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Remove duplicate defines which are already included in kernel headers.
MAX_PID_NS_LEVEL macro is used inside kernel only. It isn't exposed to
userspace. So it is never defined in test application. Remove #ifndef in
this case.
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
These duplicate defines should automatically be picked up from kernel
headers. Use KHDR_INCLUDES to add kernel header files.
Signed-off-by: Muhammad Usama Anjum <usama.anjum@collabora.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Extract duplicate code from these four functions
unix_redir_to_connected()
udp_redir_to_connected()
inet_unix_redir_to_connected()
unix_inet_redir_to_connected()
to generate a new helper pairs_redir_to_connected(). Create the
different socketpairs in these four functions, then pass the
socketpairs info to the new common helper to do the connections.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Link: https://lore.kernel.org/r/54bb28dcf764e7d4227ab160883931d2173f4f3d.1696588133.git.geliang.tang@suse.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
We currently expect up to a three-digit number of tests and subtests, so:
#999/999: some_test/some_subtest: ...
Is the largest test/subtest we can see. If we happen to cross into
1000s, current logic will just truncate everything after 7th character.
This patch fixes this truncate and allows to go way higher (up to 31
characters in total). We still nicely align test numbers:
#60/66 core_reloc_btfgen/type_based___incompat:OK
#60/67 core_reloc_btfgen/type_based___fn_wrong_args:OK
#60/68 core_reloc_btfgen/type_id:OK
#60/69 core_reloc_btfgen/type_id___missing_targets:OK
#60/70 core_reloc_btfgen/enumval:OK
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20231006175744.3136675-3-andrii@kernel.org
|
|
Add support for building selftests with -O2 level of optimization, which
allows more compiler warnings detection (like lots of potentially
uninitialized usage), but also is useful to have a faster-running test
for some CPU-intensive tests.
One can build optimized versions of libbpf and selftests by running:
$ make RELEASE=1
There is a measurable speed up of about 10 seconds for me locally,
though it's mostly capped by non-parallelized serial tests. User CPU
time goes down by total 40 seconds, from 1m10s to 0m28s.
Unoptimized build (-O0)
=======================
Summary: 430/3544 PASSED, 25 SKIPPED, 4 FAILED
real 1m59.937s
user 1m10.877s
sys 3m14.880s
Optimized build (-O2)
=====================
Summary: 425/3543 PASSED, 25 SKIPPED, 9 FAILED
real 1m50.540s
user 0m28.406s
sys 3m13.198s
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20231006175744.3136675-2-andrii@kernel.org
|
|
Fix a bunch of potentially unitialized variable usage warnings that are
reported by GCC in -O2 mode. Also silence overzealous stringop-truncation
class of warnings.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/bpf/20231006175744.3136675-1-andrii@kernel.org
|
|
Chuck points out that we should use the uapi-header property
when generating the guard. Otherwise we may generate the same
guard as another file in the tree.
Tested-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
CONFIG_VSOCKETS is required by BPF selftests, otherwise we get errors
like this:
./test_progs:socket_loopback_reuseport:386: socket:
Address family not supported by protocol
socket_loopback_reuseport:FAIL:386
./test_progs:vsock_unix_redir_connectible:1496:
vsock_socketpair_connectible() failed
vsock_unix_redir_connectible:FAIL:1496
So this patch enables it in tools/testing/selftests/bpf/config.
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Link: https://lore.kernel.org/r/472e73d285db2ea59aca9bbb95eb5d4048327588.1696490003.git.geliang.tang@suse.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
|
|
Zero-initialize the entire test_result structure used by memslot_perf_test
instead of zeroing only the fields used to guard the pr_info() calls.
gcc 13.2.0 is a bit overzealous and incorrectly thinks that rbestslottime's
slot_runtime may be used uninitialized.
In file included from memslot_perf_test.c:25:
memslot_perf_test.c: In function ‘main’:
include/test_util.h:31:22: error: ‘rbestslottime.slot_runtime.tv_nsec’ may be used uninitialized [-Werror=maybe-uninitialized]
31 | #define pr_info(...) printf(__VA_ARGS__)
| ^~~~~~~~~~~~~~~~~~~
memslot_perf_test.c:1127:17: note: in expansion of macro ‘pr_info’
1127 | pr_info("Best slot setup time for the whole test area was %ld.%.9lds\n",
| ^~~~~~~
memslot_perf_test.c:1092:28: note: ‘rbestslottime.slot_runtime.tv_nsec’ was declared here
1092 | struct test_result rbestslottime;
| ^~~~~~~~~~~~~
include/test_util.h:31:22: error: ‘rbestslottime.slot_runtime.tv_sec’ may be used uninitialized [-Werror=maybe-uninitialized]
31 | #define pr_info(...) printf(__VA_ARGS__)
| ^~~~~~~~~~~~~~~~~~~
memslot_perf_test.c:1127:17: note: in expansion of macro ‘pr_info’
1127 | pr_info("Best slot setup time for the whole test area was %ld.%.9lds\n",
| ^~~~~~~
memslot_perf_test.c:1092:28: note: ‘rbestslottime.slot_runtime.tv_sec’ was declared here
1092 | struct test_result rbestslottime;
| ^~~~~~~~~~~~~
That can't actually happen, at least not without the "result" structure in
test_loop() also being used uninitialized, which gcc doesn't complain
about, as writes to rbestslottime are all-or-nothing, i.e. slottimens can't
be non-zero without slot_runtime being written.
if (!data->mem_size &&
(!rbestslottime->slottimens ||
result.slottimens < rbestslottime->slottimens))
*rbestslottime = result;
Zero-initialize the structures to make gcc happy even though this is
likely a compiler bug. The cost to do so is negligible, both in terms of
code and runtime overhead. The only downside is that the compiler won't
warn about legitimate usage of "uninitialized" data, e.g. the test could
end up consuming zeros instead of useful data. However, given that the
test is quite mature and unlikely to see substantial changes, the odds of
introducing such bugs are relatively low, whereas being able to compile
KVM selftests with -Werror detects issues on a regular basis.
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Link: https://lore.kernel.org/r/20231005002954.2887098-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
If one of the symbols processed by read_symbols() happens to have a
.cold variant with a name longer than objtool's MAX_NAME_LEN limit, the
build fails.
Avoid this problem by just using strndup() to copy the parent function's
name, rather than strncpy()ing it onto the stack.
Signed-off-by: Aaron Plattner <aplattner@nvidia.com>
Link: https://lore.kernel.org/r/41e94cfea1d9131b758dd637fecdeacd459d4584.1696355111.git.aplattner@nvidia.com
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
If objtool runs into a problem that causes it to exit early, the overall
tool still returns a status code of 0, which causes the build to
continue as if nothing went wrong.
Note this only affects early errors, as later errors are still ignored
by check().
Fixes: b51277eb9775 ("objtool: Ditch subcommands")
Signed-off-by: Aaron Plattner <aplattner@nvidia.com>
Link: https://lore.kernel.org/r/cb6a28832d24b2ebfafd26da9abb95f874c83045.1696355111.git.aplattner@nvidia.com
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
|
|
Currently the nsleep-lat test does not produce KTAP output but rather a
custom format. This means that we only get a pass/fail for the suite, not
for each individual test that the suite does. Convert to using the standard
kselftest output functions which result in KTAP output being generated.
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Currently the posix_timers test does not produce KTAP output but rather a
custom format. This means that we only get a pass/fail for the suite, not
for each individual test that the suite does. Convert to using the standard
kselftest output functions which result in KTAP output being generated.
As part of this fix the printing of diagnostics in the unlikely event that
the pthread APIs fail, these were using perror() but the API functions
directly return an error code instead of setting errno.
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|