summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)Author
2024-11-12selftests/bpf: Add tracing prog private stack testsYonghong Song
Some private stack tests are added including: - main prog only with stack size greater than BPF_PSTACK_MIN_SIZE. - main prog only with stack size smaller than BPF_PSTACK_MIN_SIZE. - prog with one subprog having MAX_BPF_STACK stack size and another subprog having non-zero small stack size. - prog with callback function. - prog with exception in main prog or subprog. - prog with async callback without nesting - prog with async callback with possible nesting Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20241112163927.2224750-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-12torture: Add --no-affinity parameter to kvm.shPaul E. McKenney
In performance tests, it can be counter-productive to spread torture-test guest OSes across sockets. Plus the experimenter might have ideas about what CPUs individual guest OSes are to run on. This commit therefore adds a --no-affinity parameter to kvm.sh to prevent it from running taskset on its guest OSes. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-12selftests/bpf: update send_signal to lower perf evemts frequencyEduard Zingerman
Similar to commit [1] sample perf events less often in test_send_signal_nmi(). This should reduce perf events throttling. [1] 7015843afcaf ("selftests/bpf: Fix send_signal test with nested CONFIG_PARAVIRT") Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20241112110906.3045278-5-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-12selftests/bpf: allow send_signal test to timeoutEduard Zingerman
The following invocation: $ t1=send_signal/send_signal_perf_thread_remote \ t2=send_signal/send_signal_nmi_thread_remote \ ./test_progs -t $t1,$t2 Leads to send_signal_nmi_thread_remote to be stuck on a line 180: /* wait for result */ err = read(pipe_c2p[0], buf, 1); In this test case: - perf event PERF_COUNT_HW_CPU_CYCLES is created for parent process; - BPF program is attached to perf event, and sends a signal to child process when event occurs; - parent program burns some CPU in busy loop and calls read() to get notification from child that it received a signal. The perf event is declared with .sample_period = 1. This forces perf to throttle events, and under some unclear conditions the event does not always occur while parent is in busy loop. After parent enters read() system call CPU cycles event won't be generated for parent anymore. Thus, if perf event had not occurred already the test is stuck. This commit updates the parent to wait for notification with a timeout, doing several iterations of busy loop + read_with_timeout(). Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20241112110906.3045278-4-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-12selftests/bpf: add read_with_timeout() utility functionEduard Zingerman
int read_with_timeout(int fd, char *buf, size_t count, long usec) As a regular read(2), but allows to specify a timeout in micro-seconds. Returns -EAGAIN on timeout. Implemented using select(). Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20241112110906.3045278-3-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-12selftests/bpf: watchdog timer for test_progsEduard Zingerman
This commit provides a watchdog timer that sets a limit of how long a single sub-test could run: - if sub-test runs for 10 seconds, the name of the test is printed (currently the name of the test is printed only after it finishes); - if sub-test runs for 120 seconds, the running thread is terminated with SIGSEGV (to trigger crash_handler() and get a stack trace). Specifically: - the timer is armed on each call to run_one_test(); - re-armed at each call to test__start_subtest(); - is stopped when exiting run_one_test(). Default timeout could be overridden using '-w' or '--watchdog-timeout' options. Value 0 can be used to turn the timer off. Here is an example execution: $ ./ssh-exec.sh ./test_progs -w 5 -t \ send_signal/send_signal_perf_thread_remote,send_signal/send_signal_nmi_thread_remote WATCHDOG: test case send_signal/send_signal_nmi_thread_remote executes for 5 seconds, terminating with SIGSEGV Caught signal #11! Stack trace: ./test_progs(crash_handler+0x1f)[0x9049ef] /lib64/libc.so.6(+0x40d00)[0x7f1f1184fd00] /lib64/libc.so.6(read+0x4a)[0x7f1f1191cc4a] ./test_progs[0x720dd3] ./test_progs[0x71ef7a] ./test_progs(test_send_signal+0x1db)[0x71edeb] ./test_progs[0x9066c5] ./test_progs(main+0x5ed)[0x9054ad] /lib64/libc.so.6(+0x2a088)[0x7f1f11839088] /lib64/libc.so.6(__libc_start_main+0x8b)[0x7f1f1183914b] ./test_progs(_start+0x25)[0x527385] #292 send_signal:FAIL test_send_signal_common:PASS:reading pipe 0 nsec test_send_signal_common:PASS:reading pipe error: size 0 0 nsec test_send_signal_common:PASS:incorrect result 0 nsec test_send_signal_common:PASS:pipe_write 0 nsec test_send_signal_common:PASS:setpriority 0 nsec Timer is implemented using timer_{create,start} librt API. Internally librt uses pthreads for SIGEV_THREAD timers, so this change adds a background timer thread to the test process. Because of this a few checks in tests 'bpf_iter' and 'iters' need an update to account for an extra thread. For parallelized scenario the watchdog is also created for each worker fork. If one of the workers gets stuck, it would be terminated by a watchdog. In theory, this might lead to a scenario when all worker threads are exhausted, however this should not be a problem for server_main(), as it would exit with some of the tests not run. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20241112110906.3045278-2-eddyz87@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-11-12Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "x86 and selftests fixes. x86: - When emulating a guest TLB flush for a nested guest, flush vpid01, not vpid02, if L2 is active but VPID is disabled in vmcs12, i.e. if L2 and L1 are sharing VPID '0' (from L1's perspective). - Fix a bug in the SNP initialization flow where KVM would return '0' to userspace instead of -errno on failure. - Move the Intel PT virtualization (i.e. outputting host trace to host buffer and guest trace to guest buffer) behind CONFIG_BROKEN. - Fix memory leak on failure of KVM_SEV_SNP_LAUNCH_START - Fix a bug where KVM fails to inject an interrupt from the IRR after KVM_SET_LAPIC. Selftests: - Increase the timeout for the memslot performance selftest to avoid false failures on arm64 and nested x86 platforms. - Fix a goof in the guest_memfd selftest where a for-loop initialized a bit mask to zero instead of BIT(0). - Disable strict aliasing when building KVM selftests to prevent the compiler from treating things like "u64 *" to "uint64_t *" cases as undefined behavior, which can lead to nasty, hard to debug failures. - Force -march=x86-64-v2 for KVM x86 selftests if and only if the uarch is supported by the compiler. - Fix broken compilation of kvm selftests after a header sync in tools/" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: VMX: Bury Intel PT virtualization (guest/host mode) behind CONFIG_BROKEN KVM: x86: Unconditionally set irr_pending when updating APICv state kvm: svm: Fix gctx page leak on invalid inputs KVM: selftests: use X86_MEMTYPE_WB instead of VMX_BASIC_MEM_TYPE_WB KVM: SVM: Propagate error from snp_guest_req_init() to userspace KVM: nVMX: Treat vpid01 as current if L2 is active, but with VPID disabled KVM: selftests: Don't force -march=x86-64-v2 if it's unsupported KVM: selftests: Disable strict aliasing KVM: selftests: fix unintentional noop test in guest_memfd_test.c KVM: selftests: memslot_perf_test: increase guest sync timeout
2024-11-12Merge tag 'kvm-s390-next-6.13-1' of ↵Paolo Bonzini
https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD - second part of the ucontrol selftest - cpumodel sanity check selftest - gen17 cpumodel changes
2024-11-12selftests: hugetlb_dio: fixup check for initial conditions to skip in the startDonet Tom
This test verifies that a hugepage, used as a user buffer for DIO operations, is correctly freed upon unmapping. To test this, we read the count of free hugepages before and after the mmap, DIO, and munmap operations, then check if the free hugepage count is the same. Reading free hugepages before the test was removed by commit 0268d4579901 ('selftests: hugetlb_dio: check for initial conditions to skip at the start'), causing the test to always fail. This patch adds back reading the free hugepages before starting the test. With this patch, the tests are now passing. Test results without this patch: ./tools/testing/selftests/mm/hugetlb_dio TAP version 13 1..4 # No. Free pages before allocation : 0 # No. Free pages after munmap : 100 not ok 1 : Huge pages not freed! # No. Free pages before allocation : 0 # No. Free pages after munmap : 100 not ok 2 : Huge pages not freed! # No. Free pages before allocation : 0 # No. Free pages after munmap : 100 not ok 3 : Huge pages not freed! # No. Free pages before allocation : 0 # No. Free pages after munmap : 100 not ok 4 : Huge pages not freed! # Totals: pass:0 fail:4 xfail:0 xpass:0 skip:0 error:0 Test results with this patch: /tools/testing/selftests/mm/hugetlb_dio TAP version 13 1..4 # No. Free pages before allocation : 100 # No. Free pages after munmap : 100 ok 1 : Huge pages freed successfully ! # No. Free pages before allocation : 100 # No. Free pages after munmap : 100 ok 2 : Huge pages freed successfully ! # No. Free pages before allocation : 100 # No. Free pages after munmap : 100 ok 3 : Huge pages freed successfully ! # No. Free pages before allocation : 100 # No. Free pages after munmap : 100 ok 4 : Huge pages freed successfully ! # Totals: pass:4 fail:0 xfail:0 xpass:0 skip:0 error:0 Link: https://lkml.kernel.org/r/20241110064903.23626-1-donettom@linux.ibm.com Fixes: 0268d4579901 ("selftests: hugetlb_dio: check for initial conditions to skip in the start") Signed-off-by: Donet Tom <donettom@linux.ibm.com> Cc: Muhammad Usama Anjum <usama.anjum@collabora.com> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-12Merge branch 'iommufd/arm-smmuv3-nested' of iommu/linux into iommufd for-nextJason Gunthorpe
Common SMMUv3 patches for the following patches adding nesting, shared branch with the iommu tree. * 'iommufd/arm-smmuv3-nested' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/iommu/linux: iommu/arm-smmu-v3: Expose the arm_smmu_attach interface iommu/arm-smmu-v3: Implement IOMMU_HWPT_ALLOC_NEST_PARENT iommu/arm-smmu-v3: Support IOMMU_GET_HW_INFO via struct arm_smmu_hw_info iommu/arm-smmu-v3: Report IOMMU_CAP_ENFORCE_CACHE_COHERENCY for CANWBS ACPI/IORT: Support CANWBS memory access flag ACPICA: IORT: Update for revision E.f vfio: Remove VFIO_TYPE1_NESTING_IOMMU ... Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-11-12iommufd/selftest: Add vIOMMU coverage for IOMMU_HWPT_INVALIDATE ioctlNicolin Chen
Add a viommu_cache test function to cover vIOMMU invalidations using the updated IOMMU_HWPT_INVALIDATE ioctl, which now allows passing in a vIOMMU via its hwpt_id field. Link: https://patch.msgid.link/r/f317f902041f3d05deaee4ca3fdd8ef4b8297361.1730836308.git.nicolinc@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-11-12iommufd/selftest: Add IOMMU_TEST_OP_DEV_CHECK_CACHE test commandNicolin Chen
Similar to IOMMU_TEST_OP_MD_CHECK_IOTLB verifying a mock_domain's iotlb, IOMMU_TEST_OP_DEV_CHECK_CACHE will be used to verify a mock_dev's cache. Link: https://patch.msgid.link/r/cd4082079d75427bd67ed90c3c825e15b5720a5f.1730836308.git.nicolinc@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-11-12iommufd: Allow hwpt_id to carry viommu_id for IOMMU_HWPT_INVALIDATENicolin Chen
With a vIOMMU object, use space can flush any IOMMU related cache that can be directed via a vIOMMU object. It is similar to the IOMMU_HWPT_INVALIDATE uAPI, but can cover a wider range than IOTLB, e.g. device/desciprtor cache. Allow hwpt_id of the iommu_hwpt_invalidate structure to carry a viommu_id, and reuse the IOMMU_HWPT_INVALIDATE uAPI for vIOMMU invalidations. Drivers can define different structures for vIOMMU invalidations v.s. HWPT ones. Since both the HWPT-based and vIOMMU-based invalidation pathways check own cache invalidation op, remove the WARN_ON_ONCE in the allocator. Update the uAPI, kdoc, and selftest case accordingly. Link: https://patch.msgid.link/r/b411e2245e303b8a964f39f49453a5dff280968f.1730836308.git.nicolinc@nvidia.com Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-11-12iommufd/selftest: Add IOMMU_VDEVICE_ALLOC test coverageNicolin Chen
Add a vdevice_alloc op to the viommu mock_viommu_ops for the coverage of IOMMU_VIOMMU_TYPE_SELFTEST allocations. Then, add a vdevice_alloc TEST_F to cover the IOMMU_VDEVICE_ALLOC ioctl. Link: https://patch.msgid.link/r/4b9607e5b86726c8baa7b89bd48123fb44104a23.1730836308.git.nicolinc@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-11-12iommufd/selftest: Add IOMMU_VIOMMU_ALLOC test coverageNicolin Chen
Add a new iommufd_viommu FIXTURE and setup it up with a vIOMMU object. Any new vIOMMU feature will be added as a TEST_F under that. Link: https://patch.msgid.link/r/abe267c9d004b29cb1712ceba2f378209d4b7e01.1730836219.git.nicolinc@nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2024-11-12kselftest/arm64: Try harder to generate different keys during PAC testsMark Brown
We very intermittently see failures in the single_thread_different_keys PAC test. As noted in the comment in the test the PAC field can be quite narrow so there is a chance of collisions even with different keys with a chance of 5% for 7 bit keys, and the potential for narrower keys. The test tries to avoid this by running repeatedly, but only tries 10 times which even with a 5% chance of collisions isn't enough. Increase the number of times we attempt to look for collisions by a factor of 100, this also affects other tests which are following a similar pattern with running the test repeatedly and either don't care like with pac_instruction_not_nop or potentially have the same issue like exec_sign_all. The PAC tests are very fast, running in a second or two even in emulation, so the 100x increased cost is mildly irritating but not a huge issue. The bulk of the overhead is in the exec_sign_all test which does a fork() and exec() per iteration. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241111-arm64-pac-test-collisions-v1-2-171875f37e44@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-12kselftest/arm64: Don't leak pipe fds in pac.exec_sign_all()Mark Brown
The PAC exec_sign_all() test spawns some child processes, creating pipes to be stdin and stdout for the child. It cleans up most of the file descriptors that are created as part of this but neglects to clean up the parent end of the child stdin and stdout. Add the missing close() calls. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241111-arm64-pac-test-collisions-v1-1-171875f37e44@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-12kselftest/arm64: Corrupt P0 in the irritator when testing SSVEMark Brown
When building for streaming SVE the irritator for SVE skips updates of both P0 and FFR. While FFR is skipped since it might not be present there is no reason to skip corrupting P0 so switch to an instruction valid in streaming mode and move the ifdef. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241107-arm64-fp-stress-irritator-v2-3-c4b9622e36ee@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-12rcutorture: Add light-weight SRCU scenarioPaul E. McKenney
This commit adds an rcutorture scenario that tests light-weight SRCU readers. While in the area, it adjusts the size of the TREE10 scenario. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: <bpf@vger.kernel.org> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-12kselftest/arm64: Fix missing printf() argument in gcs/gcs-stress.cCatalin Marinas
Compiling the child_cleanup() function results in: gcs-stress.c: In function ‘child_cleanup’: gcs-stress.c:266:75: warning: format ‘%d’ expects a matching ‘int’ argument [-Wformat=] 266 | ksft_print_msg("%s: Exited due to signal %d\n", | ~^ | | | int Add the missing child->exit_signal argument. Fixes: 05e6cfff58c4 ("kselftest/arm64: Add a GCS stress test") Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-12kselftest/arm64: Add FPMR coverage to fp-ptraceMark Brown
Add coverage for FPMR to fp-ptrace. FPMR can be available independently of SVE and SME, if SME is supported then FPMR is cleared by entering and exiting streaming mode. As with other registers we generate random values to load into the register, we restrict these to bitfields which are always defined. We also leave bitfields where the valid values are affected by the set of supported FP8 formats zero to reduce complexity, it is unlikely that specific bitfields will be affected by ptrace issues. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241112-arm64-fp-ptrace-fpmr-v2-3-250b57c61254@kernel.org [catalin.marinas@arm.com: use REG_FPMR instead of FPMR] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-12kselftest/arm64: Expand the set of ZA writes fp-ptrace doesMark Brown
Currently our test for implementable ZA writes is written in a bit of a convoluted fashion which excludes all changes where we clear SVCR.SM even though we can actually support that since changing the vector length resets SVCR. Make the logic more direct, enabling us to actually run these cases. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241112-arm64-fp-ptrace-fpmr-v2-2-250b57c61254@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-12kselftets/arm64: Use flag bits for features in fp-ptrace assembler codeMark Brown
The assembler portions of fp-ptrace are passed feature flags by the C code indicating which architectural features are supported. Currently these use an entire register for each flag which is wasteful and gets cumbersome as new flags are added. Switch to using flag bits in a single register to make things easier to maintain. No functional change. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241112-arm64-fp-ptrace-fpmr-v2-1-250b57c61254@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-12kselftest/arm64: Enable build of PAC tests with LLVM=1Mark Brown
Currently we don't build the PAC selftests when building with LLVM=1 since we attempt to test for PAC support in the toolchain before we've set up the build system to point at LLVM in lib.mk, which has to be one of the last things in the Makefile. Since all versions of LLVM supported for use with the kernel have PAC support we can just sidestep the issue by just assuming PAC is there when doing a LLVM=1 build. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241111-arm64-selftest-pac-clang-v1-1-08599ceee418@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-12kselftest/arm64: Check that SVCR is 0 in signal handlersMark Brown
We don't currently validate that we exit streaming mode and clear ZA when we enter a signal handler. Add simple checks for this in the SSVE and ZA tests. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241106-arm64-fpmr-signal-test-v1-1-31fa34ce58fe@kernel.org [catalin.marinas@arm.com: Use %lx in fprintf() as uint64_t seems to be unsigned long in glibc] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-11libbpf: Stringify errno in log messages in the remaining codeMykyta Yatsenko
Convert numeric error codes into the string representations in log messages in the rest of libbpf source files. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241111212919.368971-5-mykyta.yatsenko5@gmail.com
2024-11-11libbpf: Stringify errno in log messages in btf*.cMykyta Yatsenko
Convert numeric error codes into the string representations in log messages in btf.c and btf_dump.c. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241111212919.368971-4-mykyta.yatsenko5@gmail.com
2024-11-11libbpf: Stringify errno in log messages in libbpf.cMykyta Yatsenko
Convert numeric error codes into the string representations in log messages in libbpf.c. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241111212919.368971-3-mykyta.yatsenko5@gmail.com
2024-11-11libbpf: Introduce errstr() for stringifying errnoMykyta Yatsenko
Add function errstr(int err) that allows converting numeric error codes into string representations. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241111212919.368971-2-mykyta.yatsenko5@gmail.com
2024-11-11tools/bpf: Fix the wrong format specifier in bpf_jit_disasmLuo Yifan
There is a static checker warning that the %d in format string is mismatched with the corresponding argument type, which could result in incorrect printed data. This patch fixes it. Signed-off-by: Luo Yifan <luoyifan@cmss.chinamobile.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241111021004.272293-1-luoyifan@cmss.chinamobile.com
2024-11-11net: ipv4: Cache pmtu for all packet paths if multipath enabledVladimir Vdovin
Check number of paths by fib_info_num_path(), and update_or_create_fnhe() for every path. Problem is that pmtu is cached only for the oif that has received icmp message "need to frag", other oifs will still try to use "default" iface mtu. An example topology showing the problem: | host1 +---------+ | dummy0 | 10.179.20.18/32 mtu9000 +---------+ +-----------+----------------+ +---------+ +---------+ | ens17f0 | 10.179.2.141/31 | ens17f1 | 10.179.2.13/31 +---------+ +---------+ | (all here have mtu 9000) | +------+ +------+ | ro1 | 10.179.2.140/31 | ro2 | 10.179.2.12/31 +------+ +------+ | | ---------+------------+-------------------+------ | +-----+ | ro3 | 10.10.10.10 mtu1500 +-----+ | ======================================== some networks ======================================== | +-----+ | eth0| 10.10.30.30 mtu9000 +-----+ | host2 host1 have enabled multipath and sysctl net.ipv4.fib_multipath_hash_policy = 1: default proto static src 10.179.20.18 nexthop via 10.179.2.12 dev ens17f1 weight 1 nexthop via 10.179.2.140 dev ens17f0 weight 1 When host1 tries to do pmtud from 10.179.20.18/32 to host2, host1 receives at ens17f1 iface an icmp packet from ro3 that ro3 mtu=1500. And host1 caches it in nexthop exceptions cache. Problem is that it is cached only for the iface that has received icmp, and there is no way that ro3 will send icmp msg to host1 via another path. Host1 now have this routes to host2: ip r g 10.10.30.30 sport 30000 dport 443 10.10.30.30 via 10.179.2.12 dev ens17f1 src 10.179.20.18 uid 0 cache expires 521sec mtu 1500 ip r g 10.10.30.30 sport 30033 dport 443 10.10.30.30 via 10.179.2.140 dev ens17f0 src 10.179.20.18 uid 0 cache So when host1 tries again to reach host2 with mtu>1500, if packet flow is lucky enough to be hashed with oif=ens17f1 its ok, if oif=ens17f0 it blackholes and still gets icmp msgs from ro3 to ens17f1, until lucky day when ro3 will send it through another flow to ens17f0. Signed-off-by: Vladimir Vdovin <deliran@verdict.gg> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20241108093427.317942-1-deliran@verdict.gg Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11net: netconsole: selftests: Check if netdevsim is availableBreno Leitao
The netconsole selftest relies on the availability of the netdevsim module. To ensure the test can run correctly, we need to check if the netdevsim module is either loaded or built-in before proceeding. Update the netconsole selftest to check for the existence of the /sys/bus/netdevsim/new_device file before running the test. If the file is not found, the test is skipped with an explanation that the CONFIG_NETDEVSIM kernel config option may not be enabled. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241108-netcon_selftest_deps-v1-1-1789cbf3adcd@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11selftests: net: Add busy_poll_testJoe Damato
Add an epoll busy poll test using netdevsim. This test is comprised of: - busy_poller (via busy_poller.c) - busy_poll_test.sh which loads netdevsim, sets up network namespaces, and runs busy_poller to receive data and socat to send data. The selftest tests two different scenarios: - busy poll (the pre-existing version in the kernel) - busy poll with suspend enabled (what this series adds) The data transmit is a 1MiB temporary file generated from /dev/urandom and the test is considered passing if the md5sum of the input file to socat matches the md5sum of the output file from busy_poller. netdevsim was chosen instead of veth due to netdevsim's support for netdev-genl. For now, this test uses the functionality that netdevsim provides. In the future, perhaps netdevsim can be extended to emulate device IRQs to more thoroughly test all pre-existing kernel options (like defer_hard_irqs) and suspend. Signed-off-by: Joe Damato <jdamato@fastly.com> Co-developed-by: Martin Karsten <mkarsten@uwaterloo.ca> Signed-off-by: Martin Karsten <mkarsten@uwaterloo.ca> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20241109050245.191288-6-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11net: Add napi_struct parameter irq_suspend_timeoutMartin Karsten
Add a per-NAPI IRQ suspension parameter, which can be get/set with netdev-genl. This patch doesn't change any behavior but prepares the code for other changes in the following commits which use irq_suspend_timeout as a timeout for IRQ suspension. Signed-off-by: Martin Karsten <mkarsten@uwaterloo.ca> Co-developed-by: Joe Damato <jdamato@fastly.com> Signed-off-by: Joe Damato <jdamato@fastly.com> Tested-by: Joe Damato <jdamato@fastly.com> Tested-by: Martin Karsten <mkarsten@uwaterloo.ca> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Link: https://patch.msgid.link/20241109050245.191288-2-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11Improve consistency of '#error' directive messagesNataniel Farzan
Remove the use of contractions and use proper punctuation in #error directive messages that discourage the direct inclusion of header files. Link: https://lkml.kernel.org/r/20241105032231.28833-1-natanielfarzan@gmail.com Signed-off-by: Nataniel Farzan <natanielfarzan@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-11selftests: ncdevmem: Add automated testStanislav Fomichev
Only RX side for now and small message to test the setup. In the future, we can extend it to TX side and to testing both sides with a couple of megs of data. make \ -C tools/testing/selftests \ TARGETS="drivers/hw/net" \ install INSTALL_PATH=~/tmp/ksft scp ~/tmp/ksft ${HOST}: scp ~/tmp/ksft ${PEER}: cfg+="NETIF=${DEV}\n" cfg+="LOCAL_V6=${HOST_IP}\n" cfg+="REMOTE_V6=${PEER_IP}\n" cfg+="REMOTE_TYPE=ssh\n" cfg+="REMOTE_ARGS=root@${PEER}\n" echo -e "$cfg" | ssh root@${HOST} "cat > ksft/drivers/net/net.config" ssh root@${HOST} "cd ksft && ./run_kselftest.sh -t drivers/net:devmem.py" Reviewed-by: Mina Almasry <almasrymina@google.com> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20241107181211.3934153-13-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11selftests: ncdevmem: Move ncdevmem under drivers/net/hwStanislav Fomichev
This is where all the tests that depend on the HW functionality live in and this is where the automated test is gonna be added in the next patch. Reviewed-by: Mina Almasry <almasrymina@google.com> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20241107181211.3934153-12-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11selftests: ncdevmem: Run selftest when none of the -s or -c has been providedStanislav Fomichev
This will be used as a 'probe' mode in the selftest to check whether the device supports the devmem or not. Use hard-coded queue layout (two last queues) and prevent user from passing custom -q and/or -t. Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20241107181211.3934153-11-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11selftests: ncdevmem: Remove hard-coded queue numbersStanislav Fomichev
Use single last queue of the device and probe it dynamically. Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20241107181211.3934153-10-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11selftests: ncdevmem: Use YNL to enable TCP header splitStanislav Fomichev
In the next patch the hard-coded queue numbers are gonna be removed. So introduce some initial support for ethtool YNL and use it to enable header split. Also, tcp-data-split requires latest ethtool which is unlikely to be present in the distros right now. (ideally, we should not shell out to ethtool at all). Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20241107181211.3934153-9-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11selftests: ncdevmem: Properly reset flow steeringStanislav Fomichev
ntuple off/on might be not enough to do it on all NICs. Add a bunch of shell crap to explicitly remove the rules. Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20241107181211.3934153-8-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11selftests: ncdevmem: Switch to AF_INET6Stanislav Fomichev
Use dualstack socket to support both v4 and v6. v4-mapped-v6 address can be used to do v4. Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20241107181211.3934153-7-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11selftests: ncdevmem: Remove default argumentsStanislav Fomichev
To make it clear what's required and what's not. Also, some of the values don't seem like a good defaults; for example eth1. Move the invocation comment to the top, add missing -s to the client and cleanup the client invocation a bit to make more readable. Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20241107181211.3934153-6-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11selftests: ncdevmem: Make client_ip optionalStanislav Fomichev
Support 3-tuple filtering by making client_ip optional. When -c is not passed, don't specify src-ip/src-port in the filter. Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20241107181211.3934153-5-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11selftests: ncdevmem: Unify error handlingStanislav Fomichev
There is a bunch of places where error() calls look out of place. Use the same error(1, errno, ...) pattern everywhere. Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20241107181211.3934153-4-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11selftests: ncdevmem: Separate out dmabuf providerStanislav Fomichev
So we can plug the other ones in the future if needed. Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20241107181211.3934153-3-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11selftests: ncdevmem: Redirect all non-payload output to stderrStanislav Fomichev
That should make it possible to do expected payload validation on the caller side. Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20241107181211.3934153-2-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11selftests: hsr: Add test for VLANMD Danish Anwar
Add test for VLAN ping for HSR. The test adds vlan interfaces to the hsr interface and then verifies if ping to them works. Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20241106091710.3308519-5-danishanwar@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11ipv6: Fix soft lockups in fib6_select_path under high next hop churnOmid Ehtemam-Haghighi
Soft lockups have been observed on a cluster of Linux-based edge routers located in a highly dynamic environment. Using the `bird` service, these routers continuously update BGP-advertised routes due to frequently changing nexthop destinations, while also managing significant IPv6 traffic. The lockups occur during the traversal of the multipath circular linked-list in the `fib6_select_path` function, particularly while iterating through the siblings in the list. The issue typically arises when the nodes of the linked list are unexpectedly deleted concurrently on a different core—indicated by their 'next' and 'previous' elements pointing back to the node itself and their reference count dropping to zero. This results in an infinite loop, leading to a soft lockup that triggers a system panic via the watchdog timer. Apply RCU primitives in the problematic code sections to resolve the issue. Where necessary, update the references to fib6_siblings to annotate or use the RCU APIs. Include a test script that reproduces the issue. The script periodically updates the routing table while generating a heavy load of outgoing IPv6 traffic through multiple iperf3 clients. It consistently induces infinite soft lockups within a couple of minutes. Kernel log: 0 [ffffbd13003e8d30] machine_kexec at ffffffff8ceaf3eb 1 [ffffbd13003e8d90] __crash_kexec at ffffffff8d0120e3 2 [ffffbd13003e8e58] panic at ffffffff8cef65d4 3 [ffffbd13003e8ed8] watchdog_timer_fn at ffffffff8d05cb03 4 [ffffbd13003e8f08] __hrtimer_run_queues at ffffffff8cfec62f 5 [ffffbd13003e8f70] hrtimer_interrupt at ffffffff8cfed756 6 [ffffbd13003e8fd0] __sysvec_apic_timer_interrupt at ffffffff8cea01af 7 [ffffbd13003e8ff0] sysvec_apic_timer_interrupt at ffffffff8df1b83d -- <IRQ stack> -- 8 [ffffbd13003d3708] asm_sysvec_apic_timer_interrupt at ffffffff8e000ecb [exception RIP: fib6_select_path+299] RIP: ffffffff8ddafe7b RSP: ffffbd13003d37b8 RFLAGS: 00000287 RAX: ffff975850b43600 RBX: ffff975850b40200 RCX: 0000000000000000 RDX: 000000003fffffff RSI: 0000000051d383e4 RDI: ffff975850b43618 RBP: ffffbd13003d3800 R8: 0000000000000000 R9: ffff975850b40200 R10: 0000000000000000 R11: 0000000000000000 R12: ffffbd13003d3830 R13: ffff975850b436a8 R14: ffff975850b43600 R15: 0000000000000007 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 9 [ffffbd13003d3808] ip6_pol_route at ffffffff8ddb030c 10 [ffffbd13003d3888] ip6_pol_route_input at ffffffff8ddb068c 11 [ffffbd13003d3898] fib6_rule_lookup at ffffffff8ddf02b5 12 [ffffbd13003d3928] ip6_route_input at ffffffff8ddb0f47 13 [ffffbd13003d3a18] ip6_rcv_finish_core.constprop.0 at ffffffff8dd950d0 14 [ffffbd13003d3a30] ip6_list_rcv_finish.constprop.0 at ffffffff8dd96274 15 [ffffbd13003d3a98] ip6_sublist_rcv at ffffffff8dd96474 16 [ffffbd13003d3af8] ipv6_list_rcv at ffffffff8dd96615 17 [ffffbd13003d3b60] __netif_receive_skb_list_core at ffffffff8dc16fec 18 [ffffbd13003d3be0] netif_receive_skb_list_internal at ffffffff8dc176b3 19 [ffffbd13003d3c50] napi_gro_receive at ffffffff8dc565b9 20 [ffffbd13003d3c80] ice_receive_skb at ffffffffc087e4f5 [ice] 21 [ffffbd13003d3c90] ice_clean_rx_irq at ffffffffc0881b80 [ice] 22 [ffffbd13003d3d20] ice_napi_poll at ffffffffc088232f [ice] 23 [ffffbd13003d3d80] __napi_poll at ffffffff8dc18000 24 [ffffbd13003d3db8] net_rx_action at ffffffff8dc18581 25 [ffffbd13003d3e40] __do_softirq at ffffffff8df352e9 26 [ffffbd13003d3eb0] run_ksoftirqd at ffffffff8ceffe47 27 [ffffbd13003d3ec0] smpboot_thread_fn at ffffffff8cf36a30 28 [ffffbd13003d3ee8] kthread at ffffffff8cf2b39f 29 [ffffbd13003d3f28] ret_from_fork at ffffffff8ce5fa64 30 [ffffbd13003d3f50] ret_from_fork_asm at ffffffff8ce03cbb Fixes: 66f5d6ce53e6 ("ipv6: replace rwlock with rcu and spinlock in fib6_table") Reported-by: Adrian Oliver <kernel@aoliver.ca> Signed-off-by: Omid Ehtemam-Haghighi <omid.ehtemamhaghighi@menlosecurity.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Ido Schimmel <idosch@idosch.org> Cc: Kuniyuki Iwashima <kuniyu@amazon.com> Cc: Simon Horman <horms@kernel.org> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20241106010236.1239299-1-omid.ehtemamhaghighi@menlosecurity.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11selftests/mm: Fix unused function warning for aarch64_write_signal_pkey()Catalin Marinas
Since commit 49f59573e9e0 ("selftests/mm: Enable pkey_sighandler_tests on arm64"), pkey_sighandler_tests.c (which includes pkey-arm64.h via pkey-helpers.h) ends up compiled for arm64. Since it doesn't use aarch64_write_signal_pkey(), the compiler warns: In file included from pkey-helpers.h:106, from pkey_sighandler_tests.c:31: pkey-arm64.h:130:13: warning: ‘aarch64_write_signal_pkey’ defined but not used [-Wunused-function] 130 | static void aarch64_write_signal_pkey(ucontext_t *uctxt, u64 pkey) | ^~~~~~~~~~~~~~~~~~~~~~~~~ Make the aarch64_write_signal_pkey() a 'static inline void' function to avoid the compiler warning. Fixes: f5b5ea51f78f ("selftests: mm: make protection_keys test work on arm64") Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com> Link: https://lore.kernel.org/r/20241108110549.1185923-1-catalin.marinas@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>