Age | Commit message (Collapse) | Author |
|
Add the test counts to the JSON output from kunit.py. For example:
...
"git_branch": "kselftest",
"misc":
{
"tests": 2,
"passed": 1.
"failed": 1,
"crashed": 0,
"skipped": 0,
"errors": 0,
}
...
To output the JSON using the following command:
./tools/testing/kunit/kunit.py run example --json
This has been requested by KUnit users. The counts are in a "misc"
field because the JSON output needs to be compliant with the KCIDB
submission guide. There are no counts fields but there is a "misc" field
in the guide.
Link: https://lore.kernel.org/r/20250516201615.1237037-1-rmoar@google.com
Signed-off-by: Rae Moar <rmoar@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Add a test case to verify x86's bus lock exit functionality, which is now
supported on both Intel and AMD. Trigger bus lock exits by performing a
split-lock access, i.e. an atomic access that splits two cache lines.
Verify that the correct number of bus lock exits are generated, and that
the counter is incremented correctly and at the appropriate time based on
the underlying architecture.
Generate bus locks in both L1 and L2 (if nested virtualization is enabled),
as SVM's functionality in particular requires non-trivial logic to do the
right thing when running nested VMs.
Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Co-developed-by: Manali Shukla <manali.shukla@amd.com>
Signed-off-by: Manali Shukla <manali.shukla@amd.com>
Link: https://lore.kernel.org/r/20250502050346.14274-6-manali.shukla@amd.com
Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Remove llvm dependencies from binaries that do not use llvm libraries.
Filter out libxml2 from llvm dependencies, as it seems that
it is not actually used. This patch reduced link dependencies
for BPF selftests.
The next line was adding llvm dependencies to every target in the
makefile, while the only targets that require those are test
runnners (test_progs, test_progs-no_alu32,...):
```
$(OUTPUT)/$(TRUNNER_BINARY): LDLIBS += $$(LLVM_LDLIBS)
```
Before this change:
ldd linux/tools/testing/selftests/bpf/veristat
linux-vdso.so.1 (0x00007ffd2c3fd000)
libelf.so.1 => /lib64/libelf.so.1 (0x00007fe1dcf89000)
libz.so.1 => /lib64/libz.so.1 (0x00007fe1dcf6f000)
libm.so.6 => /lib64/libm.so.6 (0x00007fe1dce94000)
libzstd.so.1 => /lib64/libzstd.so.1 (0x00007fe1dcddd000)
libxml2.so.2 => /lib64/libxml2.so.2 (0x00007fe1dcc54000)
libstdc++.so.6 => /lib64/libstdc++.so.6 (0x00007fe1dca00000)
libc.so.6 => /lib64/libc.so.6 (0x00007fe1dc600000)
/lib64/ld-linux-x86-64.so.2 (0x00007fe1dcfb1000)
liblzma.so.5 => /lib64/liblzma.so.5 (0x00007fe1dc9d4000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007fe1dcc38000)
After:
ldd linux/tools/testing/selftests/bpf/veristat
linux-vdso.so.1 (0x00007ffc83370000)
libelf.so.1 => /lib64/libelf.so.1 (0x00007f4b87515000)
libz.so.1 => /lib64/libz.so.1 (0x00007f4b874fb000)
libc.so.6 => /lib64/libc.so.6 (0x00007f4b87200000)
libzstd.so.1 => /lib64/libzstd.so.1 (0x00007f4b87444000)
/lib64/ld-linux-x86-64.so.2 (0x00007f4b8753d000)
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250516195522.311769-1-mykyta.yatsenko5@gmail.com
|
|
Antonio Quartulli says:
====================
ovpn: pull request for net-next: ovpn 2025-05-15
this is a new version of the previous pull request.
These time I have removed the fixes that we are still discussing,
so that we don't hold the entire series back.
There is a new fix though: it's about properly checking the return value
of skb_to_sgvec_nomark(). I spotted the issue while testing pings larger
than the iface's MTU on a TCP VPN connection.
I have added various Closes and Link tags where applicable, so
that we have references to GitHub tickets and other public discussions.
Since I have resent the PR, I have also added Andrew's Reviewed-by to
the first patch.
Please pull or let me know if something should be changed!
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Patchset highlights:
- update MAINTAINERS entry for ovpn
- extend selftest with more cases
- avoid crash in selftest in case of getaddrinfo() failure
- fix ndo_start_xmit return value on error
- set ignore_df flag for IPv6 packets
- drop useless reg_state check in keepalive worker
- retain skb's dst when entering xmit function
- fix check on skb_to_sgvec_nomark() return value
|
|
In the sigpipe test, we expect send() to fail, but we do not check if
send() fails with the errno we expect (EPIPE).
Add this check and repeat the send() in case of EINTR as we do in other
tests.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20250514141927.159456-4-sgarzare@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When the other peer calls shutdown(SHUT_RD), there is a chance that
the send() call could occur before the message carrying the close
information arrives over the transport. In such cases, the send()
might still succeed. To avoid this race, let's retry the send() call
a few times, ensuring the test is more reliable.
Sleep a little before trying again to avoid flooding the other peer
and filling its receive buffer, causing false-negative.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20250514141927.159456-3-sgarzare@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The timeout API uses signals, so we have documented not to use sleep(),
but we can use nanosleep(2) since POSIX.1 explicitly specifies that it
does not interact with signals.
Let's provide timeout_usleep() for that.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20250514141927.159456-2-sgarzare@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a fairly complete example of rt-link usage. If run without any
arguments it simply lists the interfaces and some of their attrs.
If run with an arg it tries to create and delete a netkit device.
1 # ./tools/net/ynl/samples/rt-link 1
2 Trying to create a Netkit interface
3 Testing error message for policy being bad:
4 Kernel error: 'Provided default xmit policy not supported' (bad attribute: .linkinfo.data(netkit).policy)
5 1: lo: mtu 65536
6 2: wlp0s1: mtu 1500
7 3: enp0s13: mtu 1500
8 4: dummy0: mtu 1500 kind dummy altname one two
9 5: nk0: mtu 1500 kind netkit primary 0 policy forward
10 6: nk1: mtu 1500 kind netkit primary 1 policy blackhole
11 Trying to delete a Netkit interface (ifindex 6)
Sample creates the device first, it sets an invalid value for a netkit
attribute to trigger reverse parsing. Line 4 shows the error with the
attribute path correctly generated by YNL.
Then sample fixes the bad attribute and re-issues the request, with
NLM_F_ECHO set. This flag causes the notification to be looped back
to the initiating socket (our socket). Sample parses this notification
to save the ifindex of the created netkit.
Sample then proceeds to list the devices. Line 8 above shows a dummy
device with two alt names. Lines 9 and 10 show the netkit devices
the sample itself created.
The "primary" and "policy" attrs are from inside the netkit submsg.
The string values are auto-generated for the enums by YNL.
To clean up sample deletes the interface it created (line 11).
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250515231650.1325372-10-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Switch from including Classic netlink families one by one to excluding.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250515231650.1325372-9-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Reverse parsing lets YNL convert bad and missing attr pointers
from extack into a string like "missing attribute nest1.nest2.attr_name".
It's a feature that's unique to YNL C AFAIU (even the Python YNL
can't do nested reverse parsing). Add support for reverse-parsing
of sub-messages.
To simplify the logic and the code annotate the type policies
with extra metadata. Mark the selectors and the messages with
the information we need. We assume that key / selector always
precedes the sub-message while parsing (and also if there are
multiple sub-messages like in rt-link they are interleaved
selector 1 ... submsg 1 ... selector 2 .. submsg 2, not
selector 1 ... selector 2 ... submsg 1 ... submsg 2).
The rt-link sample in a subsequent changes shows reverse parsing
of sub-messages in action.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250515231650.1325372-8-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Adjust parsing and rendering appropriately to make sub-messages work.
Rendering is pretty trivial, as the submsg -> netlink conversion looks
like rendering a nest in which only one attr was set. Only trick
is that we use the enum value of the sub-message rather than the nest
as the type, and effectively skip one layer of nesting. A real double
nested struct would look like this:
[SELECTOR]
[SUBMSG]
[NEST]
[MSG1-ATTR]
A submsg "is" the nest so by skipping I mean:
[SELECTOR]
[SUBMSG]
[MSG1-ATTR]
There is no extra validation in YNL if caller has set the selector
matching the submsg type (e.g. link type = "macvlan" but the nest
attrs are set to carry "veth"). Let the kernel handle that.
Parsing side is a little more specialized as we need to render and
insert a new kind of function which switches between what to parse
based on the selector. But code isn't too complicated.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250515231650.1325372-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The easiest (or perhaps only sane) way to support submessages in C
is to treat them as if they were nests. Build fake attributes to
that effect in the codegen. Render the submsg as a big nest of all
possible values.
With this in place the main missing part is to hook in the switch
which selects how to parse based on the key.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250515231650.1325372-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Hook in handling of sub-messages, for now treat them as ignored attrs.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250515231650.1325372-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Prepare for constructing Struct() instances which represent
sub-messages rather than nested attributes.
Restructure the code / indentation to more easily insert
a case where nested reference comes from annotation other
than the 'nested-attributes' property. Make sure we don't
construct the Struct() object from scratch in multiple
places as the constructor will soon have more arguments.
This should cause no functional change.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250515231650.1325372-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We're about to add some code here for sub-messages.
Factor out the nest-related logic to make the code readable.
No functional change.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250515231650.1325372-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux
Merge cpupower utility update for 6.16 from Shuah Khan:
"Adds systemd service to run cpupower and changes binding's makefile
to use -lcpupower.
cpupower: add a systemd service to run cpupower
cpupower: do not write DESTDIR to cpupower.service
cpupower: do not call systemctl at install time
cpupower: do not install files to /etc/default/
cpupower: change binding's makefile to use -lcpupower"
* tag 'linux-cpupower-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux:
cpupower: do not install files to /etc/default/
cpupower: do not call systemctl at install time
cpupower: do not write DESTDIR to cpupower.service
cpupower: change binding's makefile to use -lcpupower
cpupower: add a systemd service to run cpupower
|
|
Use MGLRU's debugfs interface to do access tracking instead of
page_idle. The logic to use the page_idle bitmap is left in, as it is
useful for kernels that do not have MGLRU built in.
When MGLRU is enabled, page_idle will report pages as still idle even
after being accessed, as MGLRU doesn't necessarily clear the Idle folio
flag when accessing an idle page, so the test will not attempt to use
page_idle if MGLRU is enabled but otherwise not usable.
Aging pages with MGLRU is much faster than marking pages as idle with
page_idle.
Co-developed-by: Axel Rasmussen <axelrasmussen@google.com>
Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Signed-off-by: James Houghton <jthoughton@google.com>
Link: https://lore.kernel.org/r/20250508184649.2576210-8-jthoughton@google.com
[sean: print parsed features, not raw string]
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
libcgroup.o is built separately from KVM selftests and cgroup selftests,
so different compiler flags used by the different selftests will not
conflict with each other.
Signed-off-by: James Houghton <jthoughton@google.com>
Link: https://lore.kernel.org/r/20250508184649.2576210-7-jthoughton@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Add an API in the cgroups library to find the root of a specific
controller. KVM selftests will use the API to find the memory controller.
Search for the controller on both v1 and v2 mounts, as KVM selftests'
usage will be completely oblivious of v1 versus v2.
Signed-off-by: James Houghton <jthoughton@google.com>
Link: https://lore.kernel.org/r/20250508184649.2576210-6-jthoughton@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
KVM selftests will soon need to use some of the cgroup creation and
deletion functionality from cgroup_util.
Suggested-by: David Matlack <dmatlack@google.com>
Signed-off-by: James Houghton <jthoughton@google.com>
Acked-by: Tejun Heo <tj@kernel.org>
Link: https://lore.kernel.org/r/20250508184649.2576210-5-jthoughton@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Move a handful of helpers out of cgroup_util.c and into test_memcontrol.c
that have nothing to with cgroups in general, in anticipation of making
cgroup_util.c a generic library that can be used by other selftests.
Make read_text() and write_text() non-static so test_memcontrol.c can
use them.
Signed-off-by: James Houghton <jthoughton@google.com>
Acked-by: Michal Koutný <mkoutny@suse.com>
Link: https://lore.kernel.org/r/20250508184649.2576210-4-jthoughton@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Add an option to skip sanity check of number of still idle pages,
and set it by default to skip, in case hypervisor or NUMA balancing
is detected.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Co-developed-by: James Houghton <jthoughton@google.com>
Signed-off-by: James Houghton <jthoughton@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Link: https://lore.kernel.org/r/20250508184649.2576210-3-jthoughton@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
Extract the guts of thp_configured() and get_trans_hugepagesz() to
standalone helpers so that the core logic can be reused for other sysfs
files, e.g. to query numa_balancing.
Opportunistically assert that the initial fscanf() read at least one byte,
and add a comment explaining the second call to fscanf().
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: James Houghton <jthoughton@google.com>
Link: https://lore.kernel.org/r/20250508184649.2576210-2-jthoughton@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
|
|
On ARM64, when running with --configs '36*SRCU-P', I noticed that only 1 instance
instead of 36 for starting.
Fix it by checking for Image files, instead of bzImage which ARM does
not seem to have. With this I see all 36 instances running at the same
time in the batch.
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
|
|
Back in the day, rcutorture was about the only thing that tested off-stack
CPU masks, but now any arm64 system with more than 256 CPUs tests it
full time. In fact, it is necessary to hack the kernel to prevent such
a system from testing off-stack CPU masks. This means that there is
no longer much point in rcutorture going out of its way to test this.
And given the differences in how CPUMASK_OFFSTACK is enabled in x86 and
arm64, rcutorture would need to go out of its way.
This commit therefore removes CONFIG_CPUMASK_OFFSTACK=y (and the
CONFIG_MAXSMP=y required to enable it on x86) from TREE01.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
|
|
The TREE01.boot nr_cpus kernel boot parameter has been set to 43 for
more than seven years, but it can cause RCU CPU stall warnings on arm64,
most of the time involving the stop-machine subsystem. This should
not be too surprising, given that this causes 43 vCPUs to spin with
interrupts disabled when there are only eight physical CPUs.
The point of this CPU overcommit is to test the ability of expedited RCU
grace period initialization to handle races with incoming CPUs that have
never previously been online. But limiting to 17 CPUs instead of 43
allows time for this code to be exercised, and eliminates (or at least
greatly reduces) the incidence of RCU CPU stall warnings on arm64.
So this commit therefore sets nr_cpus=17 in TREE01.boot.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
|
|
Different architectures capitalize their splats differently. Who knew?
This commit therefore checks for both arm64 "Call trace:" and x86
"Call Trace:".
Reported-by: Joel Fernandes <joelagnelf@nvidia.com>
Closes: https://lore.kernel.org/all/553c33d8-2b51-4772-8aef-97b0163bc78e@nvidia.com/
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
|
|
This commit adds a --do-rcu-rust parameter to torture.sh, which invokes
a rust_doctests_kernel kunit run. Note that kunit wants a clean source
tree, so this runs "make mrproper", which might come as a surprise to
some users. Should there be a --mrproper parameter to torture.sh to make
the user explicitly ask for it?
Co-developed-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
|
|
Right now, torture.sh runs normal runs unconditionally, which can be slow
and thus annoying when you only want to test --kcsan or --kasan runs.
This commit therefore adds a --do-normal argument so that "--kcsan
--do-no-kasan --do-no-normal" runs only KCSAN runs. Note that specifying
"--do-no-kasan --do-no-kcsan --do-no-normal" gets normal runs, so you
should not try to use this as a synonym for --do-none.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
|
|
When an XDR counted array has a maximum element count, xdrgen adds
a bounds check to the encoder or decoder for that type. But in cases
where the .x provides no maximum element count, such as
struct notify4 {
/* composed from notify_type4 or notify_deviceid_type4 */
bitmap4 notify_mask;
notifylist4 notify_vals;
};
struct CB_NOTIFY4args {
stateid4 cna_stateid;
nfs_fh4 cna_fh;
notify4 cna_changes<>;
};
xdrgen is supposed to omit that bounds check. Some of the Jinja2
templates handle that correctly, but a few are incorrect and leave
the bounds check in place with a maximum of zero, which causes
encoding/decoding of that type to fail unconditionally.
Reported-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
When running 'make' in tools/testing/selftests/arm64/ without explicitly
setting the OUTPUT variable, the build system will creates test
directories (e.g., /bti) in the root filesystem due to OUTPUT defaulting
to an empty string. This causes unintended pollution of the root directory.
This patch adds proper handling for the OUTPUT variable: Sets OUTPUT
to the current directory (.) if not specified
Signed-off-by: tanze <tanze@kylinos.cn>
Link: https://lore.kernel.org/r/20250515051839.3409658-1-tanze@kylinos.cn
Signed-off-by: Will Deacon <will@kernel.org>
|
|
When MTE is supported but MTE_ASYMM is not (ID_AA64PFR1_EL1.MTE == 2)
ID_AA64PFR1_EL1.MTE_frac == 0xF indicates MTE_ASYNC is unsupported
and MTE_frac == 0 indicates it is supported.
As MTE_frac was previously unconditionally read as 0 from the guest
and user-space, check that using SET_ONE_REG to set it to 0 succeeds
but does not change MTE_frac from unsupported (0xF) to supported (0).
This is required as values originating from KVM from user-space must
be accepted to avoid breaking migration.
Also, to allow this MTE field to be tested, enable KVM_ARM_CAP_MTE
for the set_id_regs test. No effect on existing tests is expected.
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
Link: https://lore.kernel.org/r/20250512114112.359087-4-ben.horgan@arm.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Explicitly specify LDFLAGS as an argument to CC so that this can be
overridden by the user.
Link: https://lore.kernel.org/all/20250328183858.1417835-3-bmasney@redhat.com/
Signed-off-by: Brian Masney <bmasney@redhat.com>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
|
|
Allow overriding the CFLAGS assignment so that the user can pass in
an outside value.
Link: https://lore.kernel.org/all/20250328183858.1417835-2-bmasney@redhat.com/
Signed-off-by: Brian Masney <bmasney@redhat.com>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
|
|
'realpath' is not always available, fallback to 'readlink -f' if is not
available. They seem to work equally well in this context.
Link: https://lore.kernel.org/r/20250318160510.3441646-1-yosry.ahmed@linux.dev
Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
TC needs arrays of nests, but just a put for now.
Fairly straightforward addition.
Link: https://patch.msgid.link/20250513222011.844106-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cross-merge networking fixes after downstream PR (net-6.15-rc7).
Conflicts:
tools/testing/selftests/drivers/net/hw/ncdevmem.c
97c4e094a4b2 ("tests/ncdevmem: Fix double-free of queue array")
2f1a805f32ba ("selftests: ncdevmem: Implement devmem TCP TX")
https://lore.kernel.org/20250514122900.1e77d62d@canb.auug.org.au
Adjacent changes:
net/core/devmem.c
net/core/devmem.h
0afc44d8cdf6 ("net: devmem: fix kernel panic when netlink socket close after module unload")
bd61848900bf ("net: devmem: Implement TX path")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from Bluetooth and wireless.
A few more fixes for the locking changes trickling in. Nothing too
alarming, I suspect those will continue for another release. Other
than that things are slowing down nicely.
Current release - fix to a fix:
- Bluetooth: hci_event: use key encryption size when its known
- tools: ynl-gen: allow multi-attr without nested-attributes again
Current release - regressions:
- locking fixes:
- lock lower level devices when updating features
- eth: bnxt_en: bring back rtnl_lock() in the bnxt_open() path
- devmem: fix panic when Netlink socket closes after module unload
Current release - new code bugs:
- eth: txgbe: fixes for FW communication on new AML devices
Previous releases - always broken:
- sched: flush gso_skb list too during ->change(), avoid potential
null-deref on reconfig
- wifi: mt76: disable NAPI on driver removal
- hv_netvsc: fix error 'nvsp_rndis_pkt_complete error status: 2'"
* tag 'net-6.15-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (44 commits)
net: devmem: fix kernel panic when netlink socket close after module unload
tsnep: fix timestamping with a stacked DSA driver
net/tls: fix kernel panic when alloc_page failed
bnxt_en: bring back rtnl_lock() in the bnxt_open() path
mlxsw: spectrum_router: Fix use-after-free when deleting GRE net devices
wifi: mac80211: Set n_channels after allocating struct cfg80211_scan_request
octeontx2-pf: Do not reallocate all ntuple filters
wifi: mt76: mt7925: fix missing hdr_trans_tlv command for broadcast wtbl
wifi: mt76: disable napi on driver removal
Drivers: hv: vmbus: Remove vmbus_sendpacket_pagebuffer()
hv_netvsc: Remove rmsg_pgcnt
hv_netvsc: Preserve contiguous PFN grouping in the page buffer array
hv_netvsc: Use vmbus_sendpacket_mpb_desc() to send VMBus messages
Drivers: hv: Allow vmbus_sendpacket_mpb_desc() to create multiple ranges
octeontx2-af: Fix CGX Receive counters
net: ethernet: mtk_eth_soc: fix typo for declaration MT7988 ESW capability
net: libwx: Fix FW mailbox unknown command
net: libwx: Fix FW mailbox reply timeout
net: txgbe: Fix to calculate EEPROM checksum for AML devices
octeontx2-pf: macsec: Fix incorrect max transmit size in TX secy
...
|
|
To increase code coverage, extend the ovpn selftests with the following
cases:
* connect UDP peers using a mix of IPv6 and IPv4 at the transport layer
* run full test with tunnel MTU equal to transport MTU (exercising
IP layer fragmentation)
* ping "LAN IP" served by VPN peer ("LAN behind a client" test case)
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
|
|
getaddrinfo() may fail with error code different from EAI_FAIL
or EAI_NONAME, however in this case we still try to free the
results object, thus leading to a crash.
Fix this by bailing out on any possible error.
Fixes: 959bc330a439 ("testing/selftests: add test tool and scripts for ovpn module")
Signed-off-by: Antonio Quartulli <antonio@openvpn.net>
|
|
The custom syncookie test expects TCPOPT_WINDOW to be 7 based on the
kernel’s behaviour at the time, but the upcoming series [0] will bump
it to 10.
Let's relax the test to allow any valid TCPOPT_WINDOW value in the
range 1–14.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://lore.kernel.org/netdev/20250513193919.1089692-1-edumazet@google.com/ #[0]
Link: https://patch.msgid.link/20250514214021.85187-1-kuniyu@amazon.com
|
|
Avoid dereferencing bpf_map_skeleton's link field if it's NULL.
If BPF map skeleton is created with the size, that indicates containing
link field, but the field was not actually initialized with valid
bpf_link pointer, libbpf crashes. This may happen when using libbpf-rs
skeleton.
Skeleton loading may still progress, but user needs to attach struct_ops
map separately.
Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250514113220.219095-1-mykyta.yatsenko5@gmail.com
|
|
Most tracepoints in the kernel are created with TRACE_EVENT(). The
TRACE_EVENT() macro (and DECLARE_EVENT_CLASS() and DEFINE_EVENT() where in
reality, TRACE_EVENT() is just a helper macro that calls those other two
macros), will create not only a tracepoint (the function trace_<event>()
used in the kernel), it also exposes the tracepoint to user space along
with defining what fields will be saved by that tracepoint.
There are a few places that tracepoints are created in the kernel that are
not exposed to userspace via tracefs. They can only be accessed from code
within the kernel. These tracepoints are created with DEFINE_TRACE()
Most of these tracepoints end with "_tp". This is useful as when the
developer sees that, they know that the tracepoint is for in-kernel only
(meaning it can only be accessed inside the kernel, either directly by the
kernel or indirectly via modules and BPF programs) and is not exposed to
user space.
Instead of making this only a process to add "_tp", enforce it by making
the DECLARE_TRACE() append the "_tp" suffix to the tracepoint. This
requires adding DECLARE_TRACE_EVENT() macros for the TRACE_EVENT() macro
to use that keeps the original name.
Link: https://lore.kernel.org/all/20250418083351.20a60e64@gandalf.local.home/
Cc: netdev <netdev@vger.kernel.org>
Cc: Jiri Olsa <olsajiri@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Ahern <dsahern@kernel.org>
Cc: Juri Lelli <juri.lelli@gmail.com>
Cc: Breno Leitao <leitao@debian.org>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/20250510163730.092fad5b@gandalf.local.home
Acked-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
|
|
After elf_update_group_sh_info() was introduced, a prototype version of
"objtool klp diff" went from taking ~1s to several minutes, due to
looping almost endlessly in elf_update_group_sh_info() while creating
thousands of local symbols in a file with thousands of sections.
Dramatically improve the performance by marking all symbols' correlated
SHT_GROUP sections while reading the object. That way there's no need
to search for it every time a symbol gets reindexed.
Fixes: 2cb291596e2c ("objtool: Fix up st_info in COMDAT group section")
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Rong Xu <xur@google.com>
Link: https://lkml.kernel.org/r/2a33e583c87e3283706f346f9d59aac20653b7fd.1746662991.git.jpoimboe@kernel.org
|
|
Patch series "eliminate mmap() retry merge, add .mmap_prepare hook", v2.
During the mmap() of a file-backed mapping, we invoke the underlying
driver file's mmap() callback in order to perform driver/file system
initialisation of the underlying VMA.
This has been a source of issues in the past, including a significant
security concern relating to unwinding of error state discovered by Jann
Horn, as fixed in commit 5de195060b2e ("mm: resolve faulty mmap_region()
error path behaviour") which performed the recent, significant, rework of
mmap() as a whole.
However, we have had a fly in the ointment remain - drivers have a great
deal of freedom in the .mmap() hook to manipulate VMA state (as well as
page table state).
This can be problematic, as we can no longer reason sensibly about VMA
state once the call is complete (the ability to do - anything - here does
rather interfere with that).
In addition, callers may choose to do odd or unusual things which might
interfere with subsequent steps in the mmap() process, and it may do so
and then raise an error, requiring very careful unwinding of state about
which we can make no assumptions.
Rather than providing such an open-ended interface, this series provides
an alternative, far more restrictive one - we expose a whitelist of fields
which can be adjusted by the driver, along with immutable state upon which
the driver can make such decisions:
struct vm_area_desc {
/* Immutable state. */
struct mm_struct *mm;
unsigned long start;
unsigned long end;
/* Mutable fields. Populated with initial state. */
pgoff_t pgoff;
struct file *file;
vm_flags_t vm_flags;
pgprot_t page_prot;
/* Write-only fields. */
const struct vm_operations_struct *vm_ops;
void *private_data;
};
The mmap logic then updates the state used to either merge with a VMA or
establish a new VMA based upon this logic.
This is achieved via new file hook .mmap_prepare(), which is, importantly,
invoked very early on in the mmap() process.
If an error arises, we can very simply abort the operation with very
little unwinding of state required.
The existing logic contains another, related, peccadillo - since the
.mmap() callback might do anything, it may also cause a previously
unmergeable VMA to become mergeable with adjacent VMAs.
Right now the logic will retry a merge like this only if the driver
changes VMA flags, and changes them in such a way that a merge might
succeed (that is, the flags are not 'special', that is do not contain any
of the flags specified in VM_SPECIAL).
This has also been the source of a great deal of pain - it's hard to
reason about an .mmap() callback that might do - anything - but it's also
hard to reason about setting up a VMA and writing to the maple tree, only
to do it again utilising a great deal of shared state.
Since .mmap_prepare() sets fields before the first merge is even
attempted, the use of this callback obviates the need for this retry merge
logic.
A driver may only specify .mmap_prepare() or the deprecated .mmap()
callback. In future we may add futher callbacks beyond .mmap_prepare() to
faciliate all use cass as we convert drivers.
In researching this change, I examined every .mmap() callback, and
discovered only a very few that set VMA state in such a way that a. the
VMA flags changed and b. this would be mergeable.
In the majority of cases, it turns out that drivers are mapping kernel
memory and thus ultimately set VM_PFNMAP, VM_MIXEDMAP, or other
unmergeable VM_SPECIAL flags.
Of those that remain I identified a number of cases which are only
applicable in DAX, setting the VM_HUGEPAGE flag:
* dax_mmap()
* erofs_file_mmap()
* ext4_file_mmap()
* xfs_file_mmap()
For this remerge to not occur and to impact users, each of these cases
would require a user to mmap() files using DAX, in parts, immediately
adjacent to one another.
This is a very unlikely usecase and so it does not appear to be worthwhile
to adjust this functionality accordingly.
We can, however, very quickly do so if needed by simply adding an
.mmap_prepare() callback to these as required.
There are two further non-DAX cases I idenitfied:
* orangefs_file_mmap() - Clears VM_RAND_READ if set, replacing with
VM_SEQ_READ.
* usb_stream_hwdep_mmap() - Sets VM_DONTDUMP.
Both of these cases again seem very unlikely to be mmap()'d immediately
adjacent to one another in a fashion that would result in a merge.
Finally, we are left with a viable case:
* secretmem_mmap() - Set VM_LOCKED, VM_DONTDUMP.
This is viable enough that the mm selftests trigger the logic as a matter
of course. Therefore, this series replace the .secretmem_mmap() hook with
.secret_mmap_prepare().
This patch (of 3):
Provide a means by which drivers can specify which fields of those
permitted to be changed should be altered to prior to mmap()'ing a range
(which may either result from a merge or from mapping an entirely new
VMA).
Doing so is substantially safer than the existing .mmap() calback which
provides unrestricted access to the part-constructed VMA and permits
drivers and file systems to do 'creative' things which makes it hard to
reason about the state of the VMA after the function returns.
The existing .mmap() callback's freedom has caused a great deal of issues,
especially in error handling, as unwinding the mmap() state has proven to
be non-trivial and caused significant issues in the past, for instance
those addressed in commit 5de195060b2e ("mm: resolve faulty mmap_region()
error path behaviour").
It also necessitates a second attempt at merge once the .mmap() callback
has completed, which has caused issues in the past, is awkward, adds
overhead and is difficult to reason about.
The .mmap_prepare() callback eliminates this requirement, as we can update
fields prior to even attempting the first merge. It is safer, as we
heavily restrict what can actually be modified, and being invoked very
early in the mmap() process, error handling can be performed safely with
very little unwinding of state required.
The .mmap_prepare() and deprecated .mmap() callbacks are mutually
exclusive, so we permit only one to be invoked at a time.
Update vma userland test stubs to account for changes.
Link: https://lkml.kernel.org/r/cover.1746792520.git.lorenzo.stoakes@oracle.com
Link: https://lkml.kernel.org/r/adb36a7c4affd7393b2fc4b54cc5cfe211e41f71.1746792520.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
test_memcg_protection()
The test_memcg_protection() function is used for the test_memcg_min and
test_memcg_low sub-tests. This function generates a set of parent/child
cgroups like:
parent: memory.min/low = 50M
child 0: memory.min/low = 75M, memory.current = 50M
child 1: memory.min/low = 25M, memory.current = 50M
child 2: memory.min/low = 0, memory.current = 50M
After applying memory pressure, the function expects the following actual
memory usages.
parent: memory.current ~= 50M
child 0: memory.current ~= 29M
child 1: memory.current ~= 21M
child 2: memory.current ~= 0
In reality, the actual memory usages can differ quite a bit from the
expected values. It uses an error tolerance of 10% with the
values_close() helper.
Both the test_memcg_min and test_memcg_low sub-tests can fail sporadically
because the actual memory usage exceeds the 10% error tolerance. Below
are a sample of the usage data of the tests runs that fail.
Child Actual usage Expected usage %err
----- ------------ -------------- ----
1 16990208 22020096 -12.9%
1 17252352 22020096 -12.1%
0 37699584 30408704 +10.7%
1 14368768 22020096 -21.0%
1 16871424 22020096 -13.2%
The current 10% error tolerenace might be right at the time
test_memcontrol.c was first introduced in v4.18 kernel, but memory reclaim
have certainly evolved quite a bit since then which may result in a bit
more run-to-run variation than previously expected.
Increase the error tolerance to 15% for child 0 and 20% for child 1 to
minimize the chance of this type of failure. The tolerance is bigger for
child 1 because an upswing in child 0 corresponds to a smaller %err than a
similar downswing in child 1 due to the way %err is used in
values_close().
Before this patch, a 100 test runs of test_memcontrol produced the
following results:
17 not ok 1 test_memcg_min
22 not ok 2 test_memcg_low
After applying this patch, there were no test failure for test_memcg_min
and test_memcg_low in 100 test runs. However, these tests may still fail
once in a while if the memory usage goes beyond the newly extended range.
Link: https://lkml.kernel.org/r/20250502010443.106022-3-longman@redhat.com
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "memcg: Fix test_memcg_min/low test failures", v8.
The test_memcontrol selftest consistently fails its test_memcg_low
sub-test (with memory_recursiveprot enabled) and sporadically fails its
test_memcg_min sub-test. This patchset fixes the test_memcg_min and
test_memcg_low failures by adjusting the test_memcontrol selftest to fix
these test failures.
This patch (of 8):
The test_memcontrol selftest consistently fails its test_memcg_low
sub-test due to the fact that its 3rd test child cgroup which have a
memmory.low of 0 have low event count. This happens when
memory_recursiveprot mount option is enabled which is the default setting
used by systemd to mount cgroup2 filesystem.
This issue was originally fixed by commit cdc69458a5f3 ("cgroup: account
for memory_recursiveprot in test_memcg_low()"). It was later reverted by
commit 1d09069f5313 ("selftests: memcg: expect no low events in
unprotected sibling") expecting the memory reclaim code would be fixed.
However, it turns out the unprotected cgroup may still have some residual
effective memory.low protection depending on the memory.low settings in
its parent and its siblings. As a result, low events may still be
triggered.
One way to fix the test failure is to revert the revert commit. However,
Michal suggested that it might be better to ignore the low event count
with memory_recursiveprot enabled as low event may or may not happen
depending on the actual test configuration.
Modify the test_memcontrol.c to ignore low event in the 3rd child cgroup
with memory_recursiveprot on.
The 4th child cgroup has no memory usage and so has an effective low of 0.
It has no low event count because the mem_cgroup_below_low() check in
shrink_node_memcgs() is skipped as mem_cgroup_below_min() returns true.
If we ever change mem_cgroup_below_min() in such a way that it no longer
skips the no usage case, we will have to add code to explicitly skip it.
With this patch applied, the test_memcg_low sub-test finishes successfully
without failure in most cases. Though both test_memcg_low and
test_memcg_min sub-tests may still fail occasionally if the memory.current
values fall outside of the expected ranges.
Link: https://lkml.kernel.org/r/20250502010443.106022-1-longman@redhat.com
Link: https://lkml.kernel.org/r/20250502010443.106022-2-longman@redhat.com
Signed-off-by: Waiman Long <longman@redhat.com>
Suggested-by: Michal Koutný <mkoutny@suse.com>
Acked-by: Michal Koutný <mkoutny@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Waiman Long <longman@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Improve the installation procedure for the systemd service unit
'cpupower.service', to be more distro-agnostic. Do not install the
service unit configuration file to /etc/default/ (a directory that
is used by Debian and Debian-derivatives and only rarely by other
distros).
Also, clarify the role of the configuration file in its own comments.
Link: https://lore.kernel.org/linux-pm/20250509002206.bd2519ba52035d47c3c32aa6@paranoici.org/T/#ma8a3fa80acc4036af6c754e8ecabacc55b288ad1
Link: https://lore.kernel.org/r/20250513163937.61062-5-invernomuto@paranoici.org
Fixes: 9c70b779ad91 ("cpupower: add a systemd service to run cpupower")
Signed-off-by: Francesco Poli (wintermute) <invernomuto@paranoici.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Fix the installation procedure for the systemd service unit
'cpupower.service'. Do not call "systemctl daemon-reload" in the
Makefile, but explain when this command should be manually issued
in the README file.
Link: https://lore.kernel.org/linux-pm/20250509002206.bd2519ba52035d47c3c32aa6@paranoici.org/T/#mfbb938f9c0d5a21173acb92a061eb9205fd0abfe
Link: https://lore.kernel.org/r/20250513163937.61062-4-invernomuto@paranoici.org
Fixes: 9c70b779ad91 ("cpupower: add a systemd service to run cpupower")
Signed-off-by: Francesco Poli (wintermute) <invernomuto@paranoici.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|
|
Fix the use of DESTDIR variable in the Makefile, as far as the
installation of the systemd service unit 'cpupower.service' is
concerned.
This was caused by a misunderstanding about the purpose of the DESTDIR
variable in the Makefile, which is instead meant to support staged
installations: its value should not end up into installed file contents.
Link: https://lore.kernel.org/linux-pm/20250509002206.bd2519ba52035d47c3c32aa6@paranoici.org/T/#mfbb938f9c0d5a21173acb92a061eb9205fd0abfe
Link: https://www.gnu.org/prep/standards/html_node/DESTDIR.html
Link: https://lore.kernel.org/r/20250513163937.61062-3-invernomuto@paranoici.org
Fixes: 9c70b779ad91 ("cpupower: add a systemd service to run cpupower")
Signed-off-by: Francesco Poli (wintermute) <invernomuto@paranoici.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
|