Age | Commit message (Collapse) | Author |
|
git://git.open-mesh.org/linux-merge
Simon Wunderlich says:
====================
Here are some batman-adv bugfixes:
- Fix panic during interface removal in BATMAN V, by Andy Strohman
- Cleanup BATMAN V/ELP metric handling, by Sven Eckelmann (2 patches)
- Fix incorrect offset in batadv_tt_tvlv_ogm_handler_v1(),
by Remi Pommarel
* tag 'batadv-net-pullrequest-20250207' of git://git.open-mesh.org/linux-merge:
batman-adv: Fix incorrect offset in batadv_tt_tvlv_ogm_handler_v1()
batman-adv: Drop unmanaged ELP metric worker
batman-adv: Ignore neighbor throughput metrics in error case
batman-adv: fix panic during interface removal
====================
Link: https://patch.msgid.link/20250207095823.26043-1-sw@simonwunderlich.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
says:
====================
ptp: vmclock: bugfixes and cleanups for error handling
Some error handling issues I noticed while looking at the code.
Only compile-tested.
v1: https://lore.kernel.org/r/20250206-vmclock-probe-v1-0-17a3ea07be34@linutronix.de
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
====================
Link: https://patch.msgid.link/20250207-vmclock-probe-v2-0-bc2fce0bdf07@linutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
vmclock_probe() uses an "out:" label to return from the function on
error. This indicates that some cleanup operation is necessary.
However the label does not do anything as all resources are managed
through devres, making the code slightly harder to read.
Remove the label and just return directly.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Most resources owned by the vmclock device are managed through devres.
Only the miscdev and ptp clock are managed manually.
This makes the code slightly harder to understand than necessary.
Switch them over to devres and remove the now unnecessary drvdata.
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
vmclock_remove() tries to detect the successful registration of the misc
device based on the value of its minor value.
However that check is incorrect if the misc device registration was not
attempted in the first place.
Always initialize the minor number, so the check works properly.
Fixes: 205032724226 ("ptp: Add support for the AMZNC10C 'vmclock' device")
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
If vmclock_ptp_register() fails during probing, vmclock_remove() is
called to clean up the ptp clock and misc device.
It uses dev_get_drvdata() to access the vmclock state.
However the driver data is not yet set at this point.
Assign the driver data earlier.
Fixes: 205032724226 ("ptp: Add support for the AMZNC10C 'vmclock' device")
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Without the .owner field, the module can be unloaded while /dev/vmclock0
is open, leading to an oops.
Fixes: 205032724226 ("ptp: Add support for the AMZNC10C 'vmclock' device")
Cc: stable@vger.kernel.org
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2025-02-08
The first patch is by Reyders Morales and fixes a code example in the
CAN ISO15765-2 documentation.
The next patch is contributed by Alexander Hölzl and fixes sending of
J1939 messages with zero data length.
Fedor Pchelkin's patch for the ctucanfd driver adds a missing handling
for an skb allocation error.
Krzysztof Kozlowski contributes a patch for the c_can driver to fix
unbalanced runtime PM disable in error path.
The next patch is by Vincent Mailhol and fixes a NULL pointer
dereference on udev->serial in the etas_es58x driver.
The patch is by Robin van der Gracht and fixes the handling for an skb
allocation error.
* tag 'linux-can-fixes-for-6.14-20250208' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
can: rockchip: rkcanfd_handle_rx_fifo_overflow_int(): bail out if skb cannot be allocated
can: etas_es58x: fix potential NULL pointer dereference on udev->serial
can: c_can: fix unbalanced runtime PM disable in error path
can: ctucanfd: handle skb allocation failure
can: j1939: j1939_sk_send_loop(): fix unable to send messages with data length zero
Documentation/networking: fix basic node example document ISO 15765-2
====================
Link: https://patch.msgid.link/20250208115120.237274-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The device is able to checksum plain TCP / UDP packets over IPv4 / IPv6
when the 'ipcs' bit in the send descriptor is set. Advertise support for
the 'NETIF_F_IP{,6}_CSUM' features in net devices registered by the
driver and VLAN uppers and set the 'ipcs' bit when the stack requests Tx
checksum offload.
Note that the device also calculates the IPv4 checksum, but it first
zeroes the current checksum so there should not be any difference
compared to the checksum calculated by the kernel.
On SN5600 (Spectrum-4) there is about 10% improvement in Tx packet rate
with 1400 byte packets when using pktgen.
Tested on Spectrum-{1,2,3,4} with all the combinations of IPv4 / IPv6,
TCP / UDP, with and without VLAN.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/8dc86c95474ce10572a0fa83b8adb0259558e982.1738950446.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Refering to C binaries from Python code is going to be a common
need. Add a helper to convert from path in relation to the test.
Meaning, if the test is in the same directory as the binary, the
call would be simply: cfg.rpath("binary").
The helper name "rpath" is not great. I can't think of a better
name that would be accurate yet concise.
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20250207184140.1730466-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We have separate Env classes for local tests and tests with a remote
endpoint. Make it easier to share the code by creating a base class.
Make env loading a method of this class.
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20250207184140.1730466-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ncdevmem doesn't need libmnl, remove the unnecessary include.
Since YNL doesn't depend on libmnl either, any more, it's actually
possible to build selftests without having libmnl installed.
Reviewed-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250207183119.1721424-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Kuniyuki Iwashima says:
====================
fib: rules: Convert RTM_NEWRULE and RTM_DELRULE to per-netns RTNL.
Patch 1 ~ 2 are small cleanup, and patch 3 ~ 8 make fib_nl_newrule()
and fib_nl_delrule() hold per-netns RTNL.
v1: https://lore.kernel.org/20250206084629.16602-1-kuniyu@amazon.com
====================
Link: https://patch.msgid.link/20250207072502.87775-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
fib_nl_delrule() is the doit() handler for RTM_DELRULE but also called
from vrf_newlink() in case something fails in vrf_add_fib_rules().
In the latter case, RTNL is already held and the 4th arg is true.
Let's hold per-netns RTNL in fib_delrule() if rtnl_held is false.
Now we can place ASSERT_RTNL_NET() in call_fib_rule_notifiers().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250207072502.87775-9-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We will hold RTNL just before calling fib_nl2rule_rtnl() in
fib_delrule() and release it before kfree(nlrule).
Let's add a new rule to make the following change cleaner.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250207072502.87775-8-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
fib_nl_newrule() is the doit() handler for RTM_NEWRULE but also called
from vrf_newlink().
In the latter case, RTNL is already held and the 4th arg is true.
Let's hold per-netns RTNL in fib_newrule() if rtnl_held is false.
Note that we call fib_rule_get() before releasing per-netns RTNL to call
notify_rule_change() without RTNL and prevent freeing the new rule.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250207072502.87775-7-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
fib_nl_newrule() / fib_nl_delrule() is the doit() handler for
RTM_NEWRULE / RTM_DELRULE but also called from vrf_newlink().
Currently, we hold RTNL on both paths but will not on the former.
Also, we set dev_net(dev)->rtnl to skb->sk in vrf_fib_rule() because
fib_nl_newrule() / fib_nl_delrule() fetch net as sock_net(skb->sk).
Let's Factorise the two functions and pass net and rtnl_held flag.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250207072502.87775-6-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The following patch will not set skb->sk from VRF path.
Let's fetch net from fib_rule->fr_net instead of sock_net(skb->sk)
in fib[46]_rule_configure().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250207072502.87775-5-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We will move RTNL down to fib_nl_newrule() and fib_nl_delrule().
Some operations in fib_nl2rule() require RTNL: fib_default_rule_pref()
and __dev_get_by_name().
Let's split the RTNL parts as fib_nl2rule_rtnl().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250207072502.87775-4-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
skb is not used in fib_nl2rule() other than sock_net(skb->sk),
which is already available in callers, fib_nl_newrule() and
fib_nl_delrule().
Let's pass net directly to fib_nl2rule().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250207072502.87775-3-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
fib_nl_newrule() / fib_nl_delrule() looks up struct fib_rules_ops
in sock_net(skb->sk) and calls rule_exists() / rule_find() respectively.
fib_nl_newrule() creates a new rule and links it to the found ops, so
struct fib_rule never belongs to a different netns's ops->rules_list.
Let's remove redundant netns check in rule_exists() and rule_find().
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250207072502.87775-2-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Akihiko Odaki says:
====================
tun: Unify vnet implementation
When I implemented virtio's hash-related features to tun/tap [1],
I found tun/tap does not fill the entire region reserved for the virtio
header, leaving some uninitialized hole in the middle of the buffer
after read()/recvmesg().
This series fills the uninitialized hole. More concretely, the
num_buffers field will be initialized with 1, and the other fields will
be inialized with 0. Setting the num_buffers field to 1 is mandated by
virtio 1.0 [2].
The change to virtio header is preceded by another change that refactors
tun and tap to unify their virtio-related code.
[1]: https://lore.kernel.org/r/20241008-rss-v5-0-f3cf68df005d@daynix.com
[2]: https://lore.kernel.org/r/20241227084256-mutt-send-email-mst@kernel.org/
====================
Link: https://patch.msgid.link/20250207-tun-v6-0-fb49cf8b103e@daynix.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
tun and tap implements the same vnet-related features so reuse the code.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250207-tun-v6-7-fb49cf8b103e@daynix.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
hdr_len is repeatedly used so keep it in a local variable.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250207-tun-v6-6-fb49cf8b103e@daynix.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The vnet handling code will be reused by tap.
Functions are renamed to ensure that their names contain "vnet" to
clarify that they are part of the decoupled vnet handling code.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250207-tun-v6-5-fb49cf8b103e@daynix.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Decouple the vnet handling code so that we can reuse it for tap.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250207-tun-v6-4-fb49cf8b103e@daynix.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Decouple vnet-related functions from tun_struct so that we can reuse
them for tap in the future.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250207-tun-v6-3-fb49cf8b103e@daynix.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
hdr_len is repeatedly used so keep it in a local variable.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250207-tun-v6-2-fb49cf8b103e@daynix.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Check IS_ENABLED(CONFIG_TUN_VNET_CROSS_LE) to save some lines and make
future changes easier.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20250207-tun-v6-1-fb49cf8b103e@daynix.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Sean Anderson says:
====================
net: xilinx: axienet: Enable adaptive IRQ coalescing with DIM
To improve performance without sacrificing latency under low load,
enable DIM. While I appreciate not having to write the library myself, I
do think there are many unusual aspects to DIM, as detailed in the last
patch.
====================
Link: https://patch.msgid.link/20250206201036.1516800-1-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The default RX IRQ coalescing settings of one IRQ per packet can represent
a significant CPU load. However, increasing the coalescing unilaterally
can result in undesirable latency under low load. Adaptive IRQ
coalescing with DIM offers a way to adjust the coalescing settings based
on load.
This device only supports "CQE" mode [1], where each packet resets the
timer. Therefore, an interrupt is fired either when we receive
coalesce_count_rx packets or when the interface is idle for
coalesce_usec_rx. With this in mind, consider the following scenarios:
Link saturated
Here we want to set coalesce_count_rx to a large value, in order to
coalesce more packets and reduce CPU load. coalesce_usec_rx should
be set to at least the time for one packet. Otherwise the link will
be "idle" and we will get an interrupt for each packet anyway.
Bursts of packets
Each burst should be coalesced into a single interrupt, although it
may be prudent to reduce coalesce_count_rx for better latency.
coalesce_usec_rx should be set to at least the time for one packet
so bursts are coalesced. However, additional time beyond the packet
time will just increase latency at the end of a burst.
Sporadic packets
Due to low load, we can set coalesce_count_rx to 1 in order to
reduce latency to the minimum. coalesce_usec_rx does not matter in
this case.
Based on this analysis, I expected the CQE profiles to look something
like
usec = 0, pkts = 1 // Low load
usec = 16, pkts = 4
usec = 16, pkts = 16
usec = 16, pkts = 64
usec = 16, pkts = 256 // High load
Where usec is set to 16 to be a few us greater than the 12.3 us packet
time of a 1500 MTU packet at 1 GBit/s. However, the CQE profile is
instead
usec = 2, pkts = 256 // Low load
usec = 8, pkts = 128
usec = 16, pkts = 64
usec = 32, pkts = 64
usec = 64, pkts = 64 // High load
I found this very surprising. The number of coalesced packets
*decreases* as load increases. But as load increases we have more
opportunities to coalesce packets without affecting latency as much.
Additionally, the profile *increases* the usec as the load increases.
But as load increases, the gaps between packets will tend to become
smaller, making it possible to *decrease* usec for better latency at the
end of a "burst".
I consider the default CQE profile unsuitable for this NIC. Therefore,
we use the first profile outlined in this commit instead.
coalesce_usec_rx is set to 16 by default, but the user can customize it.
This may be necessary if they are using jumbo frames. I think adjusting
the profile times based on the link speed/mtu would be good improvement
for generic DIM.
In addition to the above profile problems, I noticed the following
additional issues with DIM while testing:
- DIM tends to "wander" when at low load, since the performance gradient
is pretty flat. If you only have 10p/ms anyway then adjusting the
coalescing settings will not affect throughput very much.
- DIM takes a long time to adjust back to low indices when load is
decreased following a period of high load. This is because it only
re-evaluates its settings once every 64 interrupts. However, at low
load 64 interrupts can be several seconds.
Finally: performance. This patch increases receive throughput with
iperf3 from 840 Mbits/sec to 938 Mbits/sec, decreases interrupts from
69920/sec to 316/sec, and decreases CPU utilization (4x Cortex-A53) from
43% to 9%.
[1] Who names this stuff?
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://patch.msgid.link/20250206201036.1516800-5-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The cr variables now contain the same values as the control registers
themselves. Extract/calculate the values from the variables instead of
saving the user-specified values. This allows us to remove some
bookeeping, and also lets the user know what the actual coalesce
settings are.
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://patch.msgid.link/20250206201036.1516800-4-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In preparation for adaptive IRQ coalescing, we first need to support
adjusting the settings at runtime. The existing code doesn't require any
locking because
- dma_start is the only function that modifies rx/tx_dma_cr. It is
always called with IRQs and NAPI disabled, so nothing else is touching
the hardware.
- The IRQs don't race with poll, since the latter is a softirq.
- The IRQs don't race with dma_stop since they both just clear the
control registers.
- dma_stop doesn't race with poll since the former is called with NAPI
disabled.
However, once we introduce another function that modifies rx/tx_dma_cr,
we need to have some locking to prevent races. Introduce two locks to
protect these variables and their registers.
The control register values are now generated where the coalescing
settings are set. Converting coalescing settings to control register
values may require sleeping because of clk_get_rate. However, the
read/modify/write of the control registers themselves can't sleep
because it needs to happen in IRQ context. By pre-calculating the
control register values, we avoid introducing an additional mutex.
Since axienet_dma_start writes the control settings when it runs, we
don't bother updating the CR registers when rx/tx_dma_started is false.
This prevents any issues from writing to the control registers in the
middle of a reset sequence.
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://patch.msgid.link/20250206201036.1516800-3-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Combine the common parts of the CR calculations for better code reuse.
While we're at it, simplify the code a bit.
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://patch.msgid.link/20250206201036.1516800-2-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Kalle Valo says:
====================
wireless fixes for v6.14-rc3
We have only one fix for ath12k and one fix for brcmfmac. Also this
will be my last pull request as I'm stepping down as wireless driver
maintainer.
* tag 'wireless-2025-02-07' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
MAINTAINERS: wifi: remove Kalle
MAINTAINERS: wifi: ath: remove Kalle
wifi: brcmfmac: use random seed flag for BCM4355 and BCM4364 firmware
wifi: ath12k: fix handling of 6 GHz rules
====================
Link: https://patch.msgid.link/20250207182957.23315C4CED1@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Eric Dumazet says:
====================
net: second round to use dev_net_rcu()
dev_net(dev) should either be protected by RTNL or RCU.
There is no LOCKDEP support yet for this helper.
Adding it would trigger too many splats.
This second series fixes some of them.
====================
Link: https://patch.msgid.link/20250207135841.1948589-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
igmp6_send() can be called without RTNL or RCU being held.
Extend RCU protection so that we can safely fetch the net pointer
and avoid a potential UAF.
Note that we no longer can use sock_alloc_send_skb() because
ipv6.igmp_sk uses GFP_KERNEL allocations which can sleep.
Instead use alloc_skb() and charge the net->ipv6.igmp_sk
socket under RCU protection.
Fixes: b8ad0cbc58f7 ("[NETNS][IPV6] mcast - handle several network namespace")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250207135841.1948589-9-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ndisc_send_skb() can be called without RTNL or RCU held.
Acquire rcu_read_lock() earlier, so that we can use dev_net_rcu()
and avoid a potential UAF.
Fixes: 1762f7e88eb3 ("[NETNS][IPV6] ndisc - make socket control per namespace")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250207135841.1948589-8-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
l3mdev_l3_out() can be called without RCU being held:
raw_sendmsg()
ip_push_pending_frames()
ip_send_skb()
ip_local_out()
__ip_local_out()
l3mdev_ip_out()
Add rcu_read_lock() / rcu_read_unlock() pair to avoid
a potential UAF.
Fixes: a8e3e1a9f020 ("net: l3mdev: Add hook to output path")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250207135841.1948589-7-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ovs_vport_cmd_fill_info() can be called without RTNL or RCU.
Use RCU protection and dev_net_rcu() to avoid potential UAF.
Fixes: 9354d4520342 ("openvswitch: reliable interface indentification in port dumps")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250207135841.1948589-6-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
arp_xmit() can be called without RTNL or RCU protection.
Use RCU protection to avoid potential UAF.
Fixes: 29a26a568038 ("netfilter: Pass struct net into the netfilter hooks")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250207135841.1948589-5-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
__neigh_notify() can be called without RTNL or RCU protection.
Use RCU protection to avoid potential UAF.
Fixes: 426b5303eb43 ("[NETNS]: Modify the neighbour table code so it handles multiple network namespaces")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250207135841.1948589-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ndisc_alloc_skb() can be called without RTNL or RCU being held.
Add RCU protection to avoid possible UAF.
Fixes: de09334b9326 ("ndisc: Introduce ndisc_alloc_skb() helper.")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250207135841.1948589-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ndisc_send_redirect() is called under RCU protection, not RTNL.
It must use dev_get_by_index_rcu() instead of __dev_get_by_index()
Fixes: 2f17becfbea5 ("vrf: check the original netdevice for generating redirect")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Stephen Suryaputra <ssuryaextr@gmail.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250207135841.1948589-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit df542f669307 ("net: stmmac: Switch to zero-copy in
non-XDP RX path") makes DMA write received frame into buffer at offset
of NET_SKB_PAD and sets page pool parameters to sync from offset of
NET_SKB_PAD. But when Header Payload Split is enabled, the header is
written at offset of NET_SKB_PAD, while the payload is written at
offset of zero. Uncorrect offset parameter for the payload breaks dma
coherence [1] since both CPU and DMA touch the page buffer from offset
of zero which is not handled by the page pool sync parameter.
And in case the DMA cannot split the received frame, for example,
a large L2 frame, pp_params.max_len should grow to match the tail
of entire frame.
[1] https://lore.kernel.org/netdev/d465f277-bac7-439f-be1d-9a47dfe2d951@nvidia.com/
Reported-by: Jon Hunter <jonathanh@nvidia.com>
Reported-by: Brad Griffis <bgriffis@nvidia.com>
Suggested-by: Ido Schimmel <idosch@idosch.org>
Fixes: df542f669307 ("net: stmmac: Switch to zero-copy in non-XDP RX path")
Signed-off-by: Furong Xu <0x1207@gmail.com>
Tested-by: Jon Hunter <jonathanh@nvidia.com>
Tested-by: Thierry Reding <treding@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20250207085639.13580-1-0x1207@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The Dell AW1022z is an RTL8156B based 2.5G Ethernet controller.
Add the vendor and product ID values to the driver. This makes Ethernet
work with the adapter.
Signed-off-by: Aleksander Jan Bajkowski <olek2@wp.pl>
Link: https://patch.msgid.link/20250206224033.980115-1-olek2@wp.pl
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Alexander Lobakin says:
====================
xsk: the lost bits from Chapter III
Before introducing libeth_xdp, we need to add a couple more generic
helpers. Notably:
* 01: add generic loop unrolling hint helpers;
* 04: add helper to get both xdp_desc's DMA address and metadata
pointer in one go, saving several cycles and hotpath object
code size in drivers (especially when unrolling).
Bonus:
* 02, 03: convert two drivers which were using custom macros to
generic unrolled_count() (trivial, no object code changes).
====================
Link: https://patch.msgid.link/20250206182630.3914318-1-aleksander.lobakin@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently, when your driver supports XSk Tx metadata and you want to
send an XSk frame, you need to do the following:
* call external xsk_buff_raw_get_dma();
* call inline xsk_buff_get_metadata(), which calls external
xsk_buff_raw_get_data() and then do some inline checks.
This effectively means that the following piece:
addr = pool->unaligned ? xp_unaligned_add_offset_to_addr(addr) : addr;
is done twice per frame, plus you have 2 external calls per frame, plus
this:
meta = pool->addrs + addr - pool->tx_metadata_len;
if (unlikely(!xsk_buff_valid_tx_metadata(meta)))
is always inlined, even if there's no meta or it's invalid.
Add xsk_buff_raw_get_ctx() (xp_raw_get_ctx() to be precise) to do that
in one go. It returns a small structure with 2 fields: DMA address,
filled unconditionally, and metadata pointer, non-NULL only if it's
present and valid. The address correction is performed only once and
you also have only 1 external call per XSk frame, which does all the
calculations and checks outside of your hotpath. You only need to
check `if (ctx.meta)` for the metadata presence.
To not copy any existing code, derive address correction and getting
virtual and DMA address into small helpers. bloat-o-meter reports no
object code changes for the existing functionality.
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20250206182630.3914318-5-aleksander.lobakin@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ice, same as i40e, has a custom loop unrolling macros for unrolling
Tx descriptors filling on XSk xmit.
Replace ice defs with generic unrolled_count(), which is also more
convenient as it allows passing defines as its argument, not hardcoded
values, while the loop declaration will still be usual for-loop.
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://patch.msgid.link/20250206182630.3914318-4-aleksander.lobakin@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
i40e, as well as ice, has a custom loop unrolling macro for unrolling
Tx descriptors filling on XSk xmit.
Replace i40e defs with generic unrolled_count(), which is also more
convenient as it allows passing defines as its argument, not hardcoded
values, while the loop declaration will still be a usual for-loop.
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Acked-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://patch.msgid.link/20250206182630.3914318-3-aleksander.lobakin@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|