Age | Commit message (Collapse) | Author |
|
mptcp_get_available_schedulers() needs to iterate over the schedulers'
list only to read the names: it doesn't modify anything there.
In this case, it is enough to hold the RCU read lock, no need to combine
this with the associated spin lock as it was done since its introduction
in commit 73c900aa3660 ("mptcp: add net.mptcp.available_schedulers").
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Geliang Tang <geliang@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241104-net-next-mptcp-sched-unneeded-lock-v2-1-2ccc1e0c750c@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Matthieu Baerts says:
====================
mptcp: pm: fix wrong perm and sock kfree
Two small fixes related to the MPTCP path-manager:
- Patch 1: remove an accidental restriction to admin users to list MPTCP
endpoints. A regression from v6.7.
- Patch 2: correctly use sock_kfree_s() instead of kfree() in the
userspace PM. A fix for another fix introduced in v6.4 and
backportable up to v5.19.
====================
Link: https://patch.msgid.link/20241104-net-mptcp-misc-6-12-v1-0-c13f2ff1656f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The local address entries on userspace_pm_local_addr_list are allocated
by sock_kmalloc().
It's then required to use sock_kfree_s() instead of kfree() to free
these entries in order to adjust the allocated size on the sk side.
Fixes: 24430f8bf516 ("mptcp: add address into userspace pm list")
Cc: stable@vger.kernel.org
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241104-net-mptcp-misc-6-12-v1-2-c13f2ff1656f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
During the switch to YNL, the command to list all endpoints has been
accidentally restricted to users with admin permissions.
It looks like there are no reasons to have this restriction which makes
it harder for a user to quickly check if the endpoint list has been
correctly populated by an automated tool. Best to go back to the
previous behaviour then.
mptcp_pm_gen.c has been modified using ynl-gen-c.py:
$ ./tools/net/ynl/ynl-gen-c.py --mode kernel \
--spec Documentation/netlink/specs/mptcp_pm.yaml --source \
-o net/mptcp/mptcp_pm_gen.c
The header file doesn't need to be regenerated.
Fixes: 1d0507f46843 ("net: mptcp: convert netlink from small_ops to ops")
Cc: stable@vger.kernel.org
Reviewed-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241104-net-mptcp-misc-6-12-v1-1-c13f2ff1656f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Drew Fustini says:
====================
Add the dwmac driver support for T-HEAD TH1520 SoC
This series adds support for dwmac gigabit ethernet in the T-Head TH1520
RISC-V SoC used on boards like BeagleV Ahead and the LicheePi 4A.
The gigabit ethernet on these boards does need pinctrl support to mux
the necessary pads. The pinctrl-th1520 driver, pinctrl binding, and
related dts patches are in linux-next. However, they are not yet in
net-next/main.
Therefore, I am dropping the dts patch for v5 as it will not build on
net-next/main due to the lack of the padctrl0_apsys pin controller node
in next-next/main version th1520.dtsi. It does exist in linux-next [1]
and the two patches in this series allow the ethernet ports to work
correctly on the LPi4A and Ahead when applied to linux-next.
The dwmac-thead driver in this series does not need the pinctrl-th1520
driver to build. Nor does the thead,th1520-gmac.yaml binding need the
pinctrl binding to pass the schema check.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/arch/riscv/boot/dts/thead/th1520.dtsi
====================
Link: https://patch.msgid.link/20241103-th1520-gmac-v7-0-ef094a30169c@tenstorrent.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add dwmac glue driver to support the DesignWare-based GMAC controllers
on the T-HEAD TH1520 SoC.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Signed-off-by: Drew Fustini <dfustini@tenstorrent.com>
Link: https://patch.msgid.link/20241103-th1520-gmac-v7-2-ef094a30169c@tenstorrent.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add documentation to describe the DesginWare-based GMAC controllers in
the T-HEAD TH1520 SoC.
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Signed-off-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Signed-off-by: Drew Fustini <dfustini@tenstorrent.com>
Link: https://patch.msgid.link/20241103-th1520-gmac-v7-1-ef094a30169c@tenstorrent.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After e31a9fedc7d8 ("Revert "r8169: disable ASPM during NAPI poll"")
these locks aren't needed any longer.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/680f2606-ac7d-4ced-8694-e5033855da9b@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
DP83848 datasheet (section 4.7.2) indicates that the reset pin should be
toggled after the clocks are running. Add the PHY_RST_AFTER_CLK_EN to
make sure that this indication is respected.
In my experience not having this flag enabled would lead to, on some
boots, the wrong MII mode being selected if the PHY was initialized on
the bootloader and was receiving data during Linux boot.
Signed-off-by: Diogo Silva <diogompaissilva@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Fixes: 34e45ad9378c ("net: phy: dp83848: Add TI DP83848 Ethernet PHY")
Link: https://patch.msgid.link/20241102151504.811306-1-paissilva@ld-100007.ds1.internal
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Lothar Rubusch says:
====================
Add support for Synopsis DesignWare version 3.72a
Add compatibility and dt-binding for Synopsis DesignWare version 3.72a.
The dwmac is used on some older Altera/Intel SoCs such as Arria10.
Updating compatibles in the driver and bindings for the DT improves the
binding check coverage for such SoCs.
====================
Link: https://patch.msgid.link/20241102114122.4631-1-l.rubusch@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The hard processor system (HPS) on the Intel/Altera Arria10 provides
three Ethernet Media Access Controller (EMAC) peripherals. Each EMAC
can be used to transmit and receive data at 10/100/1000 Mbps over
ethernet connections in compliance with the IEEE 802.3 specification.
The EMACs on the Arria10 are instances of the Synopsis DesignWare
Universal 10/100/1000 Ethernet MAC, version 3.72a.
Support the Synopsis DesignWare version 3.72a, which is used in Intel's
Arria10 SoC, since it was missing.
Signed-off-by: Lothar Rubusch <l.rubusch@gmail.com>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20241102114122.4631-3-l.rubusch@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The dwmac 3.72a is an ip version that can be found on Intel/Altera Arria10
SoCs. Going by the hardware features "snps,multicast-filter-bins" and
"snps,perfect-filter-entries" shall be supported. Thus add a
compatibility flag, and extend coverage of the driver for the 3.72a.
Signed-off-by: Lothar Rubusch <l.rubusch@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241102114122.4631-2-l.rubusch@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The latter is the preferred way to copy ethtool strings.
Avoids manually incrementing the pointer. Cleans up the code quite well.
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Reviewed-by: Jijie Shao <shaojijie@huawei.com>
Link: https://patch.msgid.link/20241101220023.290926-1-rosenp@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Clearing the secpath for internal ports will cause packet drops when
ipsec offload or early SW ipsec decrypt are used. Systems that rely
on these will not be able to actually pass traffic via openvswitch.
There is still an open issue for a flow miss packet - this is because
we drop the extensions during upcall and there is no facility to
restore such data (and it is non-trivial to add such functionality
to the upcall interface). That means that when a flow miss occurs,
there will still be packet drops. With this patch, when a flow is
found then traffic which has an associated xfrm extension will
properly flow.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Link: https://patch.msgid.link/20241101204732.183840-1-aconole@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Replace the register addresses with the names used in r8125/r8126
vendor driver, and consider that RSS_CTRL_8125 is a 32 bit register.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/3bf2f340-b369-4174-97bf-fd38d4217492@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Dr. David Alan Gilbert says:
====================
A pile of sfc deadcode
This is a collection of deadcode removal in the sfc
drivers; the split is vaguely where I found them in
the tree, with some left over.
This has been build tested and booted on an x86 VM,
but I fon't have the hardware to test; however
it's all full function removal.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
====================
Link: https://patch.msgid.link/20241102151625.39535-1-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
efx_ticks_to_usecs(), efx_reconfigure_port(), efx_ptp_get_mode(), and
efx_tx_get_copy_buffer_limited() are unused.
They seem to be partially due to the later splits to Siena, but
some seem unused for longer.
Remove them.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://patch.msgid.link/20241102151625.39535-5-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
efx_mcdi_flush_rxqs(), efx_mcdi_rpc_async_quiet(),
efx_mcdi_rpc_finish_quiet(), and efx_mcdi_wol_filter_get_magic()
are unused.
I think these are fall out from the split into Siena
that happened in
commit 4d49e5cd4b09 ("sfc/siena: Rename functions in mcdi headers to avoid
conflicts with sfc")
and
commit d48523cb88e0 ("sfc: Copy shared files needed for Siena (part 2)")
Remove them.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://patch.msgid.link/20241102151625.39535-4-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
efx_mae_mport_vf() has been unused since
commit 5227adff37af ("sfc: add mport lookup based on driver's mport data")
Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://patch.msgid.link/20241102151625.39535-3-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ef4_farch_dimension_resources(), ef4_nic_fix_nodesc_drop_stat(),
ef4_ticks_to_usecs() and ef4_tx_get_copy_buffer_limited() were
copied over from efx_ equivalents in 2016 but never used by
commit 5a6681e22c14 ("sfc: separate out SFC4000 ("Falcon") support into new
sfc-falcon driver")
EF4_MAX_FLUSH_TIME is also unused.
Remove them.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://patch.msgid.link/20241102151625.39535-2-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This commit fix a typographical error in netlink nlmsg_type constants definition in the include/uapi/linux/rtnetlink.h at line 177. The definition is RTM_NEWNVLAN RTM_NEWVLAN instead of RTM_NEWVLAN RTM_NEWVLAN.
Signed-off-by: Maurice Lambert <mauricelambert434@gmail.com>
Fixes: 8dcea187088b ("net: bridge: vlan: add rtm definitions and dump support")
Link: https://patch.msgid.link/20241103223950.230300-1-mauricelambert434@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We can see high contention on ptp_lock while doing RX timestamping
on high packet rates over several queues. Spinlock is not effecient
to protect timecounter for RX timestamps when reads are the most
usual operations and writes are only occasional. It's better to use
seqlock in such cases.
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Link: https://patch.msgid.link/20241103215108.557531-2-vadfed@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This hardware can provide only 48 bits of cycle counter. We can leave
only 24 bits in the cache to extend RX timestamps from 32 bits to 48
bits. Lower 8 bits of the cached value will be used to check for
roll-over while extending to full 48 bits.
This change makes cache writes atomic even on 32 bit platforms and we
can simply use READ_ONCE()/WRITE_ONCE() pair and remove spinlock. The
configuration structure will be also reduced by 4 bytes.
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Link: https://patch.msgid.link/20241103215108.557531-1-vadfed@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Recently, the net/lib.sh file has been modified to include defer.sh from
net/lib/sh/ directory. The Makefile from net/lib has been modified
accordingly, but not the ones from the sub-targets using net/lib.sh.
Because of that, the new file is not installed as expected when
installing the Forwarding, MPTCP, and Netfilter targets, e.g.
# make -C tools/testing/selftests TARGETS=net/mptcp install \
INSTALL_PATH=/tmp/kself
# cd /tmp/kself/
# ./run_kselftest.sh -c net/mptcp
TAP version 13
1..7
# timeout set to 1800
# selftests: net/mptcp: mptcp_connect.sh
# ./../lib.sh: line 5: /tmp/kself/net/lib/sh/defer.sh: No such file
or directory
# (...)
This can be fixed simply by adding all the .sh files from net/lib/sh
directory to the TEST_INCLUDES variable in the different Makefile's.
Fixes: a6e263f125cd ("selftests: net: lib: Introduce deferred commands")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20241104-net-next-selftests-lib-sh-deps-v1-1-7c9f7d939fc2@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The gettimex64() doesn't modify values in timecounter, that's why there
is no need to update sequence counter. Reduce the contention on sequence
lock for multi-thread PHC reading use-case.
Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Acked-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20241014170103.2473580-1-vadfed@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Update of stateful object triggers:
WARNING: suspicious RCU usage
net/netfilter/nf_tables_api.c:7759 RCU-list traversed in non-reader section!!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
1 lock held by nft/3060:
#0: ffff88810f0578c8 (&nft_net->commit_mutex){+.+.}-{4:4}, [..]
... but this list is not protected by the transaction mutex but the
nfnl nftables subsystem mutex.
Switch to nft_obj_type_get which will acquire rcu read lock,
bump refcount, and returns the result.
v3: Dan Carpenter points out nft_obj_type_get returns error pointer, not
NULL, on error.
Fixes: dad3bdeef45f ("netfilter: nf_tables: fix memory leak during stateful obj update").
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
type list
nft shell tests trigger:
WARNING: suspicious RCU usage
net/netfilter/nf_tables_api.c:3125 RCU-list traversed in non-reader section!!
1 lock held by nft/2068:
#0: ffff888106c6f8c8 (&nft_net->commit_mutex){+.+.}-{4:4}, at: nf_tables_valid_genid+0x3c/0xf0
But the transaction mutex doesn't protect this list, the nfnl subsystem
mutex would, but we can't acquire it here without risk of ABBA
deadlocks.
Acquire the rcu read lock to avoid this issue.
v3: add a comment that explains the ->inner_ops check implies
expression is builtin and lack of a module owner reference is ok.
Fixes: 3a07327d10a0 ("netfilter: nft_inner: support for inner tunnel header matching")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Like previous patches: iteration is ok if the list cannot be altered in
parallel.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Its not possible to add or delete elements from hash and bitmap sets,
as long as caller is holding the transaction mutex, so its ok to iterate
the list outside of rcu read side critical section.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
The transaction mutex prevents concurrent add/delete, its ok to iterate
those lists outside of rcu read side critical sections.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Same as previous patch. All set handling functions here can be called
with transaction mutex held (but not the rcu read lock).
The transaction mutex prevents concurrent add/delete, so this is fine.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
On rule delete we get:
WARNING: suspicious RCU usage
net/netfilter/nf_tables_api.c:3420 RCU-list traversed in non-reader section!!
1 lock held by iptables/134:
#0: ffff888008c4fcc8 (&nft_net->commit_mutex){+.+.}-{3:3}, at: nf_tables_valid_genid (include/linux/jiffies.h:101) nf_tables
Code is fine, no other CPU can change the list because we're holding
transaction mutex.
Pass the needed lockdep annotation to the iterator and fix
two comments for functions that are no longer restricted to rcu-only
context.
This is enough to resolve rule delete, but there are several other
missing annotations, added in followup-patches.
Fixes: 28875945ba98 ("rcu: Add support for consolidated-RCU reader checking")
Reported-by: Matthieu Baerts <matttbe@kernel.org>
Tested-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://lore.kernel.org/netfilter-devel/da27f17f-3145-47af-ad0f-7fd2a823623e@kernel.org/
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Roger Quadros says:
====================
net: ethernet: ti: am65-cpsw: Fixes to multi queue RX feature
On J7 platforms, setting up multiple RX flows was failing
as the RX free descriptor ring 0 is shared among all flows
and we did not allocate enough elements in the RX free descriptor
ring 0 to accommodate for all RX flows. Patch 1 fixes this.
The second patch fixes a warning if there was any error in
am65_cpsw_nuss_init_rx_chns() and am65_cpsw_nuss_cleanup_rx_chns()
was called after that.
Signed-off-by: Roger Quadros <rogerq@kernel.org>
====================
Link: https://patch.msgid.link/20241101-am65-cpsw-multi-rx-j7-fix-v3-0-338fdd6a55da@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
flow->irq is initialized to 0 which is a valid IRQ. Set it to -EINVAL
in error path of am65_cpsw_nuss_init_rx_chns() so we do not try
to free an unallocated IRQ in am65_cpsw_nuss_remove_rx_chns().
If user tried to change number of RX queues and am65_cpsw_nuss_init_rx_chns()
failed due to any reason, the warning will happen if user tries to change
the number of RX queues after the error condition.
root@am62xx-evm:~# ethtool -L eth0 rx 3
[ 40.385293] am65-cpsw-nuss 8000000.ethernet: set new flow-id-base 19
[ 40.393211] am65-cpsw-nuss 8000000.ethernet: Failed to init rx flow2
netlink error: Invalid argument
root@am62xx-evm:~# ethtool -L eth0 rx 2
[ 82.306427] ------------[ cut here ]------------
[ 82.311075] WARNING: CPU: 0 PID: 378 at kernel/irq/devres.c:144 devm_free_irq+0x84/0x90
[ 82.469770] Call trace:
[ 82.472208] devm_free_irq+0x84/0x90
[ 82.475777] am65_cpsw_nuss_remove_rx_chns+0x6c/0xac [ti_am65_cpsw_nuss]
[ 82.482487] am65_cpsw_nuss_update_tx_rx_chns+0x2c/0x9c [ti_am65_cpsw_nuss]
[ 82.489442] am65_cpsw_set_channels+0x30/0x4c [ti_am65_cpsw_nuss]
[ 82.495531] ethnl_set_channels+0x224/0x2dc
[ 82.499713] ethnl_default_set_doit+0xb8/0x1b8
[ 82.504149] genl_family_rcv_msg_doit+0xc0/0x124
[ 82.508757] genl_rcv_msg+0x1f0/0x284
[ 82.512409] netlink_rcv_skb+0x58/0x130
[ 82.516239] genl_rcv+0x38/0x50
[ 82.519374] netlink_unicast+0x1d0/0x2b0
[ 82.523289] netlink_sendmsg+0x180/0x3c4
[ 82.527205] __sys_sendto+0xe4/0x158
[ 82.530779] __arm64_sys_sendto+0x28/0x38
[ 82.534782] invoke_syscall+0x44/0x100
[ 82.538526] el0_svc_common.constprop.0+0xc0/0xe0
[ 82.543221] do_el0_svc+0x1c/0x28
[ 82.546528] el0_svc+0x28/0x98
[ 82.549578] el0t_64_sync_handler+0xc0/0xc4
[ 82.553752] el0t_64_sync+0x190/0x194
[ 82.557407] ---[ end trace 0000000000000000 ]---
Fixes: da70d184a8c3 ("net: ethernet: ti: am65-cpsw: Introduce multi queue Rx")
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
On J7 platforms, setting up multiple RX flows was failing
as the RX free descriptor ring 0 is shared among all flows
and we did not allocate enough elements in the RX free descriptor
ring 0 to accommodate for all RX flows.
This issue is not present on AM62 as separate pair of
rings are used for free and completion rings for each flow.
Fix this by allocating enough elements for RX free descriptor
ring 0.
However, we can no longer rely on desc_idx (descriptor based
offsets) to identify the pages in the respective flows as
free descriptor ring includes elements for all flows.
To solve this, introduce a new swdata data structure to store
flow_id and page. This can be used to identify which flow (page_pool)
and page the descriptor belonged to when popped out of the
RX rings.
Fixes: da70d184a8c3 ("net: ethernet: ti: am65-cpsw: Introduce multi queue Rx")
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
When the driver is uninstalled and the VF is disabled concurrently, a
kernel crash occurs. The reason is that the two actions call function
pci_disable_sriov(). The num_VFs is checked to determine whether to
release the corresponding resources. During the second calling, num_VFs
is not 0 and the resource release function is called. However, the
corresponding resource has been released during the first invoking.
Therefore, the problem occurs:
[15277.839633][T50670] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
...
[15278.131557][T50670] Call trace:
[15278.134686][T50670] klist_put+0x28/0x12c
[15278.138682][T50670] klist_del+0x14/0x20
[15278.142592][T50670] device_del+0xbc/0x3c0
[15278.146676][T50670] pci_remove_bus_device+0x84/0x120
[15278.151714][T50670] pci_stop_and_remove_bus_device+0x6c/0x80
[15278.157447][T50670] pci_iov_remove_virtfn+0xb4/0x12c
[15278.162485][T50670] sriov_disable+0x50/0x11c
[15278.166829][T50670] pci_disable_sriov+0x24/0x30
[15278.171433][T50670] hnae3_unregister_ae_algo_prepare+0x60/0x90 [hnae3]
[15278.178039][T50670] hclge_exit+0x28/0xd0 [hclge]
[15278.182730][T50670] __se_sys_delete_module.isra.0+0x164/0x230
[15278.188550][T50670] __arm64_sys_delete_module+0x1c/0x30
[15278.193848][T50670] invoke_syscall+0x50/0x11c
[15278.198278][T50670] el0_svc_common.constprop.0+0x158/0x164
[15278.203837][T50670] do_el0_svc+0x34/0xcc
[15278.207834][T50670] el0_svc+0x20/0x30
For details, see the following figure.
rmmod hclge disable VFs
----------------------------------------------------
hclge_exit() sriov_numvfs_store()
... device_lock()
pci_disable_sriov() hns3_pci_sriov_configure()
pci_disable_sriov()
sriov_disable()
sriov_disable() if !num_VFs :
if !num_VFs : return;
return; sriov_del_vfs()
sriov_del_vfs() ...
... klist_put()
klist_put() ...
... num_VFs = 0;
num_VFs = 0; device_unlock();
In this patch, when driver is removing, we get the device_lock()
to protect num_VFs, just like sriov_numvfs_store().
Fixes: 0dd8a25f355b ("net: hns3: disable sriov before unload hclge layer")
Signed-off-by: Peiyang Wang <wangpeiyang1@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241101091507.3644584-1-shaojijie@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Daniel Machon says:
====================
net: lan969x: add VCAP functionality
== Description:
This series is the third of a multi-part series, that prepares and adds
support for the new lan969x switch driver.
The upstreaming efforts is split into multiple series (might change a
bit as we go along):
1) Prepare the Sparx5 driver for lan969x (merged)
2) Add support for lan969x (same basic features as Sparx5
provides excl. FDMA and VCAP, merged).
--> 3) Add lan969x VCAP functionality.
4) Add RGMII and FDMA functionality.
== VCAP support:
The Versatile Content-Aware Processor (VCAP) is a content-aware packet
processor that allows wirespeed packet inspection for rich
implementation of, for example, advanced VLAN and QoS classification and
manipulations, IP source guarding, longest prefix matching for Layer-3
routing, and security features for wireline and wireless applications.
This is all achieved by programming rules into the VCAP.
When a VCAP is enabled, every frame passing through the switch is
analyzed and multiple keys are created based on the contents of the
frame. The frame is examined to determine the frame type (for example,
IPv4 TCP frame), so that the frame information is extracted according to
the frame type, port-specific configuration, and classification results
from the basic classification. Keys are applied to the VCAP and when
there is a match between a key and a rule in the VCAP, the rule is then
applied to the frame from which the key was extracted.
After this series is applied, the lan969x driver will support the same
VCAP functionality as Sparx5.
== Patch breakdown:
Patch #1 exposes some VCAP symbols for lan969x.
Patch #2 replaces VCAP uses of SPX5_PORTS with n_ports from the match
data.
Patch #3 adds new VCAP constants to match data
Patch #4 removes the is_sparx5() check to now initialize the VCAP API on
lan969x.
Patch #5 adds the auto-generated VCAP data for lan969x.
Patch #6 adds the VCAP configuration data for lan969x.
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
====================
Link: https://patch.msgid.link/20241101-sparx5-lan969x-switch-driver-3-v1-0-3c76f22f4bfa@microchip.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add configuration data (for consumption by the VCAP API) for the four
VCAP's that we are going to support. The following VCAP's will be
supported:
- VCAP CLM: (also known as IS0) is part of the analyzer and enables
frame classification using VCAP functionality.
- VCAP IS2: is part of ANA_ACL and enables access control lists, using
VCAP functionality.
- VCAP ES0: is part of the rewriter and enables rewriting of frames
using VCAP functionality.
- VCAP ES2: is part of EACL and enables egress access control lists
using VCAP functionality
The two VCAP's: CLM and IS2 use shared resources from the SUPER VCAP.
The SUPER VCAP is a shared pool of 6 blocks that can be distributed
freely among CLM and IS2. Each block in the pool has 3,072 addresses
with entries, actions, and counters. ES0 and ES2 does not use shared
resources.
In the configuration data for lan969x CLM uses blocks 2-4 with a total
of 6 lookups. IS2 uses blocks 0-1 with a total of 4 lookups.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Platform VCAP data for each VCAP instance is auto-generated using an
internal Microchip tool. The generated VCAP data contains information
about keyfields, keyfield sets, actionfields, actionfield sets and
typegroups, which in combination are used to encode and decode rules in
the VCAP.
Add the auto-generated VCAP file lan969x_vcap_ag_api.c and assign the
two structs: lan969x_vcaps and lan969x_vcap_stats to the match data.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The is_sparx5() check was introduced in an earlier series, to make sure
the sparx5_vcap_init() was not executed on lan969x, as it was not
implemented there yet. Now that it is, remove that check.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
In preparation for lan969x VCAP support, add the following three new
VCAP constants to match data:
- vcaps_cfg (contains configuration data for each VCAP).
- vcaps (contains auto-generated information about VCAP keys and
actions).
- vcap_stats: (contains auto-generated string names of all the keys
and actions)
Add these constants to the Sparx5 match data constants and use them to
initialize the VCAP's in sparx5_vcap_init().
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The Sparx5 VCAP implementation uses the SPX5_PORTS symbol to iterate over
the 65 front ports of Sparx5. Replace the use with the n_ports constant
from the match data, which translates to 65 of Sparx5 and 30 on lan969x.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
In preparation for lan969x VCAP support, expose the following symbols for
use by the lan969x VCAP implementation:
- The symbols SPARX5_*_LOOKUPS defines the number of lookups in each
VCAP instance. These are the same for lan969x. Move them to the
header file.
- The struct sparx5_vcap_inst encapsulates information about a single
VCAP instance. Move this struct to the header file and declare the
sparx5_vcap_inst_cfg as extern.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Xuan Zhuo says:
====================
virtio_net: enable premapped mode by default
v1:
1. fix some small problems
2. remove commit "virtio_net: introduce vi->mode"
In the last linux version, we disabled this feature to fix the
regress[1].
The patch set is try to fix the problem and re-enable it.
More info: http://lore.kernel.org/all/20240820071913.68004-1-xuanzhuo@linux.alibaba.com
[1]: http://lore.kernel.org/all/8b20cc28-45a9-4643-8e87-ba164a540c0a@oracle.com
====================
Link: https://patch.msgid.link/20241029084615.91049-1-xuanzhuo@linux.alibaba.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Now, the premapped mode can be enabled unconditionally.
So we can remove the failover code for merge and small mode.
The virtnet_rq_xxx() helper would be only used if the mode is using pre
mapping. A check is added to prevent misusing of these API.
Tested-by: Darren Kenny <darren.kenny@oracle.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Currently, the virtio core will perform a dma operation for each
buffer. Although, the same page may be operated multiple times.
In premapped mod, we can perform only one dma operation for the pages of
the alloc frag. This is beneficial for the iommu device.
kernel command line: intel_iommu=on iommu.passthrough=0
| strict=0 | strict=1
Before | 775496pps | 428614pps
After | 1109316pps | 742853pps
In the 6.11, we disabled this feature because a regress [1].
Now, we fix the problem and re-enable it.
[1]: http://lore.kernel.org/all/8b20cc28-45a9-4643-8e87-ba164a540c0a@oracle.com
Tested-by: Darren Kenny <darren.kenny@oracle.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The virtio-net big mode did not enable premapped mode,
so we did not need to check the unmap. And the subsequent
commit will remove the failover code for failing enable
premapped for merge and small mode. So we need to remove
the checking do_dma code in the big mode path.
Tested-by: Darren Kenny <darren.kenny@oracle.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
When the frag just got a page, then may lead to regression on VM.
Specially if the sysctl net.core.high_order_alloc_disable value is 1,
then the frag always get a page when do refill.
Which could see reliable crashes or scp failure (scp a file 100M in size
to VM).
The issue is that the virtnet_rq_dma takes up 16 bytes at the beginning
of a new frag. When the frag size is larger than PAGE_SIZE,
everything is fine. However, if the frag is only one page and the
total size of the buffer and virtnet_rq_dma is larger than one page, an
overflow may occur.
The commit f9dac92ba908 ("virtio_ring: enable premapped mode whatever
use_dma_api") introduced this problem. And we reverted some commits to
fix this in last linux version. Now we try to enable it and fix this
bug directly.
Here, when the frag size is not enough, we reduce the buffer len to fix
this problem.
Reported-by: "Si-Wei Liu" <si-wei.liu@oracle.com>
Tested-by: Darren Kenny <darren.kenny@oracle.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Vladimir Oltean says:
====================
Fix sparse warnings in dpaa_eth driver
This is a follow-up of the discussion at:
https://lore.kernel.org/oe-kbuild-all/20241028-sticky-refined-lionfish-b06c0c@leitao/
where I said I would take care of the sparse warnings uncovered by
Breno's COMPILE_TEST change for the dpaa_eth driver.
There was one warning that I decided to treat as an actual bug:
https://lore.kernel.org/netdev/20241029163105.44135-1-vladimir.oltean@nxp.com/
and what remains here are those warnings which I consider harmless.
I would like Christophe to ack the entire series to be taken through
netdev. I find it weird that the qbman driver, whose major API consumer
is netdev, is maintained by a different group. In this case, the buggy
qm_sg_entry_get_off() function is defined in qbman but exclusively
called in netdev.
====================
Link: https://patch.msgid.link/20241029164317.50182-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Sparse provides the following output:
warning: cast to restricted __be32
This is a harmless warning due to the fact that we dereference the hash
stored in the FD using an incorrect type annotation. Suppress the
warning by using the correct __be32 type instead of u32. No functional
change.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Breno Leitao <leitao@debian.org>
Acked-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Link: https://patch.msgid.link/20241029164317.50182-4-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|