Age | Commit message (Collapse) | Author |
|
DP83848 datasheet (section 4.7.2) indicates that the reset pin should be
toggled after the clocks are running. Add the PHY_RST_AFTER_CLK_EN to
make sure that this indication is respected.
In my experience not having this flag enabled would lead to, on some
boots, the wrong MII mode being selected if the PHY was initialized on
the bootloader and was receiving data during Linux boot.
Signed-off-by: Diogo Silva <diogompaissilva@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Fixes: 34e45ad9378c ("net: phy: dp83848: Add TI DP83848 Ethernet PHY")
Link: https://patch.msgid.link/20241102151504.811306-1-paissilva@ld-100007.ds1.internal
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Lothar Rubusch says:
====================
Add support for Synopsis DesignWare version 3.72a
Add compatibility and dt-binding for Synopsis DesignWare version 3.72a.
The dwmac is used on some older Altera/Intel SoCs such as Arria10.
Updating compatibles in the driver and bindings for the DT improves the
binding check coverage for such SoCs.
====================
Link: https://patch.msgid.link/20241102114122.4631-1-l.rubusch@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The hard processor system (HPS) on the Intel/Altera Arria10 provides
three Ethernet Media Access Controller (EMAC) peripherals. Each EMAC
can be used to transmit and receive data at 10/100/1000 Mbps over
ethernet connections in compliance with the IEEE 802.3 specification.
The EMACs on the Arria10 are instances of the Synopsis DesignWare
Universal 10/100/1000 Ethernet MAC, version 3.72a.
Support the Synopsis DesignWare version 3.72a, which is used in Intel's
Arria10 SoC, since it was missing.
Signed-off-by: Lothar Rubusch <l.rubusch@gmail.com>
Acked-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://patch.msgid.link/20241102114122.4631-3-l.rubusch@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The dwmac 3.72a is an ip version that can be found on Intel/Altera Arria10
SoCs. Going by the hardware features "snps,multicast-filter-bins" and
"snps,perfect-filter-entries" shall be supported. Thus add a
compatibility flag, and extend coverage of the driver for the 3.72a.
Signed-off-by: Lothar Rubusch <l.rubusch@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241102114122.4631-2-l.rubusch@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The latter is the preferred way to copy ethtool strings.
Avoids manually incrementing the pointer. Cleans up the code quite well.
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Reviewed-by: Jijie Shao <shaojijie@huawei.com>
Link: https://patch.msgid.link/20241101220023.290926-1-rosenp@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Clearing the secpath for internal ports will cause packet drops when
ipsec offload or early SW ipsec decrypt are used. Systems that rely
on these will not be able to actually pass traffic via openvswitch.
There is still an open issue for a flow miss packet - this is because
we drop the extensions during upcall and there is no facility to
restore such data (and it is non-trivial to add such functionality
to the upcall interface). That means that when a flow miss occurs,
there will still be packet drops. With this patch, when a flow is
found then traffic which has an associated xfrm extension will
properly flow.
Signed-off-by: Aaron Conole <aconole@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Link: https://patch.msgid.link/20241101204732.183840-1-aconole@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Replace the register addresses with the names used in r8125/r8126
vendor driver, and consider that RSS_CTRL_8125 is a 32 bit register.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Link: https://patch.msgid.link/3bf2f340-b369-4174-97bf-fd38d4217492@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Dr. David Alan Gilbert says:
====================
A pile of sfc deadcode
This is a collection of deadcode removal in the sfc
drivers; the split is vaguely where I found them in
the tree, with some left over.
This has been build tested and booted on an x86 VM,
but I fon't have the hardware to test; however
it's all full function removal.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
====================
Link: https://patch.msgid.link/20241102151625.39535-1-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
efx_ticks_to_usecs(), efx_reconfigure_port(), efx_ptp_get_mode(), and
efx_tx_get_copy_buffer_limited() are unused.
They seem to be partially due to the later splits to Siena, but
some seem unused for longer.
Remove them.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://patch.msgid.link/20241102151625.39535-5-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
efx_mcdi_flush_rxqs(), efx_mcdi_rpc_async_quiet(),
efx_mcdi_rpc_finish_quiet(), and efx_mcdi_wol_filter_get_magic()
are unused.
I think these are fall out from the split into Siena
that happened in
commit 4d49e5cd4b09 ("sfc/siena: Rename functions in mcdi headers to avoid
conflicts with sfc")
and
commit d48523cb88e0 ("sfc: Copy shared files needed for Siena (part 2)")
Remove them.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://patch.msgid.link/20241102151625.39535-4-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
efx_mae_mport_vf() has been unused since
commit 5227adff37af ("sfc: add mport lookup based on driver's mport data")
Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://patch.msgid.link/20241102151625.39535-3-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ef4_farch_dimension_resources(), ef4_nic_fix_nodesc_drop_stat(),
ef4_ticks_to_usecs() and ef4_tx_get_copy_buffer_limited() were
copied over from efx_ equivalents in 2016 but never used by
commit 5a6681e22c14 ("sfc: separate out SFC4000 ("Falcon") support into new
sfc-falcon driver")
EF4_MAX_FLUSH_TIME is also unused.
Remove them.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://patch.msgid.link/20241102151625.39535-2-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This commit fix a typographical error in netlink nlmsg_type constants definition in the include/uapi/linux/rtnetlink.h at line 177. The definition is RTM_NEWNVLAN RTM_NEWVLAN instead of RTM_NEWVLAN RTM_NEWVLAN.
Signed-off-by: Maurice Lambert <mauricelambert434@gmail.com>
Fixes: 8dcea187088b ("net: bridge: vlan: add rtm definitions and dump support")
Link: https://patch.msgid.link/20241103223950.230300-1-mauricelambert434@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We can see high contention on ptp_lock while doing RX timestamping
on high packet rates over several queues. Spinlock is not effecient
to protect timecounter for RX timestamps when reads are the most
usual operations and writes are only occasional. It's better to use
seqlock in such cases.
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Link: https://patch.msgid.link/20241103215108.557531-2-vadfed@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This hardware can provide only 48 bits of cycle counter. We can leave
only 24 bits in the cache to extend RX timestamps from 32 bits to 48
bits. Lower 8 bits of the cached value will be used to check for
roll-over while extending to full 48 bits.
This change makes cache writes atomic even on 32 bit platforms and we
can simply use READ_ONCE()/WRITE_ONCE() pair and remove spinlock. The
configuration structure will be also reduced by 4 bytes.
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Link: https://patch.msgid.link/20241103215108.557531-1-vadfed@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Recently, the net/lib.sh file has been modified to include defer.sh from
net/lib/sh/ directory. The Makefile from net/lib has been modified
accordingly, but not the ones from the sub-targets using net/lib.sh.
Because of that, the new file is not installed as expected when
installing the Forwarding, MPTCP, and Netfilter targets, e.g.
# make -C tools/testing/selftests TARGETS=net/mptcp install \
INSTALL_PATH=/tmp/kself
# cd /tmp/kself/
# ./run_kselftest.sh -c net/mptcp
TAP version 13
1..7
# timeout set to 1800
# selftests: net/mptcp: mptcp_connect.sh
# ./../lib.sh: line 5: /tmp/kself/net/lib/sh/defer.sh: No such file
or directory
# (...)
This can be fixed simply by adding all the .sh files from net/lib/sh
directory to the TEST_INCLUDES variable in the different Makefile's.
Fixes: a6e263f125cd ("selftests: net: lib: Introduce deferred commands")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20241104-net-next-selftests-lib-sh-deps-v1-1-7c9f7d939fc2@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The gettimex64() doesn't modify values in timecounter, that's why there
is no need to update sequence counter. Reduce the contention on sequence
lock for multi-thread PHC reading use-case.
Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Acked-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20241014170103.2473580-1-vadfed@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Update of stateful object triggers:
WARNING: suspicious RCU usage
net/netfilter/nf_tables_api.c:7759 RCU-list traversed in non-reader section!!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
1 lock held by nft/3060:
#0: ffff88810f0578c8 (&nft_net->commit_mutex){+.+.}-{4:4}, [..]
... but this list is not protected by the transaction mutex but the
nfnl nftables subsystem mutex.
Switch to nft_obj_type_get which will acquire rcu read lock,
bump refcount, and returns the result.
v3: Dan Carpenter points out nft_obj_type_get returns error pointer, not
NULL, on error.
Fixes: dad3bdeef45f ("netfilter: nf_tables: fix memory leak during stateful obj update").
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
type list
nft shell tests trigger:
WARNING: suspicious RCU usage
net/netfilter/nf_tables_api.c:3125 RCU-list traversed in non-reader section!!
1 lock held by nft/2068:
#0: ffff888106c6f8c8 (&nft_net->commit_mutex){+.+.}-{4:4}, at: nf_tables_valid_genid+0x3c/0xf0
But the transaction mutex doesn't protect this list, the nfnl subsystem
mutex would, but we can't acquire it here without risk of ABBA
deadlocks.
Acquire the rcu read lock to avoid this issue.
v3: add a comment that explains the ->inner_ops check implies
expression is builtin and lack of a module owner reference is ok.
Fixes: 3a07327d10a0 ("netfilter: nft_inner: support for inner tunnel header matching")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Like previous patches: iteration is ok if the list cannot be altered in
parallel.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Its not possible to add or delete elements from hash and bitmap sets,
as long as caller is holding the transaction mutex, so its ok to iterate
the list outside of rcu read side critical section.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
The transaction mutex prevents concurrent add/delete, its ok to iterate
those lists outside of rcu read side critical sections.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Same as previous patch. All set handling functions here can be called
with transaction mutex held (but not the rcu read lock).
The transaction mutex prevents concurrent add/delete, so this is fine.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
On rule delete we get:
WARNING: suspicious RCU usage
net/netfilter/nf_tables_api.c:3420 RCU-list traversed in non-reader section!!
1 lock held by iptables/134:
#0: ffff888008c4fcc8 (&nft_net->commit_mutex){+.+.}-{3:3}, at: nf_tables_valid_genid (include/linux/jiffies.h:101) nf_tables
Code is fine, no other CPU can change the list because we're holding
transaction mutex.
Pass the needed lockdep annotation to the iterator and fix
two comments for functions that are no longer restricted to rcu-only
context.
This is enough to resolve rule delete, but there are several other
missing annotations, added in followup-patches.
Fixes: 28875945ba98 ("rcu: Add support for consolidated-RCU reader checking")
Reported-by: Matthieu Baerts <matttbe@kernel.org>
Tested-by: Matthieu Baerts <matttbe@kernel.org>
Closes: https://lore.kernel.org/netfilter-devel/da27f17f-3145-47af-ad0f-7fd2a823623e@kernel.org/
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Roger Quadros says:
====================
net: ethernet: ti: am65-cpsw: Fixes to multi queue RX feature
On J7 platforms, setting up multiple RX flows was failing
as the RX free descriptor ring 0 is shared among all flows
and we did not allocate enough elements in the RX free descriptor
ring 0 to accommodate for all RX flows. Patch 1 fixes this.
The second patch fixes a warning if there was any error in
am65_cpsw_nuss_init_rx_chns() and am65_cpsw_nuss_cleanup_rx_chns()
was called after that.
Signed-off-by: Roger Quadros <rogerq@kernel.org>
====================
Link: https://patch.msgid.link/20241101-am65-cpsw-multi-rx-j7-fix-v3-0-338fdd6a55da@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
flow->irq is initialized to 0 which is a valid IRQ. Set it to -EINVAL
in error path of am65_cpsw_nuss_init_rx_chns() so we do not try
to free an unallocated IRQ in am65_cpsw_nuss_remove_rx_chns().
If user tried to change number of RX queues and am65_cpsw_nuss_init_rx_chns()
failed due to any reason, the warning will happen if user tries to change
the number of RX queues after the error condition.
root@am62xx-evm:~# ethtool -L eth0 rx 3
[ 40.385293] am65-cpsw-nuss 8000000.ethernet: set new flow-id-base 19
[ 40.393211] am65-cpsw-nuss 8000000.ethernet: Failed to init rx flow2
netlink error: Invalid argument
root@am62xx-evm:~# ethtool -L eth0 rx 2
[ 82.306427] ------------[ cut here ]------------
[ 82.311075] WARNING: CPU: 0 PID: 378 at kernel/irq/devres.c:144 devm_free_irq+0x84/0x90
[ 82.469770] Call trace:
[ 82.472208] devm_free_irq+0x84/0x90
[ 82.475777] am65_cpsw_nuss_remove_rx_chns+0x6c/0xac [ti_am65_cpsw_nuss]
[ 82.482487] am65_cpsw_nuss_update_tx_rx_chns+0x2c/0x9c [ti_am65_cpsw_nuss]
[ 82.489442] am65_cpsw_set_channels+0x30/0x4c [ti_am65_cpsw_nuss]
[ 82.495531] ethnl_set_channels+0x224/0x2dc
[ 82.499713] ethnl_default_set_doit+0xb8/0x1b8
[ 82.504149] genl_family_rcv_msg_doit+0xc0/0x124
[ 82.508757] genl_rcv_msg+0x1f0/0x284
[ 82.512409] netlink_rcv_skb+0x58/0x130
[ 82.516239] genl_rcv+0x38/0x50
[ 82.519374] netlink_unicast+0x1d0/0x2b0
[ 82.523289] netlink_sendmsg+0x180/0x3c4
[ 82.527205] __sys_sendto+0xe4/0x158
[ 82.530779] __arm64_sys_sendto+0x28/0x38
[ 82.534782] invoke_syscall+0x44/0x100
[ 82.538526] el0_svc_common.constprop.0+0xc0/0xe0
[ 82.543221] do_el0_svc+0x1c/0x28
[ 82.546528] el0_svc+0x28/0x98
[ 82.549578] el0t_64_sync_handler+0xc0/0xc4
[ 82.553752] el0t_64_sync+0x190/0x194
[ 82.557407] ---[ end trace 0000000000000000 ]---
Fixes: da70d184a8c3 ("net: ethernet: ti: am65-cpsw: Introduce multi queue Rx")
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
On J7 platforms, setting up multiple RX flows was failing
as the RX free descriptor ring 0 is shared among all flows
and we did not allocate enough elements in the RX free descriptor
ring 0 to accommodate for all RX flows.
This issue is not present on AM62 as separate pair of
rings are used for free and completion rings for each flow.
Fix this by allocating enough elements for RX free descriptor
ring 0.
However, we can no longer rely on desc_idx (descriptor based
offsets) to identify the pages in the respective flows as
free descriptor ring includes elements for all flows.
To solve this, introduce a new swdata data structure to store
flow_id and page. This can be used to identify which flow (page_pool)
and page the descriptor belonged to when popped out of the
RX rings.
Fixes: da70d184a8c3 ("net: ethernet: ti: am65-cpsw: Introduce multi queue Rx")
Signed-off-by: Roger Quadros <rogerq@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
When the driver is uninstalled and the VF is disabled concurrently, a
kernel crash occurs. The reason is that the two actions call function
pci_disable_sriov(). The num_VFs is checked to determine whether to
release the corresponding resources. During the second calling, num_VFs
is not 0 and the resource release function is called. However, the
corresponding resource has been released during the first invoking.
Therefore, the problem occurs:
[15277.839633][T50670] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000020
...
[15278.131557][T50670] Call trace:
[15278.134686][T50670] klist_put+0x28/0x12c
[15278.138682][T50670] klist_del+0x14/0x20
[15278.142592][T50670] device_del+0xbc/0x3c0
[15278.146676][T50670] pci_remove_bus_device+0x84/0x120
[15278.151714][T50670] pci_stop_and_remove_bus_device+0x6c/0x80
[15278.157447][T50670] pci_iov_remove_virtfn+0xb4/0x12c
[15278.162485][T50670] sriov_disable+0x50/0x11c
[15278.166829][T50670] pci_disable_sriov+0x24/0x30
[15278.171433][T50670] hnae3_unregister_ae_algo_prepare+0x60/0x90 [hnae3]
[15278.178039][T50670] hclge_exit+0x28/0xd0 [hclge]
[15278.182730][T50670] __se_sys_delete_module.isra.0+0x164/0x230
[15278.188550][T50670] __arm64_sys_delete_module+0x1c/0x30
[15278.193848][T50670] invoke_syscall+0x50/0x11c
[15278.198278][T50670] el0_svc_common.constprop.0+0x158/0x164
[15278.203837][T50670] do_el0_svc+0x34/0xcc
[15278.207834][T50670] el0_svc+0x20/0x30
For details, see the following figure.
rmmod hclge disable VFs
----------------------------------------------------
hclge_exit() sriov_numvfs_store()
... device_lock()
pci_disable_sriov() hns3_pci_sriov_configure()
pci_disable_sriov()
sriov_disable()
sriov_disable() if !num_VFs :
if !num_VFs : return;
return; sriov_del_vfs()
sriov_del_vfs() ...
... klist_put()
klist_put() ...
... num_VFs = 0;
num_VFs = 0; device_unlock();
In this patch, when driver is removing, we get the device_lock()
to protect num_VFs, just like sriov_numvfs_store().
Fixes: 0dd8a25f355b ("net: hns3: disable sriov before unload hclge layer")
Signed-off-by: Peiyang Wang <wangpeiyang1@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241101091507.3644584-1-shaojijie@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Daniel Machon says:
====================
net: lan969x: add VCAP functionality
== Description:
This series is the third of a multi-part series, that prepares and adds
support for the new lan969x switch driver.
The upstreaming efforts is split into multiple series (might change a
bit as we go along):
1) Prepare the Sparx5 driver for lan969x (merged)
2) Add support for lan969x (same basic features as Sparx5
provides excl. FDMA and VCAP, merged).
--> 3) Add lan969x VCAP functionality.
4) Add RGMII and FDMA functionality.
== VCAP support:
The Versatile Content-Aware Processor (VCAP) is a content-aware packet
processor that allows wirespeed packet inspection for rich
implementation of, for example, advanced VLAN and QoS classification and
manipulations, IP source guarding, longest prefix matching for Layer-3
routing, and security features for wireline and wireless applications.
This is all achieved by programming rules into the VCAP.
When a VCAP is enabled, every frame passing through the switch is
analyzed and multiple keys are created based on the contents of the
frame. The frame is examined to determine the frame type (for example,
IPv4 TCP frame), so that the frame information is extracted according to
the frame type, port-specific configuration, and classification results
from the basic classification. Keys are applied to the VCAP and when
there is a match between a key and a rule in the VCAP, the rule is then
applied to the frame from which the key was extracted.
After this series is applied, the lan969x driver will support the same
VCAP functionality as Sparx5.
== Patch breakdown:
Patch #1 exposes some VCAP symbols for lan969x.
Patch #2 replaces VCAP uses of SPX5_PORTS with n_ports from the match
data.
Patch #3 adds new VCAP constants to match data
Patch #4 removes the is_sparx5() check to now initialize the VCAP API on
lan969x.
Patch #5 adds the auto-generated VCAP data for lan969x.
Patch #6 adds the VCAP configuration data for lan969x.
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
====================
Link: https://patch.msgid.link/20241101-sparx5-lan969x-switch-driver-3-v1-0-3c76f22f4bfa@microchip.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Add configuration data (for consumption by the VCAP API) for the four
VCAP's that we are going to support. The following VCAP's will be
supported:
- VCAP CLM: (also known as IS0) is part of the analyzer and enables
frame classification using VCAP functionality.
- VCAP IS2: is part of ANA_ACL and enables access control lists, using
VCAP functionality.
- VCAP ES0: is part of the rewriter and enables rewriting of frames
using VCAP functionality.
- VCAP ES2: is part of EACL and enables egress access control lists
using VCAP functionality
The two VCAP's: CLM and IS2 use shared resources from the SUPER VCAP.
The SUPER VCAP is a shared pool of 6 blocks that can be distributed
freely among CLM and IS2. Each block in the pool has 3,072 addresses
with entries, actions, and counters. ES0 and ES2 does not use shared
resources.
In the configuration data for lan969x CLM uses blocks 2-4 with a total
of 6 lookups. IS2 uses blocks 0-1 with a total of 4 lookups.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Platform VCAP data for each VCAP instance is auto-generated using an
internal Microchip tool. The generated VCAP data contains information
about keyfields, keyfield sets, actionfields, actionfield sets and
typegroups, which in combination are used to encode and decode rules in
the VCAP.
Add the auto-generated VCAP file lan969x_vcap_ag_api.c and assign the
two structs: lan969x_vcaps and lan969x_vcap_stats to the match data.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The is_sparx5() check was introduced in an earlier series, to make sure
the sparx5_vcap_init() was not executed on lan969x, as it was not
implemented there yet. Now that it is, remove that check.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
In preparation for lan969x VCAP support, add the following three new
VCAP constants to match data:
- vcaps_cfg (contains configuration data for each VCAP).
- vcaps (contains auto-generated information about VCAP keys and
actions).
- vcap_stats: (contains auto-generated string names of all the keys
and actions)
Add these constants to the Sparx5 match data constants and use them to
initialize the VCAP's in sparx5_vcap_init().
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The Sparx5 VCAP implementation uses the SPX5_PORTS symbol to iterate over
the 65 front ports of Sparx5. Replace the use with the n_ports constant
from the match data, which translates to 65 of Sparx5 and 30 on lan969x.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
In preparation for lan969x VCAP support, expose the following symbols for
use by the lan969x VCAP implementation:
- The symbols SPARX5_*_LOOKUPS defines the number of lookups in each
VCAP instance. These are the same for lan969x. Move them to the
header file.
- The struct sparx5_vcap_inst encapsulates information about a single
VCAP instance. Move this struct to the header file and declare the
sparx5_vcap_inst_cfg as extern.
Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Xuan Zhuo says:
====================
virtio_net: enable premapped mode by default
v1:
1. fix some small problems
2. remove commit "virtio_net: introduce vi->mode"
In the last linux version, we disabled this feature to fix the
regress[1].
The patch set is try to fix the problem and re-enable it.
More info: http://lore.kernel.org/all/20240820071913.68004-1-xuanzhuo@linux.alibaba.com
[1]: http://lore.kernel.org/all/8b20cc28-45a9-4643-8e87-ba164a540c0a@oracle.com
====================
Link: https://patch.msgid.link/20241029084615.91049-1-xuanzhuo@linux.alibaba.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Now, the premapped mode can be enabled unconditionally.
So we can remove the failover code for merge and small mode.
The virtnet_rq_xxx() helper would be only used if the mode is using pre
mapping. A check is added to prevent misusing of these API.
Tested-by: Darren Kenny <darren.kenny@oracle.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Currently, the virtio core will perform a dma operation for each
buffer. Although, the same page may be operated multiple times.
In premapped mod, we can perform only one dma operation for the pages of
the alloc frag. This is beneficial for the iommu device.
kernel command line: intel_iommu=on iommu.passthrough=0
| strict=0 | strict=1
Before | 775496pps | 428614pps
After | 1109316pps | 742853pps
In the 6.11, we disabled this feature because a regress [1].
Now, we fix the problem and re-enable it.
[1]: http://lore.kernel.org/all/8b20cc28-45a9-4643-8e87-ba164a540c0a@oracle.com
Tested-by: Darren Kenny <darren.kenny@oracle.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The virtio-net big mode did not enable premapped mode,
so we did not need to check the unmap. And the subsequent
commit will remove the failover code for failing enable
premapped for merge and small mode. So we need to remove
the checking do_dma code in the big mode path.
Tested-by: Darren Kenny <darren.kenny@oracle.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
When the frag just got a page, then may lead to regression on VM.
Specially if the sysctl net.core.high_order_alloc_disable value is 1,
then the frag always get a page when do refill.
Which could see reliable crashes or scp failure (scp a file 100M in size
to VM).
The issue is that the virtnet_rq_dma takes up 16 bytes at the beginning
of a new frag. When the frag size is larger than PAGE_SIZE,
everything is fine. However, if the frag is only one page and the
total size of the buffer and virtnet_rq_dma is larger than one page, an
overflow may occur.
The commit f9dac92ba908 ("virtio_ring: enable premapped mode whatever
use_dma_api") introduced this problem. And we reverted some commits to
fix this in last linux version. Now we try to enable it and fix this
bug directly.
Here, when the frag size is not enough, we reduce the buffer len to fix
this problem.
Reported-by: "Si-Wei Liu" <si-wei.liu@oracle.com>
Tested-by: Darren Kenny <darren.kenny@oracle.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Vladimir Oltean says:
====================
Fix sparse warnings in dpaa_eth driver
This is a follow-up of the discussion at:
https://lore.kernel.org/oe-kbuild-all/20241028-sticky-refined-lionfish-b06c0c@leitao/
where I said I would take care of the sparse warnings uncovered by
Breno's COMPILE_TEST change for the dpaa_eth driver.
There was one warning that I decided to treat as an actual bug:
https://lore.kernel.org/netdev/20241029163105.44135-1-vladimir.oltean@nxp.com/
and what remains here are those warnings which I consider harmless.
I would like Christophe to ack the entire series to be taken through
netdev. I find it weird that the qbman driver, whose major API consumer
is netdev, is maintained by a different group. In this case, the buggy
qm_sg_entry_get_off() function is defined in qbman but exclusively
called in netdev.
====================
Link: https://patch.msgid.link/20241029164317.50182-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Sparse provides the following output:
warning: cast to restricted __be32
This is a harmless warning due to the fact that we dereference the hash
stored in the FD using an incorrect type annotation. Suppress the
warning by using the correct __be32 type instead of u32. No functional
change.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Breno Leitao <leitao@debian.org>
Acked-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Link: https://patch.msgid.link/20241029164317.50182-4-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Multi-buffer frame descriptors (FDs) point to a buffer holding a
scatter/gather table (SGT), which is a finite array of fixed-size
entries, the last of which has qm_sg_entry_is_final(&sgt[i]) == true.
Each SGT entry points to a buffer holding pieces of the frame.
DPAARM.pdf explains in the figure called "Internal and External Margins,
Scatter/Gather Frame Format" that the SGT table is located within its
buffer at the same offset as the frame data start is located within the
first packet buffer.
+------------------------+
Scatter/Gather Buffer | First Buffer | Last Buffer
^ +------------+ ^ +-|---->^ +------------+ +->+------------+
| | | | ICEOF | | | | | |////////////|
| +------------+ v | | | | | |////////////|
BSM | |/ part of //| | |BSM | | | |////////////|
| |/ Internal /| | | | | | |////////////|
| |/ Context //| | | | | | |// Frame ///|
| +------------+ | | | | | ... |/ content //|
| | | | | | | | |////////////|
| | | | | | | | |////////////|
v +------------+ | | v +------------+ |////////////|
| Scatter/ //| sgt[0]--+ | |// Frame ///| |////////////|
| Gather List| ... | |/ content //| +------------+ ^
|////////////| sgt[N]----+ |////////////| | | | BEM
|////////////| |////////////| | | |
+------------+ +------------+ +------------+ v
BSM = Buffer Start Margin, BEM = Buffer End Margin, both are configured
by dpaa_eth_init_rx_port() for the RX FMan port relevant here.
sg_fd_to_skb() runs in the calling context of rx_default_dqrr() -
the NAPI receive callback - which only expects to receive contiguous
(qm_fd_contig) or scatter/gather (qm_fd_sg) frame descriptors.
Everything else is irrelevant codewise.
The processing done by sg_fd_to_skb() is weird because it does not
conform to the expectations laid out by the aforementioned figure.
Namely, it parses the OFFSET field only for SGT entries with i != 0
(codewise, skb != NULL). In those cases, OFFSET should always be 0.
Also, it does not parse the OFFSET field for the sgt[0] case, the only
case where the buffer offset is meaningful in this context. There, it
uses the fd_off, aka the offset to the Scatter/Gather List in the
Scatter/Gather Buffer from the figure. By equivalence, they should both
be equal to the BSM (in turn, equal to priv->rx_headroom).
This can actually be explained due to the bug which we had in
qm_sg_entry_get_off() until the previous change:
- qm_sg_entry_get_off() did not actually _work_ for sgt[0]. It returned
zero even with a non-zero offset, so fd_off had to be used as a fill-in.
- qm_sg_entry_get_off() always returned zero for sgt[i>0], and that
resulted in no user-visible bug, because the buffer offset _was
supposed_ to be zero for those buffers. So remove it from calculations.
Add assertions about the OFFSET field in both cases (first or subsequent
SGT entries) to make it absolutely obvious when something is not well
handled.
Similar logic can be seen in the driver for the architecturally similar
DPAA2, where dpaa2_eth_build_frag_skb() calls dpaa2_sg_get_offset() only
for i == 0. For the rest, there is even a comment stating the same thing:
* Data in subsequent SG entries is stored from the
* beginning of the buffer, so we don't need to add the
* sg_offset.
Tested on LS1046A.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Link: https://patch.msgid.link/20241029164317.50182-3-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
struct qm_sg_entry :: offset is a 13-bit field, declared as __be16.
When using be32_to_cpu(), a wrong value will be calculated on little
endian systems (Arm), because type promotion from 16-bit to 32-bit,
which is done before the byte swap and always in the CPU native
endianness, changes the value of the scatter/gather list entry offset in
big-endian interpretation (adds two zero bytes in the LSB interpretation).
The result of the byte swap is ANDed with GENMASK(12, 0), so the result
is always zero, because only those bytes added by type promotion remain
after the application of the bit mask.
The impact of the bug is that scatter/gather frames with a non-zero
offset into the buffer are treated by the driver as if they had a zero
offset. This is all in theory, because in practice, qm_sg_entry_get_off()
has a single caller, where the bug is inconsequential, because at that
call site the buffer offset will always be zero, as will be explained in
the subsequent change.
Flagged by sparse:
warning: cast to restricted __be32
warning: cast from restricted __be16
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Breno Leitao <leitao@debian.org>
Acked-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Link: https://patch.msgid.link/20241029164317.50182-2-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There's no need for devm bloat here. In addition, these are freed right
before the function exits.
Also swapped kcalloc order for consistency.
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Reviewed-by: Shay Agroskin <shayagr@amazon.com>
Link: https://patch.msgid.link/20241101214828.289752-2-rosenp@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The VMCLOCK device gives support for accurate timekeeping even across
live migration, unlike the KVM PTP clock. To help ensure that users can
always use ptp_vmclock where it's available in preference to ptp_kvm,
set it to 'default PTP_1588_CLOCK_VMCLOCK' instead of 'default y'.
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Link: https://patch.msgid.link/89955b74d225129d6e3d79b53aa8d81d1b50560f.camel@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ena_com_get_dev_basic_stats() has been unused since 2017's
commit d81db2405613 ("net/ena: refactor ena_get_stats64 to be atomic
context safe")
ena_com_get_offload_settings() has been unused since the original
commit of ENA back in 2016 in
commit 1738cd3ed342 ("net: ena: Add a driver for Amazon Elastic
Network Adapters (ENA)")
Remove them.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: David Arinzon <darinzon@amazon.com>
Link: https://patch.msgid.link/20241102220142.80285-1-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This manually reverts
commit a4e262cde3cd ("net: ena: allow automatic fallback to polling mode")
which is unused.
(I did it manually because there are other minor comment
and function changes surrounding it).
Build tested only.
Suggested-by: David Arinzon <darinzon@amazon.com>
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Link: https://patch.msgid.link/20241103194149.293456-1-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Sometimes the names of the enum entries are self-explanatory
or come from standards. Forcing authors to write trivial kdoc
for each of such entries seems unreasonable, but kdoc would
complain about undocumented entries.
Detect enums which only have documentation for the entire
type and no documentation for entries. Render their doc
as a plain comment.
Link: https://patch.msgid.link/20241103165314.1631237-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit d5fed5addb2b ("tcp: reorganize tcp_sock fast path variables")
moved the fields around and misplaced the documentation for "lsndtime".
So, let's replace it in the proper place.
Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241104070041.64302-1-dongml2@chinatelecom.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|