summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-10-22selftests: TBF: Use defer for test cleanupPetr Machata
Use the defer framework to schedule cleanups as soon as the command is executed. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22selftests: RED: Use defer for test cleanupPetr Machata
Instead of having a suite of dedicated cleanup functions, use the defer framework to schedule cleanups right as their setup functions are run. The sleep after stop_traffic() in mlxsw selftests is necessary, but scheduling it as "defer sleep; defer stop_traffic" is silly. Instead, add a local helper to stop traffic and sleep afterwards. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22selftests: forwarding: lib: Allow passing PID to stop_traffic()Petr Machata
Now that it is possible to schedule a deferral of stop_traffic() right after the traffic is started, we do not have to rely on the %% magic to kill the background process that was started last. Instead we can just give the PID explicitly. This makes it possible to start other background processes after the traffic is started without confusing the cleanup. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22selftests: forwarding: Add a fallback cleanup()Petr Machata
Consistent use of defers obviates the need for a separate test-specific cleanup function -- everything is just taken care of in defers. So in this patch, introduce a cleanup() helper in the forwarding lib.sh, which calls just pre_cleanup() and defer_scopes_cleanup(). Selftests are obviously still free to override the function. Since pre_cleanup() is too entangled with forwarding-specific minutia, the function cannot currently be in net/lib.sh. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22selftests: net: lib: Introduce deferred commandsPetr Machata
In commit 8510801a9dbd ("selftests: drv-net: add ability to schedule cleanup with defer()"), a defer helper was added to Python selftests. The idea is to keep cleanup commands close to their dirtying counterparts, thereby making it more transparent what is cleaning up what, making it harder to miss a cleanup, and make the whole cleanup business exception safe. All these benefits are applicable to bash as well, exception safety can be interpreted in terms of safety vs. a SIGINT. This patch therefore introduces a framework of several helpers that serve to schedule cleanups in bash selftests: - defer_scope_push(), defer_scope_pop(): Deferred statements can be batched together in scopes. When a scope is popped, the deferred commands scheduled in that scope are executed in the order opposite to order of their scheduling. - defer(): Schedules a defer to the most recently pushed scope (or the default scope if none was pushed.) - defer_prio(): Schedules a defer on the priority track. The priority defer queue is run before the default defer queue when scope is popped. The issue that this is addressing is specifically the one of restoring devlink shared buffer threshold type. When setting up static thresholds, one has to first change the threshold type to static, then override the individual thresholds. When cleaning up, it would be natural to reset the threshold values first, then change the threshold type. But the values that are valid for dynamic thresholds are generally invalid for static thresholds and vice versa. Attempts to restore the values first would be bounced. Thus one has to first reset the threshold type, then adjust the thresholds. (You could argue that the shared buffer threshold type API is broken and you would be right, but here we are.) This cannot be solved by pure defers easily. I considered making it possible to disable an existing defer, so that one could then schedule a new defer and disable the original. But this forward-shifting of the defer job would have to take place after every threshold-adjusting command, which would make it very awkward to schedule these jobs. - defer_scopes_cleanup(): Pops any unpopped scopes, including the default one. The selftests that use defer should run this in their exit trap. This is important to get cleanups of interrupted scripts. - in_defer_scope(): Sometimes a function would like to introduce a new defer scope, then run whatever it is that it wants to run, and then pop the scope to run the deferred cleanups. The helper in_defer_scope() can be used to run another command within such environment, such that any scheduled defers run after the command finishes. The framework is added as a separate file lib/sh/defer.sh so that it can be used by all bash selftests, including those that do not currently use lib.sh. lib.sh however includes the file by default, because ideally all tests would use these helpers instead of hand-rolling their cleanups. Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22net: phy: marvell: Add mdix status reportingPaul Davey
Report MDI-X resolved state after link up. Tested on Linkstreet 88E6193X internal PHYs. Signed-off-by: Paul Davey <paul.davey@alliedtelesis.co.nz> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241017015026.255224-1-paul.davey@alliedtelesis.co.nz Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22net: stmmac: Programming sequence for VLAN packets with split headerAbhishek Chauhan
Currently reset state configuration of split header works fine for non-tagged packets and we see no corruption in payload of any size We need additional programming sequence with reset configuration to handle VLAN tagged packets to avoid corruption in payload for packets of size greater than 256 bytes. Without this change ping application complains about corruption in payload when the size of the VLAN packet exceeds 256 bytes. With this change tagged and non-tagged packets of any size works fine and there is no corruption seen. Current configuration which has the issue for VLAN packet ---------------------------------------------------------- Split happens at the position at Layer 3 header |MAC-DA|MAC-SA|Vlan Tag|Ether type|IP header|IP data|Rest of the payload| 2 bytes ^ | With the fix we are making sure that the split happens now at Layer 2 which is end of ethernet header and start of IP payload Ip traffic split ----------------- Bits which take care of this are SPLM and SPLOFST SPLM = Split mode is set to Layer 2 SPLOFST = These bits indicate the value of offset from the beginning of Length/Type field at which header split should take place when the appropriate SPLM is selected. Reset value is 2bytes. Un-tagged data (without VLAN) |MAC-DA|MAC-SA|Ether type|IP header|IP data|Rest of the payload| 2bytes ^ | Tagged data (with VLAN) |MAC-DA|MAC-SA|VLAN Tag|Ether type|IP header|IP data|Rest of the payload| 2bytes ^ | Non-IP traffic split such AV packet ------------------------------------ Bits which take care of this are SAVE = Split AV Enable SAVO = Split AV Offset, similar to SPLOFST but this is for AVTP packets. |Preamble|MAC-DA|MAC-SA|VLAN tag|Ether type|IEEE 1722 payload|CRC| 2bytes ^ | Signed-off-by: Abhishek Chauhan <quic_abchauha@quicinc.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20241016234313.3992214-1-quic_abchauha@quicinc.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22Merge branch 'rtnetlink-refactor-rtnl_-new-del-set-link-for-per-netns-rtnl'Paolo Abeni
Kuniyuki Iwashima says: ==================== rtnetlink: Refactor rtnl_{new,del,set}link() for per-netns RTNL. This is a prep for the next series where we will push RTNL down to rtnl_{new,del,set}link(). That means, for example, __rtnl_newlink() is always under RTNL, but rtnl_newlink() has a non-RTNL section. As a prerequisite for per-netns RTNL, we will move netns validation (and RTNL-independent validations if possible) to that section. rtnl_link_ops and rtnl_af_ops will be protected with SRCU not to depend on RTNL. Changes: v2: * Add Eric's Reviewed-by to patch 1-4,6,8-11, (no tag on 5,7,12-14) * Patch 7 * Handle error of init_srcu_struct(). * Call cleanup_srcu_struct() after synchronize_srcu(). * Patch 12 * Move put_net() before errorout label * Patch 13 * Newly added as prep for patch 14 * Patch 14 * Handle error of init_srcu_struct(). * Call cleanup_srcu_struct() after synchronize_srcu(). v1: https://lore.kernel.org/netdev/20241009231656.57830-1-kuniyu@amazon.com/ ==================== Link: https://patch.msgid.link/20241016185357.83849-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22rtnetlink: Protect struct rtnl_af_ops with SRCU.Kuniyuki Iwashima
Once RTNL is replaced with rtnl_net_lock(), we need a mechanism to guarantee that rtnl_af_ops is alive during inflight RTM_SETLINK even when its module is being unloaded. Let's use SRCU to protect ops. rtnl_af_lookup() now iterates rtnl_af_ops under RCU and returns SRCU-protected ops pointer. The caller must call rtnl_af_put() to release the pointer after the use. Also, rtnl_af_unregister() unlinks the ops first and calls synchronize_srcu() to wait for inflight RTM_SETLINK requests to complete. Note that rtnl_af_ops needs to be protected by its dedicated lock when RTNL is removed. Note also that BUG_ON() in do_setlink() is changed to the normal error handling as a different af_ops might be found after validate_linkmsg(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22rtnetlink: Return int from rtnl_af_register().Kuniyuki Iwashima
The next patch will add init_srcu_struct() in rtnl_af_register(), then we need to handle its error. Let's add the error handling in advance to make the following patch cleaner. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Matt Johnston <matt@codeconstruct.com.au> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22rtnetlink: Call rtnl_link_get_net_capable() in do_setlink().Kuniyuki Iwashima
We will push RTNL down to rtnl_setlink(). RTM_SETLINK could call rtnl_link_get_net_capable() in do_setlink() to move a dev to a new netns, but the netns needs to be fetched before holding rtnl_net_lock(). Let's move it to rtnl_setlink() and pass the netns to do_setlink(). Now, RTM_NEWLINK paths (rtnl_changelink() and rtnl_group_changelink()) can pass the prefetched netns to do_setlink(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22rtnetlink: Clean up rtnl_setlink().Kuniyuki Iwashima
We will push RTNL down to rtnl_setlink(). Let's unify the error path to make it easy to place rtnl_net_lock(). While at it, keep the variables in reverse xmas order. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22rtnetlink: Clean up rtnl_dellink().Kuniyuki Iwashima
We will push RTNL down to rtnl_delink(). Let's unify the error path to make it easy to place rtnl_net_lock(). While at it, keep the variables in reverse xmas order. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22rtnetlink: Fetch IFLA_LINK_NETNSID in rtnl_newlink().Kuniyuki Iwashima
Another netns option for RTM_NEWLINK is IFLA_LINK_NETNSID and is fetched in rtnl_newlink_create(). This must be done before holding rtnl_net_lock(). Let's move IFLA_LINK_NETNSID processing to rtnl_newlink(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22rtnetlink: Call rtnl_link_get_net_capable() in rtnl_newlink().Kuniyuki Iwashima
As a prerequisite of per-netns RTNL, we must fetch netns before looking up dev or moving it to another netns. rtnl_link_get_net_capable() is called in rtnl_newlink_create() and do_setlink(), but both of them need to be moved to the RTNL-independent region, which will be rtnl_newlink(). Let's call rtnl_link_get_net_capable() in rtnl_newlink() and pass the netns down to where needed. Note that the latter two have not passed the nets to do_setlink() yet but will do so after the remaining rtnl_link_get_net_capable() is moved to rtnl_setlink() later. While at it, dest_net is renamed to tgt_net in rtnl_newlink_create() to align with rtnl_{del,set}link(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22rtnetlink: Protect struct rtnl_link_ops with SRCU.Kuniyuki Iwashima
Once RTNL is replaced with rtnl_net_lock(), we need a mechanism to guarantee that rtnl_link_ops is alive during inflight RTM_NEWLINK even when its module is being unloaded. Let's use SRCU to protect ops. rtnl_link_ops_get() now iterates link_ops under RCU and returns SRCU-protected ops pointer. The caller must call rtnl_link_ops_put() to release the pointer after the use. Also, __rtnl_link_unregister() unlinks the ops first and calls synchronize_srcu() to wait for inflight RTM_NEWLINK requests to complete. Note that link_ops needs to be protected by its dedicated lock when RTNL is removed. Suggested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22rtnetlink: Move ops->validate to rtnl_newlink().Kuniyuki Iwashima
ops->validate() does not require RTNL. Let's move it to rtnl_newlink(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22rtnetlink: Move rtnl_link_ops_get() and retry to rtnl_newlink().Kuniyuki Iwashima
Currently, if neither dev nor rtnl_link_ops is found in __rtnl_newlink(), we release RTNL and redo the whole process after request_module(), which complicates the logic. The ops will be RTNL-independent later. Let's move the ops lookup to rtnl_newlink() and do the retry earlier. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22rtnetlink: Move simple validation from __rtnl_newlink() to rtnl_newlink().Kuniyuki Iwashima
We will push RTNL down to rtnl_newlink(). Let's move RTNL-independent validation to rtnl_newlink(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22rtnetlink: Factorise do_setlink() path from __rtnl_newlink().Kuniyuki Iwashima
__rtnl_newlink() got too long to maintain. For example, netdev_master_upper_dev_get()->rtnl_link_ops is fetched even when IFLA_INFO_SLAVE_DATA is not specified. Let's factorise the single dev do_setlink() path to a separate function. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22rtnetlink: Call validate_linkmsg() in do_setlink().Kuniyuki Iwashima
There are 3 paths that finally call do_setlink(), and validate_linkmsg() is called in each path. 1. RTM_NEWLINK 1-1. dev is found in __rtnl_newlink() 1-2. dev isn't found, but IFLA_GROUP is specified in rtnl_group_changelink() 2. RTM_SETLINK The next patch factorises 1-1 to a separate function. As a preparation, let's move validate_linkmsg() calls to do_setlink(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-22rtnetlink: Allocate linkinfo[] as struct rtnl_newlink_tbs.Kuniyuki Iwashima
We will move linkinfo to rtnl_newlink() and pass it down to other functions. Let's pack it into rtnl_newlink_tbs. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21Merge branch 'net-mlx5-refactor-esw-qos-to-support-generalized-operations'Paolo Abeni
Tariq Toukan says: ==================== net/mlx5: Refactor esw QoS to support generalized operations This patch series from the team to mlx5 core driver consists of one main QoS part followed by small misc patches. This main part (patches 1 to 11) by Carolina refactors the QoS handling to generalize operations on scheduling groups and vports. These changes are necessary to support new features that will extend group functionality, introduce new group types, and support deeper hierarchies. Additionally, this refactor updates the terminology from "group" to "node" to better reflect the hardware’s rate hierarchy and its use of scheduling element nodes. Simplify group scheduling element creation: - net/mlx5: Refactor QoS group scheduling element creation Refactor to support generalized operations for QoS: - net/mlx5: Introduce node type to rate group structure - net/mlx5: Add parent group support in rate group structure - net/mlx5: Restrict domain list insertion to root TSAR ancestors - net/mlx5: Rename vport QoS group reference to parent - net/mlx5: Introduce node struct and rename group terminology to node - net/mlx5: Refactor vport scheduling element creation function - net/mlx5: Refactor vport QoS to use scheduling node structure - net/mlx5: Remove vport QoS enabled flag Support generalized operations for QoS elements: - net/mlx5: Simplify QoS scheduling element configuration - net/mlx5: Generalize QoS operations for nodes and vports On top, patch 12 by Moshe handles FW request to move to drop mode. In patch 13, Benjamin Poirier removes an empty eswitch flow table when not used, which improves packet processing performance. Patches 14 and 15 by Moshe are small field renamings as preparation for future fields addition to these structures. Series generated against: commit c531f2269a53 ("net: bcmasp: enable SW timestamping") ==================== Link: https://patch.msgid.link/20241016173617.217736-1-tariqt@nvidia.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21net/mlx5: fs, rename modify header struct member actionMoshe Shemesh
As preparation for HW Steering support, rename modify header struct member action to fs_dr_action, to distinguish from fs_hws_action which will be added. Add a pointer where needed to keep code line shorter and more readable. Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21net/mlx5: fs, rename packet reformat struct member actionMoshe Shemesh
As preparation for HW Steering support, rename packet reformat struct member action to fs_dr_action, to distinguish from fs_hws_action which will be added. Add a pointer where needed to keep code line shorter and more readable. Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21net/mlx5: Only create VEPA flow table when in VEPA modeBenjamin Poirier
Currently, when VFs are created, two flow tables are added for the eswitch: the "fdb" table, which contains rules for each VF and the "vepa_fdb" table. In the default VEB mode, the vepa_fdb table is empty. When switching to VEPA mode, flow steering rules are added to vepa_fdb. Even though the vepa_fdb table is empty in VEB mode, its presence adds some cost to packet processing. In some workloads, this leads to drops which are reported by the rx_discards_phy ethtool counter. In order to improve performance, only create vepa_fdb when in VEPA mode. Tests were done on a ConnectX-6 Lx adapter forwarding 64B packets between both ports using dpdk-testpmd. Numbers are Rx-pps for each port, as reported by testpmd. Without changes: traffic to unknown mac testpmd on PF numvfs=0,0 35257998,35264499 numvfs=1,1 24590124,24590888 testpmd on VF with numvfs=1,1 20434338,20434887 traffic to VF mac testpmd on VF with numvfs=1,1 30341014,30340749 With changes: traffic to unknown mac testpmd on PF numvfs=0,0 35404361,35383378 numvfs=1,1 29801247,29790757 testpmd on VF with numvfs=1,1 24310435,24309084 traffic to VF mac testpmd on VF with numvfs=1,1 34811436,34781706 Signed-off-by: Benjamin Poirier <bpoirier@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21net/mlx5: Add sync reset drop mode supportMoshe Shemesh
On sync reset flow, firmware may request a PF, which already acknowledged the unload event, to move to drop mode. Drop mode means that this PF will reduce polling frequency, as this PF is not going to have another active part in the reset, but only reload back after the reset. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Aya Levin <ayal@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21net/mlx5: Generalize QoS operations for nodes and vportsCarolina Jubran
Refactor QoS normalization and rate calculation functions to operate on mlx5_esw_sched_node, allowing for generalized handling of both vports and nodes. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21net/mlx5: Simplify QoS scheduling element configurationCarolina Jubran
Simplify the configuration of QoS scheduling elements by removing the separate functions `esw_qos_node_config` and `esw_qos_vport_config`. Instead, directly use the existing `esw_qos_sched_elem_config` function for both nodes and vports. This unification helps in generalizing operations on scheduling elements nodes. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21net/mlx5: Remove vport QoS enabled flagCarolina Jubran
Remove the `enabled` flag from the `vport->qos` struct, as QoS now relies solely on the `sched_node` pointer to determine whether QoS features are in use. Currently, the vport `qos` struct consists only of the `sched_node`, introducing an unnecessary two-level reference. However, the qos struct is retained as it will be extended in future patches to support new QoS features. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21net/mlx5: Refactor vport QoS to use scheduling node structureCarolina Jubran
Refactor the vport QoS structure by moving group membership and scheduling details into the `mlx5_esw_sched_node` structure. This change consolidates the vport into the rate hierarchy by unifying the handling of different types of scheduling element nodes. In addition, add a direct reference to the mlx5_vport within the mlx5_esw_sched_node structure, to ensure that the vport is easily accessible when a scheduling node is associated with a vport. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21net/mlx5: Refactor vport scheduling element creation functionCarolina Jubran
Modify the vport scheduling element creation function to get the parent node directly, aligning it with the group creation function. This ensures a consistent flow for scheduling elements creation, as the parent nodes already contain the device and parent element index. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21net/mlx5: Introduce node struct and rename group terminology to nodeCarolina Jubran
Introduce the `mlx5_esw_sched_node` struct, consolidating all rate hierarchy related details, including membership and scheduling parameters. Since the group concept aligns with the `mlx5_esw_sched_node`, replace the `mlx5_esw_rate_group` struct with it and rename the "group" terminology to "node" throughout the rate hierarchy. All relevant code paths and structures have been updated to use the "node" terminology accordingly, laying the groundwork for future patches that will unify the handling of different types of members within the rate hierarchy. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21net/mlx5: Rename vport QoS group reference to parentCarolina Jubran
Rename the `group` field in the `mlx5_vport` structure to `parent` to clarify the vport's role as a member of a parent group and distinguish it from the concept of a general group. Additionally, rename `group_entry` to `parent_entry` to reflect this update. This distinction will be important for handling more complex group structures and scheduling elements. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21net/mlx5: Restrict domain list insertion to root TSAR ancestorsCarolina Jubran
Update the logic for adding rate groups to the E-Switch domain list, ensuring only groups with the root Transmit Scheduling Arbiter as their parent are included. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21net/mlx5: Add parent group support in rate group structureCarolina Jubran
Introduce a `parent` field in the `mlx5_esw_rate_group` structure to support hierarchical group relationships. The `parent` can reference another group or be set to `NULL`, indicating the group is connected to the root TSAR. This change enables the ability to manage groups in a hierarchical structure for future enhancements. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21net/mlx5: Introduce node type to rate group structureCarolina Jubran
Introduce the `sched_node_type` enum to represent both the group and its members as scheduling nodes in the rate hierarchy. Add the `type` field to the rate group structure to specify the type of the node membership in the rate hierarchy. Generalize comments to reflect this flexibility within the rate group structure. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21net/mlx5: Refactor QoS group scheduling element creationCarolina Jubran
Introduce `esw_qos_create_group_sched_elem` to handle the creation of group scheduling elements for E-Switch QoS, Transmit Scheduling Arbiter (TSAR). This reduces duplication and simplifies code for TSAR setup. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Daniel Machon <daniel.machon@microchip.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21Merge branch 'add-support-of-hibmcge-ethernet-driver'Paolo Abeni
Jijie Shao says: ==================== Add support of HIBMCGE Ethernet Driver This patch set adds the support of Hisilicon BMC Gigabit Ethernet Driver. This patch set includes basic Rx/Tx functionality. It also includes the registration and interrupt codes. This work provides the initial support to the HIBMCGE and would incrementally add features or enhancements. ==================== Link: https://patch.msgid.link/20241015123516.4035035-1-shaojijie@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21net: hibmcge: Add maintainer for hibmcgeJijie Shao
Add myself as the maintainer for the hibmcge ethernet driver. Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21net: hibmcge: Add a Makefile and update Kconfig for hibmcgeJijie Shao
Add a Makefile and update Kconfig to build hibmcge driver. Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21net: hibmcge: Implement some ethtool_ops functionsJijie Shao
Implement the .get_drvinfo .get_link .get_link_ksettings to get the basic information and working status of the driver. Implement the .set_link_ksettings to modify the rate, duplex, and auto-negotiation status. Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21net: hibmcge: Implement rx_poll function to receive packetsJijie Shao
Implement rx_poll function to read the rx descriptor after receiving the rx interrupt. Adjust the skb based on the descriptor to complete the reception of the packet. Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21net: hibmcge: Implement .ndo_start_xmit functionJijie Shao
Implement .ndo_start_xmit function to fill the information of the packet to be transmitted into the tx descriptor, and then the hardware will transmit the packet using the information in the tx descriptor. In addition, we also implemented the tx_handler function to enable the tx descriptor to be reused, and .ndo_tx_timeout function to print some information when the hardware is busy. Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21net: hibmcge: Implement some .ndo functionsJijie Shao
Implement the .ndo_open() .ndo_stop() .ndo_set_mac_address() and .ndo_change_mtu functions(). And .ndo_validate_addr calls the eth_validate_addr function directly Signed-off-by: Jijie Shao <shaojijie@huawei.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21net: hibmcge: Add interrupt supported in this moduleJijie Shao
The driver supports four interrupts: TX interrupt, RX interrupt, mdio interrupt, and error interrupt. Actually, the driver does not use the mdio interrupt. Therefore, the driver does not request the mdio interrupt. The error interrupt distinguishes different error information by using different masks. To distinguish different errors, the statistics count is added for each error. To ensure the consistency of the code process, masks are added for the TX interrupt and RX interrupt. This patch implements interrupt request, and provides a unified entry for the interrupt handler function. However, the specific interrupt handler function of each interrupt is not implemented currently. Because of pcim_enable_device(), the interrupt vector is already device managed and does not need to be free actively. Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21net: hibmcge: Add mdio and hardware configuration supported in this moduleJijie Shao
Implements the C22 read and write PHY registers interfaces. Some hardware interfaces related to the PHY are also implemented in this patch. Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21net: hibmcge: Add read/write registers supported through the bar spaceJijie Shao
Add support for to read and write registers through the pic bar space. Some driver parameters, such as mac_id, are determined by the board form. Therefore, these parameters are initialized from the register as device specifications. the device specifications register are initialized and written by bmc. driver will read these registers when loading. Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21net: hibmcge: Add pci table supported in this moduleJijie Shao
Add pci table supported in this module, and implement pci_driver function to initialize this driver. hibmcge is a passthrough network device. Its software runs on the host side, and the MAC hardware runs on the BMC side to reduce the host CPU area. The software interacts with the MAC hardware through the PCIe. ┌─────────────────────────┐ │ HOST CPU network device │ │ ┌──────────────┐ │ │ │hibmcge driver│ │ │ └─────┬─┬──────┘ │ │ │ │ │ │HOST ┌───┴─┴───┐ │ │ │ PCIE RC │ │ └──────┴───┬─┬───┴────────┘ │ │ PCIE │ │ ┌──────┬───┴─┴───┬────────┐ │ │ PCIE EP │ │ │BMC └───┬─┬───┘ │ │ │ │ │ │ ┌────────┴─┴──────────┐ │ │ │ GE │ │ │ │ ┌─────┐ ┌─────┐ │ │ │ │ │ MAC │ │ MAC │ │ │ └─┴─┼─────┼────┼─────┼──┴─┘ │ PHY │ │ PHY │ └─────┘ └─────┘ Signed-off-by: Jijie Shao <shaojijie@huawei.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-21net: sfp: change quirks for Alcatel Lucent G-010S-PShengyu Qu
Seems Alcatel Lucent G-010S-P also have the same problem that it uses TX_FAULT pin for SOC uart. So apply sfp_fixup_ignore_tx_fault to it. Signed-off-by: Shengyu Qu <wiagn233@outlook.com> Link: https://patch.msgid.link/TYCPR01MB84373677E45A7BFA5A28232C98792@TYCPR01MB8437.jpnprd01.prod.outlook.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>