summaryrefslogtreecommitdiff
path: root/drivers/infiniband/hw/hfi1/ipoib.h
AgeCommit message (Collapse)Author
2021-04-07IB/hfi1: Remove indirect call to hfi1_ipoib_send_dma()Mike Marciniszyn
hfi1_ipoib_send() directly calls hfi1_ipoib_send_dma() with no value add. Fix by renaming hfi1_ipoib_send_dma() to hfi1_ipoib_send(). Link: https://lore.kernel.org/r/1617026056-50483-6-git-send-email-dennis.dalessandro@cornelisnetworks.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-07IB/hfi1: Correct oversized ring allocationMike Marciniszyn
The completion ring for tx is using the wrong size to size the ring, oversizing the ring by two orders of magniture. Correct the allocation size and use kcalloc_node() to allocate the ring. Fix mistaken GFP defines in similar allocations. Link: https://lore.kernel.org/r/1617026056-50483-4-git-send-email-dennis.dalessandro@cornelisnetworks.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-07IB/{ipoib,hfi1}: Add a timeout handler for rdma_netdevMike Marciniszyn
The current rdma_netdev handling in ipoib hooks the tx_timeout handler, but prints out a totally useless message that prevents effective debugging especially when multiple transmit queues are being used. Add a tx_timeout rdma_netdev hook and implement the callback in the hfi1 to print additional information. The existing non-helpful message is avoided when the driver has presented a callback. Link: https://lore.kernel.org/r/1617026056-50483-3-git-send-email-dennis.dalessandro@cornelisnetworks.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-26RDMA: Support more than 255 rdma portsMark Bloch
Current code uses many different types when dealing with a port of a RDMA device: u8, unsigned int and u32. Switch to u32 to clean up the logic. This allows us to make (at least) the core view consistent and use the same type. Unfortunately not all places can be converted. Many uverbs functions expect port to be u8 so keep those places in order not to break UAPIs. HW/Spec defined values must also not be changed. With the switch to u32 we now can support devices with more than 255 ports. U32_MAX is reserved to make control logic a bit easier to deal with. As a device with U32_MAX ports probably isn't going to happen any time soon this seems like a non issue. When a device with more than 255 ports is created uverbs will report the RDMA device as having 255 ports as this is the max currently supported. The verbs interface is not changed yet because the IBTA spec limits the port size in too many places to be u8 and all applications that relies in verbs won't be able to cope with this change. At this stage, we are extending the interfaces that are using vendor channel solely Once the limitation is lifted mlx5 in switchdev mode will be able to have thousands of SFs created by the device. As the only instance of an RDMA device that reports more than 255 ports will be a representor device and it exposes itself as a RAW Ethernet only device CM/MAD/IPoIB and other ULPs aren't effected by this change and their sysfs/interfaces that are exposes to userspace can remain unchanged. While here cleanup some alignment issues and remove unneeded sanity checks (mainly in rdmavt), Link: https://lore.kernel.org/r/20210301070420.439400-1-leon@kernel.org Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-11-12IB/hfi1: switch to core handling of rx/tx byte/packet countersHeiner Kallweit
Use netdev->tstats instead of a member of hfi1_ipoib_dev_priv for storing a pointer to the per-cpu counters. This allows us to use core functionality for statistics handling. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Acked-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2020-06-24IB/hfi1: Add atomic triggered sleep/wakeupMike Marciniszyn
When running iperf in a two host configuration the following trace can occur: [ 319.728730] NETDEV WATCHDOG: ib0 (hfi1): transmit queue 0 timed out The issue happens because the current implementation relies on the netif txq being stopped to control the flushing of the tx list. There are two resources that the transmit logic can wait on and stop the txq: - SDMA descriptors - Ring space to hold completions The ring space is tested on the sending side and relieved when the ring is consumed in the napi tx reaping. Unfortunately, that reaping can run conncurrently with the workqueue flushing of the txlist. If the txq is started just before the workitem executes, the txlist will never be flushed, leading to the txq being stuck. Fix by: - Adding sleep/wakeup wrappers * Use an atomic to control the call to the netif routines inside the wrappers - Use another atomic to record ring space exhaustion * Only wakeup when the a ring space exhaustion has happened and it relieved Add additional wrappers to clarify the ring space resource handling. Fixes: d99dc602e2a5 ("IB/hfi1: Add functions to transmit datagram ipoib packets") Link: https://lore.kernel.org/r/20200623204327.108092.4024.stgit@awfm-01.aw.intel.com Reviewed-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-05-21IB/hfi1: Add rx functions for dummy netdevGrzegorz Andrejczuk
This patch adds the rx functions for the dummy netdev: - Functions to allocate/free the dummy netdev. - Functions to allocate/free receiving contexts for the netdev. - Functions to initialize/de-initialize the receive queue. - Functions to enable/disable the receive queue. Link: https://lore.kernel.org/r/20200511160643.173205.75087.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21IB/hfi1: Add functions to receive accelerated ipoib packetsKaike Wan
Ipoib netdev will share receive contexts with existing VNIC netdev. To achieve that, a dummy netdev is allocated with hfi1_devdata to own the receive contexts, and ipoib and VNIC netdevs will be put on top of it. Each receive context is associated with a single NAPI object. This patch adds the functions to receive incoming packets for accelerated ipoib. Link: https://lore.kernel.org/r/20200511160631.173205.54184.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Grzegorz Andrejczuk <grzegorz.andrejczuk@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21IB/hfi1: Add the transmit side of a datagram ipoib RDMA netdevGary Leshner
This implements the transmit side of the multiple transmit queue RDMA netdev used to accelerate ipoib. The receive side remains the ipoib internal implementation. The init/unint/open/stop netdev operations are saved off and called by the versions within the hfi1 netdev in order to initialize the connected mode resources present in ipoib thus allowing us to switch modes between datagram and connected. The datagram queue pair instantiated by the ipoib ulp is used by this implementation for its queue pair number and to register with multicast. The above queue pair is not used on transmit other than its qpn as the verbs layer is skipped and packets are directly submitted to the sdma engines. Link: https://lore.kernel.org/r/20200511160554.173205.1369.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Gary Leshner <Gary.S.Leshner@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21IB/hfi1: Add functions to transmit datagram ipoib packetsGary Leshner
This patch implements the mechanism to accelerate the transmit side of a multiple transmit queue RDMA netdev by submitting the packets to the SDMA engine directly instead of sending through the verbs layer. This patch also changes the UD/SEND_ONLY op to output the entropy value in byte 0 of deth[1]. UD/SEND_ONLY_WITH_IMMEDIATE uses the previous behavior with no entropy value being output. The code in the ipoib rdma netdev which submits tx requests upon successful submission will call trace_sdma_output_ibhdr to output the ibhdr to the trace buffer. Link: https://lore.kernel.org/r/20200511160548.173205.45616.stgit@awfm-01.aw.intel.com Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Gary Leshner <Gary.S.Leshner@intel.com> Signed-off-by: Kaike Wan <kaike.wan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>