summaryrefslogtreecommitdiff
path: root/drivers/infiniband/hw/mlx5/mlx5_ib.h
AgeCommit message (Collapse)Author
2019-06-24RDMA/mlx5: Use PA mapping for PI handoverMax Gurtovoy
If possibe, avoid doing a UMR operation to register data and protection buffers (via MTT/KLM mkeys). Instead, use the local DMA key and map the SG lists using PA access. This is safe, since the internal key for data and protection never exposed to the remote server (only signature key might be exposed). If PA mappings are not possible, perform mapping using MTT/KLM descriptors. The setup of the tested benchmark (using iSER ULP): - 2 servers with 24 cores (1 initiator and 1 target) - ConnectX-4/ConnectX-5 adapters - 24 target sessions with 1 LUN each - ramdisk backstore - PI active Performance results running fio (24 jobs, 128 iodepth) using write_generate=1 and read_verify=1 (w/w.o patch): bs IOPS(read) IOPS(write) ---- ---------- ---------- 512 1266.4K/1262.4K 1720.1K/1732.1K 4k 793139/570902 1129.6K/773982 32k 72660/72086 97229/96164 Using write_generate=0 and read_verify=0 (w/w.o patch): bs IOPS(read) IOPS(write) ---- ---------- ---------- 512 1590.2K/1600.1K 1828.2K/1830.3K 4k 1078.1K/937272 1142.1K/815304 32k 77012/77369 98125/97435 Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Israel Rukshin <israelr@mellanox.com> Suggested-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-24RDMA/mlx5: Improve PI handover performanceIsrael Rukshin
In some loads, there is performance degradation when using KLM mkey instead of MTT mkey. This is because KLM descriptor access is via indirection that might require more HW resources and cycles. Using KLM descriptor is not necessary when there are no gaps at the data/metadata sg lists. As an optimization, use MTT mkey whenever it is possible. For that matter, allocate internal MTT mkey and choose the effective pi_mr for in transaction according to the required mapping scheme. The setup of the tested benchmark (using iSER ULP): - 2 servers with 24 cores (1 initiator and 1 target) - ConnectX-4/ConnectX-5 adapters - 24 target sessions with 1 LUN each - ramdisk backstore - PI active Performance results running fio (24 jobs, 128 iodepth) using write_generate=1 and read_verify=1 (w/w.o/baseline): bs IOPS(read) IOPS(write) ---- ---------- ---------- 512 1262.4K/1243.3K/1147.1K 1732.1K/1725.1K/1423.8K 4k 570902/571233/457874 773982/743293/642080 32k 72086/72388/71933 96164/71789/93249 Using write_generate=0 and read_verify=0 (w/w.o patch): bs IOPS(read) IOPS(write) ---- ---------- ---------- 512 1600.1K/1572.1K/1393.3K 1830.3K/1823.5K/1557.2K 4k 937272/921992/762934 815304/753772/646071 32k 77369/75052/72058 97435/73180/94612 Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Suggested-by: Max Gurtovoy <maxg@mellanox.com> Suggested-by: Idan Burstein <idanb@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-24RDMA/core: Validate integrity handover device capMax Gurtovoy
Protect the case that a ULP tries to allocate a QP with signature enabled flag while the LLD doesn't support this feature. While we're here, also move integrity_en attribute from mlx5_qp to ib_qp as a preparation for adding new integrity API to the rw-API (that is part of ib_core module). Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-24RDMA/core: Rename signature qp create flag and signature device capabilityIsrael Rukshin
Rename IB_QP_CREATE_SIGNATURE_EN to IB_QP_CREATE_INTEGRITY_EN and IB_DEVICE_SIGNATURE_HANDOVER to IB_DEVICE_INTEGRITY_HANDOVER. Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-24RDMA/mlx5: Implement mlx5_ib_map_mr_sg_pi and mlx5_ib_alloc_mr_integrityMax Gurtovoy
mlx5_ib_map_mr_sg_pi() will map the PI and data dma mapped SG lists to the mlx5 memory region prior to the registration operation. In the new API, the mlx5 driver will allocate an internal memory region for the UMR operation to register both PI and data SG lists. The internal MR will use KLM mode in order to map 2 (possibly non-contiguous/non-align) SG lists using 1 memory key. In the new API, each ULP will use 1 memory region for the signature operation (instead of 3 in the old API). This memory region will have a key that will be exposed to remote server to perform RDMA operation. The internal memory key that will map the SG lists will stay private. Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Israel Rukshin <israelr@mellanox.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-20RDMA: Convert destroy_wq to be voidLeon Romanovsky
All callers of destroy WQ are always success and there is no need to check their return value, so convert destroy_wq to be void. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-06-18Merge remote-tracking branch 'mlx5-next/mlx5-next' into HEADDoug Ledford
Take mlx5-next so we can take a dependent two patch series next. Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-06-13net/mlx5: Change interrupt handler to call chain notifierYuval Avnery
Multiple EQs may share the same IRQ in subsequent patches. Instead of calling the IRQ handler directly, the EQ will register to an atomic chain notfier. The Linux built-in shared IRQ is not used because it forces the caller to disable the IRQ and clear affinity before free_irq() can be called. This patch is the first step in the separation of IRQ and EQ logic. Signed-off-by: Yuval Avnery <yuvalav@mellanox.com> Reviewed-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2019-06-11RDMA: Convert CQ allocations to be under core responsibilityLeon Romanovsky
Ensure that CQ is allocated and freed by IB/core and not by drivers. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Gal Pressman <galpress@amazon.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Tested-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-06-11RDMA: Clean destroy CQ in drivers do not return errorsLeon Romanovsky
Like all other destroy commands, .destroy_cq() call is not supposed to fail. In all flows, the attempt to return earlier caused to memory leaks. This patch converts .destroy_cq() to do not return any errors. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Gal Pressman <galpress@amazon.com> Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-05-06IB/mlx5: Add steering SW ICM device memory typeAriel Levkovich
This patch adds support for allocating, deallocating and registering a new device memory type, STEERING_SW_ICM. This memory can be allocated and used by a privileged user for direct rule insertion and management of the device's steering tables. The type is provided by the user via the dedicated attribute in the alloc_dm ioctl command. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-05-06IB/mlx5: Support device memory type attributeAriel Levkovich
This patch intoruduces a new mlx5_ib driver attribute to the DM allocation method - the DM type. In order to allow addition of new types in downstream patches this patch also refactors the allocation, deallocation and registration handlers to consider the requested type and perform the necessary actions according to it. Since not all future device memory types will be such that are mapped to user memory, the mandatory page index output attribute is modified to be optional. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Eli Cohen <eli@mellanox.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-22RDMA/mlx5: Access the prio bypass inside the FDB flow table namespaceMark Bloch
Now that we have a specific prio inside the FDB namespace allow retrieving it from the RDMA side. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-10RDMA/mlx5: Remove VF representor profileMark Bloch
Now that we have a single IB device with multiple ports we can remove the VF representor profile. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-10RDMA/mlx5: Move to single device multiport ports in switchdev modeMark Bloch
Move from IB device (representor) per virtual function to single IB device with port per virtual function (port 1 represents the uplink). As number of ports is a static property of an IB device, declare the IB device with as many port as the possible according to the PCI bus. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-10RDMA/mlx5: Move rep into port structMark Bloch
In preparation of moving into a model of single IB device multiple ports move rep to be part of the port structure. We mark a representor device by setting is_rep, no functional change with this patch. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-10RDMA/mlx5: Move netdev info into the port structMark Bloch
Netdev info is stored in a separate array and holds data relevant on a per port basis, move it to be part of the port struct. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-08RDMA: Handle SRQ allocations by IB/coreLeon Romanovsky
Convert SRQ allocation from drivers to be in the IB/core Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-08RDMA: Handle AH allocations by IB/coreLeon Romanovsky
Simplify drivers by ensuring lifetime of ib_ah object. The changes in .create_ah() go hand in hand with relevant update in .destroy_ah(). We will use this opportunity and convert .destroy_ah() to don't fail, as it was suggested a long time ago, because there is nothing to do in case of failure during destroy. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-08IB/mlx5: Remove references to uboject->contextJason Gunthorpe
These should all go through udata now. Add mlx5_udata_to_mdev to convert a udata into the struct mlx5_ib_dev as these call sites require. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-01IB: Pass only ib_udata in function prototypesShamir Rabinovitch
Now when ib_udata is passed to all the driver's object create/destroy APIs the ib_udata will carry the ib_ucontext for every user command. There is no need to also pass the ib_ucontext via the functions prototypes. Make ib_udata the only argument psssed. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-01IB: Pass uverbs_attr_bundle down ib_x destroy pathShamir Rabinovitch
The uverbs_attr_bundle with the ucontext is sent down to the drivers ib_x destroy path as ib_udata. The next patch will use the ib_udata to free the drivers destroy path from the dependency in 'uobject->context' as we already did for the create path. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-21IB/mlx5: Protect against prefetch of invalid MRMoni Shoua
When deferring a prefetch request we need to protect against MR or PD being destroyed while the request is still enqueued. The first step is to validate that PD owns the lkey that describes the MR and that the MR that the lkey refers to is owned by that PD. The second step is to dequeue all requests when MR is destroyed. Since PD can't be destroyed while it owns MRs it is guaranteed that when a worker wakes up the request it refers to is still valid. Now, it is possible to refrain from taking a reference on the device since it is assured to be present as pd. While that, replace the dedicated ordered workqueue with the system unbound workqueue to reuse an existing resource and improve performance. This will also fix a bug of queueing to the wrong workqueue. Fixes: 813e90b1aeaa ("IB/mlx5: Add advise_mr() support") Reported-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04IB/mlx5: Let read user wqe also from SRQ bufferMoni Shoua
Reading a WQE from SRQ is almost identical to reading from regular RQ. The differences are the size of the queue, the size of a WQE and buffer location. Make necessary changes to mlx5_ib_read_user_wqe() to let it read a WQE from a SRQ or RQ by caller choice. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-29Merge branch 'devx-async' into k.o/for-nextJason Gunthorpe
Yishai Hadas says: Enable DEVX asynchronous query commands This series enables querying a DEVX object in an asynchronous mode. The userspace application won't block when calling the firmware and it will be able to get the response back once that it will be ready. To enable the above functionality: - DEVX asynchronous command completion FD object was introduced. - The applicable file operations were implemented to enable using it by the user application. - Query asynchronous method was added to the DEVX object, it will call the firmware asynchronously and manages the response on the given input FD. - Hot unplug support was added for the FD to work properly upon unbind/disassociate. - mlx5 core fence for asynchronous commands was implemented and used to prevent racing upon unbind/disassociate. This branch is based on mlx5-next & v5.0-rc2 due to dependencies, from git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux * branch 'devx-async': IB/mlx5: Implement DEVX hot unplug for async command FD IB/mlx5: Implement the file ops of DEVX async command FD IB/mlx5: Introduce async DEVX obj query API IB/mlx5: Introduce MLX5_IB_OBJECT_DEVX_ASYNC_CMD_FD Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-24infiniband: mlx5: no need to check return value of debugfs_create functionsGreg Kroah-Hartman
When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-24net/mlx5: Make mlx5_cmd_exec_cb() a safe APIJason Gunthorpe
APIs that have deferred callbacks should have some kind of cleanup function that callers can use to fence the callbacks. Otherwise things like module unloading can lead to dangling function pointers, or worse. The IB MR code is the only place that calls this function and had a really poor attempt at creating this fence. Provide a good version in the core code as future patches will add more places that need this fence. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2019-01-21IB/mlx5: Manage indirection mkey upon DEVX flow for ODPYishai Hadas
Manage indirection mkey upon DEVX flow to support ODP. To support a page fault event on the indirection mkey it needs to be part of the device mkey radix tree. Both the creation and the deletion flows for a DEVX object which is indirection mkey were adapted to handle that. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-15RDMA/mad: Reduce MAD scope to mlx5_ib onlyLeon Romanovsky
Management Datagram Interface (MAD) is applicable only when physical port is Infiniband. It makes MAD command logic to be completely unrelated to eth/core parts of mlx5. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Acked-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-10IB/{core,hw}: Have ib_umem_get extract the ib_ucontext from ib_udataJason Gunthorpe
ib_umem_get() can only be called in a method callback, which always has a udata parameter. This allows ib_umem_get() to derive the ucontext pointer directly from the udata without requiring the drivers to find it in some way or another. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
2019-01-08RDMA/mlx5: Introduce and reuse helper to identify ODP MRLeon Romanovsky
Consolidate various checks if MR is ODP backed to one simple helper and update call sites to use it. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-08RDMA: Clean structures from CONFIG_INFINIBAND_ON_DEMAND_PAGINGLeon Romanovsky
CONFIG_INFINIBAND_ON_DEMAND_PAGING is used in general structures to micro-optimize the memory footprint. Remove it, so it will allow us to simplify various ODP device flows. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-28Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma updates from Jason Gunthorpe: "This has been a fairly typical cycle, with the usual sorts of driver updates. Several series continue to come through which improve and modernize various parts of the core code, and we finally are starting to get the uAPI command interface cleaned up. - Various driver fixes for bnxt_re, cxgb3/4, hfi1, hns, i40iw, mlx4, mlx5, qib, rxe, usnic - Rework the entire syscall flow for uverbs to be able to run over ioctl(). Finally getting past the historic bad choice to use write() for command execution - More functional coverage with the mlx5 'devx' user API - Start of the HFI1 series for 'TID RDMA' - SRQ support in the hns driver - Support for new IBTA defined 2x lane widths - A big series to consolidate all the driver function pointers into a big struct and have drivers provide a 'static const' version of the struct instead of open coding initialization - New 'advise_mr' uAPI to control device caching/loading of page tables - Support for inline data in SRPT - Modernize how umad uses the driver core and creates cdev's and sysfs files - First steps toward removing 'uobject' from the view of the drivers" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (193 commits) RDMA/srpt: Use kmem_cache_free() instead of kfree() RDMA/mlx5: Signedness bug in UVERBS_HANDLER() IB/uverbs: Signedness bug in UVERBS_HANDLER() IB/mlx5: Allocate the per-port Q counter shared when DEVX is supported IB/umad: Start using dev_groups of class IB/umad: Use class_groups and let core create class file IB/umad: Refactor code to use cdev_device_add() IB/umad: Avoid destroying device while it is accessed IB/umad: Simplify and avoid dynamic allocation of class IB/mlx5: Fix wrong error unwind IB/mlx4: Remove set but not used variable 'pd' RDMA/iwcm: Don't copy past the end of dev_name() string IB/mlx5: Fix long EEH recover time with NVMe offloads IB/mlx5: Simplify netdev unbinding IB/core: Move query port to ioctl RDMA/nldev: Expose port_cap_flags2 IB/core: uverbs copy to struct or zero helper IB/rxe: Reuse code which sets port state IB/rxe: Make counters thread safe IB/mlx5: Use the correct commands for UMEM and UCTX allocation ...
2018-12-20Merge branch 'mlx5-next' into rdma.gitJason Gunthorpe
From git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux mlx5 updates taken for dependencies on following patches. * branche 'mlx5-next': (23 commits) IB/mlx5: Introduce uid as part of alloc/dealloc transport domain net/mlx5: Add shared Q counter bits net/mlx5: Continue driver initialization despite debugfs failure net/mlx5: Fold the modify lag code into function net/mlx5: Add lag affinity info to log net/mlx5: Split the activate lag function into two routines net/mlx5: E-Switch, Introduce flow counter affinity IB/mlx5: Unify e-switch representors load approach between uplink and VFs net/mlx5: Use lowercase 'X' for hex values net/mlx5: Remove duplicated include from eswitch.c net/mlx5: Remove the get protocol device interface entry net/mlx5: Support extended destination format in flow steering command net/mlx5: E-Switch, Change vhca id valid bool field to bit flag net/mlx5: Introduce extended destination fields net/mlx5: Revise gre and nvgre key formats net/mlx5: Add monitor commands layout and event data net/mlx5: Add support for plugged-disabled cable status in PME net/mlx5: Add support for PCIe power slot exceeded error in PME net/mlx5: Rework handling of port module events net/mlx5: Move flow counters data structures from flow steering header ...
2018-12-19RDMA: Mark if destroy address handle is in a sleepable contextGal Pressman
Introduce a 'flags' field to destroy address handle callback and add a flag that marks whether the callback is executed in an atomic context or not. This will allow drivers to wait for completion instead of polling for it when it is allowed. Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-19RDMA: Mark if create address handle is in a sleepable contextGal Pressman
Introduce a 'flags' field to create address handle callback and add a flag that marks whether the callback is executed in an atomic context or not. This will allow drivers to wait for completion instead of polling for it when it is allowed. Signed-off-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-19IB/mlx5: Fix compile issue when ODP disabledDoug Ledford
When CONFIG_INFINIBAND_ON_DEMAND_PAGING is not enabled, we were getting build failures for defined but not used code. Fix that. Fixes: 813e90b1aeaa ("IB/mlx5: Add advise_mr() support") Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-18IB/mlx5: Add advise_mr() supportMoni Shoua
The verb advise_mr() is used to give advice to the kernel about an address range that belongs to a MR. Implement the verb and register it on the device. The current implementation supports the only known advice to date, prefetch. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-18RDMA/mlx5: Delete unreachable handle_atomic code by simplifying SW completionLeon Romanovsky
Handle atomic was left as unimplemented from 2013, remove the code till this part will be developed. Remove the dead code by simplifying SW completion logic which is supposed to be the same for send and receive paths. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Tested-by: Stephen Rothwell <sfr@canb.auug.org.au> # compile tested Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-14net/mlx5: Make RoCE and SR-IOV LAG modes explicitAviv Heller
With the introduction of SR-IOV LAG, checking whether LAG is active is no longer good enough, since RoCE and SR-IOV LAG each entails different behavior by both the core and infiniband drivers. This patch introduces facilities to discern LAG type, in addition to mlx5_lag_is_active(). These are implemented in such a way as to allow more complex mode combinations in the future. Signed-off-by: Aviv Heller <avivh@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-14IB/mlx5: Unify e-switch representors load approach between uplink and VFsMark Bloch
When in switchdev mode and the add function is called by the core level driver, make sure we only register the callbacks, but don't create the mlx5 IB device or initialize anything. With this change all the IB devices in switchdev mode are created only once the load callback is invoked by the e-switch core sub-module. This follows the design paradigm under which the all the Eth representors must be loaded before any of IB reprs is loaded. Signed-off-by: Mark Bloch <markb@mellanox.com> Acked-by: Or Gerlitz <ogerlitz@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-11IB/mlx5: Remove duplicated include from mlx5_ib.hYueHaibing
Remove duplicated include. Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-07Merge branch 'mlx5-packet-credit-fc' into rdma.gitJason Gunthorpe
Danit Goldberg says: Packet based credit mode Packet based credit mode is an alternative end-to-end credit mode for QPs set during their creation. Credits are transported from the responder to the requester to optimize the use of its receive resources. In packet-based credit mode, credits are issued on a per packet basis. The advantage of this feature comes while sending large RDMA messages through switches that are short in memory. The first commit exposes QP creation flag and the HCA capability. The second commit adds support for a new DV QP creation flag. The last commit report packet based credit mode capability via the MLX5DV device capabilities. * branch 'mlx5-packet-credit-fc': IB/mlx5: Report packet based credit mode device capability IB/mlx5: Add packet based credit mode support net/mlx5: Expose packet based credit mode Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-07IB/mlx5: Add packet based credit mode supportDanit Goldberg
The device can support two credit modes, message based (default) and packet based. In order to enable packet based mode, the QP should be created with special flag that indicates this. This patch adds support for the new DV QP creation flag that can be used for RC QPs in order to change the credit mode. Signed-off-by: Danit Goldberg <danitg@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-04IB/mlx5: Allow XRC usage via verbs in DEVX contextYishai Hadas
Allows XRC usage from the verbs flow in a DEVX context. As XRCD is some shared kernel resource between processes it should be created with UID=0 to point on that. As a result once XRC QP/SRQ are created they must be used as well with UID=0 so that firmware will allow the XRCD usage. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-04IB/mlx5: Enforce DEVX privilege by firmwareYishai Hadas
Enforce DEVX privilege by firmware, this enables future device functionality without the need to make driver changes unless a new privilege type will be introduced. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-04Merge 'mlx5-next' into mlx5-devxDoug Ledford
The enhanced devx support series needs commit: 9d43faac02e3 ("net/mlx5: Update mlx5_ifc with DEVX UCTX capabilities bits") Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-12-04RDMA/mlx5: Initialize SRQ tables on mlx5_ibLeon Romanovsky
Transfer initialization and cleanup from mlx5_priv struct of mlx5_core_dev to be part of mlx5_ib_dev. This completes removal of SRQ from mlx5_core. Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-12-04RDMA/mlx5: Use stages for callback to setup and release DEVXLeon Romanovsky
Reuse existing infrastructure to initialize and release DEVX uid. The DevX interface is intended for user space access, so it is supposed to be initialized before ib_register_device(). Also it isn't supported in switchdev mode and don't need to initialize it in that mode. Fixes: 76dc5a8406bf ("IB/mlx5: Manage device uid for DEVX white list commands") Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-11-29IB/mlx5: Use the new mlx5 core notifier APISaeed Mahameed
Remove the deprecated mlx5_interface->event mlx5_ib callback and use new mlx5 notifier API to subscribe for mlx5 events. For native mlx5_ib devices profiles pf_profile/nic_rep_profile register the notifier callback mlx5_ib_handle_event which treats the notifier context as mlx5_ib_dev. For vport repesentors, don't register any notifier, same as before, they didn't receive any mlx5 events. For slave port (mlx5_ib_multiport_info) register a different notifier callback mlx5_ib_event_slave_port, which knows that the event is coming for mlx5_ib_multiport_info and prepares the event job accordingly. Before this on the event handler work we had to ask mlx5_core if this is a slave port mlx5_core_is_mp_slave(work->dev), now it is not needed anymore. mlx5_ib_multiport_info notifier registration is done on mlx5_ib_bind_slave_port and de-registration is done on mlx5_ib_unbind_slave_port. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>