summaryrefslogtreecommitdiff
path: root/drivers/infiniband/hw/mlx5/qp.c
AgeCommit message (Collapse)Author
2019-02-07IB/mlx5: Simplify WQE count power of two checkGal Pressman
Use is_power_of_2() instead of hard coding it in the driver. While at it, fix the meaningless error print. Signed-off-by: Gal Pressman <galpress@amazon.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-05IB/mlx5: Do not use hw_access_flags for be and CPU dataBart Van Assche
Avoid that sparse reports the following for the mlx5 driver: drivers/infiniband/hw/mlx5/qp.c:2671:34: warning: invalid assignment: |= drivers/infiniband/hw/mlx5/qp.c:2671:34: left side has type restricted __be32 drivers/infiniband/hw/mlx5/qp.c:2671:34: right side has type int drivers/infiniband/hw/mlx5/qp.c:2679:34: warning: invalid assignment: |= drivers/infiniband/hw/mlx5/qp.c:2679:34: left side has type restricted __be32 drivers/infiniband/hw/mlx5/qp.c:2679:34: right side has type int drivers/infiniband/hw/mlx5/qp.c:2680:34: warning: invalid assignment: |= drivers/infiniband/hw/mlx5/qp.c:2680:34: left side has type restricted __be32 drivers/infiniband/hw/mlx5/qp.c:2680:34: right side has type int drivers/infiniband/hw/mlx5/qp.c:2684:34: warning: invalid assignment: |= drivers/infiniband/hw/mlx5/qp.c:2684:34: left side has type restricted __be32 drivers/infiniband/hw/mlx5/qp.c:2684:34: right side has type int drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: cast from restricted __be32 drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: incorrect type in argument 1 (different base types) drivers/infiniband/hw/mlx5/qp.c:2686:28: expected unsigned int [usertype] val drivers/infiniband/hw/mlx5/qp.c:2686:28: got restricted __be32 [usertype] drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: cast from restricted __be32 drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: cast from restricted __be32 drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: cast from restricted __be32 drivers/infiniband/hw/mlx5/qp.c:2686:28: warning: cast from restricted __be32 This patch does not change any functionality. Fixes: a60109dc9a95 ("IB/mlx5: Add support for extended atomic operations") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04Merge tag 'v5.0-rc5' into rdma.git for-nextJason Gunthorpe
Linux 5.0-rc5 Needed to merge the include/uapi changes so we have an up to date single-tree for these files. Patches already posted are also expected to need this for dependencies.
2019-02-04IB/mlx5: Let read user wqe also from SRQ bufferMoni Shoua
Reading a WQE from SRQ is almost identical to reading from regular RQ. The differences are the size of the queue, the size of a WQE and buffer location. Make necessary changes to mlx5_ib_read_user_wqe() to let it read a WQE from a SRQ or RQ by caller choice. Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-21RDMA/mlx5: Fix check for supported user flags when creating a QPMark Bloch
When the flags verification was added two flags were missed from the check: * MLX5_QP_FLAG_TIR_ALLOW_SELF_LB_UC * MLX5_QP_FLAG_TIR_ALLOW_SELF_LB_MC This causes user applications that were using these flags to break. Fixes: 2e43bb31b8df ("IB/mlx5: Verify that driver supports user flags") Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-10IB/{core,hw}: Have ib_umem_get extract the ib_ucontext from ib_udataJason Gunthorpe
ib_umem_get() can only be called in a method callback, which always has a udata parameter. This allows ib_umem_get() to derive the ucontext pointer directly from the udata without requiring the drivers to find it in some way or another. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
2019-01-02IB/mlx5: Allow XRC INI usage via verbs in DEVX contextYishai Hadas
From device point of view both XRC target and initiator are XRC transport type. Fix to use the expected UID as was handled for the XRC target case to allow its usage via verbs in DEVX context. Fixes: 5aa3771ded54 ("IB/mlx5: Allow XRC usage via verbs in DEVX context") Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-28Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma updates from Jason Gunthorpe: "This has been a fairly typical cycle, with the usual sorts of driver updates. Several series continue to come through which improve and modernize various parts of the core code, and we finally are starting to get the uAPI command interface cleaned up. - Various driver fixes for bnxt_re, cxgb3/4, hfi1, hns, i40iw, mlx4, mlx5, qib, rxe, usnic - Rework the entire syscall flow for uverbs to be able to run over ioctl(). Finally getting past the historic bad choice to use write() for command execution - More functional coverage with the mlx5 'devx' user API - Start of the HFI1 series for 'TID RDMA' - SRQ support in the hns driver - Support for new IBTA defined 2x lane widths - A big series to consolidate all the driver function pointers into a big struct and have drivers provide a 'static const' version of the struct instead of open coding initialization - New 'advise_mr' uAPI to control device caching/loading of page tables - Support for inline data in SRPT - Modernize how umad uses the driver core and creates cdev's and sysfs files - First steps toward removing 'uobject' from the view of the drivers" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (193 commits) RDMA/srpt: Use kmem_cache_free() instead of kfree() RDMA/mlx5: Signedness bug in UVERBS_HANDLER() IB/uverbs: Signedness bug in UVERBS_HANDLER() IB/mlx5: Allocate the per-port Q counter shared when DEVX is supported IB/umad: Start using dev_groups of class IB/umad: Use class_groups and let core create class file IB/umad: Refactor code to use cdev_device_add() IB/umad: Avoid destroying device while it is accessed IB/umad: Simplify and avoid dynamic allocation of class IB/mlx5: Fix wrong error unwind IB/mlx4: Remove set but not used variable 'pd' RDMA/iwcm: Don't copy past the end of dev_name() string IB/mlx5: Fix long EEH recover time with NVMe offloads IB/mlx5: Simplify netdev unbinding IB/core: Move query port to ioctl RDMA/nldev: Expose port_cap_flags2 IB/core: uverbs copy to struct or zero helper IB/rxe: Reuse code which sets port state IB/rxe: Make counters thread safe IB/mlx5: Use the correct commands for UMEM and UCTX allocation ...
2018-12-18RDMA: Cleanup undesired pd->uobject usageShamir Rabinovitch
Drivers should be using udata to determine if a method is invoked from user space or kernel space. A pd does not necessarily say a different objects is kernel or user. Transforming the tests to use udata eliminates a large number of uobject references from the drivers. Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-18RDMA/mlx5: Delete unreachable handle_atomic code by simplifying SW completionLeon Romanovsky
Handle atomic was left as unimplemented from 2013, remove the code till this part will be developed. Remove the dead code by simplifying SW completion logic which is supposed to be the same for send and receive paths. Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Tested-by: Stephen Rothwell <sfr@canb.auug.org.au> # compile tested Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-14net/mlx5: Make RoCE and SR-IOV LAG modes explicitAviv Heller
With the introduction of SR-IOV LAG, checking whether LAG is active is no longer good enough, since RoCE and SR-IOV LAG each entails different behavior by both the core and infiniband drivers. This patch introduces facilities to discern LAG type, in addition to mlx5_lag_is_active(). These are implemented in such a way as to allow more complex mode combinations in the future. Signed-off-by: Aviv Heller <avivh@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2018-12-11Merge tag 'v4.20-rc6' into rdma.git for-nextJason Gunthorpe
For dependencies in following patches.
2018-12-11IB/core: Add new IB ratesMichael Guralnik
Add the new rates that were added to Infiniband spec as part of HDR and 2x support. Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-07Merge branch 'mlx5-packet-credit-fc' into rdma.gitJason Gunthorpe
Danit Goldberg says: Packet based credit mode Packet based credit mode is an alternative end-to-end credit mode for QPs set during their creation. Credits are transported from the responder to the requester to optimize the use of its receive resources. In packet-based credit mode, credits are issued on a per packet basis. The advantage of this feature comes while sending large RDMA messages through switches that are short in memory. The first commit exposes QP creation flag and the HCA capability. The second commit adds support for a new DV QP creation flag. The last commit report packet based credit mode capability via the MLX5DV device capabilities. * branch 'mlx5-packet-credit-fc': IB/mlx5: Report packet based credit mode device capability IB/mlx5: Add packet based credit mode support net/mlx5: Expose packet based credit mode Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-07IB/mlx5: Add packet based credit mode supportDanit Goldberg
The device can support two credit modes, message based (default) and packet based. In order to enable packet based mode, the QP should be created with special flag that indicates this. This patch adds support for the new DV QP creation flag that can be used for RC QPs in order to change the credit mode. Signed-off-by: Danit Goldberg <danitg@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-04IB/mlx5: Allow XRC usage via verbs in DEVX contextYishai Hadas
Allows XRC usage from the verbs flow in a DEVX context. As XRCD is some shared kernel resource between processes it should be created with UID=0 to point on that. As a result once XRC QP/SRQ are created they must be used as well with UID=0 so that firmware will allow the XRCD usage. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-11-29IB/mlx5: Use fragmented QP's buffer for in-kernel usersGuy Levi
The current implementation of create QP requires contiguous memory, such a requirement is problematic once the memory is fragmented or the system is low in memory, it causes failures in dma_zalloc_coherent(). This patch takes advantage of the new mlx5_core API which allocates a fragmented buffer. This makes the QP creation much more resilient to memory fragmentation. Data-path code was adapted to the fact that WQEs can cross buffers. We also use the opportunity to fix some cosmetic legacy coding convention errors which were in the feature scope. Signed-off-by: Guy Levi <guyle@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-11-21IB/mlx5: Allow modify AV in DCI QP to RTRArtemy Kovalyov
This is required so the user can set the SL on the DC QP. Signed-off-by: Artemy Kovalyov <artemyko@mellanox.com> Reviewed-by: Yossi Itigin <yosefe@mellanox.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-11-21IB/mlx5: Fix XRC QP support after introducing extended atomicYonatan Cohen
Extended atomics are supported with RC and XRC QP types, but the commit citied in the Fixes line added an unneeded check to to_mlx5_access_flags. This broke XRC QPs. The following ib_atomic_bw invocation over XRC reproduces the issue: ib_atomic_bw -d mlx5_1 --connection=XRC --atomic_type=FETCH_AND_ADD It is safe to remove such checks because the QP type was already checked in ib_modify_qp_is_ok(), which was previously called from mlx5_ib_modify_qp. Fixes: a60109dc9a95 ("IB/mlx5: Add support for extended atomic operations") Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-11-21RDMA/mlx5: Fix fence type for IB_WR_LOCAL_INV WRMajd Dibbiny
Currently, for IB_WR_LOCAL_INV WR, when the next fence is None, the current fence will be SMALL instead of Normal Fence. Without this patch krping doesn't work on CX-5 devices and throws following error: The error messages are from CX5 driver are: (from server side) [ 710.434014] mlx5_0:dump_cqe:278:(pid 2712): dump error cqe [ 710.434016] 00000000 00000000 00000000 00000000 [ 710.434016] 00000000 00000000 00000000 00000000 [ 710.434017] 00000000 00000000 00000000 00000000 [ 710.434018] 00000000 93003204 100000b8 000524d2 [ 710.434019] krping: cq completion failed with wr_id 0 status 4 opcode 128 vender_err 32 Fixed the logic to set the correct fence type. Fixes: 6e8484c5cf07 ("RDMA/mlx5: set UMR wqe fence according to HCA cap") Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-17IB/mlx5: Add support for extended atomic operationsYonatan Cohen
Extended atomic operations cmp&swp and fetch&add is a Mellanox feature extending the standard atomic operation to use, varied operand sizes, as apposed to normal atomic operation that use an 8 byte operand only. Extended atomics allows masking the results and arguments. This patch configures QP to support extended atomic operation with the maximum size possible, as exposed by HCA capabilities. Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-10-17IB/mlx5: Allow scatter to CQE without global signaled WRsYonatan Cohen
Requester scatter to CQE is restricted to QPs configured to signal all WRs. This patch adds ability to enable scatter to cqe (force enable) in the requester without sig_all, for users who do not want all WRs signaled but rather just the ones whose data found in the CQE. Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-10-17IB/mlx5: Verify that driver supports user flagsYonatan Cohen
Flags sent down from user might not be supported by running driver. This might lead to unwanted bugs. To solve this, added macro to test for unsupported flags. Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-10-17IB/mlx5: Support scatter to CQE for DC transport typeYonatan Cohen
Scatter to CQE is a HW offload that saves PCI writes by scattering the payload to the CQE. This patch extends already existing functionality to support DC transport type. Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Guy Levi <guyle@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-10-16RDMA/mlx5: Remove extraneous error checkGal Pressman
Remove double error check from create user RQ error flow. Fixes: 79b20a6c3014 ("IB/mlx5: Add receive Work Queue verbs") Signed-off-by: Gal Pressman <pressmangal@gmail.com> Reviewed-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-03RDMA: Remove unused parameter from ib_modify_qp_is_ok()Kamal Heib
The ll parameter is not used in ib_modify_qp_is_ok(), so remove it. Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-27IB/mlx5: Expose RAW QP device handles to user spaceYishai Hadas
Expose RAW QP device handles to user space by extending the UHW part of mlx5_ib_create_qp_resp. This data is returned only when DEVX context is used where it may be applicable. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-26RDMA/drivers: Use dev_err/dbg/etc instead of pr_* + ibdev->nameJason Gunthorpe
Kernel convention is that a driver for a subsystem will print using dev_* on the subsystem's struct device, or with dev_* on the physical device. Drivers should rarely use a pr_* function. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-25IB/mlx5: Set uid as part of XRCD commandsYishai Hadas
Set uid as part of XRCD commands so that the firmware can manage the XRCD object in a secured way. That will enable using an XRCD that was created by verbs application to be used by the DEVX flow in case the uid is equal. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-25IB/mlx5: Set uid as part of RQT commandsYishai Hadas
Set uid as part of RQT commands so that the firmware can manage the RQT object in a secured way. That will enable using an RQT that was created by verbs application to be used by the DEVX flow in case the uid is equal. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-25IB/mlx5: Set uid as part of TIS commandsYishai Hadas
Set uid as part of TIS commands so that the firmware can manage the TIS object in a secured way. That will enable using a TIS that was created by verbs application to be used by the DEVX flow in case the uid is equal. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-25IB/mlx5: Set uid as part of TIR commandsYishai Hadas
Set uid as part of TIR commands so that the firmware can manage the TIR object in a secured way. That will enable using a TIR that was created by verbs application to be used by the DEVX flow in case the uid is equal. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-25IB/mlx5: Set uid as part of DCT commandsYishai Hadas
Set uid as part of DCT create command so that the firmware can manage the DCT object in a secured way. The uid for the destroy and drain commands are set by mlx5_core. That will enable using a DCT that was created by verbs application to be used by the DEVX flow in case the uid is equal. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-25IB/mlx5: Set uid as part of SQ commandsYishai Hadas
Set uid as part of SQ commands so that the firmware can manage the SQ object in a secured way. The uid for the destroy command is set by mlx5_core. This will enable using an SQ that was created by verbs application to be used by the DEVX flow in case the uid is equal. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-25IB/mlx5: Set uid as part of RQ commandsYishai Hadas
Set uid as part of RQ commands so that the firmware can manage the RQ object in a secured way. The uid for the destroy command is set by mlx5_core. This will enable using an RQ that was created by verbs application to be used by the DEVX flow in case the uid is equal. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-25IB/mlx5: Set uid as part of QP creationYishai Hadas
Set uid as part of QP creation so that the firmware can manage the QP object in a secured way. The uid for the destroy and the modify commands is set by mlx5_core. This will enable using a QP that was created by verbs application to be used by the DEVX flow in case the uid is equal. Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-21Merge branch 'mlx5-vport-loopback' into rdma.getDoug Ledford
For dependencies, branch based on 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux.git mlx5 mcast/ucast loopback control enhancements from Leon Romanovsky: ==================== This is short series from Mark which extends handling of loopback traffic. Originally mlx5 IB dynamically enabled/disabled both unicast and multicast based on number of users. However RAW ethernet QPs need more granular access. ==================== Fixed failed automerge in mlx5_ib.h (minor context conflict issue) mlx5-vport-loopback branch: RDMA/mlx5: Enable vport loopback when user context or QP mandate RDMA/mlx5: Allow creating RAW ethernet QP with loopback support RDMA/mlx5: Refactor transport domain bookkeeping logic net/mlx5: Rename incorrect naming in IFC file Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-09-21RDMA/mlx5: Enable vport loopback when user context or QP mandateMark Bloch
A user can create a QP which can accept loopback traffic, but that's not enough. We need to enable loopback on the vport as well. Currently vport loopback is enabled only when more than 1 users are using the IB device, update the logic to consider whatever a QP which supports loopback was created, if so enable vport loopback even if there is only a single user. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-09-21RDMA/mlx5: Allow creating RAW ethernet QP with loopback supportMark Bloch
Expose two new flags: MLX5_QP_FLAG_TIR_ALLOW_SELF_LB_UC MLX5_QP_FLAG_TIR_ALLOW_SELF_LB_MC Those flags can be used at creation time in order to allow a QP to be able to receive loopback traffic (unicast and multicast). We store the state in the QP to be used on the destroy path to indicate with which flags the QP was created with. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-09-22net/mlx5: Rename incorrect naming in IFC fileMark Bloch
Remove a trailing underscore from the multicast/unicast names. Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2018-09-12IB/mlx5: Allow transition of DCI QP to resetMoni Shoua
The transition is allowed from any state and the atrribute mask must be IB_QP_STATE. Fixes: c32a4f296e1d ("IB/mlx5: Add support for DC Initiator QP") Signed-off-by: Moni Shoua <monis@mellanox.com> Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-06IB/mlx5: Don't hold spin lock while checking device stateParav Pandit
mdev->state device state is not protected by the QP for which WRs are being processed. Therefore, there is no need to hold spin lock while checking mdev state. Given that device fatal error is unlikely situation, wrap the condition check with unlikely(). Additionally, kernel QP1 is also a kernel ULP for which soft CQEs needs to be generated. Therefore, check for device fatal error before processing QP1 work requests. Fixes: 89ea94a7b6c4 ("IB/mlx5: Reset flow support for IB kernel ULPs") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-04IB/mlx5: Change TX affinity assignment in RoCE LAG modeMajd Dibbiny
In the current code, the TX affinity is per RoCE device, which can cause unfairness between different contexts. e.g. if we open two contexts, and each open 10 QPs concurrently, all of the QPs of the first context might end up on the first port instead of distributed on the two ports as expected To overcome this unfairness between processes, we maintain per device TX affinity, and per process TX affinity. The allocation algorithm is as follow: 1. Hold two tx_port_affinity atomic variables, one per RoCE device and one per ucontext. Both initialized to 0. 2. In mlx5_ib_alloc_ucontext do: 2.1. ucontext.tx_port_affinity = device.tx_port_affinity 2.2. device.tx_port_affinity += 1 3. In modify QP INIT2RST: 3.1. qp.tx_port_affinity = ucontext.tx_port_affinity % MLX5_PORT_NUM 3.2. ucontext.tx_port_affinity += 1 Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-14IB/mlx5: Fix leaking stack memory to userspaceJason Gunthorpe
mlx5_ib_create_qp_resp was never initialized and only the first 4 bytes were written. Fixes: 41d902cb7c32 ("RDMA/mlx5: Fix definition of mlx5_ib_create_qp_resp") Cc: <stable@vger.kernel.org> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-08RDMA/mlx5: Fix shift overflow in mlx5_ib_create_wqLeon Romanovsky
[ 61.182439] UBSAN: Undefined behaviour in drivers/infiniband/hw/mlx5/qp.c:5366:34 [ 61.183673] shift exponent 4294967288 is too large for 32-bit type 'unsigned int' [ 61.185530] CPU: 0 PID: 639 Comm: qp Not tainted 4.18.0-rc1-00037-g4aa1d69a9c60-dirty #96 [ 61.186981] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014 [ 61.188315] Call Trace: [ 61.188661] dump_stack+0xc7/0x13b [ 61.190427] ubsan_epilogue+0x9/0x49 [ 61.190899] __ubsan_handle_shift_out_of_bounds+0x1ea/0x22f [ 61.197040] mlx5_ib_create_wq+0x1c99/0x1d50 [ 61.206632] ib_uverbs_ex_create_wq+0x499/0x820 [ 61.213892] ib_uverbs_write+0x77e/0xae0 [ 61.248018] vfs_write+0x121/0x3b0 [ 61.249831] ksys_write+0xa1/0x120 [ 61.254024] do_syscall_64+0x7c/0x2a0 [ 61.256178] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 61.259211] RIP: 0033:0x7f54bab70e99 [ 61.262125] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 [ 61.268678] RSP: 002b:00007ffe1541c318 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 61.271076] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f54bab70e99 [ 61.273795] RDX: 0000000000000070 RSI: 0000000020000240 RDI: 0000000000000003 [ 61.276982] RBP: 00007ffe1541c330 R08: 00000000200078e0 R09: 0000000000000002 [ 61.280035] R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004005c0 [ 61.283279] R13: 00007ffe1541c420 R14: 0000000000000000 R15: 0000000000000000 Cc: <stable@vger.kernel.org> # 4.7 Fixes: 79b20a6c3014 ("IB/mlx5: Add receive Work Queue verbs") Cc: syzkaller <syzkaller@googlegroups.com> Reported-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-30RDMA, core and ULPs: Declare ib_post_send() and ib_post_recv() arguments constBart Van Assche
Since neither ib_post_send() nor ib_post_recv() modify the data structure their second argument points at, declare that argument const. This change makes it necessary to declare the 'bad_wr' argument const too and also to modify all ULPs that call ib_post_send(), ib_post_recv() or ib_post_srq_recv(). This patch does not change any functionality but makes it possible for the compiler to verify whether the ib_post_(send|recv|srq_recv) really do not modify the posted work request. To make this possible, only one cast had to be introduce that casts away constness, namely in rpcrdma_post_recvs(). The only way I can think of to avoid that cast is to introduce an additional loop in that function or to change the data type of bad_wr from struct ib_recv_wr ** into int (an index that refers to an element in the work request list). However, both approaches would require even more extensive changes than this patch. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-30IB/mlx5, ib_post_send(), IB_WR_REG_SIG_MR: Do not modify the 'wr' argumentBart Van Assche
Since the next patch will constify the wr pointer, do not modify the data that pointer points at. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Cc: Saeed Mahameed <saeedm@mellanox.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-30RDMA: Constify the argument of the work request conversion functionsBart Van Assche
When posting a send work request, the work request that is posted is not modified by any of the RDMA drivers. Make this explicit by constifying most ib_send_wr pointers in RDMA transport drivers. Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-13RDMA/mlx5: Check that supplied blue flame index doesn't overflowLeon Romanovsky
User's supplied index is checked again total number of system pages, but this number already includes num_static_sys_pages, so addition of that value to supplied index causes to below error while trying to access sys_pages[]. BUG: KASAN: slab-out-of-bounds in bfregn_to_uar_index+0x34f/0x400 Read of size 4 at addr ffff880065561904 by task syz-executor446/314 CPU: 0 PID: 314 Comm: syz-executor446 Not tainted 4.18.0-rc1+ #256 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014 Call Trace: dump_stack+0xef/0x17e print_address_description+0x83/0x3b0 kasan_report+0x18d/0x4d0 bfregn_to_uar_index+0x34f/0x400 create_user_qp+0x272/0x227d create_qp_common+0x32eb/0x43e0 mlx5_ib_create_qp+0x379/0x1ca0 create_qp.isra.5+0xc94/0x22d0 ib_uverbs_create_qp+0x21b/0x2a0 ib_uverbs_write+0xc2c/0x1010 vfs_write+0x1b0/0x550 ksys_write+0xc6/0x1a0 do_syscall_64+0xa7/0x590 entry_SYSCALL_64_after_hwframe+0x49/0xbe RIP: 0033:0x433679 Code: fd ff 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b 91 fd ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007fff2b3d8e48 EFLAGS: 00000217 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 00000000004002f8 RCX: 0000000000433679 RDX: 0000000000000040 RSI: 0000000020000240 RDI: 0000000000000003 RBP: 00000000006d4018 R08: 00000000004002f8 R09: 00000000004002f8 R10: 00000000004002f8 R11: 0000000000000217 R12: 0000000000000000 R13: 000000000040cb00 R14: 000000000040cb90 R15: 0000000000000006 Allocated by task 314: kasan_kmalloc+0xa0/0xd0 __kmalloc+0x1a9/0x510 mlx5_ib_alloc_ucontext+0x966/0x2620 ib_uverbs_get_context+0x23f/0xa60 ib_uverbs_write+0xc2c/0x1010 __vfs_write+0x10d/0x720 vfs_write+0x1b0/0x550 ksys_write+0xc6/0x1a0 do_syscall_64+0xa7/0x590 entry_SYSCALL_64_after_hwframe+0x49/0xbe Freed by task 1: __kasan_slab_free+0x12e/0x180 kfree+0x159/0x630 kvfree+0x37/0x50 single_release+0x8e/0xf0 __fput+0x2d8/0x900 task_work_run+0x102/0x1f0 exit_to_usermode_loop+0x159/0x1c0 do_syscall_64+0x408/0x590 entry_SYSCALL_64_after_hwframe+0x49/0xbe The buggy address belongs to the object at ffff880065561100 which belongs to the cache kmalloc-4096 of size 4096 The buggy address is located 2052 bytes inside of 4096-byte region [ffff880065561100, ffff880065562100) The buggy address belongs to the page: page:ffffea0001955800 count:1 mapcount:0 mapping:ffff88006c402480 index:0x0 compound_mapcount: 0 flags: 0x4000000000008100(slab|head) raw: 4000000000008100 ffffea0001a7c000 0000000200000002 ffff88006c402480 raw: 0000000000000000 0000000080070007 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff880065561800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff880065561880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 >ffff880065561900: 04 fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff880065561980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff880065561a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc Cc: <stable@vger.kernel.org> # 4.15 Fixes: 1ee47ab3e8d8 ("IB/mlx5: Enable QP creation with a given blue flame index") Reported-by: Noa Osherovich <noaos@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-07-13RDMA/mlx5: Melt consecutive calls to alloc_bfreg() in one callLeon Romanovsky
There is no need for three consecutive calls to alloc_bfreg(). It can be implemented with one function. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>