summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-07-20RDMA/siw: Remove the query_pkey callbackKamal Heib
Now that the query_pkey() isn't mandatory by the RDMA core for iwarp providers, this callback can be removed. Link: https://lore.kernel.org/r/20200714183414.61069-5-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Acked-by: Bernard Metzler <bmt@zurich.ibm.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-20RDMA/core: Remove query_pkey from the mandatory opsKamal Heib
The query_pkey() isn't mandatory for the iwarp providers, so remove this requirement from the RDMA core. Link: https://lore.kernel.org/r/20200714183414.61069-4-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-20RDMA/core: Allocate the pkey cache only if the pkey_tbl_len is setKamal Heib
Allocate the pkey cache only if the pkey_tbl_len is set by the provider, also add checks to avoid accessing the pkey cache when it not initialized. Link: https://lore.kernel.org/r/20200714183414.61069-3-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-20RDMA/core: Expose pkeys sysfs files only if pkey_tbl_len is setKamal Heib
Expose the pkeys sysfs files only if the pkey_tbl_len is set by the providers. Link: https://lore.kernel.org/r/20200714183414.61069-2-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-16RDMA/rxe: Prevent access to wr->next ptr afrer wr is posted to send queueMikhail Malygin
rxe_post_send_kernel() iterates over linked list of wr's, until the wr->next ptr is NULL. However if we've got an interrupt after last wr is posted, control may be returned to the code after send completion callback is executed and wr memory is freed. As a result, wr->next pointer may contain incorrect value leading to panic. Store the wr->next on the stack before posting it. Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/20200716190340.23453-1-m.malygin@yadro.com Signed-off-by: Mikhail Malygin <m.malygin@yadro.com> Signed-off-by: Sergey Kojushev <s.kojushev@yadro.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-16RDMA/qedr: Add EDPM max size to alloc ucontext responseMichal Kalderon
User space should receive the maximum edpm size from kernel driver, similar to other edpm/ldpm related limits. Add an additional parameter to the alloc_ucontext_resp structure for the edpm maximum size. In addition, pass an indication from user-space to kernel (and not just kernel to user) that the DPM sizes are supported. This is for supporting backward-forward compatibility between driver and lib for everything related to DPM transaction and limit sizes. This should have been part of commit mentioned in Fixes tag. Link: https://lore.kernel.org/r/20200707063100.3811-3-michal.kalderon@marvell.com Fixes: 93a3d05f9d68 ("RDMA/qedr: Add kernel capability flags for dpm enabled mode") Signed-off-by: Ariel Elior <ariel.elior@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-16RDMA/qedr: Add EDPM mode type for user-fw compatibilityMichal Kalderon
In older FW versions the completion flag was treated as the ack flag in edpm messages. commit ff937b916eb6 ("qed: Add EDPM mode type for user-fw compatibility") exposed the FW option of setting which mode the QP is in by adding a flag to the qedr <-> qed API. This patch adds the qedr <-> libqedr interface so that the libqedr can set the flag appropriately and qedr can pass it down to FW. Flag is added for backward compatibility with libqedr. For older libs, this flag didn't exist and therefore set to zero. Fixes: ac1b36e55a51 ("qedr: Add support for user context verbs") Link: https://lore.kernel.org/r/20200707063100.3811-2-michal.kalderon@marvell.com Signed-off-by: Yuval Bason <yuval.bason@marvell.com> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-16RDMA/usnic: switch from 'pci_' to 'dma_' APIChristophe JAILLET
The wrappers in include/linux/pci-dma-compat.h should go away. The patch has been generated with the coccinelle script bellow. It has been compile tested. When memory is allocated, GFP_ATOMIC should be used to be consistent with the surrounding code. @@ @@ - PCI_DMA_BIDIRECTIONAL + DMA_BIDIRECTIONAL @@ @@ - PCI_DMA_TODEVICE + DMA_TO_DEVICE @@ @@ - PCI_DMA_FROMDEVICE + DMA_FROM_DEVICE @@ @@ - PCI_DMA_NONE + DMA_NONE @@ expression e1, e2, e3; @@ - pci_alloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3; @@ - pci_zalloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3, e4; @@ - pci_free_consistent(e1, e2, e3, e4) + dma_free_coherent(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_single(e1, e2, e3, e4) + dma_map_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_single(e1, e2, e3, e4) + dma_unmap_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4, e5; @@ - pci_map_page(e1, e2, e3, e4, e5) + dma_map_page(&e1->dev, e2, e3, e4, e5) @@ expression e1, e2, e3, e4; @@ - pci_unmap_page(e1, e2, e3, e4) + dma_unmap_page(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_sg(e1, e2, e3, e4) + dma_map_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_sg(e1, e2, e3, e4) + dma_unmap_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_cpu(e1, e2, e3, e4) + dma_sync_single_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_device(e1, e2, e3, e4) + dma_sync_single_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_cpu(e1, e2, e3, e4) + dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_device(e1, e2, e3, e4) + dma_sync_sg_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2; @@ - pci_dma_mapping_error(e1, e2) + dma_mapping_error(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_dma_mask(e1, e2) + dma_set_mask(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_consistent_dma_mask(e1, e2) + dma_set_coherent_mask(&e1->dev, e2) Link: https://lore.kernel.org/r/20200711073120.249146-1-christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-16IB/hfi1: Remove unnecessary fall-through markingsGustavo A. R. Silva
Reorganize the code a bit in a more standard way[1] and remove unnecessary fall-through markings. [1] https://lore.kernel.org/lkml/20200708054703.GR207186@unreal/ Link: https://lore.kernel.org/r/20200709235250.GA26678@embeddedor Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-16RDMA/qedr: SRQ's bug fixesYuval Basson
QP's with the same SRQ, working on different CQs and running in parallel on different CPUs could lead to a race when maintaining the SRQ consumer count, and leads to FW running out of SRQs. Update the consumer atomically. Make sure the wqe_prod is updated after the sge_prod due to FW requirements. Fixes: 3491c9e799fb ("qedr: Add support for kernel mode SRQ's") Link: https://lore.kernel.org/r/20200708195526.31040-1-ybason@marvell.com Signed-off-by: Michal Kalderon <mkalderon@marvell.com> Signed-off-by: Yuval Basson <ybason@marvell.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-16IB/isert: allocate RW ctxs according to max IO sizeMax Gurtovoy
Current iSER target code allocates MR pool budget based on queue size. Since there is no handshake between iSER initiator and target on max IO size, we'll set the iSER target to support upto 16MiB IO operations and allocate the correct number of RDMA ctxs according to the factor of MR's per IO operation. This would guarantee sufficient size of the MR pool for the required IO queue depth and IO size. Link: https://lore.kernel.org/r/20200708091908.162263-1-maxg@mellanox.com Reported-by: Krishnamraju Eraparaju <krishna2@chelsio.com> Tested-by: Krishnamraju Eraparaju <krishna2@chelsio.com> Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Acked-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-16RDMA/mlx5: Init dest_type when create flowDaria Velikovsky
When using action drop dest_type was never assigned to any value. Add initialization of dest_type to -1 since 0 is valid. Fixes: f29de9eee782 ("RDMA/mlx5: Add support for drop action in DV steering") Link: https://lore.kernel.org/r/20200707110259.882276-1-leon@kernel.org Signed-off-by: Daria Velikovsky <daria@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-16RDMA/rxe: Remove rxe_link_layer()Kamal Heib
Instead of returning IB_LINK_LAYER_ETHERNET from rxe_link_layer, return it directly from get_link_layer callback and remove rxe_link_layer(). Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/20200705104313.283034-5-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-16RDMA/rxe: Return void from rxe_mem_init_dma()Kamal Heib
The return value from rxe_mem_init_dma() is always 0 - change it to be void and fix the callers accordingly. Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/20200705104313.283034-4-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-16RDMA/rxe: Return void from rxe_init_port_param()Kamal Heib
The return value from rxe_init_port_param() is always 0 - change it to be void. Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/20200705104313.283034-3-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-16RDMA/rxe: Drop pointless checks in rxe_init_portsKamal Heib
Both pkey_tbl_len and gid_tbl_len are set in rxe_init_port_param() - so no need to check if they aren't set. Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/20200705104313.283034-2-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-10RDMA/counter: Allow manually bind QPs with different pids to same counterMark Zhang
In manual mode allow bind user QPs with different pids to same counter, since this is allowed in auto mode. Bind kernel QPs and user QPs to the same counter are not allowed. Fixes: 1bd8e0a9d0fd ("RDMA/counter: Allow manual mode configuration support") Link: https://lore.kernel.org/r/20200702082933.424537-4-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-10RDMA/counter: Only bind user QPs in auto modeMark Zhang
In auto mode only bind user QPs to a dynamic counter, since this feature is mainly used for system statistic and diagnostic purpose, while there's no need to counter kernel QPs so far. Fixes: 99fa331dc862 ("RDMA/counter: Add "auto" configuration mode support") Link: https://lore.kernel.org/r/20200702082933.424537-3-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-10RDMA/counter: Add PID category support in auto modeMark Zhang
With the "PID" category QPs have same PID will be bound to same counter; If this category is not set then QPs have different PIDs will be bound to same counter. This is implemented for 2 reasons: 1. The counter is a limited resource, while there may be dozens of applications, each of which creates several types of QPs, which means it may doesn't have enough counter. 2. The system administrator needs all QPs created by all applications with same type bound to one counter. The counter name and PID is only make sense when "PID" category are configured. This category can also be used in combine with others, e.g. QP type. Link: https://lore.kernel.org/r/20200702082933.424537-2-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-10RDMA/mlx5: Remove unused to_mibmr functionGal Pressman
The to_mibmr function is unused, remove it. Link: https://lore.kernel.org/r/20200705141143.47303-1-galpress@amazon.com Signed-off-by: Gal Pressman <galpress@amazon.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-07RDMA/mlx5: Delete one-time used functionsLeon Romanovsky
Merge them into their callers, usually the only thing the caller did was to call the one function, so this is clearer. Link: https://lore.kernel.org/r/20200702081809.423482-7-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-07RDMA/mlx5: Cleanup DEVX initialization flowLeon Romanovsky
Move DEVX initialization and cleanup flows to the devx.c instead of having almost empty functions in main.c Link: https://lore.kernel.org/r/20200702081809.423482-6-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-07RDMA/mlx5: Separate flow steering logic from main.cLeon Romanovsky
Move flow steering logic to be in separate file and rename flow.c to be fs.c because it is better describe the content. Link: https://lore.kernel.org/r/20200702081809.423482-5-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-07RDMA/mlx5: Separate counters from main.cLeon Romanovsky
There are number of counters types supported in mlx5_ib: HW counters, congestion counters, Q-counters and flow counters. Almost all supporting code was placed in main.c that made almost impossible to maintain the code anymore. Let's create separate code namespace for the counters to easy future generalization effort. Link: https://lore.kernel.org/r/20200702081809.423482-4-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-07RDMA/mlx5: Separate restrack callbacks initialization from main.cLeon Romanovsky
The restrack code has separate .c, so move callbacks initialization to that file to improve code locality. Link: https://lore.kernel.org/r/20200702081809.423482-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-07RDMA/mlx5: Limit the scope of mlx5_ib_enable_driver functionLeon Romanovsky
The mlx5_ib_enable_driver() is local function and doesn't need to be shared in mlx5_ib, so change it's signature to have static keyword in it. Link: https://lore.kernel.org/r/20200702081809.423482-2-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-07RDMA/hns: Optimize MTR level-0 addressing to access huge pageXi Wang
If hns ROCEE is set to level-0 addressing, the length of the entire buffer can be used as the page size. The driver needn't to split the buffer into small units because all pages are continuous. Link: https://lore.kernel.org/r/1593525696-12570-1-git-send-email-liweihang@huawei.com Signed-off-by: Xi Wang <wangxi11@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-07RDMA/rxe: Skip dgid check in loopback modeZhu Yanjun
In the loopback tests, the following call trace occurs. Call Trace: __rxe_do_task+0x1a/0x30 [rdma_rxe] rxe_qp_destroy+0x61/0xa0 [rdma_rxe] rxe_destroy_qp+0x20/0x60 [rdma_rxe] ib_destroy_qp_user+0xcc/0x220 [ib_core] uverbs_free_qp+0x3c/0xc0 [ib_uverbs] destroy_hw_idr_uobject+0x24/0x70 [ib_uverbs] uverbs_destroy_uobject+0x43/0x1b0 [ib_uverbs] uobj_destroy+0x41/0x70 [ib_uverbs] __uobj_get_destroy+0x39/0x70 [ib_uverbs] ib_uverbs_destroy_qp+0x88/0xc0 [ib_uverbs] ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0xb9/0xf0 [ib_uverbs] ib_uverbs_cmd_verbs+0xb16/0xc30 [ib_uverbs] The root cause is that the actual RDMA connection is not created in the loopback tests and the rxe_match_dgid will fail randomly. To fix this call trace which appear in the loopback tests, skip check of the dgid. Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/20200630123605.446959-1-leon@kernel.org Signed-off-by: Zhu Yanjun <yanjunz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06RDMA: Move XRCD to be under ib_core responsibilityLeon Romanovsky
Update the code to allocate and free ib_xrcd structure in the ib_core instead of inside drivers. Link: https://lore.kernel.org/r/20200630101855.368895-4-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06RDMA/core: Create and destroy counters in the ib_coreLeon Romanovsky
Move allocation and destruction of counters under ib_core responsibility Link: https://lore.kernel.org/r/20200630101855.368895-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06IB/uverbs: Expose UAPI to query MRYishai Hadas
Expose UAPI to query MR, this will let user space application that didn't allocate the MR but has access to by owning the matching command FD to retrieve its information. Link: https://lore.kernel.org/r/20200630093916.332097-8-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06RDMA/mlx5: Introduce UAPI to query PD attributesYishai Hadas
Introduce UAPI to query PD attributes, this can be used to retrieve PD attributes by having the PD handle of the created one and owning the command FD for the ucontxet. Link: https://lore.kernel.org/r/20200630093916.332097-7-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06RDMA/mlx5: Implement the query ucontext functionalityYishai Hadas
Implement the query ucontext functionality by returning the original ucontext data as part of an extra mlx5 attribute that holds the driver UAPI response. Link: https://lore.kernel.org/r/20200630093916.332097-6-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06RDMA/mlx5: Refactor mlx5_ib_alloc_ucontext() responseYishai Hadas
Refactor mlx5_ib_alloc_ucontext() to set its response fields in a cleaner way. It includes, - Move the relevant code to a self contained function. - Calculate the response length once and drop redundant code all around. - Reuse previously set ucontext fields once preparing the response. The self contained function will be used in next patch as part of implementing the query ucontext functionality. Link: https://lore.kernel.org/r/20200630093916.332097-5-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06IB/uverbs: Expose UAPI to query ucontextYishai Hadas
Expose UAPI to query ucontext, this will let user space application that didn't allocate the ucontext but has access to by owning the matching command FD to retrieve the ucontext information. Link: https://lore.kernel.org/r/20200630093916.332097-4-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06IB/uverbs: Set IOVA on IB MR in uverbs layerYishai Hadas
Set IOVA on IB MR in uverbs layer to let all drivers have it, this includes both reg/rereg MR flows. As part of this change cleaned-up this setting from the drivers that already did it by themselves in their user flows. Fixes: e6f0330106f4 ("mlx4_ib: set user mr attributes in struct ib_mr") Link: https://lore.kernel.org/r/20200630093916.332097-3-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06IB/uverbs: Enable CQ ioctl commands by defaultYishai Hadas
Enable CQ ioctl commands by default, this functionality is fully mature to be used over ioctl, no reason to maintain any more the EXP KCONFIG entry to enable it. Link: https://lore.kernel.org/r/20200630093916.332097-2-leon@kernel.org Signed-off-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06RDMA/core: Optimize XRC target lookupMaor Gottlieb
Replace the mutex with read write semaphore and use xarray instead of linked list for XRC target QPs. This will give faster XRC target lookup. In addition, when QP is closed, don't insert it back to the xarray if the destroy command failed. Link: https://lore.kernel.org/r/20200706122716.647338-4-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06RDMA/core: Clean ib_alloc_xrcd() and reuse it to allocate XRC domainMaor Gottlieb
ib_alloc_xrcd() already does the required initialization, so move the uverbs to call it and save code duplication, while cleaning the function argument lists of that function. Link: https://lore.kernel.org/r/20200706122716.647338-3-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06RDMA/mlx5: Get XRCD number directly for the internal useLeon Romanovsky
The mlx5_ib creates XRC domain and uses for creating internal SRQ. However all that is needed is XRCD number and not full blown ib_xrcd objects. Update the code to get and store the number only. Link: https://lore.kernel.org/r/20200706122716.647338-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06RDMA: Remove the udata parameter from alloc_mr callbackGal Pressman
Allocating an MR flow can only be initiated by kernel users, and not from userspace so a udata parameter is redundant. Link: https://lore.kernel.org/r/20200706120343.10816-4-galpress@amazon.com Signed-off-by: Gal Pressman <galpress@amazon.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06RDMA/core: Remove ib_alloc_mr_user functionGal Pressman
Allocating an MR flow can only be initiated by kernel users, and not from userspace. As a result, the udata parameter is always being passed as NULL. Rename ib_alloc_mr_user function to ib_alloc_mr and remove the udata parameter. Link: https://lore.kernel.org/r/20200706120343.10816-3-galpress@amazon.com Signed-off-by: Gal Pressman <galpress@amazon.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06RDMA/core: Check for error instead of success in alloc MR functionGal Pressman
The common kernel pattern is to check for error, not success. Flip the if statement accordingly and keep the main flow unindented. Link: https://lore.kernel.org/r/20200706120343.10816-2-galpress@amazon.com Signed-off-by: Gal Pressman <galpress@amazon.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06RDMA/core: Clean up tracepoint headersChuck Lever
There's no need for core/trace.c to include rdma/ib_verbs.h twice. Link: https://lore.kernel.org/r/20200702141946.3775.51943.stgit@klimt.1015granger.net Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06Merge branch 'mlx5_ipoib_qpn' into rdma.git for-nextJason Gunthorpe
Michael Guralnik says: ==================== This series handles IPoIB child interface creation with setting interface's HW address. In current implementation, lladdr requested by user is ignored and overwritten. Child interface gets the same GID as the parent interface and a QP number which is assigned by the underlying drivers. In this series we fix this behavior so that user's requested address is assigned to the newly created interface. As specific QP number request is not supported for all vendors, QP number requested by user will still be overwritten when this is not supported. Behavior of creation of child interfaces through the sysfs mechanism or without specifying a requested address, stays the same. ==================== Based on the mlx5-next branch at git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux due to dependencies. * branch 'mlx5_ipoib_qpn': RDMA/ipoib: Handle user-supplied address when creating child net/mlx5: Enable QP number request when creating IPoIB underlay QP Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06RDMA/ipoib: Handle user-supplied address when creating childMichael Guralnik
Use the address supplied by user when creating a child interface. Previously, the address requested by the user was ignored and overridden with parent's GID and the random QP number assigned to the child. Link: https://lore.kernel.org/r/20200623110105.1225750-3-leon@kernel.org Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Reviewed-by: Feras Daoud <ferasda@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-03net/mlx5: Enable QP number request when creating IPoIB underlay QPMichael Guralnik
If in the process of creating the underlay QP for an IPoIB interface the user has set the address and specifically the 1st-3rd bytes representing the QP number, use the requested QP number when creating the underlay QP. For a user to be able to request a QP number on QP creation, the MKEY_BY_NAME NVCONFIG should be set. As mkey_by_name and qp_by_name are coupled in FW. This requires driver to query the mkey_by_name max cap during initialization and set the current cap if it was enabled in FW. Signed-off-by: Michael Guralnik <michaelgur@mellanox.com> Reviewed-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-07-03RDMA/mlx5: Introduce ODP prefetch counterMaor Gottlieb
For debugging purpose it will be easier to understand if prefetch works okay if it has its own counter. Introduce ODP prefetch counter and count per MR the total number of prefetched pages. In addition remove comment which is not relevant anymore and anyway not in the correct place. Link: https://lore.kernel.org/r/20200621104147.53795-1-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-02RDMA/core: Fix bogus WARN_ON during ib_unregister_device_queued()Jason Gunthorpe
ib_unregister_device_queued() can only be used by drivers using the new dealloc_device callback flow, and it has a safety WARN_ON to ensure drivers are using it properly. However, if unregister and register are raced there is a special destruction path that maintains the uniform error handling semantic of 'caller does ib_dealloc_device() on failure'. This requires disabling the dealloc_device callback which triggers the WARN_ON. Instead of using NULL to disable the callback use a special function pointer so the WARN_ON does not trigger. Fixes: d0899892edd0 ("RDMA/device: Provide APIs from the core code to help unregistration") Link: https://lore.kernel.org/r/0-v1-a36d512e0a99+762-syz_dealloc_driver_jgg@nvidia.com Reported-by: syzbot+4088ed905e4ae2b0e13b@syzkaller.appspotmail.com Suggested-by: Hillf Danton <hdanton@sina.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-02RDMA/ipoib: Fix ABBA deadlock with ipoib_reap_ah()Jason Gunthorpe
ipoib_mcast_carrier_on_task() insanely open codes a rtnl_lock() such that the only time flush_workqueue() can be called is if it also clears IPOIB_FLAG_OPER_UP. Thus the flush inside ipoib_flush_ah() will deadlock if it gets unlucky enough, and lockdep doesn't help us to find it early: CPU0 CPU1 CPU2 __ipoib_ib_dev_flush() down_read(vlan_rwsem) ipoib_vlan_add() rtnl_trylock() down_write(vlan_rwsem) ipoib_mcast_carrier_on_task() while (!rtnl_trylock()) msleep(20); ipoib_flush_ah() flush_workqueue(priv->wq) Clean up the ah_reaper related functions and lifecycle to make sense: - Start/Stop of the reaper should only be done in open/stop NDOs, not in any other places - cancel and flush of the reaper should only happen in the stop NDO. cancel is only functional when combined with IPOIB_STOP_REAPER. - Non-stop places were flushing the AH's just need to flush out dead AH's synchronously and ignore the background task completely. It is fully locked and harmless to leave running. Which ultimately fixes the ABBA deadlock by removing the unnecessary flush_workqueue() from the problematic place under the vlan_rwsem. Fixes: efc82eeeae4e ("IB/ipoib: No longer use flush as a parameter") Link: https://lore.kernel.org/r/20200625174219.290842-1-kamalheib1@gmail.com Reported-by: Kamal Heib <kheib@redhat.com> Tested-by: Kamal Heib <kheib@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>