summaryrefslogtreecommitdiff
path: root/drivers/infiniband/hw/mlx5/qp.c
AgeCommit message (Collapse)Author
2020-09-29RDMA/drivers: Remove udata check from special QPLeon Romanovsky
GSI QP can't be created from the user space, hence the udata check is always false (udata == NULL). Remove that check and simplify the flow. Link: https://lore.kernel.org/r/20200926102450.2966017-9-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-29RDMA/mlx5: Change GSI QP to have same creation flow like other QPsLeon Romanovsky
There is no reason to have separate create flow for the GSI QP, while general create_qp routine has all needed checks and ability to allocate and free the proper struct mlx5_ib_qp. Link: https://lore.kernel.org/r/20200926102450.2966017-4-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-29RDMA/mlx5: Embed GSI QP into general mlx5_ib QPLeon Romanovsky
The GSI QPs have different create flow from the regular QPs, but it is not really needed. Update the code to use mlx5_ib_qp as a storage class for all outside of GSI calls. Link: https://lore.kernel.org/r/20200926102450.2966017-2-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-18Merge branch 'mlx_sw_owner_v2' into rdma.git for-nextJason Gunthorpe
Leon Romanovsky says: ==================== This series from Alex extends software steering interface to support devices with extra capability "sw_owner_2" which will replace existing "sw_owner". ==================== Based on the mlx5-next branch at git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux due to dependencies. * branch 'mlx5_sw_owner_v2: RDMA/mlx5: Expose TIR and QP ICM address for sw_owner_v2 devices RDMA/mlx5: Allow DM allocation for sw_owner_v2 enabled devices RDMA/mlx5: Add sw_owner_v2 bit capability
2020-09-18RDMA/mlx5: Expose TIR and QP ICM address for sw_owner_v2 devicesAlex Vesker
Expose the ICM address to access TIR and QP, this will allow sw_owned_v2 devices to steer traffic to TIRs and QPs same as done with sw_owner capability. Link: https://lore.kernel.org/r/20200903073857.1129166-4-leon@kernel.org Signed-off-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-17RDMA: Convert RWQ table logic to ib_core allocation schemeLeon Romanovsky
Move struct ib_rwq_ind_table allocation to ib_core. Link: https://lore.kernel.org/r/20200902081623.746359-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-09RDMA: Restore ability to return error for destroy WQLeon Romanovsky
Make this interface symmetrical to other destroy paths. Fixes: a49b1dc7ae44 ("RDMA: Convert destroy_wq to be void") Link: https://lore.kernel.org/r/20200907120921.476363-9-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-09RDMA: Change XRCD destroy return valueLeon Romanovsky
Update XRCD destroy flow to allow command failure. Fixes: 28ad5f65c314 ("RDMA: Move XRCD to be under ib_core responsibility") Link: https://lore.kernel.org/r/20200907120921.476363-8-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-31Merge tag 'v5.9-rc3' into rdma.git for-nextJason Gunthorpe
Required due to dependencies in following patches. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-27IB/mlx5: Add DCT RoCE LAG supportMark Zhang
When DCT QPs work in RoCE LAG mode: 1. DCT creation is allowed only when it is supported 2. The "port" of a DCT QP is assigned in a round-robin way Link: https://lore.kernel.org/r/20200818115245.700581-3-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-27IB/mlx5: Add tx_affinity support for DCI QPMark Zhang
DCI QP supports tx_affinity as well. Link: https://lore.kernel.org/r/20200818115245.700581-2-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-23treewide: Use fallthrough pseudo-keywordGustavo A. R. Silva
Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-18RDMA/mlx5: Add new IB rates supportMark Zhang
Support 56, 25, 100, 200 and 50Gbps IB rates in mlx5 driver. Link: https://lore.kernel.org/r/20200802081712.1993490-1-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-18RDMA/mlx5: Replace open-coded offsetofend() macroLeon Romanovsky
Clean mlx5_ib from open-coded implementations of offsetofend(). Link: https://lore.kernel.org/r/20200730081235.1581127-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-06Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma updates from Jason Gunthorpe: "A quiet cycle after the larger 5.8 effort. Substantially cleanup and driver work with a few smaller features this time. - Driver updates for hfi1, rxe, mlx5, hns, qedr, usnic, bnxt_re - Removal of dead or redundant code across the drivers - RAW resource tracker dumps to include a device specific data blob for device objects to aide device debugging - Further advance the IOCTL interface, remove the ability to turn it off. Add QUERY_CONTEXT, QUERY_MR, and QUERY_PD commands - Remove stubs related to devices with no pkey table - A shared CQ scheme to allow multiple ULPs to share the CQ rings of a device to give higher performance - Several more static checker, syzkaller and rare crashers fixed" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (121 commits) RDMA/mlx5: Fix flow destination setting for RDMA TX flow table RDMA/rxe: Remove pkey table RDMA/umem: Add a schedule point in ib_umem_get() RDMA/hns: Fix the unneeded process when getting a general type of CQE error RDMA/hns: Fix error during modify qp RTS2RTS RDMA/hns: Delete unnecessary memset when allocating VF resource RDMA/hns: Remove redundant parameters in set_rc_wqe() RDMA/hns: Remove support for HIP08_A RDMA/hns: Refactor hns_roce_v2_set_hem() RDMA/hns: Remove redundant hardware opcode definitions RDMA/netlink: Remove CAP_NET_RAW check when dump a raw QP RDMA/include: Replace license text with SPDX tags RDMA/rtrs: remove WQ_MEM_RECLAIM for rtrs_wq RDMA/rtrs-clt: add an additional random 8 seconds before reconnecting RDMA/cma: Execute rdma_cm destruction from a handler properly RDMA/cma: Remove unneeded locking for req paths RDMA/cma: Using the standard locking pattern when delivering the removal event RDMA/cma: Simplify DEVICE_REMOVAL for internal_id RDMA/efa: Add EFA 0xefa1 PCI ID RDMA/efa: User/kernel compatibility handshake mechanism ...
2020-07-30RDMA/mlx5: Initialize QP mutex for the debug kernelsLeon Romanovsky
In DCT and RSS RAW QP creation flows, the QP mutex wasn't initialized and the magic field inside lock was missing. This caused to the following kernel warning for kernels build with CONFIG_DEBUG_MUTEXES. DEBUG_LOCKS_WARN_ON(lock->magic != lock) WARNING: CPU: 3 PID: 16261 at kernel/locking/mutex.c:938 __mutex_lock+0x60e/0x940 Modules linked in: bonding nf_tables ipip tunnel4 geneve ip6_udp_tunnel udp_tunnel ip6_gre ip6_tunnel tunnel6 ip_gre gre ip_tunnel mlx5_ib mlx5_core mlxfw ptp pps_core rdma_ucm ib_uverbs ib_ipoib ib_umad openvswitch nsh xt_MASQUERADE nf_conntrack_netlink nfnetlink iptable_nat xt_addrtype xt_conntrack nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter overlay ib_srp scsi_transport_srp rpcrdma ib_iser libiscsi scsi_transport_iscsi rdma_cm iw_cm ib_cm ib_core [last unloaded: mlxfw] CPU: 3 PID: 16261 Comm: ib_send_bw Not tainted 5.8.0-rc4_for_upstream_min_debug_2020_07_08_22_04 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 RIP: 0010:__mutex_lock+0x60e/0x940 Code: c0 0f 84 6d fa ff ff 44 8b 15 4e 9d ba 00 45 85 d2 0f 85 5d fa ff ff 48 c7 c6 f2 de 2b 82 48 c7 c7 f1 8a 2b 82 e8 d2 4d 72 ff <0f> 0b 4c 8b 4d 88 e9 3f fa ff ff f6 c2 04 0f 84 37 fe ff ff 48 89 RSP: 0018:ffff88810bb8b870 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: ffff88829f1dd880 RSI: 0000000000000000 RDI: ffffffff81192afa RBP: ffff88810bb8b910 R08: 0000000000000000 R09: 0000000000000028 R10: 0000000000000000 R11: 0000000000003f85 R12: 0000000000000002 R13: ffff88827d8d3ce0 R14: ffffffffa059f615 R15: ffff8882a4d02610 FS: 00007f3f6988e740(0000) GS:ffff8882f5b80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000556556158000 CR3: 000000010a63c005 CR4: 0000000000360ea0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ? cmd_exec+0x947/0xe60 [mlx5_core] ? __mutex_lock+0x76/0x940 ? mlx5_ib_qp_set_counter+0x25/0xa0 [mlx5_ib] mlx5_ib_qp_set_counter+0x25/0xa0 [mlx5_ib] mlx5_ib_counter_bind_qp+0x9b/0xe0 [mlx5_ib] __rdma_counter_bind_qp+0x6b/0xa0 [ib_core] rdma_counter_bind_qp_auto+0x363/0x520 [ib_core] _ib_modify_qp+0x316/0x580 [ib_core] ib_modify_qp_with_udata+0x19/0x30 [ib_core] modify_qp+0x4c4/0x600 [ib_uverbs] ib_uverbs_ex_modify_qp+0x87/0xe0 [ib_uverbs] ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x129/0x1c0 [ib_uverbs] ib_uverbs_cmd_verbs.isra.5+0x5d5/0x11f0 [ib_uverbs] ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x120/0x120 [ib_uverbs] ? lock_acquire+0xb9/0x3a0 ? ib_uverbs_ioctl+0xd0/0x210 [ib_uverbs] ? ib_uverbs_ioctl+0x175/0x210 [ib_uverbs] ib_uverbs_ioctl+0x14b/0x210 [ib_uverbs] ? ib_uverbs_ioctl+0xd0/0x210 [ib_uverbs] ksys_ioctl+0x234/0x7d0 ? exc_page_fault+0x202/0x640 ? do_syscall_64+0x1f/0x2e0 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x59/0x2e0 ? asm_exc_page_fault+0x8/0x30 ? rcu_read_lock_sched_held+0x52/0x60 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: b4aaa1f0b415 ("IB/mlx5: Handle type IB_QPT_DRIVER when creating a QP") Link: https://lore.kernel.org/r/20200730082719.1582397-2-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-29RDMA/mlx5: Allow providing extra scatter CQE QP flagLeon Romanovsky
Scatter CQE feature relies on two flags MLX5_QP_FLAG_SCATTER_CQE and MLX5_QP_FLAG_ALLOW_SCATTER_CQE, both of them can be provided without relation to device capability. Relax global validity check to allow MLX5_QP_FLAG_ALLOW_SCATTER_CQE QP flag. Existing user applications are failing on this new validity check. Fixes: 90ecb37a751b ("RDMA/mlx5: Change scatter CQE flag to be set like other vendor flags") Fixes: 37518fa49f76 ("RDMA/mlx5: Process all vendor flags in one place") Link: https://lore.kernel.org/r/20200728120255.805733-1-leon@kernel.org Reviewed-by: Artemy Kovalyov <artemyko@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-28RDMA/mlx5: Delete unreachable codeLeon Romanovsky
Delete two occurrences of unreachable code discovered by the Coverity. Link: https://lore.kernel.org/r/20200727095746.495915-1-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-24RDMA/mlx5: Allow SQ modificationMaor Gottlieb
Currently the SQ is set to a ready state when the RAW QP is modified to INIT. When the TIS is modified, e.g. to change the lag_tx_affinity, then SQs which are already in the ready state will not be affected. Open a window to modify the SQ behavior by setting the SQ as ready only when QP was modified to RTS. Link: https://lore.kernel.org/r/20200716105416.1423826-1-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Zhang <markz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-08RDMA/mlx5: Set PD pointers for the error flow unwindLeon Romanovsky
ib_pd is accessed internally during destroy of the TIR/TIS, but PD can be not set yet. This leading to the following kernel panic. BUG: kernel NULL pointer dereference, address: 0000000000000074 PGD 8000000079eaa067 P4D 8000000079eaa067 PUD 7ae81067 PMD 0 Oops: 0000 [#1] SMP PTI CPU: 1 PID: 709 Comm: syz-executor.0 Not tainted 5.8.0-rc3 #41 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 RIP: 0010:destroy_raw_packet_qp_tis drivers/infiniband/hw/mlx5/qp.c:1189 [inline] RIP: 0010:destroy_raw_packet_qp drivers/infiniband/hw/mlx5/qp.c:1527 [inline] RIP: 0010:destroy_qp_common+0x2ca/0x4f0 drivers/infiniband/hw/mlx5/qp.c:2397 Code: 00 85 c0 74 2e e8 56 18 55 ff 48 8d b3 28 01 00 00 48 89 ef e8 d7 d3 ff ff 48 8b 43 08 8b b3 c0 01 00 00 48 8b bd a8 0a 00 00 <0f> b7 50 74 e8 0d 6a fe ff e8 28 18 55 ff 49 8d 55 50 4c 89 f1 48 RSP: 0018:ffffc900007bbac8 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffff88807949e800 RCX: 0000000000000998 RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff88807c180140 RBP: ffff88807b50c000 R08: 000000000002d379 R09: ffffc900007bba00 R10: 0000000000000001 R11: 000000000002d358 R12: ffff888076f37000 R13: ffff88807949e9c8 R14: ffffc900007bbe08 R15: ffff888076f37000 FS: 00000000019bf940(0000) GS:ffff88807dd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000074 CR3: 0000000076d68004 CR4: 0000000000360ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: mlx5_ib_create_qp+0xf36/0xf90 drivers/infiniband/hw/mlx5/qp.c:3014 _ib_create_qp drivers/infiniband/core/core_priv.h:333 [inline] create_qp+0x57f/0xd20 drivers/infiniband/core/uverbs_cmd.c:1443 ib_uverbs_create_qp+0xcf/0x100 drivers/infiniband/core/uverbs_cmd.c:1564 ib_uverbs_write+0x5fa/0x780 drivers/infiniband/core/uverbs_main.c:664 __vfs_write+0x3f/0x90 fs/read_write.c:495 vfs_write+0xc7/0x1f0 fs/read_write.c:559 ksys_write+0x5e/0x110 fs/read_write.c:612 do_syscall_64+0x3e/0x70 arch/x86/entry/common.c:359 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x466479 Code: Bad RIP value. RSP: 002b:00007ffd057b62b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 000000000073bf00 RCX: 0000000000466479 RDX: 0000000000000070 RSI: 0000000020000240 RDI: 0000000000000003 RBP: 00000000019bf8fc R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff R13: 0000000000000bf6 R14: 00000000004cb859 R15: 00000000006fefc0 Fixes: 6c41965d647a ("RDMA/mlx5: Don't access ib_qp fields in internal destroy QP path") Link: https://lore.kernel.org/r/20200707110612.882962-4-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-07RDMA/mlx5: Separate counters from main.cLeon Romanovsky
There are number of counters types supported in mlx5_ib: HW counters, congestion counters, Q-counters and flow counters. Almost all supporting code was placed in main.c that made almost impossible to maintain the code anymore. Let's create separate code namespace for the counters to easy future generalization effort. Link: https://lore.kernel.org/r/20200702081809.423482-4-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06RDMA: Move XRCD to be under ib_core responsibilityLeon Romanovsky
Update the code to allocate and free ib_xrcd structure in the ib_core instead of inside drivers. Link: https://lore.kernel.org/r/20200630101855.368895-4-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06RDMA/mlx5: Get XRCD number directly for the internal useLeon Romanovsky
The mlx5_ib creates XRC domain and uses for creating internal SRQ. However all that is needed is XRCD number and not full blown ib_xrcd objects. Update the code to get and store the number only. Link: https://lore.kernel.org/r/20200706122716.647338-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-02RDMA/mlx5: Fix legacy IPoIB QP initializationLeon Romanovsky
Legacy IPoIB sets IB_QP_CREATE_NETIF_QP QP create flag and because mlx5 doesn't use this flag, the process_create_flags() failed to create IPoIB QPs. Fixes: 2978975ce7f1 ("RDMA/mlx5: Process create QP flags in one place") Link: https://lore.kernel.org/r/20200630122147.445847-1-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-06-22RDMA/mlx5: Protect from kernel crash if XRC_TGT doesn't have udataLeon Romanovsky
Don't deref udata if it is NULL BUG: kernel NULL pointer dereference, address: 0000000000000030 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 SMP PTI CPU: 2 PID: 1592 Comm: python3 Not tainted 5.7.0-rc6+ #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 RIP: 0010:create_qp+0x39e/0xae0 [mlx5_ib] Code: c0 0d 00 00 bf 10 01 00 00 e8 be a9 e4 e0 48 85 c0 49 89 c2 0f 84 0c 07 00 00 41 8b 85 74 63 01 00 0f c8 a9 00 00 00 10 74 0a <41> 8b 46 30 0f c8 41 89 42 14 41 8b 52 18 41 0f b6 4a 1c 0f ca 89 RSP: 0018:ffffc9000067f8b0 EFLAGS: 00010206 RAX: 0000000010170000 RBX: ffff888441313000 RCX: 0000000000000000 RDX: 0000000000000200 RSI: 0000000000000000 RDI: ffff88845b1d4400 RBP: ffffc9000067fa60 R08: 0000000000000200 R09: ffff88845b1d4200 R10: ffff88845b1d4200 R11: ffff888441313000 R12: ffffc9000067f950 R13: ffff88846ac00140 R14: 0000000000000000 R15: ffff88846c2bc000 FS: 00007faa1a3c0540(0000) GS:ffff88846fd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000030 CR3: 0000000446dca003 CR4: 0000000000760ea0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: ? __switch_to_asm+0x40/0x70 ? __switch_to_asm+0x34/0x70 mlx5_ib_create_qp+0x897/0xfa0 [mlx5_ib] ib_create_qp+0x9e/0x300 [ib_core] create_qp+0x92d/0xb20 [ib_uverbs] ? ib_uverbs_cq_event_handler+0x30/0x30 [ib_uverbs] ? release_resource+0x30/0x30 ib_uverbs_create_qp+0xc4/0xe0 [ib_uverbs] ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0xc8/0xf0 [ib_uverbs] ib_uverbs_run_method+0x223/0x770 [ib_uverbs] ? track_pfn_remap+0xa7/0x100 ? uverbs_disassociate_api+0xd0/0xd0 [ib_uverbs] ? remap_pfn_range+0x358/0x490 ib_uverbs_cmd_verbs.isra.6+0x19b/0x370 [ib_uverbs] ? rdma_umap_priv_init+0x82/0xe0 [ib_core] ? vm_mmap_pgoff+0xec/0x120 ib_uverbs_ioctl+0xc0/0x120 [ib_uverbs] ksys_ioctl+0x92/0xb0 __x64_sys_ioctl+0x16/0x20 do_syscall_64+0x48/0x130 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: e383085c2425 ("RDMA/mlx5: Set ECE options during QP create") Link: https://lore.kernel.org/r/20200621115959.60126-1-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-06-18RDMA/mlx5: Fix integrity enabled QP creationMax Gurtovoy
create_flags checks was refactored and broke the creation on integrity enabled QPs and actually broke the NVMe/RDMA and iSER ULP's when using mlx5 driven devices. Fixes: 2978975ce7f1 ("RDMA/mlx5: Process create QP flags in one place") Link: https://lore.kernel.org/r/20200617130230.2846915-1-leon@kernel.org Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-06-18RDMA/mlx5: Remove ECE limitation from the RAW_PACKET QPsLeon Romanovsky
Like any other QP type, rely on FW for the RAW_PACKET QPs to decide if ECE is supported or not. This fixes an inability to create RAW_PACKET QPs with latest rdma-core with the ECE support. Fixes: e383085c2425 ("RDMA/mlx5: Set ECE options during QP create") Link: https://lore.kernel.org/r/20200618112507.3453496-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-06-18RDMA/mlx5: Fix remote gid value in query QPMaor Gottlieb
Remote gid is not copied to the right address. Fix it by using rdma_ah_set_dgid_raw to copy the remote gid value from the QP context on query QP. Fixes: 70bd7fb87625 ("RDMA/mlx5: Remove manually crafted QP context the query call") Link: https://lore.kernel.org/r/20200618112507.3453496-3-leon@kernel.org Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-06-18RDMA/mlx5: Don't access ib_qp fields in internal destroy QP pathLeon Romanovsky
destroy_qp_common is called for flows where QP is already created by HW. While it is called from IB/core, the ibqp.* fields will be fully initialized, but it is not the case if this function is called during QP creation. Don't rely on ibqp fields as much as possible and initialize send_cq/recv_cq as temporal solution till all drivers will be converted to IB/core QP allocation scheme. refcount_t: underflow; use-after-free. WARNING: CPU: 1 PID: 5372 at lib/refcount.c:28 refcount_warn_saturate+0xfe/0x1a0 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 5372 Comm: syz-executor.2 Not tainted 5.5.0-rc5 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 Call Trace: mlx5_core_put_rsc+0x70/0x80 destroy_resource_common+0x8e/0xb0 mlx5_core_destroy_qp+0xaf/0x1d0 mlx5_ib_destroy_qp+0xeb0/0x1460 ib_destroy_qp_user+0x2d5/0x7d0 create_qp+0xed3/0x2130 ib_uverbs_create_qp+0x13e/0x190 ? ib_uverbs_ex_create_qp ib_uverbs_write+0xaa5/0xdf0 __vfs_write+0x7c/0x100 ksys_write+0xc8/0x200 do_syscall_64+0x9c/0x390 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Fixes: 08d53976609a ("RDMA/mlx5: Copy response to the user in one place") Link: https://lore.kernel.org/r/20200617130148.2846643-1-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-06-15RDMA/mlx5: Fix -Wformat warning in check_ucmd_data()Tom Seewald
Variables of type size_t should use %zu rather than %lu [1]. The variables "inlen", "ucmd", "last", and "size" are all size_t, so use the correct format specifiers. [1] https://www.kernel.org/doc/html/latest/core-api/printk-formats.html Fixes: e383085c2425 ("RDMA/mlx5: Set ECE options during QP create") Link: https://lore.kernel.org/r/20200605023012.9527-1-tseewald@gmail.com Signed-off-by: Tom Seewald <tseewald@gmail.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-06-15RDMA/mlx5: Remove duplicated assignment to resp.response_lengthColin Ian King
The assignment to resp.response_length is never read since it is being updated again on the next statement. The assignment is redundant so removed it. Fixes: a645a89d9a78 ("RDMA/mlx5: Return ECE DC support") Link: https://lore.kernel.org/r/20200604143902.56021-1-colin.king@canonical.com Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-06-03RDMA/mlx5: Return ECE DC supportLeon Romanovsky
The DC QPs are many-to-one QP types that means that first connection will establish ECE options that coming connections should follow. Due to this property, the ECE code was removed between first [1] and second [2] ECE submissions. This patch returns the dropped code, because ECE is a property of a connection and like any other connection users are needed to manage this data. Allow them to set ECE parameter for DC too and avoid need of having compatibility flag for the DC ECE. [1] https://lore.kernel.org/linux-rdma/20200523132243.817936-1-leon@kernel.org/ [2] https://lore.kernel.org/linux-rdma/20200525174401.71152-1-leon@kernel.org/ Link: https://lore.kernel.org/r/20200602125548.172654-4-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-06-03RDMA/mlx5: Don't rely on FW to set zeros in ECE responseLeon Romanovsky
The FW returns zeros in case feature is not enabled, but it is better to have the capability check and ensure that returned result is cleared. Fixes: 3e09a427ae7a ("RDMA/mlx5: Get ECE options from FW during create QP") Link: https://lore.kernel.org/r/20200602125548.172654-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-06-03RDMA/mlx5: Return an error if copy_to_user failsLeon Romanovsky
In theoretical event, the ib_copy_to_udata() can fail, so return -EFAULT error to the user, so he will destroy the QP. Fixes: 50aec2c3135e ("RDMA/mlx5: Return ECE data after modify QP") Link: https://lore.kernel.org/r/20200602125548.172654-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-29RDMA/mlx5: Support TX port affinity for VF drivers in LAG modeMark Zhang
The mlx5 VF driver doesn't set QP tx port affinity because it doesn't know if the lag is active or not, since the "lag_active" works only for PF interfaces. In this case for VF interfaces only one lag is used which brings performance issue. Add a lag_tx_port_affinity CAP bit; When it is enabled and "num_lag_ports > 1", then driver always set QP tx affinity, regardless of lag state. Link: https://lore.kernel.org/r/20200527055014.355093-1-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-27RDMA/mlx5: Return ECE data after modify QPLeon Romanovsky
After users sets the ECE option, FW will return the agreed/supported bits through an output structures of modify QP stages for regular QPs or through create QP for the DCT. Link: https://lore.kernel.org/r/20200526115440.205922-9-leon@kernel.org Reviewed-by: Mark Zhang <markz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-27RDMA/mlx5: Set ECE options during modify QPLeon Romanovsky
The most common way to set ECE option will be during modify QP command in INIT2RTR, RTR2RTS and RTS2RTS stages, so update mlx5 to support it. The new bit in the comp_mask is needed to mark that kernel supports ECE and can receive data instead of "reserved" field in the struct mlx5_ib_modify_qp. Link: https://lore.kernel.org/r/20200526115440.205922-8-leon@kernel.org Reviewed-by: Mark Zhang <markz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-27RDMA/mlx5: Convert modify QP to use MLX5_SET macrosLeon Romanovsky
Instead of hand crafted mlx5_qp_context and mlx5_qp_path use common MLX5_SET() macros. Link: https://lore.kernel.org/r/20200526115440.205922-7-leon@kernel.org Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Mark Zhang <markz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-27RDMA/mlx5: Remove manually crafted QP context the query callLeon Romanovsky
As a preparation to removal hand crafted mlx5_qp_context, convert query_qp_attr() to use proper MLX5_GET() macros. Link: https://lore.kernel.org/r/20200526115440.205922-6-leon@kernel.org Reviewed-by: Mark Zhang <markz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-27RDMA/mlx5: Use direct modify QP implementationLeon Romanovsky
As a preparation to removal hand crafted mlx5_qp_context, convert counter code to use mlx5_cmd_exec_in() directly. Link: https://lore.kernel.org/r/20200526115440.205922-5-leon@kernel.org Reviewed-by: Mark Zhang <markz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-27RDMA/mlx5: Set ECE options during QP createLeon Romanovsky
Allow users to ask creation of QPs with specific ECE options. Such early set even before RDMA-CM connection is established is useful if user knows exactly which option he needs. Link: https://lore.kernel.org/r/20200526115440.205922-4-leon@kernel.org Reviewed-by: Mark Zhang <markz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-27RDMA/mlx5: Get ECE options from FW during create QPLeon Romanovsky
Supported ECE options are returned from FW in the create_qp phase and zero means that field is not valid. Such default value allows us to reuse reserved field without worries about comp_mask. Update create QP API to return ECE options. Link: https://lore.kernel.org/r/20200526115440.205922-3-leon@kernel.org Reviewed-by: Mark Zhang <markz@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-21Merge tag 'v5.7-rc6' into rdma.git for-nextJason Gunthorpe
Linux 5.7-rc6 Conflict in drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c resolved by deleting dr_cq_event, matching how netdev resolved it. Required for dependencies in the following patches. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-07RDMA/mlx5: Remove duplicated assignment to variable rcqe_szColin Ian King
The variable rcqe_sz is being unnecessarily assigned twice, fix this by removing one of the duplicates. Fixes: 8bde2c509e40 ("RDMA/mlx5: Update all DRIVER QP places to use QP subtype") Link: https://lore.kernel.org/r/20200507151610.52636-1-colin.king@canonical.com Addresses-Coverity: ("Evaluation order violation") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-06RDMA/mlx5: Allow only raw Ethernet QPs when RoCE isn't enabledMark Bloch
When operating in switchdev mode or using devlink to disable RoCE only raw Ethernet QPs are allowed to be created. When in switchdev mode this can lead to passing an invalid port number as part of the modify qp firmware cmd and will lead to a syndrome reported back to the user, such as: * mlx5_cmd_check:803:(pid 50148): RST2INIT_QP(0x502) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x177405). Internal UD QP might be used to test for write combining support (even if externally we report RoCE as disabled) check for that specific flag and allow is specifically. Fixes: b5ca15ad7e61 ("IB/mlx5: Add proper representors support") Link: https://lore.kernel.org/r/20200506071602.7177-3-leon@kernel.org Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-06RDMA/mlx5: Move all WR logic from qp.c to separate fileLeon Romanovsky
Split qp.c by removing all WR logic to separate file. Link: https://lore.kernel.org/r/20200506065513.4668-4-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-06RDMA/mlx5: Refactor mlx5_post_send() to improve readabilityMax Gurtovoy
Add small helpers in order to avoid code duplication and improve code readability. Decrease the amount of code in the gigantic post_send function and divide it to readable methods that will help in code maintenance in the future. Link: https://lore.kernel.org/r/20200506065513.4668-3-leon@kernel.org Signed-off-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-06RDMA/mlx5: Define RoCEv2 udp source port when set pathMark Zhang
Calculate and set UDP source port based on the flow label. If flow label is not defined in GRH then calculate it based on lqpn/rqpn. Link: https://lore.kernel.org/r/20200504051935.269708-4-leon@kernel.org Signed-off-by: Mark Zhang <markz@mellanox.com> Reviewed-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-02RDMA/mlx5: Set lag tx affinity according to slaveMaor Gottlieb
The patch sets the lag tx affinity of the data QPs and the GSI QPs according to the LAG xmit slave. For GSI QPs, in case the link layer is Ethenet (RoCE) we create two GSI QPs, one for each physical port. When the driver selects the GSI QP, it will consider the port affinity result. For connected QPs, the driver sets the affinity of the xmit slave. The above, ensures that RC QP and it's corresponding GSI QP will transmit from the same physical port. Link: https://lore.kernel.org/r/20200430192146.12863-17-maorg@mellanox.com Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-02RDMA/mlx5: Refactor affinity related codeMaor Gottlieb
Move affinity related code in modify qp to function. It's a preparation for next patch the extend the affinity calculation to consider the xmit slave. Link: https://lore.kernel.org/r/20200430192146.12863-16-maorg@mellanox.com Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>