Age | Commit message (Collapse) | Author |
|
Now that the SRCU stuff has been removed the entire MR destroy logic can
be made a lot simpler. Currently there are many different ways to destroy a
MR and it makes it really hard to do this task correctly. Route all
destruction through mlx5_ib_dereg_mr() and make it work for all
situations.
Since it turns out all the different MR types do basically the same thing
this removes a lot of knowledge of MR internals from ODP and leaves ODP
just exporting an operation to clean up children.
This fixes a few weird corner cases bugs and firmly uses the correct
ordering of the MR destruction:
- Stop parallel access to the mkey via the ODP xarray
- Stop DMA
- Release the umem
- Clean up ODP children
- Free/Recycle the MR
Link: https://lore.kernel.org/r/20210304120745.1090751-4-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The struct mlx5_ib_mr can be used for three different things, but only one
at a time:
- In the user MR cache
- As a kernel MR
- As a user MR
Overlay the three things into a single union with the following rules:
- If the mr is found on the cache_ent->head list then it is a cache MR
and umem == NULL. The entire union is zero after the MR is removed from
the cache.
- If umem != NULL or type == IB_MR_TYPE_USER then it is a user MR.
- If umem == NULL then it is a kernel MR
This reduces the size of struct mlx5_ib_mr to 552 bytes from 702.
The only place the three flows overlap in the code is during dereg, so add
a few extra checks along there.
Link: https://lore.kernel.org/r/20210304120745.1090751-3-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
All of the ODP code assumes when it calls mlx5_mr_cache_alloc() the ODP
related fields are zero'd. This is true if the MR was just allocated, but
if the MR is recycled through the cache then the values are never zero'd.
This causes a bug in the odp_stats, they don't reset when the MR is
reallocated, also is_odp_implicit is never 0'd.
So we can use memset on a block of the mlx5_ib_mr reorganize the structure
to put all the data that can be zero'd by the cache at the end.
It is organized as an anonymous struct because the next patch will make
this a union.
Delete the unused smr_info. Don't set the kernel only desc_size on the
user path. No longer any need to zero mr->parent before freeing it, the
memset() will get it now.
Fixes: a3de94e3d61e ("IB/mlx5: Introduce ODP diagnostic counters")
Link: https://lore.kernel.org/r/20210304120745.1090751-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The HIP09 supports XRC transport service, it greatly saves the number of
QPs required to connect all processes in a large cluster.
Link: https://lore.kernel.org/r/1614826558-35423-1-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Set mkey_index to the correct place when preparing the destroy_mkey inbox,
this will fix the following syndrome:
mlx5_core 0000:08:00.0: mlx5_cmd_check:782:(pid 980716): DEALLOC_PD(0x801)
op_mod(0x0) failed, status bad resource state(0x9), syndrome (0x31b635)
Link: https://lore.kernel.org/r/20210311142024.1282004-1-leon@kernel.org
Fixes: 1368ead04c36 ("RDMA/mlx5: Use strict get/set operations for obj_id")
Reviewed-by: Ido Kalir <idok@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
1. Don't set the ts_format bit to default when it reserved - device is
running in the old mode (free running).
2. XRC doesn't have a CQ therefore the ts format in the QP
context should be default / free running.
3. Set ts_format to WQ.
Fixes: 2fe8d4b87802 ("RDMA/mlx5: Fail QP creation if the device can not support the CQE TS")
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
HIP09 uses new address space to map SQ doorbell registers, the doorbell of
each QP is isolated based on the size of 64KB, which can improve the
performance in concurrency scenarios.
Link: https://lore.kernel.org/r/1614082833-23130-1-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
The W=1 allmodconfig build produces the following warning:
drivers/infiniband/hw/mlx5/odp.c:1086: warning: wrong kernel-doc identifier on line:
* Parse a series of data segments for page fault handling.
Fix it by changing /** to be /* as it is written in kernel-doc
documentation.
Fixes: 5e769e444d26 ("RDMA/hw/mlx5/odp: Fix formatting and add missing descriptions in 'pagefault_data_segments()'")
Link: https://lore.kernel.org/r/20210302074214.1054299-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Set err to -ENOMEM if kzalloc fails instead of 0.
Fixes: 759738537142 ("IB/mlx5: Enable subscription for device events over DEVX")
Link: https://lore.kernel.org/r/20210222122343.19720-1-yuehaibing@huawei.com
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Linux 5.11
Merged to resolve conflicts with RDMA rc commits
- drivers/infiniband/sw/rxe/rxe_net.c
The final logic is to call rxe_get_dev_from_net() again with the master
netdev if the packet was rx'd on a vlan. To keep the elimination of the
local variables requires a trivial edit to the code in -rc
Link: https://lore.kernel.org/r/20210210131542.215ea67c@canb.auug.org.au
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Leon Romanovsky says:
====================
Add an extra timestamp format for mlx5_ib device.
====================
Based on the mlx5-next branch at
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
due to dependencies.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
* branch 'mlx5_timestamp':
RDMA/mlx5: Fail QP creation if the device can not support the CQE TS
net/mlx5: Add new timestamp mode bits
|
|
In ConnectX6Dx device, HW can work in real time timestamp mode according
to the device capabilities per RQ/SQ/QP.
When the flag IB_UVERBS_CQ_FLAGS_TIMESTAMP_COMPLETION is set, the user
expect to get TS on the CQEs in free running format, so we need to fail
the QP creation if the current mode of the device doesn't support it.
Link: https://lore.kernel.org/r/20210209131107.698833-3-leon@kernel.org
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The traditional DevX CQ creation flow goes through mlx5_core_create_cq()
which checks that the given EQN corresponds to an existing EQ and attaches
a devx handler to the EQN for the CQ.
In some cases the EQ will not be a kernel EQ, but will be controlled by
modify CQ, don't block creating these just because the EQN can't be found
in the kernel.
Link: https://lore.kernel.org/r/20210211085549.1277674-1-leon@kernel.org
Signed-off-by: Tal Gilboa <talgi@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Fix the following coccicheck warning:
./drivers/infiniband/hw/qedr/qedr.h:629:9-10: WARNING: return of 0/1 in
function 'qedr_qp_has_rq' with return type bool.
./drivers/infiniband/hw/qedr/qedr.h:620:9-10: WARNING: return of 0/1 in
function 'qedr_qp_has_sq' with return type bool.
Link: https://lore.kernel.org/r/1612949901-109873-1-git-send-email-jiapeng.chong@linux.alibaba.com
Reported-by: Abaci Robot<abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Acked-by: Michal KalderonĀ <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
FRMR is not well-supported on HIP08, it is re-designed for HIP09 and the
position of related fields is changed. Then the ULPs should be forbidden
to use FRMR on older hardwares.
Link: https://lore.kernel.org/r/1612924424-28217-1-git-send-email-liweihang@huawei.com
Signed-off-by: Yixing Liu <liuyixing1@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Simplify __hns_roce_cmq_send() then remove the redundant variables.
Link: https://lore.kernel.org/r/1612688143-28226-6-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The register 0x07014 is actually the head pointer of CMDQ, and 0x07010
means tail pointer. Current definitions are confusing, so rename them and
related variables.
The next_to_use of structure hns_roce_v2_cmq_ring has the same semantics
as head, merge them into one member. The next_to_clean of structure
hns_roce_v2_cmq_ring has the same semantics as tail. After deleting
next_to_clean, tail should also be deleted.
Link: https://lore.kernel.org/r/1612688143-28226-5-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
CMDQ works serially, after each successful transmission, the head and tail
pointers will be equal, so there is no need to check whether the queue is
full. At the same time, since the descriptor of each transmission is new,
there is no need to perform a cleanup operation. Then, the field named
next_to_clean in structure hns_roce_v2_cmq_ring is redundant.
Fixes: a04ff739f2a9 ("RDMA/hns: Add command queue support for hip08 RoCE driver")
Link: https://lore.kernel.org/r/1612688143-28226-4-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
When posting a multi-descriptors command, the error code of previous
failed descriptors may be rewrote to 0 by a later successful descriptor.
Fixes: a04ff739f2a9 ("RDMA/hns: Add command queue support for hip08 RoCE driver")
Link: https://lore.kernel.org/r/1612688143-28226-3-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
last_status of structure hns_roce_v2_cmq has never been used, and the
variable named 'complete' in __hns_roce_cmq_send() is meaningless.
Fixes: a04ff739f2a9 ("RDMA/hns: Add command queue support for hip08 RoCE driver")
Link: https://lore.kernel.org/r/1612688143-28226-2-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Support 400Gbps IB rate in mlx5 driver.
Link: https://lore.kernel.org/r/20210209130429.698237-1-leon@kernel.org
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
qedr_gsi_post_send() has a debug output which prints the return value of
in_irq() and irqs_disabled().
The result of the in_irq(), even if invoked from an interrupt handler, is
subject to change depending on the `threadirqs' command line switch. The
result of irqs_disabled() is always be 1 because the function acquires
spinlock_t with spin_lock_irqsave().
Remove in_irq() and irqs_disabled() from the debug output because it
provides little value.
Link: https://lore.kernel.org/r/20210208193347.383254-1-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Cleanup the synchronize_srcu() from the ODP flow as it was found to be a
very heavy time consumer as part of dereg_mr.
For example de-registration of 10000 ODP MRs each with size of 2M hugepage
took 19.6 sec comparing de-registration of same number of non ODP MRs that
took 172 ms.
The new locking scheme uses the wait_event() mechanism which follows the
use count of the MR instead of using synchronize_srcu().
By that change, the time required for the above test took 95 ms which is
even better than the non ODP flow.
Once fully dropped the srcu usage, had to come with a lock to protect the
XA access.
As part of using the above mechanism we could also clean the
num_deferred_work stuff and follow the use count instead.
Link: https://lore.kernel.org/r/20210202071309.2057998-1-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
There is no need to use a for loop to assign values for an array of cmd
descriptors which has only two elements.
Link: https://lore.kernel.org/r/1612517974-31867-13-git-send-email-liweihang@huawei.com
Signed-off-by: Xinhao Liu <liuxinhao5@hisilicon.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The hns driver wrap around the consumer index of AEQ and CEQ when they
reach to two times of queue entries number for owner mechanism, actually,
it is unnecessary to wrap around since the hardware itself will mask it
before use.
Link: https://lore.kernel.org/r/1612517974-31867-12-git-send-email-liweihang@huawei.com
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
All fields of WQE will be rewrote, so the memset is unnecessary. And when
SQ is working in OWNER mode, the pipeline may prefetch the WQEs beyond PI,
the memset operation may flip the owner bit too early, then the pipeline
may get a wrong WQ.
Link: https://lore.kernel.org/r/1612517974-31867-11-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Use macros instead of magic numbers to represent shift of dma_handle_wqe,
dma_handle_idx and UDP destination port number of RoCEv2.
Link: https://lore.kernel.org/r/1612517974-31867-10-git-send-email-liweihang@huawei.com
Signed-off-by: Xinhao Liu <liuxinhao5@hisilicon.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
hns_roce_device.h is not specific to hardware, some definitions are only
used for HIP06, they should be moved into hns_roce_hw_v1.h.
Link: https://lore.kernel.org/r/1612517974-31867-9-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Currently, the driver updates doorbell looks like this:
post()
{
wqe.field = 0x111;
wmb();
update_wq_db();
}
update_wq_db()
{
db.field = 0x222;
__raw_writeq(db, db_reg);
}
writeq() is a better choice than __raw_writeq() because it calls dma_wmb()
to barrier in ARM64, and dma_wmb() is better than wmb() for ROCEE device.
This patch removes all wmb() before updating doorbell of SQ/RQ/CQ/SRQ by
replacing __raw_writeq() with writeq() to improve performence. The new
process looks like this:
post()
{
wqe.field = 0x111;
update_wq_db();
}
update_wq_db()
{
db.field = 0x222;
writeq(db, db_reg);
}
Link: https://lore.kernel.org/r/1612517974-31867-8-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Since HIP09 does not require this function, it should be masked.
Link: https://lore.kernel.org/r/1612517974-31867-7-git-send-email-liweihang@huawei.com
Signed-off-by: Yixing Liu <liuyixing1@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
This feature should only be enabled by querying capability from firmware.
Fixes: ba6bb7e97421 ("RDMA/hns: Add interfaces to get pf capabilities from firmware")
Link: https://lore.kernel.org/r/1612517974-31867-5-git-send-email-liweihang@huawei.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Add the mapped page count checking flow to avoid invalid page size when
creating MTR.
Fixes: 38389eaa4db1 ("RDMA/hns: Add mtr support for mixed multihop addressing")
Link: https://lore.kernel.org/r/1612517974-31867-4-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
This bit should be in type of enum ib_sig_type, or there will be a sparse
warning.
Fixes: bfe860351e31 ("RDMA/hns: Fix cast from or to restricted __le32 for driver")
Link: https://lore.kernel.org/r/1612517974-31867-3-git-send-email-liweihang@huawei.com
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
ULP usually set IB(V)_QP_AV when trying to modify QP to RTR if they want
to record sgid index into QPC. For UD QPs, it is useless because it will
be included in WQE. For RC QPs, it will be filled in
hns_roce_set_path(). So sgid index shouldn't be filled by default. Then
hns_get_gid_index() is moved to hns_roce_hw_v1.c because it is only called
in it.
Fixes: 926a01dc000d ("RDMA/hns: Add QP operations support for hip08 SoC")
Link: https://lore.kernel.org/r/1612517974-31867-2-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Direct wqe is a mechanism to fill wqe directly into the hardware. In the
case of light load, the wqe will be filled into pcie bar space of the
hardware, this will reduce one memory access operation and therefore
reduce the latency.
Link: https://lore.kernel.org/r/1611997513-27107-1-git-send-email-liweihang@huawei.com
Signed-off-by: Yixing Liu <liuyixing1@huawei.com>
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The post_recv only supports QP types of RC, GSI and UD.
Link: https://lore.kernel.org/r/1611997090-48820-13-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The SRQ in the hns driver consists of the following four parts:
* wqe buf: the buffer to store WQE.
* wqe_idx buf: the cqe of SRQ may be not generated in the order of wqe, so
the wqe_idx corresponding to the idle WQE needs to be pushed into the
index queue which is a FIFO, then it instructs the hardware to obtain
the corresponding WQE.
* bitmap: bitmap is used to generate and release wqe_idx. When the user
has a new WR, the driver finds the idx of the idle wqe in bitmap. When
the CQE of wqe is generated, the driver will release the idx.
* wr_id buf: wr_id buf is used to store the user's wr_id, then return it
to the user when poll_cq verb is invoked.
The process of post SRQ recv is refactored to make preceding code clearer.
Link: https://lore.kernel.org/r/1611997090-48820-12-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The HIP09 requires the driver to clear the unused data segments in wqe
buffer to make the hns ROCEE stop reading the remaining invalid sges for
RQ.
Link: https://lore.kernel.org/r/1611997090-48820-11-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Refactor post recv flow by removing unnecessary checking and removing
duplicated code.
Link: https://lore.kernel.org/r/1611997090-48820-10-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Use new register operation interfaces to simplify the process of write SRQ
Context.
Link: https://lore.kernel.org/r/1611997090-48820-9-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Reduce parameter numbers of write_srqc() and move some related code into
it from alloc_srqc().
Link: https://lore.kernel.org/r/1611997090-48820-8-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Split the SRQ creation process into multiple steps and encapsulate them
into functions.
Link: https://lore.kernel.org/r/1611997090-48820-7-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Each SRQs contain an reserved WQE, it is inappropriate and should be
removed.
Fixes: c7bcb13442e1 ("RDMA/hns: Add SRQ support for hip08 kernel mode")
Link: https://lore.kernel.org/r/1611997090-48820-6-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
When an error occurs, the qp_table must be cleared, regardless of whether
the SRQ feature is enabled.
Fixes: 5c1f167af112 ("RDMA/hns: Init SRQ table for hip08")
Link: https://lore.kernel.org/r/1611997090-48820-5-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
According to the IB Specification, srq_limit shouldn't be configured
during SRQ creation. If a user set srq_limit at this time, the driver
should forced it to zero, or the result of creating SRQ will conflict with
the result of querying SRQ.
Fixes: c7bcb13442e1 ("RDMA/hns: Add SRQ support for hip08 kernel mode")
Link: https://lore.kernel.org/r/1611997090-48820-4-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
If a user posts WR by wr_list, the head pointer of idx_queue won't be
updated until all wqes are filled, so the judgment of whether head equals
to tail will get a wrong result. Fix above issue and move the head and
tail pointer from the srq structure into the idx_queue structure. After
idx_queue is filled with wqe idx, the head pointer of it will increase.
Fixes: c7bcb13442e1 ("RDMA/hns: Add SRQ support for hip08 kernel mode")
Link: https://lore.kernel.org/r/1611997090-48820-3-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The RQ/SRQ of HIP08 needs one special sge to stop receive reliably. So the
driver needs to allocate at least one SGE when creating RQ/SRQ and ensure
that at least one SGE is filled with the special value during post_recv.
Besides, the kernel driver should only do this for kernel ULP. For
userspace ULP, the userspace driver will allocate the reserved SGE in
buffer, and the kernel driver just needs to pin the corresponding size of
memory based on the userspace driver's requirements.
Link: https://lore.kernel.org/r/1611997090-48820-2-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Instead of open coding the loop for port iteration, use rdma_for_each_port
macro provided by core.
To use such macro, early initialization of phys_port_cnt is needed.
Hence, initialize such constant early enough in the init stage.
Whichever functions are called with port using rdma_for_each_port(),
convert their port type from u8 to unsigned int to match the core API.
Link: https://lore.kernel.org/r/20210203130133.4057329-6-leon@kernel.org
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Currently mlx5 driver caches port GID table length for 2 ports. It is
also cached by IB core as port immutable data.
When mlx5 representor ports are present, which are usually more than 2,
invalid access to port_caps array can happen while validating the GID
table length which is only for 2 ports.
To avoid this, take help of the IB cores port immutable data by exposing
an API to read the port immutable fields.
Remove mlx5 driver's internal cache, thereby reduce code and data.
Link: https://lore.kernel.org/r/20210203130133.4057329-5-leon@kernel.org
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|