Age | Commit message (Collapse) | Author |
|
Pull block fixes from Jens Axboe:
- dasd spelling fixes (Bhaskar)
- Limit bio max size on multi-page bvecs to the hardware limit, to
avoid overly large bio's (and hence latencies). Originally queued for
the merge window, but needed a fix and was dropped from the initial
pull (Changheun)
- NVMe pull request (Christoph):
- reset the bdev to ns head when failover (Daniel Wagner)
- remove unsupported command noise (Keith Busch)
- misc passthrough improvements (Kanchan Joshi)
- fix controller ioctl through ns_head (Minwoo Im)
- fix controller timeouts during reset (Tao Chiu)
- rnbd fixes/cleanups (Gioh, Md, Dima)
- Fix iov_iter re-expansion (yangerkun)
* tag 'block-5.13-2021-05-07' of git://git.kernel.dk/linux-block:
block: reexpand iov_iter after read/write
nvmet: remove unsupported command noise
nvme-multipath: reset bdev to ns head when failover
nvme-pci: fix controller reset hang when racing with nvme_timeout
nvme: move the fabrics queue ready check routines to core
nvme: avoid memset for passthrough requests
nvme: add nvme_get_ns helper
nvme: fix controller ioctl through ns_head
bio: limit bio max size
RDMA/rtrs: fix uninitialized symbol 'cnt'
s390: dasd: Mundane spelling fixes
block/rnbd: Remove all likely and unlikely
block/rnbd-clt: Check the return value of the function rtrs_clt_query
block/rnbd: Fix style issues
block/rnbd-clt: Change queue_depth type in rnbd_clt_session to size_t
|
|
rtrs_clt_rdma_cq_direct returns an ninitialized value in cnt
if there is no session. This patch makes rtrs_clt_rdma_cq_direct
returns a negative value for block layer not to try again.
Fixes: 2958a995edc94 ("block/rnbd-clt: Support polling mode for IO latency optimization")
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Link: https://lore.kernel.org/r/20210429092741.266533-1-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Pull rdma updates from Jason Gunthorpe:
"This is significantly bug fixes and general cleanups. The noteworthy
new features are fairly small:
- XRC support for HNS and improves RQ operations
- Bug fixes and updates for hns, mlx5, bnxt_re, hfi1, i40iw, rxe, siw
and qib
- Quite a few general cleanups on spelling, error handling, static
checker detections, etc
- Increase the number of device ports supported beyond 255. High port
count software switches now exist
- Several bug fixes for rtrs
- mlx5 Device Memory support for host controlled atomics
- Report SRQ tables through to rdma-tool"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (145 commits)
IB/qib: Remove redundant assignment to ret
RDMA/nldev: Add copy-on-fork attribute to get sys command
RDMA/bnxt_re: Fix a double free in bnxt_qplib_alloc_res
RDMA/siw: Fix a use after free in siw_alloc_mr
IB/hfi1: Remove redundant variable rcd
RDMA/nldev: Add QP numbers to SRQ information
RDMA/nldev: Return SRQ information
RDMA/restrack: Add support to get resource tracking for SRQ
RDMA/nldev: Return context information
RDMA/core: Add CM to restrack after successful attachment to a device
RDMA/cma: Skip device which doesn't support CM
RDMA/rxe: Fix a bug in rxe_fill_ip_info()
RDMA/mlx5: Expose private query port
RDMA/mlx4: Remove an unused variable
RDMA/mlx5: Fix type assignment for ICM DM
IB/mlx5: Set right RoCE l3 type and roce version while deleting GID
RDMA/i40iw: Fix error unwinding when i40iw_hmc_sd_one fails
RDMA/cxgb4: add missing qpid increment
IB/ipoib: Remove unnecessary struct declaration
RDMA/bnxt_re: Get rid of custom module reference counting
...
|
|
Pull block driver updates from Jens Axboe:
- MD changes via Song:
- raid5 POWER fix
- raid1 failure fix
- UAF fix for md cluster
- mddev_find_or_alloc() clean up
- Fix NULL pointer deref with external bitmap
- Performance improvement for raid10 discard requests
- Fix missing information of /proc/mdstat
- rsxx const qualifier removal (Arnd)
- Expose allocated brd pages (Calvin)
- rnbd via Gioh Kim:
- Change maintainer
- Change domain address of maintainers' email
- Add polling IO mode and document update
- Fix memory leak and some bug detected by static code analysis
tools
- Code refactoring
- Series of floppy cleanups/fixes (Denis)
- s390 dasd fixes (Julian)
- kerneldoc fixes (Lee)
- null_blk double free (Lv)
- null_blk virtual boundary addition (Max)
- Remove xsysace driver (Michal)
- umem driver removal (Davidlohr)
- ataflop fixes (Dan)
- Revalidate disk removal (Christoph)
- Bounce buffer cleanups (Christoph)
- Mark lightnvm as deprecated (Christoph)
- mtip32xx init cleanups (Shixin)
- Various fixes (Tian, Gustavo, Coly, Yang, Zhang, Zhiqiang)
* tag 'for-5.13/drivers-2021-04-27' of git://git.kernel.dk/linux-block: (143 commits)
async_xor: increase src_offs when dropping destination page
drivers/block/null_blk/main: Fix a double free in null_init.
md/raid1: properly indicate failure when ending a failed write request
md-cluster: fix use-after-free issue when removing rdev
nvme: introduce generic per-namespace chardev
nvme: cleanup nvme_configure_apst
nvme: do not try to reconfigure APST when the controller is not live
nvme: add 'kato' sysfs attribute
nvme: sanitize KATO setting
nvmet: avoid queuing keep-alive timer if it is disabled
brd: expose number of allocated pages in debugfs
ataflop: fix off by one in ataflop_probe()
ataflop: potential out of bounds in do_format()
drbd: Fix fall-through warnings for Clang
block/rnbd: Use strscpy instead of strlcpy
block/rnbd-clt-sysfs: Remove copy buffer overlap in rnbd_clt_get_path_name
block/rnbd-clt: Remove max_segment_size
block/rnbd-clt: Generate kobject_uevent when the rnbd device state changes
block/rnbd-srv: Remove unused arguments of rnbd_srv_rdma_ev
Documentation/ABI/rnbd-clt: Add description for nr_poll_queues
...
|
|
We always map with SZ_4K, so do not need max_segment_size.
Cc: Leon Romanovsky <leonro@nvidia.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20210419073722.15351-18-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
struct rtrs_srv is not used when handling rnbd_srv_rdma_ev messages, so
cleaned up
rdma_ev function pointer in rtrs_srv_ops also is changed.
Cc: Leon Romanovsky <leonro@nvidia.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Aleksei Marov <aleksei.marov@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20210419073722.15351-16-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
RNBD can make double-queues for irq-mode and poll-mode.
For example, on 4-CPU system 8 request-queues are created,
4 for irq-mode and 4 for poll-mode.
If the IO has HIPRI flag, the block-layer will call .poll function
of RNBD. Then IO is sent to the poll-mode queue.
Add optional nr_poll_queues argument for map_devices interface.
To support polling of RNBD, RTRS client creates connections
for both of irq-mode and direct-poll-mode.
For example, on 4-CPU system it could've create 5 connections:
con[0] => user message (softirq cq)
con[1:4] => softirq cq
After this patch, it can create 9 connections:
con[0] => user message (softirq cq)
con[1:4] => softirq cq
con[5:8] => DIRECT-POLL cq
Cc: Leon Romanovsky <leonro@nvidia.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20210419073722.15351-14-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
They are defined with the same value and similar meaning, let's remove
one of them, then we can remove {WAIT,NOWAIT}.
Also change the type of 'wait' from 'int' to 'enum wait_type' to make
it clear.
Cc: Leon Romanovsky <leonro@nvidia.com>
Cc: linux-rdma@vger.kernel.org
Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20210419073722.15351-9-gi-oh.kim@ionos.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Two error messages are only different message but have common
code to generate the path string.
Link: https://lore.kernel.org/r/20210406123639.202899-4-gi-oh.kim@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
It does not help to debug if it only print error message
without any debugging information which session and connection
the error happened.
Link: https://lore.kernel.org/r/20210406123639.202899-3-gi-oh.kim@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Client prints only error value and it is not enough for debugging.
1. When client receives an error from server: the client does not only
print the error value but also more information of server connection.
2. When client failes to send IO: the client gets an error from RDMA
layer. It also print more information of server connection.
Link: https://lore.kernel.org/r/20210406123639.202899-2-gi-oh.kim@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
It shows the latest latency that the client checked when sending the
heart-beat.
Link: https://lore.kernel.org/r/20210407113444.150961-3-gi-oh.kim@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
This patch adds new multipath policy: min-latency. Client checks the
latency of each path when it sends the heart-beat. And it sends IO to the
path with the minimum latency.
Link: https://lore.kernel.org/r/20210407113444.150961-2-gi-oh.kim@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
A session can be removed dynamically by sysfs interface "remove_path" that
eventually calls rtrs_clt_remove_path_from_sysfs function. The current
rtrs_clt_remove_path_from_sysfs first removes the sysfs interfaces and
frees sess->stats object. Second it removes the session from the active
list.
Therefore some functions could access non-connected session and access the
freed sess->stats object even-if they check the session status before
accessing the session.
For instance rtrs_clt_request and get_next_path_min_inflight check the
session status and try to send IO to the session. The session status
could be changed when they are trying to send IO but they could not catch
the change and update the statistics information in sess->stats object,
and generate use-after-free problem.
(see: "RDMA/rtrs-clt: Check state of the rtrs_clt_sess before reading its
stats")
This patch changes the rtrs_clt_remove_path_from_sysfs to remove the
session from the active session list and then destroy the sysfs
interfaces.
Each function still should check the session status because closing or
error recovery paths can change the status.
Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20210412084002.33582-1-gi-oh.kim@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Reviewed-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Max io size is limited by both remote buffer size and the max fr pages per
mr.
Link: https://lore.kernel.org/r/20210325153308.1214057-20-gi-oh.kim@ionos.com
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Link: https://lore.kernel.org/r/20210325153308.1214057-18-gi-oh.kim@ionos.com
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Before receiving the session name, the error message cannot include any
information about which connection generates the error.
This patch stores the addresses of source and target in the sessname field
to show which generates the error. That field will be over-written
when receiving the session name from client.
Link: https://lore.kernel.org/r/20210325153308.1214057-17-gi-oh.kim@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
There is common code converting addresses of source machine and
destination machine to a string. We already have a struct rtrs_addr to
store two addresses. This patch introduces a new function that converts
two addresses into one string with struct rtrs_addr.
Link: https://lore.kernel.org/r/20210325153308.1214057-14-gi-oh.kim@ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
KASAN detected the following BUG:
BUG: KASAN: use-after-free in rtrs_clt_update_wc_stats+0x41/0x100 [rtrs_client]
Read of size 8 at addr ffff88bf2fb4adc0 by task swapper/0/0
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G O 5.4.84-pserver #5.4.84-1+feature+linux+5.4.y+dbg+20201216.1319+b6b887b~deb10
Hardware name: Supermicro H8QG6/H8QG6, BIOS 3.00 09/04/2012
Call Trace:
<IRQ>
dump_stack+0x96/0xe0
print_address_description.constprop.4+0x1f/0x300
? irq_work_claim+0x2e/0x50
__kasan_report.cold.8+0x78/0x92
? rtrs_clt_update_wc_stats+0x41/0x100 [rtrs_client]
kasan_report+0x10/0x20
rtrs_clt_update_wc_stats+0x41/0x100 [rtrs_client]
rtrs_clt_rdma_done+0xb1/0x760 [rtrs_client]
? lockdep_hardirqs_on+0x1a8/0x290
? process_io_rsp+0xb0/0xb0 [rtrs_client]
? mlx4_ib_destroy_cq+0x100/0x100 [mlx4_ib]
? add_interrupt_randomness+0x1a2/0x340
__ib_process_cq+0x97/0x100 [ib_core]
ib_poll_handler+0x41/0xb0 [ib_core]
irq_poll_softirq+0xe0/0x260
__do_softirq+0x127/0x672
irq_exit+0xd1/0xe0
do_IRQ+0xa3/0x1d0
common_interrupt+0xf/0xf
</IRQ>
RIP: 0010:cpuidle_enter_state+0xea/0x780
Code: 31 ff e8 99 48 47 ff 80 7c 24 08 00 74 12 9c 58 f6 c4 02 0f 85 53 05 00 00 31 ff e8 b0 6f 53 ff e8 ab 4f 5e ff fb 8b 44 24 04 <85> c0 0f 89 f3 01 00 00 48 8d 7b 14 e8 65 1e 77 ff c7 43 14 00 00
RSP: 0018:ffffffffab007d58 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffca
RAX: 0000000000000002 RBX: ffff88b803d69800 RCX: ffffffffa91a8298
RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffffffffab021414
RBP: ffffffffab6329e0 R08: 0000000000000002 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
R13: 000000bf39d82466 R14: ffffffffab632aa0 R15: ffffffffab632ae0
? lockdep_hardirqs_on+0x1a8/0x290
? cpuidle_enter_state+0xe5/0x780
cpuidle_enter+0x3c/0x60
do_idle+0x2fb/0x390
? arch_cpu_idle_exit+0x40/0x40
? schedule+0x94/0x120
cpu_startup_entry+0x19/0x1b
start_kernel+0x5da/0x61b
? thread_stack_cache_init+0x6/0x6
? load_ucode_amd_bsp+0x6f/0xc4
? init_amd_microcode+0xa6/0xa6
? x86_family+0x5/0x20
? load_ucode_bsp+0x182/0x1fd
secondary_startup_64+0xa4/0xb0
Allocated by task 5730:
save_stack+0x19/0x80
__kasan_kmalloc.constprop.9+0xc1/0xd0
kmem_cache_alloc_trace+0x15b/0x350
alloc_sess+0xf4/0x570 [rtrs_client]
rtrs_clt_open+0x3b4/0x780 [rtrs_client]
find_and_get_or_create_sess+0x649/0x9d0 [rnbd_client]
rnbd_clt_map_device+0xd7/0xf50 [rnbd_client]
rnbd_clt_map_device_store+0x4ee/0x970 [rnbd_client]
kernfs_fop_write+0x141/0x240
vfs_write+0xf3/0x280
ksys_write+0xba/0x150
do_syscall_64+0x68/0x270
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Freed by task 5822:
save_stack+0x19/0x80
__kasan_slab_free+0x125/0x170
kfree+0xe7/0x3f0
kobject_put+0xd3/0x240
rtrs_clt_destroy_sess_files+0x3f/0x60 [rtrs_client]
rtrs_clt_close+0x3c/0x80 [rtrs_client]
close_rtrs+0x45/0x80 [rnbd_client]
rnbd_client_exit+0x10f/0x2bd [rnbd_client]
__x64_sys_delete_module+0x27b/0x340
do_syscall_64+0x68/0x270
entry_SYSCALL_64_after_hwframe+0x49/0xbe
When rtrs_clt_close is triggered, it iterates over all the present
rtrs_clt_sess and triggers close on them. However, the call to
rtrs_clt_destroy_sess_files is done before the rtrs_clt_close_conns. This
is incorrect since during the initialization phase we allocate
rtrs_clt_sess first, and then we go ahead and create rtrs_clt_con for it.
If we free the rtrs_clt_sess structure before closing the rtrs_clt_con, it
may so happen that an inflight IO completion would trigger the function
rtrs_clt_rdma_done, which would lead to the above UAF case.
Hence close the rtrs_clt_con connections first, and then trigger the
destruction of session files.
Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20210325153308.1214057-12-gi-oh.kim@ionos.com
Signed-off-by: Md Haris Iqbal <haris.iqbal@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Let these cases share the same path since all of them need to close
session.
Link: https://lore.kernel.org/r/20210325153308.1214057-11-gi-oh.kim@ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Reviewed-by: Danil Kipnis <danil.kipnis@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The two members are not used in the code, so remove them.
Link: https://lore.kernel.org/r/20210325153308.1214057-10-gi-oh.kim@ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Reviewed-by: Danil Kipnis <danil.kipnis@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
We can remove the label after move put_device to the right place.
Link: https://lore.kernel.org/r/20210325153308.1214057-9-gi-oh.kim@ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Reviewed-by: Danil Kipnis <danil.kipnis@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
There is no need to dereference 's' from 'sess', since we have "sess =
to_clt_sess(s)" before.
And we can deference 'dev' from 's' earlier.
Link: https://lore.kernel.org/r/20210325153308.1214057-8-gi-oh.kim@ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@ionos.com>
Reviewed-by: Danil Kipnis <danil.kipnis@ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
It's easier to understand a string instead of enum.
Link: https://lore.kernel.org/r/20210222141551.54345-2-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Have the driver use shared CQs which provids a ~10%-20% improvement during
test.
Instead of opening a CQ for each QP per connection, a CQ for each QP will
be provided by the RDMA core driver that will be shared between the QPs on
that core reducing interrupt overhead.
Link: https://lore.kernel.org/r/20210222141551.54345-1-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
smatch gives the warning:
drivers/infiniband/ulp/rtrs/rtrs-srv.c:1805 rtrs_rdma_connect() warn: passing zero to 'PTR_ERR'
Which is trying to say smatch has shown that srv is not an error pointer
and thus cannot be passed to PTR_ERR.
The solution is to move the list_add() down after full initilization of
rtrs_srv. To avoid holding the srv_mutex too long, only hold it during the
list operation as suggested by Leon.
Fixes: 03e9b33a0fd6 ("RDMA/rtrs: Only allow addition of path to an already established session")
Link: https://lore.kernel.org/r/20210216143807.65923-1-jinpu.wang@cloud.ionos.com
Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
put_device() decreases the ref-count and then the device will be
cleaned-up, while at is also add missing put_device in
rtrs_srv_create_once_sysfs_root_folders
This patch solves a kmemleak error as below:
unreferenced object 0xffff88809a7a0710 (size 8):
comm "kworker/4:1H", pid 113, jiffies 4295833049 (age 6212.380s)
hex dump (first 8 bytes):
62 6c 61 00 6b 6b 6b a5 bla.kkk.
backtrace:
[<0000000054413611>] kstrdup+0x2e/0x60
[<0000000078e3120a>] kobject_set_name_vargs+0x2f/0xb0
[<00000000f1a17a6b>] dev_set_name+0xab/0xe0
[<00000000d5502e32>] rtrs_srv_create_sess_files+0x2fb/0x314 [rtrs_server]
[<00000000ed11a1ef>] rtrs_srv_info_req_done+0x631/0x800 [rtrs_server]
[<000000008fc5aa8f>] __ib_process_cq+0x94/0x100 [ib_core]
[<00000000a9599cb4>] ib_cq_poll_work+0x32/0xc0 [ib_core]
[<00000000cfc376be>] process_one_work+0x4bc/0x980
[<0000000016e5c96a>] worker_thread+0x78/0x5c0
[<00000000c20b8be0>] kthread+0x191/0x1e0
[<000000006c9c0003>] ret_from_fork+0x3a/0x50
Fixes: baa5b28b7a47 ("RDMA/rtrs-srv: Replace device_register with device_initialize and device_add")
Link: https://lore.kernel.org/r/20210212134525.103456-5-jinpu.wang@cloud.ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
kmemleak reported an error as below:
unreferenced object 0xffff8880674b7640 (size 64):
comm "kworker/4:1H", pid 113, jiffies 4296403507 (age 507.840s)
hex dump (first 32 bytes):
69 70 3a 31 39 32 2e 31 36 38 2e 31 32 32 2e 31 ip:192.168.122.1
31 30 40 69 70 3a 31 39 32 2e 31 36 38 2e 31 32 10@ip:192.168.12
backtrace:
[<0000000054413611>] kstrdup+0x2e/0x60
[<0000000078e3120a>] kobject_set_name_vargs+0x2f/0xb0
[<00000000ca2be3ee>] kobject_init_and_add+0xb0/0x120
[<0000000062ba5e78>] rtrs_srv_create_sess_files+0x14c/0x314 [rtrs_server]
[<00000000b45b7217>] rtrs_srv_info_req_done+0x5b1/0x800 [rtrs_server]
[<000000008fc5aa8f>] __ib_process_cq+0x94/0x100 [ib_core]
[<00000000a9599cb4>] ib_cq_poll_work+0x32/0xc0 [ib_core]
[<00000000cfc376be>] process_one_work+0x4bc/0x980
[<0000000016e5c96a>] worker_thread+0x78/0x5c0
[<00000000c20b8be0>] kthread+0x191/0x1e0
[<000000006c9c0003>] ret_from_fork+0x3a/0x50
It is caused by the not-freed kobject of rtrs_srv_sess. The kobject
embedded in rtrs_srv_sess has ref-counter 2 after calling
process_info_req(). Therefore it must call kobject_put twice. Currently
it calls kobject_put only once at rtrs_srv_destroy_sess_files because
kobject_del removes the state_in_sysfs flag and then kobject_put in
free_sess() is not called.
This patch moves kobject_del() into free_sess() so that the kobject of
rtrs_srv_sess can be freed. And also this patch adds the missing call of
sysfs_remove_group() to clean-up the sysfs directory.
Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20210212134525.103456-4-jinpu.wang@cloud.ionos.com
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
While adding a path from the client side to an already established
session, it was possible to provide the destination IP to a different
server. This is dangerous.
This commit adds an extra member to the rtrs_msg_conn_req structure, named
first_conn; which is supposed to notify if the connection request is the
first for that session or not.
On the server side, if a session does not exist but the first_conn
received inside the rtrs_msg_conn_req structure is 1, the connection
request is failed. This signifies that the connection request is for an
already existing session, and since the server did not find one, it is an
wrong connection request.
Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20210212134525.103456-3-jinpu.wang@cloud.ionos.com
Signed-off-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Reviewed-by: Lutz Pogrell <lutz.pogrell@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
BUG: KASAN: stack-out-of-bounds in _mlx4_ib_post_send+0x1bd2/0x2770 [mlx4_ib]
Read of size 4 at addr ffff8880d5a7f980 by task kworker/0:1H/565
CPU: 0 PID: 565 Comm: kworker/0:1H Tainted: G O 5.4.84-storage #5.4.84-1+feature+linux+5.4.y+dbg+20201216.1319+b6b887b~deb10
Hardware name: Supermicro H8QG6/H8QG6, BIOS 3.00 09/04/2012
Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
Call Trace:
dump_stack+0x96/0xe0
print_address_description.constprop.4+0x1f/0x300
? irq_work_claim+0x2e/0x50
__kasan_report.cold.8+0x78/0x92
? _mlx4_ib_post_send+0x1bd2/0x2770 [mlx4_ib]
kasan_report+0x10/0x20
_mlx4_ib_post_send+0x1bd2/0x2770 [mlx4_ib]
? check_chain_key+0x1d7/0x2e0
? _mlx4_ib_post_recv+0x630/0x630 [mlx4_ib]
? lockdep_hardirqs_on+0x1a8/0x290
? stack_depot_save+0x218/0x56e
? do_profile_hits.isra.6.cold.13+0x1d/0x1d
? check_chain_key+0x1d7/0x2e0
? save_stack+0x4d/0x80
? save_stack+0x19/0x80
? __kasan_slab_free+0x125/0x170
? kfree+0xe7/0x3b0
rdma_write_sg+0x5b0/0x950 [rtrs_server]
The problem is when we send imm_wr, the type should be ib_rdma_wr, so hw
driver like mlx4 can do rdma_wr(wr), so fix it by use the ib_rdma_wr as
type for imm_wr.
Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20210212134525.103456-2-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
When KASAN is enabled, we notice warning below:
[ 483.436975] ==================================================================
[ 483.437234] BUG: KASAN: stack-out-of-bounds in _mlx5_ib_post_send+0x188a/0x2560 [mlx5_ib]
[ 483.437430] Read of size 4 at addr ffff88a195fd7d30 by task kworker/1:3/6954
[ 483.437731] CPU: 1 PID: 6954 Comm: kworker/1:3 Kdump: loaded Tainted: G O 5.4.82-pserver #5.4.82-1+feature+linux+5.4.y+dbg+20201210.1532+987e7a6~deb10
[ 483.437976] Hardware name: Supermicro Super Server/X11DDW-L, BIOS 3.3 02/21/2020
[ 483.438168] Workqueue: rtrs_server_wq hb_work [rtrs_core]
[ 483.438323] Call Trace:
[ 483.438486] dump_stack+0x96/0xe0
[ 483.438646] ? _mlx5_ib_post_send+0x188a/0x2560 [mlx5_ib]
[ 483.438802] print_address_description.constprop.6+0x1b/0x220
[ 483.438966] ? _mlx5_ib_post_send+0x188a/0x2560 [mlx5_ib]
[ 483.439133] ? _mlx5_ib_post_send+0x188a/0x2560 [mlx5_ib]
[ 483.439285] __kasan_report.cold.9+0x1a/0x32
[ 483.439444] ? _mlx5_ib_post_send+0x188a/0x2560 [mlx5_ib]
[ 483.439597] kasan_report+0x10/0x20
[ 483.439752] _mlx5_ib_post_send+0x188a/0x2560 [mlx5_ib]
[ 483.439910] ? update_sd_lb_stats+0xfb1/0xfc0
[ 483.440073] ? set_reg_wr+0x520/0x520 [mlx5_ib]
[ 483.440222] ? update_group_capacity+0x340/0x340
[ 483.440377] ? find_busiest_group+0x314/0x870
[ 483.440526] ? update_sd_lb_stats+0xfc0/0xfc0
[ 483.440683] ? __bitmap_and+0x6f/0x100
[ 483.440832] ? __lock_acquire+0xa2/0x2150
[ 483.440979] ? __lock_acquire+0xa2/0x2150
[ 483.441128] ? __lock_acquire+0xa2/0x2150
[ 483.441279] ? debug_lockdep_rcu_enabled+0x23/0x60
[ 483.441430] ? lock_downgrade+0x390/0x390
[ 483.441582] ? __lock_acquire+0xa2/0x2150
[ 483.441729] ? __lock_acquire+0xa2/0x2150
[ 483.441876] ? newidle_balance+0x425/0x8f0
[ 483.442024] ? __lock_acquire+0xa2/0x2150
[ 483.442172] ? debug_lockdep_rcu_enabled+0x23/0x60
[ 483.442330] hb_work+0x15d/0x1d0 [rtrs_core]
[ 483.442479] ? schedule_hb+0x50/0x50 [rtrs_core]
[ 483.442627] ? lock_downgrade+0x390/0x390
[ 483.442781] ? process_one_work+0x40d/0xa50
[ 483.442931] process_one_work+0x4ee/0xa50
[ 483.443082] ? pwq_dec_nr_in_flight+0x110/0x110
[ 483.443231] ? do_raw_spin_lock+0x119/0x1d0
[ 483.443383] worker_thread+0x65/0x5c0
[ 483.443532] ? process_one_work+0xa50/0xa50
[ 483.451839] kthread+0x1e2/0x200
[ 483.451983] ? kthread_create_on_node+0xc0/0xc0
[ 483.452139] ret_from_fork+0x3a/0x50
The problem is we use wrong type when send wr, hw driver expect the type
of IB_WR_RDMA_WRITE_WITH_IMM wr should be ib_rdma_wr, and doing
container_of to access member. The fix is simple use ib_rdma_wr instread
of ib_send_wr.
Fixes: c0894b3ea69d ("RDMA/rtrs: core: lib functions shared between client and server modules")
Link: https://lore.kernel.org/r/20201217141915.56989-20-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Fix up wr_avail accounting. if wr_cnt is 0, then we do SIGNAL for first
wr, in completion we add queue_depth back, which is not right in the
sense of tracking for available wr.
So fix it by init wr_cnt to 1.
Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20201217141915.56989-19-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
We do not need to wait for REG_MR completion, so remove the
SIGNAL flag.
Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20201217141915.56989-18-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
We may want to add new flags, so it's better to use bitmask to check flags.
Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20201217141915.56989-17-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
For HB, there is no need to generate signal for completion.
Also remove a comment accordingly.
Fixes: c0894b3ea69d ("RDMA/rtrs: core: lib functions shared between client and server modules")
Link: https://lore.kernel.org/r/20201217141915.56989-16-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reported-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Make all failure cases go to the common path to avoid duplicate code.
And some issued existed before.
1. clt need to be freed to avoid memory leak.
2. return ERR_PTR(-ENOMEM) if kobject_create_and_add fails, because
rtrs_clt_open checks the return value of by call "IS_ERR(clt)".
Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20201217141915.56989-15-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
We had a few places wr_cqe is not set, which could lead to NULL pointer
deref or GPF in error case.
Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20201217141915.56989-14-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Let's rename it to rtrs_clt_change_state since the previous one is
killed.
Also update the comment to make it more clear.
Link: https://lore.kernel.org/r/20201217141915.56989-13-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
It is just a wrapper of rtrs_clt_change_state_get_old, and we can reuse
rtrs_clt_change_state_get_old with add the checking of 'old_state' is
valid or not.
Link: https://lore.kernel.org/r/20201217141915.56989-12-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
This is not needed since the label is just after the place.
Link: https://lore.kernel.org/r/20201217141915.56989-11-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Let's wait the inflight permits before free it.
Link: https://lore.kernel.org/r/20201217141915.56989-10-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Since the two functions are called together, let's consolidate them in
a new function rtrs_clt_destroy_sysfs_root.
Link: https://lore.kernel.org/r/20201217141915.56989-9-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Per the comment of kobject_init_and_add, we need to free the memory
by call kobject_put.
Fixes: 215378b838df ("RDMA/rtrs: client: sysfs interface functions")
Fixes: 91b11610af8d ("RDMA/rtrs: server: sysfs interface functions")
Link: https://lore.kernel.org/r/20201217141915.56989-8-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
The rtrs_iu_free is called in rtrs_iu_alloc if memory is limited, so we
don't need to free the same iu again.
Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20201217141915.56989-7-jinpu.wang@cloud.ionos.com
Signed-off-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
Reviewed-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Currently rtrs when create_qp use a coarse numbers (bigger in general),
which leads to hardware create more resources which only waste memory
with no benefits.
- SERVICE con,
For max_send_wr/max_recv_wr, it's 2 times SERVICE_CON_QUEUE_DEPTH + 2
- IO con
For max_send_wr/max_recv_wr, it's sess->queue_depth * 3 + 1
Fixes: 6a98d71daea1 ("RDMA/rtrs: client: main functionality")
Link: https://lore.kernel.org/r/20201217141915.56989-6-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Remove self first to avoid deadlock, we don't want to
use close_work to remove sess sysfs.
Fixes: 91b11610af8d ("RDMA/rtrs: server: sysfs interface functions")
Link: https://lore.kernel.org/r/20201217141915.56989-5-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Tested-by: Lutz Pogrell <lutz.pogrell@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
In this error case, we don't need hold mutex to call close_sess.
Fixes: 9cb837480424 ("RDMA/rtrs: server: main functionality")
Link: https://lore.kernel.org/r/20201217141915.56989-4-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Tested-by: Lutz Pogrell <lutz.pogrell@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
rtrs does not have same limit for both max_send_wr and max_recv_wr,
To allow client and server set different values, export in a separate
parameter for rtrs_cq_qp_create.
Also fix the type accordingly, u32 should be used instead of u16.
Fixes: c0894b3ea69d ("RDMA/rtrs: core: lib functions shared between client and server modules")
Link: https://lore.kernel.org/r/20201217141915.56989-2-jinpu.wang@cloud.ionos.com
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Reviewed-by: Md Haris Iqbal <haris.iqbal@cloud.ionos.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|
|
Previously the rnbd client requested the rtrs to allocate rnbd_iu
just after the rtrs_iu. So the rnbd client passes the size of
rnbd_iu for rtrs_clt_open() and rtrs creates an array of
rnbd_iu and rtrs_iu.
For IO handling, rnbd_iu exists after the request because we pass
the size of rnbd_iu when setting the tag-set. Therefore we do not
use the rnbd_iu allocated by rtrs for IO handling.
We only use the rnbd_iu allocated by rtrs when doing session
initialization. Almost all rnbd_iu allocated by rtrs are wasted.
By this patch the rnbd client does not request rnbd_iu allocation
to rtrs but allocate it for itself when doing session initialization.
Also remove unused rtrs_permit_to_pdu from rtrs.
Signed-off-by: Gioh Kim <gi-oh.kim@cloud.ionos.com>
Signed-off-by: Jack Wang <jinpu.wang@cloud.ionos.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
From https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git
The rc RDMA branch is needed due to dependencies on the next patches.
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
|