summaryrefslogtreecommitdiff
path: root/drivers/infiniband/sw/rxe/rxe_net.c
AgeCommit message (Collapse)Author
2020-10-26RDMA/rxe: Fix small problem in network_type patchBob Pearson
The patch referenced below has a typo that results in using the wrong L2 header size for outbound traffic. (V4 <-> V6). It also breaks kernel-side RC traffic because they use AVs that use RDMA_NETWORK_XXX enums instead of RXE_NETWORK_TYPE_XXX enums. Fix this by transcoding between these enum types. Fixes: e0d696d201dd ("RDMA/rxe: Move the definitions for rxe_av.network_type to uAPI") Link: https://lore.kernel.org/r/20201016211343.22906-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-16RDMA/rxe: Move the definitions for rxe_av.network_type to uAPIJason Gunthorpe
RXE was wrongly using an internal kernel enum as part of its uAPI, split this out into a dedicated uAPI enum just for RXE. It only uses the IPv4 and IPv6 values. This was exposed by changing the internal kernel enum definition which broke RXE. Fixes: 1c15b4f2a42f ("RDMA/core: Modify enum ib_gid_type and enum rdma_network_type") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-31RDMA/rxe: Add SPDX hdrs to rxe source filesBob Pearson
Add SPDX headers to all rxe .c and .h files. Link: https://lore.kernel.org/r/20200827145439.2273-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-27RDMA/rxe: Fix style warningsBob Pearson
Fixed several minor checkpatch warnings in existing rxe source. Link: https://lore.kernel.org/r/20200820224638.3212-3-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-16RDMA/rxe: Remove rxe_link_layer()Kamal Heib
Instead of returning IB_LINK_LAYER_ETHERNET from rxe_link_layer, return it directly from get_link_layer callback and remove rxe_link_layer(). Fixes: 8700e3e7c485 ("Soft RoCE driver") Link: https://lore.kernel.org/r/20200705104313.283034-5-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2019-12-04net: ipv6_stub: use ip6_dst_lookup_flow instead of ip6_dst_lookupSabrina Dubroca
ipv6_stub uses the ip6_dst_lookup function to allow other modules to perform IPv6 lookups. However, this function skips the XFRM layer entirely. All users of ipv6_stub->ip6_dst_lookup use ip_route_output_flow (via the ip_route_output_key and ip_route_output helpers) for their IPv4 lookups, which calls xfrm_lookup_route(). This patch fixes this inconsistent behavior by switching the stub to ip6_dst_lookup_flow, which also calls xfrm_lookup_route(). This requires some changes in all the callers, as these two functions take different arguments and have different return types. Fixes: 5f81bd2e5d80 ("ipv6: export a stub for IPv6 symbols used by vxlan") Reported-by: Xiumei Mu <xmu@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-05-03RDMA/rxe: Use rdma_read_gid_attr_ndev_rcu to access netdevParav Pandit
Use rdma_read_gid_attr_ndev_rcu() to access netdevice attached to GID entry under rcu lock. This ensures that while working on the netdevice of the GID, it doesn't get freed. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-05-03RDMA/rxe: Consider skb reserve space based on netdev of GIDParav Pandit
Always consider the skb reserve space based on netdevice of the GID attribute, regardless of vlan or non vlan netdevice. Fixes: 43c9fc509fa5 ("rdma_rxe: make rxe work over 802.1q VLAN devices") Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-27IB/rxe: Replace av->network_type with skb->protocolZhu Yanjun
In the function rxe_init_packet, based on av->network_type, skb->protocol is set to ipv4 or ipv6. The functions rxe_prepare and rxe_send are called after the functin rxe_init_packet. So in these functions, av->network_type can be replaced with skb->protocol. The functions are in the xmit fast path. So with skb->protocol, the performance will be better. Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-19rdma_rxe: Use netlink messages to add/delete linksSteve Wise
Add support for the RDMA_NLDEV_CMD_NEWLINK/DELLINK messages which allow dynamically adding new RXE links. Deprecate the old module options for now. Cc: Moni Shoua <monis@mellanox.com> Reviewed-by: Yanjun Zhu <yanjun.zhu@oracle.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-19RDMA/rxe: Close a race after ib_register_deviceJason Gunthorpe
Since rxe allows unregistration from other threads the rxe pointer can become invalid any moment after ib_register_driver returns. This could cause a user triggered use after free. Add another driver callback to be called right after the device becomes registered to complete any device setup required post-registration. This callback has enough core locking to prevent the device from becoming unregistered. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-19RDMA/rxe: Add ib_device_get_by_name() and use it in rxeJason Gunthorpe
rxe has an open coded version of this that is not as safe as the core version. This lets us eliminate the internal device list entirely from rxe. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-19RDMA/rxe: Use driver_unregister and new unregistration APIJason Gunthorpe
rxe does not have correct locking for its registration/unregistration paths, use the core code to handle it instead. In this mode ib_unregister_device will also do the dealloc, so rxe is required to do clean up from a callback. The core code ensures that unregistration is done only once, and generally takes care of locking and concurrency problems for rxe. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-19RDMA/rxe: Use ib_device_get_by_netdev() instead of open codingJason Gunthorpe
The core API handles the locking correctly and is faster if there are multiple devices. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-04RDMA/rxe: Improve loopback markingKamal Heib
Currently a packet is marked for loopback only if the source and destination addresses equals. This is not enough when multiple gids are present in rxe device's gid table and the traffic is from one gid to another. Fix it by marking the packet for loopback if the destination MAC address is equal to the source MAC address. Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Tested-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-30RDMA: Provide safe ib_alloc_device() functionLeon Romanovsky
All callers to ib_alloc_device() provide a larger size than struct ib_device and rely on the fact that struct ib_device is embedded in their driver specific structure as the first member. Provide a safer variant of ib_alloc_device() that checks and enforces this approach to make sure the drivers are using it right. Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-20IB/rxe: Reuse code which sets port stateYuval Shaia
Same code is executed in both rxe_param_set_add and rxe_notify functions. Make one function and call it from both places. Since both callers already have a rxe object use it directly instead of deriving it from the net device. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com>  Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-11-08RDMA/rxe: Add link_down, rdma_sends, rdma_recvs stats countersAndrew Boyer
link_down is self-explanatory. rdma_sends and rdma_recvs count the number of RDMA Send and RDMA Receive operations completed successfully. This is different from the existing sent_pkts and rcvd_pkts counters because the existing counters measure packets, not RDMA operations. ack_deffered is renamed to ack_deferred to fix the spelling. out_of_sequence is renamed to out_of_seq_request to make clear that it is counting only requests and not other packets which can be out of sequence. Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-11-08RDMA/rxe: Distinguish between down links and disabled linksAndrew Boyer
In ib_query_port(), use the netdev's IFF_UP flag to determine phys_state (flag set = down = POLLING, flag clear = disabled = DISABLED). Callers can then use the phys_state field to distinguish between links which have a dead partner, cable missing, etc., from links which are turned off on the local node. This is useful for HA and supportability. Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-10-03RDMA/rxe: Remove unused addr_same()Kamal Heib
This function is not in use - delete it. Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-09-26RDMA/drivers: Use dev_name instead of ibdev->nameJason Gunthorpe
These return the same thing but dev_name is a more conventional use of the kernel API. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
2018-09-26RDMA/drivers: Use dev_err/dbg/etc instead of pr_* + ibdev->nameJason Gunthorpe
Kernel convention is that a driver for a subsystem will print using dev_* on the subsystem's struct device, or with dev_* on the physical device. Drivers should rarely use a pr_* function. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-30IB/rxe: Simplify rxe_find_route() to avoid GID query for netdevParav Pandit
rxe_prepare() is called on an skb which has ndev already initialized by rxe_init_packet(). Therefore avoid querying the GID attribute again and use the available netdevice from the skb->dev. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Tested-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-08-30IB/rxe: vary the source udp port for receive scalingVijay Immanuel
Select the source udp port number for a QP based on the source QPN. This provides a better spread of traffic across NIC RX queues for RC/UC QPs. Signed-off-by: Vijay Immanuel <vijayi@attalasystems.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-06-29IB/rxe: don't clear the tx queue on every transferVijay Immanuel
Do not call sk_dst_set() on every packet transfer because that calls sk_tx_queue_clear(), which clears the tx queue. A QP must stay on the same tx queue to maintain packet order. Signed-off-by: Vijay Immanuel <vijayi@attalasystems.com> Acked-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-06-18IB/rxe: Use rdma GID APIParav Pandit
rxe_netdev_from_av can now be done by the core code directly from the gid_attrs, no need for a helper in the driver. ib_find_cached_gid_by_port can be switched to use the rdma version here as well. Signed-off-by: Parav Pandit <parav@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-05-28IB/rxe: avoid unnecessary exportZhu Yanjun
The function rxe_remove_all is only used in this modules. There is no other modules that call this function. So it is not necessary to export it. Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-05-01IB/cxgb4: use skb_put_zero()/__skb_put_zeroYueHaibing
Use the recently introduced helper to replace the pattern of skb_put_zero/__skb_put() && memset(). Signed-off-by: YueHaibing <yuehaibing@huawei.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-27IB/rxe: Change rxe_rcv to return voidYuval Shaia
It always returns 0. Change return type to void. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-19IB/rxe: make rxe_release_udp_tunnel staticZhu Yanjun
The function rxe_release_udp_tunnel is only used in rxe_net.c. So it is necessary to make this function as static. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-04-17IB/rxe: make the variable staticZhu Yanjun
The variable rxe_net_notifier is only used in the file rxe_net.c. So remove it from rxe_net.h file and make it static in the file rxe_net.c. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Bart Van Assche <bart.vanassche@wdc.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-03-14rdma_rxe: make rxe work over 802.1q VLAN devicesMartin Wilck
This patch fixes RDMA/rxe over 802.1q VLAN devices. Without it, I observed the following behavior: a) adding a VLAN device to RXE via rxe_net_add() creates a non-functional RDMA device. This is caused by the logic in enum_all_gids_of_dev_cb() / is_eth_port_of_netdev(), which only considers networks connected to "upper devices" of the configured network device, resulting in an empty set of gids for a VLAN interface that is an "upper device" itself. Later attempts to connect via this rdma device fail in cma_acuire_dev() because no gids can be resolved. b) adding the master device of the VLAN device instead seems to work initially, target addresses via VLAN devices are resolved successfully. But the connection times out because no 802.1q VLAN headers are inserted in the ethernet packets, which are therefore never received. This happens because the RXE layer sends the packets via the master device rather than the VLAN device. The problem could be solved by changing either a) or b). My thinking was that the logic in a) was created deliberately, thus I decided to work on b). It turns out that the information about the VLAN interface for the gid at hand is available in the AV information. My patch converts the RXE code to use this netdev instead of rxe->ndev. With this change, RXE over vlan works on my test system. Signed-off-by: Martin Wilck <mwilck@suse.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-03-07IB/rxe: remove unnecessary rxe in rxe_sendZhu Yanjun
In the function rxe_send, the variable rxe is not used in it. So it should be removed. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-08IB/rxe: remove unnecessary skb_clone in xmitZhu Yanjun
In xmit, there is a skb_clone. This function copies the struct sk_buff. And some parameters are changed to the new skb. Then the new skb is sent while the old skb is freed. While the function skb_clone is removed, the parameter changes are made on the old skb, then the old skb is sent. It can also work well. The following tests are made. server client --------- --------- |1.1.1.1|<----rxe-channel--->|1.1.1.2| --------- --------- On server: rping -s -a 1.1.1.1 -v -C 1000 -S 512 On client: rping -c -a 1.1.1.1 -v -C 1000 -S 512 The kernel config CONFIG_DEBUG_KMEMLEAK is enabled on both server and client. This test runs for several hours. There is no memory leak and the whole system can work well. As the above network, the following tests are made. Server: ibv_rc_pingpong -d rxe0 -g 1 Client: ibv_rc_pingpong -d rxe0 -g 1 1.1.1.1 The result on Server. Before: 8192000 bytes in 0.88 seconds = 74.36 Mbit/sec 1000 iters in 0.88 seconds = 881.30 usec/iter After: 8192000 bytes in 0.81 seconds = 81.15 Mbit/sec 1000 iters in 0.81 seconds = 807.62 usec/iter The throughput is enhanced and the latency is reduced. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Joe Jin <joe.jin@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2018-01-08IB/rxe: add the static type to the variableZhu Yanjun
The variable recv_sockets is only used in the file rxe_net.c. So it is better to add static type to it. CC: Srinivas Eeda <srinivas.eeda@oracle.com> CC: Joe Jin <joe.jin@oracle.com> CC: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-28IB/rxe: Handle NETDEV_CHANGE eventsAndrew Boyer
Without this fix, ports configured on top of ixgbe miss link up notifications. ibv_query_port() will continue to return IBV_PORT_DOWN even though the port is up and working. Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-28IB/rxe: Remove unneeded initialization in prepare6()Andrew Boyer
Fixes: 4ed6ad1eb30e ("IB/rxe: Cache dst in QP instead of getting it...") Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-28IB/rxe: Add dst_clone() in prepare_ipv6_hdr()Andrew Boyer
Otherwise the reference count goes negative as IPv6 packets complete. Fixes: 4ed6ad1eb30e ("IB/rxe: Cache dst in QP instead of getting it...") Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-28IB/rxe: Fix destination cache for IPv6Andrew Boyer
To successfully match an IPv6 path, the path cookie must match. Store it in the QP so that the IPv6 path can be reused. Replace open-coded version of dst_check() with the actual call, fixing the logic. The open-coded version skips the check call if dst->obsolete is 0 (DST_OBSOLETE_NONE), proceeding to replace the route. DST_OBSOLETE_NONE means that the route may continue to be used, though. Fixes: 4ed6ad1eb30e ("IB/rxe: Cache dst in QP instead of getting it...") Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-08-28IB/rxe: Move refcounting earlier in rxe_send()Andrew Boyer
The network stack will call nskb's destructor, rxe_skb_tx_dtor(), if the packet gets dropped by ip_local_out()/ip6_local_out(). Thus we need to add the QP ref before output to avoid extra dereferences during network congestion. This could lead to unwanted destruction of the QP. Fix up the skb_out accounting, too. Fixes: fda85ce91240 ("IB/rxe: Fix kernel panic from skb destructor") Signed-off-by: Andrew Boyer <andrew.boyer@dell.com> Acked-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-07-17IB/rxe: Fix kernel panic from skb destructorYonatan Cohen
In the time between rxe_send has finished and skb destructor called, the QP's ref count might be 0, leading to a possible QP destruction. This will lead to a kernel panic when the destructor dereferences the QP. The operation of incrementing QP ref count at rxe_send and decrementing from skb destructor will prevent this crash. BUG: unable to handle kernel NULL pointer dereference at 000000000000072c IP: [<ffffffffa05df765>] rxe_skb_tx_dtor+0x15/0x50 [rdma_rxe] PGD 0 [16240.211178] Oops: 0002 [#1] SMP CPU: 3 PID: 0 Comm: swapper/3 Tainted: G OE 4.9.0-mlnx #1 Hardware name: Red Hat KVM, BIOS Bochs 01/01/2011 task: ffff88042d6b1480 task.stack: ffffc90001904000 RIP: 0010:[<ffffffffa05df765>] [<ffffffffa05df765>] rxe_skb_tx_dtor+0x15/0x50 [rdma_rxe] RSP: 0018:ffff88043fcc3df0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff880429684700 RCX: ffff88042d248200 RDX: 00000000ffffffff RSI: 00000000fffffe01 RDI: ffff880429684700 RBP: ffff88043fcc3e00 R08: ffff88043fcda240 R09: 00000000ff2d1de6 R10: 0000000000000000 R11: 00000000f49cf6fe R12: ffff880429684700 R13: ffffffff81893f96 R14: ffffffff817d66f0 R15: ffff880427f74200 FS: 0000000000000000(0000) GS:ffff88043fcc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000072c CR3: 000000041d3df000 CR4: 00000000000006e0 Stack: ffffffff817b29cf ffff880429684700 ffff88043fcc3e18 ffffffff817b42c2 ffff880429684700 ffff88043fcc3e40 ffffffff817b4332 ffff880429684700 ffff880427f74238 ffff880427f74228 ffff88043fcc3e58 ffffffff81893f96 Call Trace: <IRQ> [16240.336345] [<ffffffff817b29cf>] ? skb_release_head_state+0x4f/0xb0 [<ffffffff817b42c2>] skb_release_all+0x12/0x30 [<ffffffff817b4332>] kfree_skb+0x32/0x90 [<ffffffff81893f96>] ndisc_error_report+0x36/0x40 [<ffffffff817d4de1>] neigh_invalidate+0x81/0xf0 [<ffffffff817d68f7>] neigh_timer_handler+0x207/0x2b0 [<ffffffff81109295>] call_timer_fn+0x35/0x120 [<ffffffff81109db7>] run_timer_softirq+0x1d7/0x460 [<ffffffff8106155e>] ? kvm_sched_clock_read+0x1e/0x30 [<ffffffff810366b9>] ? sched_clock+0x9/0x10 [<ffffffff810cfed2>] ? sched_clock_cpu+0x72/0xa0 [<ffffffff818dd537>] __do_softirq+0xd7/0x289 [<ffffffff810a6c95>] irq_exit+0xb5/0xc0 [<ffffffff818dd372>] smp_apic_timer_interrupt+0x42/0x50 [<ffffffff818dc682>] apic_timer_interrupt+0x82/0x90 <EOI> [16240.395776] [<ffffffff818da156>] ? native_safe_halt+0x6/0x10 [<ffffffff818d9e6e>] default_idle+0x1e/0xd0 [<ffffffff8103797f>] arch_cpu_idle+0xf/0x20 [<ffffffff818da2c5>] default_idle_call+0x35/0x40 [<ffffffff810e3eb5>] cpu_startup_entry+0x185/0x210 [<ffffffff81050433>] start_secondary+0x103/0x130 RIP [<ffffffffa05df765>] rxe_skb_tx_dtor+0x15/0x50 [rdma_rxe] Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-25{net,IB}/{rxe,usnic}: Utilize generic mac to eui32 functionYuval Shaia
This logic seems to be duplicated in (at least) three separate files. Move it to one place so code can be re-use. Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
2017-04-21IB/rxe: Cache dst in QP instead of getting it for each sendyonatanc
In RC QP there is no need to resolve the outgoing interface for each packet, as this does not change during QP life cycle. Instead cache the interface on the socket and use that one. This improves performance by 12% by sparing redundant calls to rxe_find_route. ib_send_bw -d rxe0 -x 1 -n 9000 -e -s $((1024 * 1024 )) -l 100 ---------------------------------------------------------------------------------------- | | bytes | iterations | BW peak[MB/sec] | BW average[MB/sec] | MsgRate[Mpps] | ---------------------------------------------------------------------------------------- | before | 1048576 | 9000 | inf | 551.21 | 0.000551 | | after | 1048576 | 9000 | inf | 615.54 | 0.000616 | ---------------------------------------------------------------------------------------- Fixes: 8700e3e7c485 ("Soft RoCE driver") Signed-off-by: Yonatan Cohen <yonatanc@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-23Merge tag 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma updates from Doug Ledford: "First set of updates for 4.11 kernel merge window - Add new Broadcom bnxt_re RoCE driver - rxe driver updates - ioctl cleanups - ETH_P_IBOE declaration cleanup - IPoIB changes - Add port state cache - Allow srpt driver to accept guids as port names in config - Update to hfi1 driver - Update to srp driver - Lots of misc minor changes all over" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (114 commits) RDMA/bnxt_re: fix for "bnxt_en: Update to firmware interface spec 1.7.0." rdma_cm: fail iwarp accepts w/o connection params IB/srp: Drain the send queue before destroying a QP IB/core: Add support for draining IB_POLL_DIRECT completion queues IB/srp: Improve an error path IB/srp: Make a diagnostic message more informative IB/srp: Document locking conventions IB/srp: Fix race conditions related to task management IB/srp: Avoid that duplicate responses trigger a kernel bug IB/SRP: Avoid using IB_MR_TYPE_SG_GAPS RDMA/qedr: Fix some error handling RDMA/bnxt_re: add DCB dependency IB/hns: include linux/module.h IB/vmw_pvrdma: Expose vendor error to ULPs vmw_pvrdma: switch to pci_alloc_irq_vectors IB/hfi1: use size_t for passing array length IB/ipoib: Remove redudant label IB/ipoib: remove the unnecessary memory free IB/mthca: switch to pci_alloc_irq_vectors IB/hfi1: Code reuse with memdup_copy ...
2017-02-19Merge branch 'k.o/for-4.10-rc' into HEADDoug Ledford
2017-02-06net-next: treewide use is_vlan_dev() helper function.Parav Pandit
This patch makes use of is_vlan_dev() function instead of flag comparison which is exactly done by is_vlan_dev() helper function. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Jon Maxwell <jmaxwell37@gmail.com> Acked-by: Johannes Thumshirn <jth@kernel.org> Acked-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-24IB/rxe: Fix rxe dev insertion to rxe_dev_listMaor Gottlieb
The first argument of list_add_tail is the new item and the second is the head of the list. Fix the code to pass arguments in the right order, otherwise not all the rxe devices will be removed during teardown. Fixes: 8700e3e7c4857 ('Soft RoCE driver') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-10IB/rxe: Remove a pointless indirection layerBart Van Assche
Neither rxe->ifc_ops nor any of the function pointers in struct struct rxe_ifc_ops ever change. Hence remove the rxe->ifc_ops indirection mechanism. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-10IB/rxe: Suppress sparse warningsBart Van Assche
Avoid that sparse complains about using 0 as a pointer, about missing function declarations and also avoid that sparse complains about endianness. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Reviewed-by: Andrew Boyer <andrew.boyer@dell.com> Cc: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-23Merge tag 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma fixes from Doug Ledford: "First round of -rc fixes for 4.10 kernel: - a series of qedr fixes - a series of rxe fixes - one i40iw fix - one cma fix - one cxgb4 fix" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: IB/rxe: Don't check for null ptr in send() IB/rxe: Drop future atomic/read packets rather than retrying IB/rxe: Use BTH_PSN_MASK when ACKing duplicate sends qedr: Always notify the verb consumer of flushed CQEs qedr: clear the vendor error field in the work completion qedr: post_send/recv according to QP state qedr: ignore inline flag in read verbs qedr: modify QP state to error when destroying it qedr: return correct value on modify qp qedr: return error if destroy CQ failed qedr: configure the number of CQEs on CQ creation i40iw: Set 128B as the only supported RQ WQE size IB/cma: Fix a race condition in iboe_addr_get_sgid() IB/rxe: Fix a memory leak in rxe_qp_cleanup() iw_cxgb4: set correct FetchBurstMax for QPs