summaryrefslogtreecommitdiff
path: root/drivers/infiniband/core/cma.c
AgeCommit message (Collapse)Author
2017-02-23Merge tag 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma updates from Doug Ledford: "First set of updates for 4.11 kernel merge window - Add new Broadcom bnxt_re RoCE driver - rxe driver updates - ioctl cleanups - ETH_P_IBOE declaration cleanup - IPoIB changes - Add port state cache - Allow srpt driver to accept guids as port names in config - Update to hfi1 driver - Update to srp driver - Lots of misc minor changes all over" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (114 commits) RDMA/bnxt_re: fix for "bnxt_en: Update to firmware interface spec 1.7.0." rdma_cm: fail iwarp accepts w/o connection params IB/srp: Drain the send queue before destroying a QP IB/core: Add support for draining IB_POLL_DIRECT completion queues IB/srp: Improve an error path IB/srp: Make a diagnostic message more informative IB/srp: Document locking conventions IB/srp: Fix race conditions related to task management IB/srp: Avoid that duplicate responses trigger a kernel bug IB/SRP: Avoid using IB_MR_TYPE_SG_GAPS RDMA/qedr: Fix some error handling RDMA/bnxt_re: add DCB dependency IB/hns: include linux/module.h IB/vmw_pvrdma: Expose vendor error to ULPs vmw_pvrdma: switch to pci_alloc_irq_vectors IB/hfi1: use size_t for passing array length IB/ipoib: Remove redudant label IB/ipoib: remove the unnecessary memory free IB/mthca: switch to pci_alloc_irq_vectors IB/hfi1: Code reuse with memdup_copy ...
2017-02-22rdma_cm: fail iwarp accepts w/o connection paramsSteve Wise
cma_accept_iw() needs to return an error if conn_params is NULL. Since this is coming from user space, we can crash. Reported-by: Shaobo He <shaobo@cs.utah.edu> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Cc: stable@vger.kernel.org Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-19Merge branch 'k.o/for-4.10-rc' into HEADDoug Ledford
2017-02-15IB/cma: Destination and source addr families must matchMoni Shoua
The destination address in a listening rdma_id does not have an address family. Since address family in both sides of a connection must be the same in rdma_bind_addr() we set the address family of the destination to the address family of the source. This patch serves the logic in cma_port_is_unique() which requires to know if destination address that is associated with a rdma_id is any address (cma_zero_addr() and cma_loopback_addr()). This can happen when port reuse is checked for a port number that is being listened to. Fixes: 19b752a19dce ("IB/cma: Allow port reuse for rdma_id") Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-15IB/cma: Add default RoCE TOS to CMA configfsMajd Dibbiny
Add new entry to the RDMA-CM configfs that allows users to select default TOS for RDMA-CM QPs. This is useful for users that want to control the TOS for legacy applications without changing their code. Application that sets the TOS explicitly using the rdma_set_option API will continue to work as expected, meaning overriding the configfs value. CC: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Majd Dibbiny <majd@mellanox.com> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-02-06net-next: treewide use is_vlan_dev() helper function.Parav Pandit
This patch makes use of is_vlan_dev() function instead of flag comparison which is exactly done by is_vlan_dev() helper function. Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Daniel Jurgens <danielj@mellanox.com> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Jon Maxwell <jmaxwell37@gmail.com> Acked-by: Johannes Thumshirn <jth@kernel.org> Acked-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-01-27IB/core: Add inline function to validate portYuval Shaia
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-27IB/cma: Fix reversed testChristophe Jaillet
This test looks reverted. We should log an error message only if 'ib_attach_mcast()' fails. Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-27RDMA/cma: Fix unknown symbol when CONFIG_IPV6 is not enabledJack Morgenstein
If IPV6 has not been enabled in the underlying kernel, we must avoid calling IPV6 procedures in rdma_cm.ko. This requires using "IS_ENABLED(CONFIG_IPV6)" in "if" statements surrounding any code which calls external IPV6 procedures. In the instance fixed here, procedure cma_bind_addr() called ipv6_addr_type() -- which resulted in calling external procedure __ipv6_addr_type(). Fixes: 6c26a77124ff ("RDMA/cma: fix IPv6 address resolution") Cc: <stable@vger.kernel.org> # v4.2+ Cc: Spencer Baugh <sbaugh@catern.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Reviewed-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-24IB/cma: Allow port reuse for rdma_idMoni Shoua
When allocating a port number for binding to a rdma_id, assuming the allocation is not for a specific port, the rule is to allow only ports that were not in use before by any other rdma_id. This condition is too strong to achieve the goal of a unique 5 tuple rdma_id. Instead, we can compare current rdma_id with other rdma_id for difference in one of destination port, source address and destination address to allow port reuse. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-24IB/cma: Add debug messages to error flowsMoni Shoua
Print debug messages to the kernel log to add more information about RDMA_CM events that indicate an error. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-12RDMA/cma: use cached port state when bind loopbackJack Wang
Signed-off-by: Jack Wang <jinpu.wang@profitbricks.com> Reviewed-by: Michael Wang <yun.wang@profitbricks.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-01-12RDMA/cma: resolve to first active ib portJack Wang
When we try to resolve a dest addr, if we don't give src addr, cma core will try to resolve to our source ib device automatically. The current logic only checks if a given port has the same subnet_prefix as our dest, which is not enough if we use default well known subnet_prefix on our active port, as it will be the same as the subnet_prefix on inactive ports and we might match against an inactive port by accident. To resolve this, we should also check if port is active before we resolve it as a suitable src address for a given dest. Signed-off-by: Jack Wang <jinpu.wang@profitbricks.com> Reviewed-by: Michael Wang <yun.wang@profitbricks.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-15Merge tag 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma updates from Doug Ledford: "This is the complete update for the rdma stack for this release cycle. Most of it is typical driver and core updates, but there is the entirely new VMWare pvrdma driver. You may have noticed that there were changes in DaveM's pull request to the bnxt Ethernet driver to support a RoCE RDMA driver. The bnxt_re driver was tentatively set to be pulled in this release cycle, but it simply wasn't ready in time and was dropped (a few review comments still to address, and some multi-arch build issues like prefetch() not working across all arches). Summary: - shared mlx5 updates with net stack (will drop out on merge if Dave's tree has already been merged) - driver updates: cxgb4, hfi1, hns-roce, i40iw, mlx4, mlx5, qedr, rxe - debug cleanups - new connection rejection helpers - SRP updates - various misc fixes - new paravirt driver from vmware" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (210 commits) IB: Add vmw_pvrdma driver IB/mlx4: fix improper return value IB/ocrdma: fix bad initialization infiniband: nes: return value of skb_linearize should be handled MAINTAINERS: Update Intel RDMA RNIC driver maintainers MAINTAINERS: Remove Mitesh Ahuja from emulex maintainers IB/core: fix unmap_sg argument qede: fix general protection fault may occur on probe IB/mthca: Replace pci_pool_alloc by pci_pool_zalloc mlx5, calc_sq_size(): Make a debug message more informative mlx5: Remove a set-but-not-used variable mlx5: Use { } instead of { 0 } to init struct IB/srp: Make writing the add_target sysfs attr interruptible IB/srp: Make mapping failures easier to debug IB/srp: Make login failures easier to debug IB/srp: Introduce a local variable in srp_add_one() IB/srp: Fix CONFIG_DYNAMIC_DEBUG=n build IB/multicast: Check ib_find_pkey() return value IPoIB: Avoid reading an uninitialized member variable IB/mad: Fix an array index check ...
2016-12-14rdma_cm: add rdma_consumer_reject_data helper functionSteve Wise
rdma_consumer_reject_data() will return the private data pointer and length if any is available. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14rdma_cm: add rdma_is_consumer_reject() helper functionSteve Wise
Return true if the peer consumer application rejected the connection attempt. Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-12-14rdma_cm: add rdma_reject_msg() helper functionSteve Wise
rdma_reject_msg() returns a pointer to a string message associated with the transport reject reason codes. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
All conflicts were simple overlapping changes except perhaps for the Thunder driver. That driver has a change_mtu method explicitly for sending a message to the hardware. If that fails it returns an error. Normally a driver doesn't need an ndo_change_mtu method becuase those are usually just range changes, which are now handled generically. But since this extra operation is needed in the Thunder driver, it has to stay. However, if the message send fails we have to restore the original MTU before the change because the entire call chain expects that if an error is thrown by ndo_change_mtu then the MTU did not change. Therefore code is added to nicvf_change_mtu to remember the original MTU, and to restore it upon nicvf_update_hw_max_frs() failue. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-18netns: make struct pernet_operations::id unsigned intAlexey Dobriyan
Make struct pernet_operations::id unsigned. There are 2 reasons to do so: 1) This field is really an index into an zero based array and thus is unsigned entity. Using negative value is out-of-bound access by definition. 2) On x86_64 unsigned 32-bit data which are mixed with pointers via array indexing or offsets added or subtracted to pointers are preffered to signed 32-bit data. "int" being used as an array index needs to be sign-extended to 64-bit before being used. void f(long *p, int i) { g(p[i]); } roughly translates to movsx rsi, esi mov rdi, [rsi+...] call g MOVSX is 3 byte instruction which isn't necessary if the variable is unsigned because x86_64 is zero extending by default. Now, there is net_generic() function which, you guessed it right, uses "int" as an array index: static inline void *net_generic(const struct net *net, int id) { ... ptr = ng->ptr[id - 1]; ... } And this function is used a lot, so those sign extensions add up. Patch snipes ~1730 bytes on allyesconfig kernel (without all junk messing with code generation): add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730) Unfortunately some functions actually grow bigger. This is a semmingly random artefact of code generation with register allocator being used differently. gcc decides that some variable needs to live in new r8+ registers and every access now requires REX prefix. Or it is shifted into r12, so [r12+0] addressing mode has to be used which is longer than [r8] However, overall balance is in negative direction: add/remove: 0/0 grow/shrink: 70/598 up/down: 396/-2126 (-1730) function old new delta nfsd4_lock 3886 3959 +73 tipc_link_build_proto_msg 1096 1140 +44 mac80211_hwsim_new_radio 2776 2808 +32 tipc_mon_rcv 1032 1058 +26 svcauth_gss_legacy_init 1413 1429 +16 tipc_bcbase_select_primary 379 392 +13 nfsd4_exchange_id 1247 1260 +13 nfsd4_setclientid_confirm 782 793 +11 ... put_client_renew_locked 494 480 -14 ip_set_sockfn_get 730 716 -14 geneve_sock_add 829 813 -16 nfsd4_sequence_done 721 703 -18 nlmclnt_lookup_host 708 686 -22 nfsd4_lockt 1085 1063 -22 nfs_get_client 1077 1050 -27 tcf_bpf_init 1106 1076 -30 nfsd4_encode_fattr 5997 5930 -67 Total: Before=154856051, After=154854321, chg -0.00% Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-11-17Merge tag 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rmda fixes from Doug Ledford. "First round of -rc fixes. Due to various issues, I've been away and couldn't send a pull request for about three weeks. There were a number of -rc patches that built up in the meantime (some where there already from the early -rc stages). Obviously, there were way too many to send now, so I tried to pare the list down to the more important patches for the -rc cycle. Most of the code has had plenty of soak time at the various vendor's testing setups, so I doubt there will be another -rc pull request this cycle. I also tried to limit the patches to those with smaller footprints, so even though a shortlog is longer than I would like, the actual diffstat is mostly very small with the exception of just three files that had more changes, and a couple files with pure removals. Summary: - Misc Intel hfi1 fixes - Misc Mellanox mlx4, mlx5, and rxe fixes - A couple cxgb4 fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (34 commits) iw_cxgb4: invalidate the mr when posting a read_w_inv wr iw_cxgb4: set *bad_wr for post_send/post_recv errors IB/rxe: Update qp state for user query IB/rxe: Clear queue buffer when modifying QP to reset IB/rxe: Fix handling of erroneous WR IB/rxe: Fix kernel panic in UDP tunnel with GRO and RX checksum IB/mlx4: Fix create CQ error flow IB/mlx4: Check gid_index return value IB/mlx5: Fix NULL pointer dereference on debug print IB/mlx5: Fix fatal error dispatching IB/mlx5: Resolve soft lock on massive reg MRs IB/mlx5: Use cache line size to select CQE stride IB/mlx5: Validate requested RQT size IB/mlx5: Fix memory leak in query device IB/core: Avoid unsigned int overflow in sg_alloc_table IB/core: Add missing check for addr_resolve callback return value IB/core: Set routable RoCE gid type for ipv4/ipv6 networks IB/cm: Mark stale CM id's whenever the mad agent was unregistered IB/uverbs: Fix leak of XRC target QPs IB/hfi1: Remove incorrect IS_ERR check ...
2016-11-16IB/core: Set routable RoCE gid type for ipv4/ipv6 networksLeon Romanovsky
On Thu, Oct 27, 2016 at 04:36:28PM +0300, Leon Romanovsky wrote: > From: Mark Bloch <markb@mellanox.com> > > If the underlying netowrk type is ipv4 or ipv6 and the device supports > routable RoCE, prefer it so the traffic could cross subnets. > > Signed-off-by: Mark Bloch <markb@mellanox.com> > Signed-off-by: Maor Gottlieb <maorg@mellanox.com> > Signed-off-by: Leon Romanovsky <leon@kernel.org> > --- Hi Doug, Please take the following v1 of this patch where I fixed spelling error from "netowrk" to be "network". Thanks. >From 09f96ba3e9b4442cfb44dca04c6726e55525c9c3 Mon Sep 17 00:00:00 2001 From: Mark Bloch <markb@mellanox.com> Date: Sun, 11 Sep 2016 06:25:10 +0000 Subject: [PATCH rdma-rc v1 3/6] IB/core: Set routable RoCE gid type for ipv4/ipv6 networks If the underlying network type is ipv4 or ipv6 and the device supports routable RoCE, prefer it so the traffic could cross subnets. Signed-off-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-11-11infiniband: shut up a maybe-uninitialized warningArnd Bergmann
Some configurations produce this harmless warning when built with gcc -Wmaybe-uninitialized: infiniband/core/cma.c: In function 'cma_get_net_dev': infiniband/core/cma.c:1242:12: warning: 'src_addr_storage.sin_addr.s_addr' may be used uninitialized in this function [-Wmaybe-uninitialized] I previously reported this for the powerpc64 defconfig, but have now reproduced the same thing for x86 as well, using gcc-5 or higher. The code looks correct to me, and this change just rearranges it by making sure we alway initialize the entire address structure to make the warning disappear. My first approach added an initialization at the time of the declaration, which Doug commented may be too costly, so I hope this version doesn't add overhead. Link: http://arm-soc.lixom.net/buildlogs/mainline/v4.7-rc6/buildall.powerpc.ppc64_defconfig.log.passed Link: https://patchwork.kernel.org/patch/9212825/ Acked-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-09Merge tag 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull main rdma updates from Doug Ledford: "This is the main pull request for the rdma stack this release. The code has been through 0day and I had it tagged for linux-next testing for a couple days. Summary: - updates to mlx5 - updates to mlx4 (two conflicts, both minor and easily resolved) - updates to iw_cxgb4 (one conflict, not so obvious to resolve, proper resolution is to keep the code in cxgb4_main.c as it is in Linus' tree as attach_uld was refactored and moved into cxgb4_uld.c) - improvements to uAPI (moved vendor specific API elements to uAPI area) - add hns-roce driver and hns and hns-roce ACPI reset support - conversion of all rdma code away from deprecated create_singlethread_workqueue - security improvement: remove unsafe ib_get_dma_mr (breaks lustre in staging)" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (75 commits) staging/lustre: Disable InfiniBand support iw_cxgb4: add fast-path for small REG_MR operations cxgb4: advertise support for FR_NSMR_TPTE_WR IB/core: correctly handle rdma_rw_init_mrs() failure IB/srp: Fix infinite loop when FMR sg[0].offset != 0 IB/srp: Remove an unused argument IB/core: Improve ib_map_mr_sg() documentation IB/mlx4: Fix possible vl/sl field mismatch in LRH header in QP1 packets IB/mthca: Move user vendor structures IB/nes: Move user vendor structures IB/ocrdma: Move user vendor structures IB/mlx4: Move user vendor structures IB/cxgb4: Move user vendor structures IB/cxgb3: Move user vendor structures IB/mlx5: Move and decouple user vendor structures IB/{core,hw}: Add constant for node_desc ipoib: Make ipoib_warn ratelimited IB/mlx4/alias_GUID: Remove deprecated create_singlethread_workqueue IB/ipoib_verbs: Remove deprecated create_singlethread_workqueue IB/ipoib: Remove deprecated create_singlethread_workqueue ...
2016-10-07IB/cma: Remove deprecated create_singlethread_workqueueBhaktipriya Shridhar
alloc_ordered_workqueue() with WQ_MEM_RECLAIM set, replaces deprecated create_singlethread_workqueue(). This is the identity conversion. The workqueue "cma_wq" queues work item cma_work_handler. It has been identity converted. WQ_MEM_RECLAIM has been set to ensure forward progress under memory pressure. Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-22IB/core: Fix possible memory leak in cma_resolve_iboe_route()Wei Yongjun
'work' and 'route->path_rec' are malloced in cma_resolve_iboe_route() and should be freed before leaving from the error handling cases, otherwise it will cause memory leak. Fixes: 200298326b27 ('IB/core: Validate route when we init ah') Signed-off-by: Wei Yongjun <weiyj.lk@gmail.com> Reviewed-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-08-04Merge tag 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull base rdma updates from Doug Ledford: "Round one of 4.8 code: while this is mostly normal, there is a new driver in here (the driver was hosted outside the kernel for several years and is actually a fairly mature and well coded driver). It amounts to 13,000 of the 16,000 lines of added code in here. Summary: - Updates/fixes for iw_cxgb4 driver - Updates/fixes for mlx5 driver - Add flow steering and RSS API - Add hardware stats to mlx4 and mlx5 drivers - Add firmware version API for RDMA driver use - Add the rxe driver (this is a software RoCE driver that makes any Ethernet device a RoCE device) - Fixes for i40iw driver - Support for send only multicast joins in the cma layer - Other minor fixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (72 commits) Soft RoCE driver IB/core: Support for CMA multicast join flags IB/sa: Add cached attribute containing SM information to SA port IB/uverbs: Fix race between uverbs_close and remove_one IB/mthca: Clean up error unwind flow in mthca_reset() IB/mthca: NULL arg to pci_dev_put is OK IB/hfi1: NULL arg to sc_return_credits is OK IB/mlx4: Add diagnostic hardware counters net/mlx4: Query performance and diagnostics counters net/mlx4: Add diagnostic counters capability bit Use smaller 512 byte messages for portmapper messages IB/ipoib: Report SG feature regardless of HW UD CSUM capability IB/mlx4: Don't use GFP_ATOMIC for CQ resize struct IB/hfi1: Disable by default IB/rdmavt: Disable by default IB/mlx5: Fix port counter ID association to QP offset IB/mlx5: Fix iteration overrun in GSI qps i40iw: Add NULL check for puda buffer i40iw: Change dup_ack_thresh to u8 i40iw: Remove unnecessary check for moving CQ head ...
2016-08-03IB/core: Support for CMA multicast join flagsAlex Vesker
Added UCMA and CMA support for multicast join flags. Flags are passed using UCMA CM join command previously reserved fields. Currently supporting two join flags indicating two different multicast JoinStates: 1. Full Member: The initiator creates the Multicast group(MCG) if it wasn't previously created, can send Multicast messages to the group and receive messages from the MCG. 2. Send Only Full Member: The initiator creates the Multicast group(MCG) if it wasn't previously created, can send Multicast messages to the group but doesn't receive any messages from the MCG. IB: Send Only Full Member requires a query of ClassPortInfo to determine if SM/SA supports this option. If SM/SA doesn't support Send-Only there will be no join request sent and an error will be returned. ETH: When Send Only Full Member is requested no IGMP join will be sent. Signed-off-by: Alex Vesker <valex@mellanox.com> Reviewed by: Hal Rosenstock <hal@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-23Merge branches '4.7-rc-misc', 'hfi1-fixes', 'i40iw-rc-fixes' and ↵Doug Ledford
'mellanox-rc-fixes' into k.o/for-4.7-rc
2016-06-23IB/core: Fix RoCE v1 multicast join logic issueAlex Vesker
During multicast join of RoCEv1, IGMP join state and max hop limit were updated incorrectly. IGMP join should be sent and marked as joined only on RoCEv2 after a successful join. Max hops should be updated to the hop limit on RoCEv2 regardless of the join state. Fixes: bee3c3c91865 ('IB/cma: Join and leave multicast groups...') Signed-off-by: Alex Vesker <valex@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-06-17IB/cma: Make the code easier to verifyBart Van Assche
Static source code analysis tools like smatch cannot handle functions that lock or not lock a mutex depending on the value of the arguments. Hence inline the function cma_disable_callback(). Additionally, this patch realizes a small performance optimization by reducing the number of mutex_lock() and mutex_unlock() calls in the modified functions. With this patch applied smatch no longer complains about source file cma.c. Without this patch smatch reports the following for this source file: drivers/infiniband/core/cma.c:1959: cma_req_handler() warn: inconsistent returns 'mutex:&listen_id->handler_mutex'. Locked on: line 1880 line 1959 Unlocked on: line 1941 drivers/infiniband/core/cma.c:2112: iw_conn_req_handler() warn: inconsistent returns 'mutex:&listen_id->handler_mutex'. Locked on: line 2048 Unlocked on: line 2112 Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Steve Wise <swise@opengridcomputing.com> Cc: Leon Romanovsky <leonro@mellanox.com> Acked-by: Sean Hefty <sean.hefty@intel.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-13Merge branches 'cxgb4-2', 'i40iw-2', 'ipoib', 'misc-4.7' and 'mlx5-fcs' into ↵Doug Ledford
k.o/for-4.7
2016-05-13IB/core: Fix a potential array overrun in CMA and SA agentMark Bloch
Fix array overrun when going over callback table. In declaration of callback table, the max size isn't provided and in registration phase, it is provided. There is potential scenario where a new operation is added and it is not supported by current client. The acceptance of such operation by ib_netlink will cause to array overrun. Fixes: 809d5fc9bf65 ("infiniband: pass rdma_cm module to netlink_dump_start") Fixes: b493d91d333e ("iwcm: common code for port mapper") Fixes: 2ca546b92a02 ("IB/sa: Route SA pathrecord query through netlink") Signed-off-by: Mark Bloch <markb@mellanox.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-05-12IB/cma: pass the port number to ib_create_qpChristoph Hellwig
The new RW API will need this. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Tested-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-16Merge branches 'mlx4', 'mlx5' and 'ocrdma' into k.o/for-4.6Doug Ledford
2016-03-03IB/core: trivial prink cleanup.Parav Pandit
1. Replaced printk with appropriate pr_warn, pr_err, pr_info. 2. Removed unnecessary prints around memory allocation failure which are not required, as reported by the checkpatch script. Signed-off-by: Parav Pandit <pandit.parav@gmail.com> Reviewed-by: Haggai Eran <haggaie@mellanox.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-03-01IB/cma: Print warning on different inner and header P_KeysHaggai Eran
Commit 4c21b5bcef73 ("IB/cma: Add net_dev and private data checks to RDMA CM") added checks for incoming RDMA CM requests that they can be matched to a netdev based on the P_Key in the BTH of the request. This behavior was reverted in commit ab3964ad2acf ("IB/cma: Use inner P_Key to determine netdev"), since the mlx5 and ipath drivers didn't send the correct value in the BTH P_Key. Since the ipath driver was removed, and the mlx5 driver can now send GSI packets on different P_Keys, we could revert the patch to let the rdma_cm module look on the BTH P_Key when deciding to what netdev a packet belongs. However, that still breaks compatibility with the older drivers. Change the behavior to print a warning when receiving a request that has a different BTH P_Key and inner payload P_Key. In the future, after users have seen the warnings and upgraded their setups, remove the warning and block these requests. Signed-off-by: Haggai Eran <haggaie@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-23Merge tag 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma Pull rdma updates from Doug Ledford: "Initial roundup of 4.5 merge window patches - Remove usage of ib_query_device and instead store attributes in ib_device struct - Move iopoll out of block and into lib, rename to irqpoll, and use in several places in the rdma stack as our new completion queue polling library mechanism. Update the other block drivers that already used iopoll to use the new mechanism too. - Replace the per-entry GID table locks with a single GID table lock - IPoIB multicast cleanup - Cleanups to the IB MR facility - Add support for 64bit extended IB counters - Fix for netlink oops while parsing RDMA nl messages - RoCEv2 support for the core IB code - mlx4 RoCEv2 support - mlx5 RoCEv2 support - Cross Channel support for mlx5 - Timestamp support for mlx5 - Atomic support for mlx5 - Raw QP support for mlx5 - MAINTAINERS update for mlx4/mlx5 - Misc ocrdma, qib, nes, usNIC, cxgb3, cxgb4, mlx4, mlx5 updates - Add support for remote invalidate to the iSER driver (pushed through the RDMA tree due to dependencies, acknowledged by nab) - Update to NFSoRDMA (pushed through the RDMA tree due to dependencies, acknowledged by Bruce)" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma: (169 commits) IB/mlx5: Unify CQ create flags check IB/mlx5: Expose Raw Packet QP to user space consumers {IB, net}/mlx5: Move the modify QP operation table to mlx5_ib IB/mlx5: Support setting Ethernet priority for Raw Packet QPs IB/mlx5: Add Raw Packet QP query functionality IB/mlx5: Add create and destroy functionality for Raw Packet QP IB/mlx5: Refactor mlx5_ib_qp to accommodate other QP types IB/mlx5: Allocate a Transport Domain for each ucontext net/mlx5_core: Warn on unsupported events of QP/RQ/SQ net/mlx5_core: Add RQ and SQ event handling net/mlx5_core: Export transport objects IB/mlx5: Expose CQE version to user-space IB/mlx5: Add CQE version 1 support to user QPs and SRQs IB/mlx5: Fix data validation in mlx5_ib_alloc_ucontext IB/sa: Fix netlink local service GFP crash IB/srpt: Remove redundant wc array IB/qib: Improve ipoib UD performance IB/mlx4: Advertise RoCE v2 support IB/mlx4: Create and use another QP1 for RoCEv2 IB/mlx4: Enable send of RoCE QP1 packets with IP/UDP headers ...
2016-01-19IB/core: Use hop-limit from IP stack for RoCEMatan Barak
Previously, IPV6_DEFAULT_HOPLIMIT was used as the hop limit value for RoCE. Fixing that by taking ip4_dst_hoplimit and ip6_dst_hoplimit as hop limit values. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2016-01-19IB/cma: Fix RDMA port validation for iWarpMatan Barak
cma_validate_port wrongly assumed that Ethernet devices are RoCE devices and thus their ndev should be matched in the GID table. This broke the iWarp support. Fixing that matching the ndev only if we work on a RoCE port. Cc: <stable@vger.kernel.org> # 4.4.x- Fixes: abae1b71dd37 ('IB/cma: cma_validate_port should verify the port and netdevice') Reported-by: Hariprasad Shenai <hariprasad@chelsio.com> Tested-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB/cma: Join and leave multicast groups with IGMPMoni Shoua
Since RoCEv2 is a protocol over IP header it is required to send IGMP join and leave requests to the network when joining and leaving multicast groups. Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB/cma: Add configfs for rdma_cmMatan Barak
Users would like to control the behaviour of rdma_cm. For example, old applications which don't set the required RoCE gid type could be executed on RoCE V2 network types. In order to support this configuration, we implement a configfs for rdma_cm. In order to use the configfs, one needs to mount it and mkdir <IB device name> inside rdma_cm directory. The patch adds support for a single configuration file, default_roce_mode. The mode can either be "IB/RoCE v1" or "RoCE v2". Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB/rdma_cm: Add wrapper for cma reference countMatan Barak
Currently, cma users can't increase or decrease the cma reference count. This is necassary when setting cma attributes (like the default GID type) in order to avoid use-after-free errors. Adding cma_ref_dev and cma_deref_dev APIs. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB/core: Validate route when we init ahMatan Barak
In order to make sure API users don't try to use SGIDs which don't conform to the routing table, validate the route before searching the RoCE GID table. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB/core: Add rdma_network_type to wcSomnath Kotur
Providers should tell IB core the wc's network type. This is used in order to search for the proper GID in the GID table. When using HCAs that can't provide this info, IB core tries to deep examine the packet and extract the GID type by itself. We choose sgid_index and type from all the matching entries in RDMA-CM based on hint from the IP stack and we set hop_limit for the IP packet based on above hint from IP stack. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Somnath Kotur <Somnath.Kotur@Avagotech.Com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB/cm: Use the source GID index typeMatan Barak
Previosuly, cm and cma modules supported only IB and RoCE v1 GID type. In order to support multiple GID types, the gid_type is passed to cm_init_av_by_path and stored in the path record. The rdma cm client would use a default GID type that will be saved in rdma_id_private. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-23IB/core: Add gid_type to gid attributeMatan Barak
In order to support multiple GID types, we need to store the gid_type with each GID. This is also aligned with the RoCE v2 annex "RoCEv2 PORT GID table entries shall have a "GID type" attribute that denotes the L3 Address type". The currently supported GID is IB_GID_TYPE_IB which is also RoCE v1 GID type. This implies that gid_type should be added to roce_gid_table meta-data. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-22IB/cma: cma_match_net_dev needs to take into account port_numMatan Barak
Previously, cma_match_net_dev called cma_protocol_roce which tried to verify that the IB device uses RoCE protocol. However, if rdma_id wasn't bound to a port, then the check would occur against the first port of the device without regard to whether that port was even of the same type as the type of port the incoming packet was received on. Fix this by passing the port of the request and only checking against the same port of the device. Reported-by: Or Gerlitz <gerlitz.or@gmail.com> Fixes: b8cab5dab15f ('IB/cma: Accept connection without a valid netdev on RoCE') Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-22IB/core: Avoid calling ib_query_deviceOr Gerlitz
Use the cached copy of the attributes present on the device, except for the case of a query originating from user-space, where we have to invoke the driver query_device entry, so they can fill in their udata. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-12-08IB/cma: Add a missing rcu_read_unlock()Bart Van Assche
Ensure that validate_ipv4_net_dev() calls rcu_read_unlock() if fib_lookup() fails. Detected by sparse. Compile-tested only. Fixes: "IB/cma: Validate routing of incoming requests" (commit f887f2ac87c2). Cc: Haggai Eran <haggaie@mellanox.com> Cc: stable <stable@vger.kernel.org> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Haggai Eran <haggaie@mellanox.com> Reviewed-by: Jason Gunthorpe <jgunthorpe@obsidianresearch.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2015-10-30IB/core, cma: Make __attribute_const__ declarations sparse-friendlyBart Van Assche
Move the __attribute_const__ declarations such that sparse understands that these apply to the function itself and not to the return type. This avoids that sparse reports error messages like the following: drivers/infiniband/core/verbs.c:73:12: error: symbol 'ib_event_msg' redeclared with different type (originally declared at include/rdma/ib_verbs.h:470) - different modifiers Fixes: 2b1b5b601230 ("IB/core, cma: Nice log-friendly string helpers") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Sagi Grimberg <sagig@mellanox.com> Reviewed-by: Sagi Grimberg <sagig@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>