summaryrefslogtreecommitdiff
path: root/drivers/infiniband
AgeCommit message (Collapse)Author
2017-04-20IB/opa-vnic: VNIC Ethernet Management Agent (VEMA) interfaceVishwanathapura, Niranjana
OPA VNIC EMA interface functions are the management interfaces to the OPA VNIC netdev. Add support to add and remove VNIC ports. Implement the required GET/SET management interface functions and processing of new management information. Add support to send trap notifications upon various events like interface status change, unicast/multicast mac list update and mac address change. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/opa-vnic: VNIC MAC table supportVishwanathapura, Niranjana
OPA VNIC MAC table contains the MAC address to DLID mappings provided by the Ethernet manager. During transmission, the MAC table provides the MAC address to DLID translation. Implement MAC table using simple hash list. Also provide support to update/query the MAC table by Ethernet manager. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/opa-vnic: VNIC statistics supportVishwanathapura, Niranjana
OPA VNIC driver statistics support maintains various counters including standard netdev counters and the Ethernet manager defined counters. Add the Ethtool hook to read the counters. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/opa-vnic: VNIC Ethernet Management (EM) structure definitionsVishwanathapura, Niranjana
Define VNIC EM MAD structures and the associated macros. These structures are used for information exchange between VNIC EM agent (EMA) on the host and the Ethernet manager. These include the virtual ethernet switch (vesw) port information, vesw port mac table, summay and error counters, vesw port interface mac lists and the EMA trap. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/opa-vnic: Virtual Network Interface Controller (VNIC) netdevVishwanathapura, Niranjana
OPA VNIC netdev function supports Ethernet functionality over Omni-Path fabric by encapsulating Ethernet packets inside Omni-Path packet header. It allocates a rdma netdev device and interfaces with the network stack to provide standard Ethernet network interfaces. It overrides HFI1 device's netdev operations where it is required. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Sadanand Warrier <sadanand.warrier@intel.com> Signed-off-by: Sudeep Dutt <sudeep.dutt@intel.com> Signed-off-by: Andrzej Kacprowski <andrzej.kacprowski@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20Merge branch 'k.o/for-4.12' into k.o/for-4.12-rdma-netdeviceDoug Ledford
2017-04-20IB/core: Rename uverbs event file structureMatan Barak
Previously, ib_uverbs_event_file was suffixed by _file as it contained the actual file information. Since it's now only used as base struct for ib_uverbs_async_event_file and ib_uverbs_completion_event_file, we change its name to ib_uverbs_event_queue. This represents its logical role better. Fixes: 1e7710f3f656 ('IB/core: Change completion channel to use the reworked objects schema') Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/core: Don't use is_async in event files to infer events sizeMatan Barak
Previously, we inferred the events size in ib_uverbs_event_read by using the is_async flag. Instead of that, we pass the event size directly. Fixes: 1e7710f3f656 ('IB/core: Change completion channel to use the reworked objects schema') Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/core: A small refactor in destroy WQ handlerMatan Barak
Instead of having uverbs_uobject_put both in the error flow and the good flow, we unite them. Fixes: fd3c7904db6e ('IB/core: Change idr objects to use the new schema') Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/core: Nullify ib_uobject during allocationMatan Barak
Currently, we initialize all fields of ib_uobject straight after allocation. Therefore, a kmalloc was sufficient. Since ib_uobject could be embedded in a type specific structure, we nullify it to spare programmer errors. Fixes: 3832125624b7 ('IB/core: Add support for idr types') Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/core: Don't pass the lock state to _rdma_remove_commit_uobjectMatan Barak
The only scenario where this function was called while the lock is already taken is in the context cleanup scenario. Thus, in order not to pass the lock state to this function, we just call the remove logic straight from the cleanup context function. Fixes: 3832125624b7 ('IB/core: Add support for idr types') Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-20IB/core: Rename write flag to exclusive in rdma_coreMatan Barak
We rename the "write" flags to "exclusive", as it's used for both WRITE and DESTROY actions. Fixes: 3832125624b7 ('IB/core: Add support for idr types') Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-17hw/mlx5: Add New bit to check over QP creationErez Shitrit
Add check for bit IB_QP_CREATE_NETIF_QP while creating QP. Signed-off-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-17net/mlx5e: IPoIB, Xmit flowSaeed Mahameed
Implement mlx5e's IPoIB SKB transmit using the helper functions provided by mlx5e ethernet tx flow, the only difference in the code between mlx5e_xmit and mlx5i_xmit is that IPoIB has some extra fields to fill (UD datagram segment) in the TX descriptor (WQE) and it doesn't need to have any vlan handling. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Erez Shitrit <erezsh@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-15Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts were simply overlapping changes. In the net/ipv4/route.c case the code had simply moved around a little bit and the same fix was made in both 'net' and 'net-next'. In the net/sched/sch_generic.c case a fix in 'net' happened at the same time that a new argument was added to qdisc_hash_add(). Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13netlink: pass extended ACK struct to parsing functionsJohannes Berg
Pass the new extended ACK reporting struct to all of the generic netlink parsing functions. For now, pass NULL in almost all callers (except for some in the core.) Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-13netlink: extended ACK reportingJohannes Berg
Add the base infrastructure and UAPI for netlink extended ACK reporting. All "manual" calls to netlink_ack() pass NULL for now and thus don't get extended ACK reporting. Big thanks goes to Pablo Neira Ayuso for not only bringing up the whole topic at netconf (again) but also coming up with the nlattr passing trick and various other ideas. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pendingLinus Torvalds
Pull SCSI target fixes from Nicholas Bellinger: "There has been work in a number of different areas over the last weeks, including: - Fix target-core-user (TCMU) back-end bi-directional handling (Xiubo Li + Mike Christie + Ilias Tsitsimpis) - Fix iscsi-target TMR reference leak during session shutdown (Rob Millner + Chu Yuan Lin) - Fix target_core_fabric_configfs.c race between LUN shutdown + mapped LUN creation (James Shen) - Fix target-core unknown fabric callback queue-full errors (Potnuri Bharat Teja) - Fix iscsi-target + iser-target queue-full handling in order to support iw_cxgb4 RNICs. (Potnuri Bharat Teja + Sagi Grimberg) - Fix ALUA transition state race between multiple initiator (Mike Christie) - Drop work-around for legacy GlobalSAN initiator, to allow QLogic 57840S + 579xx offload HBAs to work out-of-the-box in MSFT environments. (Martin Svec + Arun Easi) Note that a number are CC'ed for stable, and although the queue-full bug-fixes required for iser-target to work with iw_cxgb4 aren't CC'ed here, they'll be posted to Greg-KH separately" * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: tcmu: Skip Data-Out blocks before gathering Data-In buffer for BIDI case iscsi-target: Drop work-around for legacy GlobalSAN initiator target: Fix ALUA transition state race between multiple initiators iser-target: avoid posting a recv buffer twice iser-target: Fix queue-full response handling iscsi-target: Propigate queue_data_in + queue_status errors target: Fix unknown fabric callback queue-full errors tcmu: Fix wrongly calculating of the base_command_size tcmu: Fix possible overwrite of t_data_sg's last iov[] target: Avoid mappedlun symlink creation during lun shutdown iscsi-target: Fix TMR reference leak during session shutdown usb: gadget: Correct usb EP argument for BOT status request tcmu: Allow cmd_time_out to be set to zero (disabled)
2017-04-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Mostly simple cases of overlapping changes (adding code nearby, a function whose name changes, for example). Signed-off-by: David S. Miller <davem@davemloft.net>
2017-04-05IB/hfi1: Eliminate synchronize_rcu() in mr deleteMike Marciniszyn
The synchronize_rcu() call can be eliminated to improve memory deregistration performance. There are two key fields involved: - The rcu pointer itself - the lkey_published field To close the window between the rcu read of the mregion pointer and the reference count the code should: 1. To lkey/rkey validation (reader) Read the rcu pointer. If the pointer is non-NULL, get a reference. To the current validation tests use a READ_ONCE() on the lkey_published. Upon any failure release the reference. 2. To the remove logic (delete) Insure the published is zeroed prior to setting the pointer to NULL. This requires using rcu_assign_pointer() to insure lkey_published is written prior to the NULL. 3. To the insert logic (add) Insure the published is set use an rcu_assign_pointer() to insure the pointer is after all MR fields. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/hfi1: Add transmit fault injection featureDon Hiatt
Add ability to fault packets on transmit by opcode. Dropping by packet can be achieved by setting the mask to 0. In order to drop non-verbs traffic we set PbcInsertHrc to NONE (0x2). The packet will still be delivered to the receiving node but a KHdrHCRCErr (KDETH packet with a bad HCRC) will be triggered and the packet will not be delivered to the correct context. In order to drop regular verbs traffic we set the PbcTestEbp flag. The packet will still be delivered to the receiving node but a 'late ebp error' will be triggered and will be dropped. A global toggle (/sys/kernel/debug/hfi1/hfi1_X/fault_suppress_err) has been added to suppress the error messages on the receive node when a packet was faulted on the sending node. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/hfi1: Add receive fault injection featureDon Hiatt
Add fault injection capability: - Drop packets unconditionally (fault_by_packet) - Drop packets based on opcode (fault_by_opcode) This feature reacts to the global FAULT_INJECTION config flag. The faulting traces have been added: - misc/fault_opcode - misc/fault_packet See 'Documentation/fault-injection/fault-injection.txt' for details. Examples: - Dropping packets by opcode: /sys/kernel/debug/hfi1/hfi1_X/fault_opcode # Enable fault echo Y > fault_by_opcode # Setprobability of dropping (0-100%) # echo 25 > probability # Set opcode echo 0x64 > opcode # Number of times to fault echo 3 > times # An optional mask allows you to fault # a range of opcodes echo 0xf0 > mask /sys/kernel/debug/hfi1/hfi1_X/fault_stats contains a value in parentheses to indicate number of each opcode dropped. - Dropping packets unconditionally /sys/kernel/debug/hfi1/hfi1_X/fault_packet # Enable fault echo Y > fault_by_packet /sys/kernel/debug/hfi1/hfi1_X/fault_packet/fault_stats contains the number of packets dropped. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/hfi1: Ensure VL index is within boundsMichael J. Ruhl
Improve the safety of the code and ensure the array cannot be indexed out of bounds when picking the CPU for a given SDMA engine. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/rdmavt: Avoid reseting wqe send_flags in unreserveMike Marciniszyn
The wqe should be read only and in fact the superfluous reset of the RVT_SEND_RESERVE_USED flag causes an issue where reserved operations elicit a bad completion to the ULP. The maintenance of the flag is now entirely within rvt_post_one_wr() where a reserved operation will set the flag and a non-reserved operation will insure the operation that is about to be posted has the flag reset. Fixes: Commit 856cc4c237ad ("IB/hfi1: Add the capability for reserved operations") Reviewed-by: Don Hiatt <don.hiatt@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/rdmavt, IB/hfi1: Fix timer migration regressionsSebastian Sanchez
RC timeout counter isn't getting incremented. Increment counter and add the trace for it. Fixes: 87c23b4ab018 ("IB/rdmavt: Adding timer logic to rdmavt") Reviewed-by: Brian Welty <brian.welty@intel.com> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/hfi1: Add a patch value to the firmware version stringMichael J. Ruhl
The HFI firmware now includes a patch level in its version. Updating the necessary code to include the patch version in the firmware string. Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/hfi1: Check for QSFP presence before attempting readsEaswar Hariharan
Attempting to read the status of a QSFP cable creates noise in the logs and misses out on setting an appropriate Offline/Disabled Reason if the cable is not plugged in. Check for this prior to attempting the read and attendant retries. Fixes: 673b975f1fba ("IB/hfi1: Add QSFP sanity pre-check") Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/hfi1: Protect the global dev_cntr_names and port_cntr_namesTadeusz Struk
Protect the global dev_cntr_names and port_cntr_names with the global mutex as they are allocated and freed in a function called per device. Otherwise there is a danger of double free and memory leaks. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/hfi1: Check device id early during initTadeusz Struk
If there is a wrong device passed to the driver it should fail early, without trying to initialize the device only to find out that it has an invalid device later during the init. Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Tadeusz Struk <tadeusz.struk@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/rdmavt: Add swqe completion traceMike Marciniszyn
The following fields are available for filter/trace: - wqe - wr_id - qpn - qpt - length - idx - ssn - (wr)opcode - (wr)send_flags Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/rdmavt: Add tracing for cq entry and pollMike Marciniszyn
The following fields are defined for filtering and triggering: - wr_id - status - opcode - qpn - length - idx Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/rdmavt: Add additional fields to post send traceMike Marciniszyn
This fix is to get additional debugging information. The following fields are added: - wqe - qpt - num_sge - ssn - pid - send_flags These additional fields provide for more focused filtering and triggering. The patch also moves the trace to just before the wqe is posted to get the most accurate information and future proofs the code to trace all possible reserved opcodes. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/rdmavt, IB/hfi1, IB/qib: Make wc opcode translation driver dependentMike Marciniszyn
The work to create a completion helper moved the translation of send wqe operations to completion opcodes to rdmvat. This precludes having driver dependent operations. Make the translation driver dependent by doing the translation in the driver prior to the rvt_qp_swqe_complete() call using restored translation tables. Fixes: Commit f2dc9cdce83c ("IB/rdmavt: Add a send completion helper") Fixes: Commit 0771da5a6e9d ("IB/hfi1,IB/qib: Use new send completion helper") Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/hfi1: NULL pointer dereference when freeing rhashtableSebastian Sanchez
A NULL pointer dereference occurs when the driver is unloaded, and the SDMA rhashtable is freed if the rhashtable_init() function has not been called. Prevent this by changing sdma_rht to be a pointer to a dynamically allocated hash table. The NULL-ness of the pointer serves as an indication that the hash table was initialized and that it needs to be destroyed. Fixes: 0cb2aa690c7e ("IB/hfi1: Add sysfs interface for affinity setup") Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/hfi1: Cache registers during state changeMichael J. Ruhl
When the LCB is going offline, inopportune port queries can cause benign error messages to be logged. To deal with this, cache the registers just before setting the LCB to offline, allowing queries to return without eliciting the error. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/hfi1: Race hazard avoidance in user SDMA driverMichael J. Ruhl
Set the errcode before the state and add the smb_wmb() to avoid a potential race condition with the user. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Michael Ruhl <michael.j.ruhl@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/hfi1: Force logical link downDean Luick
If the logical link state does not read as down when the physical link state is offline, force it to down. Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Reviewed-by: Jakub Byczkowski <jakub.byczkowski@intel.com> Signed-off-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Easwar Hariharan <easwar.hariharan@intel.com> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/IPoIB: ibX: failed to create mcg debug fileShamir Rabinovitch
When udev renames the netdev devices, ipoib debugfs entries does not get renamed. As a result, if subsequent probe of ipoib device reuse the name then creating a debugfs entry for the new device would fail. Also, moved ipoib_create_debug_files and ipoib_delete_debug_files as part of ipoib event handling in order to avoid any race condition between these. Fixes: 1732b0ef3b3a ([IPoIB] add path record information in debugfs) Cc: stable@vger.kernel.org # 2.6.15+ Signed-off-by: Vijay Kumar <vijay.ac.kumar@oracle.com> Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com> Reviewed-by: Mark Bloch <markb@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/hns: Explicitly include linux/of.hMark Brown
hns_roce_hw_v1.c uses DT interfaces but relies on implict inclusion of linux/of.h which means that changes in other headers could break the build, as happened in -next for arm64 today. Add an explicit include. Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/core: Change completion channel to use the reworked objects schemaMatan Barak
This patch adds the standard fd based type - completion_channel. The completion_channel is now prefixed with ib_uobject, similarly to the rest of the uobjects. This requires a few changes: (1) We define a new completion channel fd based object type. (2) completion_event and async_event are now two different types. This means they use different fops. (3) We release the completion_channel exactly as we release other idr based objects. (4) Since ib_uobjects are already kref-ed, we only add the kref to the async event. A fd object requires filling out several parameters. Its op pointer should point to uverbs_fd_ops and its size should be at least the size if ib_uobject. We use a macro to make the type declaration easier. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/core: Add support for fd objectsMatan Barak
The completion channel we use in verbs infrastructure is FD based. Previously, we had a separate way to manage this object. Since we strive for a single way to manage any kind of object in this infrastructure, we conceptually treat all objects as subclasses of ib_uobject. This commit adds the necessary mechanism to support FD based objects like their IDR counterparts. FD objects release need to be synchronized with context release. We use the cleanup_mutex on the uverbs_file for that. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/core: Add lock to multicast handlersMatan Barak
When two handlers used the same object in the old schema, we blocked the process in the kernel. The new schema just returns -EBUSY. This could lead to different behaviour in applications between the old schema and the new schema. In most cases, using such handlers concurrently could lead to crashing the process. For example, if thread A destroys a QP and thread B modifies it, we could have the destruction happens before the modification. In this case, we are accessing freed memory which could lead to crashing the process. This is true for most cases. However, attaching and detaching a multicast address from QP concurrently is safe. Therefore, we preserve the original behaviour by adding a lock there. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/core: Change idr objects to use the new schemaMatan Barak
This changes only the handlers which deals with idr based objects to use the new idr allocation, fetching and destruction schema. This patch consists of the following changes: (1) Allocation, fetching and destruction is done via idr ops. (2) Context initializing and release is done through uverbs_initialize_ucontext and uverbs_cleanup_ucontext. (3) Ditching the live flag. Mostly, this is pretty straight forward. The only place that is a bit trickier is in ib_uverbs_open_qp. Commit [1] added code to check whether the uobject is already live and initialized. This mostly happens because of a race between open_qp and events. We delayed assigning the uobject's pointer in order to eliminate this race without using the live variable. [1] commit a040f95dc819 ("IB/core: Fix XRC race condition in ib_uverbs_open_qp") Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/core: Add idr based standard typesMatan Barak
This patch adds the standard idr based types. These types are used in downstream patches in order to initialize, destroy and lookup IB standard objects which are based on idr objects. An idr object requires filling out several parameters. Its op pointer should point to uverbs_idr_ops and its size should be at least the size of ib_uobject. We add a macro to make the type declaration easier. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/core: Add support for idr typesMatan Barak
The new ioctl infrastructure supports driver specific objects. Each such object type has a hot unplug function, allocation size and an order of destruction. When a ucontext is created, a new list is created in this ib_ucontext. This list contains all objects created under this ib_ucontext. When a ib_ucontext is destroyed, we traverse this list several time destroying the various objects by the order mentioned in the object type description. If few object types have the same destruction order, they are destroyed in an order opposite to their creation. Adding an object is done in two parts. First, an object is allocated and added to idr tree. Then, the command's handlers (in downstream patches) could work on this object and fill in its required details. After a successful command, the commit part is called and the user objects become ucontext visible. If the handler failed, alloc_abort should be called. Removing an uboject is done by calling lookup_get with the write flag and finalizing it with destroy_commit. A major change from the previous code is that we actually destroy the kernel object itself in destroy_commit (rather than just the uobject). We should make sure idr (per-uverbs-file) and list (per-ucontext) could be accessed concurrently without corrupting them. Signed-off-by: Matan Barak <matanb@mellanox.com> Reviewed-by: Yishai Hadas <yishaih@mellanox.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-04-05IB/core: Refactor idr to be per uverbs_fileMatan Barak
The current code creates an idr per type. Since types are currently common for all drivers and known in advance, this was good enough. However, the proposed ioctl based infrastructure allows each driver to declare only some of the common types and declare its own specific types. Thus, we decided to implement idr to be per uverbs_file. Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Haggai Eran <haggaie@mellanox.com> Reviewed-by: Sean Hefty <sean.hefty@intel.com> Signed-off-by: Doug Ledford <dledford@redhat.com>
2017-03-30iser-target: avoid posting a recv buffer twiceSagi Grimberg
We pre-allocate our send-queues and might overflow them in case we have multi work-request operations which tend to occur for large RDMA transfers over devices with limited allowed sg elements. When we get to a queue-full condition we might retry again later, so track our receive buffers so we don't repost them for a retry case. Reported-by: Potnuri Bharat Teja <bharat@chelsio.com> Tested-by: Potnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: Potnuri Bharat Teja <bharat@chelsio.com> Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-30iser-target: Fix queue-full response handlingNicholas Bellinger
This patch addresses two queue-full handling bugs in iser-target. The first is propagating isert_rdma_rw_ctx_post() return back to target-core via isert_put_datain() + isert_get_dataout() callbacks, in order to trigger queue-full logic in target-core. Note target-core expects -EAGAIN or -ENOMEM error to signal RDMA WRITE/READ data-transfer callbacks should be retried, after queue-full logic been invoked. Other types of errors propagated up from RDMA RW API will result in target-core generating internal CHECK_CONDITION status, avoiding subsequent isert_put_datain() and isert_get_dataout() iscsit_transport callback retry attempts. The second is to use transport_generic_request_failure() during T10-PI hw-offload errors in isert_rdma_write_done() and isert_rdma_read_done(), so CHECK_CONDITION queue-full is handled internally by target-core. Also add isert_put_response() T10-PI failure case fixme in isert_rdma_write_done(), which is currently not internally retried or released until session reinstatement. Reported-by: Potnuri Bharat Teja <bharat@chelsio.com> Reviewed-by: Potnuri Bharat Teja <bharat@chelsio.com> Tested-by: Potnuri Bharat Teja <bharat@chelsio.com> Cc: Potnuri Bharat Teja <bharat@chelsio.com> Reported-by: Steve Wise <swise@opengridcomputing.com> Cc: Steve Wise <swise@opengridcomputing.com> Cc: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Nicholas Bellinger <nab@linux-iscsi.org>
2017-03-30drivers: add explicit interrupt.h includesFlorian Westphal
These files all use functions declared in interrupt.h, but currently rely on implicit inclusion of this file (via netns/xfrm.h). That won't work anymore when the flow cache is removed so include that header where needed. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-03-27Merge 4.11-rc4 into char-misc-nextGreg Kroah-Hartman
We want the char-misc fixes in here as well. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>