linux/linux-stable.git - Linux kernel stable tree

Age	Commit message (Collapse)	Author
2015-12-12	ixgbe: Reorder search to work from the top down instead of bottom up	Alexander Duyck
	This patch is meant to reduce the complexity of the search function used for finding a VLVF entry associated with a given VLAN ID. The previous code was searching from bottom to top. I reordered it to search from top to bottom. In addition I pulled an AND statement out of the loop and instead replaced it with an OR statement outside the loop. This should help to reduce the overall size and complexity of the function. There was also some formatting I cleaned up in regards to whitespace and such. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-12	ixgbe: Add support for adding/removing VLAN on PF bypassing the VLVF	Alexander Duyck
	This patch adds support for bypassing the VLVF entry creation when the PF is adding a new VLAN. The advantage to doing this is that we can then save the VLVF entries for the VFs which must have them in order to function, versus the PF which can fall back on the default pool entry. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-12	ixgbe: Simplify configuration of setting VLVF and VLVFB	Alexander Duyck
	This patch addresses several issues within the VLVF and VLVFB configuration First was the fact that code was overly complicated with multiple conditional paths depending on if we adding or removing and which bit we were going to add or remove. Instead of messing with all that I have simplified it by using (vid / 32) and (1 - vid / 32) to identify our register and the other vlvfb register. Second was the fact that we were likely leaking a few packets into the PF in cases where we were deleting an entry and the VFTA filter for that entry as the ordering was such that we deleted the pool and then the VLAN filter instead of the other way around. I have updated that by adding a check for no bits being set and if that occurs we clear things up in the proper order. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-12	ixgbe: Reduce VT code indent in set_vfta by introducing jump label	Alexander Duyck
	In order to clear the way for upcoming work I thought it best to drop the level of indent in the ixgbe_set_vfta_generic function. Most of the code is held in the virtualization specific section. So the easiest approach is to just add a jump label and jump past the bulk of the code if it is not enabled. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-12	ixgbe: Simplify definitions for regidx and bit in set_vfta	Alexander Duyck
	This patch simplifies the logic for setting the VFTA register by removing the number of conditional checks needed. Instead we just use some boolean logic to generate vfta_delta, and if that is set then we xor the vfta by that value and write it back. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-12	ixgbe: Fix SR-IOV VLAN pool configuration	Alexander Duyck
	The code for checking the PF bit in ixgbe_set_vf_vlan_msg was using the wrong offset and as a result it was pulling the VLAN off of the PF even if there were VFs numbered greater than 40 that still had the VLAN enabled. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-12	ixgbe: Return error on failure to allocate mac_table	Alexander Duyck
	Add a check to make certain mac_table was actually allocated and is not NULL. If it is NULL return -ENOMEM and allow the probe routine to fail rather then causing a NULL pointer dereference further down the line. Signed-off-by: Alexander Duyck <aduyck@mirantis.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2015-12-12	Merge branch 'mlx5-flow-steering'	David S. Miller
	Saeed Mahameed says: ==================== mlx5 improved flow steering management First two patches fixes some minor issues in recently introduced SRIOV code. The other seven patches modifies the driver's code that manages flow steering rules with Connectx-4 devices. Basic introduction: The flow steering device specification model is composed of the following entities: Destination (either a TIR/Flow table/vport), where TIR is RSS end-point, vport is the VF eSwitch port in SRIOV. Flow table entry (FTE) - the values used by the flow specification Flow table group (FG) - the masks used by the flow specification Flow table (FT) - groups several FGs and can serve as destination The flow steering software entities: In addition to the device objects, the software have two more objects: Priorities - group several FTs. Handles order of packet matching. Namespaces - group several priorities. Namespace are used in order to isolate different usages of steering (for example, add two separate namespaces, one for the NIC driver and one for E-Switch FDB). The base data structure for the flow steering management is a tree and all the flow steering objects such as (Namespace/Flow table/Flow Group/FTE/etc.) are represented as a node in the tree, e.g.: Priority-0 -> FT1 -> FG -> FTE -> TIR (destination) Priority-1 -> FT2 -> FG-> FTE -> TIR (destination) Matching begins in FT1 flow rules and if there is a miss on all the FTEs then matching continues on the FTEs in FT2. The new implementation solves/improves the following issues in the current code: 1) The new impl. supports multiple destinations, the search for existing rule with the same matching value is performed by the flow steering management. In the current impl. the E-switch FDB management code needs to search for existing rules before calling to the add rule function. 2) The new impl. manages the flow table level, in the current implementation the consumer states the flow table level when new flow table is created without any knowledge about the levels of other flow tables. 3) In the current impl. the consumer can't create or destroy flow groups dynamically, the flow groups are passed as argument to the create flow table API. The new impl. exposes API for create/destroy flow group. The series is built as follows: Patch #1 add flow steering API firmware commands. Patch #2 add tree operation of the flow steering tree: add/remove node, initialize node and take reference count on a node. Patch #3 add essential algorithms for managing the flow steering. Patch #4 Initialize the flow steering tree, flow steering initialization is based on static tree which illustrates the flow steering tree when the driver is loaded. Patch #5 is the main patch of the series. It introduce the flow steering API. Patch #6 Expose the new flow steering API and remove the old one. The Ethernet flow steering follows the existing implementation, but uses the new steering API. Patch #7 Rename en_flow_table.c to en_fs.c in order to be aligned with the new flow steering files. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-12	net/mlx5e: Rename en_flow_table.c to en_fs.c	Maor Gottlieb
	Rename en_flow_table.c to en_fs.c in order to be aligned with the new flow steering files. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-12	net/mlx5: Use flow steering infrastructure for mlx5_en	Maor Gottlieb
	Expose the new flow steering API and remove the old one. Few changes are required: 1. The Ethernet flow steering follows the existing implementation, but uses the new steering API. The old flow steering implementation is removed. 2. Move the E-switch FDB management to use the new API. 3. When driver is loaded call to mlx5_init_fs which initialize the flow steering tree structure, open namespaces for NIC receive and for E-switch FDB. 4. Call to mlx5_cleanup_fs when the driver is unloaded. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-12	net/mlx5_core: Flow steering tree initialization	Maor Gottlieb
	Flow steering initialization is based on static tree which illustrates the flow steering tree when the driver is loaded. The initialization considers the max supported flow table level of the device, a minimum of 2 kernel flow tables(vlan and mac) are required to have kernel flow table functionality. The tree structures when the driver is loaded: root_namespace(receive nic) \| priority-0 (kernel priority) \| namespace(kernel namespace) \| priority-0 (flow tables priority) In the following patches, When the EN driver will use the flow steering API, it create two flow tables and their flow groups under priority-0(flow tables priority). Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-12	net/mlx5_core: Introduce flow steering API	Maor Gottlieb
	Introducing the following objects: mlx5_flow_root_namespace: represent the root of specific flow table type tree(e.g NIC receive, FDB, etc..) mlx5_flow_group: define the mask of the flow specification. fs_fte(flow steering flow table entry): defines the value of the flow specification. The following describes the relationships between the tree objects: root_namespace --> priorities -->namespaces --> priorities -->flow-tables --> flow-groups --> flow-entries --> destinations When we create new object(flow table/flow group/flow table entry), we call to the FW command and then we add the related sw object to the tree. When we destroy object, e.g. call to mlx5_destroy_flow_table, we use the tree node destructor for destroying the FW object and remove the node from the tree. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-12	net/mlx5_core: Add flow steering lookup algorithms	Maor Gottlieb
	Introduce the flow steering mlx5_flow_namespace (Namespace) and fs_prio (Flow Steering Priority) tree nodes. Namespaces are used in order to isolate different usages or types of steering (for example, downstream patches will add a different namespaces for the NIC driver and for E-Switch FDB usages). Flow Steering Priorities are objects that describes priorities ranges between different flow objects under the same namespace. Example, entries in priority i are matched before entries in priority i+1. This patch adds the following algorithms: 1) Calculate level: Each flow table has level(the priority between the flow tables). When we initialize the flow steering tree, we assign range of levels to each priority, therefore the level for new flow table is the location within the priority related to the range of the priority. 2) Match between match criteria. This function is used for searching flow group when new flow rule is added. 3) Match between match values. This function is used for searching flow table entry when new flow rule is added. 4) Add essential macros for traversing on a node's children. E.g. traversing on all the flow table of some priority Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-12	net/mlx5_core: Add flow steering base data structures	Maor Gottlieb
	Introducing the base data structure and its operations that are going to represent ConnectX-4 Flow Steering, this data structure is basically a tree and all Flow steering objects such as (Flow Table/Flow Group/FTE/etc ..) are represented as fs_node(s). fs_node is the base object which describes a basic tree node, with the following extra info: type: describes the runtime type of the node (Object). lock: lock this node sub-tree. ref_count: number of children + current references. remove_func: a generic destructor. fs_node types will be used and explained once the usage is added in the following patches. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-12	net/mlx5_core: Introduce flow steering firmware commands	Maor Gottlieb
	Introduce new Flow Steering (FS) firmware commands, in-order to support the new flow steering infrastructure. Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Signed-off-by: Moni Shoua <monis@mellanox.com> Signed-off-by: Matan Barak <matanb@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-12	net/mlx5e: Assign random MAC address if needed	Saeed Mahameed
	Under SRIOV there might be a case where VFs are loaded without pre-assigned MAC address. In this case, the VF will randomize its own MAC. This will address the case of administrator not assigning MAC to the VF through the PF OS APIs and keep udev happy. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-12	net/mlx5: Fix query E-Switch capabilities	Saeed Mahameed
	E-Switch capabilities should be queried only if E-Switch flow table is supported and not only when vport group manager. Fixes: d6666753c6e8 ("net/mlx5: E-Switch, Introduce HCA cap and E-Switch vport context") Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-11	Merge branch 'thunderx-pass2'	David S. Miller
	Sunil Goutham says: ==================== net: thunderx: Support for pass-2 hw features This patch set adds support for new features added in pass-2 revision of hardware like TSO and count based interrupt coalescing. Changes from v1: - Addressed comments received regarding boolean bit field changes by excluding them from this patch. Will submit a seperate patch along with cleanup of unsed field. - Got rid of new macro 'VNIC_NAPI_WEIGHT' introduced in count threshold interrupt patch. ==================== Reviewed-by: Pavel Fedin <p.fedin@samsung.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-11	net: thunderx: Enable CQE count threshold interrupt	Sunil Goutham
	This feature is introduced in pass-2 chip and with this CQ interrupt coalescing will work based on both timer and count. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-11	net: thunderx: HW TSO support for pass-2 hardware	Sunil Goutham
	This adds support for offloading TCP segmentation to HW in pass-2 revision of hardware. Both driver level SW TSO for pass1.x chips and HW TSO for pass-2 chip will co-exist. Modified SQ descriptor structures to reflect pass-2 hw implementation. Signed-off-by: Sunil Goutham <sgoutham@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-11	Doc: Micrel-ksz90x1.txt: Document deprecated MAC OF properties	Andrew Lunn
	Phy properties are expected to be found in the PHY OF node. However this Micrel driver also allows them to be placed into the MAC OF node. This is deprecated. Document it as such, and remove the example using the deprecated method to prevent people copying it into new device tree files. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-11	Merge branch 'mvneta-rss-xps'	David S. Miller
	Gregory CLEMENT says: ==================== mvneta: Introduce RSS support and XPS configuration this series is the first step add RSS support on mvneta. It will allow associating an ethernet interface to a given CPU through RSS by using "ethtool -X ethX weight". Indeed, currently I only enable one entry in the RSS lookup table. Even if it is not really RSS, it allows to get back the irq affinity feature we lost by using the percpu interrupt. The main change compared to the second version is the setup for the XPS instead of using specific hack inside the driver in the forth patch. Th first patch make the default queue associate to each port and no more a global variable. The second patch really associates the RX queues with the CPUs instead of masking the percpu interrupts for doing it. All the RX queues are enabled and are statically associated with the CPUs by using a modulo of the number of present CPUs. But at this stage only one RX queue will receive the stream. The third patch introduces a first level of RSS support through the ethtool functions. As explained in the introduction there is only one entry in the RSS lookup table which permits at the end to associate an mvneta port to a CPU through the RX queues because the mapping is static. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-11	net: mvneta: Configure XPS support	Gregory CLEMENT
	With this patch each CPU is associated with its own set of TX queues. It also setup the XPS with an initial configuration which set the affinity matching the hardware configuration. Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-11	net: mvneta: Add naive RSS support	Gregory CLEMENT
	This patch adds the support for the RSS related ethtool function. Currently it only uses one entry in the indirection table which allows associating an mvneta interface to a given CPU. Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Tested-by: Marcin Wojtas <mw@semihalf.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-11	net: mvneta: Associate RX queues with each CPU	Gregory CLEMENT
	We enable the percpu interrupt for all the CPU and we just associate a CPU to a few queue at the neta level. The mapping between the CPUs and the queues is static. The queues are associated to the CPU module the number of CPUs. However currently we only use on RX queue for a given Ethernet port. Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-11	net: mvneta: Make the default queue related for each port	Gregory CLEMENT
	Instead of using the same default queue for all the port. Move it in the port struct. It will allow have a different default queue for each port. Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-11	mpls_iptunnel: add static qualifier to mpls_output	Roopa Prabhu
	This gets rid of the following compile warn: net/mpls/mpls_iptunnel.c:40:5: warning: no previous prototype for mpls_output [-Wmissing-prototypes] Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com> Acked-by: Robert Shearman <rshearma@brocade.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-11	cxgb4: Handle clip return values	Hariprasad Shenai
	Add a warn message when clip table overflows. If clip table isn't allocated, return from cxgb4_clip_release() to avoid panic. Disable offload if clip isn't enabled in the hardware. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-11	mlxsw: core: remove an unneeded condition	Dan Carpenter
	We already know "err" is zero so there is no need to check. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-11	mlxsw: spectrum: fix some error handling	Dan Carpenter
	The "err = " assignment is missing here. Fixes: 0d65fc13042f ('mlxsw: spectrum: Implement LAG port join/leave') Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-11	netcp: add more __le32 annotations	Arnd Bergmann
	The handling of epib and psdata remains a bit unclear in the driver, as we access the same fields both as CPU-endian and through DMA from the device. Sparse warns about this: ti/netcp_core.c:1147:21: warning: incorrect type in assignment (different base types) ti/netcp_core.c:1147:21: expected unsigned int [usertype] [assigned] epib ti/netcp_core.c:1147:21: got restricted __le32 <noident> This uses __le32 types in a few places and uses __force where the code looks fishy. The previous patch should really have produced the correct behavior, but this second patch is needed to shut up the warnings about it. Ideally it would be slightly rewritten to not need those casts, but I don't dare do that without access to the hardware for proper testing. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-11	netcp: try to reduce type confusion in descriptors	Arnd Bergmann
	The netcp driver produces tons of warnings when CONFIG_LPAE is enabled on ARM: drivers/net/ethernet/ti/netcp_core.c: In function 'netcp_tx_map_skb': drivers/net/ethernet/ti/netcp_core.c:1084:13: warning: passing argument 1 of 'set_words' from incompatible pointer type [-Wincompatible-pointer-types] This is the result of trying to pass a pointer to a dma_addr_t to a function that expects a u32 pointer to copy that into a DMA descriptor. Looking at that code in more detail to fix the warnings, I see multiple related problems: * The conversion functions are not endian-safe, as the DMA descriptors are almost certainly fixed-endian, but the CPU is not. * On 64-bit machines, passing a pointer through a u32 variable is a bug, accessing an indirect pointer as a u32 pointer even more so. * The handling of epib and psdata mixes native-endian and device-endian data. In this patch, I try to sort out the types for most accesses here, adding le32_to_cpu/cpu_to_le32 where appropriate, and passing pointers through two 32-bit words in the descriptor padding, to make it plausible that the driver does the right thing if compiled for big-endian or 64-bit systems. Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-09	cgroup: fix sock_cgroup_data initialization on earlier compilers	Tejun Heo
	sock_cgroup_data is a struct containing an anonymous union. sock_cgroup_set_prioidx() and sock_cgroup_set_classid() were initializing a field inside the anonymous union as follows. struct sock_ccgroup_data skcd_buf = { .val = VAL }; While this is fine on more recent compilers, gcc-4.4.7 triggers the following errors. include/linux/cgroup-defs.h: In function ‘sock_cgroup_set_prioidx’: include/linux/cgroup-defs.h:619: error: unknown field ‘val’ specified in initializer include/linux/cgroup-defs.h:619: warning: missing braces around initializer include/linux/cgroup-defs.h:619: warning: (near initialization for ‘skcd_buf.<anonymous>’) This is because .val belongs to the anonymous union nested inside the struct but the initializer is missing the nesting. Fix it by adding an extra pair of braces. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Alaa Hleihel <alaa@dev.mellanox.co.il> Fixes: bd1060a1d671 ("sock, cgroup: add sock->sk_cgroup") Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-08	chelsio: constify cmac_ops structures	Julia Lawall
	The cmac_ops structures are never modified, so declare them as const. Done with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-08	bnx2x: remove rx_pkt/rx_calls	Eric Dumazet
	These fields are updated but never read. Remove the overhead. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-08	bnx2x: avoid soft lockup in bnx2x_poll()	Eric Dumazet
	Under heavy TX load, bnx2x_poll() can loop forever and trigger soft lockup bugs. A napi poll handler must yield after one TX completion round, risk of livelock is too high otherwise. Bug is very easy to trigger using a debug build, and udp flood, because of added cpu cycles in TX completion, and we do not receive enough packets to break the loop. Reported-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Ariel Elior <ariel.elior@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-08	rhashtable: Remove unnecessary wmb for future_tbl	Herbert Xu
	The patch 9497df88ab5567daa001829051c5f87161a81ff0 ("rhashtable: Fix reader/rehash race") added a pair of barriers. In fact the wmb is superfluous because every subsequent write to the old or new hash table uses rcu_assign_pointer, which itself carriers a full barrier prior to the assignment. Therefore we may remove the explicit wmb. Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Thomas Graf <tgraf@suug.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-08	Merge branch 'cxgb4-update-kconfig-and-fixes'	David S. Miller
	Hariprasad Shenai says: ==================== Update Kconfig and some fixes for cxgb4 This series update Kconfig to add description for Chelsio's next generation T6 family of adapters, also fixes ethtool stats alignment and prevents simultaneous execution of service_ofldq thread, deals with queue wrap around and adds some fl counters for debugging purpose and device ID for new T5 adapters. This patch series has been created against net-next tree and includes patches on cxgb4 driver. We have included all the maintainers of respective drivers. Kindly review the change and let us know in case of any review comments. Thanks V2: Declare 'service_ofldq_running' as bool in Patch 4/7 ("cxgb4: prevent simultaneous execution of service_ofldq()") based on review comment by David Miller ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-08	cxgb4: Adds PCI device id for new T5 adapters	Hariprasad Shenai
	Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-08	cxgb4: Add FL DMA mapping error and low counter	Hariprasad Shenai
	Add Free List DMA Mapping Errors to SGE Queue info for Free Lists. Add Free List "Low" counter to count the number of times we see the number of pointers that we _think_ the hardware sees in the Free List below the Egress Threshold. Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-08	cxgb4: Deal with wrap-around of queue for Work request	Hariprasad Shenai
	The WR headers may not fit within one descriptor. So we need to deal with wrap-around here. Based on original patch by Pranjal Joshi <pjoshi@chelsio.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-08	cxgb4: prevent simultaneous execution of service_ofldq()	Hariprasad Shenai
	Change mutual exclusion mechanism to prevent multiple threads of execution from running in service_ofldq() at the same time. The old mechanism used an implicit guard on the down-call path and none on the restart path and wasn't working. This checking makes the mechanism explicit and is much easier to understand as a result. Based on original work by Casey Leedom <leedom@chelsio.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-08	cxgb4: Use ACCES_ONCE macro to read queue's consumer index	Hariprasad Shenai
	Use helper macro ACCESS_ONCE() to load from the SGE status page to prevent the compiler loading multiple times. Based on original work by Mike Werner <werner@chelsio.com> Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-08	cxgb4/cxgb4vf: update Kconfig file to include T6 adapter	Hariprasad Shenai
	Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-08	cxgb4: Align rest of the ethtool get stats	Hariprasad Shenai
	Signed-off-by: Hariprasad Shenai <hariprasad@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-08	net: hns: optimize XGE capability by reducing cpu usage	yankejian
	here is the patch raising the performance of XGE by: 1)changes the way page management method for enet momery, and 2)reduces the count of rmb, and 3)adds Memory prefetching Signed-off-by: Kejian Yan <yankejian@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-08	sock, cgroup: add sock->sk_cgroup	Tejun Heo
	In cgroup v1, dealing with cgroup membership was difficult because the number of membership associations was unbound. As a result, cgroup v1 grew several controllers whose primary purpose is either tagging membership or pull in configuration knobs from other subsystems so that cgroup membership test can be avoided. net_cls and net_prio controllers are examples of the latter. They allow configuring network-specific attributes from cgroup side so that network subsystem can avoid testing cgroup membership; unfortunately, these are not only cumbersome but also problematic. Both net_cls and net_prio aren't properly hierarchical. Both inherit configuration from the parent on creation but there's no interaction afterwards. An ancestor doesn't restrict the behavior in its subtree in anyway and configuration changes aren't propagated downwards. Especially when combined with cgroup delegation, this is problematic because delegatees can mess up whatever network configuration implemented at the system level. net_prio would allow the delegatees to set whatever priority value regardless of CAP_NET_ADMIN and net_cls the same for classid. While it is possible to solve these issues from controller side by implementing hierarchical allowable ranges in both controllers, it would involve quite a bit of complexity in the controllers and further obfuscate network configuration as it becomes even more difficult to tell what's actually being configured looking from the network side. While not much can be done for v1 at this point, as membership handling is sane on cgroup v2, it'd be better to make cgroup matching behave like other network matches and classifiers than introducing further complications. In preparation, this patch updates sock->sk_cgrp_data handling so that it points to the v2 cgroup that sock was created in until either net_prio or net_cls is used. Once either of the two is used, sock->sk_cgrp_data reverts to its previous role of carrying prioidx and classid. This is to avoid adding yet another cgroup related field to struct sock. As the mode switching can happen at most once per boot, the switching mechanism is aimed at lowering hot path overhead. It may leak a finite, likely small, number of cgroup refs and report spurious prioidx or classid on switching; however, dynamic updates of prioidx and classid have always been racy and lossy - socks between creation and fd installation are never updated, config changes don't update existing sockets at all, and prioidx may index with dead and recycled cgroup IDs. Non-critical inaccuracies from small race windows won't make any noticeable difference. This patch doesn't make use of the pointer yet. The following patch will implement netfilter match for cgroup2 membership. v2: Use sock_cgroup_data to avoid inflating struct sock w/ another cgroup specific field. v3: Add comments explaining why sock_data_prioidx() and sock_data_classid() use different fallback values. Signed-off-by: Tejun Heo <tj@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Daniel Wagner <daniel.wagner@bmw-carit.de> CC: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-08	net: wrap sock->sk_cgrp_prioidx and ->sk_classid inside a struct	Tejun Heo
	Introduce sock->sk_cgrp_data which is a struct sock_cgroup_data. ->sk_cgroup_prioidx and ->sk_classid are moved into it. The struct and its accessors are defined in cgroup-defs.h. This is to prepare for overloading the fields with a cgroup pointer. This patch mostly performs equivalent conversions but the followings are noteworthy. * Equality test before updating classid is removed from sock_update_classid(). This shouldn't make any noticeable difference and a similar test will be implemented on the helper side later. * sock_update_netprioidx() now takes struct sock_cgroup_data and can be moved to netprio_cgroup.h without causing include dependency loop. Moved. * The dummy version of sock_update_netprioidx() converted to a static inline function while at it. Signed-off-by: Tejun Heo <tj@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-08	netprio_cgroup: limit the maximum css->id to USHRT_MAX	Tejun Heo
	netprio builds per-netdev contiguous priomap array which is indexed by css->id. The array is allocated using kzalloc() effectively limiting the maximum ID supported to some thousand range. This patch caps the maximum supported css->id to USHRT_MAX which should be way above what is actually useable. This allows reducing sock->sk_cgrp_prioidx to u16 from u32. The freed up part will be used to overload the cgroup related fields. sock->sk_cgrp_prioidx's position is swapped with sk_mark so that the two cgroup related fields are adjacent. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Cc: Daniel Borkmann <daniel@iogearbox.net> CC: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-12-08	Merge branch 'for-4.5-ancestor-test' of ↵	David S. Miller
	git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Preparatory changes for some new socket cgroup infrastructure and netfilter targets. Signed-off-by: David S. Miller <davem@davemloft.net>