summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2014-10-02rtlwifi: Fix Kconfig for RTL8192EELarry Finger
The driver needs btcoexist, but Kconfig fails to select it. This omission could cause build errors for some configurations. Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-10-02ath9k: Fix flushing in MCC modeSujith Manoharan
When we are attempting to switch to a new channel context, the TX queues are flushed, but the mac80211 queues are not stopped and traffic can still come down to the driver. This patch fixes it by stopping the queues assigned to the current context/vif before trying to flush. Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-10-02ath9k: Fix queue handling for channel contextsSujith Manoharan
When a full chip reset is done, all the queues across all VIFs are stopped, but if MCC is enabled, only the queues of the current context is awakened, when we complete the reset. This results in unfairness for the inactive context. Since frames are queued internally in the driver if there is a context mismatch, we can awaken all the queues when coming out of a reset. The VIF-specific queues are still used in flow control, to ensure fairness when traffic is high. Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-10-02ath9k: Add ath9k_chanctx_stop_queues()Sujith Manoharan
This can be used when the queues of a context needs to be stopped. Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-10-02ath9k: Pass context to ath9k_chanctx_wake_queues()Sujith Manoharan
Change the ath9k_chanctx_wake_queues() API so that we can pass the channel context that needs its queues to be stopped. Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-10-02ath9k: Fix queue handling in flush()Sujith Manoharan
When draining of the TX queues fails, a full HW reset is done. ath_reset() makes sure that the queues in mac80211 are restarted, so there is no need to wake them up again. Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-10-02ath9k: Remove duplicate codeSujith Manoharan
ath9k_has_tx_pending() can be used to check if there are pending frames instead of having duplicate code. Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-10-02ath9k: Fix pending frame checkSujith Manoharan
Checking for the queue depth outside of the TX queue lock is incorrect and in this case, is not required since it is done inside ath9k_has_pending_frames(). Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-10-02ath9k: Check pending frames properlySujith Manoharan
There is no need to check if the current channel context has active ACs queued up if the TX queue is not empty. Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-10-02ath9k: Print RoC expirationSujith Manoharan
Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-10-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/usb/r8152.c net/netfilter/nfnetlink.c Both r8152 and nfnetlink conflicts were simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-02mwifiex: add support for SD8887 chipsetAvinash Patil
This patch adds SD8887 support to mwifiex. SD8887 is Marvell's 1x1 11ac solution. The corresponding firmware image file is located at: "mrvl/sd8887_uapsta.bin" Signed-off-by: Avinash Patil <patila@marvell.com> Signed-off-by: Cathy Luo <cluo@marvell.com> Signed-off-by: Frank Huang <frankh@marvell.com> Signed-off-by: Nishant Sarmukadam <nishants@marvell.com> Signed-off-by: Amitkumar Karwar <akarwar@marvell.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-10-02mwifiex: few more register offset entries for sdio card structureAvinash Patil
This patch adds some more defitions to card specific register structure and removes static defines for these registers. Signed-off-by: Avinash Patil <patila@marvell.com> Signed-off-by: Cathy Luo <cluo@marvell.com> Signed-off-by: Amitkumar Karwar <akarwar@marvell.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-10-02wil6210: atomic I/O for the card memoryVladimir Kondratiev
Introduce netdev IOCTLs, to be used by the debug tools. Allows to read/write single dword value or memory block, aligned to dword Different address modes supported: - BAR offset - Firmware "linker" address - target's AHB bus Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-10-02wil6210: manual FW error recovery modeVladimir Kondratiev
Introduce manual FW recovery mode. It is activated if module parameter @no_fw_recovery set to true. May be changed at runtime. Recovery information provided by new "recovery" debugfs file. It prints: mode = [auto|manual] state = [idle|pending|running] In manual mode, after FW error, recovery won't start automatically. Instead, after notification to user space, recovery waits in "pending" state, as indicated by the "recovery" debugfs file. User space tools may perform data collection and allow to continue recovery by writing "run" to the "recovery" debugfs file. Alternatively, recovery pending may be canceled by stopping network interface i.e. 'ifconfig wlan0 down' Signed-off-by: Vladimir Kondratiev <qca_vkondrat@qca.qualcomm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-10-02ath: Add support for tracingSujith Manoharan
Signed-off-by: Sujith Manoharan <c_manoha@qca.qualcomm.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-10-02Merge branch 'for-upstream' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next
2014-10-02ipvs: Clean up comment style in ip_vs.hSimon Horman
* Consistently use the multi-line comment style for networking code: /* This * That * The other thing */ * Use single-line comment style for comments with only one line of text. * In general follow the leading '*' of each line of a comment with a single space and then text. * Add missing line break between functions, remove double line break, align comments to previous lines whenever possible. Reported-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-10-02netfilter: explicit module dependency between br_netfilter and physdevPablo Neira Ayuso
You can use physdev to match the physical interface enslaved to the bridge device. This information is stored in skb->nf_bridge and it is set up by br_netfilter. So, this is only available when iptables is used from the bridge netfilter path. Since 34666d4 ("netfilter: bridge: move br_netfilter out of the core"), the br_netfilter code is modular. To reduce the impact of this change, we can autoload the br_netfilter if the physdev match is used since we assume that the users need br_netfilter in place. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-10-02netfilter: nf_tables: allow to filter from prerouting and postroutingPablo Neira Ayuso
This allows us to emulate the NAT table in ebtables, which is actually a plain filter chain that hooks at prerouting, output and postrouting. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-10-02netfilter: nft_compat: remove incomplete 32/64 bits arch compat codePablo Neira Ayuso
This code was based on the wrong asumption that you can probe based on the match/target private size that we get from userspace. This doesn't work at all when you have to dump the info back to userspace since you don't know what word size the userspace utility is using. Currently, the extensions that require arch compat are limit match and the ebt_mark match/target. The standard targets are not used by the nft-xt compat layer, so they are not affected. We can work around this limitation with a new revision that uses arch agnostic types. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-10-02netfilter: nf_tables: wait for call_rcu completion on module removalPablo Neira Ayuso
Make sure the objects have been released before the nf_tables modules is removed. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-10-02netfilter: use IS_ENABLED(CONFIG_BRIDGE_NETFILTER)Pablo Neira Ayuso
In 34666d4 ("netfilter: bridge: move br_netfilter out of the core"), the bridge netfilter code has been modularized. Use IS_ENABLED instead of ifdef to cover the module case. Fixes: 34666d4 ("netfilter: bridge: move br_netfilter out of the core") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-10-02netfilter: move nf_send_resetX() code to nf_reject_ipvX modulesPablo Neira Ayuso
Move nf_send_reset() and nf_send_reset6() to nf_reject_ipv4 and nf_reject_ipv6 respectively. This code is shared by x_tables and nf_tables. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-10-02netfilter: nft_reject: introduce icmp code abstraction for inet and bridgePablo Neira Ayuso
This patch introduces the NFT_REJECT_ICMPX_UNREACH type which provides an abstraction to the ICMP and ICMPv6 codes that you can use from the inet and bridge tables, they are: * NFT_REJECT_ICMPX_NO_ROUTE: no route to host - network unreachable * NFT_REJECT_ICMPX_PORT_UNREACH: port unreachable * NFT_REJECT_ICMPX_HOST_UNREACH: host unreachable * NFT_REJECT_ICMPX_ADMIN_PROHIBITED: administratevely prohibited You can still use the specific codes when restricting the rule to match the corresponding layer 3 protocol. I decided to not overload the existing NFT_REJECT_ICMP_UNREACH to have different semantics depending on the table family and to allow the user to specify ICMP family specific codes if they restrict it to the corresponding family. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-10-02Bluetooth: 6lowpan: Check transmit errors for multicast packetsJukka Rissanen
We did not return error if multicast packet transmit failed. This might not be desired so return error also in this case. If there are multiple 6lowpan devices where the multicast packet is sent, then return error even if sending to only one of them fails. Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-10-02Bluetooth: 6lowpan: Return EAGAIN error also for multicast packetsJukka Rissanen
Make sure that we are able to return EAGAIN from l2cap_chan_send() even for multicast packets. The error code was ignored unncessarily. Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-10-02Bluetooth: 6lowpan: Avoid memory leak if memory allocation failsJukka Rissanen
If skb_unshare() returns NULL, then we leak the original skb. Solution is to use temp variable to hold the new skb. Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-10-02Bluetooth: 6lowpan: Memory leak as the skb is not freedJukka Rissanen
The earlier multicast commit 36b3dd250dde ("Bluetooth: 6lowpan: Ensure header compression does not corrupt IPv6 header") lost one skb free which then caused memory leak. Signed-off-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
2014-10-02igb: bump version to 5.2.15Todd Fujinaka
Bump version Signed-off-by: Todd Fujinaka <todd.fujinaka@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-02i40e/igb: Convert to dev_consume_skb_any()Rick Jones
Convert two more Intel NIC drivers to dev_consume_skb_any() to help make dropped packet profiling sane. Signed-off-by: Rick Jones <rick.jones2@hp.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Tested-by: Jim Young <jamesx.m.young@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-02igb: remove blocking phy read from inside spinlockBernhard Kaindl
Remove a source of latency spikes (in my case up to 10ms) by not calling code that uses mdelay() for feeding a phy statistic (rx errors for idle symbols - not data -> idle_errors) while being called with a spinlock held. As idle_errors isn't read, this patch only removes unused code and data. Later, more complicated changes may be applied to address the spinlock and allow for some PHY diagnostics by harvesting this PHY stats register fully. This patch is designed to fix the issue and be safe for longterm/stable. For the Intel e1000e driver, the same change was applied in 2008 with commit 23033fad5be0 ("e1000e: remove phy read from inside spinlock"). The mdelay is triggered by HW/SW semaphores, thus it depends on the HW. I've HW that triggers it even when idle. Others may trigger it only e.g. when Ethernet ports aquire or loose the link or on ifconfig up / down. We've noticed this first from delays in frame rx/tx due to the mdelay(). Example command for checking if the issue is triggered: cyclictest -Smp1 (Look for occasional "Max:" values > 4000 or use -b 4000 to stop if greater) It was observed with I350 ports connected to other I350 ports, but not if driver and EEPROM was modified to run the I350 in EEPROM-less mode. phy_stats.idle_errors and .receive_errors (isn't touched) occupy 64 not used bits in the adapter struct: Their allocation may be removed as well. Cc: Carolyn Wyborny <carolyn.wyborny@intel.com> Cc: Todd Fujinaka <todd.fujinaka@intel.com> Fixes: 12dcd86b75d5 ("igb: fix stats handling") (this added the spin_lock) Signed-off-by: Bernhard Kaindl <bk-linux@use.startmail.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-02ixgbe: delete one duplicate marcro definition of IXGBE_MAX_L2A_QUEUESEthan Zhao
There is typo in ixgbe.h, two marcro definition of IXGBE_MAX_L2A_QUEUES to 4, delete one, clear the compiler warning. Signed-off-by: Ethan Zhao <ethan.zhao@oracle.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-02ixgbe: fix setting of TXDCTL.WTRHESH when ITR is set to 0 and no BQLEmil Tantilov
This patch consolidates the logic behind dynamically setting TXDCTL.WTHRESH depending on interrupt throttle rate (ITR) setting regardless of BQL. Previously TXDCTL.WTHRESH was dynamically being set only with BQL being enabled, but we have to set it regardless of BQL when ITR is low to avoid Tx stalls/hangs. CC: John Greene <jogreene@redhat.com> Reported by: Masayuki Gouji <gouji.masayuki@jp.fujitsu.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-02ixgbe: remove wait loop on autoneg for copper devicesEmil Tantilov
This patch removes couple of wait loops on autoneg that are not needed. During validation we noticed that the loops always time out, so there should be no user impact. Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-02ixgbe: Convert the normal transmit complete path to dev_consume_skb_any()Rick Jones
Convert the normal packet completion path to dev_consume_skb_any() so packet drop profiling via dropwatch or perf top -G -e skb_kfree_skb is not cluttered with false hits. Compile tested only. There is a dev_kfree_skb_any() in the routine ixgbe_ptp_tx_hwtstamp() in ixgbe_ptp.c that looks like a conversion candidate but I wasn't familiar enough with the code to pull the trigger. Signed-off-by: Rick Jones <rick.jones2@hp.com> Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-02Bluetooth: Fix lockdep warning with l2cap_chan_connectJohan Hedberg
The L2CAP connection's channel list lock (conn->chan_lock) must never be taken while already holding a channel lock (chan->lock) in order to avoid lock-inversion and lockdep warnings. So far the l2cap_chan_connect function has acquired the chan->lock early in the function and then later called l2cap_chan_add(conn, chan) which will try to take the conn->chan_lock. This violates the correct order of taking the locks and may lead to the following type of lockdep warnings: -> #1 (&conn->chan_lock){+.+...}: [<c109324d>] lock_acquire+0x9d/0x140 [<c188459c>] mutex_lock_nested+0x6c/0x420 [<d0aab48e>] l2cap_chan_add+0x1e/0x40 [bluetooth] [<d0aac618>] l2cap_chan_connect+0x348/0x8f0 [bluetooth] [<d0cc9a91>] lowpan_control_write+0x221/0x2d0 [bluetooth_6lowpan] -> #0 (&chan->lock){+.+.+.}: [<c10928d8>] __lock_acquire+0x1a18/0x1d20 [<c109324d>] lock_acquire+0x9d/0x140 [<c188459c>] mutex_lock_nested+0x6c/0x420 [<d0ab05fd>] l2cap_connect_cfm+0x1dd/0x3f0 [bluetooth] [<d0a909c4>] hci_le_meta_evt+0x11a4/0x1260 [bluetooth] [<d0a910eb>] hci_event_packet+0x3ab/0x3120 [bluetooth] [<d0a7cb08>] hci_rx_work+0x208/0x4a0 [bluetooth] CPU0 CPU1 ---- ---- lock(&conn->chan_lock); lock(&chan->lock); lock(&conn->chan_lock); lock(&chan->lock); Before calling l2cap_chan_add() the channel is not part of the conn->chan_l list, and can therefore only be accessed by the L2CAP user (such as l2cap_sock.c). We can therefore assume that it is the responsibility of the user to handle mutual exclusion until this point (which we can see is already true in l2cap_sock.c by it in many places touching chan members without holding chan->lock). Since the hci_conn and by exctension l2cap_conn creation in the l2cap_chan_connect() function depend on chan details we cannot simply add a mutex_lock(&conn->chan_lock) in the beginning of the function (since the conn object doesn't yet exist there). What we can do however is move the chan->lock taking later into the function where we already have the conn object and can that way take conn->chan_lock first. This patch implements the above strategy and does some other necessary changes such as using __l2cap_chan_add() which assumes conn->chan_lock is held, as well as adding a second needed label so the unlocking happens as it should. Reported-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Tested-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Acked-by: Jukka Rissanen <jukka.rissanen@linux.intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2014-10-01fm10k: Correctly set the number of Tx queuesAlexander Duyck
The number of Tx queues was not being updated due to some issues when generating the patches. This change makes sure to add the lines necessary to update the number of Tx queues correctly. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-01fm10k: Reduce buffer size when pages are larger than 4KAlexander Duyck
This change reduces the buffer size to 2K for all page sizes. The basic idea is that since most frames only have a 1500 MTU supporting a buffer size larger than this is somewhat wasteful. As such I have reduced the size to 2K for all page sizes which will allow for more uses per page. Signed-off-by: Alexander Duyck <alexander.h.duyck@intel.com> Tested-by: Aaron Brown <aaron.f.brown@intel.com> Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
2014-10-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Don't halt the firmware in r8152 driver, from Hayes Wang. 2) Handle full sized 802.1ad frames in bnx2 and tg3 drivers properly, from Vlad Yasevich. 3) Don't sleep while holding tx_clean_lock in netxen driver, fix from Manish Chopra. 4) Certain kinds of ipv6 routes can end up endlessly failing the route validation test, causing it to be re-looked up over and over again. This particularly kills input route caching in TCP sockets. Fix from Hannes Frederic Sowa. 5) netvsc_start_xmit() has a use-after-free access to skb->len, fix from K Y Srinivasan. 6) Fix matching of inverted containers in ematch module, from Ignacy Gawędzki. 7) Aggregation of GRO frames via SKB ->frag_list for linear skbs isn't handled properly, regression fix from Eric Dumazet. 8) Don't test return value of ipv4_neigh_lookup(), which returns an error pointer, against NULL. From WANG Cong. 9) Fix an old regression where we mistakenly allow a double add of the same tunnel. Fixes from Steffen Klassert. 10) macvtap device delete and open can run in parallel and corrupt lists etc., fix from Vlad Yasevich. 11) Fix build error with IPV6=m NETFILTER_XT_TARGET_TPROXY=y, from Pablo Neira Ayuso. 12) rhashtable_destroy() triggers lockdep splats, fix also from Pablo. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (32 commits) bna: Update Maintainer Email r8152: disable power cut for RTL8153 r8152: remove clearing bp bnx2: Correctly receive full sized 802.1ad fragmes tg3: Allow for recieve of full-size 8021AD frames r8152: fix setting RTL8152_UNPLUG netxen: Fix bug in Tx completion path. netxen: Fix BUG "sleeping function called from invalid context" ipv6: remove rt6i_genid hyperv: Fix a bug in netvsc_start_xmit() net: stmmac: fix stmmac_pci_probe failed when CONFIG_HAVE_CLK is selected ematch: Fix matching of inverted containers. gro: fix aggregation for skb using frag_list neigh: check error pointer instead of NULL for ipv4_neigh_lookup() ip6_gre: Return an error when adding an existing tunnel. ip6_vti: Return an error when adding an existing tunnel. ip6_tunnel: Return an error when adding an existing tunnel. ip6gre: add a rtnl link alias for ip6gretap net/mlx4_core: Allow not to specify probe_vf in SRIOV IB mode r8152: fix the carrier off when autoresuming ...
2014-10-01bna: Update Maintainer EmailRasesh Mody
Update the maintainer email for BNA driver. Signed-off-by: Rasesh Mody <rasesh.mody@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01net: phy: add BCM7425 and BCM7429 PHYsPetri Gynther
Signed-off-by: Petri Gynther <pgynther@google.com> Acked-by: Florian Fainelli <f.fainelli@gmai.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01net: bcmgenet: fix bcmgenet_put_tx_csum()Petri Gynther
bcmgenet_put_tx_csum() needs to return skb pointer back to the caller because it reallocates a new one in case of lack of skb headroom. Signed-off-by: Petri Gynther <pgynther@google.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01net: pktgen: packet bursting via skb->xmit_moreAlexei Starovoitov
This patch demonstrates the effect of delaying update of HW tailptr. (based on earlier patch by Jesper) burst=1 is the default. It sends one packet with xmit_more=false burst=2 sends one packet with xmit_more=true and 2nd copy of the same packet with xmit_more=false burst=3 sends two copies of the same packet with xmit_more=true and 3rd copy with xmit_more=false Performance with ixgbe (usec 30): burst=1 tx:9.2 Mpps burst=2 tx:13.5 Mpps burst=3 tx:14.5 Mpps full 10G line rate Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01net: bridge: add a br_set_state helper functionFlorian Fainelli
In preparation for being able to propagate port states to e.g: notifiers or other kernel parts, do not manipulate the port state directly, but instead use a helper function which will allow us to do a bit more than just setting the state. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01net_sched: avoid calling tcf_unbind_filter() in call_rcu callbackWANG Cong
This fixes the following crash: [ 63.976822] general protection fault: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC [ 63.980094] CPU: 1 PID: 15 Comm: ksoftirqd/1 Not tainted 3.17.0-rc6+ #648 [ 63.980094] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 63.980094] task: ffff880117dea690 ti: ffff880117dfc000 task.ti: ffff880117dfc000 [ 63.980094] RIP: 0010:[<ffffffff817e6d07>] [<ffffffff817e6d07>] u32_destroy_key+0x27/0x6d [ 63.980094] RSP: 0018:ffff880117dffcc0 EFLAGS: 00010202 [ 63.980094] RAX: ffff880117dea690 RBX: ffff8800d02e0820 RCX: 0000000000000000 [ 63.980094] RDX: 0000000000000001 RSI: 0000000000000002 RDI: 6b6b6b6b6b6b6b6b [ 63.980094] RBP: ffff880117dffcd0 R08: 0000000000000000 R09: 0000000000000000 [ 63.980094] R10: 00006c0900006ba8 R11: 00006ba100006b9d R12: 0000000000000001 [ 63.980094] R13: ffff8800d02e0898 R14: ffffffff817e6d4d R15: ffff880117387a30 [ 63.980094] FS: 0000000000000000(0000) GS:ffff88011a800000(0000) knlGS:0000000000000000 [ 63.980094] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 63.980094] CR2: 00007f07e6732fed CR3: 000000011665b000 CR4: 00000000000006e0 [ 63.980094] Stack: [ 63.980094] ffff88011a9cd300 ffffffff82051ac0 ffff880117dffce0 ffffffff817e6d68 [ 63.980094] ffff880117dffd70 ffffffff810cb4c7 ffffffff810cb3cd ffff880117dfffd8 [ 63.980094] ffff880117dea690 ffff880117dea690 ffff880117dfffd8 000000000000000a [ 63.980094] Call Trace: [ 63.980094] [<ffffffff817e6d68>] u32_delete_key_freepf_rcu+0x1b/0x1d [ 63.980094] [<ffffffff810cb4c7>] rcu_process_callbacks+0x3bb/0x691 [ 63.980094] [<ffffffff810cb3cd>] ? rcu_process_callbacks+0x2c1/0x691 [ 63.980094] [<ffffffff817e6d4d>] ? u32_destroy_key+0x6d/0x6d [ 63.980094] [<ffffffff810780a4>] __do_softirq+0x142/0x323 [ 63.980094] [<ffffffff810782a8>] run_ksoftirqd+0x23/0x53 [ 63.980094] [<ffffffff81092126>] smpboot_thread_fn+0x203/0x221 [ 63.980094] [<ffffffff81091f23>] ? smpboot_unpark_thread+0x33/0x33 [ 63.980094] [<ffffffff8108e44d>] kthread+0xc9/0xd1 [ 63.980094] [<ffffffff819e00ea>] ? do_wait_for_common+0xf8/0x125 [ 63.980094] [<ffffffff8108e384>] ? __kthread_parkme+0x61/0x61 [ 63.980094] [<ffffffff819e43ec>] ret_from_fork+0x7c/0xb0 [ 63.980094] [<ffffffff8108e384>] ? __kthread_parkme+0x61/0x61 tp could be freed in call_rcu callback too, the order is not guaranteed. John Fastabend says: ==================== Its worth noting why this is safe. Any running schedulers will either read the valid class field or it will be zeroed. All schedulers today when the class is 0 do a lookup using the same call used by the tcf_exts_bind(). So even if we have a running classifier hit the null class pointer it will do a lookup and get to the same result. This is particularly fragile at the moment because the only way to verify this is to audit the schedulers call sites. ==================== Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01net_sched: fix another crash in cls_tcindexWANG Cong
This patch fixes the following crash: [ 166.670795] BUG: unable to handle kernel NULL pointer dereference at (null) [ 166.674230] IP: [<ffffffff814b739f>] __list_del_entry+0x5c/0x98 [ 166.674230] PGD d0ea5067 PUD ce7fc067 PMD 0 [ 166.674230] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC [ 166.674230] CPU: 1 PID: 775 Comm: tc Not tainted 3.17.0-rc6+ #642 [ 166.674230] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 [ 166.674230] task: ffff8800d03c4d20 ti: ffff8800cae7c000 task.ti: ffff8800cae7c000 [ 166.674230] RIP: 0010:[<ffffffff814b739f>] [<ffffffff814b739f>] __list_del_entry+0x5c/0x98 [ 166.674230] RSP: 0018:ffff8800cae7f7d0 EFLAGS: 00010207 [ 166.674230] RAX: 0000000000000000 RBX: ffff8800cba8d700 RCX: ffff8800cba8d700 [ 166.674230] RDX: 0000000000000000 RSI: dead000000200200 RDI: ffff8800cba8d700 [ 166.674230] RBP: ffff8800cae7f7d0 R08: 0000000000000001 R09: 0000000000000001 [ 166.674230] R10: 0000000000000000 R11: 000000000000859a R12: ffffffffffffffe8 [ 166.674230] R13: ffff8800cba8c5b8 R14: 0000000000000001 R15: ffff8800cba8d700 [ 166.674230] FS: 00007fdb5f04a740(0000) GS:ffff88011a800000(0000) knlGS:0000000000000000 [ 166.674230] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 166.674230] CR2: 0000000000000000 CR3: 00000000cf929000 CR4: 00000000000006e0 [ 166.674230] Stack: [ 166.674230] ffff8800cae7f7e8 ffffffff814b73e8 ffff8800cba8d6e8 ffff8800cae7f828 [ 166.674230] ffffffff817caeec 0000000000000046 ffff8800cba8c5b0 ffff8800cba8c5b8 [ 166.674230] 0000000000000000 0000000000000001 ffff8800cf8e33e8 ffff8800cae7f848 [ 166.674230] Call Trace: [ 166.674230] [<ffffffff814b73e8>] list_del+0xd/0x2b [ 166.674230] [<ffffffff817caeec>] tcf_action_destroy+0x4c/0x71 [ 166.674230] [<ffffffff817ca0ce>] tcf_exts_destroy+0x20/0x2d [ 166.674230] [<ffffffff817ec2b5>] tcindex_delete+0x196/0x1b7 struct list_head can not be simply copied and we should always init it. Cc: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: John Fastabend <john.r.fastabend@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01Merge branch 'udp_gso'David S. Miller
Tom Herbert says: ==================== udp: Generalize GSO for UDP tunnels This patch set generalizes the UDP tunnel segmentation functions so that they can work with various protocol encapsulations. The primary change is to set the inner_protocol field in the skbuff when creating the encapsulated packet, and then in skb_udp_tunnel_segment this data is used to determine the function for segmenting the encapsulated packet. The inner_protocol field is overloaded to take either an Ethertype or IP protocol. The inner_protocol is set on transmit using skb_set_inner_ipproto or skb_set_inner_protocol functions. VXLAN and IP tunnels (for fou GSO) were modified to call these. Notes: - GSO for GRE/UDP where GRE checksum is enabled does not work. Handling this will require some special case code. - Software GSO now supports many varieties of encapsulation with SKB_GSO_UDP_TUNNEL{_CSUM}. We still need a mechanism to query for device support of particular combinations (I intend to add ndo_gso_check for that). - MPLS seems to be the only previous user of inner_protocol. I don't believe these patches can affect that. For supporting GSO with MPLS over UDP, the inner_protocol should be set using the helper functions in this patch. - GSO for L2TP/UDP should also be straightforward now. v2: - Respin for Eric's restructuring of skbuff. Tested GRE, IPIP, and SIT over fou as well as VLXAN. This was done using 200 TCP_STREAMs in netperf. GRE IPv4, FOU, UDP checksum enabled TCP_STREAM TSO enabled on tun interface 14.04% TX CPU utilization 13.17% RX CPU utilization 9211 Mbps TCP_STREAM TSO disabled on tun interface 27.82% TX CPU utilization 25.41% RX CPU utilization 9336 Mbps IPv4, FOU, UDP checksum disabled TCP_STREAM TSO enabled on tun interface 13.14% TX CPU utilization 23.18% RX CPU utilization 9277 Mbps TCP_STREAM TSO disabled on tun interface 30.00% TX CPU utilization 31.28% RX CPU utilization 9327 Mbps IPIP FOU, UDP checksum enabled TCP_STREAM TSO enabled on tun interface 15.28% TX CPU utilization 13.92% RX CPU utilization 9342 Mbps TCP_STREAM TSO disabled on tun interface 27.82% TX CPU utilization 25.41% RX CPU utilization 9336 Mbps FOU, UDP checksum disabled TCP_STREAM TSO enabled on tun interface 15.08% TX CPU utilization 24.64% RX CPU utilization 9226 Mbps TCP_STREAM TSO disabled on tun interface 30.00% TX CPU utilization 31.28% RX CPU utilization 9327 Mbps SIT FOU, UDP checksum enabled TCP_STREAM TSO enabled on tun interface 14.47% TX CPU utilization 14.58% RX CPU utilization 9106 Mbps TCP_STREAM TSO disabled on tun interface 31.82% TX CPU utilization 30.82% RX CPU utilization 9204 Mbps FOU, UDP checksum disabled TCP_STREAM TSO enabled on tun interface 15.70% TX CPU utilization 27.93% RX CPU utilization 9097 Mbps TCP_STREAM TSO disabled on tun interface 33.48% TX CPU utilization 37.36% RX CPU utilization 9197 Mbps VXLAN TCP_STREAM TSO enabled on tun interface 16.42% TX CPU utilization 23.66% RX CPU utilization 9081 Mbps TCP_STREAM TSO disabled on tun interface 30.32% TX CPU utilization 30.55% RX CPU utilization 9185 Mbps Baseline (no encp, TSO and LRO enabled) TCP_STREAM 11.85% TX CPU utilization 15.13% RX CPU utilization 9452 Mbps ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01vxlan: Set inner protocol before transmitTom Herbert
Call skb_set_inner_protocol to set inner Ethernet protocol to ETH_P_TEB before transmit. This is needed for GSO with UDP tunnels. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-01gre: Set inner protocol in v4 and v6 GRE transmitTom Herbert
Call skb_set_inner_protocol to set inner Ethernet protocol to protocol being encapsulation by GRE before tunnel_xmit. This is needed for GSO if UDP encapsulation (fou) is being done. Signed-off-by: Tom Herbert <therbert@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>