summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2016-04-24Linux 4.6-rc5v4.6-rc5Linus Torvalds
2016-04-24Merge branch 'mlx5-fixes'David S. Miller
Saeed Mahameed says: ==================== mlx5 driver updates and fixes Changes from V0: - Dropped: ("net/mlx5e: Reset link modes upon setting speed to zero") - Fixed compilation issue introduced to mlx5_ib driver. - Rebased to df637193906a ('Revert "Prevent NUll pointer dereference with two PHYs on cpsw"') This series has few bug fixes for mlx5 core and ethernet driver. Eli fixed a wrong static local variable declaration in flow steering API. Majd added the support of ConnectX-5 PF and VF and added the support for kernel shutdown pci callback for more robust reboot procedures. Maor fixed a soft lockup in flow steering. Rana fixed a wrog speed define in mlx5 EN driver. I also had the chance to introduce some bug fixes in mlx5 EN mtu reporting and handling. For -stable: net/mlx5_core: Fix soft lockup in steering error flow net/mlx5e: Device's mtu field is u16 and not int net/mlx5e: Fix minimum MTU net/mlx5e: Use vport MTU rather than physical port MTU ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24net/mlx5: Add pci shutdown callbackMajd Dibbiny
This patch introduces kexec support for mlx5. When switching kernels, kexec() calls shutdown, which unloads the driver and cleans its resources. In addition, remove unregister netdev from shutdown flow. This will allow a clean shutdown, even if some netdev clients did not release their reference from this netdev. Releasing The HW resources only is enough as the kernel is shutting down Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: Haggai Abramovsky <hagaya@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24net/mlx5_core: Remove static from local variableEli Cohen
The static is not required and breaks re-entrancy if it will be required. Fixes: 2530236303d9 ("net/mlx5_core: Flow steering tree initialization") Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24net/mlx5e: Use vport MTU rather than physical port MTUSaeed Mahameed
Set and report vport MTU rather than physical MTU, Driver will set both vport and physical port mtu and will rely on the query of vport mtu. SRIOV VFs have to report their MTU to their vport manager (PF), and this will allow them to work with any MTU they need without failing the request. Also for some cases where the PF is not a port owner, PF can work with MTU less than the physical port mtu if set physical port mtu didn't take effect. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24net/mlx5e: Fix minimum MTUSaeed Mahameed
Minimum MTU that can be set in Connectx4 device is 68. This fixes the case where a user wants to set invalid MTU, the driver will fail to satisfy this request and the interface will stay down. It is better to report an error and continue working with old mtu. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24net/mlx5e: Device's mtu field is u16 and not intSaeed Mahameed
For set/query MTU port firmware commands the MTU field is 16 bits, here I changed all the "int mtu" parameters of the functions wrapping those firmware commands to be u16. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24net/mlx5_core: Add ConnectX-5 to list of supported devicesMajd Dibbiny
Add the upcoming ConnectX-5 devices (PF and VF) to the list of supported devices by the mlx5 driver. Signed-off-by: Majd Dibbiny <majd@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24net/mlx5e: Fix MLX5E_100BASE_T defineRana Shahout
Bit 25 of eth_proto_capability in PTYS register is 1000Base-TT and not 100Base-T. Fixes: f62b8bb8f2d3 ('net/mlx5: Extend mlx5_core to support ConnectX-4 Ethernet functionality') Signed-off-by: Rana Shahout <ranas@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24net/mlx5_core: Fix soft lockup in steering error flowMaor Gottlieb
In the error flow of adding flow rule to auto-grouped flow table, we call to tree_remove_node. tree_remove_node locks the node's parent, however the node's parent is already locked by mlx5_add_flow_rule and this causes a deadlock. After this patch, if we failed to add the flow rule, we unlock the flow table before calling to tree_remove_node. fixes: f0d22d187473 ('net/mlx5_core: Introduce flow steering autogrouped flow table') Signed-off-by: Maor Gottlieb <maorg@mellanox.com> Reported-by: Amir Vadai <amir@vadai.me> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24qlcnic: Update version to 5.3.64Manish Chopra
Just updating the version as many fixes got accumulated over 5.3.63 Signed-off-by: Manish Chopra <manish.chopra@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24tcp-tso: do not split TSO packets at retransmit timeEric Dumazet
Linux TCP stack painfully segments all TSO/GSO packets before retransmits. This was fine back in the days when TSO/GSO were emerging, with their bugs, but we believe the dark age is over. Keeping big packets in write queues, but also in stack traversal has a lot of benefits. - Less memory overhead, because write queues have less skbs - Less cpu overhead at ACK processing. - Better SACK processing, as lot of studies mentioned how awful linux was at this ;) - Less cpu overhead to send the rtx packets (IP stack traversal, netfilter traversal, drivers...) - Better latencies in presence of losses. - Smaller spikes in fq like packet schedulers, as retransmits are not constrained by TCP Small Queues. 1 % packet losses are common today, and at 100Gbit speeds, this translates to ~80,000 losses per second. Losses are often correlated, and we see many retransmit events leading to 1-MSS train of packets, at the time hosts are already under stress. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24net: stmmac: socfpga: Remove re-registration of reset controllerMarek Vasut
Both socfpga_dwmac_parse_data() in dwmac-socfpga.c and stmmac_dvr_probe() in stmmac_main.c functions call devm_reset_control_get() to register an reset controller for the stmmac. This results in an attempt to register two reset controllers for the same non-shared reset line. The first attempt to register the reset controller works fine. The second attempt fails with warning from the reset controller core, see below. The warning is produced because the reset line is non-shared and thus it is allowed to have only up-to one reset controller associated with that reset line, not two or more. The solution has multiple parts. First, the original socfpga_dwmac_init() is tweaked to use reset controller pointer from the stmmac_priv (private data of the stmmac core) instead of the local instance, which was used before. The local re-registration of the reset controller is removed. Next, the socfpga_dwmac_init() is moved after stmmac_dvr_probe() in the probe function. This order is legal according to Altera and it makes the code much easier, since there is no need to temporarily register and unregister the reset controller ; the reset controller is already registered by the stmmac_dvr_probe(). Finally, plat_dat->exit and socfpga_dwmac_exit() is no longer necessary, since the functionality is already performed by the stmmac core. ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1 at drivers/reset/core.c:187 __of_reset_control_get+0x218/0x270 Modules linked in: CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.6.0-rc4-next-20160419-00015-gabb2477-dirty #4 Hardware name: Altera SOCFPGA [<c010f290>] (unwind_backtrace) from [<c010b82c>] (show_stack+0x10/0x14) [<c010b82c>] (show_stack) from [<c0373da4>] (dump_stack+0x94/0xa8) [<c0373da4>] (dump_stack) from [<c011bcc0>] (__warn+0xec/0x104) [<c011bcc0>] (__warn) from [<c011bd88>] (warn_slowpath_null+0x20/0x28) [<c011bd88>] (warn_slowpath_null) from [<c03a6eb4>] (__of_reset_control_get+0x218/0x270) [<c03a6eb4>] (__of_reset_control_get) from [<c03a701c>] (__devm_reset_control_get+0x54/0x90) [<c03a701c>] (__devm_reset_control_get) from [<c041fa30>] (stmmac_dvr_probe+0x1b4/0x8e8) [<c041fa30>] (stmmac_dvr_probe) from [<c04298c8>] (socfpga_dwmac_probe+0x1b8/0x28c) [<c04298c8>] (socfpga_dwmac_probe) from [<c03d6ffc>] (platform_drv_probe+0x4c/0xb0) [<c03d6ffc>] (platform_drv_probe) from [<c03d54ec>] (driver_probe_device+0x224/0x2bc) [<c03d54ec>] (driver_probe_device) from [<c03d5630>] (__driver_attach+0xac/0xb0) [<c03d5630>] (__driver_attach) from [<c03d382c>] (bus_for_each_dev+0x6c/0xa0) [<c03d382c>] (bus_for_each_dev) from [<c03d4ad4>] (bus_add_driver+0x1a4/0x21c) [<c03d4ad4>] (bus_add_driver) from [<c03d60ac>] (driver_register+0x78/0xf8) [<c03d60ac>] (driver_register) from [<c0101760>] (do_one_initcall+0x40/0x170) [<c0101760>] (do_one_initcall) from [<c0800e38>] (kernel_init_freeable+0x1dc/0x27c) [<c0800e38>] (kernel_init_freeable) from [<c05d1bd4>] (kernel_init+0x8/0x114) [<c05d1bd4>] (kernel_init) from [<c01076f8>] (ret_from_fork+0x14/0x3c) ---[ end trace 059d2fbe87608fa9 ]--- Signed-off-by: Marek Vasut <marex@denx.de> Cc: Matthew Gerlach <mgerlach@opensource.altera.com> Cc: Dinh Nguyen <dinguyen@opensource.altera.com> Cc: David S. Miller <davem@davemloft.net> Tested-by: Dinh Nguyen <dinguyen@opensource.altera.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24tipc: fix stale links after re-enabling bearerParthasarathy Bhuvaragan
Commit 42b18f605fea ("tipc: refactor function tipc_link_timeout()"), introduced a bug which prevents sending of probe messages during link synchronization phase. This leads to hanging links, if the bearer is disabled/enabled after links are up. In this commit, we send the probe messages correctly. Fixes: 42b18f605fea ("tipc: refactor function tipc_link_timeout()") Acked-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: Parthasarathy Bhuvaragan <parthasarathy.bhuvaragan@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24Merge branch 'macsec-fixes'David S. Miller
Sabrina Dubroca says: ==================== macsec: a few fixes Some small fixes for the macsec driver: - possible NULL pointer dereferences - netlink dumps fixes: RTNL locking, consistent dumps - a reference counting bug - wrong name for uapi constant - a few memory leaks Patches 1 to 5 are the same as in v1, patches 6 to 9 are new. Patch 6 fixes the memleak that Lance spotted in v1. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24macsec: fix netlink attribute validationSabrina Dubroca
macsec_validate_attr should check IFLA_MACSEC_REPLAY_PROTECT (not IFLA_MACSEC_PROTECT) to verify that the replay protection and replay window arguments are correct. Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24macsec: add missing macsec prefix in uapiSabrina Dubroca
I accidentally forgot some MACSEC_ prefixes in if_macsec.h. Fixes: dece8d2b78d1 ("uapi: add MACsec bits") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24macsec: fix SA leak if initialization failsSabrina Dubroca
Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver") Reported-by: Lance Richardson <lrichard@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24macsec: fix memory leaks around rx_handler (un)registrationSabrina Dubroca
We leak a struct macsec_rxh_data when we unregister the rx_handler in macsec_dellink. We also leak a struct macsec_rxh_data in register_macsec_dev if we fail to register the rx_handler. Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24macsec: add consistency check to netlink dumpsSabrina Dubroca
Use genl_dump_check_consistent in dump_secy. Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver") Suggested-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24macsec: fix rx_sa refcounting with decrypt callbackSabrina Dubroca
The decrypt callback macsec_decrypt_done needs a reference on the rx_sa and releases it before returning, but macsec_handle_frame already put that reference after macsec_decrypt returned NULL. Set rx_sa to NULL when the decrypt callback runs so that macsec_handle_frame knows it must not release the reference. Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24macsec: don't put a NULL rxsaSabrina Dubroca
The "deliver:" path of macsec_handle_frame can be called with rx_sa == NULL. Check rx_sa != NULL before calling macsec_rxsa_put(). Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24macsec: take rtnl lock before for_each_netdevSabrina Dubroca
Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver") Reported-by: Johannes Berg <johannes@sipsolutions.net> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24macsec: add missing NULL check after kmallocSabrina Dubroca
Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24Merge branch 'bridge-mdb-fixes'David S. Miller
Jiri Pirko says: ==================== bridge: mdb: Couple of fixes Elad says: This patchset fixes two problems reported by Nikolay Aleksandrov. The first problem is that the MDB offload flag might be accesed without helding the multicast_lock. The second problem is that the switchdev mdb offload is deferred and the offload bit was marked regardless if the operation succeeded or not. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24bridge: mdb: Marking port-group as offloadedElad Raz
There is a race-condition when updating the mdb offload flag without using the mulicast_lock. This reverts commit 9e8430f8d60d98 ("bridge: mdb: Passing the port-group pointer to br_mdb module"). This patch marks offloaded MDB entry as "offload" by changing the port- group flags and marks it as MDB_PG_FLAGS_OFFLOAD. When switchdev PORT_MDB succeeded and adds a multicast group, a completion callback is been invoked "br_mdb_complete". The completion function locks the multicast_lock and finds the right net_bridge_port_group and marks it as offloaded. Fixes: 9e8430f8d60d98 ("bridge: mdb: Passing the port-group pointer to br_mdb module") Reported-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: Elad Raz <eladr@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24bridge: mdb: Common function for mdb entry translationElad Raz
There is duplicate code that translates br_mdb_entry to br_ip let's wrap it in a common function. Signed-off-by: Elad Raz <eladr@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24switchdev: Adding complete operation to deferred switchdev opsElad Raz
When using switchdev deferred operation (SWITCHDEV_F_DEFER), the operation is executed in different context and the application doesn't have any way to get the operation real status. Adding a completion callback fixes that. This patch adds fields to switchdev_attr and switchdev_obj "complete_priv" field which is used by the "complete" callback. Application can set a complete function which will be called once the operation executed. Signed-off-by: Elad Raz <eladr@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24MAINTAINERS: net: add entry for TI Ethernet Switch driversGrygorii Strashko
Add record for TI Ethernet Switch Driver CPSW/CPDMA/MDIO HW (am33/am43/am57/dr7/davinci) to ensure that related patches will go through dedicated linux-omap list. Also add Mugunthan as maintainer and myself as the reviewer. Cc: "David S. Miller" <davem@davemloft.net> Cc: Tony Lindgren <tony@atomide.com> Cc: Mugunthan V N <mugunthanvnm@ti.com> Cc: Richard Cochran <richardcochran@gmail.com> Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com> Acked-by: Mugunthan V N <mugunthanvnm@ti.com> Acked-by: Tony Lindgren <tony@atomide.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24Merge branch 'tcp-tcstamp_ack-frag-coalesce'David S. Miller
Martin KaFai Lau says: ==================== tcp: Handle txstamp_ack when fragmenting/coalescing skbs This patchset is to handle the txstamp-ack bit when fragmenting/coalescing skbs. The second patch depends on the recently posted series for the net branch: "tcp: Merge timestamp info when coalescing skbs" A BPF prog is used to kprobe to sock_queue_err_skb() and print out the value of serr->ee.ee_data. The BPF prog (run-able from bcc) is attached here: BPF prog used for testing: ~~~~~ from __future__ import print_function from bcc import BPF bpf_text = """ int trace_err_skb(struct pt_regs *ctx) { struct sk_buff *skb = (struct sk_buff *)ctx->si; struct sock *sk = (struct sock *)ctx->di; struct sock_exterr_skb *serr; u32 ee_data = 0; if (!sk || !skb) return 0; serr = SKB_EXT_ERR(skb); bpf_probe_read(&ee_data, sizeof(ee_data), &serr->ee.ee_data); bpf_trace_printk("ee_data:%u\\n", ee_data); return 0; }; """ b = BPF(text=bpf_text) b.attach_kprobe(event="sock_queue_err_skb", fn_name="trace_err_skb") print("Attached to kprobe") b.trace_print() ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24tcp: Merge txstamp_ack in tcp_skb_collapse_tstampMartin KaFai Lau
When collapsing skbs, txstamp_ack also needs to be merged. Retrans Collapse Test: ~~~~~~ 0.200 accept(3, ..., ...) = 4 +0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0 0.200 write(4, ..., 730) = 730 +0 setsockopt(4, SOL_SOCKET, 37, [2688], 4) = 0 0.200 write(4, ..., 730) = 730 +0 setsockopt(4, SOL_SOCKET, 37, [2176], 4) = 0 0.200 write(4, ..., 11680) = 11680 0.200 > P. 1:731(730) ack 1 0.200 > P. 731:1461(730) ack 1 0.200 > . 1461:8761(7300) ack 1 0.200 > P. 8761:13141(4380) ack 1 0.300 < . 1:1(0) ack 1 win 257 <sack 1461:2921,nop,nop> 0.300 < . 1:1(0) ack 1 win 257 <sack 1461:4381,nop,nop> 0.300 < . 1:1(0) ack 1 win 257 <sack 1461:5841,nop,nop> 0.300 > P. 1:1461(1460) ack 1 0.400 < . 1:1(0) ack 13141 win 257 BPF Output Before: ~~~~~ <No output due to missing SCM_TSTAMP_ACK timestamp> BPF Output After: ~~~~~ <...>-2027 [007] d.s. 79.765921: : ee_data:1459 Sacks Collapse Test: ~~~~~ 0.200 accept(3, ..., ...) = 4 +0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0 0.200 write(4, ..., 1460) = 1460 +0 setsockopt(4, SOL_SOCKET, 37, [2688], 4) = 0 0.200 write(4, ..., 13140) = 13140 +0 setsockopt(4, SOL_SOCKET, 37, [2176], 4) = 0 0.200 > P. 1:1461(1460) ack 1 0.200 > . 1461:8761(7300) ack 1 0.200 > P. 8761:14601(5840) ack 1 0.300 < . 1:1(0) ack 1 win 257 <sack 1461:14601,nop,nop> 0.300 > P. 1:1461(1460) ack 1 0.400 < . 1:1(0) ack 14601 win 257 BPF Output Before: ~~~~~ <No output due to missing SCM_TSTAMP_ACK timestamp> BPF Output After: ~~~~~ <...>-2049 [007] d.s. 89.185538: : ee_data:14599 Signed-off-by: Martin KaFai Lau <kafai@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Tested-by: Soheil Hassas Yeganeh <soheil@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24tcp: Carry txstamp_ack in tcp_fragment_tstampMartin KaFai Lau
When a tcp skb is sliced into two smaller skbs (e.g. in tcp_fragment() and tso_fragment()), it does not carry the txstamp_ack bit to the newly created skb if it is needed. The end result is a timestamping event (SCM_TSTAMP_ACK) will be missing from the sk->sk_error_queue. This patch carries this bit to the new skb2 in tcp_fragment_tstamp(). BPF Output Before: ~~~~~~ <No output due to missing SCM_TSTAMP_ACK timestamp> BPF Output After: ~~~~~~ <...>-2050 [000] d.s. 100.928763: : ee_data:14599 Packetdrill Script: ~~~~~~ +0 `sysctl -q -w net.ipv4.tcp_min_tso_segs=10` +0 `sysctl -q -w net.ipv4.tcp_no_metrics_save=1` +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 0.100 < S 0:0(0) win 32792 <mss 1460,sackOK,nop,nop,nop,wscale 7> 0.100 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7> 0.200 < . 1:1(0) ack 1 win 257 0.200 accept(3, ..., ...) = 4 +0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0 +0 setsockopt(4, SOL_SOCKET, 37, [2688], 4) = 0 0.200 write(4, ..., 14600) = 14600 +0 setsockopt(4, SOL_SOCKET, 37, [2176], 4) = 0 0.200 > . 1:7301(7300) ack 1 0.200 > P. 7301:14601(7300) ack 1 0.300 < . 1:1(0) ack 14601 win 257 0.300 close(4) = 0 0.300 > F. 14601:14601(0) ack 1 0.400 < F. 1:1(0) ack 16062 win 257 0.400 > . 14602:14602(0) ack 2 Signed-off-by: Martin KaFai Lau <kafai@fb.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Soheil Hassas Yeganeh <soheil@google.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Tested-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-nextDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following patchset contains Netfilter updates for your net-next tree, mostly from Florian Westphal to sort out the lack of sufficient validation in x_tables and connlabel preparation patches to add nf_tables support. They are: 1) Ensure we don't go over the ruleset blob boundaries in mark_source_chains(). 2) Validate that target jumps land on an existing xt_entry. This extra sanitization comes with a performance penalty when loading the ruleset. 3) Introduce xt_check_entry_offsets() and use it from {arp,ip,ip6}tables. 4) Get rid of the smallish check_entry() functions in {arp,ip,ip6}tables. 5) Make sure the minimal possible target size in x_tables. 6) Similar to #3, add xt_compat_check_entry_offsets() for compat code. 7) Check that standard target size is valid. 8) More sanitization to ensure that the target_offset field is correct. 9) Add xt_check_entry_match() to validate that matches are well-formed. 10-12) Three patch to reduce the number of parameters in translate_compat_table() for {arp,ip,ip6}tables by using a container structure. 13) No need to return value from xt_compat_match_from_user(), so make it void. 14) Consolidate translate_table() so it can be used by compat code too. 15) Remove obsolete check for compat code, so we keep consistent with what was already removed in the native layout code (back in 2007). 16) Get rid of target jump validation from mark_source_chains(), obsoleted by #2. 17) Introduce xt_copy_counters_from_user() to consolidate counter copying, and use it from {arp,ip,ip6}tables. 18,22) Get rid of unnecessary explicit inlining in ctnetlink for dump functions. 19) Move nf_connlabel_match() to xt_connlabel. 20) Skip event notification if connlabel did not change. 21) Update of nf_connlabels_get() to make the upcoming nft connlabel support easier. 23) Remove spinlock to read protocol state field in conntrack. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-23Merge branch 'fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal Pull thermal fixes from Eduardo Valentin: "Specifics in this pull request: - Fixes in mediatek and OF thermal drivers - Fixes in power_allocator governor - More fixes of unsigned to int type change in thermal_core.c. These change have been CI tested using KernelCI bot. \o/" * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal: thermal: fix Mediatek thermal controller build thermal: consistently use int for trip temp thermal: fix mtk_thermal build dependency thermal: minor mtk_thermal.c cleanups thermal: power_allocator: req_range multiplication should be a 64 bit type thermal: of: add __init attribute
2016-04-23Merge branch 'nla_align-more'David S. Miller
Nicolas Dichtel says: ==================== netlink: align attributes when needed (patchset #1) This is the continuation of the work done to align netlink attributes when these attributes contain some 64-bit fields. David, if the third patch is too big (or maybe the series), I can split it. Just tell me what you prefer. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-23taskstats: use the libnl API to align nlattr on 64-bitNicolas Dichtel
Goal of this patch is to use the new libnl API to align netlink attribute when needed. The layout of the netlink message will be a bit different after the patch, because the padattr (TASKSTATS_TYPE_STATS) will be inside the nested attribute instead of before it. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-23xfrm: align nlattr properly when neededNicolas Dichtel
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-23libnl: add nla_put_u64_64bit() helperNicolas Dichtel
With this function, nla_data() is aligned on a 64-bit area. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-23libnl: nla_put_msecs(): align on a 64-bit areaNicolas Dichtel
nla_data() is now aligned on a 64-bit area. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-23libnl: nla_put_s64(): align on a 64-bit areaNicolas Dichtel
nla_data() is now aligned on a 64-bit area. In fact, there is no user of this function. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-23libnl: nla_put_net64(): align on a 64-bit areaNicolas Dichtel
nla_data() is now aligned on a 64-bit area. The temporary function nla_put_be64_32bit() is removed in this patch. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-23libnl: nla_put_be64(): align on a 64-bit areaNicolas Dichtel
nla_data() is now aligned on a 64-bit area. A temporary version (nla_put_be64_32bit()) is added for nla_put_net64(). This function is removed in the next patch. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-23libnl: nla_put_le64(): align on a 64-bit areaNicolas Dichtel
nla_data() is now aligned on a 64-bit area. Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-23libnl: fix help of _64bit functionsNicolas Dichtel
Fix typo and describe 'padattr'. Fixes: 089bf1a6a924 ("libnl: add more helpers to align attributes on 64-bit") Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts were two cases of simple overlapping changes, nothing serious. In the UDP case, we need to add a hlist_add_tail_rcu() to linux/rculist.h, because we've moved UDP socket handling away from using nulls lists. Signed-off-by: David S. Miller <davem@davemloft.net>
2016-04-23Merge tag 'asm-generic-4.6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull asm-generic update from Arnd Bergmann: "Here is one patch to wire up the preadv/pwritev system calls in the generic system call table, which is required for all architectures that were merged in the last few years, including arm64. Usually these get merged along with the syscall implementation or one of the architecture trees, but this time that did not happen. Andre and Christoph both sent a version of this patch, I picked the one I got first" * tag 'asm-generic-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: generic syscalls: wire up preadv2 and pwritev2 syscalls
2016-04-23generic syscalls: wire up preadv2 and pwritev2 syscallsAndre Przywara
These new syscalls are implemented as generic code, so enable them for architectures like arm64 which use the generic syscall table. Signed-off-by: Andre Przywara <andre.przywara@arm.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2016-04-23Merge tag 'qcom-fixes-for-4.6-rc2' of ↵Arnd Bergmann
git://git.kernel.org/pub/scm/linux/kernel/git/agross/linux into fixes Merge "Qualcomm Fixes for v4.6-rc2" from Andy Gross: * Revert BAM usage on MSM8974 boards * tag 'qcom-fixes-for-4.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/agross/linux: Revert "dts: msm8974: Add dma channels for blsp2_i2c1 node" Revert "dts: msm8974: Add blsp2_bam dma node"
2016-04-23arm64: dts: uniphier: fix I2C nodes of PH1-LD20Masahiro Yamada
The I2C hardware blocks on this SoC are connected as follows: I2C0: external connection I2C1: external connection I2C2: internal connection I2C3: external connection I2C4: external connection I2C5: internal connection I2C6: no connection (not accessible) Delete pinctrl from Ch2, add pinctrl to Ch4, and remove the Ch6 node. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2016-04-23Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes: two EDAC driver fixes, a Xen crash fix, a HyperV log spam fix and a documentation fix" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86 EDAC, sb_edac.c: Take account of channel hashing when needed x86 EDAC, sb_edac.c: Repair damage introduced when "fixing" channel address x86/mm/xen: Suppress hugetlbfs in PV guests x86/doc: Correct limits in Documentation/x86/x86_64/mm.txt x86/hyperv: Avoid reporting bogus NMI status for Gen2 instances