summaryrefslogtreecommitdiff
path: root/drivers/net/bonding/bond_main.c
AgeCommit message (Collapse)Author
2014-03-02bonding: send arp requests even if there's no route to themVeaceslav Falico
Currently we're only sending arp requests if we have a route to the target (and, thus, can find out the source ip address). There are some use cases, however, where we don't want/need to set an ip address (or set up a specific route) for bonding to use arp monitoring *for traffic generation*. We can easily send arp probes (arp requests with src ip == 0) to generate arp broadcast responses from the target ip and use them for determining if the target is up. This, obviously, won't work with arp validation - because we don't have the ip address set and, thus, will filter out the responses. So in that case - print a warning. CC: François CACHEREUL <f.cachereul@alphalink.fr> CC: Zhenjie Chen <zhchen@redhat.com> CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26bonding: disallow enslaving a bond to itselfJiri Bohac
Enslaving a bond to itself leads to an endless loop and hangs the kernel. Signed-off-by: Jiri Bohac <jbohac@suse.cz> Tested-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26bonding: fix a div error caused by the slave release pathNikolay Aleksandrov
There's a bug in the slave release function which leads the transmit functions which use the bond->slave_cnt to a div by 0 because we might just have released our last slave and made slave_cnt == 0 but at the same time we may have a transmitter after the check for an empty list which will fetch it and use it in the slave id calculation. Fix it by moving the slave_cnt after synchronize_rcu so if this was our last slave any new transmitters will see an empty slave list which is checked after rcu lock but before calling the mode transmit functions which rely on bond->slave_cnt. Fixes: 278b208375 ("bonding: initial RCU conversion") CC: Veaceslav Falico <vfalico@redhat.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: Jay Vosburgh <fubar@us.ibm.com> CC: David S. Miller <davem@davemloft.net> Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Acked-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26bonding: Fix RTNL: assertion failed at net/core/rtnetlink.c for ab arp monitordingtianhong
Veaceslav has reported and fix this problem by commit f2ebd477f141bc0 (bonding: restructure locking of bond_ab_arp_probe()). According Jay's opinion, the current solution is not very well, because the notification is to indicate that the interface has actually changed state in a meaningful way, but these calls in the ab ARP monitor are internal settings of the flags to allow the ARP monitor to search for a slave to become active when there are no active slaves. The flag setting to active or backup is to permit the ARP monitor's response logic to do the right thing when deciding if the test slave (current_arp_slave) is up or not. So the best way to fix the problem is that we should not send a notification when the slave is in testing state, and check the state at the end of the monitor, if the slave's state recover, avoid to send pointless notification twice. And RTNL is really a big lock, hold it regardless the slave's state changed or not when the current_active_slave is null will loss performance (every 100ms), so we should hold it only when the slave's state changed and need to notify. I revert the old commit and add new modifications. Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Veaceslav Falico <vfalico@redhat.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-26bonding: Fix RTNL: assertion failed at net/core/rtnetlink.c for 802.3ad modedingtianhong
The problem was introduced by the commit 1d3ee88ae0d (bonding: add netlink attributes to slave link dev). The bond_set_active_slave() and bond_set_backup_slave() will use rtmsg_ifinfo to send slave's states, so these two functions should be called in RTNL. In 802.3ad mode, acquiring RTNL for the __enable_port and __disable_port cases is difficult, as those calls generally already hold the state machine lock, and cannot unconditionally call rtnl_lock because either they already hold RTNL (for calls via bond_3ad_unbind_slave) or due to the potential for deadlock with bond_3ad_adapter_speed_changed, bond_3ad_adapter_duplex_changed, bond_3ad_link_change, or bond_3ad_update_lacp_rate. All four of those are called with RTNL held, and acquire the state machine lock second. The calling contexts for __enable_port and __disable_port already hold the state machine lock, and may or may not need RTNL. According to the Jay's opinion, I don't think it is a problem that the slave don't send notify message synchronously when the status changed, normally the state machine is running every 100 ms, send the notify message at the end of the state machine if the slave's state changed should be better. I fix the problem through these steps: 1). add a new function bond_set_slave_state() which could change the slave's state and call rtmsg_ifinfo() according to the input parameters called notify. 2). Add a new slave parameter which called should_notify, if the slave's state changed and don't notify yet, the parameter will be set to 1, and then if the slave's state changed again, the param will be set to 0, it indicate that the slave's state has been restored, no need to notify any one. 3). the __enable_port and __disable_port should not call rtmsg_ifinfo in the state machine lock, any change in the state of slave could set a flag in the slave, it will indicated that an rtmsg_ifinfo should be called at the end of the state machine. Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Veaceslav Falico <vfalico@redhat.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24bonding: remove no longer needed lock for bond_xxx_info_query()dingtianhong
The bond_xxx_info_query() was already in RTNL, so no need to use bond lock to protect the bond slave list, so remove it. Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Veaceslav Falico <vfalico@redhat.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-24bonding: netpoll: remove unwanted slave_dev_support_netpoll()dingtianhong
The __netpoll_setup() will check the slave's flag and ndo_poll_controller just like the slave_dev_support_netpoll() does, and slave_dev_support_netpoll() was not used by any place, so remove it. Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Veaceslav Falico <vfalico@redhat.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-20bonding: fix bond_arp_rcv() race of curr_active_slaveVeaceslav Falico
bond->curr_active_slave can be changed between its deferences, even to NULL, and thus we might panic. We're always holding the rcu (rx_handler->bond_handle_frame()->bond_arp_rcv()) so fix this by rcu_dereferencing() it and using the saved. Reported-by: Ding Tianhong <dingtianhong@huawei.com> Fixes: aeea64a ("bonding: don't trust arp requests unless active slave really works") CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Acked-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-19bonding: More use of ether_addr_copyJoe Perches
It's smaller and faster for some architectures. Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
Conflicts: drivers/net/bonding/bond_3ad.h drivers/net/bonding/bond_main.c Two minor conflicts in bonding, both of which were overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-18bonding: rename last_arp_rx to last_rxVeaceslav Falico
To reflect the new meaning. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-18bonding: trivial: rename slave->jiffies to ->last_link_upVeaceslav Falico
slave->jiffies is updated every time the slave becomes active, which, for bonding, means that its link is 'up'. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-18bonding: remove useless updating of slave->dev->last_rxVeaceslav Falico
Now that all the logic is handled via last_arp_rx, we don't need to use last_rx. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> CC: "David S. Miller" <davem@davemloft.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-18bonding: use last_arp_rx in bond_loadbalance_arp_mon()Veaceslav Falico
Now that last_arp_rx correctly show the last time we've received an ARP, we can use it safely instead of slave->dev->last_rx. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-18bonding: use the new options to correctly set last_arp_rxVeaceslav Falico
Now that the options are in place - arp_validate can be set to receive all the traffic or only arp packets to verify if the slave is up, when the slave isn't validated. CC: Rob Landley <rob@landley.net> CC: "David S. Miller" <davem@davemloft.net> CC: Nikolay Aleksandrov <nikolay@redhat.com> CC: Ding Tianhong <dingtianhong@huawei.com> CC: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-18bonding: always set recv_probe to bond_arp_rcv in arp monitorVeaceslav Falico
Currently we only set bond_arp_rcv() if we're using arp_validate, however this makes us skip updating last_arp_rx if we're not validating incoming ARPs - thus, if arp_validate is off, last_arp_rx will never be updated. Fix this by always setting up recv_probe = bond_arp_rcv, even if we're not using arp_validate. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-18bonding: always update last_arp_rx on packet recieveVeaceslav Falico
Currently we're updating the last_arp_rx only when we've validate the packet, however afterwards we use it as 'ANY last packet received', but not only validated ARPs. Fix this by updating it in case of any packet received. It won't break the arp_validation=0 because we, anyway, return the correct slave->dev->last_rx in slave_last_rx(). CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-18bonding: permit using arp_validate with non-ab modesVeaceslav Falico
Currently it's disabled because it's sometimes hard, in typical configs, to make it work - because of the nature how the loadbalance modes work - as it's hard to deliver valid arp replies to correct slaves by the switch. However we still can use arp_validation in loadbalance with several other configs, per example with arp_validate == 2 for backup with one broadcast domain, without the switch(es) doing any balancing - this way we'd be (a bit more) sure that the slave is up. So, enable it to let users decide which one works/suits them best. Also correct the mode limitation from BOND_OPT_ARP_VALIDATE. CC: Nikolay Aleksandrov <nikolay@redhat.com> CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Acked-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-18bonding: remove bond->lock from bond_arp_rcvVeaceslav Falico
We're always called with rcu_read_lock() held (bond_arp_rcv() is only called from bond_handle_frame(), which is rx_handler and always called under rcu from __netif_receive_skb_core() ). The slave active/passive and/or bonding params can change in-flight, however we don't really care about that - we only modify the last time packet was received, which is harmless. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Acked-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-17netdevice: add queue selection fallback handler for ndo_select_queueDaniel Borkmann
Add a new argument for ndo_select_queue() callback that passes a fallback handler. This gets invoked through netdev_pick_tx(); fallback handler is currently __netdev_pick_tx() as most drivers invoke this function within their customized implementation in case for skbs that don't need any special handling. This fallback handler can then be replaced on other call-sites with different queue selection methods (e.g. in packet sockets, pktgen etc). This also has the nice side-effect that __netdev_pick_tx() is then only invoked from netdev_pick_tx() and export of that function to modules can be undone. Suggested-by: David S. Miller <davem@davemloft.net> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-17bonding: Convert memcpy(foo, bar, ETH_ALEN) to ether_addr_copy(foo, bar)Joe Perches
ether_addr_copy is smaller and faster for some architectures. This relies on a stack frame being at least __aligned(2) for one use of an Ethernet address on the stack. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-17bonding: Neaten pr_<level>Joe Perches
Add missing terminating newlines. Convert uses of pr_info to pr_cont in bond_check_params. Standardize upper/lower case styles. Typo fixes, remove unnecessary parentheses and periods. Alignment neatening. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-17bonding: Convert pr_warning to pr_warn, neateningJoe Perches
Use more current logging style. Coalesce formats, realign arguments, drop unnecessary periods. Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13bonding: remove the redundant judgements for bond_set_mac_address()dingtianhong
The dev_set_mac_address() will check the dev->netdev_ops->ndo_set_mac_address, so no need to check it in bond_set_mac_address(). Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Veaceslav Falico <vfalico@redhat.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Reviewed-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-13bonding: Fix deadlock in bonding driver when using netpolldingtianhong
The bonding driver take write locks and spin locks that are shared by the tx path in enslave processing and notification processing, If the netconsole is in use, the bonding can call printk which puts us in the netpoll tx path, if the netconsole is attached to the bonding driver, result in deadlock. So add protection for these place, by checking the netpoll_block_tx state, we can defer the sending of the netconsole frames until a later time using the retransmit feature of netpoll_send_skb that is triggered on the return code NETDEV_TX_BUSY. Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Veaceslav Falico <vfalico@redhat.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-10bonding: remove unwanted bond lock for enslave processingdingtianhong
The bond enslave processing don't hold bond->lock anymore, so release an unlocked rw lock will cause warning message, remove the unwanted read_unlock(&bond->lock). Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Veaceslav Falico <vfalico@redhat.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Acked-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-04bonding: fail_over_mac should only affect AB mode in bond_set_mac_address()dingtianhong
The fail_over_mac could be set to active or follow in any time for all modes, so if the fail_over_mac is not none and the current mode is not active-backup, the bond_set_mac_address() could not change the master and slave's MAC address. In bond_set_mac_address(), the fail_over_mac should only affect AB mode, so modify to check the mode in addition to fail_over_mac when setting bond's MAC address. Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Veaceslav Falico <vfalico@redhat.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-02-04bonding: fail_over_mac should only affect AB mode at enslave and removal ↵dingtianhong
processing According to bonding.txt, the fail_over_ma should only affect active-backup mode, but I found that the fail_over_mac could be set to active or follow in all modes, this will cause new slave could not be set to bond's MAC address at enslave processing and restore its own MAC address at removal processing. The correct way to fix the problem is that we should not add restrictions when setting options, just need to modify the bond enslave and removal processing to check the mode in addition to fail_over_mac when setting a slave's MAC during enslavement. The change active slave processing already only calls the fail_over_mac function when in active-backup mode. Thanks for Jay's suggestion. The patch also modify the pr_warning() to pr_warn(). Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Veaceslav Falico <vfalico@redhat.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-28bonding: fix locking in bond_loadbalance_arp_mon()Ding Tianhong
The commit 1d3ee88ae0d605629bf369 (bonding: add netlink attributes to slave link dev) has add rtmsg_ifinfo() in bond_set_active_slave() and bond_set_backup_slave(), so the two function need to called in RTNL lock, but bond_loadbalance_arp_mon() only calling these functions in RCU, warning message will occurs. fix this by add a new function bond_slave_state_change(), which will reset the slave's state after slave link check, so remove the bond_set_xxx_slave() from the cycle and only record the slave_state_changed, this will call the new function to set all slaves to new state in RTNL later. Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Veaceslav Falico <vfalico@redhat.com> Cc: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Ding Tianhong <dingtianhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-27bonding: restructure locking of bond_ab_arp_probe()Veaceslav Falico
Currently we're calling it from under RCU context, however we're using some functions that require rtnl to be held. Fix this by restructuring the locking - don't call it under any locks, aquire rcu_read_lock() if we're sending _only_ (i.e. we have the active slave present), and use rtnl locking otherwise - if we need to modify (in)active flags of a slave. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-27bonding: RCUify bond_ab_arp_probeVeaceslav Falico
Currently bond_ab_arp_probe() is always called under rcu_read_lock(), however to work with curr_active_slave we're still holding the curr_slave_lock. To remove that curr_slave_lock - rcu_dereference the bond's curr_active_slave and use it further - so that we're sure the slave won't go away, and we don't care if it will change in the meanwhile. CC: Jay Vosburgh <fubar@us.ibm.com> CC: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: Veaceslav Falico <vfalico@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: 1) BPF debugger and asm tool by Daniel Borkmann. 2) Speed up create/bind in AF_PACKET, also from Daniel Borkmann. 3) Correct reciprocal_divide and update users, from Hannes Frederic Sowa and Daniel Borkmann. 4) Currently we only have a "set" operation for the hw timestamp socket ioctl, add a "get" operation to match. From Ben Hutchings. 5) Add better trace events for debugging driver datapath problems, also from Ben Hutchings. 6) Implement auto corking in TCP, from Eric Dumazet. Basically, if we have a small send and a previous packet is already in the qdisc or device queue, defer until TX completion or we get more data. 7) Allow userspace to manage ipv6 temporary addresses, from Jiri Pirko. 8) Add a qdisc bypass option for AF_PACKET sockets, from Daniel Borkmann. 9) Share IP header compression code between Bluetooth and IEEE802154 layers, from Jukka Rissanen. 10) Fix ipv6 router reachability probing, from Jiri Benc. 11) Allow packets to be captured on macvtap devices, from Vlad Yasevich. 12) Support tunneling in GRO layer, from Jerry Chu. 13) Allow bonding to be configured fully using netlink, from Scott Feldman. 14) Allow AF_PACKET users to obtain the VLAN TPID, just like they can already get the TCI. From Atzm Watanabe. 15) New "Heavy Hitter" qdisc, from Terry Lam. 16) Significantly improve the IPSEC support in pktgen, from Fan Du. 17) Allow ipv4 tunnels to cache routes, just like sockets. From Tom Herbert. 18) Add Proportional Integral Enhanced packet scheduler, from Vijay Subramanian. 19) Allow openvswitch to mmap'd netlink, from Thomas Graf. 20) Key TCP metrics blobs also by source address, not just destination address. From Christoph Paasch. 21) Support 10G in generic phylib. From Andy Fleming. 22) Try to short-circuit GRO flow compares using device provided RX hash, if provided. From Tom Herbert. The wireless and netfilter folks have been busy little bees too. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (2064 commits) net/cxgb4: Fix referencing freed adapter ipv6: reallocate addrconf router for ipv6 address when lo device up fib_frontend: fix possible NULL pointer dereference rtnetlink: remove IFLA_BOND_SLAVE definition rtnetlink: remove check for fill_slave_info in rtnl_have_link_slave_info qlcnic: update version to 5.3.55 qlcnic: Enhance logic to calculate msix vectors. qlcnic: Refactor interrupt coalescing code for all adapters. qlcnic: Update poll controller code path qlcnic: Interrupt code cleanup qlcnic: Enhance Tx timeout debugging. qlcnic: Use bool for rx_mac_learn. bonding: fix u64 division rtnetlink: add missing IFLA_BOND_AD_INFO_UNSPEC sfc: Use the correct maximum TX DMA ring size for SFC9100 Add Shradha Shah as the sfc driver maintainer. net/vxlan: Share RX skb de-marking and checksum checks with ovs tulip: cleanup by using ARRAY_SIZE() ip_tunnel: clear IPCB in ip_tunnel_xmit() in case dst_link_failure() is called net/cxgb4: Don't retrieve stats during recovery ...
2014-01-22bonding: Don't allow bond devices to change network namespaces.Weilong Chen
Like bridge, bonding as netdevice doesn't cross netns boundaries. Bonding ports and bonding itself live in same netns. Signed-off-by: Weilong Chen <chenweilong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-22bonding: convert netlink to use slave data info apiJiri Pirko
Signed-off-by: Jiri Pirko <jiri@resnulli.us> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-22bonding: convert active_slave to use the new option APINikolay Aleksandrov
This patch adds the necessary changes so active_slave would use the new bonding option API. Also some trivial/style fixes. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-22bonding: convert primary_reselect to use the new option APINikolay Aleksandrov
This patch adds the necessary changes so primary_reselect would use the new bonding option API. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-22bonding: convert miimon to use the new option APINikolay Aleksandrov
This patch adds the necessary changes so miimon would use the new bonding option API. The "default" definition has been removed as it was 0. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-22bonding: convert ad_select to use the new option APINikolay Aleksandrov
This patch adds the necessary changes so ad_select would use the new bonding option API. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-22bonding: convert lacp_rate to use the new option APINikolay Aleksandrov
This patch adds the necessary changes so lacp_rate would use the new bonding option API. Also some trivial/style error fixes. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-22bonding: convert arp_interval to use the new option APINikolay Aleksandrov
This patch adds the necessary changes so arp_interval would use the new bonding option API. The "default" definition has been removed as it was 0. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-22bonding: convert fail_over_mac to use the new option APINikolay Aleksandrov
This patch adds the necessary changes so fail_over_mac would use the new bonding option API. Also fixes a trivial copy/paste error in bond_check_params where the wrong variable was used for the error msg. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-22bonding: convert arp_all_targets to use the new option APINikolay Aleksandrov
This patch adds the necessary changes so arp_all_targets would use the new bonding option API. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-22bonding: convert arp_validate to use the new option APINikolay Aleksandrov
This patch adds the necessary changes so arp_validate would use the new bonding option API. Also fix some trivial/style errors. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-22bonding: convert xmit_hash_policy to use the new option APINikolay Aleksandrov
This patch adds the necessary changes so xmit_hash_policy would use the new bonding option API. Also fix some trivial/style errors. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-22bonding: convert packets_per_slave to use the new option APINikolay Aleksandrov
This patch adds the necessary changes so packets_per_slave would use the new bonding option API. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-22bonding: convert mode setting to use the new option APINikolay Aleksandrov
This patch makes the bond's mode setting use the new option API and adds support for dependency printing which relies on having an entry for the mode option in the bond_opts[] array. Also add the ability to print the mode name when mode dependency fails and fix some trivial/style errors. Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-21reciprocal_divide: update/correction of the algorithmHannes Frederic Sowa
Jakub Zawadzki noticed that some divisions by reciprocal_divide() were not correct [1][2], which he could also show with BPF code after divisions are transformed into reciprocal_value() for runtime invariance which can be passed to reciprocal_divide() later on; reverse in BPF dump ended up with a different, off-by-one K in some situations. This has been fixed by Eric Dumazet in commit aee636c4809fa5 ("bpf: do not use reciprocal divide"). This follow-up patch improves reciprocal_value() and reciprocal_divide() to work in all cases by using Granlund and Montgomery method, so that also future use is safe and without any non-obvious side-effects. Known problems with the old implementation were that division by 1 always returned 0 and some off-by-ones when the dividend and divisor where very large. This seemed to not be problematic with its current users, as far as we can tell. Eric Dumazet checked for the slab usage, we cannot surely say so in the case of flex_array. Still, in order to fix that, we propose an extension from the original implementation from commit 6a2d7a955d8d resp. [3][4], by using the algorithm proposed in "Division by Invariant Integers Using Multiplication" [5], Torbjörn Granlund and Peter L. Montgomery, that is, pseudocode for q = n/d where q, n, d is in u32 universe: 1) Initialization: int l = ceil(log_2 d) uword m' = floor((1<<32)*((1<<l)-d)/d)+1 int sh_1 = min(l,1) int sh_2 = max(l-1,0) 2) For q = n/d, all uword: uword t = (n*m')>>32 q = (t+((n-t)>>sh_1))>>sh_2 The assembler implementation from Agner Fog [6] also helped a lot while implementing. We have tested the implementation on x86_64, ppc64, i686, s390x; on x86_64/haswell we're still half the latency compared to normal divide. Joint work with Daniel Borkmann. [1] http://www.wireshark.org/~darkjames/reciprocal-buggy.c [2] http://www.wireshark.org/~darkjames/set-and-dump-filter-k-bug.c [3] https://gmplib.org/~tege/division-paper.pdf [4] http://homepage.cs.uiowa.edu/~jones/bcd/divide.html [5] http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.1.2556 [6] http://www.agner.org/optimize/asmlib.zip Reported-by: Jakub Zawadzki <darkjames-ws@darkjames.pl> Cc: Eric Dumazet <eric.dumazet@gmail.com> Cc: Austin S Hemmelgarn <ahferroin7@gmail.com> Cc: linux-kernel@vger.kernel.org Cc: Jesse Gross <jesse@nicira.com> Cc: Jamal Hadi Salim <jhs@mojatatu.com> Cc: Stephen Hemminger <stephen@networkplumber.org> Cc: Matt Mackall <mpm@selenic.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: Christoph Lameter <cl@linux-foundation.org> Cc: Andy Gospodarek <andy@greyhouse.net> Cc: Veaceslav Falico <vfalico@redhat.com> Cc: Jay Vosburgh <fubar@us.ibm.com> Cc: Jakub Zawadzki <darkjames-ws@darkjames.pl> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-20Merge branch 'core-rcu-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull RCU updates from Ingo Molnar: - add RCU torture scripts/tooling - static analysis improvements - update RCU documentation - miscellaneous fixes * 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits) rcu: Remove "extern" from function declarations in kernel/rcu/rcu.h rcu: Remove "extern" from function declarations in include/linux/*rcu*.h rcu/torture: Dynamically allocate SRCU output buffer to avoid overflow rcu: Don't activate RCU core on NO_HZ_FULL CPUs rcu: Warn on allegedly impossible rcu_read_unlock_special() from irq rcu: Add an RCU_INITIALIZER for global RCU-protected pointers rcu: Make rcu_assign_pointer's assignment volatile and type-safe bonding: Use RCU_INIT_POINTER() for better overhead and for sparse rcu: Add comment on evaluate-once properties of rcu_assign_pointer(). rcu: Provide better diagnostics for blocking in RCU callback functions rcu: Improve SRCU's grace-period comments rcu: Fix CONFIG_RCU_FANOUT_EXACT for odd fanout/leaf values rcu: Fix coccinelle warnings rcutorture: Stop tracking FSF's postal address rcutorture: Move checkarg to functions.sh rcutorture: Flag errors and warnings with color coding rcutorture: Record results from repeated runs of the same test scenario rcutorture: Test summary at end of run with less chattiness rcutorture: Update comment in kvm.sh listing typical RCU trace events rcutorture: Add tracing-enabled version of TREE08 ...
2014-01-17bonding: add netlink attributes to slave link devsfeldma@cumulusnetworks.com
If link is IFF_SLAVE, extend link dev netlink attributes to include slave attributes with new IFLA_SLAVE nest. Add netlink notification (RTM_NEWLINK) when slave status changes from backup to active, or visa-versa. Adds new ndo_get_slave op to net_device_ops to fill skb with IFLA_SLAVE attributes. Currently only used by bonding driver, but could be used by other aggregating devices with slaves. Signed-off-by: Scott Feldman <sfeldma@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-17bonding: add sysfs /slave dir for bond slave devices.sfeldma@cumulusnetworks.com
Add sub-directory under /sys/class/net/<interface>/slave with read-only attributes for slave. Directory only appears when <interface> is a slave. $ tree /sys/class/net/eth2/slave/ /sys/class/net/eth2/slave/ ├── ad_aggregator_id ├── link_failure_count ├── mii_status ├── perm_hwaddr ├── queue_id └── state $ cat /sys/class/net/eth2/slave/* 2 0 up 40:02:10:ef:06:01 0 active Signed-off-by: Scott Feldman <sfeldma@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>