summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-03-31ipv6: move ip6_dst_ops first in netns_ipv6Eric Dumazet
ip6_dst_ops have cache line alignement. Moving it at beginning of netns_ipv6 removes a 48 byte hole, and shrinks netns_ipv6 from 12 to 11 cache lines. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-31ipv6: convert elligible sysctls to u8Eric Dumazet
Convert most sysctls that can fit in a byte. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-31tcp: convert tcp_comp_sack_nr sysctl to u8Eric Dumazet
tcp_comp_sack_nr max value was already 255. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-31ipv4: convert igmp_link_local_mcast_reports sysctl to u8Eric Dumazet
This sysctl is a bool, can use less storage. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-31ipv4: convert fib_multipath_{use_neigh|hash_policy} sysctls to u8Eric Dumazet
Make room for better packing of netns_ipv4 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-31ipv4: convert udp_l3mdev_accept sysctl to u8Eric Dumazet
Reduce footprint of sysctls. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-31ipv4: convert fib_notify_on_flag_change sysctl to u8Eric Dumazet
Reduce footprint of sysctls. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-31inet: shrink netns_ipv4 by another cache lineEric Dumazet
By shuffling around some fields to remove 8 bytes of hole, we can save one cache line. pahole result before/after the patch : /* size: 768, cachelines: 12, members: 139 */ /* sum members: 673, holes: 11, sum holes: 39 */ /* padding: 56 */ /* paddings: 2, sum paddings: 7 */ /* forced alignments: 1 */ -> /* size: 704, cachelines: 11, members: 139 */ /* sum members: 673, holes: 10, sum holes: 31 */ /* paddings: 2, sum paddings: 7 */ /* forced alignments: 1 */ Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-31inet: shrink inet_timewait_death_row by 48 bytesEric Dumazet
struct inet_timewait_death_row uses two cache lines, because we want tw_count to use a full cache line to avoid false sharing. Rework its definition and placement in netns_ipv4 so that: 1) We add 60 bytes of padding after tw_count to avoid false sharing, knowing that tcp_death_row will have ____cacheline_aligned_in_smp attribute. 2) We do not risk padding before tcp_death_row, because we move it at the beginning of netns_ipv4, even if new fields are added later. 3) We do not waste 48 bytes of padding after it. Note that I have not changed dccp. pahole result for struct netns_ipv4 before/after the patch : /* size: 832, cachelines: 13, members: 139 */ /* sum members: 721, holes: 12, sum holes: 95 */ /* padding: 16 */ /* paddings: 2, sum paddings: 55 */ -> /* size: 768, cachelines: 12, members: 139 */ /* sum members: 673, holes: 11, sum holes: 39 */ /* padding: 56 */ /* paddings: 2, sum paddings: 7 */ /* forced alignments: 1 */ Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-31Merge branch 'net-coding-style'David S. Miller
Weihang Li says: ==================== net: fix some coding style issues Do some cleanups according to the coding style of kernel, including wrong print type, redundant and missing spaces and so on. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-31net: lpc_eth: fix format warnings of block commentsYangyang Li
Fix the following format warning: 1. Block comments use * on subsequent lines 2. Block comments use a trailing */ on a separate line Signed-off-by: Yangyang Li <liyangyang20@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-31net: toshiba: fix the trailing format of some block commentsYixing Liu
Use a trailling */ on a separate line for block comments. Signed-off-by: Yixing Liu <liuyixing1@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-31net: ocelot: fix a trailling format issue with block commentsYixing Liu
Use a tralling */ on a separate line for block comments. Signed-off-by: Yixing Liu <liuyixing1@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-31net: amd: correct some format issuesYixing Liu
There should be a blank line after declarations. Signed-off-by: Yixing Liu <liuyixing1@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-31net: amd8111e: fix inappropriate spacesYixing Liu
Delete unncecessary spaces and add some reasonable spaces according to the coding-style of kernel. Signed-off-by: Yixing Liu <liuyixing1@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-31net: ena: remove extra words from commentsYixing Liu
Remove the redundant "for" from the commment. Signed-off-by: Yixing Liu <liuyixing1@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-31net: ena: fix inaccurate print typeYixing Liu
Use "%u" to replace "hu%". Signed-off-by: Yixing Liu <liuyixing1@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-31qrtr: Convert qrtr_ports from IDR to XArrayMatthew Wilcox (Oracle)
The XArray interface is easier for this driver to use. Also fixes a bug reported by the improper use of GFP_ATOMIC. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-31net: ethernet: stmicro: Remove duplicate struct declarationWan Jiabing
struct stmmac_safety_stats is declared twice. One has been declared at 29th line. Remove the duplicate. Signed-off-by: Wan Jiabing <wanjiabing@vivo.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-31ice: Correct comment block styleTony Nguyen
The following is reported by checkpatch, correct it. ----------------------------------------------- drivers/net/ethernet/intel/ice/ice_adminq_cmd.h ----------------------------------------------- WARNING:NETWORKING_BLOCK_COMMENT_STYLE: networking block comments don't use an empty /* line, use /* Comment... FILE: drivers/net/ethernet/intel/ice/ice_adminq_cmd.h:1428: +/* + * Send to PF command (indirect 0x0801) ID is only used by PF Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com>
2021-03-31ice: cleanup style issuesBruce Allan
A few style issues reported by checkpatch have snuck into the code; resolve the style issues. COMPLEX_MACRO: Macros with complex values should be enclosed in parentheses Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-31ice: Consolidate VSI state and flagsAnirudh Venkataramanan
struct ice_vsi has two fields, state and flags which seem to be serving the same purpose. Consolidate them into one field 'state'. enum ice_state is used to represent state information of the PF. While some of these enum values can be use to represent VSI state, it makes more sense to represent VSI state with its own enum. So derive a new enum ice_vsi_state from ice_vsi_flags and ice_state and use it. Also rename enum ice_state to ice_pf_state for clarity. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-31ice: Refactor ice_set/get_rss into LUT and key specific functionsBrett Creeley
Currently ice_set/get_rss are used to set/get the RSS LUT and/or RSS key. However nearly everywhere these functions are called only the LUT or key are set/get. Also, making this change reduces how many things ice_set/get_rss are doing. Fix this by adding ice_set/get_rss_lut and ice_set/get_rss_key functions. Also, consolidate all calls for setting/getting the RSS LUT and RSS Key to use ice_set/get_rss_lut() and ice_set/get_rss_key(). Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-31ice: Refactor get/set RSS LUT to use struct parameterBrett Creeley
Update ice_aq_get_rss_lut() and ice_aq_set_rss_lut() to take a new structure ice_aq_get_set_rss_params instead of passing individual parameters. This is done for 2 reasons: 1. Reduce the number of parameters passed to the functions. 2. Reduce the amount of change required if the arguments ever need to be updated in the future. Also, reduce duplicate code that was checking for an invalid vsi_handle and lut parameter by moving the checks to the lower level __ice_aq_get_set_rss_lut(). Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-31ice: Change ice_vsi_setup_q_map() to not depend on RSSBrett Creeley
Currently, ice_vsi_setup_q_map() depends on the VSI's rss_size. However, the Rx Queue Mapping section of the VSI context has no dependency on RSS. Instead, limit the maximum number of Rx queues per TC based on the Rx Queue mapping section of the VSI context, which currently allows for up to 256 Rx queues per TC. Signed-off-by: Brett Creeley <brett.creeley@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-31ice: rename ptype bitmapQi Zhang
Align all ptype bitmap to follow ice_ptypes_xxx prefix. Signed-off-by: Qi Zhang <qi.z.zhang@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-31ice: correct memory allocation callBruce Allan
Use *malloc() instead of *calloc() when allocating only a single object as opposed to an array of objects. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-31ice: Check for bail out condition earlyAnirudh Venkataramanan
Check for bail out condition before calling ice_aq_sff_eeprom Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-31ice: remove unnecessary duplicated AQ command flag settingBruce Allan
Commit a012dca9f7a2 ("ice: add ethtool -m support for reading i2c eeprom modules") unnecessarily added the ICE_AQ_FLAG_BUF flag to the descriptor when sending the indirect Read/Write SFF EEPROM AQ command. The flag is already added later in the code flow for all indirect AQ commands, i.e. commands that provide an additional data buffer. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-31ice: change link misconfiguration messagePaul Greenwalt
Change link misconfiguration message since the configuration could be intended by the user. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-31ice: handle increasing Tx or Rx ring sizesPaul M Stillwell Jr
There is an issue when the Tx or Rx ring size increases using 'ethtool -L ...' where the new rings don't get the correct ITR values because when we rebuild the VSI we don't know that some of the rings may be new. Fix this by looking at the original number of rings and determining if the rings in ice_vsi_rebuild_set_coalesce() were not present in the original rings received in ice_vsi_rebuild_get_coalesce(). Also change the code to return an error if we can't allocate memory for the coalesce data in ice_vsi_rebuild(). Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-31ice: Update to use package info from ice segmentDan Nowlin
There are two package versions in the package binary. Today, these two version numbers are the same. However, in the future that may change. Update code to use the package info from the ice segment metadata section, which is the package information that is actually downloaded to the firmware during the download package process. Signed-off-by: Dan Nowlin <dan.nowlin@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-31ice: Delay netdev registrationAnirudh Venkataramanan
Once a netdev is registered, the corresponding network interface can be immediately used by userspace utilities (like say NetworkManager). This can be problematic if the driver technically isn't fully up yet. Move netdev registration to the end of probe, as by this time the driver data structures and device will be initialized as expected. However, delaying netdev registration causes a failure in the aRFS flow where netdev->reg_state == NETREG_REGISTERED condition is checked. It's not clear why this check was added to begin with, so remove it. Local testing didn't indicate any issues with this change. The state bit check in ice_open was put in as a stop-gap measure to prevent a premature interface up operation. This is no longer needed, so remove it. Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-31ice: Add Support for XPSBenita Bose
Enable and configure XPS. The driver code implemented sets up the Transmit Packet Steering Map, which in turn will be used by the kernel in queue selection during Tx. Signed-off-by: Benita Bose <benita.bose@intel.com> Tested-by: Tony Brelinski <tonyx.brelinski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-03-31ip6_tunnel: sit: proper dev_{hold|put} in ndo_[un]init methodsEric Dumazet
Same reasons than for the previous commits : 6289a98f0817 ("sit: proper dev_{hold|put} in ndo_[un]init methods") 40cb881b5aaa ("ip6_vti: proper dev_{hold|put} in ndo_[un]init methods") 7f700334be9a ("ip6_gre: proper dev_{hold|put} in ndo_[un]init methods") After adopting CONFIG_PCPU_DEV_REFCNT=n option, syzbot was able to trigger a warning [1] Issue here is that: - all dev_put() should be paired with a corresponding prior dev_hold(). - A driver doing a dev_put() in its ndo_uninit() MUST also do a dev_hold() in its ndo_init(), only when ndo_init() is returning 0. Otherwise, register_netdevice() would call ndo_uninit() in its error path and release a refcount too soon. [1] WARNING: CPU: 1 PID: 21059 at lib/refcount.c:31 refcount_warn_saturate+0xbf/0x1e0 lib/refcount.c:31 Modules linked in: CPU: 1 PID: 21059 Comm: syz-executor.4 Not tainted 5.12.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:refcount_warn_saturate+0xbf/0x1e0 lib/refcount.c:31 Code: 1d 6a 5a e8 09 31 ff 89 de e8 8d 1a ab fd 84 db 75 e0 e8 d4 13 ab fd 48 c7 c7 a0 e1 c1 89 c6 05 4a 5a e8 09 01 e8 2e 36 fb 04 <0f> 0b eb c4 e8 b8 13 ab fd 0f b6 1d 39 5a e8 09 31 ff 89 de e8 58 RSP: 0018:ffffc900025aefe8 EFLAGS: 00010282 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000040000 RSI: ffffffff815c51f5 RDI: fffff520004b5def RBP: 0000000000000004 R08: 0000000000000000 R09: 0000000000000000 R10: ffffffff815bdf8e R11: 0000000000000000 R12: ffff888023488568 R13: ffff8880254e9000 R14: 00000000dfd82cfd R15: ffff88802ee2d7c0 FS: 00007f13bc590700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f0943e74000 CR3: 0000000025273000 CR4: 00000000001506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __refcount_dec include/linux/refcount.h:344 [inline] refcount_dec include/linux/refcount.h:359 [inline] dev_put include/linux/netdevice.h:4135 [inline] ip6_tnl_dev_uninit+0x370/0x3d0 net/ipv6/ip6_tunnel.c:387 register_netdevice+0xadf/0x1500 net/core/dev.c:10308 ip6_tnl_create2+0x1b5/0x400 net/ipv6/ip6_tunnel.c:263 ip6_tnl_newlink+0x312/0x580 net/ipv6/ip6_tunnel.c:2052 __rtnl_newlink+0x1062/0x1710 net/core/rtnetlink.c:3443 rtnl_newlink+0x64/0xa0 net/core/rtnetlink.c:3491 rtnetlink_rcv_msg+0x44e/0xad0 net/core/rtnetlink.c:5553 netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2502 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:674 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2350 ___sys_sendmsg+0xf3/0x170 net/socket.c:2404 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2433 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: 919067cc845f ("net: add CONFIG_PCPU_DEV_REFCNT") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-31Merge branch 'ethtool-fec-netlink'David S. Miller
Jakub Kicinski says: ==================== ethtool: support FEC configuration over netlink This series adds support for the equivalents of ETHTOOL_GFECPARAM and ETHTOOL_SFECPARAM over netlink. As a reminder - this is an API which allows user to query current FEC mode, as well as set FEC manually if autoneg is disabled. It does not configure anything if autoneg is enabled (that said few/no drivers currently reject .set_fecparam calls while autoneg is disabled, hopefully FW will just ignore the settings). The existing functionality is mostly preserved in the new API. The ioctl interface uses a set of flags, and link modes to tell user which modes are supported. Here is how the flags translate to the new interface (skipping descriptions for actual FEC modes): ioctl flag | description | new API ================================================================ ETHTOOL_FEC_OFF | disabled (supported) | \ ETHTOOL_FEC_RS | | ` link mode bitset ETHTOOL_FEC_BASER | | / .._A_FEC_MODES ETHTOOL_FEC_LLRS | | / ETHTOOL_FEC_AUTO | pick based on cable | bool .._A_FEC_AUTO ETHTOOL_FEC_NONE | not supported | no bit, no AUTO reported Since link modes are already depended on (although somewhat implicitly) for expressing supported modes - the new interface uses them for the manual configuration, as well as uses link mode bit number to communicate the active mode. Use of link modes allows us to define any number of FEC modes we want, and reuse the strset we already have defined. Separating AUTO as its own attribute is the biggest changed compared to the ioctl. It means drivers can no longer report AUTO as the active FEC mode because there is no link mode for AUTO. active_fec == AUTO makes little sense in the first place IMHO, active_fec should be the actual mode, so hopefully this is fine. The other minor departure is that None is no longer explicitly expressed in the API. But drivers are reasonable in handling of this somewhat pointless bit, so I'm not expecting any issues there. One extension which could be considered would be moving active FEC to ETHTOOL_MSG_LINKMODE_*, but then why not move all of FEC into link modes? I don't know where to draw the line. netdevsim support and a simple self test are included. Next step is adding stats similar to the ones added for pause. ==================== Signed-off-by: David S. Miller <davem@davemloft.net> ,
2021-03-31selftests: ethtool: add a netdevsim FEC testJakub Kicinski
Test FEC settings, iterate over configs. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-31netdevsim: add FEC settings supportJakub Kicinski
Add support for ethtool FEC and some ethtool error injection. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-31ethtool: support FEC settings over netlinkJakub Kicinski
Add FEC API to netlink. This is not a 1-to-1 conversion. FEC settings already depend on link modes to tell user which modes are supported. Take this further an use link modes for manual configuration. Old struct ethtool_fecparam is still used to talk to the drivers, so we need to translate back and forth. We can revisit the internal API if number of FEC encodings starts to grow. Enforce only one active FEC bit (by using a bit position rather than another mask). Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-31tools/resolve_btfids: Fix warningsStanislav Fomichev
* make eprintf static, used only in main.c * initialize ret in eprintf * remove unused *tmp v3: * remove another err (Song Liu) v2: * remove unused 'int err = -1' Signed-off-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Song Liu <songliubraving@fb.com> Link: https://lore.kernel.org/bpf/20210329223143.3659983-1-sdf@google.com
2021-03-30selftests/bpf: Add an option for a debug shell in vmtest.shKP Singh
The newly introduced -s command line option starts an interactive shell. If a command is specified, the shell is started after the command finishes executing. It's useful to have a shell especially when debugging failing tests or developing new tests. Since the user may terminate the VM forcefully, an extra "sync" is added after the execution of the command to persist any logs from the command into the log file. Signed-off-by: KP Singh <kpsingh@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20210323014752.3198283-1-kpsingh@kernel.org
2021-03-30net: ethernet: Fix typo of 'network' in commentEric Lin
Signed-off-by: Eric Lin <dslin1010@gmail.com> Reported-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-30mlxsw: spectrum_router: Only perform atomic nexthop bucket replacement when ↵Ido Schimmel
requested When cleared, the 'force' parameter in nexthop bucket replacement notifications indicates that a driver should try to perform an atomic replacement. Meaning, only update the contents of the bucket if it is inactive. Since mlxsw only queries buckets' activity once every second, there is no point in trying an atomic replacement if the idle timer interval is smaller than 1 second. Currently, mlxsw ignores the original value of 'force' and will always try an atomic replacement if the idle timer is not smaller than 1 second. Fix this by taking the original value of 'force' into account and never promoting a non-atomic replacement to an atomic one. Fixes: 617a77f044ed ("mlxsw: spectrum_router: Add nexthop bucket replacement support") Reported-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-30Merge branch 'mptcp-subflow-disconnected'David S. Miller
Mat Martineau says: ==================== MPTCP: Allow initial subflow to be disconnected An MPTCP connection is aggregated from multiple TCP subflows, and can involve multiple IP addresses on either peer. The addresses used in the initial subflow connection are assigned address id 0 on each side of the link. More addresses can be added and shared with the peer using address IDs of 1 or larger. MPTCP in Linux shares non-zero address IDs across all MPTCP connections in a net namespace, which allows userspace to manage subflow connections across a number of sockets. However, this makes the address with id 0 a special case, since the IP address associated with id 0 is potentially different for each socket. This patch set allows the initial subflow to be disconnected when userspace specifies an address to remove using both id 0 and an IP address, or when the peer sends an RM_ADDR for id 0. Patches 1 and 3 implement the change for requests from the peer and userspace, respectively. Patch 2 consolidates some code for disconnecting subflows. Patches 4-6 update the self tests to cover removal of subflows using address id 0. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-30selftests: mptcp: remove id 0 address testcasesGeliang Tang
This patch added the testcases for removing the id 0 subflow and the id 0 address. In do_transfer, use the removing addresses number '9' for deleting the id 0 address. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-30selftests: mptcp: add addr argument for del_addrGeliang Tang
For the id 0 address, different MPTCP connections could be using different IP addresses for id 0. This patch added an extra argument IP address for del_addr when using id 0. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-30selftests: mptcp: avoid calling pm_nl_ctl with bad IDsMatthieu Baerts
IDs are supposed to be between 0 and 255. In pm_nl_ctl, for both the 'add' and 'get' instruction, the ID is casted in a u_int8_t. So if we give 256, we will delete ID 0. Obviously, the goal is not to delete this ID by giving 256. We could modify pm_nl_ctl and stop if the ID is negative or higher than 255 but probably better not to increase the number of lines for such things in this tool which is only used in selftests. Instead, we use it within the limits. This modification also means that we will no longer add a new ID for the 2nd entry. That's why we removed an expected entry from the dump and introduced with commit dc8eb10e95a8 ("selftests: mptcp: add testcases for setting the address ID"). So now we delete ID 9 like before and we add entries for IDs 10 to 255 that are deleted just after. Note that this could be seen as a fix but it was not really an issue so far: we were simply playing with ID 0/1 once again. With the following commit ("selftests: mptcp: add addr argument for del_addr"), it will be different because ID 0 is going to required an address. We don't want errors when trying to delete ID 0 without the address argument. Acked-and-tested-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-30mptcp: remove id 0 addressGeliang Tang
This patch added a new function mptcp_nl_remove_id_zero_address to remove the id 0 address. In this function, traverse all the existing msk sockets to find the msk matched the input IP address. Then fill the removing list with id 0, and pass it to mptcp_pm_remove_addr and mptcp_pm_remove_subflow. Suggested-by: Paolo Abeni <pabeni@redhat.com> Suggested-by: Matthieu Baerts <matthieu.baerts@tessares.net> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-30mptcp: unify RM_ADDR and RM_SUBFLOW receivingGeliang Tang
There are some duplicate code in mptcp_pm_nl_rm_addr_received and mptcp_pm_nl_rm_subflow_received. This patch unifies them into a new function named mptcp_pm_nl_rm_addr_or_subflow. In it, use the input parameter rm_type to identify it's now removing an address or a subflow. Suggested-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-03-30mptcp: remove all subflows involving id 0 addressGeliang Tang
There's only one subflow involving the non-zero id address, but there may be multi subflows involving the id 0 address. Here's an example: local_id=0, remote_id=0 local_id=1, remote_id=0 local_id=0, remote_id=1 If the removing address id is 0, all the subflows involving the id 0 address need to be removed. In mptcp_pm_nl_rm_addr_received/mptcp_pm_nl_rm_subflow_received, the "break" prevents the iteration to the next subflow, so this patch dropped them. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: Geliang Tang <geliangtang@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>