summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-07-24mm: initialize return of vm_insert_pagesTom Rix
clang static analysis reports a garbage return In file included from mm/memory.c:84: mm/memory.c:1612:2: warning: Undefined or garbage value returned to caller [core.uninitialized.UndefReturn] return err; ^~~~~~~~~~ The setting of err depends on a loop executing. So initialize err. Signed-off-by: Tom Rix <trix@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20200703155354.29132-1-trix@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-24vfs/xattr: mm/shmem: kernfs: release simple xattr entry in a right wayChengguang Xu
After commit fdc85222d58e ("kernfs: kvmalloc xattr value instead of kmalloc"), simple xattr entry is allocated with kvmalloc() instead of kmalloc(), so we should release it with kvfree() instead of kfree(). Fixes: fdc85222d58e ("kernfs: kvmalloc xattr value instead of kmalloc") Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Hugh Dickins <hughd@google.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Daniel Xu <dxu@dxuuu.xyz> Cc: Chris Down <chris@chrisdown.name> Cc: Andreas Dilger <adilger@dilger.ca> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> [5.7] Link: http://lkml.kernel.org/r/20200704051608.15043-1-cgxu519@mykernel.net Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-24mm/mmap.c: close race between munmap() and expand_upwards()/downwards()Kirill A. Shutemov
VMA with VM_GROWSDOWN or VM_GROWSUP flag set can change their size under mmap_read_lock(). It can lead to race with __do_munmap(): Thread A Thread B __do_munmap() detach_vmas_to_be_unmapped() mmap_write_downgrade() expand_downwards() vma->vm_start = address; // The VMA now overlaps with // VMAs detached by the Thread A // page fault populates expanded part // of the VMA unmap_region() // Zaps pagetables partly // populated by Thread B Similar race exists for expand_upwards(). The fix is to avoid downgrading mmap_lock in __do_munmap() if detached VMAs are next to VM_GROWSDOWN or VM_GROWSUP VMA. [akpm@linux-foundation.org: s/mmap_sem/mmap_lock/ in comment] Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Yang Shi <yang.shi@linux.alibaba.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: <stable@vger.kernel.org> [4.20+] Link: http://lkml.kernel.org/r/20200709105309.42495-1-kirill.shutemov@linux.intel.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-24uprobes: Change handle_swbp() to send SIGTRAP with si_code=SI_KERNEL, to fix ↵Oleg Nesterov
GDB regression If a tracee is uprobed and it hits int3 inserted by debugger, handle_swbp() does send_sig(SIGTRAP, current, 0) which means si_code == SI_USER. This used to work when this code was written, but then GDB started to validate si_code and now it simply can't use breakpoints if the tracee has an active uprobe: # cat test.c void unused_func(void) { } int main(void) { return 0; } # gcc -g test.c -o test # perf probe -x ./test -a unused_func # perf record -e probe_test:unused_func gdb ./test -ex run GNU gdb (GDB) 10.0.50.20200714-git ... Program received signal SIGTRAP, Trace/breakpoint trap. 0x00007ffff7ddf909 in dl_main () from /lib64/ld-linux-x86-64.so.2 (gdb) The tracee hits the internal breakpoint inserted by GDB to monitor shared library events but GDB misinterprets this SIGTRAP and reports a signal. Change handle_swbp() to use force_sig(SIGTRAP), this matches do_int3_user() and fixes the problem. This is the minimal fix for -stable, arch/x86/kernel/uprobes.c is equally wrong; it should use send_sigtrap(TRAP_TRACE) instead of send_sig(SIGTRAP), but this doesn't confuse GDB and needs another x86-specific patch. Reported-by: Aaron Merey <amerey@redhat.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200723154420.GA32043@redhat.com
2020-07-24sched: Warn if garbage is passed to default_wake_function()Chris Wilson
Since the default_wake_function() passes its flags onto try_to_wake_up(), warn if those flags collide with internal values. Given that the supplied flags are garbage, no repair can be done but at least alert the user to the damage they are causing. In the belief that these errors should be picked up during testing, the warning is only compiled in under CONFIG_SCHED_DEBUG. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: https://lore.kernel.org/r/20200723201042.18861-1-chris@chris-wilson.co.uk
2020-07-23vrf: Handle CONFIG_SYSCTL not setDavid Ahern
Randy reported compile failure when CONFIG_SYSCTL is not set/enabled: ERROR: modpost: "sysctl_vals" [drivers/net/vrf.ko] undefined! Fix by splitting out the sysctl init and cleanup into helpers that can be set to do nothing when CONFIG_SYSCTL is disabled. In addition, move vrf_strict_mode and vrf_strict_mode_change to above vrf_shared_table_handler (code move only) and wrap all of it in the ifdef CONFIG_SYSCTL. Update the strict mode tests to check for the existence of the /proc/sys entry. Fixes: 33306f1aaf82 ("vrf: add sysctl parameter for strict mode") Cc: Andrea Mayer <andrea.mayer@uniroma2.it> Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: David Ahern <dsahern@kernel.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter/IPVS fixes for net: 1) Fix NAT hook deletion when table is dormant, from Florian Westphal. 2) Fix IPVS sync stalls, from guodeqing. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23ice: add 1G SGMII PHY typePaul M Stillwell Jr
There isn't a case for 1G SGMII in ice_get_media_type() so add the handling for it. Also handle the special case where some direct attach cables may report that they support 1G SGMII, but that is erroneous since SGMII is supposed to be a backplane media type (between a MAC and a PHY). If the driver doesn't handle this special case then a user could see the 'Port' in ethtool change from 'Direct attach Copper' to 'Backplane' when they have forced the speed to 1G, but the cable hasn't changed. Lastly, change ice_aq_get_phy_caps() to save the module_type info if the function was called with ICE_AQC_REPORT_TOPO_CAP. This call uses the media information to populate the module_type. If no media is present then the values in module_type will be 0. Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23ice: Report AOC PHY Types as FiberDoug Dziggel
Report AOC types as fiber instead of unknown. Signed-off-by: Doug Dziggel <douglas.a.dziggel@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23ice: add AQC get link topology handle supportPaul Greenwalt
Add AQC get link topology handle support. This is needed to determine Direct Attach (DA) or backplane media type for PHY types that support either. Get link topology handle cage node type request can be used to determine if a cage is present or not. If a cage is present for PHY types that supports both DA and backplane media type, then the media type is DA, else the media type is backplane. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23ice: Rename low_power_ctrlLev Faerman
Rename the low_power_ctrl field to low_power_ctrl_an to be properly descriptive of it being an autoneg field. Signed-off-by: Lev Faerman <lev.faerman@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23ice: update reporting of autoneg capabilitiesPaul Greenwalt
Firmware now reports AN28, AN32, and AN73. Add a helper and check these new values and report PHY autoneg capability. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23ice: add ice_aq_get_phy_caps() debug logsPaul Greenwalt
Add debug logs for ice_aq_get_phy_caps(), and format ice_aq_set_phy_cfg() and ice_aq_get_link_info() debug logs to make them more readable. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23ice: support Total Port Shutdown on devices that support itBruce Allan
When the Port Disable bit is set in the Link Default Override Mask TLV PFA module in the NVM, Total Port Shutdown mode is supported and enabled. In this mode, the driver should act as if the link-down-on-close ethtool private flag is always enabled and dis-allow any change to that flag. Signed-off-by: Bruce Allan <bruce.w.allan@intel.com> Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23ice: add link lenient and default override supportPaul Greenwalt
Adds functions to check for link override firmware support and get the override settings for a port. The previously supported/default link mode was strict mode. In strict mode link is configured based on get PHY capabilities PHY types with media. Lenient mode is now the default link mode. In lenient mode the link is configured based on get PHY capabilities PHY types without media. This allows the user to configure link that the media does not report. Limit the minimum supported link mode to 25G for devices that support 100G, and 1G for devices that support less than 100G. Default override is only supported in lenient mode. If default override is supported and enabled, then default override values are used for configuring speed and FEC. Default override provide persistent link settings in the NVM. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Evan Swanson <evan.swanson@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23geneve: fix an uninitialized value in geneve_changelink()Cong Wang
geneve_nl2info() sets 'df' conditionally, so we have to initialize it by copying the value from existing geneve device in geneve_changelink(). Fixes: 56c09de347e4 ("geneve: allow changing DF behavior after creation") Reported-by: syzbot+7ebc2e088af5e4c0c9fa@syzkaller.appspotmail.com Cc: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23bonding: check return value of register_netdevice() in bond_newlink()Cong Wang
Very similar to commit 544f287b8495 ("bonding: check error value of register_netdevice() immediately"), we should immediately check the return value of register_netdevice() before doing anything else. Fixes: 005db31d5f5f ("bonding: set carrier off for devices created through netlink") Reported-and-tested-by: syzbot+bbc3a11c4da63c1b74d6@syzkaller.appspotmail.com Cc: Beniamino Galvani <bgalvani@redhat.com> Cc: Taehee Yoo <ap420073@gmail.com> Cc: Jay Vosburgh <j.vosburgh@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23ice: restore PHY settings on media insertionPaul Greenwalt
After the transition from no media to media FW will clear the set-phy-cfg data set by the user. Save initial PHY settings and any settings later requested by the user and use that data to restore PHY settings on media insertion. Since PHY configuration is now being stored, replace calls that were calling FW to get the configuration with the saved copy. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com> Signed-off-by: Paul M Stillwell Jr <paul.m.stillwell.jr@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23net: dsa: stop overriding master's ndo_get_phys_port_nameVladimir Oltean
The purpose of this override is to give the user an indication of what the number of the CPU port is (in DSA, the CPU port is a hardware implementation detail and not a network interface capable of traffic). However, it has always failed (by design) at providing this information to the user in a reliable fashion. Prior to commit 3369afba1e46 ("net: Call into DSA netdevice_ops wrappers"), the behavior was to only override this callback if it was not provided by the DSA master. That was its first failure: if the DSA master itself was a DSA port or a switchdev, then the user would not see the number of the CPU port in /sys/class/net/eth0/phys_port_name, but the number of the DSA master port within its respective physical switch. But that was actually ok in a way. The commit mentioned above changed that behavior, and now overrides the master's ndo_get_phys_port_name unconditionally. That comes with problems of its own, which are worse in a way. The idea is that it's typical for switchdev users to have udev rules for consistent interface naming. These are based, among other things, on the phys_port_name attribute. If we let the DSA switch at the bottom to start randomly overriding ndo_get_phys_port_name with its own CPU port, we basically lose any predictability in interface naming, or even uniqueness, for that matter. So, there are reasons to let DSA override the master's callback (to provide a consistent interface, a number which has a clear meaning and must not be interpreted according to context), and there are reasons to not let DSA override it (it breaks udev matching for the DSA master). But, there is an alternative method for users to retrieve the number of the CPU port of each DSA switch in the system: $ devlink port pci/0000:00:00.5/0: type eth netdev swp0 flavour physical port 0 pci/0000:00:00.5/2: type eth netdev swp2 flavour physical port 2 pci/0000:00:00.5/4: type notset flavour cpu port 4 spi/spi2.0/0: type eth netdev sw0p0 flavour physical port 0 spi/spi2.0/1: type eth netdev sw0p1 flavour physical port 1 spi/spi2.0/2: type eth netdev sw0p2 flavour physical port 2 spi/spi2.0/4: type notset flavour cpu port 4 spi/spi2.1/0: type eth netdev sw1p0 flavour physical port 0 spi/spi2.1/1: type eth netdev sw1p1 flavour physical port 1 spi/spi2.1/2: type eth netdev sw1p2 flavour physical port 2 spi/spi2.1/3: type eth netdev sw1p3 flavour physical port 3 spi/spi2.1/4: type notset flavour cpu port 4 So remove this duplicated, unreliable and troublesome method. From this patch on, the phys_port_name attribute of the DSA master will only contain information about itself (if at all). If the users need reliable information about the CPU port they're probably using devlink anyway. Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Acked-by: florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23ice: move auto FEC checks into ice_cfg_phy_fec()Paul Greenwalt
The call to ice_cfg_phy_fec() requires the caller to perform certain actions before calling it. Instead of imposing these preconditions move the operations into the function and perform them ourselves. Also, fix some style issues in nearby touched code. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Signed-off-by: Chinh T Cao <chinh.t.cao@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23ice: refactor FC functionsPaul Greenwalt
Create a helper function for configuring requested flow control so that it can be utilized by other functions looking to configure flow control settings. Utilize the existing helper ice_copy_phy_caps_to_cfg() to copy a PHY capability to configuration instead duplicating the code for it. Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23ice: Add advanced power mgmt for WoLAkeem G Abodunrin
Add callbacks needed to support advanced power management for Wake on LAN. Also make ice_pf_state_is_nominal function available for all configurations not just CONFIG_PCI_IOV. Signed-off-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23ice: split ice_discover_caps into two functionsJacob Keller
Using the new ice_aq_list_caps and ice_parse_(dev|func)_caps functions, replace ice_discover_caps with two functions that each take a pointer to the dev_caps and func_caps structures respectively. This makes the side effect of updating the hw->dev_caps and hw->func_caps obvious from reading the implementation of the function. Additionally, it opens the way for enabling reading of device capabilities outside of the initialization flow. By passing in a pointer, another caller will be able to read the capabilities without modifying the HW capabilities structures. As there are no other callers, it is safe to now remove ice_aq_discover_caps and ice_parse_caps. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23ice: split ice_parse_caps into separate functionsJacob Keller
The ice_parse_caps function is used to convert the capability block data coming from firmware into a structured format used by other parts of the code. The current implementation directly updates the hw->func_caps and hw->dev_caps structures. It is directly called from within ice_aq_discover_caps. This causes the discover_caps function to have the side effect of modifying the HW capability structures, which is not intuitive. Split this function into ice_parse_dev_caps and ice_parse_func_caps. These functions will take a pointer to the dev_caps and func_caps respectively. Also create an ice_parse_common_caps for sharing the capability logic that is common to device and function. Doing so enables a future refactor to allow reading and parsing capabilities into a local caps structure instead of modifying the members of the HW structure directly. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23ice: refactor ice_discover_caps to avoid need to retryJacob Keller
The ice_discover_caps function is used to read the device and function capabilities, updating the hardware capabilities structures with relevant data. The exact number of capabilities returned by the hardware is unknown ahead of time. The AdminQ command will report the total number of capabilities in the return buffer. The current implementation involves requesting capabilities once, reading this returned size, and then re-requested with that size. This isn't really necessary. The firmware interface has a maximum size of ICE_AQ_MAX_BUF_LEN. Firmware can never return more than ICE_AQ_MAX_BUF_LEN / sizeof(struct ice_aqc_list_caps_elem) capabilities. Avoid the retry loop by simply allocating a buffer of size ICE_AQ_MAX_BUF_LEN. This is significantly simpler than retrying. The extra allocation isn't a big deal, as it will be released after we finish parsing the capabilities. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Andrew Bowers <andrewx.bowers@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2020-07-23Revert "cifs: Fix the target file was deleted when rename failed."Steve French
This reverts commit 9ffad9263b467efd8f8dc7ae1941a0a655a2bab2. Upon additional testing with older servers, it was found that the original commit introduced a regression when using the old SMB1 dialect and rsyncing over an existing file. The patch will need to be respun to address this, likely including a larger refactoring of the SMB1 and SMB3 rename code paths to make it less confusing and also to address some additional rename error cases that SMB3 may be able to workaround. Signed-off-by: Steve French <stfrench@microsoft.com> Reported-by: Patrick Fernie <patrick.fernie@gmail.com> CC: Stable <stable@vger.kernel.org> Acked-by: Ronnie Sahlberg <lsahlber@redhat.com> Acked-by: Pavel Shilovsky <pshilov@microsoft.com> Acked-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
2020-07-23Merge tag 's390-5.8-6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux into master Pull s390 fixes from Heiko Carstens: - Change cpum_cf/perf counter name from DFLT_CCERROR to DFLT_CCFINISH to reflect reality and avoid further confusion. This is a user space visible change therefore the commit has also a stable tag for 5.7, where this counter was introduced. - Add Matthew Rosato as s390 IOMMU maintainer. * tag 's390-5.8-6' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: MAINTAINERS: add Matthew for s390 IOMMU s390/cpum_cf,perf: change DFLT_CCERROR counter name
2020-07-23i2c: i2c-qcom-geni: Fix DMA transfer raceDouglas Anderson
When I have KASAN enabled on my kernel and I start stressing the touchscreen my system tends to hang. The touchscreen is one of the only things that does a lot of big i2c transfers and ends up hitting the DMA paths in the geni i2c driver. It appears that KASAN adds enough delay in my system to tickle a race condition in the DMA setup code. When the system hangs, I found that it was running the geni_i2c_irq() over and over again. It had these: m_stat = 0x04000080 rx_st = 0x30000011 dm_tx_st = 0x00000000 dm_rx_st = 0x00000000 dma = 0x00000001 Notably we're in DMA mode but are getting M_RX_IRQ_EN and M_RX_FIFO_WATERMARK_EN over and over again. Putting some traces in geni_i2c_rx_one_msg() showed that when we failed we were getting to the start of geni_i2c_rx_one_msg() but were never executing geni_se_rx_dma_prep(). I believe that the problem here is that we are starting the geni command before we run geni_se_rx_dma_prep(). If a transfer makes it far enough before we do that then we get into the state I have observed. Let's change the order, which seems to work fine. Although problems were seen on the RX path, code inspection suggests that the TX should be changed too. Change it as well. Fixes: 37692de5d523 ("i2c: i2c-qcom-geni: Add bus driver for the Qualcomm GENI I2C controller") Signed-off-by: Douglas Anderson <dianders@chromium.org> Tested-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org> Reviewed-by: Akash Asthana <akashast@codeaurora.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Reviewed-by: Mukesh Kumar Savaliya <msavaliy@codeaurora.org> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2020-07-23i2c: rcar: always clear ICSAR to avoid side effectsWolfram Sang
On R-Car Gen2, we get a timeout when reading from the address set in ICSAR, even though the slave interface is disabled. Clearing it fixes this situation. Note that Gen3 is not affected. To reproduce: bind and undbind an I2C slave on some bus, run 'i2cdetect' on that bus. Fixes: de20d1857dd6 ("i2c: rcar: add slave support") Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Wolfram Sang <wsa@kernel.org>
2020-07-23tcp: allow at most one TLP probe per flightYuchung Cheng
Previously TLP may send multiple probes of new data in one flight. This happens when the sender is cwnd limited. After the initial TLP containing new data is sent, the sender receives another ACK that acks partial inflight. It may re-arm another TLP timer to send more, if no further ACK returns before the next TLP timeout (PTO) expires. The sender may send in theory a large amount of TLP until send queue is depleted. This only happens if the sender sees such irregular uncommon ACK pattern. But it is generally undesirable behavior during congestion especially. The original TLP design restrict only one TLP probe per inflight as published in "Reducing Web Latency: the Virtue of Gentle Aggression", SIGCOMM 2013. This patch changes TLP to send at most one probe per inflight. Note that if the sender is app-limited, TLP retransmits old data and did not have this issue. Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23AX.25: Prevent integer overflows in connect and sendmsgDan Carpenter
We recently added some bounds checking in ax25_connect() and ax25_sendmsg() and we so we removed the AX25_MAX_DIGIS checks because they were no longer required. Unfortunately, I believe they are required to prevent integer overflows so I have added them back. Fixes: 8885bb0621f0 ("AX.25: Prevent out-of-bounds read in ax25_sendmsg()") Fixes: 2f2a7ffad5c6 ("AX.25: Fix out-of-bounds read in ax25_connect()") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23cxgb4: add loopback ethtool self-testVishal Kulkarni
In this test, loopback pkt is created and sent on default queue. The packet goes until the Multi Port Switch (MPS) just before the MAC and based on the specified channel number, it either goes outside the wire on one of the physical ports or looped back to Rx path by MPS. In this case, we're specifying loopback channel, instead of physical ports, so the packet gets looped back to Rx path, instead of getting transmitted on the wire. v3: - Modify commit message to include test details. v2: - Add only loopback self-test. Signed-off-by: Vishal Kulkarni <vishal@chelsio.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23Merge branch 'l2tp-further-checkpatch-pl-cleanups'David S. Miller
Tom Parkin says: ==================== l2tp: further checkpatch.pl cleanups l2tp hasn't been kept up to date with the static analysis checks offered by checkpatch.pl. This patchset builds on the series "l2tp: cleanup checkpatch.pl warnings". It includes small refactoring changes which improve code quality and resolve a subset of the checkpatch warnings for the l2tp codebase. ==================== Reviewed-by: James Chapman <jchapman@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23l2tp: cleanup kzalloc callsTom Parkin
Passing "sizeof(struct blah)" in kzalloc calls is less readable, potentially prone to future bugs if the type of the pointer is changed, and triggers checkpatch warnings. Tweak the kzalloc calls in l2tp which use this form to avoid the warning. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23l2tp: cleanup netlink tunnel create address handlingTom Parkin
When creating an L2TP tunnel using the netlink API, userspace must either pass a socket FD for the tunnel to use (for managed tunnels), or specify the tunnel source/destination address (for unmanaged tunnels). Since source/destination addresses may be AF_INET or AF_INET6, the l2tp netlink code has conditionally compiled blocks to support IPv6. Rather than embedding these directly into l2tp_nl_cmd_tunnel_create (where it makes the code difficult to read and confuses checkpatch to boot) split the handling of address-related attributes into a separate function. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23l2tp: cleanup netlink send of tunnel address informationTom Parkin
l2tp_nl_tunnel_send has conditionally compiled code to support AF_INET6, which makes the code difficult to follow and triggers checkpatch warnings. Split the code out into functions to handle the AF_INET v.s. AF_INET6 cases, which both improves readability and resolves the checkpatch warnings. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23l2tp: check socket address type in l2tp_dfs_seq_tunnel_showTom Parkin
checkpatch warns about indentation and brace balancing around the conditionally compiled code for AF_INET6 support in l2tp_dfs_seq_tunnel_show. By adding another check on the socket address type we can make the code more readable while removing the checkpatch warning. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23l2tp: cleanup unnecessary braces in if statementsTom Parkin
These checks are all simple and don't benefit from extra braces to clarify intent. Remove them for easier-reading code. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23l2tp: cleanup comparisons to NULLTom Parkin
checkpatch warns about comparisons to NULL, e.g. CHECK: Comparison to NULL could be written "!rt" #474: FILE: net/l2tp/l2tp_ip.c:474: + if (rt == NULL) { These sort of comparisons are generally clearer and more readable the way checkpatch suggests, so update l2tp accordingly. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23net/ncsi: use eth_zero_addr() to clear mac addressMiaohe Lin
Use eth_zero_addr() to clear mac address insetad of memset(). Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23cxgb4: use eth_zero_addr() to clear mac addressMiaohe Lin
Use eth_zero_addr() to clear mac address insetad of memset(). Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23Merge branch 'mptcp-non-backup-subflows-pre-reqs'David S. Miller
Paolo Abeni says: ==================== mptcp: non backup subflows pre-reqs This series contains a bunch of MPTCP improvements loosely related to concurrent subflows xmit usage, currently under development. The first 3 patches are actually bugfixes for issues that will become apparent as soon as we will enable the above feature. The later patches improve the handling of incoming additional subflows, improving significantly the performances in stress tests based on a high new connection rate. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23subflow: introduce and use mptcp_can_accept_new_subflow()Paolo Abeni
So that we can easily perform some basic PM-related adimission checks before creating the child socket. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23subflow: use rsk_ops->send_reset()Paolo Abeni
tcp_send_active_reset() is more prone to transient errors (memory allocation or xmit queue full): in stress conditions the kernel may drop the egress packet, and the client will be stuck. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23subflow: explicitly check for plain tcp rskPaolo Abeni
When syncookie are in use, the TCP stack may feed into subflow_syn_recv_sock() plain TCP request sockets. We can't access mptcp_subflow_request_sock-specific fields on such sockets. Explicitly check the rsk ops to do safe accesses. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23mptcp: cleanup subflow_finish_connect()Paolo Abeni
The mentioned function has several unneeded branches, handle each case - MP_CAPABLE, MP_JOIN, fallback - under a single conditional and drop quite a bit of duplicate code. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23mptcp: explicitly track the fully established statusPaolo Abeni
Currently accepted msk sockets become established only after accept() returns the new sk to user-space. As MP_JOIN request are refused as per RFC spec on non fully established socket, the above causes mp_join self-tests instabilities. This change lets the msk entering the established status as soon as it receives the 3rd ack and propagates the first subflow fully established status on the msk socket. Finally we can change the subflow acceptance condition to take in account both the sock state and the msk fully established flag. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23mptcp: mark as fallback even early onesPaolo Abeni
In the unlikely event of a failure at connect time, we currently clear the request_mptcp flag - so that the MPC handshake is not started at all, but the msk is not explicitly marked as fallback. This would lead to later insertion of wrong DSS options in the xmitted packets, in violation of RFC specs and possibly fooling the peer. Fixes: e1ff9e82e2ea ("net: mptcp: improve fallback to TCP") Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23mptcp: avoid data corruption on reinsertPaolo Abeni
When updating a partially acked data fragment, we actually corrupt it. This is irrelevant till we send data on a single subflow, as retransmitted data, if any are discarded by the peer as duplicate, but it will cause data corruption as soon as we will start creating non backup subflows. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-23subflow: always init 'rel_write_seq'Paolo Abeni
Currently we do not init the subflow write sequence for MP_JOIN subflows. This will cause bad mapping being generated as soon as we will use non backup subflow. Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Tested-by: Christoph Paasch <cpaasch@apple.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>