summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-12-19net: phy: marvell: use existing clause 37 definitionsRussell King
Use existing clause 37 advertising/link partner definitions rather than private ones for the advertisement registers. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-19net: phy: marvell: consolidate phy status readingRussell King
marvell_read_status_page_an() always reads the PHY status register, but marvell_update_link() has already done this. Rather than wastefully reading the register twice in quick succession, read it once in marvell_read_status_page() and use the result for both. This makes marvell_update_link() rather pointless, so move it into marvell_read_status_page(). Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-19net: phy: marvell: use positive logic for link stateRussell King
Rather than using negative logic: if (there is no link) set link = 0 else set link = 1 use the more natural positive logic: if (there is link) set link = 1 else set link = 0 Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-19net: phy: marvell: initialise link partner state earlierRussell King
Move the initialisation of the link partner state earlier, inside marvell_read_status_page(), so we don't have the same initialisation scattered amongst the other files. This is in a similar place to the genphy implementation, so would result in the same behaviour if a PHY read error occurs. This allows us to get rid of marvell_read_status_page_fixed(), which became a pointless wrapper around genphy_read_status_fixed(). Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-19net: phy: marvell: rearrange to use genphy_read_lpa()Russell King
Rearrange the Marvell PHY driver to use genphy_read_lpa() rather than open-coding this functionality. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-19net: phy: provide and use genphy_read_status_fixed()Russell King
There are two drivers and generic code which contain exactly the same code to read the status of a PHY operating without autonegotiation enabled. Rather than duplicate this code, provide a helper to read this information. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-19net: phy: add genphy_check_and_restart_aneg()Russell King
Add a helper for restarting autonegotiation(), similar to the clause 45 variant. Use it in __genphy_config_aneg() Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-19net: phy: use phy_resolve_aneg_pause()Russell King
Several drivers code their own version of this, working from the LPA register, after setting the ethtool link partner advertisement bitmask. Use the generic function instead. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-19net: phy: remove redundant .aneg_done initialisersRussell King
Remove initialisers that set .aneg_done to genphy_aneg_done - this is the default for clause 22 PHYs, so the initialiser is redundant. Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-19Merge ath-next from git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/ath.gitKalle Valo
ath.git patches for v5.6. Major changes: wil6210 * support set_multicast_to_unicast cfg80211 operation * support set_cqm_rssi_config cfg80211 operation wcn36xx * disable HW_CONNECTION_MONITOR as firmware is buggy
2019-12-19ath11k: Use sizeof_field() instead of FIELD_SIZEOF()Kees Cook
The FIELD_SIZEOF() macro was redundant, and is being removed from the kernel. Since commit c593642c8be0 ("treewide: Use sizeof_field() macro") this is one of the last users of the old macro, so replace it. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-12-19net, sysctl: Fix compiler warning when only cBPF is presentAlexander Lobakin
proc_dointvec_minmax_bpf_restricted() has been firstly introduced in commit 2e4a30983b0f ("bpf: restrict access to core bpf sysctls") under CONFIG_HAVE_EBPF_JIT. Then, this ifdef has been removed in ede95a63b5e8 ("bpf: add bpf_jit_limit knob to restrict unpriv allocations"), because a new sysctl, bpf_jit_limit, made use of it. Finally, this parameter has become long instead of integer with fdadd04931c2 ("bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K") and thus, a new proc_dolongvec_minmax_bpf_restricted() has been added. With this last change, we got back to that proc_dointvec_minmax_bpf_restricted() is used only under CONFIG_HAVE_EBPF_JIT, but the corresponding ifdef has not been brought back. So, in configurations like CONFIG_BPF_JIT=y && CONFIG_HAVE_EBPF_JIT=n since v4.20 we have: CC net/core/sysctl_net_core.o net/core/sysctl_net_core.c:292:1: warning: ‘proc_dointvec_minmax_bpf_restricted’ defined but not used [-Wunused-function] 292 | proc_dointvec_minmax_bpf_restricted(struct ctl_table *table, int write, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Suppress this by guarding it with CONFIG_HAVE_EBPF_JIT again. Fixes: fdadd04931c2 ("bpf: fix bpf_jit_limit knob for PAGE_SIZE >= 64K") Signed-off-by: Alexander Lobakin <alobakin@dlink.ru> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191218091821.7080-1-alobakin@dlink.ru
2019-12-19ath11k: explicitly cast wmi commands to their correct struct typeJohn Crispin
Three of the WMI command handlers were not casting to the right data type. Lets make the code consistent with the other handlers. Signed-off-by: John Crispin <john@phrozen.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-12-19wil6210: add support for set_cqm_rssi_configDedy Lansky
set_cqm_rssi_config() is used by the kernel to configure connection quality monitor RSSI threshold. wil6210 uses WMI_SET_LINK_MONITOR_CMDID to set the RSSI threshold to FW which in turn reports RSSI threshold changes with WMI_LINK_MONITOR_EVENTID. Signed-off-by: Dedy Lansky <dlansky@codeaurora.org> Signed-off-by: Maya Erez <merez@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-12-19wil6210: support set_multicast_to_unicast cfg80211 operationAhmad Masri
Wil6210 AP has a separate ring for transmitting multicast packets, multicast packets are transmitted without an ack from the receiver side. Therefore, 802.11 spec defines some low MCS rates for multicat packets. However, there is no guarantee that these packets were really received and handled on the client side. Some applications that rely on multicast packets, may prefer to transmit these packets as a unicast to ensure reliability, and also to ensure better performance with high MCS rates. multicast to unicast is done by duplicating multicast packets to all clients and changing the DA (multicast) to the MAC address of the client. see NL80211_CMD_SET_MULTICAST_TO_UNICAST for more info. Signed-off-by: Ahmad Masri <amasri@codeaurora.org> Signed-off-by: Maya Erez <merez@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-12-19wil6210: fix MID valid bits in Rx status messageDedy Lansky
Fix incorrect definitions of MAC ID bits inside Rx status message. Signed-off-by: Dedy Lansky <dlansky@codeaurora.org> Signed-off-by: Maya Erez <merez@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-12-19wil6210: reduce ucode_debug memory regionDedy Lansky
ucode_debug memory region defined as 4K bytes. Fix this according to Talyn device memory map. Signed-off-by: Dedy Lansky <dlansky@codeaurora.org> Signed-off-by: Maya Erez <merez@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-12-19wil6210: add verification for cid upper boundAlexei Avshalom Lazar
max_assoc_sta can receive values (from the user or from the FW) that are higher than WIL6210_MAX_CID. Verify that cid doesn't exceed the upper bound of WIL6210_MAX_CID. Signed-off-by: Alexei Avshalom Lazar <ailizaro@codeaurora.org> Signed-off-by: Maya Erez <merez@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-12-19wil6210: take mem_lock for writing in crash dump collectionAlexei Avshalom Lazar
On some crash dump cases mem_lock is already taken, error returns and crash dump copy fails. In this case wait until mem_lock available instead of failing the operation. Also take the mem_lock for writing to prevent other threads from altering the state of the device while collecting crash dump. Signed-off-by: Alexei Avshalom Lazar <ailizaro@codeaurora.org> Signed-off-by: Maya Erez <merez@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-12-19wil6210: minimize the time that mem_lock is heldAlexei Avshalom Lazar
mem_lock is taken for the entire wil_reset(). Optimize this by taking mem_lock just before device is being reset and release the lock after FW download. Signed-off-by: Alexei Avshalom Lazar <ailizaro@codeaurora.org> Signed-off-by: Maya Erez <merez@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-12-19wil6210: dump Rx status message on errorsAhmad Masri
Dump all the Rx status message on different errors to allow more visibility of the case. Signed-off-by: Ahmad Masri <amasri@codeaurora.org> Signed-off-by: Maya Erez <merez@codeaurora.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2019-12-19Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge fixes from Andrew Morton: "6 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: lib/Kconfig.debug: fix some messed up configurations mm: vmscan: protect shrinker idr replace with CONFIG_MEMCG kasan: don't assume percpu shadow allocations will succeed kasan: use apply_to_existing_page_range() for releasing vmalloc shadow mm/memory.c: add apply_to_existing_page_range() helper kasan: fix crashes on access to memory mapped by vm_map_ram()
2019-12-19Merge tag 'pm-5.5-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management fix from Rafael Wysocki: "Fix a problem related to CPU offline/online and cpufreq governors that in some system configurations may lead to a system-wide deadlock during CPU online" * tag 'pm-5.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: cpufreq: Avoid leaving stale IRQ work items during CPU offline
2019-12-19xfs: don't commit sunit/swidth updates to disk if that would cause repair ↵Darrick J. Wong
failures Alex Lyakas reported[1] that mounting an xfs filesystem with new sunit and swidth values could cause xfs_repair to fail loudly. The problem here is that repair calculates the where mkfs should have allocated the root inode, based on the superblock geometry. The allocation decisions depend on sunit, which means that we really can't go updating sunit if it would lead to a subsequent repair failure on an otherwise correct filesystem. Port from xfs_repair some code that computes the location of the root inode and teach mount to skip the ondisk update if it would cause problems for repair. Along the way we'll update the documentation, provide a function for computing the minimum AGFL size instead of open-coding it, and cut down some indenting in the mount code. Note that we allow the mount to proceed (and new allocations will reflect this new geometry) because we've never screened this kind of thing before. We'll have to wait for a new future incompat feature to enforce correct behavior, alas. Note that the geometry reporting always uses the superblock values, not the incore ones, so that is what xfs_info and xfs_growfs will report. [1] https://lore.kernel.org/linux-xfs/20191125130744.GA44777@bfoster/T/#m00f9594b511e076e2fcdd489d78bc30216d72a7d Reported-by: Alex Lyakas <alex@zadara.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-12-19xfs: split the sunit parameter update into two partsDarrick J. Wong
If the administrator provided a sunit= mount option, we need to validate the raw parameter, convert the mount option units (512b blocks) into the internal unit (fs blocks), and then validate that the (now cooked) parameter doesn't screw anything up on disk. The incore inode geometry computation can depend on the new sunit option, but a subsequent patch will make validating the cooked value depends on the computed inode geometry, so break the sunit update into two steps. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-12-19xfs: refactor agfl length computation functionDarrick J. Wong
Refactor xfs_alloc_min_freelist to accept a NULL @pag argument, in which case it returns the largest possible minimum length. This will be used in an upcoming patch to compute the length of the AGFL at mkfs time. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Brian Foster <bfoster@redhat.com>
2019-12-19libxfs: resync with the userspace libxfsDarrick J. Wong
Prepare to resync the userspace libxfs with the kernel libxfs. There were a few things I missed -- a couple of static inline directory functions that have to be exported for xfs_repair; a couple of directory naming functions that make porting much easier if they're /not/ static inline; and a u16 usage that should have been uint16_t. None of these things are bugs in their own right; this just makes porting xfsprogs easier. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Eric Sandeen <sandeen@redhat.com>
2019-12-19xfs: use bitops interface for buf log item AIL flag checkBrian Foster
The xfs_log_item flags were converted to atomic bitops as of commit 22525c17ed ("xfs: log item flags are racy"). The assert check for AIL presence in xfs_buf_item_relse() still uses the old value based check. This likely went unnoticed as XFS_LI_IN_AIL evaluates to 0 and causes the assert to unconditionally pass. Fix up the check. Signed-off-by: Brian Foster <bfoster@redhat.com> Fixes: 22525c17ed ("xfs: log item flags are racy") Reviewed-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-12-19Merge branch 'bpf-fix-xsk-wakeup'Daniel Borkmann
Maxim Mikityanskiy says: ==================== This series addresses the issue described in the commit message of the first patch: lack of synchronization between XSK wakeup and destroying the resources used by XSK wakeup. The idea is similar to napi_synchronize. The series contains fixes for the drivers that implement XSK. v2 incorporates changes suggested by Björn: 1. Call synchronize_rcu in Intel drivers only if the XDP program is being unloaded. 2. Don't forget rcu_read_lock when wakeup is called from xsk_poll. 3. Use xs->zc as the condition to call ndo_xsk_wakeup. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-12-19net/ixgbe: Fix concurrency issues between config flow and XSKMaxim Mikityanskiy
Use synchronize_rcu to wait until the XSK wakeup function finishes before destroying the resources it uses: 1. ixgbe_down already calls synchronize_rcu after setting __IXGBE_DOWN. 2. After switching the XDP program, call synchronize_rcu to let ixgbe_xsk_wakeup exit before the XDP program is freed. 3. Changing the number of channels brings the interface down. 4. Disabling UMEM sets __IXGBE_TX_DISABLED before closing hardware resources and resetting xsk_umem. Check that bit in ixgbe_xsk_wakeup to avoid using the XDP ring when it's already destroyed. synchronize_rcu is called from ixgbe_txrx_ring_disable. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191217162023.16011-5-maximmi@mellanox.com
2019-12-19net/i40e: Fix concurrency issues between config flow and XSKMaxim Mikityanskiy
Use synchronize_rcu to wait until the XSK wakeup function finishes before destroying the resources it uses: 1. i40e_down already calls synchronize_rcu. On i40e_down either __I40E_VSI_DOWN or __I40E_CONFIG_BUSY is set. Check the latter in i40e_xsk_wakeup (the former is already checked there). 2. After switching the XDP program, call synchronize_rcu to let i40e_xsk_wakeup exit before the XDP program is freed. 3. Changing the number of channels brings the interface down (see i40e_prep_for_reset and i40e_pf_quiesce_all_vsi). 4. Disabling UMEM sets __I40E_CONFIG_BUSY, too. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191217162023.16011-4-maximmi@mellanox.com
2019-12-19net/mlx5e: Fix concurrency issues between config flow and XSKMaxim Mikityanskiy
After disabling resources necessary for XSK (the XDP program, channels, XSK queues), use synchronize_rcu to wait until the XSK wakeup function finishes, before freeing the resources. Suspend XSK wakeups during switching channels. If the XDP program is being removed, synchronize_rcu before closing the old channels to allow XSK wakeup to complete. Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191217162023.16011-3-maximmi@mellanox.com
2019-12-19xsk: Add rcu_read_lock around the XSK wakeupMaxim Mikityanskiy
The XSK wakeup callback in drivers makes some sanity checks before triggering NAPI. However, some configuration changes may occur during this function that affect the result of those checks. For example, the interface can go down, and all the resources will be destroyed after the checks in the wakeup function, but before it attempts to use these resources. Wrap this callback in rcu_read_lock to allow driver to synchronize_rcu before actually destroying the resources. xsk_wakeup is a new function that encapsulates calling ndo_xsk_wakeup wrapped into the RCU lock. After this commit, xsk_poll starts using xsk_wakeup and checks xs->zc instead of ndo_xsk_wakeup != NULL to decide ndo_xsk_wakeup should be called. It also fixes a bug introduced with the need_wakeup feature: a non-zero-copy socket may be used with a driver supporting zero-copy, and in this case ndo_xsk_wakeup should not be called, so the xs->zc check is the correct one. Fixes: 77cd0d7b3f25 ("xsk: add support for need_wakeup flag in AF_XDP rings") Signed-off-by: Maxim Mikityanskiy <maximmi@mellanox.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20191217162023.16011-2-maximmi@mellanox.com
2019-12-19Merge branch 'pm-cpufreq'Rafael J. Wysocki
* pm-cpufreq: cpufreq: Avoid leaving stale IRQ work items during CPU offline
2019-12-19mmc: sdhci-of-esdhc: re-implement erratum A-009204 workaroundYangbo Lu
The erratum A-009204 workaround patch was reverted because of incorrect implementation. 8b6dc6b mmc: sdhci-of-esdhc: Revert "mmc: sdhci-of-esdhc: add erratum A-009204 support" This patch is to re-implement the workaround (add a 5 ms delay before setting SYSCTL[RSTD] to make sure all the DMA transfers are finished). Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com> Link: https://lore.kernel.org/r/20191219032335.26528-1-yangbo.lu@nxp.com Fixes: 5dd195522562 ("mmc: sdhci-of-esdhc: add erratum A-009204 support") Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2019-12-18clk: qcom: Avoid SMMU/cx gdsc corner casesJeffrey Hugo
Mark the msm8998 cpu CX gdsc as votable and use the hw control to avoid corner cases with SMMU per hardware documentation. Fixes: 3f7df5baa259 ("clk: qcom: Add MSM8998 GPU Clock Controller (GPUCC) driver") Signed-off-by: Jeffrey Hugo <jeffrey.l.hugo@gmail.com> Link: https://lkml.kernel.org/r/20191217171905.5619-1-jeffrey.l.hugo@gmail.com Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2019-12-18clk: qcom: gcc-sc7180: Fix setting flag for votable GDSCsMatthias Kaehlcke
Commit 17269568f7267 ("clk: qcom: Add Global Clock controller (GCC) driver for SC7180") sets the VOTABLE flag in .pwrsts, but it needs to be set in .flags, fix this. Fixes: 17269568f7267 ("clk: qcom: Add Global Clock controller (GCC) driver for SC7180") Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Link: https://lkml.kernel.org/r/20191204120341.1.I9971817e83ee890d1096c43c5a6ce6ced53d5bd3@changeid Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2019-12-18Merge tag 'tpmdd-next-20191219' of git://git.infradead.org/users/jjs/linux-tpmddLinus Torvalds
Pull tpm fixes from Jarkko Sakkinen: "Bunch of fixes for rc3" * tag 'tpmdd-next-20191219' of git://git.infradead.org/users/jjs/linux-tpmdd: tpm/tpm_ftpm_tee: add shutdown call back tpm: selftest: cleanup after unseal with wrong auth/policy test tpm: selftest: add test covering async mode tpm: fix invalid locking in NONBLOCKING mode security: keys: trusted: fix lost handle flush tpm_tis: reserve chip for duration of tpm_tis_core_init KEYS: asymmetric: return ENOMEM if akcipher_request_alloc() fails KEYS: remove CONFIG_KEYS_COMPAT
2019-12-19tpm/tpm_ftpm_tee: add shutdown call backPavel Tatashin
Add shutdown call back to close existing session with fTPM TA to support kexec scenario. Add parentheses to function names in comments as specified in kdoc. Signed-off-by: Thirupathaiah Annapureddy <thiruan@microsoft.com> Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Tested-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
2019-12-19drm/exynos: gsc: add missed component_delChuhong Yuan
The driver forgets to call component_del in remove to match component_add in probe. Add the missed call to fix it. Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Signed-off-by: Inki Dae <inki.dae@samsung.net>
2019-12-18net: stmmac: tc: Fix TAPRIO division operationJose Abreu
For ARCHs that don't support 64 bits division we need to use the helpers. Fixes: b60189e0392f ("net: stmmac: Integrate EST with TAPRIO scheduler API") Signed-off-by: Jose Abreu <Jose.Abreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-18s390/ftrace: save traced function callerVasily Gorbik
A typical backtrace acquired from ftraced function currently looks like the following (e.g. for "path_openat"): arch_stack_walk+0x15c/0x2d8 stack_trace_save+0x50/0x68 stack_trace_call+0x15a/0x3b8 ftrace_graph_caller+0x0/0x1c 0x3e0007e3c98 <- ftraced function caller (should be do_filp_open+0x7c/0xe8) do_open_execat+0x70/0x1b8 __do_execve_file.isra.0+0x7d8/0x860 __s390x_sys_execve+0x56/0x68 system_call+0xdc/0x2d8 Note random "0x3e0007e3c98" stack value as ftraced function caller. This value causes either imprecise unwinder result or unwinding failure. That "0x3e0007e3c98" comes from r14 of ftraced function stack frame, which it haven't had a chance to initialize since the very first instruction calls ftrace code ("ftrace_caller"). (ftraced function might never save r14 as well). Nevertheless according to s390 ABI any function is called with stack frame allocated for it and r14 contains return address. "ftrace_caller" itself is called with "brasl %r0,ftrace_caller". So, to fix this issue simply always save traced function caller onto ftraced function stack frame. Reported-by: Sven Schnelle <svens@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-12-18s390/unwind: stop gracefully at user mode pt_regs in irq stackVasily Gorbik
Consider reaching user mode pt_regs at the bottom of irq stack graceful unwinder termination. This is the case when irq/mcck/ext interrupt arrives while in user mode. Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-12-18s390/purgatory: do not build purgatory with kcov, kasan and friendsChristian Borntraeger
the purgatory must not rely on functions from the "old" kernel, so we must disable kasan and friends. We also need to have a separate copy of string.c as the default does not build memcmp with KASAN. Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-12-18s390/purgatory: Make sure we fail the build if purgatory has missing symbolsHans de Goede
Since we link purgatory with -r aka we enable "incremental linking" no checks for unresolved symbols are done while linking the purgatory. This commit adds an extra check for unresolved symbols by calling ld without -r before running objcopy to generate purgatory.ro. This will help us catch missing symbols in the purgatory sooner. Note this commit also removes --no-undefined from LDFLAGS_purgatory as that has no effect. Signed-off-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/lkml/20191212205304.191610-1-hdegoede@redhat.com Tested-by: Philipp Rudo <prudo@linux.ibm.com> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-12-18s390/ftrace: fix endless recursion in function_graph tracerSven Schnelle
The following sequence triggers a kernel stack overflow on s390x: mount -t tracefs tracefs /sys/kernel/tracing cd /sys/kernel/tracing echo function_graph > current_tracer [crash] This is because preempt_count_{add,sub} are in the list of traced functions, which can be demonstrated by: echo preempt_count_add >set_ftrace_filter echo function_graph > current_tracer [crash] The stack overflow happens because get_tod_clock_monotonic() gets called by ftrace but itself calls preempt_{disable,enable}(), which leads to a endless recursion. Fix this by using preempt_{disable,enable}_notrace(). Fixes: 011620688a71 ("s390/time: ensure get_clock_monotonic() returns monotonic values") Signed-off-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-12-18Merge branch 'ETS-qdisc'David S. Miller
Petr Machata says: ==================== Add a new Qdisc, ETS The IEEE standard 802.1Qaz (and 802.1Q-2014) specifies four principal transmission selection algorithms: strict priority, credit-based shaper, ETS (bandwidth sharing), and vendor-specific. All these have their corresponding knobs in DCB. But DCB does not have interfaces to configure RED and ECN, unlike Qdiscs. In the Qdisc land, strict priority is implemented by PRIO. Credit-based transmission selection algorithm can then be modeled by having e.g. TBF or CBS Qdisc below some of the PRIO bands. ETS would then be modeled by placing a DRR Qdisc under the last PRIO band. The problem with this approach is that DRR on its own, as well as the combination of PRIO and DRR, are tricky to configure and tricky to offload to 802.1Qaz-compliant hardware. This is due to several reasons: - As any classful Qdisc, DRR supports adding classifiers to decide in which class to enqueue packets. Unlike PRIO, there's however no fallback in the form of priomap. A way to achieve classification based on packet priority is e.g. like this: # tc filter add dev swp1 root handle 1: \ basic match 'meta(priority eq 0)' flowid 1:10 Expressing the priomap in this manner however forces drivers to deep dive into the classifier block to parse the individual rules. A possible solution would be to extend the classes with a "defmap" a la split / defmap mechanism of CBQ, and introduce this as a last resort classification. However, unlike priomap, this doesn't have the guarantee of covering all priorities. Traffic whose priority is not covered is dropped by DRR as unclassified. But ASICs tend to implement dropping in the ACL block, not in scheduling pipelines. The need to treat these configurations correctly (if only to decide to not offload at all) complicates a driver. It's not clear how to retrofit priomap with all its benefits to DRR without changing it beyond recognition. - The interplay between PRIO and DRR is also causing problems. 802.1Qaz has all ETS TCs as a last resort. Switch ASICs that support ETS at all are likely to handle ETS traffic this way as well. However, the Linux model is more generic, allowing the DRR block in any band. Drivers would need to be careful to handle this case correctly, otherwise the offloaded model might not match the slow-path one. In a similar vein, PRIO and DRR need to agree on the list of priorities assigned to DRR. This is doubly problematic--the user needs to take care to keep the two in sync, and the driver needs to watch for any holes in DRR coverage and treat the traffic correctly, as discussed above. Note that at the time that DRR Qdisc is added, it has no classes, and thus any priorities assigned to that PRIO band are not covered. Thus this case is surprisingly rather common, and needs to be handled gracefully by the driver. - Similarly due to DRR flexibility, when a Qdisc (such as RED) is attached below it, it is not immediately clear which TC the class represents. This is unlike PRIO with its straightforward classid scheme. When DRR is combined with PRIO, the relationship between classes and TCs gets even more murky. This is a problem for users as well: the TC mapping is rather important for (devlink) shared buffer configuration and (ethtool) counters. So instead, this patch set introduces a new Qdisc, which is based on 802.1Qaz wording. It is PRIO-like in how it is configured, meaning one needs to specify how many bands there are, how many are strict and how many are ETS, quanta for the latter, and priomap. The new Qdisc operates like the PRIO / DRR combo would when configured as per the standard. The strict classes, if any, are tried for traffic first. When there's no traffic in any of the strict queues, the ETS ones (if any) are treated in the same way as in DRR. The chosen interface makes the overall system both reasonably easy to configure, and reasonably easy to offload. The extra code to support ETS in mlxsw (which already supports PRIO) is about 150 lines, of which perhaps 20 lines is bona fide new business logic. Credit-based shaping transmission selection algorithm can be configured by adding a CBS Qdisc under one of the strict bands (e.g. TBF can be used to a similar effect as well). As a non-work-conserving Qdisc, CBS can't be hooked under the ETS bands. This is detected and handled identically to DRR Qdisc at runtime. Note that offloading CBS is not subject of this patchset. The patchset proceeds in four stages: - Patches #1-#3 are cleanups. - Patches #4 and #5 contain the new Qdisc. - Patches #6 and #7 update mlxsw to offload the new Qdisc. - Patches #8-#10 add selftests for ETS. Examples: - Add a Qdisc with 6 bands, 3 strict and 3 ETS with 45%-30%-25% weights: # tc qdisc add dev swp1 root handle 1: \ ets strict 3 quanta 4500 3000 2500 priomap 0 1 1 1 2 3 4 5 # tc qdisc sh dev swp1 qdisc ets 1: root refcnt 2 bands 6 strict 3 quanta 4500 3000 2500 priomap 0 1 1 1 2 3 4 5 5 5 5 5 5 5 5 5 - Tweak quantum of one of the classes of the previous Qdisc: # tc class ch dev swp1 classid 1:4 ets quantum 1000 # tc qdisc sh dev swp1 qdisc ets 1: root refcnt 2 bands 6 strict 3 quanta 1000 3000 2500 priomap 0 1 1 1 2 3 4 5 5 5 5 5 5 5 5 5 # tc class ch dev swp1 classid 1:3 ets quantum 1000 Error: Strict bands do not have a configurable quantum. - Purely strict Qdisc with 1:1 mapping between priorities and TCs: # tc qdisc add dev swp1 root handle 1: \ ets strict 8 priomap 7 6 5 4 3 2 1 0 # tc qdisc sh dev swp1 qdisc ets 1: root refcnt 2 bands 8 strict 8 priomap 7 6 5 4 3 2 1 0 7 7 7 7 7 7 7 7 - Use "bands" to specify number of bands explicitly. Underspecified bands are implicitly ETS and their quantum is taken from MTU. The following thus gives each band the same weight: # tc qdisc add dev swp1 root handle 1: \ ets bands 8 priomap 7 6 5 4 3 2 1 0 # tc qdisc sh dev swp1 qdisc ets 1: root refcnt 2 bands 8 quanta 1514 1514 1514 1514 1514 1514 1514 1514 priomap 7 6 5 4 3 2 1 0 7 7 7 7 7 7 7 7 v2: - This addresses points raised by David Miller. - Patch #4: - sch_ets.c: Add a comment with description of the Qdisc and the dequeuing algorithm. - Kconfig: Add a high-level description to the help blurb. v1: - No changes, first upstream submission after RFC. v3 (internal): - This addresses review from Jiri Pirko. - Patch #3: - Rename to _HR_ instead of to _HIERARCHY_. - Patch #4: - pkt_sched.h: Keep all the TCA_ETS_ constants in one enum. - pkt_sched.h: Rename TCA_ETS_BANDS to _NBANDS, _STRICT to _NSTRICT, _BAND_QUANTUM to _QUANTA_BAND and _PMAP_BAND to _PRIOMAP_BAND. - sch_ets.c: Update to reflect the above changes. Add a new policy, ets_class_policy, which is used when parsing class changes. Currently that policy is the same as the quanta policy, but that might change. - sch_ets.c: Move MTU handling from ets_quantum_parse() to the one caller that makes use of it. - sch_ets.c: ets_qdisc_priomap_parse(): WARN_ON_ONCE on invalid attribute instead of returning an extack. - Patch #6: - __mlxsw_sp_qdisc_ets_replace(): Pass the weights argument to this function in this patch already. Drop the weight computation. - mlxsw_sp_qdisc_prio_replace(): Rename "quanta" to "zeroes" and pass for the abovementioned "weights". - mlxsw_sp_qdisc_prio_graft(): Convert to a wrapper around __mlxsw_sp_qdisc_ets_graft(), instead of invoking the latter directly from mlxsw_sp_setup_tc_prio(). - Update to follow the _HIERARCHY_ -> _HR_ renaming. - Patch #7: - __mlxsw_sp_qdisc_ets_replace(): The "weights" argument passing and weight computation removal are now done in a previous patch. - mlxsw_sp_setup_tc_ets(): Drop case TC_ETS_REPLACE, which is handled earlier in the function. - Patch #3 (iproute2): - Add an example output to the commit message. - tc-ets.8: Fix output of two examples. - tc-ets.8: Describe default values of "bands", "quanta". - q_ets.c: A number of fixes in error messages. - q_ets.c: Comment formatting: /*padding*/ -> /* padding */ - q_ets.c: parse_nbands: Move duplicate checking to callers. - q_ets.c: Don't accept both "quantum" and "quanta" as equivalent. v2 (internal): - This addresses review from Ido Schimmel and comments from Alexander Kushnarov. - Patch #2: - s/coment/comment in the commit message. - Patch #4: - sch_ets: ets_class_is_strict(), ets_class_id(): Constify an argument - ets_class_find(): RXTify - Patch #3 (iproute2): - tc-ets.8: some spelling fixes - tc-ets.8: add another example - tc.8: add an ETS to "CLASSFUL QDISCS" section v1 (internal): - This addresses RFC reviews from Ido Schimmel and Roman Mashak, bugs found by Alexander Petrovskiy and myself, and other improvements. - Patch #2: - Expand the explanation with an explicit example. - Patch #4: - Kconfig: s/sch_drr/sch_ets/ - sch_ets: Reorder includes to be in alphabetical order - sch_ets: ets_quantum_parse(): Rename the return-pointer argument from pquantum to quantum, and use it directly, not going through a local temporary. - sch_ets: ets_qdisc_quanta_parse(): Convert syntax of function argument "quanta" from an array to a pointer. - sch_ets: ets_qdisc_priomap_parse(): Likewise with "priomap". - sch_ets: ets_qdisc_quanta_parse(), ets_qdisc_priomap_parse(): Invoke __nla_validate_nested directly instead of nl80211_validate_nested(). - sch_ets: ets_qdisc_quanta_parse(): WARN_ON_ONCE on invalid attribute instead of returning an extack. - sch_ets: ets_qdisc_change(): Make the last band the default one for unmentioned priomap priorities. - sch_ets: Fix a panic when an offloaded child in a bandwidth-sharing band notified its ETS parent. - sch_ets: When ungrafting, add the newly-created invisible FIFO to the Qdisc hash - Patch #5: - pkt_cls.h: Note that quantum=0 signifies a strict band. - Fix error path handling when ets_offload_dump() fails. - Patch #6: - __mlxsw_sp_qdisc_ets_replace(): Convert syntax of function arguments "quanta" and "priomap" from arrays to pointers. - Patch #7: - __mlxsw_sp_qdisc_ets_replace(): Convert syntax of function argument "weights" from an array to a pointer. - Patch #9: - mlxsw/sch_ets.sh: Add a comment explaining packet prioritization. - Adjust the whole suite to allow testing of traffic classifiers in addition to testing priomap. - Patch #10: - Add a number of new tests to test default priomap band, overlarge number of bands, zeroes in quanta, and altogether missing quanta. - Patch #1 (iproute2): - State motivation for inclusion of this patch in the patcheset in the commit message. - Patch #3 (iproute2): - tc-ets.8: it is now December - tc-ets.8: explain inactivity WRT using non-WC Qdiscs under ETS band - tc-ets.8: s/flow/band in explanation of quantum - tc-ets.8: explain what happens with priorities not covered by priomap - tc-ets.8: default priomap band is now the last one - q_ets.c: ets_parse_opt(): Remove unnecessary initialization of priomap and quanta. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-18selftests: qdiscs: Add test coverage for ETS QdiscPetr Machata
Add TDC coverage for the new ETS Qdisc. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-18selftests: forwarding: sch_ets: Add test coverage for ETS QdiscPetr Machata
This tests the newly-added ETS Qdisc. It runs two to three streams of traffic, each with a different priority. ETS Qdisc is supposed to allocate bandwidth according to the DRR algorithm and given weights. After running the traffic for a while, counters are compared for each stream to check that the expected ratio is in fact observed. In order for the DRR process to kick in, a traffic bottleneck must exist in the first place. In slow path, such bottleneck can be implemented by wrapping the ETS Qdisc inside a TBF or other shaper. This might however make the configuration unoffloadable. Instead, on HW datapath, the bottleneck would be set up by lowering port speed and configuring shared buffer suitably. Therefore the test is structured as a core component that implements the testing, with two wrapper scripts that implement the details of slow path resp. fast path configuration. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-12-18selftests: forwarding: Move start_/stop_traffic from mlxsw to lib.shPetr Machata
These two functions are used for starting several streams of traffic, and then stopping them later. They will be handy for the test coverage of ETS Qdisc. Move them from mlxsw-specific qos_lib.sh to the generic lib.sh. Signed-off-by: Petr Machata <petrm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>