summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-08-18bnxt_en: Fix handling FRAG_ERR when NVM_INSTALL_UPDATE cmd failsVasundhara Volam
If FW returns FRAG_ERR in response error code, driver is resending the command only when HWRM command returns success. Fix the code to resend NVM_INSTALL_UPDATE command with DEFRAG install flags, if FW returns FRAG_ERR in its response error code. Fixes: cb4d1d626145 ("bnxt_en: Retry failed NVM_INSTALL_UPDATE with defragmentation flag enabled.") Signed-off-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-18bnxt_en: Improve RX doorbell sequence.Michael Chan
When both RX buffers and RX aggregation buffers have to be replenished at the end of NAPI, post the RX aggregation buffers first before RX buffers. Otherwise, we may run into a situation where there are only RX buffers without RX aggregation buffers for a split second. This will cause the hardware to abort the RX packet and report buffer errors, which will cause unnecessary cleanup by the driver. Ringing the Aggregation ring doorbell first before the RX ring doorbell will prevent some of these buffer errors. Use the same sequence during ring initialization as well. Fixes: 697197e5a173 ("bnxt_en: Re-structure doorbells.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-18bnxt_en: Fix VNIC clearing logic for 57500 chips.Michael Chan
During device shutdown, the VNIC clearing sequence needs to be modified to free the VNIC first before freeing the RSS contexts. The current code is doing the reverse and we can get mis-directed RX completions to CP ring ID 0 when the RSS contexts are freed and zeroed. The clearing of RSS contexts is not required with the new sequence. Refactor the VNIC clearing logic into a new function bnxt_clear_vnic() and do the chip specific VNIC clearing sequence. Fixes: 7b3af4f75b81 ("bnxt_en: Add RSS support for 57500 chips.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-18net: kalmia: fix memory leaksWenwen Wang
In kalmia_init_and_get_ethernet_addr(), 'usb_buf' is allocated through kmalloc(). In the following execution, if the 'status' returned by kalmia_send_init_packet() is not 0, 'usb_buf' is not deallocated, leading to memory leaks. To fix this issue, add the 'out' label to free 'usb_buf'. Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-18cx82310_eth: fix a memory leak bugWenwen Wang
In cx82310_bind(), 'dev->partial_data' is allocated through kmalloc(). Then, the execution waits for the firmware to become ready. If the firmware is not ready in time, the execution is terminated. However, the allocated 'dev->partial_data' is not deallocated on this path, leading to a memory leak bug. To fix this issue, free 'dev->partial_data' before returning the error. Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-18Merge branch 'hns3-next'David S. Miller
Huazhong Tan says: ==================== net: hns3: add some cleanups & bugfix This patch-set includes cleanups and bugfix for the HNS3 ethernet controller driver. [patch 01/06 - 03/06] adds some cleanups. [patch 04/06] changes the print level of RAS. [patch 05/06] fixes a bug related to MAC TNL. [patch 06/06] adds phy_attached_info(). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-18net: hns3: add phy_attached_info() to the hns3 driverYonglong Liu
This patch adds the call to phy_attached_info() to the hns3 driver to identify which exact PHY drivers and models is in use. Signed-off-by: Yonglong Liu <liuyonglong@huawei.com> Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-18net: hns3: prevent unnecessary MAC TNL interruptHuazhong Tan
MAC TNL interrupt is used to collect statistic info about link status changing suddenly when netdev is running. But when stopping netdev, the enabled MAC TNL interrupt is unnecessary, and may add some noises to the statistic info. So this patch disables it before stopping MAC. Fixes: a63457878b12 ("net: hns3: Add handling of MAC tunnel interruption") Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-18net: hns3: change print level of RAS error log from warning to errorXiaofei Tan
This patch changes print level of RAS error log from warning to error. Because RAS error and its recovery process could cause application failure. Also uses %u instead of %d when the parameter is unsigned. Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Weihang Li <liweihang@hisilicon.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-18net: hns3: fix error and incorrect formatGuojia Liao
The pointer type parameter should be declare as const for preventing from its pointed value being unexpected modified. The uninitialized variable can not be return directly. The default return value is 0 if no abnormal result. This patch fixes the preceding two errors, deletes redundant declaration of a function and align one parameter. Signed-off-by: Guojia Liao <liaoguojia@huawei.com> Signed-off-by: Weihang Li <liweihang@hisilicon.com> Signed-off-by: Jian Shen <shenjian15@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-18net: hns3: modify redundant initialization of variableGuojia Liao
Some temporary variables do not need to be initialized that they will be set before used, so this patch deletes the initialization value of these temporary variables. Signed-off-by: Guojia Liao <liaoguojia@huawei.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Huzhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-18net: hns3: add or modify commentsGuojia Liao
To explain some code, this patch adds some comments, and modifies or merges some comments to make them more neat. Signed-off-by: Guojia Liao <liaoguojia@huawei.com> Signed-off-by: Zhongzhu Liu <liuzhongzhu@huawei.com> Signed-off-by: Weihang Li <liweihang@hisilicon.com> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com> Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-18Merge tag 'fixes-for-5.3-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux Pull MTD fix from Richard Weinberger: "A single fix for MTD to correctly set the spi-nor WP pin" * tag 'fixes-for-5.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux: mtd: spi-nor: Fix the disabling of write protection at init
2019-08-18bnx2x: Fix VF's VLAN reconfiguration in reload.Manish Chopra
Commit 04f05230c5c13 ("bnx2x: Remove configured vlans as part of unload sequence."), introduced a regression in driver that as a part of VF's reload flow, VLANs created on the VF doesn't get re-configured in hardware as vlan metadata/info was not getting cleared for the VFs which causes vlan PING to stop. This patch clears the vlan metadata/info so that VLANs gets re-configured back in the hardware in VF's reload flow and PING/traffic continues for VLANs created over the VFs. Fixes: 04f05230c5c13 ("bnx2x: Remove configured vlans as part of unload sequence.") Signed-off-by: Manish Chopra <manishc@marvell.com> Signed-off-by: Sudarsana Kalluru <skalluru@marvell.com> Signed-off-by: Shahed Shaikh <shshaikh@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-18Merge tag 'for-5.3-rc4-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: "Two fixes that popped up during testing: - fix for sysfs-related code that adds/removes block groups, warnings appear during several fstests in connection with sysfs updates in 5.3, the fix essentially replaces a workaround with scope NOFS and applies to 5.2-based branch too - add sanity check of trim range" * tag 'for-5.3-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: trim: Check the range passed into to prevent overflow Btrfs: fix sysfs warning and missing raid sysfs directories
2019-08-18Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "A set of fixes for x86: - Fix the inconsistent error handling in the umwait init code - Rework the boot param zeroing so gcc9 stops complaining about out of bound memset. The resulting source code is actually more sane to read than the smart solution we had - Maintainers update so Tony gets involved when Intel models are added - Some more fallthrough fixes" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Save fields explicitly, zero out everything else MAINTAINERS, x86/CPU: Tony Luck will maintain asm/intel-family.h x86/fpu/math-emu: Address fallthrough warnings x86/apic/32: Fix yet another implicit fallthrough warning x86/umwait: Fix error handling in umwait_init()
2019-08-18Merge branch 'efi-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull EFI fix from Thomas Gleixner: "A single fix for a EFI mixed mode regression caused by recent rework which did not take the firmware bitwidth into account" * 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: efi-stub: Fix get_efi_config_table on mixed-mode setups
2019-08-18Merge tag 'spdx-5.3-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx Pull SPDX fixes from Greg KH: "Here are four small SPDX fixes for 5.3-rc5. A few style fixes for some SPDX comments, added an SPDX tag for one file, and fix up some GPL boilerplate for another file. All of these have been in linux-next for a few weeks with no reported issues (they are comment changes only, so that's to be expected...)" * tag 'spdx-5.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/spdx: i2c: stm32: Use the correct style for SPDX License Identifier intel_th: Use the correct style for SPDX License Identifier coccinelle: api/atomic_as_refcounter: add SPDX License Identifier kernel/configs: Replace GPL boilerplate code with SPDX identifier
2019-08-18Merge tag 'char-misc-5.3-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are some small char and misc driver fixes for 5.3-rc5. These are two different subsystems needing some fixes, the habanalabs driver which is has some more big endian fixes for problems found. The other are some small soundwire fixes, including some Kconfig dependencies needed to resolve reported build errors. All of these have been in linux-next this week with no reported issues" * tag 'char-misc-5.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: misc: xilinx-sdfec: fix dependency and build error habanalabs: fix device IRQ unmasking for BE host habanalabs: fix endianness handling for internal QMAN submission habanalabs: fix completion queue handling when host is BE habanalabs: fix endianness handling for packets from user habanalabs: fix DRAM usage accounting on context tear down habanalabs: Avoid double free in error flow soundwire: fix regmap dependencies and align with other serial links soundwire: cadence_master: fix definitions for INTSTAT0/1 soundwire: cadence_master: fix register definition for SLAVE_STATE
2019-08-18Merge tag 'staging-5.3-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging/IIO fixes from Greg KH: "Here are four small staging and iio driver fixes for 5.3-rc5 Two are for the dt3000 comedi driver for some reported problems found in that codebase, and two are some small iio fixes. All of these have been in linux-next this week with no reported issues" * tag 'staging-5.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: staging: comedi: dt3000: Fix rounding up of timer divisor staging: comedi: dt3000: Fix signed integer overflow 'divider * base' iio: adc: max9611: Fix temperature reading in probe iio: frequency: adf4371: Fix output frequency setting
2019-08-18Merge tag 'usb-5.3-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are number of small USB fixes for 5.3-rc5. Syzbot has been on a tear recently now that it has some good USB debugging hooks integrated, so there's a number of fixes in here found by those tools for some _very_ old bugs. Also a handful of gadget driver fixes for reported issues, some hopefully-final dma fixes for host controller drivers, and some new USB serial gadget driver ids. All of these have been in linux-next this week with no reported issues (the usb-serial ones were in linux-next in its own branch, but merged into mine on Friday)" * tag 'usb-5.3-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: usb: add a hcd_uses_dma helper usb: don't create dma pools for HCDs with a localmem_pool usb: chipidea: imx: fix EPROBE_DEFER support during driver probe usb: host: fotg2: restart hcd after port reset USB: CDC: fix sanity checks in CDC union parser usb: cdc-acm: make sure a refcount is taken early enough USB: serial: option: add the BroadMobi BM818 card USB: serial: option: Add Motorola modem UARTs USB: core: Fix races in character device registration and deregistraion usb: gadget: mass_storage: Fix races between fsg_disable and fsg_set_alt usb: gadget: composite: Clear "suspended" on reset/disconnect usb: gadget: udc: renesas_usb3: Fix sysfs interface of "role" USB: serial: option: add D-Link DWM-222 device ID USB: serial: option: Add support for ZTE MF871A
2019-08-17Merge tag 'for-linus-2019-08-17' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: "A collection of fixes that should go into this series. This contains: - Revert of the REQ_NOWAIT_INLINE and associated dio changes. There were still corner cases there, and even though I had a solution for it, it's too involved for this stage. (me) - Set of NVMe fixes (via Sagi) - io_uring fix for fixed buffers (Anthony) - io_uring defer issue fix (Jackie) - Regression fix for queue sync at exit time (zhengbin) - xen blk-back memory leak fix (Wenwen)" * tag 'for-linus-2019-08-17' of git://git.kernel.dk/linux-block: io_uring: fix an issue when IOSQE_IO_LINK is inserted into defer list block: remove REQ_NOWAIT_INLINE io_uring: fix manual setup of iov_iter for fixed buffers xen/blkback: fix memory leaks blk-mq: move cancel of requeue_work to the front of blk_exit_queue nvme-pci: Fix async probe remove race nvme: fix controller removal race with scan work nvme-rdma: fix possible use-after-free in connect error flow nvme: fix a possible deadlock when passthru commands sent to a multipath device nvme-core: Fix extra device_put() call on error path nvmet-file: fix nvmet_file_flush() always returning an error nvmet-loop: Flush nvme_delete_wq when removing the port nvmet: Fix use-after-free bug when a port is removed nvme-multipath: revalidate nvme_ns_head gendisk in nvme_validate_ns
2019-08-17Merge tag 'hyperv-fixes-signed' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull Hyper-V fixes from Sasha Levin: - A few fixes for the userspace hyper-v tools from Adrian Vladu. - A fix for the hyper-v MAINTAINERs entry from Lan Tianyu. - Fix for SPDX license identifier in the userspace tools from Nishad Kamdar. * tag 'hyperv-fixes-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: MAINTAINERS: Fix Hyperv vIOMMU driver file name tools: hv: Use the correct style for SPDX License Identifier tools: hv: fix typos in toolchain tools: hv: fix KVP and VSS daemons exit code tools: hv: fixed Python pep8/flake8 warnings for lsvmbus
2019-08-17Merge branch 'stmmac-next'David S. Miller
Jose Abreu says: ==================== net: stmmac: Improvements for -next Couple of improvements for -next tree. More info in commit logs. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-17net: stmmac: selftests: Add selftest for VLAN TX OffloadJose Abreu
Add 2 new selftests for VLAN Insertion offloading. Tests are for inner and outer VLAN offloading. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-17net: stmmac: Add support for VLAN Insertion OffloadJose Abreu
Adds the logic to insert a given VLAN ID in a packet. This is offloaded to HW and its descriptor based. For now, only XGMAC implements the necessary callbacks. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-17net: stmmac: xgmac: Add EEE supportJose Abreu
Add support for EEE in XGMAC cores by implementing the necessary callbacks. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-17net: stmmac: selftests: Add tests for SA Insertion/ReplacementJose Abreu
Add 4 new tests: - SA Insertion (register based) - SA Insertion (descriptor based) - SA Replacament (register based) - SA Replacement (descriptor based) Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-17net: stmmac: Add support for SA Insertion/Replacement in XGMAC coresJose Abreu
Add the support for Source Address Insertion and Replacement in XGMAC cores. Two methods are supported: Descriptor based and register based. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-17net: stmmac: Add ethtool register dump for XGMAC coresJose Abreu
Add the ethtool interface to dump the register map in XGMAC cores. Changes from v2: - Remove uneeded memset (Jakub) Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-17net: stmmac: dwxgmac: Add Flexible PPS supportJose Abreu
Add the support for Flexible PPS in XGMAC cores. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-17net: stmmac: Add a counter for Split Header packetsJose Abreu
Add a counter that increments each time a packet with split header is received. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-17net: stmmac: Add Split Header support and enable it in XGMAC coresJose Abreu
Add the support for Split Header feature in the RX path and enable it in XGMAC cores. This does not impact neither beneficts bandwidth but it does reduces CPU usage because without the feature all the entire packet is memcpy'ed, while that with the feature only the header is. With Split Header disabled 'perf stat -d' gives: 86870.624945 task-clock (msec) # 0.429 CPUs utilized 1073352 context-switches # 0.012 M/sec 1 cpu-migrations # 0.000 K/sec 213 page-faults # 0.002 K/sec 327113872376 cycles # 3.766 GHz (62.53%) 56618161216 instructions # 0.17 insn per cycle (75.06%) 10742205071 branches # 123.658 M/sec (75.36%) 584309242 branch-misses # 5.44% of all branches (75.19%) 17594787965 L1-dcache-loads # 202.540 M/sec (74.88%) 4003773131 L1-dcache-load-misses # 22.76% of all L1-dcache hits (74.89%) 1313301468 LLC-loads # 15.118 M/sec (49.75%) 355906510 LLC-load-misses # 27.10% of all LL-cache hits (49.92%) With Split Header enabled 'perf stat -d' gives: 49324.456539 task-clock (msec) # 0.245 CPUs utilized 2542387 context-switches # 0.052 M/sec 1 cpu-migrations # 0.000 K/sec 213 page-faults # 0.004 K/sec 177092791469 cycles # 3.590 GHz (62.30%) 68555756017 instructions # 0.39 insn per cycle (75.16%) 12697019382 branches # 257.418 M/sec (74.81%) 442081897 branch-misses # 3.48% of all branches (74.79%) 20337958358 L1-dcache-loads # 412.330 M/sec (75.46%) 3820210140 L1-dcache-load-misses # 18.78% of all L1-dcache hits (75.35%) 1257719198 LLC-loads # 25.499 M/sec (49.73%) 685543923 LLC-load-misses # 54.51% of all LL-cache hits (49.86%) Changes from v2: - Reword commit message (Jakub) Changes from v1: - Add performance info (David) - Add misssing dma_sync_single_for_device() Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-17net: stmmac: xgmac: Correctly return that RX descriptor is not last oneJose Abreu
Return the correct value when RX descriptor is not the last one. Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-17net: stmmac: Prepare to add Split Header supportJose Abreu
In order to add Split Header support, stmmac_rx() needs to take into account that packet may be split accross multiple descriptors. Refactor the logic of this function in order to support this scenario. Changes from v2: - Fixup if condition detection (Jakub) - Don't stop NAPI with unfinished packet (Jakub) - Use napi_alloc_skb() (Jakub) Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-17net: stmmac: Get correct timestamp values from XGMACJose Abreu
TX Timestamp in XGMAC comes from MAC instead of descriptors. Implement this in a new callback. Also, RX Timestamp in XGMAC must be cheked against corruption and we need a barrier to make sure that descriptor fields are read correctly. Changes from v2: - Rework return code check (Jakub) Changes from v1: - Rework the get timestamp function (David) Signed-off-by: Jose Abreu <joabreu@synopsys.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-17Merge branch 'drop_monitor-for-offloaded-paths'David S. Miller
Ido Schimmel says: ==================== Add drop monitor for offloaded data paths Users have several ways to debug the kernel and understand why a packet was dropped. For example, using drop monitor and perf. Both utilities trace kfree_skb(), which is the function called when a packet is freed as part of a failure. The information provided by these tools is invaluable when trying to understand the cause of a packet loss. In recent years, large portions of the kernel data path were offloaded to capable devices. Today, it is possible to perform L2 and L3 forwarding in hardware, as well as tunneling (IP-in-IP and VXLAN). Different TC classifiers and actions are also offloaded to capable devices, at both ingress and egress. However, when the data path is offloaded it is not possible to achieve the same level of introspection since packets are dropped by the underlying device and never reach the kernel. This patchset aims to solve this by allowing users to monitor packets that the underlying device decided to drop along with relevant metadata such as the drop reason and ingress port. The above is achieved by exposing a fundamental capability of devices capable of data path offloading - packet trapping. In much the same way as drop monitor registers its probe function with the kfree_skb() tracepoint, the device is instructed to pass to the CPU (trap) packets that it decided to drop in various places in the pipeline. The configuration of the device to pass such packets to the CPU is performed using devlink, as it is not specific to a port, but rather to a device. In the future, we plan to control the policing of such packets using devlink, in order not to overwhelm the CPU. While devlink is used as the control path, the dropped packets are passed along with metadata to drop monitor, which reports them to userspace as netlink events. This allows users to use the same interface for the monitoring of both software and hardware drops. Logically, the solution looks as follows: Netlink event: Packet w/ metadata Or a summary of recent drops ^ | Userspace | +---------------------------------------------------+ Kernel | | +-------+--------+ | | | drop_monitor | | | +-------^--------+ | | | +----+----+ | | Kernel's Rx path | devlink | (non-drop traps) | | +----^----+ ^ | | +-----------+ | +-------+-------+ | | | Device driver | | | +-------^-------+ Kernel | +---------------------------------------------------+ Hardware | | Trapped packet | +--+---+ | | | ASIC | | | +------+ In order to reduce the patch count, this patchset only includes integration with netdevsim. A follow-up patchset will add devlink-trap support in mlxsw. Patches #1-#7 extend drop monitor to also monitor hardware originated drops. Patches #8-#10 add the devlink-trap infrastructure. Patches #11-#12 add devlink-trap support in netdevsim. Patches #13-#16 add tests for the generic infrastructure over netdevsim. Example ======= Instantiate netdevsim --------------------- List supported traps -------------------- netdevsim/netdevsim10: name source_mac_is_multicast type drop generic true action drop group l2_drops name vlan_tag_mismatch type drop generic true action drop group l2_drops name ingress_vlan_filter type drop generic true action drop group l2_drops name ingress_spanning_tree_filter type drop generic true action drop group l2_drops name port_list_is_empty type drop generic true action drop group l2_drops name port_loopback_filter type drop generic true action drop group l2_drops name fid_miss type exception generic false action trap group l2_drops name blackhole_route type drop generic true action drop group l3_drops name ttl_value_is_too_small type exception generic true action trap group l3_drops name tail_drop type drop generic true action drop group buffer_drops Enable a trap ------------- Query statistics ---------------- netdevsim/netdevsim10: name blackhole_route type drop generic true action trap group l3_drops stats: rx: bytes 7384 packets 52 Monitor dropped packets ----------------------- dropwatch> set alertmode packet Setting alert mode Alert mode successfully set dropwatch> set sw true setting software drops monitoring to 1 dropwatch> set hw true setting hardware drops monitoring to 1 dropwatch> start Enabling monitoring... Kernel monitoring activated. Issue Ctrl-C to stop monitoring drop at: ttl_value_is_too_small (l3_drops) origin: hardware input port ifindex: 55 input port name: eth0 timestamp: Mon Aug 12 10:52:20 2019 445911505 nsec protocol: 0x800 length: 142 original length: 142 drop at: ip6_mc_input+0x8b8/0xef8 (0xffffffff9e2bb0e8) origin: software input port ifindex: 4 timestamp: Mon Aug 12 10:53:37 2019 024444587 nsec protocol: 0x86dd length: 110 original length: 110 Future plans ============ * Provide more drop reasons as well as more metadata * Add dropmon support to libpcap, so that tcpdump/tshark could specifically listen on dropmon traffic, instead of capturing all netlink packets via nlmon interface Changes in v3: * Place test with the rest of the netdevsim tests * Fix test to load netdevsim module * Move devlink helpers from the test to devlink_lib.sh. Will be used by mlxsw tests * Re-order netdevsim includes in alphabetical order * Fix reverse xmas tree in netdevsim * Remove double include in netdevsim Changes in v2: * Use drop monitor to report dropped packets instead of devlink * Add drop monitor patches * Add test cases ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-17Documentation: Add a section for devlink-trap testingIdo Schimmel
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-17selftests: devlink_trap: Add test cases for devlink-trapIdo Schimmel
Add test cases for devlink-trap on top of the netdevsim implementation. The tests focus on the devlink-trap core infrastructure and user space API. They test both good and bad flows and also dismantle of the netdev and devlink device used to report trapped packets. This allows device drivers to focus their tests on device-specific functionality. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-17selftests: forwarding: devlink_lib: Add devlink-trap helpersIdo Schimmel
Add helpers to interact with devlink-trap, such as setting the action of a trap and retrieving statistics. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-17selftests: forwarding: devlink_lib: Allow tests to define devlink deviceIdo Schimmel
For tests that create their network interfaces dynamically or do not use interfaces at all (as with netdevsim) it is useful to define their own devlink device instead of deriving it from the first network interface. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-17Documentation: Add description of netdevsim trapsIdo Schimmel
Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-17netdevsim: Add devlink-trap supportIdo Schimmel
Have netdevsim register its trap groups and traps with devlink during initialization and periodically report trapped packets to devlink core. Since netdevsim is not a real device, the trapped packets are emulated using a workqueue that periodically reports a UDP packet with a random 5-tuple from each active packet trap and from each running netdev. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-17Documentation: Add devlink-trap documentationIdo Schimmel
Add initial documentation of the devlink-trap mechanism, explaining the background, motivation and the semantics of the interface. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-17devlink: Add generic packet traps and groupsIdo Schimmel
Add generic packet traps and groups that can report dropped packets as well as exceptions such as TTL error. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-17devlink: Add packet trap infrastructureIdo Schimmel
Add the basic packet trap infrastructure that allows device drivers to register their supported packet traps and trap groups with devlink. Each driver is expected to provide basic information about each supported trap, such as name and ID, but also the supported metadata types that will accompany each packet trapped via the trap. The currently supported metadata type is just the input port, but more will be added in the future. For example, output port and traffic class. Trap groups allow users to set the action of all member traps. In addition, users can retrieve per-group statistics in case per-trap statistics are too narrow. In the future, the trap group object can be extended with more attributes, such as policer settings which will limit the amount of traffic generated by member traps towards the CPU. Beside registering their packet traps with devlink, drivers are also expected to report trapped packets to devlink along with relevant metadata. devlink will maintain packets and bytes statistics for each packet trap and will potentially report the trapped packet with its metadata to user space via drop monitor netlink channel. The interface towards the drivers is simple and allows devlink to set the action of the trap. Currently, only two actions are supported: 'trap' and 'drop'. When set to 'trap', the device is expected to provide the sole copy of the packet to the driver which will pass it to devlink. When set to 'drop', the device is expected to drop the packet and not send a copy to the driver. In the future, more actions can be added, such as 'mirror'. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-17drop_monitor: Allow user to start monitoring hardware dropsIdo Schimmel
Drop monitor has start and stop commands, but so far these were only used to start and stop monitoring of software drops. Now that drop monitor can also monitor hardware drops, we should allow the user to control these as well. Do that by adding SW and HW flags to these commands. If no flag is specified, then only start / stop monitoring software drops. This is done in order to maintain backward-compatibility with existing user space applications. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-17drop_monitor: Add support for summary alert mode for hardware dropsIdo Schimmel
In summary alert mode a notification is sent with a list of recent drop reasons and a count of how many packets were dropped due to this reason. To avoid expensive operations in the context in which packets are dropped, each CPU holds an array whose number of entries is the maximum number of drop reasons that can be encoded in the netlink notification. Each entry stores the drop reason and a count. When a packet is dropped the array is traversed and a new entry is created or the count of an existing entry is incremented. Later, in process context, the array is replaced with a newly allocated copy and the old array is encoded in a netlink notification. To avoid breaking user space, the notification includes the ancillary header, which is 'struct net_dm_alert_msg' with number of entries set to '0'. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-17drop_monitor: Add support for packet alert mode for hardware dropsIdo Schimmel
In a similar fashion to software drops, extend drop monitor to send netlink events when packets are dropped by the underlying hardware. The main difference is that instead of encoding the program counter (PC) from which kfree_skb() was called in the netlink message, we encode the hardware trap name. The two are mostly equivalent since they should both help the user understand why the packet was dropped. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2019-08-17drop_monitor: Consider all monitoring states before performing configurationIdo Schimmel
The drop monitor configuration (e.g., alert mode) is global, but user will be able to enable monitoring of only software or hardware drops. Therefore, ensure that monitoring of both software and hardware drops are disabled before allowing drop monitor configuration to take place. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>