summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2020-07-10Merge tag 'mlx5-updates-2020-07-09' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2020-07-09 This series provides updates to mlx5 CT (connection tracking) offloads For more information please see tag log below. Please pull and let me know if there is any problem. The following conflict is expected when net is merged into net-next: to resolve just use the hunks from net-next. <<<<<<< HEAD (net-next) mlx5_tc_ct_del_ft_entry(ct_priv, entry); kfree(entry); ======= (net) mlx5_tc_ct_entry_del_rules(ct_priv, entry); kfree(entry); >>>>>>> b1a7d5bdfe54c98eca46e2c997d4e3b1484a49af mlx5 connection tracking offloads updates: 1) Restore CT state from lookup in zone instead of tupleid On a miss, Use this zone + 5 tuple taken from the skb, to lookup the CT entry and restore it, instead of the driver allocated tuple id. This improves flow insertion rate by avoiding the allocation of a header rewrite context to maintain the tupleid. 2) Re-use modify header HW objects for identical modify actions. 3) Expand tunnel register mappings Reg_c1 is 32 bits wide. Before this patchset, 24 bit were allocated for the tuple_id, 6 bits for tunnel mapping and 2 bits for tunnel options mappings. Restoring the ct state from zone lookup instead of tuple id requires reg_c1 to store 8 bits mapping the ct zone, leaving 24 bits for tunnel mappings. Expand tunnel and tunnel options register mappings to 12 bit each. 4) Trivial cleanup and fixes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Alexei Starovoitov says: ==================== pull-request: bpf 2020-07-09 The following pull-request contains BPF updates for your *net* tree. We've added 4 non-merge commits during the last 1 day(s) which contain a total of 4 files changed, 26 insertions(+), 15 deletions(-). The main changes are: 1) fix crash in libbpf on 32-bit archs, from Jakub and Andrii. 2) fix crash when l2tp and bpf_sk_reuseport conflict, from Martin. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10Merge tag 'mlx5-fixes-2020-07-02' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2020-07-02 This series introduces some fixes to mlx5 driver. V1->v2: - Drop "ip -s" patch and mirred device hold reference patch. - Will revise them in a later submission. Please pull and let me know if there is any problem. For -stable v5.2 ('net/mlx5: Fix eeprom support for SFP module') For -stable v5.4 ('net/mlx5e: Fix 50G per lane indication') For -stable v5.5 ('net/mlx5e: Fix CPU mapping after function reload to avoid aRFS RX crash') ('net/mlx5e: Fix VXLAN configuration restore after function reload') For -stable v5.7 ('net/mlx5e: CT: Fix memory leak in cleanup') ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10Merge branch 'udp_tunnel-add-NIC-RX-port-offload-infrastructure'David S. Miller
Jakub Kicinski says: ==================== udp_tunnel: add NIC RX port offload infrastructure Kernel has a facility to notify drivers about the UDP tunnel ports so that devices can recognize tunneled packets. This is important mostly for RX - devices which don't support CHECKSUM_COMPLETE can report checksums of inner packets, and compute RSS over inner headers. Some drivers also match the UDP tunnel ports also for TX, although doing so may lead to false positives and negatives. Unfortunately the user experience when trying to take adavantage of these facilities is suboptimal. First of all there is no way for users to check which ports are offloaded. Many drivers resort to printing messages to aid debugging, other use debugfs. Even worse the availability of the RX features (NETIF_F_RX_UDP_TUNNEL_PORT) is established purely on the basis of the driver having the ndos installed. For most drivers, however, the ability to perform offloads is contingent on device capabilities (driver support multiple device and firmware versions). Unless driver resorts to hackish clearing of features set incorrectly by the core - users are left guessing whether their device really supports UDP tunnel port offload or not. There is currently no way to indicate or configure whether RX features include just the checksum offload or checksum and using inner headers for RSS. Many drivers default to not using inner headers for RSS because most implementations populate the source port with entropy from the inner headers. This, however, is not always the case, for example certain switches are only able to use a fixed source port during encapsulation. We have also seen many driver authors get the intricacies of UDP tunnel port offloads wrong. Most commonly the drivers forget to perform reference counting, or take sleeping locks in the callbacks. This work tries to improve the situation by pulling the UDP tunnel port table maintenance out of the drivers. It turns out that almost all drivers maintain a fixed size table of ports (in most cases one per tunnel type), so we can take care of all the refcounting in the core, and let the driver specify if they need to sleep in the callbacks or not. The new common implementation will also support replacing ports - when a port is removed from a full table it will try to find a previously missing port to take its place. This patch only implements the core functionality along with a few drivers I was hoping to test manually [1] along with a test based on a netdevsim implementation. Following patches will convert all the drivers. Once that's complete we can remove the ndos, and rely directly on the new infrastrucutre. Then after RSS (RXFH) is converted to netlink we can add the ability to configure the use of inner RSS headers for UDP tunnels. [1] Unfortunately I wasn't able to, turns out 2 of the devices I had access to were older generation or had old FW, and they did not actually support UDP tunnel port notifications (see the second paragraph). The thrid device appears to program the UDP ports correctly but it generates bad UDP checksums with or without these patches. Long story short - I'd appreciate reviews and testing here.. v4: - better build fix (hopefully this one does it..) v3: - fix build issue; - improve bnxt changes. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10mlx4: convert to new udp_tunnel_nic infraJakub Kicinski
Convert to new infra, make use of the ability to sleep in the callback. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Tariq Toukan <tariqt@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10bnxt: convert to new udp_tunnel_nic infraJakub Kicinski
Convert to new infra, taking advantage of sleeping in callbacks. v2: - use bp->*_fw_dst_port_id != INVALID_HW_RING_ID as indication that the offload is active. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10ixgbe: convert to new udp_tunnel_nic infraJakub Kicinski
Make use of new common udp_tunnel_nic infra. ixgbe supports IPv4 only, and only single VxLAN and Geneve ports (one each). v2: - split out the RXCSUM feature handling to separate change; - declare structs separately; - use ti.type instead of assuming table 0 is VxLAN; - move setting netdev->udp_tunnel_nic_info to its own switch. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10ixgbe: don't clear UDP tunnel ports when RXCSUM is disabledJakub Kicinski
It appears the clearing of UDP tunnel ports when RXCSUM is disabled is unnecessary. Driver will not pay attention to checksum bits if RXCSUM is not set, so we can let the hardware parse the packets. Note that the UDP tunnel port NDO handlers don't pay attention to the state of RXCSUM, so the ports could had been re-programmed, anyway. This cleanup simplifies later conversion patch. v2: - break this out of the following patch. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10selftests: net: add a test for UDP tunnel info infraJakub Kicinski
Add validating the UDP tunnel infra works. $ ./udp_tunnel_nic.sh PASSED all 383 checks Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10netdevsim: add UDP tunnel port offload supportJakub Kicinski
Add UDP tunnel port handlers to our fake driver so we can test the core infra. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10ethtool: add tunnel info interfaceJakub Kicinski
Add an interface to report offloaded UDP ports via ethtool netlink. Now that core takes care of tracking which UDP tunnel ports the NICs are aware of we can quite easily export this information out to user space. The responsibility of writing the netlink dumps is split between ethtool code and udp_tunnel_nic.c - since udp_tunnel module may not always be loaded, yet we should always report the capabilities of the NIC. $ ethtool --show-tunnels eth0 Tunnel information for eth0: UDP port table 0: Size: 4 Types: vxlan No entries UDP port table 1: Size: 4 Types: geneve, vxlan-gpe Entries (1): port 1230, vxlan-gpe v4: - back to v2, build fix is now directly in udp_tunnel.h v3: - don't compile ETHTOOL_MSG_TUNNEL_INFO_GET in if CONFIG_INET not set. v2: - fix string set count, - reorder enums in the uAPI, - fix type of ETHTOOL_A_TUNNEL_UDP_TABLE_TYPES to bitset in docs and comments. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10udp_tunnel: add central NIC RX port offload infrastructureJakub Kicinski
Cater to devices which: (a) may want to sleep in the callbacks; (b) only have IPv4 support; (c) need all the programming to happen while the netdev is up. Drivers attach UDP tunnel offload info struct to their netdevs, where they declare how many UDP ports of various tunnel types they support. Core takes care of tracking which ports to offload. Use a fixed-size array since this matches what almost all drivers do, and avoids a complexity and uncertainty around memory allocations in an atomic context. Make sure that tunnel drivers don't try to replay the ports when new NIC netdev is registered. Automatic replays would mess up reference counting, and will be removed completely once all drivers are converted. v4: - use a #define NULL to avoid build issues with CONFIG_INET=n. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10udp_tunnel: re-number the offload tunnel typesJakub Kicinski
Make it possible to use tunnel types as flags more easily. There doesn't appear to be any user using the type as an array index, so this should make no difference. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10debugfs: make sure we can remove u32_array files cleanlyJakub Kicinski
debugfs_create_u32_array() allocates a small structure to wrap the data and size information about the array. If users ever try to remove the file this leads to a leak since nothing ever frees this wrapper. That said there are no upstream users of debugfs_create_u32_array() that'd remove a u32 array file (we only have one u32 array user in CMA), so there is no real bug here. Make callers pass a wrapper they allocated. This way the lifetime management of the wrapper is on the caller, and we can avoid the potential leak in debugfs. CC: Chucheng Luo <luochucheng@vivo.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-10Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma fixes from Jason Gunthorpe: "Small update, a few more merge window bugs and normal driver bug fixes: - Two merge window regressions in mlx5: a error path bug found by syzkaller and some lost code during a rework preventing ipoib from working in some configurations - Silence clang compilation warning in OPA related code - Fix a long standing race condition in ib_nl for ACM - Resolve when the HFI1 is shutdown" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/mlx5: Set PD pointers for the error flow unwind IB/mlx5: Fix 50G per lane indication RDMA/siw: Fix reporting vendor_part_id IB/sa: Resolv use-after-free in ib_nl_make_request() IB/hfi1: Do not destroy link_wq when the device is shut down IB/hfi1: Do not destroy hfi1_wq when the device is shut down RDMA/mlx5: Fix legacy IPoIB QP initialization IB/hfi1: Add explicit cast OPA_MTU_8192 to 'enum ib_mtu'
2020-07-10Merge tag 'linux-kselftest-fixes-5.8-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest fixes from Shuah Khan: "TPM2 test changes to run on python3 and kselftest framework fix to incorrect return type" * tag 'linux-kselftest-fixes-5.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: kselftest: ksft_test_num return type should be unsigned selftests: tpm: upgrade TPM2 tests from Python 2 to Python 3
2020-07-10Merge tag 'io_uring-5.8-2020-07-10' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring fixes from Jens Axboe: - Fix memleak for error path in registered files (Yang) - Export CQ overflow state in flags, necessary to fix a case where liburing doesn't know if it needs to enter the kernel (Xiaoguang) - Fix for a regression in when user memory is accounted freed, causing issues with back-to-back ring exit + init if the ulimit -l setting is very tight. * tag 'io_uring-5.8-2020-07-10' of git://git.kernel.dk/linux-block: io_uring: account user memory freed when exit has been queued io_uring: fix memleak in io_sqe_files_register() io_uring: fix memleak in __io_sqe_files_update() io_uring: export cq overflow status to userspace
2020-07-10Merge tag 'block-5.8-2020-07-10' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: - Fix for inflight accounting, which affects only dm (Ming) - Fix documentation error for bfq (Yufen) - Fix memory leak for nbd (Zheng) * tag 'block-5.8-2020-07-10' of git://git.kernel.dk/linux-block: nbd: Fix memory leak in nbd_add_socket blk-mq: consider non-idle request as "inflight" in blk_mq_rq_inflight() docs: block: update and fix tiny error for bfq
2020-07-10Merge tag 'cleanup-kernel_read_write' of git://git.infradead.org/users/hch/miscLinus Torvalds
Pull in-kernel read and write op cleanups from Christoph Hellwig: "Cleanup in-kernel read and write operations Reshuffle the (__)kernel_read and (__)kernel_write helpers, and ensure all users of in-kernel file I/O use them if they don't use iov_iter based methods already. The new WARN_ONs in combination with syzcaller already found a missing input validation in 9p. The fix should be on your way through the maintainer ASAP". [ This is prep-work for the real changes coming 5.9 ] * tag 'cleanup-kernel_read_write' of git://git.infradead.org/users/hch/misc: fs: remove __vfs_read fs: implement kernel_read using __kernel_read integrity/ima: switch to using __kernel_read fs: add a __kernel_read helper fs: remove __vfs_write fs: implement kernel_write using __kernel_write fs: check FMODE_WRITE in __kernel_write fs: unexport __kernel_write bpfilter: switch to kernel_write autofs: switch to kernel_write cachefiles: switch to kernel_write
2020-07-10Merge tag 'dma-mapping-5.8-5' of git://git.infradead.org/users/hch/dma-mappingLinus Torvalds
Pull dma-mapping fixes from Christoph Hellwig: - add a warning when the atomic pool is depleted (David Rientjes) - protect the parameters of the new scatterlist helper macros (Marek Szyprowski ) * tag 'dma-mapping-5.8-5' of git://git.infradead.org/users/hch/dma-mapping: scatterlist: protect parameters of the sg_table related macros dma-mapping: warn when coherent pool is depleted
2020-07-10Merge tag 'pinctrl-v5.8-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl Pull pin control fixes from Linus Walleij: - Fix an issue in the AMD driver for the UART0 group - Fix a glitch issue in the Baytrail pin controller * tag 'pinctrl-v5.8-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl: pinctrl: baytrail: Fix pin being driven low for a while on gpiod_get(..., GPIOD_OUT_HIGH) pinctrl: amd: fix npins for uart0 in kerncz_groups
2020-07-10Merge tag 'gpio-v5.8-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fixes from Linus Walleij: "Some GPIO fixes, most of them for the PCA953x that Andy worked hard to fix up. - Fix two runtime PM errorpath problems in the Arizona GPIO driver. - Fix three interrupt issues in the PCA953x driver. - Fix the automatic address increment handling in the PCA953x driver again. - Add a quirk to the PCA953x that fixes a problem in the Intel Galileo Gen 2" * tag 'gpio-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio: pca953x: Fix GPIO resource leak on Intel Galileo Gen 2 gpio: pca953x: disable regmap locking for automatic address incrementing gpio: pca953x: Fix direction setting when configure an IRQ gpio: pca953x: Override IRQ for one of the expanders on Galileo Gen 2 gpio: pca953x: Synchronize interrupt handler properly gpio: arizona: put pm_runtime in case of failure gpio: arizona: handle pm_runtime_get_sync failure case
2020-07-10Merge tag 'gfs2-v5.8-rc4.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 fixes from Andreas Gruenbacher: "Fix gfs2 readahead deadlocks by adding a IOCB_NOIO flag that allows gfs2 to use the generic fiel read iterator functions without having to worry about being called back while holding locks". * tag 'gfs2-v5.8-rc4.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: gfs2: Rework read and page fault locking fs: Add IOCB_NOIO flag for generic_file_read_iter
2020-07-10Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "An unfortunately large collection of arm64 fixes for -rc5. Some of this is absolutely trivial, but the alternatives, vDSO and CPU errata workaround fixes are significant. At least people are finding and fixing these things, I suppose. - Fix workaround for CPU erratum #1418040 to disable the compat vDSO - Fix Oops when single-stepping with KGDB - Fix memory attributes for hypervisor device mappings at EL2 - Fix memory leak in PSCI and remove useless variable assignment - Fix up some comments and asm labels in our entry code - Fix broken register table formatting in our generated html docs - Fix missing NULL sentinel in CPU errata workaround list - Fix patching of branches in alternative instruction sections" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64/alternatives: don't patch up internal branches arm64: Add missing sentinel to erratum_1463225 arm64: Documentation: Fix broken table in generated HTML arm64: kgdb: Fix single-step exception handling oops arm64: entry: Tidy up block comments and label numbers arm64: Rework ARM_ERRATUM_1414080 handling arm64: arch_timer: Disable the compat vdso for cores affected by ARM64_WORKAROUND_1418040 arm64: arch_timer: Allow an workaround descriptor to disable compat vdso arm64: Introduce a way to disable the 32bit vdso arm64: entry: Fix the typo in the comment of el1_dbg() drivers/firmware/psci: Assign @err directly in hotplug_tests() drivers/firmware/psci: Fix memory leakage in alloc_init_cpu_groups() KVM: arm64: Fix definition of PAGE_HYP_DEVICE
2020-07-10Merge tag 's390-5.8-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Heiko Carstens: "This is mainly due to the fact that Gerald Schaefer's and also my old email addresses currently do not work any longer. Therefore we decided to switch to new email addresses and reflect that in the MAINTAINERS file. - Update email addresses in MAINTAINERS file and add .mailmap entries for Gerald Schaefer and Heiko Carstens. - Fix huge pte soft dirty copying" * tag 's390-5.8-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: MAINTAINERS: update email address for Gerald Schaefer MAINTAINERS: update email address for Heiko Carstens s390/mm: fix huge pte soft dirty copying
2020-07-10Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull vkm fixes from Paolo Bonzini: "Two simple but important bugfixes" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: MIPS: Fix build errors for 32bit kernel KVM: nVMX: fixes for preemption timer migration
2020-07-10Merge tag 'mmc-v5.8-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fixes from Ulf Hansson: - Override DLL_CONFIG only with valid values in sdhci-msm - Get rid of of_match_ptr() macro to fix warning in owl-mmc - Limit segments to 1 to fix meson-gx G12A/G12B SoCs * tag 'mmc-v5.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: sdhci-msm: Override DLL_CONFIG only if the valid value is supplied mmc: owl-mmc: Get rid of of_match_ptr() macro mmc: meson-gx: limit segments to 1 when dram-access-quirk is needed
2020-07-10io_uring: account user memory freed when exit has been queuedJens Axboe
We currently account the memory after the exit work has been run, but that leaves a gap where a process has closed its ring and until the memory has been accounted as freed. If the memlocked ulimit is borderline, then that can introduce spurious setup errors returning -ENOMEM because the free work hasn't been run yet. Account this as freed when we close the ring, as not to expose a tiny gap where setting up a new ring can fail. Fixes: 85faa7b8346e ("io_uring: punt final io_ring_ctx wait-and-free to workqueue") Cc: stable@vger.kernel.org # v5.7 Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-10libbpf: Fix memory leak and optimize BTF sanitizationAndrii Nakryiko
Coverity's static analysis helpfully reported a memory leak introduced by 0f0e55d8247c ("libbpf: Improve BTF sanitization handling"). While fixing it, I realized that btf__new() already creates a memory copy, so there is no need to do this. So this patch also fixes misleading btf__new() signature to make data into a `const void *` input parameter. And it avoids unnecessary memory allocation and copy in BTF sanitization code altogether. Fixes: 0f0e55d8247c ("libbpf: Improve BTF sanitization handling") Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20200710011023.1655008-1-andriin@fb.com
2020-07-10io_uring: fix memleak in io_sqe_files_register()Yang Yingliang
I got a memleak report when doing some fuzz test: BUG: memory leak unreferenced object 0x607eeac06e78 (size 8): comm "test", pid 295, jiffies 4294735835 (age 31.745s) hex dump (first 8 bytes): 00 00 00 00 00 00 00 00 ........ backtrace: [<00000000932632e6>] percpu_ref_init+0x2a/0x1b0 [<0000000092ddb796>] __io_uring_register+0x111d/0x22a0 [<00000000eadd6c77>] __x64_sys_io_uring_register+0x17b/0x480 [<00000000591b89a6>] do_syscall_64+0x56/0xa0 [<00000000864a281d>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Call percpu_ref_exit() on error path to avoid refcount memleak. Fixes: 05f3fb3c5397 ("io_uring: avoid ring quiesce for fixed file set unregister and update") Cc: stable@vger.kernel.org Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-10MAINTAINERS: update email address for Gerald SchaeferGerald Schaefer
Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-10MAINTAINERS: update email address for Heiko CarstensHeiko Carstens
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2020-07-10KVM: MIPS: Fix build errors for 32bit kernelHuacai Chen
Commit dc6d95b153e78ed70b1b2c04a ("KVM: MIPS: Add more MMIO load/store instructions emulation") introduced some 64bit load/store instructions emulation which are unavailable on 32bit platform, and it causes build errors: arch/mips/kvm/emulate.c: In function 'kvm_mips_emulate_store': arch/mips/kvm/emulate.c:1734:6: error: right shift count >= width of type [-Werror] ((vcpu->arch.gprs[rt] >> 56) & 0xff); ^ arch/mips/kvm/emulate.c:1738:6: error: right shift count >= width of type [-Werror] ((vcpu->arch.gprs[rt] >> 48) & 0xffff); ^ arch/mips/kvm/emulate.c:1742:6: error: right shift count >= width of type [-Werror] ((vcpu->arch.gprs[rt] >> 40) & 0xffffff); ^ arch/mips/kvm/emulate.c:1746:6: error: right shift count >= width of type [-Werror] ((vcpu->arch.gprs[rt] >> 32) & 0xffffffff); ^ arch/mips/kvm/emulate.c:1796:6: error: left shift count >= width of type [-Werror] (vcpu->arch.gprs[rt] << 32); ^ arch/mips/kvm/emulate.c:1800:6: error: left shift count >= width of type [-Werror] (vcpu->arch.gprs[rt] << 40); ^ arch/mips/kvm/emulate.c:1804:6: error: left shift count >= width of type [-Werror] (vcpu->arch.gprs[rt] << 48); ^ arch/mips/kvm/emulate.c:1808:6: error: left shift count >= width of type [-Werror] (vcpu->arch.gprs[rt] << 56); ^ cc1: all warnings being treated as errors make[3]: *** [arch/mips/kvm/emulate.o] Error 1 So, use #if defined(CONFIG_64BIT) && defined(CONFIG_KVM_MIPS_VZ) to guard the 64bit load/store instructions emulation. Reported-by: kernel test robot <lkp@intel.com> Fixes: dc6d95b153e78ed70b1b2c04a ("KVM: MIPS: Add more MMIO load/store instructions emulation") Signed-off-by: Huacai Chen <chenhc@lemote.com> Message-Id: <1594365797-536-1-git-send-email-chenhc@lemote.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10KVM: nVMX: fixes for preemption timer migrationPaolo Bonzini
Commit 850448f35aaf ("KVM: nVMX: Fix VMX preemption timer migration", 2020-06-01) accidentally broke nVMX live migration from older version by changing the userspace ABI. Restore it and, while at it, ensure that vmx->nested.has_preemption_timer_deadline is always initialized according to the KVM_STATE_VMX_PREEMPTION_TIMER_DEADLINE flag. Cc: Makarand Sonare <makarandsonare@google.com> Fixes: 850448f35aaf ("KVM: nVMX: Fix VMX preemption timer migration") Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-09net/mlx5e: CT: Fix releasing ft entriesRoi Dayan
Before this commit, on ft flush, ft entries were not removed from the ct_tuple hashtables. Fix it. Fixes: ac991b48d43c ("net/mlx5e: CT: Offload established flows") Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Eli Britstein <elibr@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5e: CT: Remove unused function paramSaeed Mahameed
"flow" parameter is not used in __mlx5_tc_ct_flow_offload_clear(), remove it. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com>
2020-07-09net/mlx5e: CT: Return err_ptr from internal functionsSaeed Mahameed
Instead of having to deal with converting between int and ERR_PTR for return values in mlx5_tc_ct_flow_offload(), make the internal helper functions return a ptr to mlx5_flow_handle instead of passing it as output param, this will also avoid gcc confusion and false alarms, thus we remove the redundant ERR_PTR rule initialization. Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Suggested-by: Jason Gunthorpe <jgg@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com>
2020-07-09net/mlx5e: CT: Expand tunnel register mappingsPaul Blakey
Reg_c1 is 32 bits wide. Originally, 24 bit were allocated for the tuple_id, 6 bits for tunnel mapping and 2 bits for tunnel options mappings. Restoring the ct state from zone lookup instead of tuple id requires reg_c1 to store 8 bits mapping the ct zone, leaving 24 bits for tunnel mappings. Expand tunnel and tunnel options register mappings to 12 bit each. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5e: CT: Use mapping for zone restore registerPaul Blakey
Use a single byte mapping for zone restore register (zone matching remains 16 bit). This makes room for using the freed 8 bits on register C1 for mapping more tunnels and tunnel options. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5e: CT: Re-use tuple modify headers for identical modify actionsPaul Blakey
After removing the tupleid register which changed per tuple, tuple modify headers set the ct_state, zone, mark, and label registers. For non-natted tuples going through the same tc rules path, their values will be the same, and all their modify headers will be the same. Re-use tuple modify header when possible, by adding each new modify header to an hahstable, and looking up identical ones before creating a new one. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5e: Export sharing of mod headers to a new filePaul Blakey
Refactor sharing of mod headers to new file and while there, remove spin lock and flows list, as this is only used for warn on. Use the generic API in the next patch to re-use tuple modify headers for identical modify actions, Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5e: CT: Restore ct state from lookup in zone instead of tupleidPaul Blakey
Remove tupleid, and replace it with zone_restore, which is the zone an established tuple sets after match. On miss, Use this zone + tuple taken from the skb, to lookup the ct entry and restore it. This improves flow insertion rate by avoiding the allocation of a header rewrite context. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5e: CT: Don't offload tuple rewrites for established tuplesPaul Blakey
Next patches will remove the tupleid registers that is used to restore the ct state on miss, and instead use the tuple on the missed packet to lookup which state to restore. Disable tuple rewrites after connection tracking. For tuple rewrites, inject a ct_state=-trk match so it won't change the tuple for established flows (+trk) that passed connection tracking, and instead miss to software. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5e: Use netdev_info instead of pr_infoOz Shlomo
The next patch will pass the mlx5e_priv struct to the modify_header_match_supported method. Use this opportunity to refactor the existing pr_info call to a netdev_info call. Signed-off-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5e: CT: Allow header rewrite of 5-tuple and ct clear actionPaul Blakey
With ct clear we don't jump to the ct tables, so header rewrite of 5-tuple can be done in place (and not moved to after the CT action). Check for ct clear action, and if so, allow 5-tuple header rewrite. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5e: CT: Save ct entries tuples in hashtablesPaul Blakey
Save original tuple and natted tuple in two new hashtables. This is a pre-step for restoring ct state after hw miss by performing a 5-tuple lookup on the hash tables. Signed-off-by: Paul Blakey <paulb@mellanox.com> Reviewed-by: Oz Shlomo <ozsh@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5: E-switch, When eswitch is unsupported, return -EOPNOTSUPPParav Pandit
When eswitch is unsupported, currently -EPERM error code is returned instead of -EOPNOTSUPP. Due to this VF device's devlink virtual port is not enumerated because port_function_get() callback returned -EPERM instead of -EOPNOTSUPP. Hence, return the error code -EOPNOTSUPP when eswitch is unsupported. Fixes: bd93975353d5 ("net/mlx5: E-switch, Introduce and use eswitch support check helper") Signed-off-by: Parav Pandit <parav@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09libbpf: Fix libbpf hashmap on (I)LP32 architecturesJakub Bogusz
On ILP32, 64-bit result was shifted by value calculated for 32-bit long type and returned value was much outside hashmap capacity. As advised by Andrii Nakryiko, this patch uses different hashing variant for architectures with size_t shorter than long long. Fixes: e3b924224028 ("libbpf: add resizable non-thread safe internal hashmap") Signed-off-by: Jakub Bogusz <qboosh@pld-linux.org> Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200709225723.1069937-1-andriin@fb.com
2020-07-09net/mlx5e: CT: Fix memory leak in cleanupEli Britstein
CT entries are deleted via a workqueue from netfilter. If removing the module before that, the rules are cleaned by the driver itself, but the memory entries for them are not freed. Fix that. Fixes: ac991b48d43c ("net/mlx5e: CT: Offload established flows") Signed-off-by: Eli Britstein <elibr@mellanox.com> Reviewed-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
2020-07-09net/mlx5e: Fix port buffers cell size valueEran Ben Elisha
Device unit for port buffers size, xoff_threshold and xon_threshold is cells. Fix a bug in driver where cell unit size was hard-coded to 128 bytes. This hard-coded value is buggy, as it is wrong for some hardware versions. Driver to read cell size from SBCAM register and translate bytes to cell units accordingly. In order to fix the bug, this patch exposes SBCAM (Shared buffer capabilities mask) layout and defines. If SBCAM.cap_cell_size is valid, use it for all bytes to cells calculations. If not valid, fallback to 128. Cell size do not change on the fly per device. Instead of issuing SBCAM access reg command every time such translation is needed, cache it in mlx5e_dcbx as part of mlx5e_dcbnl_initialize(). Pass dcbx.port_buff_cell_sz as a param to every function that needs bytes to cells translation. While fixing the bug, move MLX5E_BUFFER_CELL_SHIFT macro to en_dcbnl.c, as it is only used by that file. Fixes: 0696d60853d5 ("net/mlx5e: Receive buffer configuration") Signed-off-by: Eran Ben Elisha <eranbe@mellanox.com> Reviewed-by: Huy Nguyen <huyn@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>