summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-05-04tcp: dynamically allocate the perturb table used by source portsWilly Tarreau
We'll need to further increase the size of this table and it's likely that at some point its size will not be suitable anymore for a static table. Let's allocate it on boot from inet_hashinfo2_init(), which is called from tcp_init(). Cc: Moshe Kol <moshe.kol@mail.huji.ac.il> Cc: Yossi Gilad <yossi.gilad@mail.huji.ac.il> Cc: Amit Klein <aksecurity@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-04tcp: add small random increments to the source portWilly Tarreau
Here we're randomly adding between 0 and 7 random increments to the selected source port in order to add some noise in the source port selection that will make the next port less predictable. With the default port range of 32768-60999 this means a worst case reuse scenario of 14116/8=1764 connections between two consecutive uses of the same port, with an average of 14116/4.5=3137. This code was stressed at more than 800000 connections per second to a fixed target with all connections closed by the client using RSTs (worst condition) and only 2 connections failed among 13 billion, despite the hash being reseeded every 10 seconds, indicating a perfectly safe situation. Cc: Moshe Kol <moshe.kol@mail.huji.ac.il> Cc: Yossi Gilad <yossi.gilad@mail.huji.ac.il> Cc: Amit Klein <aksecurity@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-04tcp: resalt the secret every 10 secondsEric Dumazet
In order to limit the ability for an observer to recognize the source ports sequence used to contact a set of destinations, we should periodically shuffle the secret. 10 seconds looks effective enough without causing particular issues. Cc: Moshe Kol <moshe.kol@mail.huji.ac.il> Cc: Yossi Gilad <yossi.gilad@mail.huji.ac.il> Cc: Amit Klein <aksecurity@gmail.com> Cc: Jason A. Donenfeld <Jason@zx2c4.com> Tested-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-04tcp: use different parts of the port_offset for index and offsetWilly Tarreau
Amit Klein suggests that we use different parts of port_offset for the table's index and the port offset so that there is no direct relation between them. Cc: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Moshe Kol <moshe.kol@mail.huji.ac.il> Cc: Yossi Gilad <yossi.gilad@mail.huji.ac.il> Cc: Amit Klein <aksecurity@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-04secure_seq: use the 64 bits of the siphash for port offset calculationWilly Tarreau
SipHash replaced MD5 in secure_ipv{4,6}_port_ephemeral() via commit 7cd23e5300c1 ("secure_seq: use SipHash in place of MD5"), but the output remained truncated to 32-bit only. In order to exploit more bits from the hash, let's make the functions return the full 64-bit of siphash_3u32(). We also make sure the port offset calculation in __inet_hash_connect() remains done on 32-bit to avoid the need for div_u64_rem() and an extra cost on 32-bit systems. Cc: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Moshe Kol <moshe.kol@mail.huji.ac.il> Cc: Yossi Gilad <yossi.gilad@mail.huji.ac.il> Cc: Amit Klein <aksecurity@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Willy Tarreau <w@1wt.eu> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-04Merge branch 'wireguard-patches-for-5-18-rc6'Jakub Kicinski
Jason A. Donenfeld says: ==================== wireguard patches for 5.18-rc6 In working on some other problems, I wound up leaning on the WireGuard CI more than usual and uncovered a few small issues with reliability. These are fairly low key changes, since they don't impact kernel code itself. One change does stick out in particular, though, which is the "make routing loop test non-fatal" commit. I'm not thrilled about doing this, but currently [1] remains unsolved, and I'm still working on a real solution to that (hopefully for 5.19 or 5.20 if I can come up with a good idea...), so for now that test just prints a big red warning instead. [1] https://lore.kernel.org/netdev/YmszSXueTxYOC41G@zx2c4.com/ ==================== Link: https://lore.kernel.org/r/20220504202920.72908-1-Jason@zx2c4.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-04wireguard: selftests: set panic_on_warn=1 from cmdlineJason A. Donenfeld
Rather than setting this once init is running, set panic_on_warn from the kernel command line, so that it catches splats from WireGuard initialization code and the various crypto selftests. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-04wireguard: selftests: bump package depsJason A. Donenfeld
Use newer, more reliable package dependencies. These should hopefully reduce flakes. However, we keep the old iputils package, as it accumulated bugs after resulting in flakes on slow machines. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-04wireguard: selftests: restore support for ccacheJason A. Donenfeld
When moving to non-system toolchains, we inadvertantly killed the ability to use ccache. So instead, build ccache support into the test harness directly. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-04wireguard: selftests: use newer toolchains to fill out architecturesJason A. Donenfeld
Rather than relying on the system to have cross toolchains available, simply download musl.cc's ones and use that libc.so, and then we use it to fill in a few missing platforms, such as riscv64, riscv64, powerpc64, and s390x. Since riscv doesn't have a second serial port in its device description, we have to use virtio's vport. This is actually the same situation on ARM, but we were previously hacking QEMU up to work around this, which required a custom QEMU. Instead just do the vport trick on ARM too. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-04wireguard: selftests: limit parallelism to $(nproc) tests at onceJason A. Donenfeld
The parallel tests were added to catch queueing issues from multiple cores. But what happens in reality when testing tons of processes is that these separate threads wind up fighting with the scheduler, and we wind up with contention in places we don't care about that decrease the chances of hitting a bug. So just do a test with the number of CPU cores, rather than trying to scale up arbitrarily. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-04wireguard: selftests: make routing loop test non-fatalJason A. Donenfeld
I hate to do this, but I still do not have a good solution to actually fix this bug across architectures. So just disable it for now, so that the CI can still deliver actionable results. This commit adds a large red warning, so that at least the failure isn't lost forever, and hopefully this can be revisited down the line. Link: https://lore.kernel.org/netdev/CAHmME9pv1x6C4TNdL6648HydD8r+txpV4hTUXOBVkrapBXH4QQ@mail.gmail.com/ Link: https://lore.kernel.org/netdev/YmszSXueTxYOC41G@zx2c4.com/ Link: https://lore.kernel.org/wireguard/CAHmME9rNnBiNvBstb7MPwK-7AmAN0sOfnhdR=eeLrowWcKxaaQ@mail.gmail.com/ Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-04RDMA/rxe: Change mcg_lock to a _bh lockBob Pearson
rxe_mcast.c currently uses _irqsave spinlocks for rxe->mcg_lock while rxe_recv.c uses _bh spinlocks for the same lock. As there is no case where the mcg_lock can be taken from an IRQ, change these all to bh locks so we don't have confusing mismatched lock types on the same spinlock. Fixes: 6090a0c4c7c6 ("RDMA/rxe: Cleanup rxe_mcast.c") Link: https://lore.kernel.org/r/20220504202817.98247-1-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-04RDMA/rxe: Do not call dev_mc_add/del() under a spinlockBob Pearson
These routines were not intended to be called under a spinlock and will throw debugging warnings: raw_local_irq_restore() called with IRQs enabled WARNING: CPU: 13 PID: 3107 at kernel/locking/irqflag-debug.c:10 warn_bogus_irq_restore+0x2f/0x50 CPU: 13 PID: 3107 Comm: python3 Tainted: G E 5.18.0-rc1+ #7 Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 RIP: 0010:warn_bogus_irq_restore+0x2f/0x50 Call Trace: <TASK> _raw_spin_unlock_irqrestore+0x75/0x80 rxe_attach_mcast+0x304/0x480 [rdma_rxe] ib_attach_mcast+0x88/0xa0 [ib_core] ib_uverbs_attach_mcast+0x186/0x1e0 [ib_uverbs] ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0xcd/0x140 [ib_uverbs] ib_uverbs_cmd_verbs+0xdb0/0xea0 [ib_uverbs] ib_uverbs_ioctl+0xd2/0x160 [ib_uverbs] do_syscall_64+0x5c/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae Move them out of the spinlock, it is OK if there is some races setting up the MC reception at the ethernet layer with rbtree lookups. Fixes: 6090a0c4c7c6 ("RDMA/rxe: Cleanup rxe_mcast.c") Link: https://lore.kernel.org/r/20220504202817.98247-1-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-04RDMA/siw: Fix a condition race issue in MPA request processingCheng Xu
The calling of siw_cm_upcall and detaching new_cep with its listen_cep should be atomistic semantics. Otherwise siw_reject may be called in a temporary state, e,g, siw_cm_upcall is called but the new_cep->listen_cep has not being cleared. This fixes a WARN: WARNING: CPU: 7 PID: 201 at drivers/infiniband/sw/siw/siw_cm.c:255 siw_cep_put+0x125/0x130 [siw] CPU: 2 PID: 201 Comm: kworker/u16:22 Kdump: loaded Tainted: G E 5.17.0-rc7 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Workqueue: iw_cm_wq cm_work_handler [iw_cm] RIP: 0010:siw_cep_put+0x125/0x130 [siw] Call Trace: <TASK> siw_reject+0xac/0x180 [siw] iw_cm_reject+0x68/0xc0 [iw_cm] cm_work_handler+0x59d/0xe20 [iw_cm] process_one_work+0x1e2/0x3b0 worker_thread+0x50/0x3a0 ? rescuer_thread+0x390/0x390 kthread+0xe5/0x110 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x1f/0x30 </TASK> Fixes: 6c52fdc244b5 ("rdma/siw: connection management") Link: https://lore.kernel.org/r/d528d83466c44687f3872eadcb8c184528b2e2d4.1650526554.git.chengyou@linux.alibaba.com Reported-by: Luis Chamberlain <mcgrof@kernel.org> Reviewed-by: Bernard Metzler <bmt@zurich.ibm.com> Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-05-04dt-bindings: pci: apple,pcie: Drop max-link-speed from exampleHector Martin
We no longer use these since 111659c2a570 (and they never worked anyway); drop them from the example to avoid confusion. Fixes: 111659c2a570 ("arm64: dts: apple: t8103: Remove PCIe max-link-speed properties") Signed-off-by: Hector Martin <marcan@marcan.st> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20220502091308.28233-1-marcan@marcan.st
2022-05-04dt-bindings: Drop redundant 'maxItems/minItems' in if/then schemasRob Herring
Another round of removing redundant minItems/maxItems when 'items' list is specified. This time it is in if/then schemas as the meta-schema was failing to check this case. If a property has an 'items' list, then a 'minItems' or 'maxItems' with the same size as the list is redundant and can be dropped. Note that is DT schema specific behavior and not standard json-schema behavior. The tooling will fixup the final schema adding any unspecified minItems/maxItems. Signed-off-by: Rob Herring <robh@kernel.org> Acked-By: Vinod Koul <vkoul@kernel.org> Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> Acked-by: Mark Brown <broonie@kernel.org> Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # For MMC Acked-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> #for IIO Link: https://lore.kernel.org/r/20220503162738.3827041-1-robh@kernel.org
2022-05-04dt-bindings: pinctrl: Allow values for drive-push-pull and drive-open-drainRob Herring
A few platforms, at91 and tegra, use drive-push-pull and drive-open-drain with a 0 or 1 value. There's not really a need for values as '1' should be equivalent to no value (it wasn't treated that way) and drive-push-pull disabled is equivalent to drive-open-drain. So dropping the value can't be done without breaking existing OSs. As we don't want new cases, mark the case with values as deprecated. Cc: Arnd Bergmann <arnd@arndb.de> Cc: Thierry Reding <thierry.reding@gmail.com> Cc: Jonathan Hunter <jonathanh@nvidia.com> Cc: Nicolas Ferre <nicolas.ferre@microchip.com> Cc: Claudiu Beznea <claudiu.beznea@microchip.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20220429194610.2741437-1-robh@kernel.org
2022-05-04Merge tag 'iomm-fixes-v5.18-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu Pull iommu fixes from Joerg Roedel: "IOMMU core: - Fix for a regression which could cause NULL-ptr dereferences Arm SMMU: - Fix off-by-one in SMMUv3 SVA TLB invalidation - Disable large mappings to workaround nvidia erratum Intel VT-d: - Handle PCI stop marker messages in IOMMU driver to meet the requirement of I/O page fault handling framework. - Calculate a feasible mask for non-aligned page-selective IOTLB invalidation. Apple DART IOMMU: - Fix potential NULL-ptr dereference - Set module owner" * tag 'iomm-fixes-v5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: iommu: Make sysfs robust for non-API groups iommu/dart: Add missing module owner to ops structure iommu/dart: check return value after calling platform_get_resource() iommu/vt-d: Drop stop marker messages iommu/vt-d: Calculate mask for non-aligned flushes iommu: arm-smmu: disable large page mappings for Nvidia arm-smmu iommu/arm-smmu-v3: Fix size calculation in arm_smmu_mm_invalidate_range()
2022-05-04Merge tag 'for-linus-5.17-2' of https://github.com/cminyard/linux-ipmiLinus Torvalds
Pull IPMI fixes from Corey Minyard: "Fix some issues that were reported. This has been in for-next for a bit (longer than the times would indicate, I had to rebase to add some text to the headers) and these are fixes that need to go in" * tag 'for-linus-5.17-2' of https://github.com/cminyard/linux-ipmi: ipmi:ipmi_ipmb: Fix null-ptr-deref in ipmi_unregister_smi() ipmi: When handling send message responses, don't process the message
2022-05-04drm/amd/display: Avoid reading audio pattern past AUDIO_CHANNELS_COUNTHarry Wentland
A faulty receiver might report an erroneous channel count. We should guard against reading beyond AUDIO_CHANNELS_COUNT as that would overflow the dpcd_pattern_period array. Signed-off-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-05-04drm/amdgpu: do not use passthrough mode in Xen dom0Marek Marczykowski-Górecki
While technically Xen dom0 is a virtual machine too, it does have access to most of the hardware so it doesn't need to be considered a "passthrough". Commit b818a5d37454 ("drm/amdgpu/gmc: use PCI BARs for APUs in passthrough") changed how FB is accessed based on passthrough mode. This breaks amdgpu in Xen dom0 with message like this: [drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3 While the reason for this failure is unclear, the passthrough mode is not really necessary in Xen dom0 anyway. So, to unbreak booting affected kernels, disable passthrough mode in this case. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/1985 Fixes: b818a5d37454 ("drm/amdgpu/gmc: use PCI BARs for APUs in passthrough") Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2022-05-04iommu: Make sysfs robust for non-API groupsRobin Murphy
Groups created by VFIO backends outside the core IOMMU API should never be passed directly into the API itself, however they still expose their standard sysfs attributes, so we can still stumble across them that way. Take care to consider those cases before jumping into our normal assumptions of a fully-initialised core API group. Fixes: 3f6634d997db ("iommu: Use right way to retrieve iommu_ops") Reported-by: Jan Stancek <jstancek@redhat.com> Tested-by: Jan Stancek <jstancek@redhat.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/86ada41986988511a8424e84746dfe9ba7f87573.1651667683.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04mmc: sdhci-msm: Reset GCC_SDCC_BCR register for SDHCShaik Sajida Bhanu
Reset GCC_SDCC_BCR register before every fresh initilazation. This will reset whole SDHC-msm controller, clears the previous power control states and avoids, software reset timeout issues as below. [ 5.458061][ T262] mmc1: Reset 0x1 never completed. [ 5.462454][ T262] mmc1: sdhci: ============ SDHCI REGISTER DUMP =========== [ 5.469065][ T262] mmc1: sdhci: Sys addr: 0x00000000 | Version: 0x00007202 [ 5.475688][ T262] mmc1: sdhci: Blk size: 0x00000000 | Blk cnt: 0x00000000 [ 5.482315][ T262] mmc1: sdhci: Argument: 0x00000000 | Trn mode: 0x00000000 [ 5.488927][ T262] mmc1: sdhci: Present: 0x01f800f0 | Host ctl: 0x00000000 [ 5.495539][ T262] mmc1: sdhci: Power: 0x00000000 | Blk gap: 0x00000000 [ 5.502162][ T262] mmc1: sdhci: Wake-up: 0x00000000 | Clock: 0x00000003 [ 5.508768][ T262] mmc1: sdhci: Timeout: 0x00000000 | Int stat: 0x00000000 [ 5.515381][ T262] mmc1: sdhci: Int enab: 0x00000000 | Sig enab: 0x00000000 [ 5.521996][ T262] mmc1: sdhci: ACmd stat: 0x00000000 | Slot int: 0x00000000 [ 5.528607][ T262] mmc1: sdhci: Caps: 0x362dc8b2 | Caps_1: 0x0000808f [ 5.535227][ T262] mmc1: sdhci: Cmd: 0x00000000 | Max curr: 0x00000000 [ 5.541841][ T262] mmc1: sdhci: Resp[0]: 0x00000000 | Resp[1]: 0x00000000 [ 5.548454][ T262] mmc1: sdhci: Resp[2]: 0x00000000 | Resp[3]: 0x00000000 [ 5.555079][ T262] mmc1: sdhci: Host ctl2: 0x00000000 [ 5.559651][ T262] mmc1: sdhci_msm: ----------- VENDOR REGISTER DUMP----------- [ 5.566621][ T262] mmc1: sdhci_msm: DLL sts: 0x00000000 | DLL cfg: 0x6000642c | DLL cfg2: 0x0020a000 [ 5.575465][ T262] mmc1: sdhci_msm: DLL cfg3: 0x00000000 | DLL usr ctl: 0x00010800 | DDR cfg: 0x80040873 [ 5.584658][ T262] mmc1: sdhci_msm: Vndr func: 0x00018a9c | Vndr func2 : 0xf88218a8 Vndr func3: 0x02626040 Fixes: 0eb0d9f4de34 ("mmc: sdhci-msm: Initial support for Qualcomm chipsets") Signed-off-by: Shaik Sajida Bhanu <quic_c_sbhanu@quicinc.com> Acked-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Tested-by: Konrad Dybcio <konrad.dybcio@somainline.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1650816153-23797-1-git-send-email-quic_c_sbhanu@quicinc.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-05-04Merge tag 'mlx5-fixes-2022-05-03' of git://git.kernel.org/pub/scm/linux/kernel/gDavid S. Miller
it/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2022-05-03 This series provides bug fixes to mlx5 driver. Please pull and let me know if there is any problem. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-04mmc: sunxi-mmc: Fix DMA descriptors allocated above 32 bitsSamuel Holland
Newer variants of the MMC controller support a 34-bit physical address space by using word addresses instead of byte addresses. However, the code truncates the DMA descriptor address to 32 bits before applying the shift. This breaks DMA for descriptors allocated above the 32-bit limit. Fixes: 3536b82e5853 ("mmc: sunxi: add support for A100 mmc controller") Signed-off-by: Samuel Holland <samuel@sholland.org> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Jernej Skrabec <jernej.skrabec@gmail.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220424231751.32053-1-samuel@sholland.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
2022-05-04iommu/dart: Add missing module owner to ops structureHector Martin
This is required to make loading this as a module work. Signed-off-by: Hector Martin <marcan@marcan.st> Fixes: 46d1fb072e76 ("iommu/dart: Add DART iommu driver") Reviewed-by: Sven Peter <sven@svenpeter.dev> Link: https://lore.kernel.org/r/20220502092238.30486-1-marcan@marcan.st Signed-off-by: Joerg Roedel <jroedel@suse.de>
2022-05-04drm/bridge: ite-it6505: add missing Kconfig option selectFabien Parent
The IT6505 is using functions provided by the DRM_DP_HELPER driver. In order to avoid having the bridge enabled but the helper disabled, let's add a select in order to be sure that the DP helper functions are always available. Fixes: b5c84a9edcd4 ("drm/bridge: add it6505 driver") Signed-off-by: Fabien Parent <fparent@baylibre.com> Reviewed-by: Neil Armstrong <narmstrong@baylibre.com> Signed-off-by: Neil Armstrong <narmstrong@baylibre.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220426141536.274727-1-fparent@baylibre.com
2022-05-04net/mlx5: Fix matching on inner TTCMark Bloch
The cited commits didn't use proper matching on inner TTC as a result distribution of encapsulated packets wasn't symmetric between the physical ports. Fixes: 4c71ce50d2fe ("net/mlx5: Support partial TTC rules") Fixes: 8e25a2bc6687 ("net/mlx5: Lag, add support to create TTC tables for LAG port selection") Signed-off-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-04net/mlx5: Avoid double clear or set of sync reset requestedMoshe Shemesh
Double clear of reset requested state can lead to NULL pointer as it will try to delete the timer twice. This can happen for example on a race between abort from FW and pci error or reset. Avoid such case using test_and_clear_bit() to verify only one time reset requested state clear flow. Similarly use test_and_set_bit() to verify only one time reset requested state set flow. Fixes: 7dd6df329d4c ("net/mlx5: Handle sync reset abort event") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Maher Sanalla <msanalla@nvidia.com> Reviewed-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-04net/mlx5: Fix deadlock in sync reset flowMoshe Shemesh
The sync reset flow can lead to the following deadlock when poll_sync_reset() is called by timer softirq and waiting on del_timer_sync() for the same timer. Fix that by moving the part of the flow that waits for the timer to reset_reload_work. It fixes the following kernel Trace: RIP: 0010:del_timer_sync+0x32/0x40 ... Call Trace: <IRQ> mlx5_sync_reset_clear_reset_requested+0x26/0x50 [mlx5_core] poll_sync_reset.cold+0x36/0x52 [mlx5_core] call_timer_fn+0x32/0x130 __run_timers.part.0+0x180/0x280 ? tick_sched_handle+0x33/0x60 ? tick_sched_timer+0x3d/0x80 ? ktime_get+0x3e/0xa0 run_timer_softirq+0x2a/0x50 __do_softirq+0xe1/0x2d6 ? hrtimer_interrupt+0x136/0x220 irq_exit+0xae/0xb0 smp_apic_timer_interrupt+0x7b/0x140 apic_timer_interrupt+0xf/0x20 </IRQ> Fixes: 3c5193a87b0f ("net/mlx5: Use del_timer_sync in fw reset flow of halting poll") Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Maher Sanalla <msanalla@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-04net/mlx5e: Fix trust state reset in reloadMoshe Tal
Setting dscp2prio during the driver reload can cause dcb ieee app list to be not empty after the reload finish and as a result to a conflict between the priority trust state reported by the app and the state in the device register. Reset the dcb ieee app list on initialization in case this is conflicting with the register status. Fixes: 2a5e7a1344f4 ("net/mlx5e: Add dcbnl dscp to priority support") Signed-off-by: Moshe Tal <moshet@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-04net/mlx5e: Avoid checking offload capability in post_parse actionAriel Levkovich
During TC action parsing, the can_offload callback is called before calling the action's main parsing callback. Later on, the can_offload callback is called again before handling the action's post_parse callback if exists. Since the main parsing callback might have changed and set parsing params for the rule, following can_offload checks might fail because some parsing params were already set. Specifically, the ct action main parsing sets the ct param in the parsing status structure and when the second can_offload for ct action is called, before handling the ct post parsing, it will return an error since it checks this ct param to indicate multiple ct actions which are not supported. Therefore, the can_offload call is removed from the post parsing handling to prevent such cases. This is allowed since the first can_offload call will ensure that the action can be offloaded and the fact the code reached the post parsing handling already means that the action can be offloaded. Fixes: 8300f225268b ("net/mlx5e: Create new flow attr for multi table actions") Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-04net/mlx5e: CT: Fix queued up restore put() executing after relevant ft releasePaul Blakey
__mlx5_tc_ct_entry_put() queues release of tuple related to some ct FT, if that is the last reference to that tuple, the actual deletion of the tuple can happen after the FT is already destroyed and freed. Flush the used workqueue before destroying the ct FT. Fixes: a2173131526d ("net/mlx5e: CT: manage the lifetime of the ct entry object") Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Signed-off-by: Paul Blakey <paulb@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-04net/mlx5e: TC, fix decap fallback to uplink when int port not supportedAriel Levkovich
When resolving the decap route device for a tunnel decap rule, the result may be an OVS internal port device. Prior to adding the support for internal port offload, such case would result in using the uplink as the default decap route device which allowed devices that can't support internal port offload to offload this decap rule. This behavior got broken by adding the internal port offload which will fail in case the device can't support internal port offload. To restore the old behavior, use the uplink device as the decap route as before when internal port offload is not supported. Fixes: b16eb3c81fe2 ("net/mlx5: Support internal port as decap route device") Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-04net/mlx5e: TC, Fix ct_clear overwriting ct action metadataAriel Levkovich
ct_clear action is translated to clearing reg_c metadata which holds ct state and zone information using mod header actions. These actions are allocated during the actions parsing, as part of the flow attributes main mod header action list. If ct action exists in the rule, the flow's main mod header is used only in the post action table rule, after the ct tables which set the ct info in the reg_c as part of the ct actions. Therefore, if the original rule has a ct_clear action followed by a ct action, the ct action reg_c setting will be done first and will be followed by the ct_clear resetting reg_c and overwriting the ct info. Fix this by moving the ct_clear mod header actions allocation from the ct action parsing stage to the ct action post parsing stage where it is already known if ct_clear is followed by a ct action. In such case, we skip the mod header actions allocation for the ct clear since the ct action will write to reg_c anyway after clearing it. Fixes: 806401c20a0f ("net/mlx5e: CT, Fix multiple allocations and memleak of mod acts") Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-04net/mlx5e: Lag, Don't skip fib events on current dstVlad Buslov
Referenced change added check to skip updating fib when new fib instance has same or lower priority. However, new fib instance can be an update on same dst address as existing one even though the structure is another instance that has different address. Ignoring events on such instances causes multipath LAG state to not be correctly updated. Track 'dst' and 'dst_len' fields of fib event fib_entry_notifier_info structure and don't skip events that have the same value of that fields. Fixes: ad11c4f1d8fd ("net/mlx5e: Lag, Only handle events from highest priority multipath entry") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-04net/mlx5e: Lag, Fix fib_info pointer assignmentVlad Buslov
Referenced change incorrectly sets single path fib_info even when LAG is not active. Fix it by moving call to mlx5_lag_fib_set() into conditional that verifies LAG state. Fixes: ad11c4f1d8fd ("net/mlx5e: Lag, Only handle events from highest priority multipath entry") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-04net/mlx5e: Lag, Fix use-after-free in fib event handlerVlad Buslov
Recent commit that modified fib route event handler to handle events according to their priority introduced use-after-free[0] in mp->mfi pointer usage. The pointer now is not just cached in order to be compared to following fib_info instances, but is also dereferenced to obtain fib_priority. However, since mlx5 lag code doesn't hold the reference to fin_info during whole mp->mfi lifetime, it could be used after fib_info instance has already been freed be kernel infrastructure code. Don't ever dereference mp->mfi pointer. Refactor it to be 'const void*' type and cache fib_info priority in dedicated integer. Group fib_info-related data into dedicated 'fib' structure that will be further extended by following patches in the series. [0]: [ 203.588029] ================================================================== [ 203.590161] BUG: KASAN: use-after-free in mlx5_lag_fib_update+0xabd/0xd60 [mlx5_core] [ 203.592386] Read of size 4 at addr ffff888144df2050 by task kworker/u20:4/138 [ 203.594766] CPU: 3 PID: 138 Comm: kworker/u20:4 Tainted: G B 5.17.0-rc7+ #6 [ 203.596751] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 [ 203.598813] Workqueue: mlx5_lag_mp mlx5_lag_fib_update [mlx5_core] [ 203.600053] Call Trace: [ 203.600608] <TASK> [ 203.601110] dump_stack_lvl+0x48/0x5e [ 203.601860] print_address_description.constprop.0+0x1f/0x160 [ 203.602950] ? mlx5_lag_fib_update+0xabd/0xd60 [mlx5_core] [ 203.604073] ? mlx5_lag_fib_update+0xabd/0xd60 [mlx5_core] [ 203.605177] kasan_report.cold+0x83/0xdf [ 203.605969] ? mlx5_lag_fib_update+0xabd/0xd60 [mlx5_core] [ 203.607102] mlx5_lag_fib_update+0xabd/0xd60 [mlx5_core] [ 203.608199] ? mlx5_lag_init_fib_work+0x1c0/0x1c0 [mlx5_core] [ 203.609382] ? read_word_at_a_time+0xe/0x20 [ 203.610463] ? strscpy+0xa0/0x2a0 [ 203.611463] process_one_work+0x722/0x1270 [ 203.612344] worker_thread+0x540/0x11e0 [ 203.613136] ? rescuer_thread+0xd50/0xd50 [ 203.613949] kthread+0x26e/0x300 [ 203.614627] ? kthread_complete_and_exit+0x20/0x20 [ 203.615542] ret_from_fork+0x1f/0x30 [ 203.616273] </TASK> [ 203.617174] Allocated by task 3746: [ 203.617874] kasan_save_stack+0x1e/0x40 [ 203.618644] __kasan_kmalloc+0x81/0xa0 [ 203.619394] fib_create_info+0xb41/0x3c50 [ 203.620213] fib_table_insert+0x190/0x1ff0 [ 203.621020] fib_magic.isra.0+0x246/0x2e0 [ 203.621803] fib_add_ifaddr+0x19f/0x670 [ 203.622563] fib_inetaddr_event+0x13f/0x270 [ 203.623377] blocking_notifier_call_chain+0xd4/0x130 [ 203.624355] __inet_insert_ifa+0x641/0xb20 [ 203.625185] inet_rtm_newaddr+0xc3d/0x16a0 [ 203.626009] rtnetlink_rcv_msg+0x309/0x880 [ 203.626826] netlink_rcv_skb+0x11d/0x340 [ 203.627626] netlink_unicast+0x4cc/0x790 [ 203.628430] netlink_sendmsg+0x762/0xc00 [ 203.629230] sock_sendmsg+0xb2/0xe0 [ 203.629955] ____sys_sendmsg+0x58a/0x770 [ 203.630756] ___sys_sendmsg+0xd8/0x160 [ 203.631523] __sys_sendmsg+0xb7/0x140 [ 203.632294] do_syscall_64+0x35/0x80 [ 203.633045] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 203.634427] Freed by task 0: [ 203.635063] kasan_save_stack+0x1e/0x40 [ 203.635844] kasan_set_track+0x21/0x30 [ 203.636618] kasan_set_free_info+0x20/0x30 [ 203.637450] __kasan_slab_free+0xfc/0x140 [ 203.638271] kfree+0x94/0x3b0 [ 203.638903] rcu_core+0x5e4/0x1990 [ 203.639640] __do_softirq+0x1ba/0x5d3 [ 203.640828] Last potentially related work creation: [ 203.641785] kasan_save_stack+0x1e/0x40 [ 203.642571] __kasan_record_aux_stack+0x9f/0xb0 [ 203.643478] call_rcu+0x88/0x9c0 [ 203.644178] fib_release_info+0x539/0x750 [ 203.644997] fib_table_delete+0x659/0xb80 [ 203.645809] fib_magic.isra.0+0x1a3/0x2e0 [ 203.646617] fib_del_ifaddr+0x93f/0x1300 [ 203.647415] fib_inetaddr_event+0x9f/0x270 [ 203.648251] blocking_notifier_call_chain+0xd4/0x130 [ 203.649225] __inet_del_ifa+0x474/0xc10 [ 203.650016] devinet_ioctl+0x781/0x17f0 [ 203.650788] inet_ioctl+0x1ad/0x290 [ 203.651533] sock_do_ioctl+0xce/0x1c0 [ 203.652315] sock_ioctl+0x27b/0x4f0 [ 203.653058] __x64_sys_ioctl+0x124/0x190 [ 203.653850] do_syscall_64+0x35/0x80 [ 203.654608] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 203.666952] The buggy address belongs to the object at ffff888144df2000 which belongs to the cache kmalloc-256 of size 256 [ 203.669250] The buggy address is located 80 bytes inside of 256-byte region [ffff888144df2000, ffff888144df2100) [ 203.671332] The buggy address belongs to the page: [ 203.672273] page:00000000bf6c9314 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x144df0 [ 203.674009] head:00000000bf6c9314 order:2 compound_mapcount:0 compound_pincount:0 [ 203.675422] flags: 0x2ffff800010200(slab|head|node=0|zone=2|lastcpupid=0x1ffff) [ 203.676819] raw: 002ffff800010200 0000000000000000 dead000000000122 ffff888100042b40 [ 203.678384] raw: 0000000000000000 0000000080200020 00000001ffffffff 0000000000000000 [ 203.679928] page dumped because: kasan: bad access detected [ 203.681455] Memory state around the buggy address: [ 203.682421] ffff888144df1f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 203.683863] ffff888144df1f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 203.685310] >ffff888144df2000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 203.686701] ^ [ 203.687820] ffff888144df2080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 203.689226] ffff888144df2100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 203.690620] ================================================================== Fixes: ad11c4f1d8fd ("net/mlx5e: Lag, Only handle events from highest priority multipath entry") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-04net/mlx5e: Fix the calling of update_buffer_lossy() APIMark Zhang
The arguments of update_buffer_lossy() is in a wrong order. Fix it. Fixes: 88b3d5c90e96 ("net/mlx5e: Fix port buffers cell size value") Signed-off-by: Mark Zhang <markzhang@nvidia.com> Reviewed-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-04net/mlx5e: Don't match double-vlan packets if cvlan is not setVlad Buslov
Currently, match VLAN rule also matches packets that have multiple VLAN headers. This behavior is similar to buggy flower classifier behavior that has recently been fixed. Fix the issue by matching on outer_second_cvlan_tag with value 0 which will cause the HW to verify the packet doesn't contain second vlan header. Fixes: 699e96ddf47f ("net/mlx5e: Support offloading tc double vlan headers match") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-04net/mlx5: Fix slab-out-of-bounds while reading resource dump menuAya Levin
Resource dump menu may span over more than a single page, support it. Otherwise, menu read may result in a memory access violation: reading outside of the allocated page. Note that page format of the first menu page contains menu headers while the proceeding menu pages contain only records. The KASAN logs are as follows: BUG: KASAN: slab-out-of-bounds in strcmp+0x9b/0xb0 Read of size 1 at addr ffff88812b2e1fd0 by task systemd-udevd/496 CPU: 5 PID: 496 Comm: systemd-udevd Tainted: G B 5.16.0_for_upstream_debug_2022_01_10_23_12 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x57/0x7d print_address_description.constprop.0+0x1f/0x140 ? strcmp+0x9b/0xb0 ? strcmp+0x9b/0xb0 kasan_report.cold+0x83/0xdf ? strcmp+0x9b/0xb0 strcmp+0x9b/0xb0 mlx5_rsc_dump_init+0x4ab/0x780 [mlx5_core] ? mlx5_rsc_dump_destroy+0x80/0x80 [mlx5_core] ? lockdep_hardirqs_on_prepare+0x286/0x400 ? raw_spin_unlock_irqrestore+0x47/0x50 ? aomic_notifier_chain_register+0x32/0x40 mlx5_load+0x104/0x2e0 [mlx5_core] mlx5_init_one+0x41b/0x610 [mlx5_core] .... The buggy address belongs to the object at ffff88812b2e0000 which belongs to the cache kmalloc-4k of size 4096 The buggy address is located 4048 bytes to the right of 4096-byte region [ffff88812b2e0000, ffff88812b2e1000) The buggy address belongs to the page: page:000000009d69807a refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88812b2e6000 pfn:0x12b2e0 head:000000009d69807a order:3 compound_mapcount:0 compound_pincount:0 flags: 0x8000000000010200(slab|head|zone=2) raw: 8000000000010200 0000000000000000 dead000000000001 ffff888100043040 raw: ffff88812b2e6000 0000000080040000 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff88812b2e1e80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88812b2e1f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff88812b2e1f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ^ ffff88812b2e2000: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88812b2e2080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== Fixes: 12206b17235a ("net/mlx5: Add support for resource dump") Signed-off-by: Aya Levin <ayal@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-04net/mlx5e: Fix wrong source vport matching on tunnel ruleAriel Levkovich
When OVS internal port is the vtep device, the first decap rule is matching on the internal port's vport metadata value and then changes the metadata to be the uplink's value. Therefore, following rules on the tunnel, in chain > 0, should avoid matching on internal port metadata and use the uplink vport metadata instead. Select the uplink's metadata value for the source vport match in case the rule is in chain greater than zero, even if the tunnel route device is internal port. Fixes: 166f431ec6be ("net/mlx5e: Add indirect tc offload of ovs internal port") Signed-off-by: Ariel Levkovich <lariel@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2022-05-03Merge branch 'bnxt_en-bug-fixes'Jakub Kicinski
Michael Chan says: ==================== bnxt_en: Bug fixes This patch series includes 3 fixes: - Fix an occasional VF open failure. - Fix a PTP spinlock usage before initialization - Fix unnecesary RX packet drops under high TX traffic load. ==================== Link: https://lore.kernel.org/r/1651540392-2260-1-git-send-email-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03bnxt_en: Fix unnecessary dropping of RX packetsMichael Chan
In bnxt_poll_p5(), we first check cpr->has_more_work. If it is true, we are in NAPI polling mode and we will call __bnxt_poll_cqs() to continue polling. It is possible to exhanust the budget again when __bnxt_poll_cqs() returns. We then enter the main while loop to check for new entries in the NQ. If we had previously exhausted the NAPI budget, we may call __bnxt_poll_work() to process an RX entry with zero budget. This will cause packets to be dropped unnecessarily, thinking that we are in the netpoll path. Fix it by breaking out of the while loop if we need to process an RX NQ entry with no budget left. We will then exit NAPI and stay in polling mode. Fixes: 389a877a3b20 ("bnxt_en: Process the NQ under NAPI continuous polling.") Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03bnxt_en: Initiallize bp->ptp_lock first before using itMichael Chan
bnxt_ptp_init() calls bnxt_ptp_init_rtc() which will acquire the ptp_lock spinlock. The spinlock is not initialized until later. Move the bnxt_ptp_init_rtc() call after the spinlock is initialized. Fixes: 24ac1ecd5240 ("bnxt_en: Add driver support to use Real Time Counter for PTP") Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03bnxt_en: Fix possible bnxt_open() failure caused by wrong RFS flagSomnath Kotur
bnxt_open() can fail in this code path, especially on a VF when it fails to reserve default rings: bnxt_open() __bnxt_open_nic() bnxt_clear_int_mode() bnxt_init_dflt_ring_mode() RX rings would be set to 0 when we hit this error path. It is possible for a subsequent bnxt_open() call to potentially succeed with a code path like this: bnxt_open() bnxt_hwrm_if_change() bnxt_fw_init_one() bnxt_fw_init_one_p3() bnxt_set_dflt_rfs() bnxt_rfs_capable() bnxt_hwrm_reserve_rings() On older chips, RFS is capable if we can reserve the number of vnics that is equal to RX rings + 1. But since RX rings is still set to 0 in this code path, we may mistakenly think that RFS is supported for 0 RX rings. Later, when the default RX rings are reserved and we try to enable RFS, it would fail and cause bnxt_open() to fail unnecessarily. We fix this in 2 places. bnxt_rfs_capable() will always return false if RX rings is not yet set. bnxt_init_dflt_ring_mode() will call bnxt_set_dflt_rfs() which will always clear the RFS flags if RFS is not supported. Fixes: 20d7d1c5c9b1 ("bnxt_en: reliably allocate IRQ table on reset to avoid crash") Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03smsc911x: allow using IRQ0Sergey Shtylyov
The AlphaProject AP-SH4A-3A/AP-SH4AD-0A SH boards use IRQ0 for their SMSC LAN911x Ethernet chip, so the networking on them must have been broken by commit 965b2aa78fbc ("net/smsc911x: fix irq resource allocation failure") which filtered out 0 as well as the negative error codes -- it was kinda correct at the time, as platform_get_irq() could return 0 on of_irq_get() failure and on the actual 0 in an IRQ resource. This issue was fixed by me (back in 2016!), so we should be able to fix this driver to allow IRQ0 usage again... When merging this to the stable kernels, make sure you also merge commit e330b9a6bb35 ("platform: don't return 0 from platform_get_irq[_byname]() on error") -- that's my fix to platform_get_irq() for the DT platforms... Fixes: 965b2aa78fbc ("net/smsc911x: fix irq resource allocation failure") Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Link: https://lore.kernel.org/r/656036e4-6387-38df-b8a7-6ba683b16e63@omp.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03net: sfp: Add tx-fault workaround for Huawei MA5671A SFP ONTMatthew Hagan
As noted elsewhere, various GPON SFP modules exhibit non-standard TX-fault behaviour. In the tested case, the Huawei MA5671A, when used in combination with a Marvell mv88e6085 switch, was found to persistently assert TX-fault, resulting in the module being disabled. This patch adds a quirk to ignore the SFP_F_TX_FAULT state, allowing the module to function. Change from v1: removal of erroneous return statment (Andrew Lunn) Signed-off-by: Matthew Hagan <mnhagan88@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20220502223315.1973376-1-mnhagan88@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-03Merge tag 'seccomp-v5.18-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull seccomp selftest fix from Kees Cook: - Avoid using stdin for read syscall testing (Jann Horn) * tag 'seccomp-v5.18-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: selftests/seccomp: Don't call read() on TTY from background pgrp