summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-02-14Linux 5.15.94v5.15.94Greg Kroah-Hartman
Link: https://lore.kernel.org/r/20230213144732.336342050@linuxfoundation.org Tested-by: Florian Fainelli <f.fainelli@gmail.com> Tested-by: Allen Pais <apais@linux.microsoft.com> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk> Tested-by: Ron Economos <re@w6rz.net> Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14Documentation/hw-vuln: Add documentation for Cross-Thread Return PredictionsTom Lendacky
commit 493a2c2d23ca91afba96ac32b6cbafb54382c2a3 upstream. Add the admin guide for the Cross-Thread Return Predictions vulnerability. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Message-Id: <60f9c0b4396956ce70499ae180cb548720b25c7e.1675956146.git.thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14KVM: x86: Mitigate the cross-thread return address predictions bugTom Lendacky
commit 6f0f2d5ef895d66a3f2b32dd05189ec34afa5a55 upstream. By default, KVM/SVM will intercept attempts by the guest to transition out of C0. However, the KVM_CAP_X86_DISABLE_EXITS capability can be used by a VMM to change this behavior. To mitigate the cross-thread return address predictions bug (X86_BUG_SMT_RSB), a VMM must not be allowed to override the default behavior to intercept C0 transitions. Use a module parameter to control the mitigation on processors that are vulnerable to X86_BUG_SMT_RSB. If the processor is vulnerable to the X86_BUG_SMT_RSB bug and the module parameter is set to mitigate the bug, KVM will not allow the disabling of the HLT, MWAIT and CSTATE exits. Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Message-Id: <4019348b5e07148eb4d593380a5f6713b93c9a16.1675956146.git.thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14x86/speculation: Identify processors vulnerable to SMT RSB predictionsTom Lendacky
commit be8de49bea505e7777a69ef63d60e02ac1712683 upstream. Certain AMD processors are vulnerable to a cross-thread return address predictions bug. When running in SMT mode and one of the sibling threads transitions out of C0 state, the other sibling thread could use return target predictions from the sibling thread that transitioned out of C0. The Spectre v2 mitigations cover the Linux kernel, as it fills the RSB when context switching to the idle thread. However, KVM allows a VMM to prevent exiting guest mode when transitioning out of C0. A guest could act maliciously in this situation, so create a new x86 BUG that can be used to detect if the processor is vulnerable. Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Message-Id: <91cec885656ca1fcd4f0185ce403a53dd9edecb7.1675956146.git.thomas.lendacky@amd.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14drm/i915: Fix VBT DSI DVO port handlingVille Syrjälä
commit 6a7ff131f17f44c593173c5ee30e2c03ef211685 upstream. Turns out modern (icl+) VBTs still declare their DSI ports as MIPI-A and MIPI-C despite the PHYs now being A and B. Remap appropriately to allow the panels declared as MIPI-C to work. Cc: stable@vger.kernel.org Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8016 Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230207064337.18697-2-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit 118b5c136c04da705b274b0d39982bb8b7430fc5) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14drm/i915: Initialize the obj flags for shmem objectsAravind Iddamsetty
commit 44e4c5684fcc82d8f099656c4ea39d9571e2a8ac upstream. Obj flags for shmem objects is not being set correctly. Fixes in setting BO_ALLOC_USER flag which applies to shmem objs as well. v2: Add fixes tag (Tvrtko, Matt A) Fixes: 13d29c823738 ("drm/i915/ehl: unconditionally flush the pages on acquire") Cc: <stable@vger.kernel.org> # v5.15+ Cc: Matthew Auld <matthew.auld@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> [tursulin: Grouped all tags together.] Link: https://patchwork.freedesktop.org/patch/msgid/20230203135205.4051149-1-aravind.iddamsetty@intel.com (cherry picked from commit bca0d1d3ceeb07be45a51c0fa4d57a0ce31b6aed) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14drm/amdgpu/fence: Fix oops due to non-matching drm_sched init/finiGuilherme G. Piccoli
commit 5ad7bbf3dba5c4a684338df1f285080f2588b535 upstream. Currently amdgpu calls drm_sched_fini() from the fence driver sw fini routine - such function is expected to be called only after the respective init function - drm_sched_init() - was executed successfully. Happens that we faced a driver probe failure in the Steam Deck recently, and the function drm_sched_fini() was called even without its counter-part had been previously called, causing the following oops: amdgpu: probe of 0000:04:00.0 failed with error -110 BUG: kernel NULL pointer dereference, address: 0000000000000090 PGD 0 P4D 0 Oops: 0002 [#1] PREEMPT SMP NOPTI CPU: 0 PID: 609 Comm: systemd-udevd Not tainted 6.2.0-rc3-gpiccoli #338 Hardware name: Valve Jupiter/Jupiter, BIOS F7A0113 11/04/2022 RIP: 0010:drm_sched_fini+0x84/0xa0 [gpu_sched] [...] Call Trace: <TASK> amdgpu_fence_driver_sw_fini+0xc8/0xd0 [amdgpu] amdgpu_device_fini_sw+0x2b/0x3b0 [amdgpu] amdgpu_driver_release_kms+0x16/0x30 [amdgpu] devm_drm_dev_init_release+0x49/0x70 [...] To prevent that, check if the drm_sched was properly initialized for a given ring before calling its fini counter-part. Notice ideally we'd use sched.ready for that; such field is set as the latest thing on drm_sched_init(). But amdgpu seems to "override" the meaning of such field - in the above oops for example, it was a GFX ring causing the crash, and the sched.ready field was set to true in the ring init routine, regardless of the state of the DRM scheduler. Hence, we ended-up using sched.ops as per Christian's suggestion [0], and also removed the no_scheduler check [1]. [0] https://lore.kernel.org/amd-gfx/984ee981-2906-0eaf-ccec-9f80975cb136@amd.com/ [1] https://lore.kernel.org/amd-gfx/cd0e2994-f85f-d837-609f-7056d5fb7231@amd.com/ Fixes: 067f44c8b459 ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)") Suggested-by: Christian König <christian.koenig@amd.com> Cc: Guchun Chen <guchun.chen@amd.com> Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14Fix page corruption caused by racy check in __free_pagesDavid Chen
commit 462a8e08e0e6287e5ce13187257edbf24213ed03 upstream. When we upgraded our kernel, we started seeing some page corruption like the following consistently: BUG: Bad page state in process ganesha.nfsd pfn:1304ca page:0000000022261c55 refcount:0 mapcount:-128 mapping:0000000000000000 index:0x0 pfn:0x1304ca flags: 0x17ffffc0000000() raw: 0017ffffc0000000 ffff8a513ffd4c98 ffffeee24b35ec08 0000000000000000 raw: 0000000000000000 0000000000000001 00000000ffffff7f 0000000000000000 page dumped because: nonzero mapcount CPU: 0 PID: 15567 Comm: ganesha.nfsd Kdump: loaded Tainted: P B O 5.10.158-1.nutanix.20221209.el7.x86_64 #1 Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 04/05/2016 Call Trace: dump_stack+0x74/0x96 bad_page.cold+0x63/0x94 check_new_page_bad+0x6d/0x80 rmqueue+0x46e/0x970 get_page_from_freelist+0xcb/0x3f0 ? _cond_resched+0x19/0x40 __alloc_pages_nodemask+0x164/0x300 alloc_pages_current+0x87/0xf0 skb_page_frag_refill+0x84/0x110 ... Sometimes, it would also show up as corruption in the free list pointer and cause crashes. After bisecting the issue, we found the issue started from commit e320d3012d25 ("mm/page_alloc.c: fix freeing non-compound pages"): if (put_page_testzero(page)) free_the_page(page, order); else if (!PageHead(page)) while (order-- > 0) free_the_page(page + (1 << order), order); So the problem is the check PageHead is racy because at this point we already dropped our reference to the page. So even if we came in with compound page, the page can already be freed and PageHead can return false and we will end up freeing all the tail pages causing double free. Fixes: e320d3012d25 ("mm/page_alloc.c: fix freeing non-compound pages") Link: https://lore.kernel.org/lkml/BYAPR02MB448855960A9656EEA81141FC94D99@BYAPR02MB4488.namprd02.prod.outlook.com/ Cc: Andrew Morton <akpm@linux-foundation.org> Cc: stable@vger.kernel.org Signed-off-by: Chunwei Chen <david.chen@nutanix.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14arm64: dts: meson-axg: Make mmc host controller interrupts level-sensitiveHeiner Kallweit
commit d182bcf300772d8b2e5f43e47fa0ebda2b767cc4 upstream. The usage of edge-triggered interrupts lead to lost interrupts under load, see [0]. This was confirmed to be fixed by using level-triggered interrupts. The report was about SDIO. However, as the host controller is the same for SD and MMC, apply the change to all mmc controller instances. [0] https://www.spinics.net/lists/linux-mmc/msg73991.html Fixes: 221cf34bac54 ("ARM64: dts: meson-axg: enable the eMMC controller") Reported-by: Peter Suti <peter.suti@streamunlimited.com> Tested-by: Vyacheslav Bocharov <adeep@lexina.in> Tested-by: Peter Suti <peter.suti@streamunlimited.com> Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Acked-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/c00655d3-02f8-6f5f-4239-ca2412420cad@gmail.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14arm64: dts: meson-g12-common: Make mmc host controller interrupts ↵Heiner Kallweit
level-sensitive commit ac8db4cceed218cca21c84f9d75ce88182d8b04f upstream. The usage of edge-triggered interrupts lead to lost interrupts under load, see [0]. This was confirmed to be fixed by using level-triggered interrupts. The report was about SDIO. However, as the host controller is the same for SD and MMC, apply the change to all mmc controller instances. [0] https://www.spinics.net/lists/linux-mmc/msg73991.html Fixes: 4759fd87b928 ("arm64: dts: meson: g12a: add mmc nodes") Tested-by: FUKAUMI Naoki <naoki@radxa.com> Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Tested-by: Jerome Brunet <jbrunet@baylibre.com> Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Acked-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/27d89baa-b8fa-baca-541b-ef17a97cde3c@gmail.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14arm64: dts: meson-gx: Make mmc host controller interrupts level-sensitiveHeiner Kallweit
commit 66e45351f7d6798751f98001d1fcd572024d87f0 upstream. The usage of edge-triggered interrupts lead to lost interrupts under load, see [0]. This was confirmed to be fixed by using level-triggered interrupts. The report was about SDIO. However, as the host controller is the same for SD and MMC, apply the change to all mmc controller instances. [0] https://www.spinics.net/lists/linux-mmc/msg73991.html Fixes: ef8d2ffedf18 ("ARM64: dts: meson-gxbb: add MMC support") Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Acked-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/76e042e0-a610-5ed5-209f-c4d7f879df44@gmail.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14rtmutex: Ensure that the top waiter is always woken upWander Lairson Costa
commit db370a8b9f67ae5f17e3d5482493294467784504 upstream. Let L1 and L2 be two spinlocks. Let T1 be a task holding L1 and blocked on L2. T1, currently, is the top waiter of L2. Let T2 be the task holding L2. Let T3 be a task trying to acquire L1. The following events will lead to a state in which the wait queue of L2 isn't empty, but no task actually holds the lock. T1 T2 T3 == == == spin_lock(L1) | raw_spin_lock(L1->wait_lock) | rtlock_slowlock_locked(L1) | | task_blocks_on_rt_mutex(L1, T3) | | | orig_waiter->lock = L1 | | | orig_waiter->task = T3 | | | raw_spin_unlock(L1->wait_lock) | | | rt_mutex_adjust_prio_chain(T1, L1, L2, orig_waiter, T3) spin_unlock(L2) | | | | | rt_mutex_slowunlock(L2) | | | | | | raw_spin_lock(L2->wait_lock) | | | | | | wakeup(T1) | | | | | | raw_spin_unlock(L2->wait_lock) | | | | | | | | waiter = T1->pi_blocked_on | | | | waiter == rt_mutex_top_waiter(L2) | | | | waiter->task == T1 | | | | raw_spin_lock(L2->wait_lock) | | | | dequeue(L2, waiter) | | | | update_prio(waiter, T1) | | | | enqueue(L2, waiter) | | | | waiter != rt_mutex_top_waiter(L2) | | | | L2->owner == NULL | | | | wakeup(T1) | | | | raw_spin_unlock(L2->wait_lock) T1 wakes up T1 != top_waiter(L2) schedule_rtlock() If the deadline of T1 is updated before the call to update_prio(), and the new deadline is greater than the deadline of the second top waiter, then after the requeue, T1 is no longer the top waiter, and the wrong task is woken up which will then go back to sleep because it is not the top waiter. This can be reproduced in PREEMPT_RT with stress-ng: while true; do stress-ng --sched deadline --sched-period 1000000000 \ --sched-runtime 800000000 --sched-deadline \ 1000000000 --mmapfork 23 -t 20 done A similar issue was pointed out by Thomas versus the cases where the top waiter drops out early due to a signal or timeout, which is a general issue for all regular rtmutex use cases, e.g. futex. The problematic code is in rt_mutex_adjust_prio_chain(): // Save the top waiter before dequeue/enqueue prerequeue_top_waiter = rt_mutex_top_waiter(lock); rt_mutex_dequeue(lock, waiter); waiter_update_prio(waiter, task); rt_mutex_enqueue(lock, waiter); // Lock has no owner? if (!rt_mutex_owner(lock)) { // Top waiter changed ----> if (prerequeue_top_waiter != rt_mutex_top_waiter(lock)) ----> wake_up_state(waiter->task, waiter->wake_state); This only takes the case into account where @waiter is the new top waiter due to the requeue operation. But it fails to handle the case where @waiter is not longer the top waiter due to the requeue operation. Ensure that the new top waiter is woken up so in all cases so it can take over the ownerless lock. [ tglx: Amend changelog, add Fixes tag ] Fixes: c014ef69b3ac ("locking/rtmutex: Add wake_state to rt_mutex_waiter") Signed-off-by: Wander Lairson Costa <wander@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230117172649.52465-1-wander@redhat.com Link: https://lore.kernel.org/r/20230202123020.14844-1-wander@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14powerpc/64s/interrupt: Fix interrupt exit race with security mitigation switchNicholas Piggin
commit 2ea31e2e62bbc4d11c411eeb36f1b02841dbcab1 upstream. The RFI and STF security mitigation options can flip the interrupt_exit_not_reentrant static branch condition concurrently with the interrupt exit code which tests that branch. Interrupt exit tests this condition to set MSR[EE|RI] for exit, then again in the case a soft-masked interrupt is found pending, to recover the MSR so the interrupt can be replayed before attempting to exit again. If the condition changes between these two tests, the MSR and irq soft-mask state will become corrupted, leading to warnings and possible crashes. For example, if the branch is initially true then false, MSR[EE] will be 0 but PACA_IRQ_HARD_DIS clear and EE may not get enabled, leading to warnings in irq_64.c. Fixes: 13799748b957 ("powerpc/64: use interrupt restart table to speed up return from interrupt") Cc: stable@vger.kernel.org # v5.14+ Reported-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230206042240.92103-1-npiggin@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14riscv: Fixup race condition on PG_dcache_clean in flush_icache_pteGuo Ren
commit 950b879b7f0251317d26bae0687e72592d607532 upstream. In commit 588a513d3425 ("arm64: Fix race condition on PG_dcache_clean in __sync_icache_dcache()"), we found RISC-V has the same issue as the previous arm64. The previous implementation didn't guarantee the correct sequence of operations, which means flush_icache_all() hasn't been called when the PG_dcache_clean was set. That would cause a risk of page synchronization. Fixes: 08f051eda33b ("RISC-V: Flush I$ when making a dirty page executable") Signed-off-by: Guo Ren <guoren@linux.alibaba.com> Signed-off-by: Guo Ren <guoren@kernel.org> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20230127035306.1819561-1-guoren@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14ceph: flush cap releases when the session is flushedXiubo Li
commit e7d84c6a1296d059389f7342d9b4b7defb518d3a upstream. MDS expects the completed cap release prior to responding to the session flush for cache drop. Cc: stable@vger.kernel.org Link: http://tracker.ceph.com/issues/38009 Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14clk: ingenic: jz4760: Update M/N/OD calculation algorithmPaul Cercueil
commit ecfb9f404771dde909ce7743df954370933c3be2 upstream. The previous algorithm was pretty broken. - The inner loop had a '(m > m_max)' condition, and the value of 'm' would increase in each iteration; - Each iteration would actually multiply 'm' by two, so it is not needed to re-compute the whole equation at each iteration; - It would loop until (m & 1) == 0, which means it would loop at most once. - The outer loop would divide the 'n' value by two at the end of each iteration. This meant that for a 12 MHz parent clock and a 1.2 GHz requested clock, it would first try n=12, then n=6, then n=3, then n=1, none of which would work; the only valid value is n=2 in this case. Simplify this algorithm with a single for loop, which decrements 'n' after each iteration, addressing all of the above problems. Fixes: bdbfc029374f ("clk: ingenic: Add support for the JZ4760") Cc: <stable@vger.kernel.org> Signed-off-by: Paul Cercueil <paul@crapouillou.net> Link: https://lore.kernel.org/r/20221214123704.7305-1-paul@crapouillou.net Signed-off-by: Stephen Boyd <sboyd@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14usb: typec: altmodes/displayport: Fix probe pin assign checkPrashant Malani
commit 54e5c00a4eb0a4c663445b245f641bbfab142430 upstream. While checking Pin Assignments of the port and partner during probe, we don't take into account whether the peripheral is a plug or receptacle. This manifests itself in a mode entry failure on certain docks and dongles with captive cables. For instance, the Startech.com Type-C to DP dongle (Model #CDP2DP) advertises its DP VDO as 0x405. This would fail the Pin Assignment compatibility check, despite it supporting Pin Assignment C as a UFP. Update the check to use the correct DP Pin Assign macros that take the peripheral's receptacle bit into account. Fixes: c1e5c2f0cb8a ("usb: typec: altmodes/displayport: correct pin assignment for UFP receptacles") Cc: stable@vger.kernel.org Reported-by: Diana Zigterman <dzigterman@chromium.org> Signed-off-by: Prashant Malani <pmalani@chromium.org> Link: https://lore.kernel.org/r/20230208205318.131385-1-pmalani@chromium.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14usb: core: add quirk for Alcor Link AK9563 smartcard readerMark Pearson
commit 303e724d7b1e1a0a93daf0b1ab5f7c4f53543b34 upstream. The Alcor Link AK9563 smartcard reader used on some Lenovo platforms doesn't work. If LPM is enabled the reader will provide an invalid usb config descriptor. Added quirk to disable LPM. Verified fix on Lenovo P16 G1 and T14 G3 Tested-by: Miroslav Zatko <mzatko@mirexoft.com> Tested-by: Dennis Wassenberg <dennis.wassenberg@secunet.com> Cc: stable@vger.kernel.org Signed-off-by: Dennis Wassenberg <dennis.wassenberg@secunet.com> Signed-off-by: Mark Pearson <mpearson-lenovo@squebb.ca> Link: https://lore.kernel.org/r/20230208181223.1092654-1-mpearson-lenovo@squebb.ca Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14btrfs: free device in btrfs_close_devices for a single device filesystemAnand Jain
commit 5f58d783fd7823b2c2d5954d1126e702f94bfc4c upstream. We have this check to make sure we don't accidentally add older devices that may have disappeared and re-appeared with an older generation from being added to an fs_devices (such as a replace source device). This makes sense, we don't want stale disks in our file system. However for single disks this doesn't really make sense. I've seen this in testing, but I was provided a reproducer from a project that builds btrfs images on loopback devices. The loopback device gets cached with the new generation, and then if it is re-used to generate a new file system we'll fail to mount it because the new fs is "older" than what we have in cache. Fix this by freeing the cache when closing the device for a single device filesystem. This will ensure that the mount command passed device path is scanned successfully during the next mount. CC: stable@vger.kernel.org # 5.10+ Reported-by: Daan De Meyer <daandemeyer@fb.com> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Anand Jain <anand.jain@oracle.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14mptcp: be careful on subflow status propagation on errorsPaolo Abeni
commit 1249db44a102d9d3541ed7798d4b01ffdcf03524 upstream. Currently the subflow error report callback unconditionally propagates the fallback subflow status to the owning msk. If the msk is already orphaned, the above prevents the code from correctly tracking the msk moving to the TCP_CLOSE state and doing the appropriate cleanup. All the above causes increasing memory usage over time and sporadic self-tests failures. There is a great deal of infrastructure trying to propagate correctly the fallback subflow status to the owning mptcp socket, e.g. via mptcp_subflow_eof() and subflow_sched_work_if_closed(): in the error propagation path we need only to cope with unorphaned sockets. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/339 Fixes: 15cc10453398 ("mptcp: deliver ssk errors to msk") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14net: USB: Fix wrong-direction WARNING in plusb.cAlan Stern
commit 811d581194f7412eda97acc03d17fc77824b561f upstream. The syzbot fuzzer detected a bug in the plusb network driver: A zero-length control-OUT transfer was treated as a read instead of a write. In modern kernels this error provokes a WARNING: usb 1-1: BOGUS control dir, pipe 80000280 doesn't match bRequestType c0 WARNING: CPU: 0 PID: 4645 at drivers/usb/core/urb.c:411 usb_submit_urb+0x14a7/0x1880 drivers/usb/core/urb.c:411 Modules linked in: CPU: 1 PID: 4645 Comm: dhcpcd Not tainted 6.2.0-rc6-syzkaller-00050-g9f266ccaa2f5 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/12/2023 RIP: 0010:usb_submit_urb+0x14a7/0x1880 drivers/usb/core/urb.c:411 ... Call Trace: <TASK> usb_start_wait_urb+0x101/0x4b0 drivers/usb/core/message.c:58 usb_internal_control_msg drivers/usb/core/message.c:102 [inline] usb_control_msg+0x320/0x4a0 drivers/usb/core/message.c:153 __usbnet_read_cmd+0xb9/0x390 drivers/net/usb/usbnet.c:2010 usbnet_read_cmd+0x96/0xf0 drivers/net/usb/usbnet.c:2068 pl_vendor_req drivers/net/usb/plusb.c:60 [inline] pl_set_QuickLink_features drivers/net/usb/plusb.c:75 [inline] pl_reset+0x2f/0xf0 drivers/net/usb/plusb.c:85 usbnet_open+0xcc/0x5d0 drivers/net/usb/usbnet.c:889 __dev_open+0x297/0x4d0 net/core/dev.c:1417 __dev_change_flags+0x587/0x750 net/core/dev.c:8530 dev_change_flags+0x97/0x170 net/core/dev.c:8602 devinet_ioctl+0x15a2/0x1d70 net/ipv4/devinet.c:1147 inet_ioctl+0x33f/0x380 net/ipv4/af_inet.c:979 sock_do_ioctl+0xcc/0x230 net/socket.c:1169 sock_ioctl+0x1f8/0x680 net/socket.c:1286 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:870 [inline] __se_sys_ioctl fs/ioctl.c:856 [inline] __x64_sys_ioctl+0x197/0x210 fs/ioctl.c:856 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd The fix is to call usbnet_write_cmd() instead of usbnet_read_cmd() and remove the USB_DIR_IN flag. Reported-and-tested-by: syzbot+2a0e7abd24f1eb90ce25@syzkaller.appspotmail.com Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Fixes: 090ffa9d0e90 ("[PATCH] USB: usbnet (9/9) module for pl2301/2302 cables") CC: stable@vger.kernel.org Link: https://lore.kernel.org/r/00000000000052099f05f3b3e298@google.com/ Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14cifs: Fix use-after-free in rdata->read_into_pages()ZhaoLong Wang
commit aa5465aeca3c66fecdf7efcf554aed79b4c4b211 upstream. When the network status is unstable, use-after-free may occur when read data from the server. BUG: KASAN: use-after-free in readpages_fill_pages+0x14c/0x7e0 Call Trace: <TASK> dump_stack_lvl+0x38/0x4c print_report+0x16f/0x4a6 kasan_report+0xb7/0x130 readpages_fill_pages+0x14c/0x7e0 cifs_readv_receive+0x46d/0xa40 cifs_demultiplex_thread+0x121c/0x1490 kthread+0x16b/0x1a0 ret_from_fork+0x2c/0x50 </TASK> Allocated by task 2535: kasan_save_stack+0x22/0x50 kasan_set_track+0x25/0x30 __kasan_kmalloc+0x82/0x90 cifs_readdata_direct_alloc+0x2c/0x110 cifs_readdata_alloc+0x2d/0x60 cifs_readahead+0x393/0xfe0 read_pages+0x12f/0x470 page_cache_ra_unbounded+0x1b1/0x240 filemap_get_pages+0x1c8/0x9a0 filemap_read+0x1c0/0x540 cifs_strict_readv+0x21b/0x240 vfs_read+0x395/0x4b0 ksys_read+0xb8/0x150 do_syscall_64+0x3f/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc Freed by task 79: kasan_save_stack+0x22/0x50 kasan_set_track+0x25/0x30 kasan_save_free_info+0x2e/0x50 __kasan_slab_free+0x10e/0x1a0 __kmem_cache_free+0x7a/0x1a0 cifs_readdata_release+0x49/0x60 process_one_work+0x46c/0x760 worker_thread+0x2a4/0x6f0 kthread+0x16b/0x1a0 ret_from_fork+0x2c/0x50 Last potentially related work creation: kasan_save_stack+0x22/0x50 __kasan_record_aux_stack+0x95/0xb0 insert_work+0x2b/0x130 __queue_work+0x1fe/0x660 queue_work_on+0x4b/0x60 smb2_readv_callback+0x396/0x800 cifs_abort_connection+0x474/0x6a0 cifs_reconnect+0x5cb/0xa50 cifs_readv_from_socket.cold+0x22/0x6c cifs_read_page_from_socket+0xc1/0x100 readpages_fill_pages.cold+0x2f/0x46 cifs_readv_receive+0x46d/0xa40 cifs_demultiplex_thread+0x121c/0x1490 kthread+0x16b/0x1a0 ret_from_fork+0x2c/0x50 The following function calls will cause UAF of the rdata pointer. readpages_fill_pages cifs_read_page_from_socket cifs_readv_from_socket cifs_reconnect __cifs_reconnect cifs_abort_connection mid->callback() --> smb2_readv_callback queue_work(&rdata->work) # if the worker completes first, # the rdata is freed cifs_readv_complete kref_put cifs_readdata_release kfree(rdata) return rdata->... # UAF in readpages_fill_pages() Similarly, this problem also occurs in the uncache_fill_pages(). Fix this by adjusts the order of condition judgment in the return statement. Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com> Cc: stable@vger.kernel.org Acked-by: Paulo Alcantara (SUSE) <pc@cjr.nz> Signed-off-by: Steve French <stfrench@microsoft.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-02-14pinctrl: intel: Restore the pins that used to be in Direct IRQ modeAndy Shevchenko
[ Upstream commit a8520be3ffef3d25b53bf171a7ebe17ee0154175 ] If the firmware mangled the register contents too much, check the saved value for the Direct IRQ mode. If it matches, we will restore the pin state. Reported-by: Jim Minter <jimminter@microsoft.com> Fixes: 6989ea4881c8 ("pinctrl: intel: Save and restore pins in "direct IRQ" mode") Tested-by: Jim Minter <jimminter@microsoft.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Mika Westerberg <mika.westerberg@linux.intel.com> Link: https://lore.kernel.org/r/20230206141558.20916-1-andriy.shevchenko@linux.intel.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14spi: dw: Fix wrong FIFO level setting for long xfersSerge Semin
[ Upstream commit c63b8fd14a7db719f8252038a790638728c4eb66 ] Due to using the u16 type in the min_t() macros the SPI transfer length will be cast to word before participating in the conditional statement implied by the macro. Thus if the transfer length is greater than 64KB the Tx/Rx FIFO threshold level value will be determined by the leftover of the truncated after the type-case length. In the worst case it will cause the dramatical performance drop due to the "Tx FIFO Empty" or "Rx FIFO Full" interrupts triggered on each xfer word sent/received to/from the bus. The problem can be easily fixed by specifying the unsigned int type in the min_t() macros thus preventing the possible data loss. Fixes: ea11370fffdf ("spi: dw: get TX level without an additional variable") Reported-by: Sergey Nazarov <Sergey.Nazarov@baikalelectronics.ru> Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20230113185942.2516-1-Sergey.Semin@baikalelectronics.ru Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14pinctrl: single: fix potential NULL dereferenceMaxim Korotkov
[ Upstream commit d2d73e6d4822140445ad4a7b1c6091e0f5fe703b ] Added checking of pointer "function" in pcs_set_mux(). pinmux_generic_get_function() can return NULL and the pointer "function" was dereferenced without checking against NULL. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 571aec4df5b7 ("pinctrl: single: Use generic pinmux helpers for managing functions") Signed-off-by: Maxim Korotkov <korotkov.maxim.s@gmail.com> Reviewed-by: Tony Lindgren <tony@atomide.com> Link: https://lore.kernel.org/r/20221118104332.943-1-korotkov.maxim.s@gmail.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14pinctrl: aspeed: Fix confusing types in return valueJoel Stanley
[ Upstream commit 287a344a11f1ebd31055cf9b22c88d7005f108d7 ] The function signature is int, but we return a bool. Instead return a negative errno as the kerneldoc suggests. Fixes: 4d3d0e4272d8 ("pinctrl: Add core support for Aspeed SoCs") Signed-off-by: Joel Stanley <joel@jms.id.au> Reviewed-by: Andrew Jeffery <andrew@aj.id.au> Link: https://lore.kernel.org/r/20230119231856.52014-1-joel@jms.id.au Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14pinctrl: mediatek: Fix the drive register definition of some PinsGuodong Liu
[ Upstream commit 5754a1c98b18009cb3030dc391aa37b77428a0bd ] The drive adjustment register definition of gpio13 and gpio81 is wrong: "the start address for the range" of gpio18 is corrected to 0x000, "the start bit for the first register within the range" of gpio81 is corrected to 24. Fixes: 6cf5e9ef362a ("pinctrl: add pinctrl driver on mt8195") Signed-off-by: Guodong Liu <Guodong.Liu@mediatek.com> Link: https://lore.kernel.org/r/20230118062116.26315-1-Guodong.Liu@mediatek.com Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14ASoC: topology: Return -ENOMEM on memory allocation failureAmadeusz Sławiński
[ Upstream commit c173ee5b2fa6195066674d66d1d7e191010fb1ff ] When handling error path, ret needs to be set to correct value. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Fixes: d29d41e28eea ("ASoC: topology: Add support for multiple kcontrol types to a widget") Reviewed-by: Cezary Rojewski <cezary.rojewski@intel.com> Signed-off-by: Amadeusz Sławiński <amadeuszx.slawinski@linux.intel.com> Link: https://lore.kernel.org/r/20230207210428.2076354-1-amadeuszx.slawinski@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14riscv: stacktrace: Fix missing the first frameLiu Shixin
[ Upstream commit cb80242cc679d6397e77d8a964deeb3ff218d2b5 ] When running kfence_test, I found some testcases failed like this: # test_out_of_bounds_read: EXPECTATION FAILED at mm/kfence/kfence_test.c:346 Expected report_matches(&expect) to be true, but is false not ok 1 - test_out_of_bounds_read The corresponding call-trace is: BUG: KFENCE: out-of-bounds read in kunit_try_run_case+0x38/0x84 Out-of-bounds read at 0x(____ptrval____) (32B right of kfence-#10): kunit_try_run_case+0x38/0x84 kunit_generic_run_threadfn_adapter+0x12/0x1e kthread+0xc8/0xde ret_from_exception+0x0/0xc The kfence_test using the first frame of call trace to check whether the testcase is succeed or not. Commit 6a00ef449370 ("riscv: eliminate unreliable __builtin_frame_address(1)") skip first frame for all case, which results the kfence_test failed. Indeed, we only need to skip the first frame for case (task==NULL || task==current). With this patch, the call-trace will be: BUG: KFENCE: out-of-bounds read in test_out_of_bounds_read+0x88/0x19e Out-of-bounds read at 0x(____ptrval____) (1B left of kfence-#7): test_out_of_bounds_read+0x88/0x19e kunit_try_run_case+0x38/0x84 kunit_generic_run_threadfn_adapter+0x12/0x1e kthread+0xc8/0xde ret_from_exception+0x0/0xc Fixes: 6a00ef449370 ("riscv: eliminate unreliable __builtin_frame_address(1)") Signed-off-by: Liu Shixin <liushixin2@huawei.com> Tested-by: Samuel Holland <samuel@sholland.org> Link: https://lore.kernel.org/r/20221207025038.1022045-1-liushixin2@huawei.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14ALSA: pci: lx6464es: fix a debug loopDan Carpenter
[ Upstream commit 5dac9f8dc25fefd9d928b98f6477ff3daefd73e3 ] This loop accidentally reuses the "i" iterator for both the inside and the outside loop. The value of MAX_STREAM_BUFFER is 5. I believe that chip->rmh.stat_len is in the 2-12 range. If the value of .stat_len is 4 or more then it will loop exactly one time, but if it's less then it is a forever loop. It looks like it was supposed to combined into one loop where conditions are checked. Fixes: 8e6320064c33 ("ALSA: lx_core: Remove useless #if 0 .. #endif") Signed-off-by: Dan Carpenter <error27@gmail.com> Link: https://lore.kernel.org/r/Y9jnJTis/mRFJAQp@kili Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14selftests: forwarding: lib: quote the sysctl valuesHangbin Liu
[ Upstream commit 3a082086aa200852545cf15159213582c0c80eba ] When set/restore sysctl value, we should quote the value as some keys may have multi values, e.g. net.ipv4.ping_group_range Fixes: f5ae57784ba8 ("selftests: forwarding: lib: Add sysctl_set(), sysctl_restore()") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://lore.kernel.org/r/20230208032110.879205-1-liuhangbin@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14rds: rds_rm_zerocopy_callback() use list_first_entry()Pietro Borrello
[ Upstream commit f753a68980cf4b59a80fe677619da2b1804f526d ] rds_rm_zerocopy_callback() uses list_entry() on the head of a list causing a type confusion. Use list_first_entry() to actually access the first element of the rs_zcookie_queue list. Fixes: 9426bbc6de99 ("rds: use list structure to track information for zerocopy completion notification") Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Pietro Borrello <borrello@diag.uniroma1.it> Link: https://lore.kernel.org/r/20230202-rds-zerocopy-v3-1-83b0df974f9a@diag.uniroma1.it Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14igc: Add ndo_tx_timeout supportSasha Neftin
[ Upstream commit 9b275176270efd18f2f4e328b32be1bad34c4c0d ] On some platforms, 100/1000/2500 speeds seem to have sometimes problems reporting false positive tx unit hang during stressful UDP traffic. Likely other Intel drivers introduce responses to a tx hang. Update the 'tx hang' comparator with the comparison of the head and tail of ring pointers and restore the tx_timeout_factor to the previous value (one). This can be test by using netperf or iperf3 applications. Example: iperf3 -s -p 5001 iperf3 -c 192.168.0.2 --udp -p 5001 --time 600 -b 0 netserver -p 16604 netperf -H 192.168.0.2 -l 600 -p 16604 -t UDP_STREAM -- -m 64000 Fixes: b27b8dc77b5e ("igc: Increase timeout value for Speed 100/1000/2500") Signed-off-by: Sasha Neftin <sasha.neftin@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20230206235818.662384-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14net/mlx5: Serialize module cleanup with reload and removeShay Drory
[ Upstream commit 8f0d1451ecf7b3bd5a06ffc866c753d0f3ab4683 ] Currently, remove and reload flows can run in parallel to module cleanup. This design is error prone. For example: aux_drivers callbacks are called from both cleanup and remove flows with different lockings, which can cause a deadlock[1]. Hence, serialize module cleanup with reload and remove. [1] cleanup remove ------- ------ auxiliary_driver_unregister(); devl_lock() auxiliary_device_delete(mlx5e_aux) device_lock(mlx5e_aux) devl_lock() device_lock(mlx5e_aux) Fixes: 912cebf420c2 ("net/mlx5e: Connect ethernet part to auxiliary bus") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14net/mlx5: fw_tracer, Zero consumer index when reloading the tracerShay Drory
[ Upstream commit 184e1e4474dbcfebc4dbd1fa823a329978f25506 ] When tracer is reloaded, the device will log the traces at the beginning of the log buffer. Also, driver is reading the log buffer in chunks in accordance to the consumer index. Hence, zero consumer index when reloading the tracer. Fixes: 4383cfcc65e7 ("net/mlx5: Add devlink reload") Signed-off-by: Shay Drory <shayd@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14net/mlx5: fw_tracer, Clear load bit when freeing string DBs buffersShay Drory
[ Upstream commit db561fed6b8fa3878e74d5df6512a4a38152b63e ] Whenever the driver is reading the string DBs into buffers, the driver is setting the load bit, but the driver never clears this bit. As a result, in case load bit is on and the driver query the device for new string DBs, the driver won't read again the string DBs. Fix it by clearing the load bit when query the device for new string DBs. Fixes: 2d69356752ff ("net/mlx5: Add support for fw live patch event") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14net/mlx5e: IPoIB, Show unknown speed instead of errorDragos Tatulea
[ Upstream commit 8aa5f171d51c1cb69e5e3106df4dd1a446102823 ] ethtool is returning an error for unknown speeds for the IPoIB interface: $ ethtool ib0 netlink error: failed to retrieve link settings netlink error: Invalid argument netlink error: failed to retrieve link settings netlink error: Invalid argument Settings for ib0: Link detected: no After this change, ethtool will return success and show "unknown speed": $ ethtool ib0 Settings for ib0: Supported ports: [ ] Supported link modes: Not reported Supported pause frame use: No Supports auto-negotiation: No Supported FEC modes: Not reported Advertised link modes: Not reported Advertised pause frame use: No Advertised auto-negotiation: No Advertised FEC modes: Not reported Speed: Unknown! Duplex: Full Auto-negotiation: off Port: Other PHYAD: 0 Transceiver: internal Link detected: no Fixes: eb234ee9d541 ("net/mlx5e: IPoIB, Add support for get_link_ksettings in ethtool") Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14net/mlx5: Bridge, fix ageing of peer FDB entriesVlad Buslov
[ Upstream commit da0c52426cd23f8728eff72c2b2d2a3eb6b451f5 ] SWITCHDEV_FDB_ADD_TO_BRIDGE event handler that updates FDB entry 'lastuse' field is only executed for eswitch that owns the entry. However, if peer entry processed packets at least once it will have hardware counter 'used' value greater than entry 'lastuse' from that point on, which will cause FDB entry not being aged out. Process the event on all eswitch instances. Fixes: ff9b7521468b ("net/mlx5: Bridge, support LAG") Signed-off-by: Vlad Buslov <vladbu@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14net/mlx5e: Update rx ring hw mtu upon each rx-fcs flag changeAdham Faris
[ Upstream commit 1e66220948df815d7b37e0ff8b4627ce10433738 ] rq->hw_mtu is used in function en_rx.c/mlx5e_skb_from_cqe_mpwrq_linear() to catch oversized packets. If FCS is concatenated to the end of the packet then the check should be updated accordingly. Rx rings initialization (mlx5e_init_rxq_rq()) invoked for every new set of channels, as part of mlx5e_safe_switch_params(), unknowingly if it runs with default configuration or not. Current rq->hw_mtu initialization assumes default configuration and ignores params->scatter_fcs_en flag state. Fix this, by accounting for params->scatter_fcs_en flag state during rq->hw_mtu initialization. In addition, updating rq->hw_mtu value during ingress traffic might lead to packets drop and oversize_pkts_sw_drop counter increase with no good reason. Hence we remove this optimization and switch the set of channels with a new one, to make sure we don't get false positives on the oversize_pkts_sw_drop counter. Fixes: 102722fc6832 ("net/mlx5e: Add support for RXFCS feature flag") Signed-off-by: Adham Faris <afaris@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14net/mlx5e: Introduce the mlx5e_flush_rq functionMaxim Mikityanskiy
[ Upstream commit d9ba64deb2f1ad58eb3067c7485518f3e96559ee ] Add a function to flush an RQ: clean up descriptors, release pages and reset the RQ. This procedure is used by the recovery flow, and it will also be used in a following commit to free some memory when switching a channel to the XSK mode. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: 1e66220948df ("net/mlx5e: Update rx ring hw mtu upon each rx-fcs flag change") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14net/mlx5e: Move repeating clear_bit in mlx5e_rx_reporter_err_rq_cqe_recoverMaxim Mikityanskiy
[ Upstream commit e64d71d055ca01fa5054d25b99fb29b98e543a31 ] The same clear_bit is called in both error and success flows. Move the call to do it only once and remove the out label. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Saeed Mahameed <saeedm@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: 1e66220948df ("net/mlx5e: Update rx ring hw mtu upon each rx-fcs flag change") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14net: mscc: ocelot: fix VCAP filters not matching on MAC with "protocol 802.1Q"Vladimir Oltean
[ Upstream commit f964f8399df29d3e3ced77177cf35131cd2491bf ] Alternative short title: don't instruct the hardware to match on EtherType with "protocol 802.1Q" flower filters. It doesn't work for the reasons detailed below. With a command such as the following: tc filter add dev $swp1 ingress chain $(IS1 2) pref 3 \ protocol 802.1Q flower skip_sw vlan_id 200 src_mac $h1_mac \ action vlan modify id 300 \ action goto chain $(IS2 0 0) the created filter is set by ocelot_flower_parse_key() to be of type OCELOT_VCAP_KEY_ETYPE, and etype is set to {value=0x8100, mask=0xffff}. This gets propagated all the way to is1_entry_set() which commits it to hardware (the VCAP_IS1_HK_ETYPE field of the key). Compare this to the case where src_mac isn't specified - the key type is OCELOT_VCAP_KEY_ANY, and is1_entry_set() doesn't populate VCAP_IS1_HK_ETYPE. The problem is that for VLAN-tagged frames, the hardware interprets the ETYPE field as holding the encapsulated VLAN protocol. So the above filter will only match those packets which have an encapsulated protocol of 0x8100, rather than all packets with VLAN ID 200 and the given src_mac. The reason why this is allowed to occur is because, although we have a block of code in ocelot_flower_parse_key() which sets "match_protocol" to false when VLAN keys are present, that code executes too late. There is another block of code, which executes for Ethernet addresses, and has a "goto finished_key_parsing" and skips the VLAN header parsing. By skipping it, "match_protocol" remains with the value it was initialized with, i.e. "true", and "proto" is set to f->common.protocol, or 0x8100. The concept of ignoring some keys rather than erroring out when they are present but can't be offloaded is dubious in itself, but is present since the initial commit fe3490e6107e ("net: mscc: ocelot: Hardware ofload for tc flower filter"), and it's outside of the scope of this patch to change that. The problem was introduced when the driver started to interpret the flower filter's protocol, and populate the VCAP filter's ETYPE field based on it. To fix this, it is sufficient to move the code that parses the VLAN keys earlier than the "goto finished_key_parsing" instruction. This will ensure that if we have a flower filter with both VLAN and Ethernet address keys, it won't match on ETYPE 0x8100, because the VLAN key parsing sets "match_protocol = false". Fixes: 86b956de119c ("net: mscc: ocelot: support matching on EtherType") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230205192409.1796428-1-vladimir.oltean@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14net: dsa: mt7530: don't change PVC_EG_TAG when CPU port becomes VLAN-awareVladimir Oltean
[ Upstream commit 0b6d6425103a676e2b6a81f3fd35d7ea4f9b90ec ] Frank reports that in a mt7530 setup where some ports are standalone and some are in a VLAN-aware bridge, 8021q uppers of the standalone ports lose their VLAN tag on xmit, as seen by the link partner. This seems to occur because once the other ports join the VLAN-aware bridge, mt7530_port_vlan_filtering() also calls mt7530_port_set_vlan_aware(ds, cpu_dp->index), and this affects the way that the switch processes the traffic of the standalone port. Relevant is the PVC_EG_TAG bit. The MT7530 documentation says about it: EG_TAG: Incoming Port Egress Tag VLAN Attribution 0: disabled (system default) 1: consistent (keep the original ingress tag attribute) My interpretation is that this setting applies on the ingress port, and "disabled" is basically the normal behavior, where the egress tag format of the packet (tagged or untagged) is decided by the VLAN table (MT7530_VLAN_EGRESS_UNTAG or MT7530_VLAN_EGRESS_TAG). But there is also an option of overriding the system default behavior, and for the egress tagging format of packets to be decided not by the VLAN table, but simply by copying the ingress tag format (if ingress was tagged, egress is tagged; if ingress was untagged, egress is untagged; aka "consistent). This is useful in 2 scenarios: - VLAN-unaware bridge ports will always encounter a miss in the VLAN table. They should forward a packet as-is, though. So we use "consistent" there. See commit e045124e9399 ("net: dsa: mt7530: fix tagged frames pass-through in VLAN-unaware mode"). - Traffic injected from the CPU port. The operating system is in god mode; if it wants a packet to exit as VLAN-tagged, it sends it as VLAN-tagged. Otherwise it sends it as VLAN-untagged*. *This is true only if we don't consider the bridge TX forwarding offload feature, which mt7530 doesn't support. So for now, make the CPU port always stay in "consistent" mode to allow software VLANs to be forwarded to their egress ports with the VLAN tag intact, and not stripped. Link: https://lore.kernel.org/netdev/trinity-e6294d28-636c-4c40-bb8b-b523521b00be-1674233135062@3c-app-gmx-bs36/ Fixes: e045124e9399 ("net: dsa: mt7530: fix tagged frames pass-through in VLAN-unaware mode") Reported-by: Frank Wunderlich <frank-w@public-files.de> Tested-by: Frank Wunderlich <frank-w@public-files.de> Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Tested-by: Arınç ÜNAL <arinc.unal@arinc9.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20230205140713.1609281-1-vladimir.oltean@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14ice: Do not use WQ_MEM_RECLAIM flag for workqueueAnirudh Venkataramanan
[ Upstream commit 4d159f7884f78b1aacb99b4fc37d1e3cb1194e39 ] When both ice and the irdma driver are loaded, a warning in check_flush_dependency is being triggered. This is due to ice driver workqueue being allocated with the WQ_MEM_RECLAIM flag and the irdma one is not. According to kernel documentation, this flag should be set if the workqueue will be involved in the kernel's memory reclamation flow. Since it is not, there is no need for the ice driver's WQ to have this flag set so remove it. Example trace: [ +0.000004] workqueue: WQ_MEM_RECLAIM ice:ice_service_task [ice] is flushing !WQ_MEM_RECLAIM infiniband:0x0 [ +0.000139] WARNING: CPU: 0 PID: 728 at kernel/workqueue.c:2632 check_flush_dependency+0x178/0x1a0 [ +0.000011] Modules linked in: bonding tls xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_cha in_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink bridge stp llc rfkill vfat fat intel_rapl_msr intel _rapl_common isst_if_common skx_edac nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct1 0dif_pclmul crc32_pclmul ghash_clmulni_intel rapl intel_cstate rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_target_mod target_ core_mod ib_iser libiscsi scsi_transport_iscsi rdma_cm ib_cm iw_cm iTCO_wdt iTCO_vendor_support ipmi_ssif irdma mei_me ib_uverbs ib_core intel_uncore joydev pcspkr i2c_i801 acpi_ipmi mei lpc_ich i2c_smbus intel_pch_thermal ioatdma ipmi_si acpi_power_meter acpi_pad xfs libcrc32c sd_mod t10_pi crc64_rocksoft crc64 sg ahci ixgbe libahci ice i40e igb crc32c_intel mdio i2c_algo_bit liba ta dca wmi dm_mirror dm_region_hash dm_log dm_mod ipmi_devintf ipmi_msghandler fuse [ +0.000161] [last unloaded: bonding] [ +0.000006] CPU: 0 PID: 728 Comm: kworker/0:2 Tainted: G S 6.2.0-rc2_next-queue-13jan-00458-gc20aabd57164 #1 [ +0.000006] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0010.010620200716 01/06/2020 [ +0.000003] Workqueue: ice ice_service_task [ice] [ +0.000127] RIP: 0010:check_flush_dependency+0x178/0x1a0 [ +0.000005] Code: 89 8e 02 01 e8 49 3d 40 00 49 8b 55 18 48 8d 8d d0 00 00 00 48 8d b3 d0 00 00 00 4d 89 e0 48 c7 c7 e0 3b 08 9f e8 bb d3 07 01 <0f> 0b e9 be fe ff ff 80 3d 24 89 8e 02 00 0f 85 6b ff ff ff e9 06 [ +0.000004] RSP: 0018:ffff88810a39f990 EFLAGS: 00010282 [ +0.000005] RAX: 0000000000000000 RBX: ffff888141bc2400 RCX: 0000000000000000 [ +0.000004] RDX: 0000000000000001 RSI: dffffc0000000000 RDI: ffffffffa1213a80 [ +0.000003] RBP: ffff888194bf3400 R08: ffffed117b306112 R09: ffffed117b306112 [ +0.000003] R10: ffff888bd983088b R11: ffffed117b306111 R12: 0000000000000000 [ +0.000003] R13: ffff888111f84d00 R14: ffff88810a3943ac R15: ffff888194bf3400 [ +0.000004] FS: 0000000000000000(0000) GS:ffff888bd9800000(0000) knlGS:0000000000000000 [ +0.000003] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0.000003] CR2: 000056035b208b60 CR3: 000000017795e005 CR4: 00000000007706f0 [ +0.000003] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ +0.000003] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ +0.000002] PKRU: 55555554 [ +0.000003] Call Trace: [ +0.000002] <TASK> [ +0.000003] __flush_workqueue+0x203/0x840 [ +0.000006] ? mutex_unlock+0x84/0xd0 [ +0.000008] ? __pfx_mutex_unlock+0x10/0x10 [ +0.000004] ? __pfx___flush_workqueue+0x10/0x10 [ +0.000006] ? mutex_lock+0xa3/0xf0 [ +0.000005] ib_cache_cleanup_one+0x39/0x190 [ib_core] [ +0.000174] __ib_unregister_device+0x84/0xf0 [ib_core] [ +0.000094] ib_unregister_device+0x25/0x30 [ib_core] [ +0.000093] irdma_ib_unregister_device+0x97/0xc0 [irdma] [ +0.000064] ? __pfx_irdma_ib_unregister_device+0x10/0x10 [irdma] [ +0.000059] ? up_write+0x5c/0x90 [ +0.000005] irdma_remove+0x36/0x90 [irdma] [ +0.000062] auxiliary_bus_remove+0x32/0x50 [ +0.000007] device_release_driver_internal+0xfa/0x1c0 [ +0.000005] bus_remove_device+0x18a/0x260 [ +0.000007] device_del+0x2e5/0x650 [ +0.000005] ? __pfx_device_del+0x10/0x10 [ +0.000003] ? mutex_unlock+0x84/0xd0 [ +0.000004] ? __pfx_mutex_unlock+0x10/0x10 [ +0.000004] ? _raw_spin_unlock+0x18/0x40 [ +0.000005] ice_unplug_aux_dev+0x52/0x70 [ice] [ +0.000160] ice_service_task+0x1309/0x14f0 [ice] [ +0.000134] ? __pfx___schedule+0x10/0x10 [ +0.000006] process_one_work+0x3b1/0x6c0 [ +0.000008] worker_thread+0x69/0x670 [ +0.000005] ? __kthread_parkme+0xec/0x110 [ +0.000007] ? __pfx_worker_thread+0x10/0x10 [ +0.000005] kthread+0x17f/0x1b0 [ +0.000005] ? __pfx_kthread+0x10/0x10 [ +0.000004] ret_from_fork+0x29/0x50 [ +0.000009] </TASK> Fixes: 940b61af02f4 ("ice: Initialize PF and setup miscellaneous interrupt") Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Tested-by: Jakub Andrysiak <jakub.andrysiak@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14uapi: add missing ip/ipv6 header dependencies for linux/stddef.hHerton R. Krzesinski
[ Upstream commit 03702d4d29be4e2510ec80b248dbbde4e57030d9 ] Since commit 58e0be1ef6118 ("net: use struct_group to copy ip/ipv6 header addresses"), ip and ipv6 headers started to use the __struct_group definition, which is defined at include/uapi/linux/stddef.h. However, linux/stddef.h isn't explicitly included in include/uapi/linux/{ip,ipv6}.h, which breaks build of xskxceiver bpf selftest if you install the uapi headers in the system: $ make V=1 xskxceiver -C tools/testing/selftests/bpf ... make: Entering directory '(...)/tools/testing/selftests/bpf' gcc -g -O0 -rdynamic -Wall -Werror (...) In file included from xskxceiver.c:79: /usr/include/linux/ip.h:103:9: error: expected specifier-qualifier-list before ‘__struct_group’ 103 | __struct_group(/* no tag */, addrs, /* no attrs */, | ^~~~~~~~~~~~~~ ... Include the missing <linux/stddef.h> dependency in ip.h and do the same for the ipv6.h header. Fixes: 58e0be1ef611 ("net: use struct_group to copy ip/ipv6 header addresses") Signed-off-by: Herton R. Krzesinski <herton@redhat.com> Reviewed-by: Carlos O'Donell <carlos@redhat.com> Tested-by: Carlos O'Donell <carlos@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14ionic: clean interrupt before enabling queue to avoid credit raceNeel Patel
[ Upstream commit e8797a058466b60fc5a3291b92430c93ba90eaff ] Clear the interrupt credits before enabling the queue rather than after to be sure that the enabled queue starts at 0 and that we don't wipe away possible credits after enabling the queue. Fixes: 0f3154e6bcb3 ("ionic: Add Tx and Rx handling") Signed-off-by: Neel Patel <neel.patel@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14net: phy: meson-gxl: use MMD access dummy stubs for GXL, internal PHYHeiner Kallweit
[ Upstream commit 69ff53e4a4c9498eeed7d1441f68a1481dc69251 ] Jerome provided the information that also the GXL internal PHY doesn't support MMD register access and EEE. MMD reads return 0xffff, what results in e.g. completely wrong ethtool --show-eee output. Therefore use the MMD dummy stubs. Fixes: d853d145ea3e ("net: phy: add an option to disable EEE advertisement") Suggested-by: Jerome Brunet <jbrunet@baylibre.com> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/84432fe4-0be4-bc82-4e5c-557206b40f56@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14bonding: fix error checking in bond_debug_reregister()Qi Zheng
[ Upstream commit cbe83191d40d8925b7a99969d037d2a0caf69294 ] Since commit ff9fb72bc077 ("debugfs: return error values, not NULL") changed return value of debugfs_rename() in error cases from %NULL to %ERR_PTR(-ERROR), we should also check error values instead of NULL. Fixes: ff9fb72bc077 ("debugfs: return error values, not NULL") Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> Link: https://lore.kernel.org/r/20230202093256.32458-1-zhengqi.arch@bytedance.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14net: phylink: move phy_device_free() to correctly release phy deviceClément Léger
[ Upstream commit ce93fdb5f2ca5c9e2a9668411cc39091507f8dc9 ] After calling fwnode_phy_find_device(), the phy device refcount is incremented. Then, when the phy device is attached to a netdev with phy_attach_direct(), the refcount is also incremented but only decremented in the caller if phy_attach_direct() fails. Move phy_device_free() before the "if" to always release it correctly. Indeed, either phy_attach_direct() failed and we don't want to keep a reference to the phydev or it succeeded and a reference has been taken internally. Fixes: 25396f680dd6 ("net: phylink: introduce phylink_fwnode_phy_connect()") Signed-off-by: Clément Léger <clement.leger@bootlin.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-14xfrm: fix bug with DSCP copy to v6 from v4 tunnelChristian Hopps
[ Upstream commit 6028da3f125fec34425dbd5fec18e85d372b2af6 ] When copying the DSCP bits for decap-dscp into IPv6 don't assume the outer encap is always IPv6. Instead, as with the inner IPv4 case, copy the DSCP bits from the correctly saved "tos" value in the control block. Fixes: 227620e29509 ("[IPSEC]: Separate inner/outer mode processing on input") Signed-off-by: Christian Hopps <chopps@chopps.org> Acked-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Sasha Levin <sashal@kernel.org>