linux/linux-stable.git - Linux kernel stable tree

Age	Commit message (Collapse)	Author
19 hours	rtc: pcf2127: fix SPI command byte for PCF2131 backport	Bruno Thomsen
	When commit fa78e9b606a472495ef5b6b3d8b45c37f7727f9d upstream was backported to LTS branches linux-6.12.y and linux-6.6.y, the SPI regmap config fix got applied to the I2C regmap config. Most likely due to a new RTC get/set parm feature introduced in 6.14 causing regmap config sections in the buttom of the driver to move. LTS branch linux-6.1.y and earlier does not have PCF2131 device support. Issue can be seen in buttom of this diff in stable/linux.git tree: git diff master..linux-6.12.y -- drivers/rtc/rtc-pcf2127.c Fixes: ee61aec8529e ("rtc: pcf2127: fix SPI command byte for PCF2131") Fixes: 5cdd1f73401d ("rtc: pcf2127: fix SPI command byte for PCF2131") Cc: stable@vger.kernel.org Cc: Alexandre Belloni <alexandre.belloni@bootlin.com> Cc: Elena Popa <elena.popa@nxp.com> Cc: Hugo Villeneuve <hvilleneuve@dimonoff.com> Signed-off-by: Bruno Thomsen <bruno.thomsen@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 hours	xhci: dbc: Fix full DbC transfer ring after several reconnects	Mathias Nyman
	[ Upstream commit a5c98e8b1398534ae1feb6e95e2d3ee5215538ed ] Pending requests will be flushed on disconnect, and the corresponding TRBs will be turned into No-op TRBs, which are ignored by the xHC controller once it starts processing the ring. If the USB debug cable repeatedly disconnects before ring is started then the ring will eventually be filled with No-op TRBs. No new transfers can be queued when the ring is full, and driver will print the following error message: "xhci_hcd 0000:00:14.0: failed to queue trbs" This is a normal case for 'in' transfers where TRBs are always enqueued in advance, ready to take on incoming data. If no data arrives, and device is disconnected, then ring dequeue will remain at beginning of the ring while enqueue points to first free TRB after last cancelled No-op TRB. s Solve this by reinitializing the rings when the debug cable disconnects and DbC is leaving the configured state. Clear the whole ring buffer and set enqueue and dequeue to the beginning of ring, and set cycle bit to its initial state. Cc: stable@vger.kernel.org Fixes: dfba2174dc42 ("usb: xhci: Add DbC support in xHCI driver") Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250902105306.877476-3-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 hours	xhci: dbc: decouple endpoint allocation from initialization	Mathias Nyman
	[ Upstream commit 220a0ffde02f962c13bc752b01aa570b8c65a37b ] Decouple allocation of endpoint ring buffer from initialization of the buffer, and initialization of endpoint context parts from from the rest of the contexts. It allows driver to clear up and reinitialize endpoint rings after disconnect without reallocating everything. This is a prerequisite for the next patch that prevents the transfer ring from filling up with cancelled (no-op) TRBs if a debug cable is reconnected several times without transferring anything. Cc: stable@vger.kernel.org Fixes: dfba2174dc42 ("usb: xhci: Add DbC support in xHCI driver") Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20250902105306.877476-2-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Stable-dep-of: a5c98e8b1398 ("xhci: dbc: Fix full DbC transfer ring after several reconnects") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 hours	usb: xhci: remove option to change a default ring's TRB cycle bit	Niklas Neronin
	[ Upstream commit e1b0fa863907a61e86acc19ce2d0633941907c8e ] The TRB cycle bit indicates TRB ownership by the Host Controller (HC) or Host Controller Driver (HCD). New rings are initialized with 'cycle_state' equal to one, and all its TRBs' cycle bits are set to zero. When handling ring expansion, set the source ring cycle bits to the same value as the destination ring. Move the cycle bit setting from xhci_segment_alloc() to xhci_link_rings(), and remove the 'cycle_state' argument from xhci_initialize_ring_info(). The xhci_segment_alloc() function uses kzalloc_node() to allocate segments, ensuring that all TRB cycle bits are initialized to zero. Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-12-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Stable-dep-of: a5c98e8b1398 ("xhci: dbc: Fix full DbC transfer ring after several reconnects") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 hours	usb: xhci: introduce macro for ring segment list iteration	Niklas Neronin
	[ Upstream commit 3f970bd06c5295e742ef4f9cf7808a3cb74a6816 ] Add macro to streamline and standardize the iteration over ring segment list. xhci_for_each_ring_seg(): Iterates over the entire ring segment list. The xhci_free_segments_for_ring() function's while loop has not been updated to use the new macro. This function has some underlying issues, and as a result, it will be handled separately in a future patch. Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Niklas Neronin <niklas.neronin@linux.intel.com> Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com> Link: https://lore.kernel.org/r/20241106101459.775897-11-mathias.nyman@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Stable-dep-of: a5c98e8b1398 ("xhci: dbc: Fix full DbC transfer ring after several reconnects") Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 hours	vmxnet3: unregister xdp rxq info in the reset path	Sankararaman Jayaraman
	commit 0dd765fae295832934bf28e45dd5a355e0891ed4 upstream. vmxnet3 does not unregister xdp rxq info in the vmxnet3_reset_work() code path as vmxnet3_rq_destroy() is not invoked in this code path. So, we get below message with a backtrace. Missing unregister, handled but fix driver WARNING: CPU:48 PID: 500 at net/core/xdp.c:182 __xdp_rxq_info_reg+0x93/0xf0 This patch fixes the problem by moving the unregister code of XDP from vmxnet3_rq_destroy() to vmxnet3_rq_cleanup(). Fixes: 54f00cce1178 ("vmxnet3: Add XDP support.") Signed-off-by: Sankararaman Jayaraman <sankararaman.jayaraman@broadcom.com> Signed-off-by: Ronak Doshi <ronak.doshi@broadcom.com> Link: https://patch.msgid.link/20250320045522.57892-1-sankararaman.jayaraman@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> [ Ajay: Modified to apply on v6.6, v6.12 ] Signed-off-by: Ajay Kaher <ajay.kaher@broadcom.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 hours	platform/x86: asus-wmi: Re-add extra keys to ignore_key_wlan quirk	Antheas Kapenekakis
	commit 225d1ee0f5ba3218d1814d36564fdb5f37b50474 upstream. It turns out that the dual screen models use 0x5E for attaching and detaching the keyboard instead of 0x5F. So, re-add the codes by reverting commit cf3940ac737d ("platform/x86: asus-wmi: Remove extra keys from ignore_key_wlan quirk"). For our future reference, add a comment next to 0x5E indicating that it is used for that purpose. Fixes: cf3940ac737d ("platform/x86: asus-wmi: Remove extra keys from ignore_key_wlan quirk") Reported-by: Rahul Chandra <rahul@chandra.net> Closes: https://lore.kernel.org/all/10020-68c90c80-d-4ac6c580@106290038/ Cc: stable@kernel.org Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev> Link: https://patch.msgid.link/20250916072818.196462-1-lkml@antheas.dev Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 hours	platform/x86: asus-wmi: Fix ROG button mapping, tablet mode on ASUS ROG Z13	Antheas Kapenekakis
	commit 132bfcd24925d4d4531a19b87acb8474be82a017 upstream. On commit 9286dfd5735b ("platform/x86: asus-wmi: Fix spurious rfkill on UX8406MA"), Mathieu adds a quirk for the Zenbook Duo to ignore the code 0x5f (WLAN button disable). On that laptop, this code is triggered when the device keyboard is attached. On the ASUS ROG Z13 2025, this code is triggered when pressing the side button of the device, which is used to open Armoury Crate in Windows. As this is becoming a pattern, where newer Asus laptops use this keycode for emitting events, let's convert the wlan ignore quirk to instead allow emitting codes, so that userspace programs can listen to it and so that it does not interfere with the rfkill state. With this patch, the Z13 wil emit KEY_PROG3 and the Duo will remain unchanged and emit no event. While at it, add a quirk for the Z13 to switch into tablet mode when removing the keyboard. Signed-off-by: Antheas Kapenekakis <lkml@antheas.dev> Link: https://lore.kernel.org/r/20250808154710.8981-2-lkml@antheas.dev Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 hours	drm/xe: Fix a NULL vs IS_ERR() in xe_vm_add_compute_exec_queue()	Dan Carpenter
	[ Upstream commit cbc7f3b4f6ca19320e2eacf8fc1403d6f331ce14 ] The xe_preempt_fence_create() function returns error pointers. It never returns NULL. Update the error checking to match. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/aJTMBdX97cof_009@stanley.mountain Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 75cc23ffe5b422bc3cbd5cf0956b8b86e4b0e162) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
19 hours	drm: bridge: cdns-mhdp8546: Fix missing mutex unlock on error path	Qi Xi
	[ Upstream commit 288dac9fb6084330d968459c750c838fd06e10e6 ] Add missing mutex unlock before returning from the error path in cdns_mhdp_atomic_enable(). Fixes: 935a92a1c400 ("drm: bridge: cdns-mhdp8546: Fix possible null pointer dereference") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Qi Xi <xiqi2@huawei.com> Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20250904034447.665427-1-xiqi2@huawei.com Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
19 hours	drm: bridge: anx7625: Fix NULL pointer dereference with early IRQ	Loic Poulain
	[ Upstream commit a10f910c77f280327b481e77eab909934ec508f0 ] If the interrupt occurs before resource initialization is complete, the interrupt handler/worker may access uninitialized data such as the I2C tcpc_client device, potentially leading to NULL pointer dereference. Signed-off-by: Loic Poulain <loic.poulain@oss.qualcomm.com> Fixes: 8bdfc5dae4e3 ("drm/bridge: anx7625: Add anx7625 MIPI DSI/DPI to DP") Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250709085438.56188-1-loic.poulain@oss.qualcomm.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
19 hours	drm/xe/tile: Release kobject for the failure path	Shuicheng Lin
	[ Upstream commit 013e484dbd687a9174acf8f4450217bdb86ad788 ] Call kobject_put() for the failure path to release the kobject v2: remove extra newline. (Matt) Fixes: e3d0839aa501 ("drm/xe/tile: Abort driver load for sysfs creation failure") Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Link: https://lore.kernel.org/r/20250819153950.2973344-2-shuicheng.lin@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit b98775bca99511cc22ab459a2de646cd2fa7241f) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
19 hours	drm/amd/display: Allow RX6xxx & RX7700 to invoke amdgpu_irq_get/put	Ivan Lipski
	commit 29a2f430475357f760679b249f33e7282688e292 upstream. [Why&How] As reported on https://gitlab.freedesktop.org/drm/amd/-/issues/3936, SMU hang can occur if the interrupts are not enabled appropriately, causing a vblank timeout. This patch reverts commit 5009628d8509 ("drm/amd/display: Remove unnecessary amdgpu_irq_get/put"), but only for RX6xxx & RX7700 GPUs, on which the issue was observed. This will re-enable interrupts regardless of whether the user space needed it or not. Fixes: 5009628d8509 ("drm/amd/display: Remove unnecessary amdgpu_irq_get/put") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3936 Suggested-by: Sun peng Li <sunpeng.li@amd.com> Reviewed-by: Sun peng Li <sunpeng.li@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 95d168b367aa28a59f94fc690ff76ebf69312c6d) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 hours	mmc: mvsdio: Fix dma_unmap_sg() nents value	Thomas Fourier
	commit 8ab2f1c35669bff7d7ed1bb16bf5cc989b3e2e17 upstream. The dma_unmap_sg() functions should be called with the same nents as the dma_map_sg(), not the value the map function returned. Fixes: 236caa7cc351 ("mmc: SDIO driver for Marvell SoCs") Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 hours	iommu/amd/pgtbl: Fix possible race while increase page table level	Vasant Hegde
	commit 1e56310b40fd2e7e0b9493da9ff488af145bdd0c upstream. The AMD IOMMU host page table implementation supports dynamic page table levels (up to 6 levels), starting with a 3-level configuration that expands based on IOVA address. The kernel maintains a root pointer and current page table level to enable proper page table walks in alloc_pte()/fetch_pte() operations. The IOMMU IOVA allocator initially starts with 32-bit address and onces its exhuasted it switches to 64-bit address (max address is determined based on IOMMU and device DMA capability). To support larger IOVA, AMD IOMMU driver increases page table level. But in unmap path (iommu_v1_unmap_pages()), fetch_pte() reads pgtable->[root/mode] without lock. So its possible that in exteme corner case, when increase_address_space() is updating pgtable->[root/mode], fetch_pte() reads wrong page table level (pgtable->mode). It does compare the value with level encoded in page table and returns NULL. This will result is iommu_unmap ops to fail and upper layer may retry/log WARN_ON. CPU 0 CPU 1 ------ ------ map pages unmap pages alloc_pte() -> increase_address_space() iommu_v1_unmap_pages() -> fetch_pte() pgtable->root = pte (new root value) READ pgtable->[mode/root] Reads new root, old mode Updates mode (pgtable->mode += 1) Since Page table level updates are infrequent and already synchronized with a spinlock, implement seqcount to enable lock-free read operations on the read path. Fixes: 754265bcab7 ("iommu/amd: Fix race in increase_address_space()") Reported-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Cc: stable@vger.kernel.org Cc: Joao Martins <joao.m.martins@oracle.com> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com> Signed-off-by: Vasant Hegde <vasant.hegde@amd.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 hours	iommu/vt-d: Fix __domain_mapping()'s usage of switch_to_super_page()	Eugene Koira
	commit dce043c07ca1ac19cfbe2844a6dc71e35c322353 upstream. switch_to_super_page() assumes the memory range it's working on is aligned to the target large page level. Unfortunately, __domain_mapping() doesn't take this into account when using it, and will pass unaligned ranges ultimately freeing a PTE range larger than expected. Take for example a mapping with the following iov_pfn range [0x3fe400, 0x4c0600), which should be backed by the following mappings: iov_pfn [0x3fe400, 0x3fffff] covered by 2MiB pages iov_pfn [0x400000, 0x4bffff] covered by 1GiB pages iov_pfn [0x4c0000, 0x4c05ff] covered by 2MiB pages Under this circumstance, __domain_mapping() will pass [0x400000, 0x4c05ff] to switch_to_super_page() at a 1 GiB granularity, which will in turn free PTEs all the way to iov_pfn 0x4fffff. Mitigate this by rounding down the iov_pfn range passed to switch_to_super_page() in __domain_mapping() to the target large page level. Additionally add range alignment checks to switch_to_super_page. Fixes: 9906b9352a35 ("iommu/vt-d: Avoid duplicate removing in __domain_mapping()") Signed-off-by: Eugene Koira <eugkoira@amazon.com> Cc: stable@vger.kernel.org Reviewed-by: Nicolas Saenz Julienne <nsaenz@amazon.com> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Link: https://lore.kernel.org/r/20250826143816.38686-1-eugkoira@amazon.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 hours	dm-stripe: fix a possible integer overflow	Mikulas Patocka
	commit 1071d560afb4c245c2076494226df47db5a35708 upstream. There's a possible integer overflow in stripe_io_hints if we have too large chunk size. Test if the overflow happened, and if it did, don't set limits->io_min and limits->io_opt; Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Suggested-by: Dongsheng Yang <dongsheng.yang@linux.dev> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 hours	dm-raid: don't set io_min and io_opt for raid1	Mikulas Patocka
	commit a86556264696b797d94238d99d8284d0d34ed960 upstream. These commands modprobe brd rd_size=1048576 vgcreate vg /dev/ram* lvcreate -m4 -L10 -n lv vg trigger the following warnings: device-mapper: table: 252:10: adding target device (start sect 0 len 24576) caused an alignment inconsistency device-mapper: table: 252:10: adding target device (start sect 0 len 24576) caused an alignment inconsistency The warnings are caused by the fact that io_min is 512 and physical block size is 4096. If there's chunk-less raid, such as raid1, io_min shouldn't be set to zero because it would be raised to 512 and it would trigger the warning. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 hours	power: supply: bq27xxx: restrict no-battery detection to bq27000	H. Nikolaus Schaller
	commit 1e451977e1703b6db072719b37cd1b8e250b9cc9 upstream. There are fuel gauges in the bq27xxx series (e.g. bq27z561) which may in some cases report 0xff as the value of BQ27XXX_REG_FLAGS that should not be interpreted as "no battery" like for a disconnected battery with some built in bq27000 chip. So restrict the no-battery detection originally introduced by commit 3dd843e1c26a ("bq27000: report missing device better.") to the bq27000. There is no need to backport further because this was hidden before commit f16d9fb6cf03 ("power: supply: bq27xxx: Retrieve again when busy") Fixes: f16d9fb6cf03 ("power: supply: bq27xxx: Retrieve again when busy") Suggested-by: Jerry Lv <Jerry.Lv@axis.com> Cc: stable@vger.kernel.org Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com> Link: https://lore.kernel.org/r/dd979fa6855fd051ee5117016c58daaa05966e24.1755945297.git.hns@goldelico.com Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 hours	power: supply: bq27xxx: fix error return in case of no bq27000 hdq battery	H. Nikolaus Schaller
	commit 2c334d038466ac509468fbe06905a32d202117db upstream. Since commit commit f16d9fb6cf03 ("power: supply: bq27xxx: Retrieve again when busy") the console log of some devices with hdq enabled but no bq27000 battery (like e.g. the Pandaboard) is flooded with messages like: [ 34.247833] power_supply bq27000-battery: driver failed to report 'status' property: -1 as soon as user-space is finding a /sys entry and trying to read the "status" property. It turns out that the offending commit changes the logic to now return the value of cache.flags if it is <0. This is likely under the assumption that it is an error number. In normal errors from bq27xxx_read() this is indeed the case. But there is special code to detect if no bq27000 is installed or accessible through hdq/1wire and wants to report this. In that case, the cache.flags are set historically by commit 3dd843e1c26a ("bq27000: report missing device better.") to constant -1 which did make reading properties return -ENODEV. So everything appeared to be fine before the return value was passed upwards. Now the -1 is returned as -EPERM instead of -ENODEV, triggering the error condition in power_supply_format_property() which then floods the console log. So we change the detection of missing bq27000 battery to simply set cache.flags = -ENODEV instead of -1. Fixes: f16d9fb6cf03 ("power: supply: bq27xxx: Retrieve again when busy") Cc: Jerry Lv <Jerry.Lv@axis.com> Cc: stable@vger.kernel.org Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com> Link: https://lore.kernel.org/r/692f79eb6fd541adb397038ea6e750d4de2deddf.1755945297.git.hns@goldelico.com Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
19 hours	octeontx2-pf: Fix use-after-free bugs in otx2_sync_tstamp()	Duoming Zhou
	[ Upstream commit f8b4687151021db61841af983f1cb7be6915d4ef ] The original code relies on cancel_delayed_work() in otx2_ptp_destroy(), which does not ensure that the delayed work item synctstamp_work has fully completed if it was already running. This leads to use-after-free scenarios where otx2_ptp is deallocated by otx2_ptp_destroy(), while synctstamp_work remains active and attempts to dereference otx2_ptp in otx2_sync_tstamp(). Furthermore, the synctstamp_work is cyclic, the likelihood of triggering the bug is nonnegligible. A typical race condition is illustrated below: CPU 0 (cleanup) \| CPU 1 (delayed work callback) otx2_remove() \| otx2_ptp_destroy() \| otx2_sync_tstamp() cancel_delayed_work() \| kfree(ptp) \| \| ptp = container_of(...); //UAF \| ptp-> //UAF This is confirmed by a KASAN report: BUG: KASAN: slab-use-after-free in __run_timer_base.part.0+0x7d7/0x8c0 Write of size 8 at addr ffff88800aa09a18 by task bash/136 ... Call Trace: <IRQ> dump_stack_lvl+0x55/0x70 print_report+0xcf/0x610 ? __run_timer_base.part.0+0x7d7/0x8c0 kasan_report+0xb8/0xf0 ? __run_timer_base.part.0+0x7d7/0x8c0 __run_timer_base.part.0+0x7d7/0x8c0 ? __pfx___run_timer_base.part.0+0x10/0x10 ? __pfx_read_tsc+0x10/0x10 ? ktime_get+0x60/0x140 ? lapic_next_event+0x11/0x20 ? clockevents_program_event+0x1d4/0x2a0 run_timer_softirq+0xd1/0x190 handle_softirqs+0x16a/0x550 irq_exit_rcu+0xaf/0xe0 sysvec_apic_timer_interrupt+0x70/0x80 </IRQ> ... Allocated by task 1: kasan_save_stack+0x24/0x50 kasan_save_track+0x14/0x30 __kasan_kmalloc+0x7f/0x90 otx2_ptp_init+0xb1/0x860 otx2_probe+0x4eb/0xc30 local_pci_probe+0xdc/0x190 pci_device_probe+0x2fe/0x470 really_probe+0x1ca/0x5c0 __driver_probe_device+0x248/0x310 driver_probe_device+0x44/0x120 __driver_attach+0xd2/0x310 bus_for_each_dev+0xed/0x170 bus_add_driver+0x208/0x500 driver_register+0x132/0x460 do_one_initcall+0x89/0x300 kernel_init_freeable+0x40d/0x720 kernel_init+0x1a/0x150 ret_from_fork+0x10c/0x1a0 ret_from_fork_asm+0x1a/0x30 Freed by task 136: kasan_save_stack+0x24/0x50 kasan_save_track+0x14/0x30 kasan_save_free_info+0x3a/0x60 __kasan_slab_free+0x3f/0x50 kfree+0x137/0x370 otx2_ptp_destroy+0x38/0x80 otx2_remove+0x10d/0x4c0 pci_device_remove+0xa6/0x1d0 device_release_driver_internal+0xf8/0x210 pci_stop_bus_device+0x105/0x150 pci_stop_and_remove_bus_device_locked+0x15/0x30 remove_store+0xcc/0xe0 kernfs_fop_write_iter+0x2c3/0x440 vfs_write+0x871/0xd70 ksys_write+0xee/0x1c0 do_syscall_64+0xac/0x280 entry_SYSCALL_64_after_hwframe+0x77/0x7f ... Replace cancel_delayed_work() with cancel_delayed_work_sync() to ensure that the delayed work item is properly canceled before the otx2_ptp is deallocated. This bug was initially identified through static analysis. To reproduce and test it, I simulated the OcteonTX2 PCI device in QEMU and introduced artificial delays within the otx2_sync_tstamp() function to increase the likelihood of triggering the bug. Fixes: 2958d17a8984 ("octeontx2-pf: Add support for ptp 1-step mode on CN10K silicon") Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
19 hours	cnic: Fix use-after-free bugs in cnic_delete_task	Duoming Zhou
	[ Upstream commit cfa7d9b1e3a8604afc84e9e51d789c29574fb216 ] The original code uses cancel_delayed_work() in cnic_cm_stop_bnx2x_hw(), which does not guarantee that the delayed work item 'delete_task' has fully completed if it was already running. Additionally, the delayed work item is cyclic, the flush_workqueue() in cnic_cm_stop_bnx2x_hw() only blocks and waits for work items that were already queued to the workqueue prior to its invocation. Any work items submitted after flush_workqueue() is called are not included in the set of tasks that the flush operation awaits. This means that after the cyclic work items have finished executing, a delayed work item may still exist in the workqueue. This leads to use-after-free scenarios where the cnic_dev is deallocated by cnic_free_dev(), while delete_task remains active and attempt to dereference cnic_dev in cnic_delete_task(). A typical race condition is illustrated below: CPU 0 (cleanup) \| CPU 1 (delayed work callback) cnic_netdev_event() \| cnic_stop_hw() \| cnic_delete_task() cnic_cm_stop_bnx2x_hw() \| ... cancel_delayed_work() \| /* the queue_delayed_work() flush_workqueue() \| executes after flush_workqueue()*/ \| queue_delayed_work() cnic_free_dev(dev)//free \| cnic_delete_task() //new instance \| dev = cp->dev; //use Replace cancel_delayed_work() with cancel_delayed_work_sync() to ensure that the cyclic delayed work item is properly canceled and that any ongoing execution of the work item completes before the cnic_dev is deallocated. Furthermore, since cancel_delayed_work_sync() uses __flush_work(work, true) to synchronously wait for any currently executing instance of the work item to finish, the flush_workqueue() becomes redundant and should be removed. This bug was identified through static analysis. To reproduce the issue and validate the fix, I simulated the cnic PCI device in QEMU and introduced intentional delays — such as inserting calls to ssleep() within the cnic_delete_task() function — to increase the likelihood of triggering the bug. Fixes: fdf24086f475 ("cnic: Defer iscsi connection cleanup") Signed-off-by: Duoming Zhou <duoming@zju.edu.cn> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
19 hours	net: liquidio: fix overflow in octeon_init_instr_queue()	Alexey Nepomnyashih
	[ Upstream commit cca7b1cfd7b8a0eff2a3510c5e0f10efe8fa3758 ] The expression `(conf->instr_type == 64) << iq_no` can overflow because `iq_no` may be as high as 64 (`CN23XX_MAX_RINGS_PER_PF`). Casting the operand to `u64` ensures correct 64-bit arithmetic. Fixes: f21fb3ed364b ("Add support of Cavium Liquidio ethernet adapters") Signed-off-by: Alexey Nepomnyashih <sdl@nppct.ru> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
19 hours	Revert "net/mlx5e: Update and set Xon/Xoff upon port speed set"	Tariq Toukan
	[ Upstream commit 3fbfe251cc9f6d391944282cdb9bcf0bd02e01f8 ] This reverts commit d24341740fe48add8a227a753e68b6eedf4b385a. It causes errors when trying to configure QoS, as well as loss of L2 connectivity (on multi-host devices). Reported-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/20250910170011.70528106@kernel.org Fixes: d24341740fe4 ("net/mlx5e: Update and set Xon/Xoff upon port speed set") Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
19 hours	octeon_ep: fix VF MAC address lifecycle handling	Sathesh B Edara
	[ Upstream commit a72175c985132885573593222a7b088cf49b07ae ] Currently, VF MAC address info is not updated when the MAC address is configured from VF, and it is not cleared when the VF is removed. This leads to stale or missing MAC information in the PF, which may cause incorrect state tracking or inconsistencies when VFs are hot-plugged or reassigned. Fix this by: - storing the VF MAC address in the PF when it is set from VF - clearing the stored VF MAC address when the VF is removed This ensures that the PF always has correct VF MAC state. Fixes: cde29af9e68e ("octeon_ep: add PF-VF mailbox communication") Signed-off-by: Sathesh B Edara <sedara@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250916133207.21737-1-sedara@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
19 hours	bonding: don't set oif to bond dev when getting NS target destination	Hangbin Liu
	[ Upstream commit a8ba87f04ca9cdec06776ce92dce1395026dc3bb ] Unlike IPv4, IPv6 routing strictly requires the source address to be valid on the outgoing interface. If the NS target is set to a remote VLAN interface, and the source address is also configured on a VLAN over a bond interface, setting the oif to the bond device will fail to retrieve the correct destination route. Fix this by not setting the oif to the bond device when retrieving the NS target destination. This allows the correct destination device (the VLAN interface) to be determined, so that bond_verify_device_path can return the proper VLAN tags for sending NS messages. Reported-by: David Wilder <wilder@us.ibm.com> Closes: https://lore.kernel.org/netdev/aGOKggdfjv0cApTO@fedora/ Suggested-by: Jay Vosburgh <jv@jvosburgh.net> Tested-by: David Wilder <wilder@us.ibm.com> Acked-by: Jay Vosburgh <jv@jvosburgh.net> Fixes: 4e24be018eb9 ("bonding: add new parameter ns_targets") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20250916080127.430626-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
19 hours	net/mlx5e: Harden uplink netdev access against device unbind	Jianbo Liu
	[ Upstream commit 6b4be64fd9fec16418f365c2d8e47a7566e9eba5 ] The function mlx5_uplink_netdev_get() gets the uplink netdevice pointer from mdev->mlx5e_res.uplink_netdev. However, the netdevice can be removed and its pointer cleared when unbound from the mlx5_core.eth driver. This results in a NULL pointer, causing a kernel panic. BUG: unable to handle page fault for address: 0000000000001300 at RIP: 0010:mlx5e_vport_rep_load+0x22a/0x270 [mlx5_core] Call Trace: <TASK> mlx5_esw_offloads_rep_load+0x68/0xe0 [mlx5_core] esw_offloads_enable+0x593/0x910 [mlx5_core] mlx5_eswitch_enable_locked+0x341/0x420 [mlx5_core] mlx5_devlink_eswitch_mode_set+0x17e/0x3a0 [mlx5_core] devlink_nl_eswitch_set_doit+0x60/0xd0 genl_family_rcv_msg_doit+0xe0/0x130 genl_rcv_msg+0x183/0x290 netlink_rcv_skb+0x4b/0xf0 genl_rcv+0x24/0x40 netlink_unicast+0x255/0x380 netlink_sendmsg+0x1f3/0x420 __sock_sendmsg+0x38/0x60 __sys_sendto+0x119/0x180 do_syscall_64+0x53/0x1d0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Ensure the pointer is valid before use by checking it for NULL. If it is valid, immediately call netdev_hold() to take a reference, and preventing the netdevice from being freed while it is in use. Fixes: 7a9fb35e8c3a ("net/mlx5e: Do not reload ethernet ports when changing eswitch mode") Signed-off-by: Jianbo Liu <jianbol@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/1757939074-617281-2-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
19 hours	igc: don't fail igc_probe() on LED setup error	Kohei Enju
	[ Upstream commit 528eb4e19ec0df30d0c9ae4074ce945667dde919 ] When igc_led_setup() fails, igc_probe() fails and triggers kernel panic in free_netdev() since unregister_netdev() is not called. [1] This behavior can be tested using fault-injection framework, especially the failslab feature. [2] Since LED support is not mandatory, treat LED setup failures as non-fatal and continue probe with a warning message, consequently avoiding the kernel panic. [1] kernel BUG at net/core/dev.c:12047! Oops: invalid opcode: 0000 [#1] SMP NOPTI CPU: 0 UID: 0 PID: 937 Comm: repro-igc-led-e Not tainted 6.17.0-rc4-enjuk-tnguy-00865-gc4940196ab02 #64 PREEMPT(voluntary) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:free_netdev+0x278/0x2b0 [...] Call Trace: <TASK> igc_probe+0x370/0x910 local_pci_probe+0x3a/0x80 pci_device_probe+0xd1/0x200 [...] [2] #!/bin/bash -ex FAILSLAB_PATH=/sys/kernel/debug/failslab/ DEVICE=0000:00:05.0 START_ADDR=$(grep " igc_led_setup" /proc/kallsyms \ \| awk '{printf("0x%s", $1)}') END_ADDR=$(printf "0x%x" $((START_ADDR + 0x100))) echo $START_ADDR > $FAILSLAB_PATH/require-start echo $END_ADDR > $FAILSLAB_PATH/require-end echo 1 > $FAILSLAB_PATH/times echo 100 > $FAILSLAB_PATH/probability echo N > $FAILSLAB_PATH/ignore-gfp-wait echo $DEVICE > /sys/bus/pci/drivers/igc/bind Fixes: ea578703b03d ("igc: Add support for LEDs on i225/i226") Signed-off-by: Kohei Enju <enjuk@amazon.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Vitaly Lifshits <vitaly.lifshits@intel.com> Reviewed-by: Kurt Kanzenbach <kurt@linutronix.de> Tested-by: Mor Bar-Gabay <morx.bar.gabay@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
19 hours	i40e: remove redundant memory barrier when cleaning Tx descs	Maciej Fijalkowski
	[ Upstream commit e37084a26070c546ae7961ee135bbfb15fbe13fd ] i40e has a feature which writes to memory location last descriptor successfully sent. Memory barrier in i40e_clean_tx_irq() was used to avoid forward-reading descriptor fields in case DD bit was not set. Having mentioned feature in place implies that such situation will not happen as we know in advance how many descriptors HW has dealt with. Besides, this barrier placement was wrong. Idea is to have this protection after reading DD bit from HW descriptor, not before. Digging through git history showed me that indeed barrier was before DD bit check, anyways the commit introducing i40e_get_head() should have wiped it out altogether. Also, there was one commit doing s/read_barrier_depends/smp_rmb when get head feature was already in place, but it was only theoretical based on ixgbe experiences, which is different in these terms as that driver has to read DD bit from HW descriptor. Fixes: 1943d8ba9507 ("i40e/i40evf: enable hardware feature head write back") Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
19 hours	ice: fix Rx page leak on multi-buffer frames	Jacob Keller
	[ Upstream commit 84bf1ac85af84d354c7a2fdbdc0d4efc8aaec34b ] The ice_put_rx_mbuf() function handles calling ice_put_rx_buf() for each buffer in the current frame. This function was introduced as part of handling multi-buffer XDP support in the ice driver. It works by iterating over the buffers from first_desc up to 1 plus the total number of fragments in the frame, cached from before the XDP program was executed. If the hardware posts a descriptor with a size of 0, the logic used in ice_put_rx_mbuf() breaks. Such descriptors get skipped and don't get added as fragments in ice_add_xdp_frag. Since the buffer isn't counted as a fragment, we do not iterate over it in ice_put_rx_mbuf(), and thus we don't call ice_put_rx_buf(). Because we don't call ice_put_rx_buf(), we don't attempt to re-use the page or free it. This leaves a stale page in the ring, as we don't increment next_to_alloc. The ice_reuse_rx_page() assumes that the next_to_alloc has been incremented properly, and that it always points to a buffer with a NULL page. Since this function doesn't check, it will happily recycle a page over the top of the next_to_alloc buffer, losing track of the old page. Note that this leak only occurs for multi-buffer frames. The ice_put_rx_mbuf() function always handles at least one buffer, so a single-buffer frame will always get handled correctly. It is not clear precisely why the hardware hands us descriptors with a size of 0 sometimes, but it happens somewhat regularly with "jumbo frames" used by 9K MTU. To fix ice_put_rx_mbuf(), we need to make sure to call ice_put_rx_buf() on all buffers between first_desc and next_to_clean. Borrow the logic of a similar function in i40e used for this same purpose. Use the same logic also in ice_get_pgcnts(). Instead of iterating over just the number of fragments, use a loop which iterates until the current index reaches to the next_to_clean element just past the current frame. Unlike i40e, the ice_put_rx_mbuf() function does call ice_put_rx_buf() on the last buffer of the frame indicating the end of packet. For non-linear (multi-buffer) frames, we need to take care when adjusting the pagecnt_bias. An XDP program might release fragments from the tail of the frame, in which case that fragment page is already released. Only update the pagecnt_bias for the first descriptor and fragments still remaining post-XDP program. Take care to only access the shared info for fragmented buffers, as this avoids a significant cache miss. The xdp_xmit value only needs to be updated if an XDP program is run, and only once per packet. Drop the xdp_xmit pointer argument from ice_put_rx_mbuf(). Instead, set xdp_xmit in the ice_clean_rx_irq() function directly. This avoids needing to pass the argument and avoids an extra bit-wise OR for each buffer in the frame. Move the increment of the ntc local variable to ensure its updated before all calls to ice_get_pgcnts() or ice_put_rx_mbuf(), as the loop logic requires the index of the element just after the current frame. Now that we use an index pointer in the ring to identify the packet, we no longer need to track or cache the number of fragments in the rx_ring. Cc: Christoph Petrausch <christoph.petrausch@deepl.com> Cc: Jesper Dangaard Brouer <hawk@kernel.org> Reported-by: Jaroslav Pulchart <jaroslav.pulchart@gooddata.com> Closes: https://lore.kernel.org/netdev/CAK8fFZ4hY6GUJNENz3wY9jaYLZXGfpr7dnZxzGMYoE44caRbgw@mail.gmail.com/ Fixes: 743bbd93cf29 ("ice: put Rx buffers after being done with current frame") Tested-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Tested-by: Priya Singh <priyax.singh@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
19 hours	ice: store max_frame and rx_buf_len only in ice_rx_ring	Jacob Keller
	[ Upstream commit 7e61c89c6065731dfc11ac7a2c0dd27a910f2afb ] The max_frame and rx_buf_len fields of the VSI set the maximum frame size for packets on the wire, and configure the size of the Rx buffer. In the hardware, these are per-queue configuration. Most VSI types use a simple method to determine the size of the buffers for all queues. However, VFs may potentially configure different values for each queue. While the Linux iAVF driver does not do this, it is allowed by the virtchnl interface. The current virtchnl code simply sets the per-VSI fields inbetween calls to ice_vsi_cfg_single_rxq(). This technically works, as these fields are only ever used when programming the Rx ring, and otherwise not checked again. However, it is confusing to maintain. The Rx ring also already has an rx_buf_len field in order to access the buffer length in the hotpath. It also has extra unused bytes in the ring structure which we can make use of to store the maximum frame size. Drop the VSI max_frame and rx_buf_len fields. Add max_frame to the Rx ring, and slightly re-order rx_buf_len to better fit into the gaps in the structure layout. Change the ice_vsi_cfg_frame_size function so that it writes to the ring fields. Call this function once per ring in ice_vsi_cfg_rxqs(). This is done over calling it inside the ice_vsi_cfg_rxq(), because ice_vsi_cfg_rxq() is called in the virtchnl flow where the max_frame and rx_buf_len have already been configured. Change the accesses for rx_buf_len and max_frame to all point to the ring structure. This has the added benefit that ice_vsi_cfg_rxq() no longer has the surprise side effect of updating ring->rx_buf_len based on the VSI field. Update the virtchnl ice_vc_cfg_qs_msg() function to set the ring values directly, and drop references to the removed VSI fields. This now makes the VF logic clear, as the ring fields are obviously per-queue. This reduces the required cognitive load when reasoning about this logic. Note that removing the VSI fields does leave a 4 byte gap, but the ice_vsi structure has many gaps, and its layout is not as critical in the hot path. The structure may benefit from a more thorough repacking, but no attempt was made in this change. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Stable-dep-of: 84bf1ac85af8 ("ice: fix Rx page leak on multi-buffer frames") Signed-off-by: Sasha Levin <sashal@kernel.org>
19 hours	net: natsemi: fix `rx_dropped` double accounting on `netif_rx()` failure	Yeounsu Moon
	[ Upstream commit 93ab4881a4e2b9657bdce4b8940073bfb4ed5eab ] `netif_rx()` already increments `rx_dropped` core stat when it fails. The driver was also updating `ndev->stats.rx_dropped` in the same path. Since both are reported together via `ip -s -s` command, this resulted in drops being counted twice in user-visible stats. Keep the driver update on `if (unlikely(!skb))`, but skip it after `netif_rx()` errors. Fixes: caf586e5f23c ("net: add a core netdev->rx_dropped counter") Signed-off-by: Yeounsu Moon <yyyynoom@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250913060135.35282-3-yyyynoom@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
19 hours	bonding: set random address only when slaves already exist	Hangbin Liu
	[ Upstream commit 35ae4e86292ef7dfe4edbb9942955c884e984352 ] After commit 5c3bf6cba791 ("bonding: assign random address if device address is same as bond"), bonding will erroneously randomize the MAC address of the first interface added to the bond if fail_over_mac = follow. Correct this by additionally testing for the bond being empty before randomizing the MAC. Fixes: 5c3bf6cba791 ("bonding: assign random address if device address is same as bond") Reported-by: Qiuling Ren <qren@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20250910024336.400253-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
19 hours	qed: Don't collect too many protection override GRC elements	Jamie Bainbridge
	[ Upstream commit 56c0a2a9ddc2f5b5078c5fb0f81ab76bbc3d4c37 ] In the protection override dump path, the firmware can return far too many GRC elements, resulting in attempting to write past the end of the previously-kmalloc'ed dump buffer. This will result in a kernel panic with reason: BUG: unable to handle kernel paging request at ADDRESS where "ADDRESS" is just past the end of the protection override dump buffer. The start address of the buffer is: p_hwfn->cdev->dbg_features[DBG_FEATURE_PROTECTION_OVERRIDE].dump_buf and the size of the buffer is buf_size in the same data structure. The panic can be arrived at from either the qede Ethernet driver path: [exception RIP: qed_grc_dump_addr_range+0x108] qed_protection_override_dump at ffffffffc02662ed [qed] qed_dbg_protection_override_dump at ffffffffc0267792 [qed] qed_dbg_feature at ffffffffc026aa8f [qed] qed_dbg_all_data at ffffffffc026b211 [qed] qed_fw_fatal_reporter_dump at ffffffffc027298a [qed] devlink_health_do_dump at ffffffff82497f61 devlink_health_report at ffffffff8249cf29 qed_report_fatal_error at ffffffffc0272baf [qed] qede_sp_task at ffffffffc045ed32 [qede] process_one_work at ffffffff81d19783 or the qedf storage driver path: [exception RIP: qed_grc_dump_addr_range+0x108] qed_protection_override_dump at ffffffffc068b2ed [qed] qed_dbg_protection_override_dump at ffffffffc068c792 [qed] qed_dbg_feature at ffffffffc068fa8f [qed] qed_dbg_all_data at ffffffffc0690211 [qed] qed_fw_fatal_reporter_dump at ffffffffc069798a [qed] devlink_health_do_dump at ffffffff8aa95e51 devlink_health_report at ffffffff8aa9ae19 qed_report_fatal_error at ffffffffc0697baf [qed] qed_hw_err_notify at ffffffffc06d32d7 [qed] qed_spq_post at ffffffffc06b1011 [qed] qed_fcoe_destroy_conn at ffffffffc06b2e91 [qed] qedf_cleanup_fcport at ffffffffc05e7597 [qedf] qedf_rport_event_handler at ffffffffc05e7bf7 [qedf] fc_rport_work at ffffffffc02da715 [libfc] process_one_work at ffffffff8a319663 Resolve this by clamping the firmware's return value to the maximum number of legal elements the firmware should return. Fixes: d52c89f120de8 ("qed*: Utilize FW 8.37.2.0") Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com> Link: https://patch.msgid.link/f8e1182934aa274c18d0682a12dbaf347595469c.1757485536.git.jamie.bainbridge@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
19 hours	dpaa2-switch: fix buffer pool seeding for control traffic	Ioana Ciornei
	[ Upstream commit 2690cb089502b80b905f2abdafd1bf2d54e1abef ] Starting with commit c50e7475961c ("dpaa2-switch: Fix error checking in dpaa2_switch_seed_bp()"), the probing of a second DPSW object errors out like below. fsl_dpaa2_switch dpsw.1: fsl_mc_driver_probe failed: -12 fsl_dpaa2_switch dpsw.1: probe with driver fsl_dpaa2_switch failed with error -12 The aforementioned commit brought to the surface the fact that seeding buffers into the buffer pool destined for control traffic is not successful and an access violation recoverable error can be seen in the MC firmware log: [E, qbman_rec_isr:391, QBMAN] QBMAN recoverable event 0x1000000 This happens because the driver incorrectly used the ID of the DPBP object instead of the hardware buffer pool ID when trying to release buffers into it. This is because any DPSW object uses two buffer pools, one managed by the Linux driver and destined for control traffic packet buffers and the other one managed by the MC firmware and destined only for offloaded traffic. And since the buffer pool managed by the MC firmware does not have an external facing DPBP equivalent, any subsequent DPBP objects created after the first DPSW will have a DPBP id different to the underlying hardware buffer ID. The issue was not caught earlier because these two numbers can be identical when all DPBP objects are created before the DPSW objects are. This is the case when the DPL file is used to describe the entire DPAA2 object layout and objects are created at boot time and it's also true for the first DPSW being created dynamically using ls-addsw. Fix this by using the buffer pool ID instead of the DPBP id when releasing buffers into the pool. Fixes: 2877e4f7e189 ("staging: dpaa2-switch: setup buffer pool and RX path rings") Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Link: https://patch.msgid.link/20250910144825.2416019-1-ioana.ciornei@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
19 hours	pcmcia: omap_cf: Mark driver struct with __refdata to prevent section mismatch	Geert Uytterhoeven
	[ Upstream commit d1dfcdd30140c031ae091868fb5bed084132bca1 ] As described in the added code comment, a reference to .exit.text is ok for drivers registered via platform_driver_probe(). Make this explicit to prevent the following section mismatch warning WARNING: modpost: drivers/pcmcia/omap_cf: section mismatch in reference: omap_cf_driver+0x4 (section: .data) -> omap_cf_remove (section: .exit.text) that triggers on an omap1_defconfig + CONFIG_OMAP_CF=m build. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Aaro Koskinen <aaro.koskinen@iki.fi> Reviewed-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Signed-off-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
19 hours	nvme: fix PI insert on write	Christoph Hellwig
	[ Upstream commit 7ac3c2889bc060c3f67cf44df0dbb093a835c176 ] I recently ran into an issue where the PI generated using the block layer integrity code differs from that from a kernel using the PRACT fallback when the block layer integrity code is disabled, and I tracked this down to us using PRACT incorrectly. The NVM Command Set Specification (section 5.33 in 1.2, similar in older versions) specifies the PRACT insert behavior as: Inserted protection information consists of the computed CRC for the protection information format (refer to section 5.3.1) in the Guard field, the LBAT field value in the Application Tag field, the LBST field value in the Storage Tag field, if defined, and the computed reference tag in the Logical Block Reference Tag. Where the computed reference tag is defined as following for type 1 and type 2 using the text below that is duplicated in the respective bullet points: the value of the computed reference tag for the first logical block of the command is the value contained in the Initial Logical Block Reference Tag (ILBRT) or Expected Initial Logical Block Reference Tag (EILBRT) field in the command, and the computed reference tag is incremented for each subsequent logical block. So we need to set ILBRT field, but we currently don't. Interestingly this works fine on my older type 1 formatted SSD, but Qemu trips up on this. We already set ILBRT for Write Same since commit aeb7bb061be5 ("nvme: set the PRACT bit when using Write Zeroes with T10 PI"). To ease this, move the PI type check into nvme_set_ref_tag. Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
19 hours	wifi: wilc1000: avoid buffer overflow in WID string configuration	Ajay.Kathat@microchip.com
	[ Upstream commit fe9e4d0c39311d0f97b024147a0d155333f388b5 ] Fix the following copy overflow warning identified by Smatch checker. drivers/net/wireless/microchip/wilc1000/wlan_cfg.c:184 wilc_wlan_parse_response_frame() error: '__memcpy()' 'cfg->s[i]->str' copy overflow (512 vs 65537) This patch introduces size check before accessing the memory buffer. The checks are base on the WID type of received data from the firmware. For WID string configuration, the size limit is determined by individual element size in 'struct wilc_cfg_str_vals' that is maintained in 'len' field of 'struct wilc_cfg_str'. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/linux-wireless/aLFbr9Yu9j_TQTey@stanley.mountain Suggested-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Ajay Singh <ajay.kathat@microchip.com> Link: https://patch.msgid.link/20250829225829.5423-1-ajay.kathat@microchip.com Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
7 days	drm/amdgpu: fix a memory leak in fence cleanup when unloading	Alex Deucher
	commit 7838fb5f119191403560eca2e23613380c0e425e upstream. Commit b61badd20b44 ("drm/amdgpu: fix usage slab after free") reordered when amdgpu_fence_driver_sw_fini() was called after that patch, amdgpu_fence_driver_sw_fini() effectively became a no-op as the sched entities we never freed because the ring pointers were already set to NULL. Remove the NULL setting. Reported-by: Lin.Cao <lincao12@amd.com> Cc: Vitaly Prosyak <vitaly.prosyak@amd.com> Cc: Christian König <christian.koenig@amd.com> Fixes: b61badd20b44 ("drm/amdgpu: fix usage slab after free") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit a525fa37aac36c4591cc8b07ae8957862415fbd5) Cc: stable@vger.kernel.org [ Adapt to conditional check ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 days	drm/i915/power: fix size for for_each_set_bit() in abox iteration	Jani Nikula
	commit cfa7b7659757f8d0fc4914429efa90d0d2577dd7 upstream. for_each_set_bit() expects size to be in bits, not bytes. The abox mask iteration uses bytes, but it works by coincidence, because the local variable holding the mask is unsigned long, and the mask only ever has bit 2 as the highest bit. Using a smaller type could lead to subtle and very hard to track bugs. Fixes: 62afef2811e4 ("drm/i915/rkl: RKL uses ABOX0 for pixel transfers") Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: stable@vger.kernel.org # v5.9+ Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20250905104149.1144751-1-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit 7ea3baa6efe4bb93d11e1c0e6528b1468d7debf6) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net> [ adapted struct intel_display display parameters to struct drm_i915_private dev_priv ] Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 days	net: mdiobus: release reset_gpio in mdiobus_unregister_device()	Buday Csaba
	commit 8ea25274ebaf2f6be8be374633b2ed8348ec0e70 upstream. reset_gpio is claimed in mdiobus_register_device(), but it is not released in mdiobus_unregister_device(). It is instead only released when the whole MDIO bus is unregistered. When a device uses the reset_gpio property, it becomes impossible to unregister it and register it again, because the GPIO remains claimed. This patch resolves that issue. Fixes: bafbdd527d56 ("phylib: Add device reset GPIO support") # see notes Reviewed-by: Andrew Lunn <andrew@lunn.ch> Cc: Csókás Bence <csokas.bence@prolan.hu> [ csokas.bence: Resolve rebase conflict and clarify msg ] Signed-off-by: Buday Csaba <buday.csaba@prolan.hu> Link: https://patch.msgid.link/20250807135449.254254-2-csokas.bence@prolan.hu Signed-off-by: Paolo Abeni <pabeni@redhat.com> [ csokas.bence: Use the v1 patch on top of 6.12, as specified in notes ] Signed-off-by: Bence Csókás <csokas.bence@prolan.hu> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 days	phy: ti-pipe3: fix device leak at unbind	Johan Hovold
	commit e19bcea99749ce8e8f1d359f68ae03210694ad56 upstream. Make sure to drop the reference to the control device taken by of_find_device_by_node() during probe when the driver is unbound. Fixes: 918ee0d21ba4 ("usb: phy: omap-usb3: Don't use omap_get_control_dev()") Cc: stable@vger.kernel.org # 3.13 Cc: Roger Quadros <rogerq@kernel.org> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20250724131206.2211-4-johan@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 days	phy: ti: omap-usb2: fix device leak at unbind	Johan Hovold
	commit 64961557efa1b98f375c0579779e7eeda1a02c42 upstream. Make sure to drop the reference to the control device taken by of_find_device_by_node() during probe when the driver is unbound. Fixes: 478b6c7436c2 ("usb: phy: omap-usb2: Don't use omap_get_control_dev()") Cc: stable@vger.kernel.org # 3.13 Cc: Roger Quadros <rogerq@kernel.org> Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20250724131206.2211-3-johan@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 days	phy: tegra: xusb: fix device and OF node leak at probe	Johan Hovold
	commit bca065733afd1e3a89a02f05ffe14e966cd5f78e upstream. Make sure to drop the references taken to the PMC OF node and device by of_parse_phandle() and of_find_device_by_node() during probe. Note the holding a reference to the PMC device does not prevent the PMC regmap from going away (e.g. if the PMC driver is unbound) so there is no need to keep the reference. Fixes: 2d1021487273 ("phy: tegra: xusb: Add wake/sleepwalk for Tegra210") Cc: stable@vger.kernel.org # 5.14 Cc: JC Kuo <jckuo@nvidia.com> Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20250724131206.2211-2-johan@kernel.org Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 days	dmaengine: dw: dmamux: Fix device reference leak in rzn1_dmamux_route_allocate	Miaoqian Lin
	commit aa2e1e4563d3ab689ffa86ca1412ecbf9fd3b308 upstream. The reference taken by of_find_device_by_node() must be released when not needed anymore. Add missing put_device() call to fix device reference leaks. Fixes: 134d9c52fca2 ("dmaengine: dw: dmamux: Introduce RZN1 DMA router support") Cc: stable@vger.kernel.org Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com> Link: https://lore.kernel.org/r/20250902090358.2423285-1-linmq006@gmail.com Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 days	dmaengine: qcom: bam_dma: Fix DT error handling for num-channels/ees	Stephan Gerhold
	commit 5068b5254812433e841a40886e695633148d362d upstream. When we don't have a clock specified in the device tree, we have no way to ensure the BAM is on. This is often the case for remotely-controlled or remotely-powered BAM instances. In this case, we need to read num-channels from the DT to have all the necessary information to complete probing. However, at the moment invalid device trees without clock and without num-channels still continue probing, because the error handling is missing return statements. The driver will then later try to read the number of channels from the registers. This is unsafe, because it relies on boot firmware and lucky timing to succeed. Unfortunately, the lack of proper error handling here has been abused for several Qualcomm SoCs upstream, causing early boot crashes in several situations [1, 2]. Avoid these early crashes by erroring out when any of the required DT properties are missing. Note that this will break some of the existing DTs upstream (mainly BAM instances related to the crypto engine). However, clearly these DTs have never been tested properly, since the error in the kernel log was just ignored. It's safer to disable the crypto engine for these broken DTBs. [1]: https://lore.kernel.org/r/CY01EKQVWE36.B9X5TDXAREPF@fairphone.com/ [2]: https://lore.kernel.org/r/20230626145959.646747-1-krzysztof.kozlowski@linaro.org/ Cc: stable@vger.kernel.org Fixes: 48d163b1aa6e ("dmaengine: qcom: bam_dma: get num-channels and num-ees from dt") Signed-off-by: Stephan Gerhold <stephan.gerhold@linaro.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250212-bam-dma-fixes-v1-8-f560889e65d8@linaro.org Signed-off-by: Vinod Koul <vkoul@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 days	usb: gadget: midi2: Fix MIDI2 IN EP max packet size	Takashi Iwai
	commit 116e79c679a1530cf833d0ff3007061d7a716bd9 upstream. The EP-IN of MIDI2 (altset 1) wasn't initialized in f_midi2_create_usb_configs() as it's an INT EP unlike others BULK EPs. But this leaves rather the max packet size unchanged no matter which speed is used, resulting in the very slow access. And the wMaxPacketSize values set there look legit for INT EPs, so let's initialize the MIDI2 EP-IN there for achieving the equivalent speed as well. Fixes: 8b645922b223 ("usb: gadget: Add support for USB MIDI 2.0 function driver") Cc: stable <stable@kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://lore.kernel.org/r/20250905133240.20966-1-tiwai@suse.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 days	usb: gadget: midi2: Fix missing UMP group attributes initialization	Takashi Iwai
	commit 21d8525d2e061cde034277d518411b02eac764e2 upstream. The gadget card driver forgot to call snd_ump_update_group_attrs() after adding FBs, and this leaves the UMP group attributes uninitialized. As a result, -ENODEV error is returned at opening a legacy rawmidi device as an inactive group. This patch adds the missing call to address the behavior above. Fixes: 8b645922b223 ("usb: gadget: Add support for USB MIDI 2.0 function driver") Cc: stable <stable@kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de> Link: https://lore.kernel.org/r/20250904153932.13589-1-tiwai@suse.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 days	usb: typec: tcpm: properly deliver cable vdms to altmode drivers	RD Babiera
	commit f34bfcc77b18375a87091c289c2eb53c249787b4 upstream. tcpm_handle_vdm_request delivers messages to the partner altmode or the cable altmode depending on the SVDM response type, which is incorrect. The partner or cable should be chosen based on the received message type instead. Also add this filter to ADEV_NOTIFY_USB_AND_QUEUE_VDM, which is used when the Enter Mode command is responded to by a NAK on SOP or SOP' and when the Exit Mode command is responded to by an ACK on SOP. Fixes: 7e7877c55eb1 ("usb: typec: tcpm: add alt mode enter/exit/vdm support for sop'") Cc: stable@vger.kernel.org Signed-off-by: RD Babiera <rdbabiera@google.com> Reviewed-by: Badhri Jagan Sridharan <badhri@google.com> Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com> Link: https://lore.kernel.org/r/20250821203759.1720841-2-rdbabiera@google.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
7 days	USB: gadget: dummy-hcd: Fix locking bug in RT-enabled kernels	Alan Stern
	commit 8d63c83d8eb922f6c316320f50c82fa88d099bea upstream. Yunseong Kim and the syzbot fuzzer both reported a problem in RT-enabled kernels caused by the way dummy-hcd mixes interrupt management and spin-locking. The pattern was: local_irq_save(flags); spin_lock(&dum->lock); ... spin_unlock(&dum->lock); ... // calls usb_gadget_giveback_request() local_irq_restore(flags); The code was written this way because usb_gadget_giveback_request() needs to be called with interrupts disabled and the private lock not held. While this pattern works fine in non-RT kernels, it's not good when RT is enabled. RT kernels handle spinlocks much like mutexes; in particular, spin_lock() may sleep. But sleeping is not allowed while local interrupts are disabled. To fix the problem, rewrite the code to conform to the pattern used elsewhere in dummy-hcd and other UDC drivers: spin_lock_irqsave(&dum->lock, flags); ... spin_unlock(&dum->lock); usb_gadget_giveback_request(...); spin_lock(&dum->lock); ... spin_unlock_irqrestore(&dum->lock, flags); This approach satisfies the RT requirements. Signed-off-by: Alan Stern <stern@rowland.harvard.edu> Cc: stable <stable@kernel.org> Fixes: b4dbda1a22d2 ("USB: dummy-hcd: disable interrupts during req->complete") Reported-by: Yunseong Kim <ysk@kzalloc.com> Closes: <https://lore.kernel.org/linux-usb/5b337389-73b9-4ee4-a83e-7e82bf5af87a@kzalloc.com/> Reported-by: syzbot+8baacc4139f12fa77909@syzkaller.appspotmail.com Closes: <https://lore.kernel.org/linux-usb/68ac2411.050a0220.37038e.0087.GAE@google.com/> Tested-by: syzbot+8baacc4139f12fa77909@syzkaller.appspotmail.com CC: Sebastian Andrzej Siewior <bigeasy@linutronix.de> CC: stable@vger.kernel.org Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/bb192ae2-4eee-48ee-981f-3efdbbd0d8f0@rowland.harvard.edu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>