linux/linux-stable.git - Linux kernel stable tree

Age	Commit message (Collapse)	Author
2025-02-19	can: flexcan: Add quirk to handle separate interrupt lines for mailboxes	Ciprian Marian Costea
	Introduce 'FLEXCAN_QUIRK_SECONDARY_MB_IRQ' quirk to handle a FlexCAN hardware module integration particularity where two ranges of mailboxes are controlled by separate hardware interrupt lines. The same 'flexcan_irq' handler is used for both separate mailbox interrupt lines, with no other changes. Signed-off-by: Ciprian Marian Costea <ciprianmarian.costea@oss.nxp.com> Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Link: https://patch.msgid.link/20250113120704.522307-3-ciprianmarian.costea@oss.nxp.com [mkl: flexcan_open(): change order and free irq_secondary_mb first] Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-02-19	can: c_can: Use syscon_regmap_lookup_by_phandle_args	Krzysztof Kozlowski
	Use syscon_regmap_lookup_by_phandle_args() which is a wrapper over syscon_regmap_lookup_by_phandle() combined with getting the syscon argument. Except simpler code this annotates within one line that given phandle has arguments, so grepping for code would be easier. There is also no real benefit in printing errors on missing syscon argument, because this is done just too late: runtime check on static/build-time data. Dtschema and Devicetree bindings offer the static/build-time check for this already. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Link: https://patch.msgid.link/20250212-syscon-phandle-args-can-v2-4-ac9a1253396b@linaro.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-02-19	can: c_can: Use of_property_present() to test existence of DT property	Krzysztof Kozlowski
	of_property_read_bool() should be used only on boolean properties. Cc: Rob Herring <robh@kernel.org> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Link: https://patch.msgid.link/20250212-syscon-phandle-args-can-v2-3-ac9a1253396b@linaro.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-02-19	can: c_can: Simplify handling syscon error path	Krzysztof Kozlowski
	Use error handling block instead of open-coding it in one of probe failure cases. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Link: https://patch.msgid.link/20250212-syscon-phandle-args-can-v2-2-ac9a1253396b@linaro.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-02-19	can: c_can: Drop useless final probe failure message	Krzysztof Kozlowski
	Generic probe failure message is useless: does not give information what failed and it duplicates messages provided by the core, e.g. from memory allocation or platform_get_irq(). It also floods dmesg in case of deferred probe, e.g. resulting from devm_clk_get(). Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Link: https://patch.msgid.link/20250212-syscon-phandle-args-can-v2-1-ac9a1253396b@linaro.org Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-02-19	hv_netvsc: Use VF's tso_max_size value when data path is VF	Shradha Gupta
	On Azure, increasing VF's gso/gro packet size to up-to GSO_MAX_SIZE is not possible without allowing the same for netvsc NIC (as the NICs are bonded together). For bonded NICs, the min of the max aggregated pkt size of the members is propagated in the stack. Therefore, we use netif_set_tso_max_size() to set max aggregated pkt size to VF's packet size for netvsc too, when the data path is switched over to the VF Tested on azure env with Accelerated Networking enabled and disabled. Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-02-19	net: mana: Allow tso_max_size to go up-to GSO_MAX_SIZE	Shradha Gupta
	Allow the max aggregated pkt size to go up-to GSO_MAX_SIZE for MANA NIC. This patch only increases the max allowable gso/gro pkt size for MANA devices and does not change the defaults. Following are the perf benefits by increasing the pkt aggregate size from legacy gso_max_size value(64K) to newer one(up-to 511K IPv4 tests for i in {1..10}; do netperf -t TCP_RR -H 10.0.0.5 -p50000 -- -r80000,80000 -O MIN_LATENCY,P90_LATENCY,P99_LATENCY,THROUGHPUT\|tail -1; done min p90 p99 Throughput gso_max_size 93 171 194 6594.25 97 154 180 7183.74 95 165 189 6927.86 96 165 188 6976.04 93 154 185 7338.05 64K 93 168 189 6938.03 94 169 189 6784.93 92 166 189 7117.56 94 179 191 6678.44 95 157 183 7277.81 min p90 p99 Throughput 93 134 146 8448.75 95 134 140 8396.54 94 137 148 8204.12 94 137 148 8244.41 94 128 139 8666.52 80K 94 141 153 8116.86 94 138 149 8163.92 92 135 142 8362.72 92 134 142 8497.57 93 136 148 8393.23 IPv6 Tests for i in {1..10}; do netperf -t TCP_RR -H fd00:9013:cadd::4 -p50000 -- -r80000,80000 -O MIN_LATENCY,P90_LATENCY,P99_LATENCY,THROUGHPUT\|tail -1; done min p90 p99 Throughput gso_max_size 108 165 170 6673.2 101 169 189 6451.69 101 165 169 6737.65 102 167 175 6614.64 101 178 189 6247.13 64K 107 163 169 6678.63 106 176 187 6350.86 100 164 169 6617.36 102 163 170 6849.21 102 168 175 6605.7 min p90 p99 Throughput 108 155 166 7183 110 154 163 7268.87 109 152 159 7434.35 107 145 157 7569.15 107 149 164 7496.17 80K 110 154 159 7245.85 108 156 162 7266.24 109 145 158 7526.66 106 145 151 7785.75 111 148 157 7246.65 Tested on azure env with Accelerated Networking enabled and disabled. Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-02-18	net: pse-pd: pd692x0: Fix power limit retrieval	Kory Maincent
	Fix incorrect data offset read in the pd692x0_pi_get_pw_limit callback. The issue was previously unnoticed as it was only used by the regulator API and not thoroughly tested, since the PSE is mainly controlled via ethtool. The function became actively used by ethtool after commit 3e9dbfec4998 ("net: pse-pd: Split ethtool_get_status into multiple callbacks"), which led to the discovery of this issue. Fix it by using the correct data offset. Fixes: a87e699c9d33 ("net: pse-pd: pd692x0: Enhance with new current limit and voltage read callbacks") Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20250217134812.1925345-1-kory.maincent@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-18	net: stmmac: Use str_enabled_disabled() helper	Yu-Chun Lin
	As kernel test robot reported, the following warning occurs: cocci warnings: (new ones prefixed by >>) >> drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c:582:6-8: opportunity for str_enabled_disabled(on) Replace ternary (condition ? "enabled" : "disabled") with str_enabled_disabled() from string_choices.h to improve readability, maintain uniform string usage, and reduce binary size through linker deduplication. Reviewed-by: Huacai Chen <chenhuacai@loongson.cn> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Yu-Chun Lin <eleanor15x@gmail.com> Link: https://patch.msgid.link/20250217155833.3105775-1-eleanor15x@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-18	net: ethernet: mediatek: add EEE support	Qingfang Deng
	Add EEE support to MediaTek SoC Ethernet. The register fields are similar to the ones in MT7531, except that the LPI threshold is in milliseconds. Signed-off-by: Qingfang Deng <dqfext@gmail.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/20250217094022.1065436-1-dqfext@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-18	net: freescale: ucc_geth: make ugeth_mac_ops be static const	Pei Xiao
	sparse warning: sparse: symbol 'ugeth_mac_ops' was not declared. Should it be static. Add static to fix sparse warnings and add const. phylink_create() will accept a const struct. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/202502141128.9HfxcdIE-lkp@intel.com Signed-off-by: Pei Xiao <xiaopei01@kylinos.cn> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-18	net: phy: c45: remove local advertisement parameter from ↵	Heiner Kallweit
	genphy_c45_eee_is_active After the last user has gone, we can remove the local advertisement parameter from genphy_c45_eee_is_active. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/bd121330-9e28-4bc8-8422-794bd54d561f@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-18	net: phy: c45: use cached EEE advertisement in genphy_c45_ethtool_get_eee	Heiner Kallweit
	Now that disabled EEE modes are considered when populating advertising_eee, we can use this bitmap here instead of reading the PHY register. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/e57ed3d4-d0bc-4f91-83f6-8f48dfb6d7d7@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-18	net: phy: c45: Don't silently remove disabled EEE modes any longer when ↵	Heiner Kallweit
	writing advertisement register advertising_eee is adjusted now whenever an EEE mode gets disabled. Therefore we can remove the silent removal of disabled EEE modes here. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/e95b9dad-24a7-4e3e-9af9-6f0770cf1520@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-18	net: phy: remove disabled EEE modes from advertising_eee in phy_probe	Heiner Kallweit
	A PHY driver may populate eee_disabled_modes in its probe or get_features callback, therefore filter the EEE advertisement read from the PHY. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/493f3e2e-9cfc-445d-adbe-58d9c117a489@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-18	net: phy: realtek: add defines for shadowed c45 standard registers	Heiner Kallweit
	Realtek shadows standard c45 registers in VEND2 device register space. Add defines for these VEND2 registers, based on the names of the standard c45 registers. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/c90bdf76-f8b8-4d06-9656-7a52d5658ee6@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-18	gve: set xdp redirect target only when it is available	Joshua Washington
	Before this patch the NETDEV_XDP_ACT_NDO_XMIT XDP feature flag is set by default as part of driver initialization, and is never cleared. However, this flag differs from others in that it is used as an indicator for whether the driver is ready to perform the ndo_xdp_xmit operation as part of an XDP_REDIRECT. Kernel helpers xdp_features_(set\|clear)_redirect_target exist to convey this meaning. This patch ensures that the netdev is only reported as a redirect target when XDP queues exist to forward traffic. Fixes: 39a7f4aa3e4a ("gve: Add XDP REDIRECT support for GQI-QPL format") Cc: stable@vger.kernel.org Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Jeroen de Borst <jeroendb@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Link: https://patch.msgid.link/20250214224417.1237818-1-joshwash@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-18	net: cadence: macb: Report standard stats	Sean Anderson
	Report standard statistics using the dedicated callbacks instead of get_ethtool_stats. OCTTX is split over two registers. Accumulating these registers separately in gem_stats just means we need to combine them again later. Instead, combine these stats before saving them, like is done for ethtool_stats. Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Link: https://patch.msgid.link/20250214212703.2618652-3-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-18	net: cadence: macb: Convert to get_stats64	Sean Anderson
	Convert the existing get_stats implementation to get_stats64. Since we now report 64-bit values, increase the counters to 64-bits as well. Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Link: https://patch.msgid.link/20250214212703.2618652-2-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-18	net: xilinx: axienet: Implement BQL	Sean Anderson
	Implement byte queue limits to allow queueing disciplines to account for packets enqueued in the ring buffers but not yet transmitted. Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Link: https://patch.msgid.link/20250214211252.2615573-1-sean.anderson@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-18	scsi: ufs: core: Fix ufshcd_is_ufs_dev_busy() and ufshcd_eh_timed_out()	Bart Van Assche
	ufshcd_is_ufs_dev_busy(), ufshcd_print_host_state() and ufshcd_eh_timed_out() are used in both modes (legacy mode and MCQ mode). hba->outstanding_reqs only represents the outstanding requests in legacy mode. Hence, change hba->outstanding_reqs into scsi_host_busy(hba->host) in these functions. Fixes: eacb139b77ff ("scsi: ufs: core: mcq: Enable multi-circular queue") Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20250214224352.3025151-1-bvanassche@acm.org Reviewed-by: Peter Wang <peter.wang@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2025-02-18	net: phy: realtek: add helper RTL822X_VND2_C22_REG	Heiner Kallweit
	C22 register space is mapped to 0xa400 in MMD VEND2 register space. Add a helper to access mapped C22 registers. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/6344277b-c5c7-449b-ac89-d5425306ca76@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-18	eth: mlx4: use the page pool for Rx buffers	Jakub Kicinski
	Simple conversion to page pool. Preserve the current fragmentation logic / page splitting. Each page starts with a single frag reference, and then we bump that when attaching to skbs. This can likely be optimized further. Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250213010635.1354034-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-18	eth: mlx4: remove the local XDP fast-recycling ring	Jakub Kicinski
	It will be replaced with page pool's built-in recycling. Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250213010635.1354034-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-18	eth: mlx4: don't try to complete XDP frames in netpoll	Jakub Kicinski
	mlx4 doesn't support ndo_xdp_xmit / XDP_REDIRECT and wasn't using page pool until now, so it could run XDP completions in netpoll (NAPI budget == 0) just fine. Page pool has calling context requirements, make sure we don't try to call it from what is potentially HW IRQ context. Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250213010635.1354034-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-18	eth: mlx4: create a page pool for Rx	Jakub Kicinski
	Create a pool per rx queue. Subsequent patches will make use of it. Move fcs_del to a hole to make space for the pointer. Per common "wisdom" base the page pool size on the ring size. Note that the page pool cache size is in full pages, so just round up the effective buffer size to pages. Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20250213010635.1354034-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-18	drm/xe/guc: Fix size_t print format	Lucas De Marchi
	Use %zx format to print size_t to remove the following warning when building for i386: >> drivers/gpu/drm/xe/xe_guc_ct.c:1727:43: warning: format specifies type 'unsigned long' but the argument has type 'size_t' (aka 'unsigned int') [-Wformat] 1727 \| drm_printf(p, "[CTB].length: 0x%lx\n", snapshot->ctb_size); \| ~~~ ^~~~~~~~~~~~~~~~~~ \| %zx Cc: José Roberto de Souza <jose.souza@intel.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202501281627.H6nj184e-lkp@intel.com/ Fixes: 643f209ba3fd ("drm/xe: Make GUC binaries dump consistent with other binaries in devcoredump") Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250128154242.3371687-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 7748289df510638ba61fed86b59ce7d2fb4a194c) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-02-18	drm/xe: Make GUC binaries dump consistent with other binaries in devcoredump	José Roberto de Souza
	All other(hwsp, hwctx and vmas) binaries follow this format: [name].length: 0x1000 [name].data: xxxxxxx [name].error: errno The error one is just in case by some reason it was not able to capture the binary. So this GuC binaries should follow the same patern. v2: - renamed GUC binary to LOG Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250123202307.95103-3-jose.souza@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit cb1f868ca13756c0c18ba54d1591332476760d07) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2025-02-18	ACPI: platform_profile: Fix memory leak in profile_class_is_visible()	Kurt Borja
	If class_find_device() finds a device, it's reference count is incremented. Call put_device() to drop this reference before returning. Fixes: 77be5cacb2c2 ("ACPI: platform_profile: Create class for ACPI platform profile") Signed-off-by: Kurt Borja <kuurtb@gmail.com> Reviewed-by: Mark Pearson <mpearson-lenovo@squebb.ca> Link: https://patch.msgid.link/20250212193058.32110-1-kuurtb@gmail.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-02-18	apple-nvme: Support coprocessors left idle	Hector Martin
	iBoot on at least some firmwares/machines leaves ANS2 running, requiring a wake command instead of a CPU boot (and if we reset ANS2 in that state, everything breaks). Only stop the CPU if RTKit was running, and only do the reset dance if the CPU is stopped. Normal shutdown handoff: - RTKit not yet running - CPU detected not running - Reset - CPU powerup - RTKit boot wait ANS2 left running/idle: - RTKit not yet running - CPU detected running - RTKit wake message Sleep/resume cycle: - RTKit shutdown - CPU stopped - (sleep here) - CPU detected not running - Reset - CPU powerup - RTKit boot wait Shutdown or device removal: - RTKit shutdown - CPU stopped Therefore, the CPU running bit serves as a consistent flag of whether the coprocessor is fully stopped or just idle. Signed-off-by: Hector Martin <marcan@marcan.st> Reviewed-by: Neal Gompa <neal@gompa.dev> Reviewed-by: Sven Peter <sven@svenpeter.dev> Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-02-18	apple-nvme: Release power domains when probe fails	Hector Martin
	Signed-off-by: Hector Martin <marcan@marcan.st> Reviewed-by: Neal Gompa <neal@gompa.dev> Reviewed-by: Sven Peter <sven@svenpeter.dev> Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-02-18	nvmet: Use enum definitions instead of hardcoded values	Damien Le Moal
	Change the definition of the inline functions nvmet_cc_en(), nvmet_cc_css(), nvmet_cc_mps(), nvmet_cc_ams(), nvmet_cc_shn(), nvmet_cc_iosqes(), and nvmet_cc_iocqes() to use the enum difinitions in include/linux/nvme.h instead of hardcoded values. Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-02-18	nvme/ioctl: add missing space in err message	Caleb Sander Mateos
	nvme_validate_passthru_nsid() logs an err message whose format string is split over 2 lines. There is a missing space between the two pieces, resulting in log lines like "... does not match nsid (1)of namespace". Add the missing space between ")" and "of". Also combine the format string pieces onto a single line to make the err message easier to grep. Fixes: e7d4b5493a2d ("nvme: factor out a nvme_validate_passthru_nsid helper") Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-02-18	nvme-tcp: fix connect failure on receiving partial ICResp PDU	Caleb Sander Mateos
	nvme_tcp_init_connection() attempts to receive an ICResp PDU but only checks that the return value from recvmsg() is non-negative. If the sender closes the TCP connection or sends fewer than 128 bytes, this check will pass even though the full PDU wasn't received. Ensure the full ICResp PDU is received by checking that recvmsg() returns the expected 128 bytes. Additionally set the MSG_WAITALL flag for recvmsg(), as a sender could split the ICResp over multiple TCP frames. Without MSG_WAITALL, recvmsg() could return prematurely with only part of the PDU. Fixes: 3f2304f8c6d6 ("nvme-tcp: add NVMe over TCP host driver") Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-02-18	nvme: tcp: Fix compilation warning with W=1	Damien Le Moal
	When compiling with W=1, a warning result for the function nvme_tcp_set_queue_io_cpu(): host/tcp.c:1578: warning: Function parameter or struct member 'queue' not described in 'nvme_tcp_set_queue_io_cpu' host/tcp.c:1578: warning: expecting prototype for Track the number of queues assigned to each cpu using a global per(). Prototype was for nvme_tcp_set_queue_io_cpu() instead Avoid this warning by using the regular comment format for the function nvme_tcp_set_queue_io_cpu() instead of the kdoc comment format. Fixes: 32193789878c ("nvme-tcp: Fix I/O queue cpu spreading for multiple controllers") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-02-18	nvmet: pci-epf: Avoid RCU stalls under heavy workload	Damien Le Moal
	The delayed work item function nvmet_pci_epf_poll_sqs_work() polls all submission queues and keeps running in a loop as long as commands are being submitted by the host. Depending on the preemption configuration of the kernel, under heavy command workload, this function can thus run for more than RCU_CPU_STALL_TIMEOUT seconds, leading to a RCU stall: rcu: INFO: rcu_sched self-detected stall on CPU rcu: 5-....: (20998 ticks this GP) idle=4244/1/0x4000000000000000 softirq=301/301 fqs=5132 rcu: (t=21000 jiffies g=-443 q=12 ncpus=8) CPU: 5 UID: 0 PID: 82 Comm: kworker/5:1 Not tainted 6.14.0-rc2 #1 Hardware name: Radxa ROCK 5B (DT) Workqueue: events nvmet_pci_epf_poll_sqs_work [nvmet_pci_epf] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : dw_edma_device_tx_status+0xb8/0x130 lr : dw_edma_device_tx_status+0x9c/0x130 sp : ffff800080b5bbb0 x29: ffff800080b5bbb0 x28: ffff0331c5c78400 x27: ffff0331c1cd1960 x26: ffff0331c0e39010 x25: ffff0331c20e4000 x24: ffff0331c20e4a90 x23: 0000000000000000 x22: 0000000000000001 x21: 00000000005aca33 x20: ffff800080b5bc30 x19: ffff0331c123e370 x18: 000000000ab29e62 x17: ffffb2a878c9c118 x16: ffff0335bde82040 x15: 0000000000000000 x14: 000000000000017b x13: 00000000ee601780 x12: 0000000000000018 x11: 0000000000000000 x10: 0000000000000001 x9 : 0000000000000040 x8 : 00000000ee601780 x7 : 0000000105c785c0 x6 : ffff0331c1027d80 x5 : 0000000001ee7ad6 x4 : ffff0335bdea16c0 x3 : ffff0331c123e438 x2 : 00000000005aca33 x1 : 0000000000000000 x0 : ffff0331c123e410 Call trace: dw_edma_device_tx_status+0xb8/0x130 (P) dma_sync_wait+0x60/0xbc nvmet_pci_epf_dma_transfer+0x128/0x264 [nvmet_pci_epf] nvmet_pci_epf_poll_sqs_work+0x2a0/0x2e0 [nvmet_pci_epf] process_one_work+0x144/0x390 worker_thread+0x27c/0x458 kthread+0xe8/0x19c ret_from_fork+0x10/0x20 The solution for this is simply to explicitly allow rescheduling using cond_resched(). However, since doing so for every loop of nvmet_pci_epf_poll_sqs_work() significantly degrades performance (for 4K random reads using 4 I/O queues, the maximum IOPS goes down from 137 KIOPS to 110 KIOPS), call cond_resched() every second to avoid the RCU stalls. Fixes: 0faa0fe6f90e ("nvmet: New NVMe PCI endpoint function target driver") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-02-18	nvmet: pci-epf: Do not uselessly write the CSTS register	Damien Le Moal
	The function nvmet_pci_epf_poll_cc_work() will do nothing if there are no changes to the controller configuration (CC) register. However, even for such case, this function still calls nvmet_update_cc() and uselessly writes the CSTS register. Avoid this by simply rescheduling the poll_cc work if the CC register has not changed. Also reschedule the poll_cc work if the function nvmet_pci_epf_enable_ctrl() fails to allow the host the chance to try again enabling the controller. While at it, since there is no point in trying to handle the CC register as quickly as possible, change the poll_cc work scheduling interval to 10 ms (from 5ms), to avoid excessive read accesses to that register. Fixes: 0faa0fe6f90e ("nvmet: New NVMe PCI endpoint function target driver") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-02-18	nvmet: pci-epf: Correctly initialize CSTS when enabling the controller	Damien Le Moal
	The function nvmet_pci_epf_poll_cc_work() sets the NVME_CSTS_RDY bit of the controller status register (CSTS) when nvmet_pci_epf_enable_ctrl() returns success. However, since this function can be called several times (e.g. if the host reboots), instead of setting the bit in ctrl->csts, initialize this field to only have NVME_CSTS_RDY set. Conversely, if nvmet_pci_epf_enable_ctrl() fails, make sure to clear all bits from ctrl->csts. To simplify nvmet_pci_epf_poll_cc_work(), initialize ctrl->csts to NVME_CSTS_RDY directly inside nvmet_pci_epf_enable_ctrl() and clear this field in that function as well in case of a failure. To be consistent, move clearing the NVME_CSTS_RDY bit from ctrl->csts when the controller is being disabled from nvmet_pci_epf_poll_cc_work() into nvmet_pci_epf_disable_ctrl(). Fixes: 0faa0fe6f90e ("nvmet: New NVMe PCI endpoint function target driver") Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-02-18	nvmet-rdma: recheck queue state is LIVE in state lock in recv done	Ruozhu Li
	The queue state checking in nvmet_rdma_recv_done is not in queue state lock.Queue state can transfer to LIVE in cm establish handler between state checking and state lock here, cause a silent drop of nvme connect cmd. Recheck queue state whether in LIVE state in state lock to prevent this issue. Signed-off-by: Ruozhu Li <david.li@jaguarmicro.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-02-18	nvmet: Fix crash when a namespace is disabled	Hannes Reinecke
	The namespace percpu counter protects pending I/O, and we can only safely diable the namespace once the counter drop to zero. Otherwise we end up with a crash when running blktests/nvme/058 (eg for loop transport): [ 2352.930426] [ T53909] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000005: 0000 [#1] PREEMPT SMP KASAN PTI [ 2352.930431] [ T53909] KASAN: null-ptr-deref in range [0x0000000000000028-0x000000000000002f] [ 2352.930434] [ T53909] CPU: 3 UID: 0 PID: 53909 Comm: kworker/u16:5 Tainted: G W 6.13.0-rc6 #232 [ 2352.930438] [ T53909] Tainted: [W]=WARN [ 2352.930440] [ T53909] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-3.fc41 04/01/2014 [ 2352.930443] [ T53909] Workqueue: nvmet-wq nvme_loop_execute_work [nvme_loop] [ 2352.930449] [ T53909] RIP: 0010:blkcg_set_ioprio+0x44/0x180 as the queue is already torn down when calling submit_bio(); So we need to init the percpu counter in nvmet_ns_enable(), and wait for it to drop to zero in nvmet_ns_disable() to avoid having I/O pending after the namespace has been disabled. Fixes: 74d16965d7ac ("nvmet-loop: avoid using mutex in IO hotpath") Signed-off-by: Hannes Reinecke <hare@kernel.org> Reviewed-by: Nilay Shroff <nilay@linux.ibm.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com> Tested-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-02-18	nvme-tcp: add basic support for the C2HTermReq PDU	Maurizio Lombardi
	Previously, the NVMe/TCP host driver did not handle the C2HTermReq PDU, instead printing "unsupported pdu type (3)" when received. This patch adds support for processing the C2HTermReq PDU, allowing the driver to print the Fatal Error Status field. Example of output: nvme nvme4: Received C2HTermReq (FES = Invalid PDU Header Field) Signed-off-by: Maurizio Lombardi <mlombard@redhat.com> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-02-18	nvme-pci: quirk Acer FA100 for non-uniqueue identifiers	Christopher Lentocha
	In order for two Acer FA100 SSDs to work in one PC (in the case of myself, a Lenovo Legion T5 28IMB05), and not show one drive and not the other, and sometimes mix up what drive shows up (randomly), these two lines of code need to be added, and then both of the SSDs will show up and not conflict when booting off of one of them. If you boot up your computer with both SSDs installed without this patch, you may also randomly get into a kernel panic (if the initrd is not set up) or stuck in the initrd "/init" process, it is set up, however, if you do apply this patch, there should not be problems with booting or seeing both contents of the drive. Tested with the btrfs filesystem with a RAID configuration of having the root drive '/' combined to make two 256GB Acer FA100 SSDs become 512GB in total storage. Kernel Logs with patch applied (`dmesg -t \| grep -i nvm`): ``` ... nvme 0000:04:00.0: platform quirk: setting simple suspend nvme nvme0: pci function 0000:04:00.0 nvme 0000:05:00.0: platform quirk: setting simple suspend nvme nvme1: pci function 0000:05:00.0 nvme nvme1: missing or invalid SUBNQN field. nvme nvme1: allocated 64 MiB host memory buffer. nvme nvme0: missing or invalid SUBNQN field. nvme nvme0: allocated 64 MiB host memory buffer. nvme nvme1: 8/0/0 default/read/poll queues nvme nvme1: Ignoring bogus Namespace Identifiers nvme nvme0: 8/0/0 default/read/poll queues nvme nvme0: Ignoring bogus Namespace Identifiers nvme0n1: p1 p2 ... ``` Kernel Logs with patch not applied (`dmesg -t \| grep -i nvm`): ``` ... nvme 0000:04:00.0: platform quirk: setting simple suspend nvme nvme0: pci function 0000:04:00.0 nvme 0000:05:00.0: platform quirk: setting simple suspend nvme nvme1: pci function 0000:05:00.0 nvme nvme0: missing or invalid SUBNQN field. nvme nvme1: missing or invalid SUBNQN field. nvme nvme0: allocated 64 MiB host memory buffer. nvme nvme1: allocated 64 MiB host memory buffer. nvme nvme0: 8/0/0 default/read/poll queues nvme nvme1: 8/0/0 default/read/poll queues nvme nvme1: globally duplicate IDs for nsid 1 nvme nvme1: VID:DID 1dbe:5216 model:Acer SSD FA100 256GB firmware:1.Z.J.2X nvme0n1: p1 p2 ... ``` Signed-off-by: Christopher Lentocha <christopherericlentocha@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Keith Busch <kbusch@kernel.org>
2025-02-18	net: phy: marvell-88q2xxx: Init PHY private structure for mv88q211x	Niklas Söderlund
	When adding LED support for mv88q222x devices the PHY private data structure was added to the mv88q211x code path, the data structure is however only allocated during mv88q222x probe. This results in a nullptr deference for mv88q2110 devices. Unable to handle kernel NULL pointer dereference at virtual address 0000000000000001 Mem abort info: ESR = 0x0000000096000004 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x04: level 0 translation fault Data abort info: ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [0000000000000001] user address but active_mm is swapper Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP CPU: 3 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.14.0-rc1-arm64-renesas-00342-ga3783dbf2574 #7 Hardware name: Renesas White Hawk Single board based on r8a779g2 (DT) pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : mv88q2xxx_config_init+0x28/0x84 lr : mv88q2110_config_init+0x98/0xb0 sp : ffff8000823eb9d0 x29: ffff8000823eb9d0 x28: ffff000440942000 x27: ffff80008144e400 x26: 0000000000001002 x25: 0000000000000000 x24: 0000000000000000 x23: 0000000000000009 x22: ffff8000810534f0 x21: ffff800081053550 x20: 0000000000000000 x19: ffff0004437d6800 x18: 0000000000000018 x17: 00000000000961c8 x16: ffff0006bef75ec0 x15: 0000000000000001 x14: 0000000000000001 x13: ffff000440218080 x12: 071c71c71c71c71c x11: ffff000440218080 x10: 0000000000001420 x9 : ffff8000823eb770 x8 : ffff8000823eb650 x7 : ffff8000823eb750 x6 : ffff8000823eb710 x5 : 0000000000000000 x4 : 0000000000000800 x3 : 0000000000000001 x2 : 0000000000000000 x1 : 00000000ffffffff x0 : ffff0004437d6800 Call trace: mv88q2xxx_config_init+0x28/0x84 (P) mv88q2110_config_init+0x98/0xb0 phy_init_hw+0x64/0x9c phy_attach_direct+0x118/0x320 phy_connect_direct+0x24/0x80 of_phy_connect+0x5c/0xa0 rtsn_open+0x5bc/0x78c __dev_open+0xf8/0x1fc __dev_change_flags+0x198/0x220 dev_change_flags+0x20/0x64 ip_auto_config+0x270/0xefc do_one_initcall+0xe4/0x22c kernel_init_freeable+0x2a8/0x308 kernel_init+0x20/0x130 ret_from_fork+0x10/0x20 Code: b907e404 f9432814 3100083f 540000e3 (39400680) ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b SMP: stopping secondary CPUs Kernel Offset: disabled CPU features: 0x000,00000070,00801250,8200700b Memory Limit: none ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]--- Fix this by using a generic probe function for both mv88q211x and mv88q222x devices that allocates the PHY private data structure, while only the mv88q222x probes for LED support. Fixes: a3783dbf2574 ("net: phy: marvell-88q2xxx: Add support for PHY LEDs on 88q2xxx") Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Link: https://patch.msgid.link/20250214174650.2056949-1-niklas.soderlund+renesas@ragnatech.se Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-18	net: phy: marvell-88q2xxx: enable temperature sensor in mv88q2xxx_config_init	Dimitri Fedrau
	Temperature sensor gets enabled for 88Q222X devices in mv88q222x_config_init. Move enabling to mv88q2xxx_config_init because all 88Q2XXX devices support the temperature sensor. Signed-off-by: Dimitri Fedrau <dima.fedrau@gmail.com> Tested-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-18	net: phy: marvell-88q2xxx: order includes alphabetically	Dimitri Fedrau
	Order includes alphabetically. Signed-off-by: Dimitri Fedrau <dima.fedrau@gmail.com> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-18	net: phy: marvell-88q2xxx: align defines	Dimitri Fedrau
	Align some defines. Signed-off-by: Dimitri Fedrau <dima.fedrau@gmail.com> Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-18	vxlan: Join / leave MC group after remote changes	Petr Machata
	When a vxlan netdevice is brought up, if its default remote is a multicast address, the device joins the indicated group. Therefore when the multicast remote address changes, the device should leave the current group and subscribe to the new one. Similarly when the interface used for endpoint communication is changed in a situation when multicast remote is configured. This is currently not done. Both vxlan_igmp_join() and vxlan_igmp_leave() can however fail. So it is possible that with such fix, the netdevice will end up in an inconsistent situation where the old group is not joined anymore, but joining the new group fails. Should we join the new group first, and leave the old one second, we might end up in the opposite situation, where both groups are joined. Undoing any of this during rollback is going to be similarly problematic. One solution would be to just forbid the change when the netdevice is up. However in vnifilter mode, changing the group address is allowed, and these problems are simply ignored (see vxlan_vni_update_group()): # ip link add name br up type bridge vlan_filtering 1 # ip link add vx1 up master br type vxlan external vnifilter local 192.0.2.1 dev lo dstport 4789 # bridge vni add dev vx1 vni 200 group 224.0.0.1 # tcpdump -i lo & # bridge vni add dev vx1 vni 200 group 224.0.0.2 18:55:46.523438 IP 0.0.0.0 > 224.0.0.22: igmp v3 report, 1 group record(s) 18:55:46.943447 IP 0.0.0.0 > 224.0.0.22: igmp v3 report, 1 group record(s) # bridge vni dev vni group/remote vx1 200 224.0.0.2 Having two different modes of operation for conceptually the same interface is silly, so in this patch, just do what the vnifilter code does and deal with the errors by crossing fingers real hard. The vnifilter code leaves old before joining new, and in case of join / leave failures does not roll back the configuration changes that have already been applied, but bails out of joining if it could not leave. Do the same here: leave before join, apply changes unconditionally and do not attempt to join if we couldn't leave. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-18	vxlan: Drop 'changelink' parameter from vxlan_dev_configure()	Petr Machata
	vxlan_dev_configure() only has a single caller that passes false for the changelink parameter. Drop the parameter and inline the sole value. Signed-off-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-18	octeontx2-pf: AF_XDP zero copy transmit support	Suman Ghosh
	This patch implements below changes, 1. To avoid concurrency with normal traffic uses XDP queues. 2. Since there are chances that XDP and AF_XDP can fall under same queue uses separate flags to handle dma buffers. Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Suman Ghosh <sumang@marvell.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-18	octeontx2-pf: Prepare for AF_XDP	Suman Ghosh
	Implement necessary APIs required for AF_XDP transmit. Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Signed-off-by: Suman Ghosh <sumang@marvell.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>