summaryrefslogtreecommitdiff
path: root/drivers
AgeCommit message (Collapse)Author
2025-04-29igb: Get rid of spurious interruptsKurt Kanzenbach
When running the igc with XDP/ZC in busy polling mode with deferral of hard interrupts, interrupts still happen from time to time. That is caused by the igb task watchdog which triggers Rx interrupts periodically. That mechanism has been introduced to overcome skb/memory allocation failures [1]. So the Rx clean functions stop processing the Rx ring in case of such failure. The task watchdog triggers Rx interrupts periodically in the hope that memory became available in the mean time. The current behavior is undesirable for real time applications, because the driver induced Rx interrupts trigger also the softirq processing. However, all real time packets should be processed by the application which uses the busy polling method. Therefore, only trigger the Rx interrupts in case of real allocation failures. Introduce a new flag for signaling that condition. Follow the same logic as in commit 8dcf2c212078 ("igc: Get rid of spurious interrupts"). [1] - https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/commit/?id=3be507547e6177e5c808544bd6a2efa2c7f1d436 Reviewed-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Sweta Kumari <sweta.kumari@intel.com> Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-04-29igb: Add support for persistent NAPI configKurt Kanzenbach
Use netif_napi_add_config() to assign persistent per-NAPI config. This is useful for preserving NAPI settings when changing queue counts or for user space programs using SO_INCOMING_NAPI_ID. Reviewed-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-04-29igb: Link queues to NAPI instancesKurt Kanzenbach
Link queues to NAPI instances via netdev-genl API. This is required to use XDP/ZC busy polling. See commit 5ef44b3cb43b ("xsk: Bring back busy polling support") for details. This also allows users to query the info with netlink: |$ ./tools/net/ynl/pyynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \ | --dump queue-get --json='{"ifindex": 2}' |[{'id': 0, 'ifindex': 2, 'napi-id': 8201, 'type': 'rx'}, | {'id': 1, 'ifindex': 2, 'napi-id': 8202, 'type': 'rx'}, | {'id': 2, 'ifindex': 2, 'napi-id': 8203, 'type': 'rx'}, | {'id': 3, 'ifindex': 2, 'napi-id': 8204, 'type': 'rx'}, | {'id': 0, 'ifindex': 2, 'napi-id': 8201, 'type': 'tx'}, | {'id': 1, 'ifindex': 2, 'napi-id': 8202, 'type': 'tx'}, | {'id': 2, 'ifindex': 2, 'napi-id': 8203, 'type': 'tx'}, | {'id': 3, 'ifindex': 2, 'napi-id': 8204, 'type': 'tx'}] Add rtnl locking to PCI error handlers, because netif_queue_set_napi() requires the lock held. While at __igb_open() use RCT coding style. Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Sweta Kumari <sweta.kumari@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-04-29igb: Link IRQs to NAPI instancesKurt Kanzenbach
Link IRQs to NAPI instances via netdev-genl API. This allows users to query that information via netlink: |$ ./tools/net/ynl/pyynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \ | --dump napi-get --json='{"ifindex": 2}' |[{'defer-hard-irqs': 0, | 'gro-flush-timeout': 0, | 'id': 8204, | 'ifindex': 2, | 'irq': 127, | 'irq-suspend-timeout': 0}, | {'defer-hard-irqs': 0, | 'gro-flush-timeout': 0, | 'id': 8203, | 'ifindex': 2, | 'irq': 126, | 'irq-suspend-timeout': 0}, | {'defer-hard-irqs': 0, | 'gro-flush-timeout': 0, | 'id': 8202, | 'ifindex': 2, | 'irq': 125, | 'irq-suspend-timeout': 0}, | {'defer-hard-irqs': 0, | 'gro-flush-timeout': 0, | 'id': 8201, | 'ifindex': 2, | 'irq': 124, | 'irq-suspend-timeout': 0}] |$ cat /proc/interrupts | grep enp2s0 |123: 0 1 IR-PCI-MSIX-0000:02:00.0 0-edge enp2s0 |124: 0 7 IR-PCI-MSIX-0000:02:00.0 1-edge enp2s0-TxRx-0 |125: 0 0 IR-PCI-MSIX-0000:02:00.0 2-edge enp2s0-TxRx-1 |126: 0 5 IR-PCI-MSIX-0000:02:00.0 3-edge enp2s0-TxRx-2 |127: 0 0 IR-PCI-MSIX-0000:02:00.0 4-edge enp2s0-TxRx-3 Reviewed-by: Joe Damato <jdamato@fastly.com> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-04-28net: ti: icssg-prueth: Add ICSSG FW StatsMD Danish Anwar
The ICSSG firmware maintains set of stats called PA_STATS. Currently the driver only dumps 4 stats. Add support for dumping more stats. The offset for different stats are defined as MACROs in icssg_switch_map.h file. All the offsets are for Slice0. Slice1 offsets are slice0 + 4. The offset calculation is taken care while reading the stats in emac_update_hardware_stats(). The statistics are documented in Documentation/networking/device_drivers/icssg_prueth.rst Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: MD Danish Anwar <danishanwar@ti.com> Link: https://patch.msgid.link/20250424095316.2643573-1-danishanwar@ti.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28rtase: Modify the format specifier in snprintf to %uJustin Lai
Modify the format specifier in snprintf to %u. Signed-off-by: Justin Lai <justinlai0215@realtek.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250425064057.30035-1-justinlai0215@realtek.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28net: thunder_bgx: Don't disable PCI device manuallyPhilipp Stanner
thunder_bgx's PCI device is enabled with pcim_enable_device(), a managed devres function which ensures that the device gets enabled on driver detach automatically. Remove the calls to pci_disable_device(). Signed-off-by: Philipp Stanner <phasta@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250425085740.65304-10-phasta@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28net: thunder_bgx: Use pure PCI devres APIPhilipp Stanner
The currently used function pci_request_regions() is one of the problematic "hybrid devres" PCI functions, which are sometimes managed through devres, and sometimes not (depending on whether pci_enable_device() or pcim_enable_device() has been called before). The PCI subsystem wants to remove this behavior and, therefore, needs to port all users to functions that don't have this problem. Furthermore, the PCI function being managed implies that it's not necessary to call pci_release_regions() manually. Remove the calls to pci_release_regions(). Replace pci_request_regions() with pcim_request_all_regions(). Signed-off-by: Philipp Stanner <phasta@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250425085740.65304-9-phasta@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28net: mdio: thunder: Use pure PCI devres APIPhilipp Stanner
The currently used function pci_request_regions() is one of the problematic "hybrid devres" PCI functions, which are sometimes managed through devres, and sometimes not (depending on whether pci_enable_device() or pcim_enable_device() has been called before). The PCI subsystem wants to remove this behavior and, therefore, needs to port all users to functions that don't have this problem. Furthermore, the PCI function being managed implies that it's not necessary to call pci_release_regions() manually. Remove the calls to pci_release_regions(). Replace pci_request_regions() with pcim_request_all_regions(). Signed-off-by: Philipp Stanner <phasta@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250425085740.65304-8-phasta@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28net: ethernet: sis900: Use pure PCI devres APIPhilipp Stanner
The currently used function pci_request_regions() is one of the problematic "hybrid devres" PCI functions, which are sometimes managed through devres, and sometimes not (depending on whether pci_enable_device() or pcim_enable_device() has been called before). The PCI subsystem wants to remove this behavior and, therefore, needs to port all users to functions that don't have this problem. Replace pci_request_regions() with pcim_request_all_regions(). Signed-off-by: Philipp Stanner <phasta@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Acked-by: Daniele Venzano <venza@brownhat.org> Link: https://patch.msgid.link/20250425085740.65304-7-phasta@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28net: ethernet: natsemi: Use pure PCI devres APIPhilipp Stanner
The currently used function pci_request_regions() is one of the problematic "hybrid devres" PCI functions, which are sometimes managed through devres, and sometimes not (depending on whether pci_enable_device() or pcim_enable_device() has been called before). The PCI subsystem wants to remove this behavior and, therefore, needs to port all users to functions that don't have this problem. Replace pci_request_regions() with pcim_request_all_regions(). Signed-off-by: Philipp Stanner <phasta@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250425085740.65304-6-phasta@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28net: tulip: Use pure PCI devres APIPhilipp Stanner
The currently used function pci_request_regions() is one of the problematic "hybrid devres" PCI functions, which are sometimes managed through devres, and sometimes not (depending on whether pci_enable_device() or pcim_enable_device() has been called before). The PCI subsystem wants to remove this behavior and, therefore, needs to port all users to functions that don't have this problem. Replace pci_request_regions() with pcim_request_all_regions(). Signed-off-by: Philipp Stanner <phasta@kernel.org> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250425085740.65304-5-phasta@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28net: octeontx2: Use pure PCI devres APIPhilipp Stanner
The currently used function pci_request_regions() is one of the problematic "hybrid devres" PCI functions, which are sometimes managed through devres, and sometimes not (depending on whether pci_enable_device() or pcim_enable_device() has been called before). The PCI subsystem wants to remove this behavior and, therefore, needs to port all users to functions that don't have this problem. Furthermore, the PCI function being managed implies that it's not necessary to call pci_release_regions() manually. Remove the calls to pci_release_regions(). Replace pci_request_regions() with pcim_request_all_regions(). Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://patch.msgid.link/20250425085740.65304-4-phasta@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28net: prestera: Use pure PCI devres APIPhilipp Stanner
The currently used function pci_request_regions() is one of the problematic "hybrid devres" PCI functions, which are sometimes managed through devres, and sometimes not (depending on whether pci_enable_device() or pcim_enable_device() has been called before). The PCI subsystem wants to remove this behavior and, therefore, needs to port all users to functions that don't have this problem. Furthermore, the PCI function being managed implies that it's not necessary to call pci_release_regions() manually. Remove the calls to pci_release_regions(). Replace pci_request_regions() with pcim_request_all_regions(). Signed-off-by: Philipp Stanner <phasta@kernel.org> Acked-by: Elad Nachman <enachman@marvell.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250425085740.65304-3-phasta@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28veth: apply qdisc backpressure on full ptr_ring to reduce TX dropsJesper Dangaard Brouer
In production, we're seeing TX drops on veth devices when the ptr_ring fills up. This can occur when NAPI mode is enabled, though it's relatively rare. However, with threaded NAPI - which we use in production - the drops become significantly more frequent. The underlying issue is that with threaded NAPI, the consumer often runs on a different CPU than the producer. This increases the likelihood of the ring filling up before the consumer gets scheduled, especially under load, leading to drops in veth_xmit() (ndo_start_xmit()). This patch introduces backpressure by returning NETDEV_TX_BUSY when the ring is full, signaling the qdisc layer to requeue the packet. The txq (netdev queue) is stopped in this condition and restarted once veth_poll() drains entries from the ring, ensuring coordination between NAPI and qdisc. Backpressure is only enabled when a qdisc is attached. Without a qdisc, the driver retains its original behavior - dropping packets immediately when the ring is full. This avoids unexpected behavior changes in setups without a configured qdisc. With a qdisc in place (e.g. fq, sfq) this allows Active Queue Management (AQM) to fairly schedule packets across flows and reduce collateral damage from elephant flows. A known limitation of this approach is that the full ring sits in front of the qdisc layer, effectively forming a FIFO buffer that introduces base latency. While AQM still improves fairness and mitigates flow dominance, the latency impact is measurable. In hardware drivers, this issue is typically addressed using BQL (Byte Queue Limits), which tracks in-flight bytes needed based on physical link rate. However, for virtual drivers like veth, there is no fixed bandwidth constraint - the bottleneck is CPU availability and the scheduler's ability to run the NAPI thread. It is unclear how effective BQL would be in this context. This patch serves as a first step toward addressing TX drops. Future work may explore adapting a BQL-like mechanism to better suit virtual devices like veth. Reported-by: Yan Zhai <yan@cloudflare.com> Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Toshiaki Makita <toshiaki.makita1@gmail.com> Link: https://patch.msgid.link/174559294022.827981.1282809941662942189.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28net: sched: generalize check for no-queue qdisc on TX queueJesper Dangaard Brouer
The "noqueue" qdisc can either be directly attached, or get default attached if net_device priv_flags has IFF_NO_QUEUE. In both cases, the allocated Qdisc structure gets it's enqueue function pointer reset to NULL by noqueue_init() via noqueue_qdisc_ops. This is a common case for software virtual net_devices. For these devices with no-queue, the transmission path in __dev_queue_xmit() will bypass the qdisc layer. Directly invoking device drivers ndo_start_xmit (via dev_hard_start_xmit). In this mode the device driver is not allowed to ask for packets to be queued (either via returning NETDEV_TX_BUSY or stopping the TXQ). The simplest and most reliable way to identify this no-queue case is by checking if enqueue == NULL. The vrf driver currently open-codes this check (!qdisc->enqueue). While functionally correct, this low-level detail is better encapsulated in a dedicated helper for clarity and long-term maintainability. To make this behavior more explicit and reusable, this patch introduce a new helper: qdisc_txq_has_no_queue(). Helper will also be used by the veth driver in the next patch, which introduces optional qdisc-based backpressure. This is a non-functional change. Reviewed-by: David Ahern <dsahern@kernel.org> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Signed-off-by: Jesper Dangaard Brouer <hawk@kernel.org> Link: https://patch.msgid.link/174559293172.827981.7583862632045264175.stgit@firesoul Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28mdio: fix CONFIG_MDIO_DEVRES selectsArnd Bergmann
The newly added rtl9300 driver needs MDIO_DEVRES: x86_64-linux-ld: drivers/net/mdio/mdio-realtek-rtl9300.o: in function `rtl9300_mdiobus_probe': mdio-realtek-rtl9300.c:(.text+0x941): undefined reference to `devm_mdiobus_alloc_size' x86_64-linux-ld: mdio-realtek-rtl9300.c:(.text+0x9e2): undefined reference to `__devm_mdiobus_register' Since this is a hidden symbol, it needs to be selected by each user, rather than the usual 'depends on'. I see that there are a few other drivers that accidentally use 'depends on', so fix these as well for consistency and to avoid dependency loops. Fixes: 37f9b2a6c086 ("net: ethernet: Add missing depends on MDIO_DEVRES") Fixes: 24e31e474769 ("net: mdio: Add RTL9300 MDIO driver") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Link: https://patch.msgid.link/20250425112819.1645342-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28net: stmmac: dwmac-loongson: Add new GMAC's PCI device ID supportHuacai Chen
Add a new GMAC's PCI device ID (0x7a23) support which is used in Loongson-2K3000/Loongson-3B6000M. The new GMAC device use external PHY, so it reuses loongson_gmac_data() as the old GMAC device (0x7a03), and the new GMAC device still doesn't support flow control now. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Yanteng Si <si.yanteng@linux.dev> Tested-by: Henry Chen <chenx97@aosc.io> Tested-by: Biao Dong <dongbiao@loongson.cn> Signed-off-by: Baoqi Zhang <zhangbaoqi@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Link: https://patch.msgid.link/20250424072209.3134762-4-chenhuacai@loongson.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28net: stmmac: dwmac-loongson: Add new multi-chan IP core supportHuacai Chen
Add a new multi-chan IP core (0x12) support which is used in Loongson- 2K3000/Loongson-3B6000M. Compared with the 0x10 core, the new 0x12 core reduces channel numbers from 8 to 4, but checksum is supported for all channels. Add a "multichan" flag to loongson_data, so that we can simply use a "if (ld->multichan)" condition rather than the complicated condition "if (ld->loongson_id == DWMAC_CORE_MULTICHAN_V1 || ld->loongson_id == DWMAC_CORE_MULTICHAN_V2)". Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Henry Chen <chenx97@aosc.io> Tested-by: Biao Dong <dongbiao@loongson.cn> Signed-off-by: Baoqi Zhang <zhangbaoqi@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Reviewed-by: Yanteng Si <si.yanteng@linux.dev> Link: https://patch.msgid.link/20250424072209.3134762-3-chenhuacai@loongson.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28net: stmmac: socfpga: Remove unused pcs-mdiodev fieldMaxime Chevallier
When dwmac-socfpga was converted to using the Lynx PCS (previously referred to in the driver as the Altera TSE PCS), the lynx_pcs_create_mdiodev() was used to create the pcs instance. As this function didn't exist in the early versions of the series, a local mdiodev object was stored for PCS creation. It was never used, but still made it into the driver, so remove it. Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250424071223.221239-4-maxime.chevallier@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28net: stmmac: socfpga: Don't check for phy to enable the SGMII adapterMaxime Chevallier
The SGMII adapter needs to be enabled for both Cisco SGMII and 1000BaseX operations. It doesn't make sense to check for an attached phydev here, as we simply might not have any, in particular if we're using the 1000BaseX interface mode. Make so that we only re-enable the SGMII adapter when it's present, and when we use a phy_mode that is handled by said adapter. Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250424071223.221239-3-maxime.chevallier@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28net: stmmac: socfpga: Enable internal GMII when using 1000BaseXMaxime Chevallier
Dwmac Socfpga may be used with an instance of a Lynx / Altera TSE PCS, in which case it gains support for 1000BaseX. It appears that the PCS is wired to the MAC through an internal GMII bus. Make sure that we enable the GMII_MII mode for the internal MAC when using 1000BaseX. Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250424071223.221239-2-maxime.chevallier@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28net: stmmac: dwmac-loongson: Move queue number init to common functionHuacai Chen
Currently, the tx and rx queue number initialization is duplicated in loongson_gmac_data() and loongson_gnet_data(), so move it to the common function loongson_default_data(). This is a preparation for later patches. Reviewed-by: Yanteng Si <si.yanteng@linux.dev> Tested-by: Henry Chen <chenx97@aosc.io> Tested-by: Biao Dong <dongbiao@loongson.cn> Signed-off-by: Baoqi Zhang <zhangbaoqi@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-04-28bonding: assign random address if device address is same as bondHangbin Liu
This change addresses a MAC address conflict issue in failover scenarios, similar to the problem described in commit a951bc1e6ba5 ("bonding: correct the MAC address for 'follow' fail_over_mac policy"). In fail_over_mac=follow mode, the bonding driver expects the formerly active slave to swap MAC addresses with the newly active slave during failover. However, under certain conditions, two slaves may end up with the same MAC address, which breaks this policy: 1) ip link set eth0 master bond0 -> bond0 adopts eth0's MAC address (MAC0). 2) ip link set eth1 master bond0 -> eth1 is added as a backup with its own MAC (MAC1). 3) ip link set eth0 nomaster -> eth0 is released and restores its MAC (MAC0). -> eth1 becomes the active slave, and bond0 assigns MAC0 to eth1. 4) ip link set eth0 master bond0 -> eth0 is re-added to bond0, now both eth0 and eth1 have MAC0. This results in a MAC address conflict and violates the expected behavior of the failover policy. To fix this, we assign a random MAC address to any newly added slave if its current MAC address matches that of the bond. The original (permanent) MAC address is saved and will be restored when the device is released from the bond. This ensures that each slave has a unique MAC address during failover transitions, preserving the integrity of the fail_over_mac=follow policy. Fixes: 3915c1e8634a ("bonding: Add "follow" option to fail_over_mac") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Jay Vosburgh <jv@jvosburgh.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-04-24netdevsim: Mark NAPI ID on skb in nsim_rcvJoe Damato
Previously, nsim_rcv was not marking the NAPI ID on the skb, leading to applications seeing a napi ID of 0 when using SO_INCOMING_NAPI_ID. To add to the userland confusion, netlink appears to correctly report the NAPI IDs for netdevsim queues but the resulting file descriptor from a call to accept() was reporting a NAPI ID of 0. Signed-off-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250424002746.16891-2-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24net: ethernet: mtk_eth_soc: convert cap_bit in mtk_eth_muxc struct to u64Bo-Cun Chen
With commit 51a4df60db5c2 ("net: ethernet: mtk_eth_soc: convert caps in mtk_soc_data struct to u64") the capabilities bitfield was converted to a 64-bit value, but a cap_bit in struct mtk_eth_muxc which is used to store a full bitfield (rather than the bit number, as the name would suggest) still holds only a 32-bit value. Change the type of cap_bit to u64 in order to avoid truncating the bitfield which results in path selection to not work with capabilities above the 32-bit limit. The values currently stored in the cap_bit field are MTK_ETH_MUX_GDM1_TO_GMAC1_ESW: BIT_ULL(18) | BIT_ULL(5) MTK_ETH_MUX_GMAC2_GMAC0_TO_GEPHY: BIT_ULL(19) | BIT_ULL(5) | BIT_ULL(6) MTK_ETH_MUX_U3_GMAC2_TO_QPHY: BIT_ULL(20) | BIT_ULL(5) | BIT_ULL(6) MTK_ETH_MUX_GMAC1_GMAC2_TO_SGMII_RGMII: BIT_ULL(20) | BIT_ULL(5) | BIT_ULL(7) MTK_ETH_MUX_GMAC12_TO_GEPHY_SGMII: BIT_ULL(21) | BIT_ULL(5) While all those values are currently still within 32-bit boundaries, the addition of new capabilities of MT7988 as well as future SoC's like MT7987 will exceed them. Also, the use of a 32-bit 'int' type to store the result of a BIT_ULL(...) is misleading. Signed-off-by: Bo-Cun Chen <bc-bocun.chen@mediatek.com> Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/ded98b0d716c3203017a7a92151516ec2bf1abee.1745369249.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24net: phy: mdio-bcm-unimac: Add asp-v3.0Justin Chen
Add mdio compat string for asp-v3.0 ethernet driver. Signed-off-by: Justin Chen <justin.chen@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250422233645.1931036-9-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24net: bcmasp: Add support for asp-v3.0Justin Chen
The asp-v3.0 is a major HW revision that reduced the number of channels and filters. The goal was to save cost by reducing the feature set. Changes for asp-v3.0 - Number of network filters were reduced. - Number of channels were reduced. - EDPKT stats were removed. - Fix a bug with csum offload. Signed-off-by: Justin Chen <justin.chen@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250422233645.1931036-8-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24net: phy: mdio-bcm-unimac: Remove asp-v2.0Justin Chen
Remove asp-v2.0 which will no longer be supported. Signed-off-by: Justin Chen <justin.chen@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250422233645.1931036-5-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24net: bcmasp: Remove support for asp-v2.0Justin Chen
The SoC that supported asp-v2.0 never saw the light of day. asp-v2.0 has quirks that makes the logic overly complicated. For example, asp-v2.0 is the only revision that has a different wake up IRQ hook up. Remove asp-v2.0 support to make supporting future HW revisions cleaner. Signed-off-by: Justin Chen <justin.chen@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250422233645.1931036-4-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.15-rc4). This pull includes wireless and a fix to vxlan which isn't in Linus's tree just yet. The latter creates with a silent conflict / build breakage, so merging it now to avoid causing problems. drivers/net/vxlan/vxlan_vnifilter.c 094adad91310 ("vxlan: Use a single lock to protect the FDB table") 087a9eb9e597 ("vxlan: vnifilter: Fix unlocked deletion of default FDB entry") https://lore.kernel.org/20250423145131.513029-1-idosch@nvidia.com No "normal" conflicts, or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24vxlan: vnifilter: Fix unlocked deletion of default FDB entryIdo Schimmel
When a VNI is deleted from a VXLAN device in 'vnifilter' mode, the FDB entry associated with the default remote (assuming one was configured) is deleted without holding the hash lock. This is wrong and will result in a warning [1] being generated by the lockdep annotation that was added by commit ebe642067455 ("vxlan: Create wrappers for FDB lookup"). Reproducer: # ip link add vx0 up type vxlan dstport 4789 external vnifilter local 192.0.2.1 # bridge vni add vni 10010 remote 198.51.100.1 dev vx0 # bridge vni del vni 10010 dev vx0 Fix by acquiring the hash lock before the deletion and releasing it afterwards. Blame the original commit that introduced the issue rather than the one that exposed it. [1] WARNING: CPU: 3 PID: 392 at drivers/net/vxlan/vxlan_core.c:417 vxlan_find_mac+0x17f/0x1a0 [...] RIP: 0010:vxlan_find_mac+0x17f/0x1a0 [...] Call Trace: <TASK> __vxlan_fdb_delete+0xbe/0x560 vxlan_vni_delete_group+0x2ba/0x940 vxlan_vni_del.isra.0+0x15f/0x580 vxlan_process_vni_filter+0x38b/0x7b0 vxlan_vnifilter_process+0x3bb/0x510 rtnetlink_rcv_msg+0x2f7/0xb70 netlink_rcv_skb+0x131/0x360 netlink_unicast+0x426/0x710 netlink_sendmsg+0x75a/0xc20 __sock_sendmsg+0xc1/0x150 ____sys_sendmsg+0x5aa/0x7b0 ___sys_sendmsg+0xfc/0x180 __sys_sendmsg+0x121/0x1b0 do_syscall_64+0xbb/0x1d0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Fixes: f9c4bb0b245c ("vxlan: vni filtering support on collect metadata device") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20250423145131.513029-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24Merge tag 'wireless-2025-04-24' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== Some more fixes, notably: * iwlwifi: various regression and iwlmld fixes * mac80211: fix TX frames in monitor mode * brcmfmac: error handling for firmware load * tag 'wireless-2025-04-24' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: iwlwifi: restore missing initialization of async_handlers_list wifi: brcm80211: fmac: Add error handling for brcmf_usb_dl_writeimage() wifi: plfxlc: Remove erroneous assert in plfxlc_mac_release wifi: iwlwifi: fix the check for the SCRATCH register upon resume wifi: iwlwifi: don't warn if the NIC is gone in resume wifi: iwlwifi: mld: fix BAID validity check wifi: iwlwifi: back off on continuous errors wifi: iwlwifi: mld: only create debugfs symlink if it does not exist wifi: iwlwifi: mld: inform trans on init failure wifi: iwlwifi: mld: properly handle async notification in op mode start Revert "wifi: iwlwifi: make no_160 more generic" Revert "wifi: iwlwifi: add support for BE213" wifi: mac80211: restore monitor for outgoing frames ==================== Link: https://patch.msgid.link/20250424120535.56499-3-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24Merge tag 'net-6.15-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "No fixes from any subtree. Current release - regressions: - net: fix the missing unlock for detached devices Previous releases - regressions: - sched: fix UAF vulnerability in HFSC qdisc - lwtunnel: disable BHs when required - mptcp: pm: defer freeing of MPTCP userspace path manager entries - tipc: fix NULL pointer dereference in tipc_mon_reinit_self() - eth: virtio-net: disable delayed refill when pausing rx Previous releases - always broken: - phylink: fix suspend/resume with WoL enabled and link down - eth: - mlx5: fix null-ptr-deref in mlx5_create_{inner_,}ttc_table() - xen-netfront: handle NULL returned by xdp_convert_buff_to_frame() - enetc: fix frame corruption on bpf_xdp_adjust_head/tail() and XDP_PASS - stmmac: fix dwmac1000 ptp timestamp status offset - pds_core: prevent possible adminq overflow/stuck condition Misc: - a bunch of MAINTAINERS updates" * tag 'net-6.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (32 commits) net: stmmac: fix multiplication overflow when reading timestamp net: stmmac: fix dwmac1000 ptp timestamp status offset net: dp83822: Fix OF_MDIO config check pds_core: make wait_context part of q_info pds_core: Remove unnecessary check in pds_client_adminq_cmd() pds_core: handle unsupported PDS_CORE_CMD_FW_CONTROL result pds_core: Prevent possible adminq overflow/stuck condition net: dsa: mt7530: sync driver-specific behavior of MT7531 variants selftests/tc-testing: Add test for HFSC queue emptying during peek operation net_sched: hfsc: Fix a potential UAF in hfsc_dequeue() too net_sched: hfsc: Fix a UAF vulnerability in class handling selftests: mptcp: diag: use mptcp_lib_get_info_value mptcp: pm: Defer freeing of MPTCP userspace path manager entries net: ethernet: mtk_eth_soc: net: revise NETSYSv3 hardware configuration tipc: fix NULL pointer dereference in tipc_mon_reinit_self() virtio-net: disable delayed refill when pausing rx net: phy: leds: fix memory leak net: phylink: mac_link_(up|down)() clarifications net: phylink: fix suspend/resume with WoL enabled and link down net: lwtunnel: disable BHs when required ...
2025-04-24Merge tag 'v6.15-p5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: - Revert acomp multibuffer tests which were buggy - Fix off-by-one regression in new scomp code - Lower quality setting on atmel-sha204a as it may not be random * tag 'v6.15-p5' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: atmel-sha204a - Set hwrng quality to lowest possible crypto: scomp - Fix off-by-one bug when calculating last page Revert "crypto: testmgr - Add multibuffer acomp testing"
2025-04-24net: phy: marvell-88q2xxx: Enable temperature sensor for mv88q211xNiklas Söderlund
The temperature sensor enabled for mv88q222x devices also functions for mv88q211x based devices. Unify the two devices probe functions to enable the sensors for all devices supported by this driver. The same oddity as for mv88q222x devices exists, the PHY link must be up for a correct temperature reading to be reported. # cat /sys/class/hwmon/hwmon9/temp1_input -75000 # ifconfig end5 up # cat /sys/class/hwmon/hwmon9/temp1_input 59000 Worth noting is that while the temperature register offsets and layout are the same between mv88q211x and mv88q222x devices their names in the datasheets are different. This change keeps the mv88q222x names for the mv88q211x support. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Dimitri Fedrau <dima.fedrau@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250418145800.2420751-1-niklas.soderlund+renesas@ragnatech.se Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-24net: stmmac: fix multiplication overflow when reading timestampAlexis Lothoré
The current way of reading a timestamp snapshot in stmmac can lead to integer overflow, as the computation is done on 32 bits. The issue has been observed on a dwmac-socfpga platform returning chaotic timestamp values due to this overflow. The corresponding multiplication is done with a MUL instruction, which returns 32 bit values. Explicitly casting the value to 64 bits replaced the MUL with a UMLAL, which computes and returns the result on 64 bits, and so returns correctly the timestamps. Prevent this overflow by explicitly casting the intermediate value to u64 to make sure that the whole computation is made on u64. While at it, apply the same cast on the other dwmac variant (GMAC4) method for snapshot retrieval. Fixes: 477c3e1f6363 ("net: stmmac: Introduce dwmac1000 timestamping operations") Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250423-stmmac_ts-v2-2-e2cf2bbd61b1@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-24net: stmmac: fix dwmac1000 ptp timestamp status offsetAlexis Lothore
When a PTP interrupt occurs, the driver accesses the wrong offset to learn about the number of available snapshots in the FIFO for dwmac1000: it should be accessing bits 29..25, while it is currently reading bits 19..16 (those are bits about the auxiliary triggers which have generated the timestamps). As a consequence, it does not compute correctly the number of available snapshots, and so possibly do not generate the corresponding clock events if the bogus value ends up being 0. Fix clock events generation by reading the correct bits in the timestamp register for dwmac1000. Fixes: 477c3e1f6363 ("net: stmmac: Introduce dwmac1000 timestamping operations") Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250423-stmmac_ts-v2-1-e2cf2bbd61b1@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-24net: dp83822: Fix OF_MDIO config checkJohannes Schneider
When CONFIG_OF_MDIO is set to be a module the code block is not compiled. Use the IS_ENABLED macro that checks for both built in as well as module. Fixes: 5dc39fd5ef35 ("net: phy: DP83822: Add ability to advertise Fiber connection") Signed-off-by: Johannes Schneider <johannes.schneider@leica-geosystems.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250423044724.1284492-1-johannes.schneider@leica-geosystems.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-23pds_core: make wait_context part of q_infoShannon Nelson
Make the wait_context a full part of the q_info struct rather than a stack variable that goes away after pdsc_adminq_post() is done so that the context is still available after the wait loop has given up. There was a case where a slow development firmware caused the adminq request to time out, but then later the FW finally finished the request and sent the interrupt. The handler tried to complete_all() the completion context that had been created on the stack in pdsc_adminq_post() but no longer existed. This caused bad pointer usage, kernel crashes, and much wailing and gnashing of teeth. Fixes: 01ba61b55b20 ("pds_core: Add adminq processing and commands") Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250421174606.3892-5-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-23pds_core: Remove unnecessary check in pds_client_adminq_cmd()Brett Creeley
When the pds_core driver was first created there were some race conditions around using the adminq, especially for client drivers. To reduce the possibility of a race condition there's a check against pf->state in pds_client_adminq_cmd(). This is problematic for a couple of reasons: 1. The PDSC_S_INITING_DRIVER bit is set during probe, but not cleared until after everything in probe is complete, which includes creating the auxiliary devices. For pds_fwctl this means it can't make any adminq commands until after pds_core's probe is complete even though the adminq is fully up by the time pds_fwctl's auxiliary device is created. 2. The race conditions around using the adminq have been fixed and this path is already protected against client drivers calling pds_client_adminq_cmd() if the adminq isn't ready, i.e. see pdsc_adminq_post() -> pdsc_adminq_inc_if_up(). Fix this by removing the pf->state check in pds_client_adminq_cmd() because invalid accesses to pds_core's adminq is already handled by pdsc_adminq_post()->pdsc_adminq_inc_if_up(). Fixes: 10659034c622 ("pds_core: add the aux client API") Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250421174606.3892-4-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-23pds_core: handle unsupported PDS_CORE_CMD_FW_CONTROL resultBrett Creeley
If the FW doesn't support the PDS_CORE_CMD_FW_CONTROL command the driver might at the least print garbage and at the worst crash when the user runs the "devlink dev info" devlink command. This happens because the stack variable fw_list is not 0 initialized which results in fw_list.num_fw_slots being a garbage value from the stack. Then the driver tries to access fw_list.fw_names[i] with i >= ARRAY_SIZE and runs off the end of the array. Fix this by initializing the fw_list and by not failing completely if the devcmd fails because other useful information is printed via devlink dev info even if the devcmd fails. Fixes: 45d76f492938 ("pds_core: set up device and adminq") Signed-off-by: Brett Creeley <brett.creeley@amd.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250421174606.3892-3-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-23pds_core: Prevent possible adminq overflow/stuck conditionBrett Creeley
The pds_core's adminq is protected by the adminq_lock, which prevents more than 1 command to be posted onto it at any one time. This makes it so the client drivers cannot simultaneously post adminq commands. However, the completions happen in a different context, which means multiple adminq commands can be posted sequentially and all waiting on completion. On the FW side, the backing adminq request queue is only 16 entries long and the retry mechanism and/or overflow/stuck prevention is lacking. This can cause the adminq to get stuck, so commands are no longer processed and completions are no longer sent by the FW. As an initial fix, prevent more than 16 outstanding adminq commands so there's no way to cause the adminq from getting stuck. This works because the backing adminq request queue will never have more than 16 pending adminq commands, so it will never overflow. This is done by reducing the adminq depth to 16. Fixes: 45d76f492938 ("pds_core: set up device and adminq") Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Brett Creeley <brett.creeley@amd.com> Signed-off-by: Shannon Nelson <shannon.nelson@amd.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250421174606.3892-2-shannon.nelson@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-23net/mlx5: HWS, Disallow matcher IP version mixingVlad Dogaru
Signal clearly to the user, via an error, that mixing IPv4 and IPv6 rules in the same matcher is not supported. Previously such cases silently failed by adding a rule that did not work correctly. Rules can specify an IP version by one of two fields: IP version or ethertype. At matcher creation, store whether the template matches on any of these two fields. If yes, inspect each rule for its corresponding match value and store the IP version inside the matcher to guard against inconsistencies with subsequent rules. Furthermore, also check rules for internal consistency, i.e. verify that the ethertype and IP version match values do not contradict each other. The logic applies to inner and outer headers independently, to account for tunneling. Rules that do not match on IP addresses are not affected. Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250422092540.182091-4-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-23net/mlx5: HWS, Harden IP version definer checksVlad Dogaru
Replicate some sanity checks that firmware does, since hardware steering does not go through firmware. When creating a definer, disallow matching on IP addresses without also matching on IP version. The latter can be satisfied by matching either on the version field in the IP header, or on the ethertype field. Also refuse to match IPv4 IHL alongside IPv6. Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250422092540.182091-3-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-23net/mlx5: HWS, Fix IP version decisionVlad Dogaru
Unify the check for IP version when creating a definer. A given matcher is deemed to match on IPv6 if any of the higher order (>31) bits of source or destination address mask are set. A single packet cannot mix IP versions between source and destination addresses, so it makes no sense that they would be decided on independently. Signed-off-by: Vlad Dogaru <vdogaru@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250422092540.182091-2-mbloch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-23net: dsa: mt7530: sync driver-specific behavior of MT7531 variantsDaniel Golle
MT7531 standalone and MMIO variants found in MT7988 and EN7581 share most basic properties. Despite that, assisted_learning_on_cpu_port and mtu_enforcement_ingress were only applied for MT7531 but not for MT7988 or EN7581, causing the expected issues on MMIO devices. Apply both settings equally also for MT7988 and EN7581 by moving both assignments form mt7531_setup() to mt7531_setup_common(). This fixes unwanted flooding of packets due to unknown unicast during DA lookup, as well as issues with heterogenous MTU settings. Fixes: 7f54cc9772ce ("net: dsa: mt7530: split-off common parts from mt7531_setup") Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Chester A. Unal <chester.a.unal@arinc9.com> Link: https://patch.msgid.link/89ed7ec6d4fa0395ac53ad2809742bb1ce61ed12.1745290867.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-23octeontx2-pf: AF_XDP: code clean upHariprasad Kelam
The current API, otx2_xdp_sq_append_pkt, verifies the number of available descriptors before sending packets to the hardware. However, for AF_XDP, this check is unnecessary because the batch value is already determined based on the free descriptors. This patch introduces a new API, "otx2_xsk_sq_append_pkt" to address this. Remove the logic for releasing the TX buffers, as it is implicitly handled by xsk_tx_peek_release_desc_batch Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Link: https://patch.msgid.link/20250420032350.4047706-1-hkelam@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-23Merge branch '1GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== igc: Add support for Frame Preemption Faizal Rahim says: Introduce support for the FPE feature in the IGC driver. The patches aligns with the upstream FPE API: https://patchwork.kernel.org/project/netdevbpf/cover/20230220122343.1156614-1-vladimir.oltean@nxp.com/ https://patchwork.kernel.org/project/netdevbpf/cover/20230119122705.73054-1-vladimir.oltean@nxp.com/ It builds upon earlier work: https://patchwork.kernel.org/project/netdevbpf/cover/20220520011538.1098888-1-vinicius.gomes@intel.com/ The patch series adds the following functionalities to the IGC driver: a) Configure FPE using `ethtool --set-mm`. b) Display FPE settings via `ethtool --show-mm`. c) View FPE statistics using `ethtool --include-statistics --show-mm'. e) Block setting preemptible tc in taprio since it is not supported yet. Existing code already blocks it in mqprio. Tested: Enabled CONFIG_PROVE_LOCKING, CONFIG_DEBUG_ATOMIC_SLEEP, CONFIG_DMA_API_DEBUG, and CONFIG_KASAN 1) selftests 2) netdev down/up cycles 3) suspend/resume cycles 4) fpe verification No bugs or unusual dmesg logs were observed. Ran 1), 2) and 3) with and without the patch series, compared dmesg and selftest logs - no differences found. * '1GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: igc: add support to get frame preemption statistics via ethtool igc: add support to get MAC Merge data via ethtool igc: block setting preemptible traffic class in taprio igc: add support to set tx-min-frag-size igc: add support for frame preemption verification igc: set the RX packet buffer size for TSN mode igc: use FIELD_PREP and GENMASK for existing RX packet buffer size igc: optimize TX packet buffer utilization for TSN mode igc: use FIELD_PREP and GENMASK for existing TX packet buffer size igc: rename I225_RXPBSIZE_DEFAULT and I225_TXPBSIZE_DEFAULT igc: rename xdp_get_tx_ring() for non-xdp usage net: ethtool: mm: reset verification status when link is down net: ethtool: mm: extract stmmac verification logic into common library net: stmmac: move frag_size handling out of spin_lock ==================== Link: https://patch.msgid.link/20250418163822.3519810-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-23net: airoha: Enable multiple IRQ lines support in airoha_eth driver.Lorenzo Bianconi
EN7581 ethernet SoC supports 4 programmable IRQ lines for Tx and Rx interrupts. Enable multiple IRQ lines support. Map Rx/Tx queues to the available IRQ lines using the default scheme used in the vendor SDK: - IRQ0: rx queues [0-4],[7-9],15 - IRQ1: rx queues [21-30] - IRQ2: rx queues 5 - IRQ3: rx queues 6 Tx queues interrupts are managed by IRQ0. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20250418-airoha-eth-multi-irq-v1-2-1ab0083ca3c1@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>