summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-02-07net: page_pool: avoid false positive warning if NAPI was never addedJakub Kicinski
We expect NAPI to be in disabled state when page pool is torn down. But it is also legal if the NAPI is completely uninitialized. Reviewed-by: Mina Almasry <almasrymina@google.com> Link: https://patch.msgid.link/20250206225638.1387810-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07net: devmem: don't call queue stop / start when the interface is downJakub Kicinski
We seem to be missing a netif_running() check from the devmem installation path. Starting a queue on a stopped device makes no sense. We still want to be able to allocate the memory, just to test that the device is indeed setting up the page pools in a memory provider compatible way. This is not a bug fix, because existing drivers check if the interface is down as part of the ops. But new drivers shouldn't have to do this, as long as they can correctly alloc/free while down. Reviewed-by: Mina Almasry <almasrymina@google.com> Link: https://patch.msgid.link/20250206225638.1387810-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07net: refactor netdev_rx_queue_restart() to use local qopsJakub Kicinski
Shorten the lines by storing dev->queue_mgmt_ops in a temp variable. Reviewed-by: Mina Almasry <almasrymina@google.com> Link: https://patch.msgid.link/20250206225638.1387810-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07net: gianfar: simplify init_phy()Heiner Kallweit
Use phy_set_max_speed() to simplify init_phy(). Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/b863dcf7-31e8-45a1-a284-7075da958ff0@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07Merge branch 'add-usb-support-for-telit-cinterion-fn990b'Jakub Kicinski
Fabio Porcedda says: ==================== Add usb support for Telit Cinterion FN990B Add usb support for Telit Cinterion FN990B. Also fix Telit Cinterion FN990A name. Connection with ModemManager was tested also AT ports. ==================== Link: https://patch.msgid.link/20250205171649.618162-1-fabio.porcedda@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07net: usb: cdc_mbim: fix Telit Cinterion FN990A nameFabio Porcedda
The correct name for FN990 is FN990A so use it in order to avoid confusion with FN990B. Signed-off-by: Fabio Porcedda <fabio.porcedda@gmail.com> Link: https://patch.msgid.link/20250205171649.618162-6-fabio.porcedda@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07net: usb: qmi_wwan: fix Telit Cinterion FN990A nameFabio Porcedda
The correct name for FN990 is FN990A so use it in order to avoid confusion with FN990B. Signed-off-by: Fabio Porcedda <fabio.porcedda@gmail.com> Link: https://patch.msgid.link/20250205171649.618162-5-fabio.porcedda@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07net: usb: qmi_wwan: add Telit Cinterion FN990B compositionFabio Porcedda
Add the following Telit Cinterion FN990B composition: 0x10d0: rmnet + tty (AT/NMEA) + tty (AT) + tty (AT) + tty (AT) + tty (diag) + DPL + QDSS (Qualcomm Debug SubSystem) + adb T: Bus=01 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#= 17 Spd=480 MxCh= 0 D: Ver= 2.10 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1 P: Vendor=1bc7 ProdID=10d0 Rev=05.15 S: Manufacturer=Telit Cinterion S: Product=FN990 S: SerialNumber=43b38f19 C: #Ifs= 9 Cfg#= 1 Atr=e0 MxPwr=500mA I: If#= 0 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=50 Driver=qmi_wwan E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=82(I) Atr=03(Int.) MxPS= 8 Ivl=32ms I: If#= 1 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=60 Driver=option E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=83(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=84(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=85(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=86(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=04(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=87(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=88(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 4 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=40 Driver=option E: Ad=05(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=89(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=8a(I) Atr=03(Int.) MxPS= 10 Ivl=32ms I: If#= 5 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=30 Driver=option E: Ad=06(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=8b(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 6 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=80 Driver=(none) E: Ad=8c(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 7 Alt= 0 #EPs= 1 Cls=ff(vend.) Sub=ff Prot=70 Driver=(none) E: Ad=8d(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms I: If#= 8 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=42 Prot=01 Driver=usbfs E: Ad=07(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms E: Ad=8e(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms Cc: stable@vger.kernel.org Signed-off-by: Fabio Porcedda <fabio.porcedda@gmail.com> Link: https://patch.msgid.link/20250205171649.618162-3-fabio.porcedda@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07net: renesas: rswitch: Convert to for_each_available_child_of_node()Geert Uytterhoeven
Simplify rswitch_get_port_node() by using the for_each_available_child_of_node() helper instead of manually ignoring unavailable child nodes, and leaking a reference. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/54f544d573a64b96e01fd00d3481b10806f4d110.1738771798.git.geert+renesas@glider.be Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07Merge branch 'net-stmmac-yet-more-eee-updates'Jakub Kicinski
Russell King says: ==================== net: stmmac: yet more EEE updates Continuing on with the STMMAC EEE cleanups from last cycle, this series further cleans up the EEE code, and fixes a problem with the existing implementation - disabling EEE doesn't immediately disable LPI signalling until the next packet is transmitted. It likely also fixes a potential race condition when trying to disable LPI vs the software timer. ==================== Link: https://patch.msgid.link/Z6NqGnM2yL7Ayo-T@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07net: stmmac: remove old EEE methodsRussell King (Oracle)
As we no longer call the set_eee_mode(), reset_eee_mode() and set_eee_lpi_entry_timer() methods, remove these and their glue in common.h Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tffe7-003ZIm-Qv@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07net: stmmac: use stmmac_set_lpi_mode()Russell King (Oracle)
Use the new stmmac_set_lpi_mode() API to configure the parameters of the desired LPI mode rather than the older methods. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tffe2-003ZIg-Mx@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07net: stmmac: dwmac4: clear LPI_CTRL_STATUS_LPITCSE tooRussell King (Oracle)
Ensure that LPI_CTRL_STATUS_LPITCSE is also appropriately cleared when disabling LPI or enabling LPI without TX clock gating. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tffdx-003ZIZ-JQ@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07net: stmmac: add new MAC method set_lpi_mode()Russell King (Oracle)
Add a new method to control LPI mode configuration. This is architected to have three configuration states: LPI disabled, LPI forced (active), or LPI under hardware timer control. This reflects the three modes which the main body of the driver wishes to deal with. We pass in whether transmit clock gating should be used, and the hardware timer value in microseconds to be set when using hardware timer control. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tffds-003ZIT-E8@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07net: stmmac: use common LPI_CTRL_STATUS bit definitionsRussell King (Oracle)
The bit definitions for the LPI control/status register are identical across all MAC versions, with the exception that some bits may not be implemented. Provide definitions for bits in this register in common.h, convert to use them, and remove the core- specific definitions. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tffdn-003ZIN-9p@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07net: stmmac: remove unnecessary LPI disable when enabling LPIRussell King (Oracle)
Remove the unnecessary LPI disable when enabling LPI - as noted in previous commits, there will never be two consecutive calls to stmmac_mac_enable_tx_lpi() without an intervening stmmac_mac_disable_tx_lpi. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tffdi-003ZIH-5h@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07net: stmmac: clear priv->tx_path_in_lpi_mode when disabling LPIRussell King (Oracle)
As other code paths do, clear priv->tx_path_in_lpi_mode when disabling LPI. This is done after the software timer has been deleted and hardware LPI has been disabled. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tffdd-003ZIB-22@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07net: stmmac: remove unnecessary priv->eee_enabled testsRussell King (Oracle)
Phylink will not call the mac_disable_tx_lpi() and mac_enable_tx_lpi() methods randomly - the first method to be called will be the enable method, and then after, the disable method will be called once between subsequent enable calls. Thus there is a guaranteed ordering. Therefore, we know the previous state of priv->eee_enabled, and can remove it from both methods. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tffdX-003ZI5-UV@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07net: stmmac: remove unnecessary priv->eee_active testsRussell King (Oracle)
Since priv->eee_active is assigned with a constant value in each of these methods, there is no need to test its value later. Remove these unnecessary tests. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tffdS-003ZHz-Qi@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07net: stmmac: remove priv->dma_cap.eee test in tx_lpi methodsRussell King (Oracle)
The tests for priv->dma_cap.eee in stmmac_mac_{en,dis}able_tx_lpi() is useless as these methods will only be called when using phylink managed EEE, and that will only be enabled if the LPI capabilities in phylink_config have been populated during initialisation. This only occurs when priv->dma_cap.eee was true. As priv->dma_cap.eee remains constant during the lifetime of the driver instance, there is no need to re-check it in these methods. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tffdN-003ZHt-Mq@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07net: stmmac: split stmmac_init_eee() and move to phylink methodsRussell King (Oracle)
Move the appropriate parts of stmmac_init_eee() into the phylink mac_enable_tx_lpi() and mac_disable_tx_lpi() methods. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tffdI-003ZHn-Iz@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07net: stmmac: dwmac4: ensure LPIATE is clearedRussell King (Oracle)
LPIATE enables the hardware timer for entering LPI mode. To sure that the correct mode is used, clear LPIATE when using manual/software-timed mode to prevent the hardware using the timer. stmmac_main.c avoids this being a problem at the moment by calling stmmac_set_eee_lpi_timer(..., 0) before switching to software mode. We no longer need to call stmmac_set_eee_lpi_timer(..., 0) when disabling EEE as stmmac_reset_eee_mode() will now clear all LPI settings. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tffdD-003ZHh-Ew@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07net: stmmac: ensure LPI is disabled when disabling EEERussell King (Oracle)
When EEE is disabled, we call stmmac_set_eee_lpi_timer(..., 0). For dwmac4, this will result in LPIATE being cleared, but LPIEN and LPITXA being set, causing LPI mode to be signalled (if it wasn't before). For others MACs, stmmac_set_eee_lpi_timer() does nothing, which means that LPI mode will continue to be signalled despite the expectation for it to be disabled. In both cases, LPI mode will be terminated when the transmitter has a packet to send, and LPIEN will be cleared by hardware. Call stmmac_reset_eee_mode() to ensure that LPI mode is disabled when EEE mode is requested to be disabled. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tffd8-003ZHb-AX@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07net: stmmac: delete software timer before disabling LPIRussell King (Oracle)
Delete the software timer to ensure that the timer doesn't fire while we are modifying the LPI register state, potentially re-enabling LPI. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1tffd3-003ZHV-6C@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07tcp: rename inet_csk_{delete|reset}_keepalive_timer()Eric Dumazet
inet_csk_delete_keepalive_timer() and inet_csk_reset_keepalive_timer() are only used from core TCP, there is no need to export them. Replace their prefix by tcp. Move them to net/ipv4/tcp_timer.c and make tcp_delete_keepalive_timer() static. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250206094605.2694118-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07tcp: do not export tcp_parse_mss_option() and tcp_mtup_init()Eric Dumazet
These two functions are not called from modules. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250206093436.2609008-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07vxlan: Remove unnecessary comments for vxlan_rcv() and vxlan_err_lookup()Ted Chen
Remove the two unnecessary comments around vxlan_rcv() and vxlan_err_lookup(), which indicate that the callers are from net/ipv{4,6}/udp.c. These callers are trivial to find. Additionally, the comment for vxlan_rcv() missed that the caller could also be from net/ipv6/udp.c. Suggested-by: Nikolay Aleksandrov <razor@blackwall.org> Suggested-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Ted Chen <znscnchen@gmail.com> Acked-by: Nikolay Aleksandrov <razor@blackwall.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250206140002.116178-1-znscnchen@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-07Merge branch 'of_get_available_child_by_name'David S. Miller
Biju Das says: ==================== Add of_get_available_child_by_name() There are lot of net drivers using of_get_child_by_name() followed by of_device_is_available() to find the available child node by name for a given parent. Provide a helper for these users to simplify the code. v1->v2: * Make it as a series as per [1] to cover the dependency. * Added Rb tag from Rob for patch#1 and this patch can be merged through net as it is the main user. * Updated all the patches with patch suffix net-next * Dropped _free() usage. [1] https://lore.kernel.org/all/CAL_JsqLo4uSGYMcLXN=0iSUMHdW8RaGCY+o8ThQHq3_eUTV9wQ@mail.gmail.com/ ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2025-02-07net: ibm: emac: Use of_get_available_child_by_name()Biju Das
Use the helper of_get_available_child_by_name() to simplify emac_dt_mdio_probe(). Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-02-07net: ethernet: actions: Use of_get_available_child_by_name()Biju Das
Use the helper of_get_available_child_by_name() to simplify owl_emac_mdio_init(). Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-02-07net: ethernet: mtk_eth_soc: Use of_get_available_child_by_name()Biju Das
Use the helper of_get_available_child_by_name() to simplify mtk_mdio_init(). Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-02-07net: ethernet: mtk-star-emac: Use of_get_available_child_by_name()Biju Das
Use the helper of_get_available_child_by_name() to simplify mtk_star_mdio_init(). Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-02-07net: dsa: sja1105: Use of_get_available_child_by_name()Biju Das
Use the helper of_get_available_child_by_name() to simplify sja1105_mdiobus_register(). Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-02-07net: dsa: rzn1_a5psw: Use of_get_available_child_by_name()Biju Das
Simplify a5psw_probe() by using of_get_available_child_by_name(). While at it, move of_node_put(mdio) inside the if block to avoid code duplication. Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-02-07of: base: Add of_get_available_child_by_name()Biju Das
There are lot of drivers using of_get_child_by_name() followed by of_device_is_available() to find the available child node by name for a given parent. Provide a helper for these users to simplify the code. Suggested-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-02-06Merge branch 'for-next' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== ice: managing MSI-X in driver Michal Swiatkowski says: It is another try to allow user to manage amount of MSI-X used for each feature in ice. First was via devlink resources API, it wasn't accepted in upstream. Also static MSI-X allocation using devlink resources isn't really user friendly. This try is using more dynamic way. "Dynamic" across whole kernel when platform supports it and "dynamic" across the driver when not. To achieve that reuse global devlink parameter pf_msix_max and pf_msix_min. It fits how ice hardware counts MSI-X. In case of ice amount of MSI-X reported on PCI is a whole MSI-X for the card (with MSI-X for VFs also). Having pf_msix_max allow user to statically set how many MSI-X he wants on PF and how many should be reserved for VFs. pf_msix_min is used to set minimum number of MSI-X with which ice driver should probe correctly. Meaning of this field in case of dynamic vs static allocation: - on system with dynamic MSI-X allocation support * alloc pf_msix_min as static, rest will be allocated dynamically - on system without dynamic MSI-X allocation support * try alloc pf_msix_max as static, minimum acceptable result is pf_msix_min As Jesse and Piotr suggested pf_msix_max and pf_msix_min can (an probably should) be stored in NVM. This patchset isn't implementing that. Dynamic (kernel or driver) way means that splitting MSI-X across the RDMA and eth in case there is a MSI-X shortage isn't correct. Can work when dynamic is only on driver site, but can't when dynamic is on kernel site. Let's remove this code and move to MSI-X allocation feature by feature. If there is no more MSI-X for a feature, a feature is working with less MSI-X or it is turned off. There is a regression here. With MSI-X splitting user can run RDMA and eth even on system with not enough MSI-X. Now only eth will work. RDMA can be turned on by changing number of PF queues (lowering) and reprobe RDMA driver. Example: 72 CPU number, eth, RDMA and flow director (1 MSI-X), 1 MSI-X for OICR on PF, and 1 more for RDMA. Card is using 1 + 72 + 1 + 72 + 1 = 147. We set pf_msix_min = 2, pf_msix_max = 128 OICR: 1 eth: 72 flow director: 1 RDMA: 128 - 74 = 54 We can change number of queues on pf to 36 and do devlink reinit OICR: 1 eth: 36 RDMA: 73 flow director: 1 We can also (implemented in "ice: enable_rdma devlink param") turned RDMA off. OICR: 1 eth: 72 RDMA: 0 (turned off) flow director: 1 After this changes we have a static base vector for SRIOV (SIOV probably in the feature). Last patch from this series is simplifying managing VF MSI-X code based on static vector. Now changing queues using ethtool is also changing MSI-X. If there is enough MSI-X it is always one to one. When there is not enough there will be more queues than MSI-X. * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: init flow director before RDMA ice: simplify VF MSI-X managing ice: enable_rdma devlink param ice: treat dyn_allowed only as suggestion ice, irdma: move interrupts code to irdma ice: get rid of num_lan_msix field ice: remove splitting MSI-X between features ice: devlink PF MSI-X max and min parameter ice: count combined queues using Rx/Tx count ==================== Link: https://patch.msgid.link/20250205185512.895887-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-06net: pcs: rzn1-miic: Convert to for_each_available_child_of_node() helperGeert Uytterhoeven
Simplify miic_parse_dt() by using the for_each_available_child_of_node() helper instead of manually skipping unavailable child nodes. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/3e394d4cf8204bcf17b184bfda474085aa8ed0dd.1738771631.git.geert+renesas@glider.be Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-06net: pcs: rzn1-miic: fill in PCS supported_interfacesRussell King (Oracle)
Populate the PCS supported_interfaces bitmap with the interfaces that this PCS supports. This makes the manual checking in miic_validate() redundant, so remove that. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/E1tfhYq-003aTm-Nx@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-06Merge branch 'enic-use-page-pool-api-for-receiving-packets'Jakub Kicinski
John Daley says: ==================== enic: Use Page Pool API for receiving packets Use the Page Pool API for RX. The Page Pool API improves bandwidth and CPU overhead by recycling pages instead of allocating new buffers in the driver. Also, page pool fragment allocation for smaller MTUs is used allow multiple packets to share pages. RX code was moved to its own file and some refactoring was done beforehand to make the page pool changes more trasparent and to simplify the resulting code. ==================== Link: https://patch.msgid.link/20250205235416.25410-1-johndale@cisco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-06enic: remove copybreak tunableJohn Daley
With the move to using the Page Pool API for RX, rx copybreak was not showing any improvement in host CPU overhead, latency or bandwidth so the driver no longer makes use of the rx_copybreak setting. This patch removes the ethtool tuneable hooks to set and get the rx copybreak since they and now no-ops. Rx copybreak was the only tunable supported, so remove the set and get tunable callbacks all together. Co-developed-by: Nelson Escobar <neescoba@cisco.com> Signed-off-by: Nelson Escobar <neescoba@cisco.com> Co-developed-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Link: https://patch.msgid.link/20250205235416.25410-5-johndale@cisco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-06enic: Use the Page Pool API for RXJohn Daley
The Page Pool API improves bandwidth and CPU overhead by recycling pages instead of allocating new buffers in the driver. Make use of page pool fragment allocation for smaller MTUs so that multiple packets can share a page. For MTUs larger than PAGE_SIZE, adjust the 'order' page parameter so that contiguous pages can be used to receive the larger packets. The RQ descriptor field 'os_buf' is repurposed to hold page pointers allocated from page_pool instead of SKBs. When packets arrive, SKBs are allocated and the page pointers are attached instead of preallocating SKBs. 'alloc_fail' netdev statistic is incremented when page_pool_dev_alloc() fails. Co-developed-by: Nelson Escobar <neescoba@cisco.com> Signed-off-by: Nelson Escobar <neescoba@cisco.com> Co-developed-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Link: https://patch.msgid.link/20250205235416.25410-4-johndale@cisco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-06enic: Simplify RX handler functionJohn Daley
Split up RX handler functions in preparation for moving to a page pool based implementation. No functional changes. Co-developed-by: Nelson Escobar <neescoba@cisco.com> Signed-off-by: Nelson Escobar <neescoba@cisco.com> Co-developed-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Link: https://patch.msgid.link/20250205235416.25410-3-johndale@cisco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-06enic: Move RX functions to their own fileJohn Daley
Move RX handler code into its own file in preparation for further changes. Some formatting changes were necessary in order to satisfy checkpatch but there were no functional changes. Co-developed-by: Nelson Escobar <neescoba@cisco.com> Signed-off-by: Nelson Escobar <neescoba@cisco.com> Co-developed-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: Satish Kharat <satishkh@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Link: https://patch.msgid.link/20250205235416.25410-2-johndale@cisco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-06netdev-genl: Elide napi_id when not presentJoe Damato
There are at least two cases where napi_id may not present and the napi_id should be elided: 1. Queues could be created, but napi_enable may not have been called yet. In this case, there may be a NAPI but it may not have an ID and output of a napi_id should be elided. 2. TX-only NAPIs currently do not have NAPI IDs. If a TX queue happens to be linked with a TX-only NAPI, elide the NAPI ID from the netlink output as a NAPI ID of 0 is not useful for users. Signed-off-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250205193751.297211-1-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-06Merge branch 'io_uring-zero-copy-rx'Jakub Kicinski
David Wei says: ==================== io_uring zero copy rx This patchset contains net/ patches needed by a new io_uring request implementing zero copy rx into userspace pages, eliminating a kernel to user copy. We configure a page pool that a driver uses to fill a hw rx queue to hand out user pages instead of kernel pages. Any data that ends up hitting this hw rx queue will thus be dma'd into userspace memory directly, without needing to be bounced through kernel memory. 'Reading' data out of a socket instead becomes a _notification_ mechanism, where the kernel tells userspace where the data is. The overall approach is similar to the devmem TCP proposal. This relies on hw header/data split, flow steering and RSS to ensure packet headers remain in kernel memory and only desired flows hit a hw rx queue configured for zero copy. Configuring this is outside of the scope of this patchset. We share netdev core infra with devmem TCP. The main difference is that io_uring is used for the uAPI and the lifetime of all objects are bound to an io_uring instance. Data is 'read' using a new io_uring request type. When done, data is returned via a new shared refill queue. A zero copy page pool refills a hw rx queue from this refill queue directly. Of course, the lifetime of these data buffers are managed by io_uring rather than the networking stack, with different refcounting rules. This patchset is the first step adding basic zero copy support. We will extend this iteratively with new features e.g. dynamically allocated zero copy areas, THP support, dmabuf support, improved copy fallback, general optimisations and more. In terms of netdev support, we're first targeting Broadcom bnxt. Patches aren't included since Taehee Yoo has already sent a more comprehensive patchset adding support in [1]. Google gve should already support this, and Mellanox mlx5 support is WIP pending driver changes. =========== Performance =========== Note: Comparison with epoll + TCP_ZEROCOPY_RECEIVE isn't done yet. Test setup: * AMD EPYC 9454 * Broadcom BCM957508 200G * Kernel v6.11 base [2] * liburing fork [3] * kperf fork [4] * 4K MTU * Single TCP flow With application thread + net rx softirq pinned to _different_ cores: +-------------------------------+ | epoll | io_uring | |-----------|-------------------| | 82.2 Gbps | 116.2 Gbps (+41%) | +-------------------------------+ Pinned to _same_ core: +-------------------------------+ | epoll | io_uring | |-----------|-------------------| | 62.6 Gbps | 80.9 Gbps (+29%) | +-------------------------------+ ===== Links ===== Broadcom bnxt support: [1]: https://lore.kernel.org/20241003160620.1521626-8-ap420073@gmail.com Linux kernel branch including io_uring bits: [2]: https://github.com/isilence/linux.git zcrx/v13 liburing for testing: [3]: https://github.com/isilence/liburing.git zcrx/next kperf for testing: [4]: https://git.kernel.dk/kperf.git ==================== Link: https://patch.msgid.link/20250204215622.695511-1-dw@davidwei.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-06net: add helpers for setting a memory provider on an rx queueDavid Wei
Add helpers that properly prep or remove a memory provider for an rx queue then restart the queue. Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: David Wei <dw@davidwei.uk> Link: https://patch.msgid.link/20250204215622.695511-11-dw@davidwei.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-06net: page_pool: add memory provider helpersPavel Begunkov
Add helpers for memory providers to interact with page pools. net_mp_niov_{set,clear}_page_pool() serve to [dis]associate a net_iov with a page pool. If used, the memory provider is responsible to match "set" calls with "clear" once a net_iov is not going to be used by a page pool anymore, changing a page pool, etc. Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: David Wei <dw@davidwei.uk> Link: https://patch.msgid.link/20250204215622.695511-10-dw@davidwei.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-06net: prepare for non devmem TCP memory providersPavel Begunkov
There is a good bunch of places in generic paths assuming that the only page pool memory provider is devmem TCP. As we want to reuse the net_iov and provider infrastructure, we need to patch it up and explicitly check the provider type when we branch into devmem TCP code. Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: David Wei <dw@davidwei.uk> Link: https://patch.msgid.link/20250204215622.695511-9-dw@davidwei.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-06net: page_pool: add a mp hook to unregister_netdevice*Pavel Begunkov
Devmem TCP needs a hook in unregister_netdevice_many_notify() to upkeep the set tracking queues it's bound to, i.e. ->bound_rxqs. Instead of devmem sticking directly out of the genetic path, add a mp function. Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Mina Almasry <almasrymina@google.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: David Wei <dw@davidwei.uk> Link: https://patch.msgid.link/20250204215622.695511-8-dw@davidwei.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-06net: page_pool: add callback for mp info printingPavel Begunkov
Add a mandatory callback that prints information about the memory provider to netlink. Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: David Wei <dw@davidwei.uk> Link: https://patch.msgid.link/20250204215622.695511-7-dw@davidwei.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>