linux/linux-stable.git - Linux kernel stable tree

Age	Commit message (Collapse)	Author
9 days	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Cross-merge networking fixes after downstream PR (net-6.16-rc8). Conflicts: drivers/net/ethernet/microsoft/mana/gdma_main.c 9669ddda18fb ("net: mana: Fix warnings for missing export.h header inclusion") 755391121038 ("net: mana: Allocate MSI-X vectors dynamically") https://lore.kernel.org/20250711130752.23023d98@canb.auug.org.au Adjacent changes: drivers/net/ethernet/ti/icssg/icssg_prueth.h 6e86fb73de0f ("net: ti: icssg-prueth: Fix buffer allocation for ICSSG") ffe8a4909176 ("net: ti: icssg-prueth: Read firmware-names from device tree") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 days	gve: implement DQO RX datapath and control path for AF_XDP zero-copy	Joshua Washington
	Add the RX datapath for AF_XDP zero-copy for DQ RDA. The RX path is quite similar to that of the normal XDP case. Parallel methods are introduced to properly handle XSKs instead of normal driver buffers. To properly support posting from XSKs, queues are destroyed and recreated, as the driver was initially making use of page pool buffers instead of the XSK pool memory. Expose support for AF_XDP zero-copy, as the TX and RX datapaths both exist. Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Link: https://patch.msgid.link/20250717152839.973004-6-jeroendb@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
11 days	gve: implement DQO TX datapath for AF_XDP zero-copy	Joshua Washington
	In the descriptor clean path, a number of changes need to be made to accommodate out of order completions and double completions. The XSK stack can only handle completions being processed in order, as a single counter is incremented in xsk_tx_completed to sigify how many XSK descriptors have been completed. Because completions can come back out of order in DQ, a separate queue of XSK descriptors must be maintained. This queue keeps the pending packets in the order that they were written so that the descriptors can be counted in xsk_tx_completed in the same order. For double completions, a new pending packet state and type are introduced. The new type, GVE_TX_PENDING_PACKET_DQO_XSK, plays an anlogous role to pre-existing _SKB and _XDP_FRAME pending packet types for XSK descriptors. The new state, GVE_PACKET_STATE_XSK_COMPLETE, represents packets for which no more completions are expected. This includes packets which have received a packet completion or reinjection completion, as well as packets whose reinjection completion timer have timed out. At this point, such packets can be counted as part of xsk_tx_completed() and freed. Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Link: https://patch.msgid.link/20250717152839.973004-5-jeroendb@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
11 days	gve: keep registry of zc xsk pools in netdev_priv	Joshua Washington
	Relying on xsk_get_pool_from_qid for getting whether zero copy is enabled on a queue is erroneous, as an XSK pool is registered in xp_assign_dev whether AF_XDP zero-copy is enabled or not. This becomes problematic when queues are restarted in copy mode, as all RX queues with XSKs will register a pool, causing the driver to exercise the zero-copy codepath. This patch adds a bitmap to keep track of which queues have zero-copy enabled. Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Link: https://patch.msgid.link/20250717152839.973004-4-jeroendb@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
11 days	gve: merge xdp and xsk registration	Joshua Washington
	The existence of both of these xdp_rxq and xsk_rxq is redundant. xdp_rxq can be used in both the zero-copy mode and the copy mode case. XSK pool memory model registration is prioritized over normal memory model registration to ensure that memory model registration happens only once per queue. Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Link: https://patch.msgid.link/20250717152839.973004-3-jeroendb@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
11 days	gve: deduplicate xdp info and xsk pool registration logic	Joshua Washington
	The XDP registration path currently has a lot of reused logic, leading changes to the codepaths to be unnecessarily complex. gve_reg_xsk_pool extracts the logic of registering an XSK pool with a queue into a method that can be used by both XDP_SETUP_XSK_POOL and gve_reg_xdp_info. gve_unreg_xdp_info is used to undo XDP info registration in the error path instead of explicitly unregistering the XDP info, as it is more complete and idempotent. This patch will be followed by other changes to the XDP registration logic, and will simplify those changes due to the use of common methods. Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Link: https://patch.msgid.link/20250717152839.973004-2-jeroendb@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
12 days	gve: Fix stuck TX queue for DQ queue format	Praveen Kaligineedi
	gve_tx_timeout was calculating missed completions in a way that is only relevant in the GQ queue format. Additionally, it was attempting to disable device interrupts, which is not needed in either GQ or DQ queue formats. As a result, TX timeouts with the DQ queue format likely would have triggered early resets without kicking the queue at all. This patch drops the check for pending work altogether and always kicks the queue after validating the queue has not seen a TX timeout too recently. Cc: stable@vger.kernel.org Fixes: 87a7f321bb6a ("gve: Recover from queue stall due to missed IRQ") Co-developed-by: Tim Hostetler <thostet@google.com> Signed-off-by: Tim Hostetler <thostet@google.com> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20250717192024.1820931-1-hramamurthy@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-09	gve: make IRQ handlers and page allocation NUMA aware	Bailey Forrest
	All memory in GVE is currently allocated without regard for the NUMA node of the device. Because access to NUMA-local memory access is significantly cheaper than access to a remote node, this change attempts to ensure that page frags used in the RX path, including page pool frags, are allocated on the NUMA node local to the gVNIC device. Note that this attempt is best-effort. If necessary, the driver will still allocate non-local memory, as __GFP_THISNODE is not passed. Descriptor ring allocations are not updated, as dma_alloc_coherent handles that. This change also modifies the IRQ affinity setting to only select CPUs from the node local to the device, preserving the behavior that TX and RX queues of the same index share CPU affinity. Signed-off-by: Bailey Forrest <bcf@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250707210107.2742029-1-jeroendb@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-07-08	gve: global: fix "for a while" typo	Ahelenia Ziemiańska
	Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/5zsbhtyox3cvbntuvhigsn42uooescbvdhrat6s3d6rczznzg5@tarta.nabijaczleweli.xyz Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-21	gve: add XDP_TX and XDP_REDIRECT support for DQ RDA	Joshua Washington
	This patch adds support for XDP_TX and XDP_REDIRECT for the DQ RDA queue format. To appropriately support transmission of XDP frames, a new pending packet type GVE_TX_PENDING_PACKET_DQO_XDP_FRAME is introduced for completion handling, as there was a previous assumption that completed packets would be SKBs. XDP_TX handling completes the basic XDP actions, so the feature is recorded accordingly. This patch also enables the ndo_xdp_xmit callback allowing DQ to handle XDP_REDIRECT packets originating from another interface. The XDP spinlock is moved to common TX ring fields so that it can be used in both GQ and DQ. Originally, it was in a section which was mutually exclusive for GQ and DQ. In summary, 3 XDP features are exposed for the DQ RDA queue format: 1) NETDEV_XDP_ACT_BASIC 2) NETDEV_XDP_ACT_NDO_XMIT 3) NETDEV_XDP_ACT_REDIRECT Note that XDP and header-data split are mutually exclusive for the time being due to lack of multi-buffer XDP support. This patch does not add support for the DQ QPL format. That is to come in a future patch series. Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-06-21	gve: refactor DQO TX methods to be more generic for XDP	Joshua Washington
	This patch performs various minor DQO TX datapath refactors in preparation for adding XDP_TX and XDP_REDIRECT support. The following refactors are performed: 1) gve_tx_fill_pkt_desc_dqo() relies on a SKB pointer to get whether checksum offloading should be enabled. This won't work for the XDP case, which does not have a SKB. This patch updates the method to use a boolean representing whether checksum offloading should be enabled directly. 2) gve_maybe_stop_dqo() contains some synchronization between the true TX head and the cached value, a synchronization which is common for XDP queues and normal netdev queues. However, that method is reserved for netdev TX queues. To avoid duplicate code, this logic is factored out into a new method, gve_has_tx_slots_available(). 3) gve_tx_update_tail() is added to update the TX tail, a functionality that will be common between normal TX and XDP TX codepaths. Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-06-21	gve: rename gve_xdp_xmit to gve_xdp_xmit_gqi	Joshua Washington
	In preparation for XDP DQ support, the gve_xdp_xmit callback needs to be generalized for all queue formats. This patch renames the GQ-specific function to gve_xdp_xmit_gqi, and introduces a new gve_xdp_xmit callback which branches on queue format. Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-06-17	gve: Return error for unknown admin queue command	Alok Tiwari
	In gve_adminq_issue_cmd(), return -EINVAL instead of 0 when an unknown admin queue command opcode is encountered. This prevents the function from silently succeeding on invalid input and prevents undefined behavior by ensuring the function fails gracefully when an unrecognized opcode is provided. These changes improve error handling. Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Link: https://patch.msgid.link/20250616054504.1644770-2-alok.a.tiwari@oracle.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-17	gve: Fix various typos and improve code comments	Alok Tiwari
	- Correct spelling and improves the clarity of comments "confiugration" -> "configuration" "spilt" -> "split" "It if is 0" -> "If it is 0" "DQ" -> "DQO" (correct abbreviation) - Clarify BIT(0) flag usage in gve_get_priv_flags() - Replaced hardcoded array size with GVE_NUM_PTYPES for clarity and maintainability. These changes are purely cosmetic and do not affect functionality. Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Joe Damato <joe@dama.to> Reviewed-by: Mina Almasry <almasrymina@google.com> Link: https://patch.msgid.link/20250616054504.1644770-1-alok.a.tiwari@oracle.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16	gve: Advertise support for rx hardware timestamping	John Fraker
	Expand the get_ts_info ethtool handler with the new gve_get_ts_info which advertises support for rx hardware timestamping. With this patch, the driver now fully supports rx hardware timestamping. Signed-off-by: John Fraker <jfraker@google.com> Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20250614000754.164827-9-hramamurthy@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16	gve: Implement ndo_hwtstamp_get/set for RX timestamping	John Fraker
	Implement ndo_hwtstamp_get/set to enable hardware RX timestamping, providing support for SIOC[SG]HWTSTAMP IOCTLs. Included with this support is the small change necessary to read the rx timestamp out of the rx descriptor, now that timestamps start being enabled. The gve clock is only used for hardware timestamps, so started when timestamps are requested and stopped when not needed. This version only supports RX hardware timestamping with the rx filter HWTSTAMP_FILTER_ALL. If the user attempts to configure a more restrictive filter, the filter will be set to HWTSTAMP_FILTER_ALL in the returned structure. Signed-off-by: John Fraker <jfraker@google.com> Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20250614000754.164827-8-hramamurthy@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16	gve: Add rx hardware timestamp expansion	John Fraker
	Allow the rx path to recover the high 32 bits of the full 64 bit rx timestamp. Use the low 32 bits of the last synced nic time and the 32 bits of the timestamp provided in the rx descriptor to generate a difference, which is then applied to the last synced nic time to reconstruct the complete 64-bit timestamp. This scheme remains accurate as long as no more than ~2 seconds have passed between the last read of the nic clock and the timestamping application of the received packet. Signed-off-by: John Fraker <jfraker@google.com> Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20250614000754.164827-7-hramamurthy@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16	gve: Add support to query the nic clock	Kevin Yang
	Query the nic clock and store the results. The timestamp delivered in descriptors has a wraparound time of ~4 seconds so 250ms is chosen as the sync cadence to provide a balance between performance, and drift potential when we do start associating host time and nic time. Leverage PTP's aux_work to query the nic clock periodically. Signed-off-by: Kevin Yang <yyd@google.com> Signed-off-by: John Fraker <jfraker@google.com> Signed-off-by: Tim Hostetler <thostet@google.com> Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20250614000754.164827-6-hramamurthy@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16	gve: Add adminq lock for queues creation and destruction	Ziwei Xiao
	Adminq commands for queues creation and destruction were not consistently protected by the driver's adminq_lock. This was previously benign as these operations were always initiated from contexts holding kernel-level locks (e.g., rtnl_lock, netdev_lock), which provided serialization. Upcoming PTP aux_work will issue adminq commands directly from the driver to read the NIC clock, without such kernel lock protection. To prevent race conditions with this new PTP work, this patch ensures the adminq_lock is held during queues creation and destruction. Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20250614000754.164827-5-hramamurthy@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16	gve: Add initial PTP device support	Harshitha Ramamurthy
	If the device supports reading of the nic clock, add support to initialize and register the PTP clock. Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20250614000754.164827-4-hramamurthy@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16	gve: Add adminq command to report nic timestamp	John Fraker
	Add an adminq command to read NIC's hardware clock. The driver allocates dma memory and passes that dma memory address to the device. The device then writes the clock to the given address. Signed-off-by: Jeff Rogers <jefrogers@google.com> Signed-off-by: John Fraker <jfraker@google.com> Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20250614000754.164827-3-hramamurthy@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-16	gve: Add device option for nic clock synchronization	John Fraker
	Add the device option and negotiation with the device for clock synchronization with the nic. This option is necessary before the driver will advertise support for hardware timestamping or other related features. Signed-off-by: Jeff Rogers <jefrogers@google.com> Signed-off-by: John Fraker <jfraker@google.com> Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20250614000754.164827-2-hramamurthy@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-12	eth: remove empty RXFH handling from drivers	Jakub Kicinski
	We're migrating RXFH config to new callbacks. Remove RXFH handling from drivers where it does nothing. Reviewed-by: Ziwei Xiao <ziweixiao@google.com> Link: https://patch.msgid.link/20250611145949.2674086-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-06-08	treewide, timers: Rename from_timer() to timer_container_of()	Ingo Molnar
	Move this API to the canonical timer_*() namespace. [ tglx: Redone against pre rc1 ] Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/aB2X0jCKQO56WdMt@gmail.com
2025-06-04	gve: add missing NULL check for gve_alloc_pending_packet() in TX DQO	Alok Tiwari
	gve_alloc_pending_packet() can return NULL, but gve_tx_add_skb_dqo() did not check for this case before dereferencing the returned pointer. Add a missing NULL check to prevent a potential NULL pointer dereference when allocation fails. This improves robustness in low-memory scenarios. Fixes: a57e5de476be ("gve: DQO: Add TX path") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-29	gve: Fix RX_BUFFERS_POSTED stat to report per-queue fill_cnt	Alok Tiwari
	Previously, the RX_BUFFERS_POSTED stat incorrectly reported the fill_cnt from RX queue 0 for all queues, resulting in inaccurate per-queue statistics. Fix this by correctly indexing priv->rx[idx].fill_cnt for each RX queue. Fixes: 24aeb56f2d38 ("gve: Add Gvnic stats AQ command and ethtool show/set-priv-flags.") Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Link: https://patch.msgid.link/20250527130830.1812903-1-alok.a.tiwari@oracle.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-13	gve: add netmem TX support to GVE DQO-RDA mode	Mina Almasry
	Use netmem_dma_*() helpers in gve_tx_dqo.c DQO-RDA paths to enable netmem TX support in that mode. Declare support for netmem TX in GVE DQO-RDA mode. Signed-off-by: Mina Almasry <almasrymina@google.com> Acked-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20250508004830.4100853-8-almasrymina@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-22	xdp: create locked/unlocked instances of xdp redirect target setters	Joshua Washington
	Commit 03df156dd3a6 ("xdp: double protect netdev->xdp_flags with netdev->lock") introduces the netdev lock to xdp_set_features_flag(). The change includes a _locked version of the method, as it is possible for a driver to have already acquired the netdev lock before calling this helper. However, the same applies to xdp_features_(set\|clear)_redirect_flags(), which ends up calling the unlocked version of xdp_set_features_flags() leading to deadlocks in GVE, which grabs the netdev lock as part of its suspend, reset, and shutdown processes: [ 833.265543] WARNING: possible recursive locking detected [ 833.270949] 6.15.0-rc1 #6 Tainted: G E [ 833.276271] -------------------------------------------- [ 833.281681] systemd-shutdow/1 is trying to acquire lock: [ 833.287090] ffff949d2b148c68 (&dev->lock){+.+.}-{4:4}, at: xdp_set_features_flag+0x29/0x90 [ 833.295470] [ 833.295470] but task is already holding lock: [ 833.301400] ffff949d2b148c68 (&dev->lock){+.+.}-{4:4}, at: gve_shutdown+0x44/0x90 [gve] [ 833.309508] [ 833.309508] other info that might help us debug this: [ 833.316130] Possible unsafe locking scenario: [ 833.316130] [ 833.322142] CPU0 [ 833.324681] ---- [ 833.327220] lock(&dev->lock); [ 833.330455] lock(&dev->lock); [ 833.333689] [ 833.333689] * DEADLOCK * [ 833.333689] [ 833.339701] May be due to missing lock nesting notation [ 833.339701] [ 833.346582] 5 locks held by systemd-shutdow/1: [ 833.351205] #0: ffffffffa9c89130 (system_transition_mutex){+.+.}-{4:4}, at: __se_sys_reboot+0xe6/0x210 [ 833.360695] #1: ffff93b399e5c1b8 (&dev->mutex){....}-{4:4}, at: device_shutdown+0xb4/0x1f0 [ 833.369144] #2: ffff949d19a471b8 (&dev->mutex){....}-{4:4}, at: device_shutdown+0xc2/0x1f0 [ 833.377603] #3: ffffffffa9eca050 (rtnl_mutex){+.+.}-{4:4}, at: gve_shutdown+0x33/0x90 [gve] [ 833.386138] #4: ffff949d2b148c68 (&dev->lock){+.+.}-{4:4}, at: gve_shutdown+0x44/0x90 [gve] Introduce xdp_features_(set\|clear)_redirect_target_locked() versions which assume that the netdev lock has already been acquired before setting the XDP feature flag and update GVE to use the locked version. Fixes: 03df156dd3a6 ("xdp: double protect netdev->xdp_flags with netdev->lock") Tested-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250422011643.3509287-1-joshwash@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17	net: ethtool: Adjust exactly ETH_GSTRING_LEN-long stats to use memcpy	Kees Cook
	Many drivers populate the stats buffer using C-String based APIs (e.g. ethtool_sprintf() and ethtool_puts()), usually when building up the list of stats individually (i.e. with a for() loop). This, however, requires that the source strings be populated in such a way as to have a terminating NUL byte in the source. Other drivers populate the stats buffer directly using one big memcpy() of an entire array of strings. No NUL termination is needed here, as the bytes are being directly passed through. Yet others will build up the stats buffer individually, but also use memcpy(). This, too, does not need NUL termination of the source strings. However, there are cases where the strings that populate the source stats strings are exactly ETH_GSTRING_LEN long, and GCC 15's -Wunterminated-string-initialization option complains that the trailing NUL byte has been truncated. This situation is fine only if the driver is using the memcpy() approach. If the C-String APIs are used, the destination string name will have its final byte truncated by the required trailing NUL byte applied by the C-string API. For drivers that are already using memcpy() but have initializers that truncate the NUL terminator, mark their source strings as __nonstring to silence the GCC warnings. For drivers that have initializers that truncate the NUL terminator and are using the C-String APIs, switch to memcpy() to avoid destination string truncation and mark their source strings as __nonstring to silence the GCC warnings. (Also introduce ethtool_cpy() as a helper to make this an easy replacement). Specifically the following warnings were investigated and addressed: ../drivers/net/ethernet/chelsio/cxgb/cxgb2.c:364:9: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization] 364 \| "TxFramesAbortedDueToXSCollisions", \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../drivers/net/ethernet/freescale/enetc/enetc_ethtool.c:165:33: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization] 165 \| { ENETC_PM_R1523X(0), "MAC rx 1523 to max-octet packets" }, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../drivers/net/ethernet/freescale/enetc/enetc_ethtool.c:190:33: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization] 190 \| { ENETC_PM_T1523X(0), "MAC tx 1523 to max-octet packets" }, \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../drivers/net/ethernet/google/gve/gve_ethtool.c:76:9: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization] 76 \| "adminq_dcfg_device_resources_cnt", "adminq_set_driver_parameter_cnt", \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c:117:53: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization] 117 \| STMMAC_STAT(ptp_rx_msg_type_pdelay_follow_up), \| ^ ../drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c:46:12: note: in definition of macro 'STMMAC_STAT' 46 \| { #m, sizeof_field(struct stmmac_extra_stats, m), \ \| ^ ../drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c:328:24: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization] 328 \| .str = "a_mac_control_frames_transmitted", \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c:340:24: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization] 340 \| .str = "a_pause_mac_ctrl_frames_received", \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Kees Cook <kees@kernel.org> Reviewed-by: Petr Machata <petrm@nvidia.com> # for mlxsw Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20250416010210.work.904-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Cross-merge networking fixes after downstream PR (net-6.15-rc2). Conflict: Documentation/networking/netdevices.rst net/core/lock_debug.c 04efcee6ef8d ("net: hold instance lock during NETDEV_CHANGE") 03df156dd3a6 ("xdp: double protect netdev->xdp_flags with netdev->lock") No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-09	xdp: double protect netdev->xdp_flags with netdev->lock	Jakub Kicinski
	Protect xdp_features with netdev->lock. This way pure readers no longer have to take rtnl_lock to access the field. This includes calling NETDEV_XDP_FEAT_CHANGE under the lock. Looks like that's fine for bonding, the only "real" listener, it's the same as ethtool feature change. In terms of normal drivers - only GVE need special consideration (other drivers don't use instance lock or don't support XDP). It calls xdp_set_features_flag() helper from gve_init_priv() which in turn is called from gve_reset_recovery() (locked), or prior to netdev registration. So switch to _locked. Reviewed-by: Joe Damato <jdamato@fastly.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Acked-by: Harshitha Ramamurthy <hramamurthy@google.com> Acked-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20250408195956.412733-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-05	treewide: Switch/rename to timer_delete[_sync]()	Thomas Gleixner
	timer_delete[_sync]() replaces del_timer[_sync](). Convert the whole tree over and remove the historical wrapper inlines. Conversion was done with coccinelle plus manual fixups where necessary. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-04-03	gve: handle overflow when reporting TX consumed descriptors	Joshua Washington
	When the tx tail is less than the head (in cases of wraparound), the TX consumed descriptor statistic in DQ will be reported as UINT32_MAX - head + tail, which is incorrect. Mask the difference of head and tail according to the ring size when reporting the statistic. Cc: stable@vger.kernel.org Fixes: 2c9198356d56 ("gve: Add consumed counts to ethtool stats") Signed-off-by: Joshua Washington <joshwash@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250402001037.2717315-1-hramamurthy@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-31	eth: gve: add missing netdev locks on reset and shutdown paths	Jakub Kicinski
	All the misc entry points end up calling into either gve_open() or gve_close(), they take rtnl_lock today but since the recent instance locking changes should also take the instance lock. Found by code inspection and untested. Fixes: cae03e5bdd9e ("net: hold netdev instance lock during queue operations") Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20250328164742.1268069-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-26	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Merge in late fixes to prepare for the 6.15 net-next PR. No conflicts, adjacent changes: drivers/net/ethernet/broadcom/bnxt/bnxt.c 919f9f497dbc ("eth: bnxt: fix out-of-range access of vnic_info array") fe96d717d38e ("bnxt_en: Extend queue stop/start for TX rings") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-25	gve: add XDP DROP and PASS support for DQ	Joshua Washington
	This patch adds support for running XDP programs on DQ, along with rudimentary processing for XDP_DROP and XDP_PASS. These actions require very limited driver functionality when it comes to processing an XDP buffer, so currently if the XDP action is not XDP_PASS, the packet is dropped and stats are updated. Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Praveen Kaliginedi <pkaligineedi@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Signed-off-by: Harshitha Ramamurthy<hramamurthy@google.com> Link: https://patch.msgid.link/20250321002910.1343422-7-hramamurthy@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-25	gve: update XDP allocation path support RX buffer posting	Joshua Washington
	In order to support installing an XDP program on DQ, RX buffers need to be reposted using 4K buffers, which is larger than the default packet buffer size of 2K. This is needed to accommodate the extra head and tail that accompanies the data portion of an XDP buffer. Continuing to use 2K buffers would mean that the packet buffer size for the NIC would have to be restricted to 2048 - 320 - 256 = 1472B. However, this is problematic for two reasons: first, 1472 is not a packet buffer size accepted by GVE; second, at least 1474B of buffer space is needed to accommodate an MTU of 1460, which is the default on GCP. As such, we allocate 4K buffers, and post a 2K section of those 4K buffers (offset relative to the XDP headroom) to the NIC for DMA to avoid a potential extra copy. Because the GQ-QPL datapath requires copies regardless, this change was not needed to support XDP in that case. To capture this subtlety, a new field, packet_buffer_truesize, has been added to the rx ring struct to represent size of the allocated buffer, while packet_buffer_size has been left to represent the portion of the buffer posted to the NIC. Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20250321002910.1343422-6-hramamurthy@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-25	gve: merge packet buffer size fields	Joshua Washington
	The data_buffer_size_dqo field in gve_priv and the packet_buffer_size field in gve_rx_ring theoretically have the same meaning, but they are defined in two different places and used in two separate contexts. There is no good reason for this, so this change merges those fields into the packet_buffer_size field in the RX ring. This change also introduces a packet_buffer_size field to struct gve_rx_queue_config to account for cases where queues are not allocated, such as when the interface is down. Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20250321002910.1343422-5-hramamurthy@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-25	gve: update GQ RX to use buf_size	Joshua Washington
	Commit ebdfae0d377b ("gve: adopt page pool for DQ RDA mode") introduced a buf_size field to the gve_rx_slot_page_info struct, which can be used in the datapath to take the place of the packet_buffer_size field, as it will already be hot in the cache due to its extensive use. Using the buf_size field in the datapath frees up the packet_buffer_size field in the GQ-specific RX cacheline to be generalized for GQ and DQ (in the next patch), as there is currently no common packet buffer size field between the two queue formats. Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20250321002910.1343422-4-hramamurthy@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-25	gve: introduce config-based allocation for XDP	Joshua Washington
	An earlier patch series[1] introduced RX/TX ring allocation configuration structs which contained metadata used to allocate and configure new RX and TX rings. This led to a much cleaner and safer allocation pattern wherein queue resources were not deallocated until new queue resources were successfully allocated. Migrate the XDP allocation path to use the same pattern to allow for the existence of a single allocation path instead of relying on XDP-specific allocation methods. These extra allocation methods result in the duplication of many existing behaviors while being prone to error when configuration changes unrelated to XDP occur. Link: https://lore.kernel.org/netdev/20240122182632.1102721-1-shailend@google.com/ [1] Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20250321002910.1343422-3-hramamurthy@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-25	gve: remove xdp_xsk_done and xdp_xsk_wakeup statistics	Joshua Washington
	These statistics pollute the hotpath and do not have any real-world use or meaning. Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20250321002910.1343422-2-hramamurthy@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-24	gve: unlink old napi only if page pool exists	Harshitha Ramamurthy
	Commit de70981f295e ("gve: unlink old napi when stopping a queue using queue API") unlinks the old napi when stopping a queue. But this breaks QPL mode of the driver which does not use page pool. Fix this by checking that there's a page pool associated with the ring. Cc: stable@vger.kernel.org Fixes: de70981f295e ("gve: unlink old napi when stopping a queue using queue API") Reviewed-by: Joshua Washington <joshwash@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250317214141.286854-1-hramamurthy@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-07	gve: convert to use netmem for DQO RDA mode	Harshitha Ramamurthy
	To add netmem support to the gve driver, add a union to the struct gve_rx_slot_page_info. netmem_ref is used for DQO queue format's raw DMA addressing(RDA) mode. The struct page is retained for other usecases. Then, switch to using relevant netmem helper functions for page pool and skb frag management. Reviewed-by: Mina Almasry <almasrymina@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20250307003905.601175-1-hramamurthy@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-06	net: hold netdev instance lock during queue operations	Stanislav Fomichev
	For the drivers that use queue management API, switch to the mode where core stack holds the netdev instance lock. This affects the following drivers: - bnxt - gve - netdevsim Originally I locked only start/stop, but switched to holding the lock over all iterations to make them look atomic to the device (feels like it should be easier to reason about). Reviewed-by: Eric Dumazet <edumazet@google.com> Cc: Saeed Mahameed <saeed@kernel.org> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250305163732.2766420-6-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-27	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Cross-merge networking fixes after downstream PR (net-6.14-rc5). Conflicts: drivers/net/ethernet/cadence/macb_main.c fa52f15c745c ("net: cadence: macb: Synchronize stats calculations") 75696dd0fd72 ("net: cadence: macb: Convert to get_stats64") https://lore.kernel.org/20250224125848.68ee63e5@canb.auug.org.au Adjacent changes: drivers/net/ethernet/intel/ice/ice_sriov.c 79990cf5e7ad ("ice: Fix deinitializing VF in error path") a203163274a4 ("ice: simplify VF MSI-X managing") net/ipv4/tcp.c 18912c520674 ("tcp: devmem: don't write truncated dmabuf CMSGs to userspace") 297d389e9e5b ("net: prefix devmem specific helpers") net/mptcp/subflow.c 8668860b0ad3 ("mptcp: reset when MPTCP opts are dropped after join") c3349a22c200 ("mptcp: consolidate subflow cleanup") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-26	gve: unlink old napi when stopping a queue using queue API	Harshitha Ramamurthy
	When a queue is stopped using the ndo queue API, before destroying its page pool, the associated NAPI instance needs to be unlinked to avoid warnings. Handle this by calling page_pool_disable_direct_recycling() when stopping a queue. Cc: stable@vger.kernel.org Fixes: ebdfae0d377b ("gve: adopt page pool for DQ RDA mode") Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250226003526.1546854-1-hramamurthy@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21	gve: Add RSS cache for non RSS device option scenario	Ziwei Xiao
	Not all the devices have the capability for the driver to query for the registered RSS configuration. The driver can discover this by checking the relevant device option during setup. If it cannot, the driver needs to store the RSS config cache and directly return such cache when queried by the ethtool. RSS config is inited when driver probes. Also the default RSS config will be adjusted when there is RX queue count change. At this point, only keys of GVE_RSS_KEY_SIZE and indirection tables of GVE_RSS_INDIR_SIZE are supported. Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Signed-off-by: Praveen Kaligineedi <pkaligineedi@google.com> Signed-off-by: Jeroen de Borst <jeroendb@google.com> Link: https://patch.msgid.link/20250219200451.3348166-1-jeroendb@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-18	gve: set xdp redirect target only when it is available	Joshua Washington
	Before this patch the NETDEV_XDP_ACT_NDO_XMIT XDP feature flag is set by default as part of driver initialization, and is never cleared. However, this flag differs from others in that it is used as an indicator for whether the driver is ready to perform the ndo_xdp_xmit operation as part of an XDP_REDIRECT. Kernel helpers xdp_features_(set\|clear)_redirect_target exist to convey this meaning. This patch ensures that the netdev is only reported as a redirect target when XDP queues exist to forward traffic. Fixes: 39a7f4aa3e4a ("gve: Add XDP REDIRECT support for GQI-QPL format") Cc: stable@vger.kernel.org Reviewed-by: Praveen Kaligineedi <pkaligineedi@google.com> Reviewed-by: Jeroen de Borst <jeroendb@google.com> Signed-off-by: Joshua Washington <joshwash@google.com> Link: https://patch.msgid.link/20250214224417.1237818-1-joshwash@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-09	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Cross-merge networking fixes after downstream PR (net-6.13-rc7). Conflicts: a42d71e322a8 ("net_sched: sch_cake: Add drop reasons") 737d4d91d35b ("sched: sch_cake: add bounds checks to host bulk flow fairness counts") Adjacent changes: drivers/net/ethernet/meta/fbnic/fbnic.h 3a856ab34726 ("eth: fbnic: add IRQ reuse support") 95978931d55f ("eth: fbnic: Revert "eth: fbnic: Add hardware monitoring support via HWMON interface"") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07	eth: gve: use appropriate helper to set xdp_features	Jakub Kicinski
	Commit f85949f98206 ("xdp: add xdp_set_features_flag utility routine") added routines to inform the core about XDP flag changes. GVE support was added around the same time and missed using them. GVE only changes the flags on error recover or resume. Presumably the flags may change during resume if VM migrated. User would not get the notification and upper devices would not get a chance to recalculate their flags. Fixes: 75eaae158b1b ("gve: Add XDP DROP and TX support for GQI-QPL format") Reviewed-By: Jeroen de Borst <jeroendb@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20250106180210.1861784-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>