net/mlx5e: SHAMPO, Coalesce skb fragments to page size - linux/linux-stable.git

diff options

author	Dragos Tatulea <dtatulea@nvidia.com>	2024-06-04 00:22:19 +0300
committer	Jakub Kicinski <kuba@kernel.org>	2024-06-05 20:20:46 -0700
commit	14ae2fd12be8c5089e43fee8a21cd8631699b97a (patch)
tree	ac187f11c0650989716a44f6f752477936b4b6df /net/unix/af_unix.c
parent	99be56171fa9ffea494dfe3d4a7f6e7e51630c2e (diff)

net/mlx5e: SHAMPO, Coalesce skb fragments to page size

When doing hardware GRO (SHAMPO), the driver puts each data payload of a packet from the wire into one skb fragment. TCP Zero-Copy expects page sized skb fragments to be able to do it's page-flipping magic. With the current way of arranging fragments by the driver, only specific MTUs (page sized multiple + header size) will yield such page sized fragments in a high percentage. This change improves payload arrangement in the skb for hardware GRO by coalescing payloads into a single skb fragment when possible. To demonstrate the fix, running tcp_mmap with a MTU of 1500 yields: - Before: 0 % bytes mmap'ed - After : 81 % bytes mmap'ed More importantly, coalescing considerably improves the HW GRO performance. Here are the results for a iperf3 bandwidth benchmark: +---------+--------+--------+------------------------+-----------+ | streams | SW GRO | HW GRO | HW GRO with coalescing | Unit | |---------+--------+--------+------------------------+-----------| | 1 | 36 | 42 | 57 | Gbits/sec | | 4 | 34 | 39 | 50 | Gbits/sec | | 8 | 31 | 35 | 43 | Gbits/sec | +---------+--------+--------+------------------------+-----------+ Benchmark details: VM based setup CPU: Intel(R) Xeon(R) Platinum 8380 CPU, 24 cores NIC: ConnectX-7 100GbE iperf3 and irq running on same CPU over a single receive queue Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20240603212219.1037656-15-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>

Diffstat (limited to 'net/unix/af_unix.c')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: