summaryrefslogtreecommitdiff
path: root/net/ipv4/tcp_output.c
diff options
context:
space:
mode:
authorJakub Kicinski <kuba@kernel.org>2025-09-17 18:29:33 -0700
committerJakub Kicinski <kuba@kernel.org>2025-09-17 18:30:56 -0700
commit152ba35c04ade1a164c774d6fccbf8e8cf4652cf (patch)
tree75680a9629fbc958480c9265e4e489a7df891721 /net/ipv4/tcp_output.c
parentcbff0b1ec64ee984759e3a0af84f4384540148a8 (diff)
parent11bbcfb7668c6f4d97260f7caaefea22678bc31e (diff)
Merge branch 'net-mlx5e-use-multiple-doorbells'
Tariq Toukan says: ==================== net/mlx5e: Use multiple doorbells mlx5e uses a single MMIO-mapped doorbell per netdevice for all send and receive operations. Writes to the doorbell go over the PCIe bus directly to the device, which then services the indicated queues. On certain architectures and with sufficiently high volume of doorbell ringing (many cores, many active channels, small MTU, no GSO, etc.), the MMIO-mapped doorbell address can become contended, leading to delays in servicing writes to that address and a global slowdown of all traffic for that netdevice. mlx5 NICs have supported using multiple doorbells for many years, the mlx5_ib driver for the same hardware has been using multiple doorbells traditionally. This patch series extends the mlx5 Ethernet driver to also use multiple doorbells to solve the MMIO contention issues. By allocating and using more doorbells for all channel queues (TX and RX), the MMIO contention on any particular doorbell address is reduced significantly. The first patches are cleanups: net/mlx5: Fix typo of MLX5_EQ_DOORBEL_OFFSET net/mlx5: Remove unused 'offset' field from struct mlx5_sq_bfreg' net/mlx5e: Remove unused 'xsk' param of mlx5e_build_xdpsq_param The next patch separates the global doorbell from Ethernet-specific resources: net/mlx5: Store the global doorbell in mlx5_priv Next, plumbing to allow a different doorbell to be used for channel TX and RX queues: net/mlx5e: Prepare for using multiple TX doorbells net/mlx5e: Prepare for using different CQ doorbells Then, enable using multiple doorbells for channel queues: net/mlx5e: Use multiple TX doorbells net/mlx5e: Use multiple CQ doorbells Finally, introduce a devlink parameter to control this: devlink: Add a 'num_doorbells' driverinit param net/mlx5e: Use the 'num_doorbells' devlink param Some performance results, done with the Linux pktgen script, running b2b over Connect-X 8 NICs: samples/pktgen/pktgen_sample02_multiqueue.sh -i $NIC -s 64 -d $DST_IP \ -m $MAC -t 64 Baseline (1 doorbell): 9 Mpps This series (8 doorbells): 56 Mpps Note that pktgen without 'burst' rings the doorbell after every packet, while real packet TX using NAPI usually batches multiple pending packets with the xmit_more mechanism. So this is in essence a micro-benchmark showcasing the improvement of using multiple doorbells on platforms affected by MMIO contention. Real life traffic usually sees little movement either way. ==================== Link: https://patch.msgid.link/1758031904-634231-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Diffstat (limited to 'net/ipv4/tcp_output.c')
0 files changed, 0 insertions, 0 deletions