summaryrefslogtreecommitdiff
path: root/drivers/net/ethernet/engleder/tsnep.h
AgeCommit message (Collapse)Author
2023-04-24tsnep: Add XDP socket zero-copy TX supportGerhard Engleder
Send and complete XSK pool frames within TX NAPI context. NAPI context is triggered by ndo_xsk_wakeup. Test results with A53 1.2GHz: xdpsock txonly copy mode, 64 byte frames: pps pkts 1.00 tx 284,409 11,398,144 Two CPUs with 100% and 10% utilization. xdpsock txonly zero-copy mode, 64 byte frames: pps pkts 1.00 tx 511,929 5,890,368 Two CPUs with 100% and 1% utilization. xdpsock l2fwd copy mode, 64 byte frames: pps pkts 1.00 rx 248,985 7,315,885 tx 248,921 7,315,885 Two CPUs with 100% and 10% utilization. xdpsock l2fwd zero-copy mode, 64 byte frames: pps pkts 1.00 rx 254,735 3,039,456 tx 254,735 3,039,456 Two CPUs with 100% and 4% utilization. Packet rate increases and CPU utilization is reduced in both cases. Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-24tsnep: Add XDP socket zero-copy RX supportGerhard Engleder
Add support for XSK zero-copy to RX path. The setup of the XSK pool can be done at runtime. If the netdev is running, then the queue must be disabled and enabled during reconfiguration. This can be done easily with functions introduced in previous commits. A more important property is that, if the netdev is running, then the setup of the XSK pool shall not stop the netdev in case of errors. A broken netdev after a failed XSK pool setup is bad behavior. Therefore, the allocation and setup of resources during XSK pool setup is done only before any queue is disabled. Additionally, freeing and later allocation of resources is eliminated in some cases. Page pool entries are kept for later use. Two memory models are registered in parallel. As a result, the XSK pool setup cannot fail during queue reconfiguration. In contrast to other drivers, XSK pool setup and XDP BPF program setup are separate actions. XSK pool setup can be done without any XDP BPF program. The XDP BPF program can be added, removed or changed without any reconfiguration of the XSK pool. Test results with A53 1.2GHz: xdpsock rxdrop copy mode, 64 byte frames: pps pkts 1.00 rx 856,054 10,625,775 Two CPUs with both 100% utilization. xdpsock rxdrop zero-copy mode, 64 byte frames: pps pkts 1.00 rx 889,388 4,615,284 Two CPUs with 100% and 20% utilization. Packet rate increases and CPU utilization is reduced. 100% CPU load seems to the base load. This load is consumed by ksoftirqd just for dropping the generated packets without xdpsock running. Using batch API reduced CPU utilization slightly, but measurements are not stable enough to provide meaningful numbers. Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-24tsnep: Replace modulo operation with maskGerhard Engleder
TX/RX ring size is static and power of 2 to enable compiler to optimize modulo operation to mask operation. Make this optimization already in the code and don't rely on the compiler. CPU utilisation during high packet rate has not changed. So no performance improvement has been measured. But it is best practice to prevent modulo operations. Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-18tsnep: Support XDP BPF program setupGerhard Engleder
Implement setup of BPF programs for XDP RX path with command XDP_SETUP_PROG of ndo_bpf(). This is the final step for XDP RX path support. There is no need to reinit the RX queues as they are always prepared for XDP. Additionally remove $(tsnep-y) from $(tsnep-objs) because it is added automatically. Test results with A53 1.2GHz: XDP_DROP (samples/bpf/xdp1) proto 17: 883878 pkt/s XDP_TX (samples/bpf/xdp2) proto 17: 255693 pkt/s XDP_REDIRECT (samples/bpf/xdpsock) sock0@eth2:0 rxdrop xdp-drv pps pkts 1.00 rx 855,582 5,404,523 tx 0 0 XDP_REDIRECT (samples/bpf/xdp_redirect) eth2->eth1 613,267 rx/s 0 err,drop/s 613,272 xmit/s Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-18tsnep: Add XDP RX supportGerhard Engleder
If BPF program is set up, then run BPF program for every received frame and execute the selected action. Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-18tsnep: Add RX queue info for XDP supportGerhard Engleder
Register xdp_rxq_info with page_pool memory model. This is needed for XDP buffer handling. Additionally fix error path by removing call of tsnep_phy_close() after failed tsnep_phy_open(). Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-18tsnep: Add XDP TX supportGerhard Engleder
Implement ndo_xdp_xmit() for XDP TX support. Support for fragmented XDP frames is included. Also some braces and logic cleanups are done in normal TX path to keep both TX paths in sync. Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-01-18tsnep: Replace TX spin_lock with __netif_tx_lockGerhard Engleder
TX spin_lock can be eliminated, because the normal TX path is already protected with __netif_tx_lock and this lock can be used for access to queue outside of normal TX path too. Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-02tsnep: Rework RX buffer allocationGerhard Engleder
Refill RX queue in batches of descriptors to improve performance. Refill is allowed to fail as long as a minimum number of descriptors is active. Thus, a limited number of failed RX buffer allocations is now allowed for normal operation. Previously every failed allocation resulted in a dropped frame. If the minimum number of active descriptors is reached, then RX buffers are still reused and frames are dropped. This ensures that the RX queue never runs empty and always continues to operate. Prework for future XDP support. Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-12-02tsnep: Throttle interruptsGerhard Engleder
Without interrupt throttling, iperf server mode generates a CPU load of 100% (A53 1.2GHz). Also the throughput suffers with less than 900Mbit/s on a 1Gbit/s link. The reason is a high interrupt load with interrupts every ~20us. Reduce interrupt load by throttling of interrupts. Interrupt delay default is 64us. For iperf server mode the CPU load is significantly reduced to ~20% and the throughput reaches the maximum of 941MBit/s. Interrupts are generated every ~140us. RX and TX coalesce can be configured with ethtool. RX coalesce has priority over TX coalesce if the same interrupt is used. Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-30tsnep: Use page pool for RXGerhard Engleder
Use page pool for RX buffer handling. Makes RX path more efficient and is required prework for future XDP support. Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-30tsnep: Add EtherType RX flow classification supportGerhard Engleder
Received Ethernet frames are assigned to first RX queue per default. Based on EtherType Ethernet frames can be assigned to other RX queues. This enables processing of real-time Ethernet protocols on dedicated RX queues. Add RX flow classification interface for EtherType based RX queue assignment. Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-30tsnep: Support multiple TX/RX queue pairsGerhard Engleder
Support additional TX/RX queue pairs if dedicated interrupt is available. Interrupts are detected by name in device tree. Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-30tsnep: Move interrupt from device to queueGerhard Engleder
For multiple queues multiple interrupts shall be used. Therefore, rework global interrupt to per queue interrupt. Every interrupt name shall contain interface name and queue information. To get a valid interface name, the interrupt request needs to by done during open like in other drivers. Additionally, this allows the removal of some initialisation checks in the interrupt handler. Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-08-22tsnep: Record RX queueGerhard Engleder
Other drivers record RX queue so it should make sense to do that also. Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2021-11-24tsnep: Fix resource_size cocci warningGerhard Engleder
The following warning is fixed, by removing the unused resource size: drivers/net/ethernet/engleder/tsnep_main.c:1155:21-24: WARNING: Suspicious code. resource_size is maybe missing with io Reported-by: kernel test robot <lkp@intel.com> Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Link: https://lore.kernel.org/r/20211124205225.13985-1-gerhard@engleder-embedded.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-11-22tsnep: Add TSN endpoint Ethernet MAC driverGerhard Engleder
The TSN endpoint Ethernet MAC is a FPGA based network device for real-time communication. It is integrated as Ethernet controller with ethtool and PTP support. For real-time communcation TC_SETUP_QDISC_TAPRIO is supported. Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com> Signed-off-by: David S. Miller <davem@davemloft.net>