diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2025-07-30 08:58:55 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2025-07-30 08:58:55 -0700 |
commit | 8be4d31cb8aaeea27bde4b7ddb26e28a89062ebf (patch) | |
tree | fec3039a08284cd87f4ec9c3bea5b5a439f1859f /drivers/net/ethernet/intel/igc/igc_tsn.c | |
parent | 4b290aae788e06561754b28c6842e4080957d3f7 (diff) | |
parent | fa582ca7e187a15e772e6a72fe035f649b387a60 (diff) |
Merge tag 'net-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core & protocols:
- Wrap datapath globals into net_aligned_data, to avoid false sharing
- Preserve MSG_ZEROCOPY in forwarding (e.g. out of a container)
- Add SO_INQ and SCM_INQ support to AF_UNIX
- Add SIOCINQ support to AF_VSOCK
- Add TCP_MAXSEG sockopt to MPTCP
- Add IPv6 force_forwarding sysctl to enable forwarding per interface
- Make TCP validation of whether packet fully fits in the receive
window and the rcv_buf more strict. With increased use of HW
aggregation a single "packet" can be multiple 100s of kB
- Add MSG_MORE flag to optimize large TCP transmissions via sockmap,
improves latency up to 33% for sockmap users
- Convert TCP send queue handling from tasklet to BH workque
- Improve BPF iteration over TCP sockets to see each socket exactly
once
- Remove obsolete and unused TCP RFC3517/RFC6675 loss recovery code
- Support enabling kernel threads for NAPI processing on per-NAPI
instance basis rather than a whole device. Fully stop the kernel
NAPI thread when threaded NAPI gets disabled. Previously thread
would stick around until ifdown due to tricky synchronization
- Allow multicast routing to take effect on locally-generated packets
- Add output interface argument for End.X in segment routing
- MCTP: add support for gateway routing, improve bind() handling
- Don't require rtnl_lock when fetching an IPv6 neighbor over Netlink
- Add a new neighbor flag ("extern_valid"), which cedes refresh
responsibilities to userspace. This is needed for EVPN multi-homing
where a neighbor entry for a multi-homed host needs to be synced
across all the VTEPs among which the host is multi-homed
- Support NUD_PERMANENT for proxy neighbor entries
- Add a new queuing discipline for IETF RFC9332 DualQ Coupled AQM
- Add sequence numbers to netconsole messages. Unregister
netconsole's console when all net targets are removed. Code
refactoring. Add a number of selftests
- Align IPSec inbound SA lookup to RFC 4301. Only SPI and protocol
should be used for an inbound SA lookup
- Support inspecting ref_tracker state via DebugFS
- Don't force bonding advertisement frames tx to ~333 ms boundaries.
Add broadcast_neighbor option to send ARP/ND on all bonded links
- Allow providing upcall pid for the 'execute' command in openvswitch
- Remove DCCP support from Netfilter's conntrack
- Disallow multiple packet duplications in the queuing layer
- Prevent use of deprecated iptables code on PREEMPT_RT
Driver API:
- Support RSS and hashing configuration over ethtool Netlink
- Add dedicated ethtool callbacks for getting and setting hashing
fields
- Add support for power budget evaluation strategy in PSE /
Power-over-Ethernet. Generate Netlink events for overcurrent etc
- Support DPLL phase offset monitoring across all device inputs.
Support providing clock reference and SYNC over separate DPLL
inputs
- Support traffic classes in devlink rate API for bandwidth
management
- Remove rtnl_lock dependency from UDP tunnel port configuration
Device drivers:
- Add a new Broadcom driver for 800G Ethernet (bnge)
- Add a standalone driver for Microchip ZL3073x DPLL
- Remove IBM's NETIUCV device driver
- Ethernet high-speed NICs:
- Broadcom (bnxt):
- support zero-copy Tx of DMABUF memory
- take page size into account for page pool recycling rings
- Intel (100G, ice, idpf):
- idpf: XDP and AF_XDP support preparations
- idpf: add flow steering
- add link_down_events statistic
- clean up the TSPLL code
- preparations for live VM migration
- nVidia/Mellanox:
- support zero-copy Rx/Tx interfaces (DMABUF and io_uring)
- optimize context memory usage for matchers
- expose serial numbers in devlink info
- support PCIe congestion metrics
- Meta (fbnic):
- add 25G, 50G, and 100G link modes to phylink
- support dumping FW logs
- Marvell/Cavium:
- support for CN20K generation of the Octeon chips
- Amazon:
- add HW clock (without timestamping, just hypervisor time access)
- Ethernet virtual:
- VirtIO net:
- support segmentation of UDP-tunnel-encapsulated packets
- Google (gve):
- support packet timestamping and clock synchronization
- Microsoft vNIC:
- add handler for device-originated servicing events
- allow dynamic MSI-X vector allocation
- support Tx bandwidth clamping
- Ethernet NICs consumer, and embedded:
- AMD:
- amd-xgbe: hardware timestamping and PTP clock support
- Broadcom integrated MACs (bcmgenet, bcmasp):
- use napi_complete_done() return value to support NAPI polling
- add support for re-starting auto-negotiation
- Broadcom switches (b53):
- support BCM5325 switches
- add bcm63xx EPHY power control
- Synopsys (stmmac):
- lots of code refactoring and cleanups
- TI:
- icssg-prueth: read firmware-names from device tree
- icssg: PRP offload support
- Microchip:
- lan78xx: convert to PHYLINK for improved PHY and MAC management
- ksz: add KSZ8463 switch support
- Intel:
- support similar queue priority scheme in multi-queue and
time-sensitive networking (taprio)
- support packet pre-emption in both
- RealTek (r8169):
- enable EEE at 5Gbps on RTL8126
- Airoha:
- add PPPoE offload support
- MDIO bus controller for Airoha AN7583
- Ethernet PHYs:
- support for the IPQ5018 internal GE PHY
- micrel KSZ9477 switch-integrated PHYs:
- add MDI/MDI-X control support
- add RX error counters
- add cable test support
- add Signal Quality Indicator (SQI) reporting
- dp83tg720: improve reset handling and reduce link recovery time
- support bcm54811 (and its MII-Lite interface type)
- air_en8811h: support resume/suspend
- support PHY counters for QCA807x and QCA808x
- support WoL for QCA807x
- CAN drivers:
- rcar_canfd: support for Transceiver Delay Compensation
- kvaser: report FW versions via devlink dev info
- WiFi:
- extended regulatory info support (6 GHz)
- add statistics and beacon monitor for Multi-Link Operation (MLO)
- support S1G aggregation, improve S1G support
- add Radio Measurement action fields
- support per-radio RTS threshold
- some work around how FIPS affects wifi, which was wrong (RC4 is
used by TKIP, not only WEP)
- improvements for unsolicited probe response handling
- WiFi drivers:
- RealTek (rtw88):
- IBSS mode for SDIO devices
- RealTek (rtw89):
- BT coexistence for MLO/WiFi7
- concurrent station + P2P support
- support for USB devices RTL8851BU/RTL8852BU
- Intel (iwlwifi):
- use embedded PNVM in (to be released) FW images to fix
compatibility issues
- many cleanups (unused FW APIs, PCIe code, WoWLAN)
- some FIPS interoperability
- MediaTek (mt76):
- firmware recovery improvements
- more MLO work
- Qualcomm/Atheros (ath12k):
- fix scan on multi-radio devices
- more EHT/Wi-Fi 7 features
- encapsulation/decapsulation offload
- Broadcom (brcm80211):
- support SDIO 43751 device
- Bluetooth:
- hci_event: add support for handling LE BIG Sync Lost event
- ISO: add socket option to report packet seqnum via CMSG
- ISO: support SCM_TIMESTAMPING for ISO TS
- Bluetooth drivers:
- intel_pcie: support Function Level Reset
- nxpuart: add support for 4M baudrate
- nxpuart: implement powerup sequence, reset, FW dump, and FW loading"
* tag 'net-next-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1742 commits)
dpll: zl3073x: Fix build failure
selftests: bpf: fix legacy netfilter options
ipv6: annotate data-races around rt->fib6_nsiblings
ipv6: fix possible infinite loop in fib6_info_uses_dev()
ipv6: prevent infinite loop in rt6_nlmsg_size()
ipv6: add a retry logic in net6_rt_notify()
vrf: Drop existing dst reference in vrf_ip6_input_dst
net/sched: taprio: align entry index attr validation with mqprio
net: fsl_pq_mdio: use dev_err_probe
selftests: rtnetlink.sh: remove esp4_offload after test
vsock: remove unnecessary null check in vsock_getname()
igb: xsk: solve negative overflow of nb_pkts in zerocopy mode
stmmac: xsk: fix negative overflow of budget in zerocopy mode
dt-bindings: ieee802154: Convert at86rf230.txt yaml format
net: dsa: microchip: Disable PTP function of KSZ8463
net: dsa: microchip: Setup fiber ports for KSZ8463
net: dsa: microchip: Write switch MAC address differently for KSZ8463
net: dsa: microchip: Use different registers for KSZ8463
net: dsa: microchip: Add KSZ8463 switch support to KSZ DSA driver
dt-bindings: net: dsa: microchip: Add KSZ8463 switch support
...
Diffstat (limited to 'drivers/net/ethernet/intel/igc/igc_tsn.c')
-rw-r--r-- | drivers/net/ethernet/intel/igc/igc_tsn.c | 118 |
1 files changed, 104 insertions, 14 deletions
diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.c b/drivers/net/ethernet/intel/igc/igc_tsn.c index f22cc4d4f459..8a110145bfee 100644 --- a/drivers/net/ethernet/intel/igc/igc_tsn.c +++ b/drivers/net/ethernet/intel/igc/igc_tsn.c @@ -13,6 +13,13 @@ #define TX_MAX_FRAG_SIZE (TX_MIN_FRAG_SIZE * \ (MAX_MULTPLIER_TX_MIN_FRAG + 1)) +enum tx_queue { + TX_QUEUE_0 = 0, + TX_QUEUE_1, + TX_QUEUE_2, + TX_QUEUE_3, +}; + DEFINE_STATIC_KEY_FALSE(igc_fpe_enabled); static int igc_fpe_init_smd_frame(struct igc_ring *ring, @@ -109,6 +116,18 @@ static int igc_fpe_xmit_smd_frame(struct igc_adapter *adapter, return err; } +static void igc_fpe_configure_tx(struct ethtool_mmsv *mmsv, bool tx_enable) +{ + struct igc_fpe_t *fpe = container_of(mmsv, struct igc_fpe_t, mmsv); + struct igc_adapter *adapter; + + adapter = container_of(fpe, struct igc_adapter, fpe); + adapter->fpe.tx_enabled = tx_enable; + + /* Update config since tx_enabled affects preemptible queue configuration */ + igc_tsn_offload_apply(adapter); +} + static void igc_fpe_send_mpacket(struct ethtool_mmsv *mmsv, enum ethtool_mpacket type) { @@ -130,15 +149,59 @@ static void igc_fpe_send_mpacket(struct ethtool_mmsv *mmsv, } static const struct ethtool_mmsv_ops igc_mmsv_ops = { + .configure_tx = igc_fpe_configure_tx, .send_mpacket = igc_fpe_send_mpacket, }; void igc_fpe_init(struct igc_adapter *adapter) { adapter->fpe.tx_min_frag_size = TX_MIN_FRAG_SIZE; + adapter->fpe.tx_enabled = false; ethtool_mmsv_init(&adapter->fpe.mmsv, adapter->netdev, &igc_mmsv_ops); } +void igc_fpe_clear_preempt_queue(struct igc_adapter *adapter) +{ + for (int i = 0; i < adapter->num_tx_queues; i++) { + struct igc_ring *tx_ring = adapter->tx_ring[i]; + + tx_ring->preemptible = false; + } +} + +static u32 igc_fpe_map_preempt_tc_to_queue(const struct igc_adapter *adapter, + unsigned long preemptible_tcs) +{ + struct net_device *dev = adapter->netdev; + u32 i, queue = 0; + + for (i = 0; i < dev->num_tc; i++) { + u32 offset, count; + + if (!(preemptible_tcs & BIT(i))) + continue; + + offset = dev->tc_to_txq[i].offset; + count = dev->tc_to_txq[i].count; + queue |= GENMASK(offset + count - 1, offset); + } + + return queue; +} + +void igc_fpe_save_preempt_queue(struct igc_adapter *adapter, + const struct tc_mqprio_qopt_offload *mqprio) +{ + u32 preemptible_queue = igc_fpe_map_preempt_tc_to_queue(adapter, + mqprio->preemptible_tcs); + + for (int i = 0; i < adapter->num_tx_queues; i++) { + struct igc_ring *tx_ring = adapter->tx_ring[i]; + + tx_ring->preemptible = !!(preemptible_queue & BIT(i)); + } +} + static bool is_any_launchtime(struct igc_adapter *adapter) { int i; @@ -238,7 +301,7 @@ bool igc_tsn_is_taprio_activated_by_user(struct igc_adapter *adapter) adapter->taprio_offload_enable; } -static void igc_tsn_tx_arb(struct igc_adapter *adapter, u16 *queue_per_tc) +static void igc_tsn_tx_arb(struct igc_adapter *adapter, bool reverse_prio) { struct igc_hw *hw = &adapter->hw; u32 txarb; @@ -250,10 +313,17 @@ static void igc_tsn_tx_arb(struct igc_adapter *adapter, u16 *queue_per_tc) IGC_TXARB_TXQ_PRIO_2_MASK | IGC_TXARB_TXQ_PRIO_3_MASK); - txarb |= IGC_TXARB_TXQ_PRIO_0(queue_per_tc[3]); - txarb |= IGC_TXARB_TXQ_PRIO_1(queue_per_tc[2]); - txarb |= IGC_TXARB_TXQ_PRIO_2(queue_per_tc[1]); - txarb |= IGC_TXARB_TXQ_PRIO_3(queue_per_tc[0]); + if (reverse_prio) { + txarb |= IGC_TXARB_TXQ_PRIO_0(TX_QUEUE_3); + txarb |= IGC_TXARB_TXQ_PRIO_1(TX_QUEUE_2); + txarb |= IGC_TXARB_TXQ_PRIO_2(TX_QUEUE_1); + txarb |= IGC_TXARB_TXQ_PRIO_3(TX_QUEUE_0); + } else { + txarb |= IGC_TXARB_TXQ_PRIO_0(TX_QUEUE_0); + txarb |= IGC_TXARB_TXQ_PRIO_1(TX_QUEUE_1); + txarb |= IGC_TXARB_TXQ_PRIO_2(TX_QUEUE_2); + txarb |= IGC_TXARB_TXQ_PRIO_3(TX_QUEUE_3); + } wr32(IGC_TXARB, txarb); } @@ -286,7 +356,6 @@ static void igc_tsn_set_rxpbsize(struct igc_adapter *adapter, */ static int igc_tsn_disable_offload(struct igc_adapter *adapter) { - u16 queue_per_tc[4] = { 3, 2, 1, 0 }; struct igc_hw *hw = &adapter->hw; u32 tqavctrl; int i; @@ -308,9 +377,16 @@ static int igc_tsn_disable_offload(struct igc_adapter *adapter) wr32(IGC_TQAVCTRL, tqavctrl); for (i = 0; i < adapter->num_tx_queues; i++) { + int reg_idx = adapter->tx_ring[i]->reg_idx; + u32 txdctl; + wr32(IGC_TXQCTL(i), 0); wr32(IGC_STQT(i), 0); wr32(IGC_ENDQT(i), NSEC_PER_SEC); + + txdctl = rd32(IGC_TXDCTL(reg_idx)); + txdctl &= ~IGC_TXDCTL_PRIORITY_HIGH; + wr32(IGC_TXDCTL(reg_idx), txdctl); } wr32(IGC_QBVCYCLET_S, 0); @@ -319,7 +395,7 @@ static int igc_tsn_disable_offload(struct igc_adapter *adapter) /* Restore the default Tx arbitration: Priority 0 has the highest * priority and is assigned to queue 0 and so on and so forth. */ - igc_tsn_tx_arb(adapter, queue_per_tc); + igc_tsn_tx_arb(adapter, false); adapter->flags &= ~IGC_FLAG_TSN_QBV_ENABLED; @@ -355,7 +431,7 @@ static u8 igc_fpe_get_frag_size_mult(const struct igc_fpe_t *fpe) u32 igc_fpe_get_supported_frag_size(u32 frag_size) { - const u32 supported_sizes[] = {64, 128, 192, 256}; + static const u32 supported_sizes[] = { 64, 128, 192, 256 }; /* Find the smallest supported size that is >= frag_size */ for (int i = 0; i < ARRAY_SIZE(supported_sizes); i++) { @@ -385,15 +461,13 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter) if (igc_is_device_id_i226(hw)) igc_tsn_set_retx_qbvfullthreshold(adapter); - if (adapter->strict_priority_enable) { - /* Configure queue priorities according to the user provided - * mapping. - */ - igc_tsn_tx_arb(adapter, adapter->queue_per_tc); - } + if (adapter->strict_priority_enable || + adapter->flags & IGC_FLAG_TSN_REVERSE_TXQ_PRIO) + igc_tsn_tx_arb(adapter, true); for (i = 0; i < adapter->num_tx_queues; i++) { struct igc_ring *ring = adapter->tx_ring[i]; + u32 txdctl = rd32(IGC_TXDCTL(ring->reg_idx)); u32 txqctl = 0; u16 cbs_value; u32 tqavcc; @@ -427,6 +501,22 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter) if (ring->launchtime_enable) txqctl |= IGC_TXQCTL_QUEUE_MODE_LAUNCHT; + if (!adapter->fpe.tx_enabled) { + /* fpe inactive: clear both flags */ + txqctl &= ~IGC_TXQCTL_PREEMPTIBLE; + txdctl &= ~IGC_TXDCTL_PRIORITY_HIGH; + } else if (ring->preemptible) { + /* fpe active + preemptible: enable preemptible queue + set low priority */ + txqctl |= IGC_TXQCTL_PREEMPTIBLE; + txdctl &= ~IGC_TXDCTL_PRIORITY_HIGH; + } else { + /* fpe active + express: enable express queue + set high priority */ + txqctl &= ~IGC_TXQCTL_PREEMPTIBLE; + txdctl |= IGC_TXDCTL_PRIORITY_HIGH; + } + + wr32(IGC_TXDCTL(ring->reg_idx), txdctl); + /* Skip configuring CBS for Q2 and Q3 */ if (i > 1) goto skip_cbs; |