diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2025-10-02 15:17:01 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2025-10-02 15:17:01 -0700 |
commit | 07fdad3a93756b872da7b53647715c48d0f4a2d0 (patch) | |
tree | 133af559ac91e6b24358b57a025abc060a782129 /drivers/net/ethernet/intel/ice/ice_common.c | |
parent | f79e772258df311c2cb21594ca0996318e720d28 (diff) | |
parent | f1455695d2d99894b65db233877acac9a0e120b9 (diff) |
Merge tag 'net-next-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Paolo Abeni:
"Core & protocols:
- Improve drop account scalability on NUMA hosts for RAW and UDP
sockets and the backlog, almost doubling the Pps capacity under DoS
- Optimize the UDP RX performance under stress, reducing contention,
revisiting the binary layout of the involved data structs and
implementing NUMA-aware locking. This improves UDP RX performance
by an additional 50%, even more under extreme conditions
- Add support for PSP encryption of TCP connections; this mechanism
has some similarities with IPsec and TLS, but offers superior HW
offloads capabilities
- Ongoing work to support Accurate ECN for TCP. AccECN allows more
than one congestion notification signal per RTT and is a building
block for Low Latency, Low Loss, and Scalable Throughput (L4S)
- Reorganize the TCP socket binary layout for data locality, reducing
the number of touched cachelines in the fastpath
- Refactor skb deferral free to better scale on large multi-NUMA
hosts, this improves TCP and UDP RX performances significantly on
such HW
- Increase the default socket memory buffer limits from 256K to 4M to
better fit modern link speeds
- Improve handling of setups with a large number of nexthop, making
dump operating scaling linearly and avoiding unneeded
synchronize_rcu() on delete
- Improve bridge handling of VLAN FDB, storing a single entry per
bridge instead of one entry per port; this makes the dump order of
magnitude faster on large switches
- Restore IP ID correctly for encapsulated packets at GSO
segmentation time, allowing GRO to merge packets in more scenarios
- Improve netfilter matching performance on large sets
- Improve MPTCP receive path performance by leveraging recently
introduced core infrastructure (skb deferral free) and adopting
recent TCP autotuning changes
- Allow bridges to redirect to a backup port when the bridge port is
administratively down
- Introduce MPTCP 'laminar' endpoint that con be used only once per
connection and simplify common MPTCP setups
- Add RCU safety to dst->dev, closing a lot of possible races
- A significant crypto library API for SCTP, MPTCP and IPv6 SR,
reducing code duplication
- Supports pulling data from an skb frag into the linear area of an
XDP buffer
Things we sprinkled into general kernel code:
- Generate netlink documentation from YAML using an integrated YAML
parser
Driver API:
- Support using IPv6 Flow Label in Rx hash computation and RSS queue
selection
- Introduce API for fetching the DMA device for a given queue,
allowing TCP zerocopy RX on more H/W setups
- Make XDP helpers compatible with unreadable memory, allowing more
easily building DevMem-enabled drivers with a unified XDP/skbs
datapath
- Add a new dedicated ethtool callback enabling drivers to provide
the number of RX rings directly, improving efficiency and clarity
in RX ring queries and RSS configuration
- Introduce a burst period for the health reporter, allowing better
handling of multiple errors due to the same root cause
- Support for DPLL phase offset exponential moving average,
controlling the average smoothing factor
Device drivers:
- Add a new Huawei driver for 3rd gen NIC (hinic3)
- Add a new SpacemiT driver for K1 ethernet MAC
- Add a generic abstraction for shared memory communication
devices (dibps)
- Ethernet high-speed NICs:
- nVidia/Mellanox:
- Use multiple per-queue doorbell, to avoid MMIO contention
issues
- support adjacent functions, allowing them to delegate their
SR-IOV VFs to sibling PFs
- support RSS for IPSec offload
- support exposing raw cycle counters in PTP and mlx5
- support for disabling host PFs.
- Intel (100G, ice, idpf):
- ice: support for SRIOV VFs over an Active-Active link
aggregate
- ice: support for firmware logging via debugfs
- ice: support for Earliest TxTime First (ETF) hardware offload
- idpf: support basic XDP functionalities and XSk
- Broadcom (bnxt):
- support Hyper-V VF ID
- dynamic SRIOV resource allocations for RoCE
- Meta (fbnic):
- support queue API, zero-copy Rx and Tx
- support basic XDP functionalities
- devlink health support for FW crashes and OTP mem corruptions
- expand hardware stats coverage to FEC, PHY, and Pause
- Wangxun:
- support ethtool coalesce options
- support for multiple RSS contexts
- Ethernet virtual:
- Macsec:
- replace custom netlink attribute checks with policy-level
checks
- Bonding:
- support aggregator selection based on port priority
- Microsoft vNIC:
- use page pool fragments for RX buffers instead of full pages
to improve memory efficiency
- Ethernet NICs consumer, and embedded:
- Qualcomm: support Ethernet function for IPQ9574 SoC
- Airoha: implement wlan offloading via NPU
- Freescale
- enetc: add NETC timer PTP driver and add PTP support
- fec: enable the Jumbo frame support for i.MX8QM
- Renesas (R-Car S4):
- support HW offloading for layer 2 switching
- support for RZ/{T2H, N2H} SoCs
- Cadence (macb): support TAPRIO traffic scheduling
- TI:
- support for Gigabit ICSS ethernet SoC (icssm-prueth)
- Synopsys (stmmac): a lot of cleanups
- Ethernet PHYs:
- Support 10g-qxgmi phy-mode for AQR412C, Felix DSA and Lynx PCS
driver
- Support bcm63268 GPHY power control
- Support for Micrel lan8842 PHY and PTP
- Support for Aquantia AQR412 and AQR115
- CAN:
- a large CAN-XL preparation work
- reorganize raw_sock and uniqframe struct to minimize memory
usage
- rcar_canfd: update the CAN-FD handling
- WiFi:
- extended Neighbor Awareness Networking (NAN) support
- S1G channel representation cleanup
- improve S1G support
- WiFi drivers:
- Intel (iwlwifi):
- major refactor and cleanup
- Broadcom (brcm80211):
- support for AP isolation
- RealTek (rtw88/89) rtw88/89:
- preparation work for RTL8922DE support
- MediaTek (mt76):
- HW restart improvements
- MLO support
- Qualcomm/Atheros (ath10k):
- GTK rekey fixes
- Bluetooth drivers:
- btusb: support for several new IDs for MT7925
- btintel: support for BlazarIW core
- btintel_pcie: support for _suspend() / _resume()
- btintel_pcie: support for Scorpious, Panther Lake-H484 IDs"
* tag 'net-next-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1536 commits)
net: stmmac: Add support for Allwinner A523 GMAC200
dt-bindings: net: sun8i-emac: Add A523 GMAC200 compatible
Revert "Documentation: net: add flow control guide and document ethtool API"
octeontx2-pf: fix bitmap leak
octeontx2-vf: fix bitmap leak
net/mlx5e: Use extack in set rxfh callback
net/mlx5e: Introduce mlx5e_rss_params for RSS configuration
net/mlx5e: Introduce mlx5e_rss_init_params
net/mlx5e: Remove unused mdev param from RSS indir init
net/mlx5: Improve QoS error messages with actual depth values
net/mlx5e: Prevent entering switchdev mode with inconsistent netns
net/mlx5: HWS, Generalize complex matchers
net/mlx5: Improve write-combining test reliability for ARM64 Grace CPUs
selftests/net: add tcp_port_share to .gitignore
Revert "net/mlx5e: Update and set Xon/Xoff upon MTU set"
net: add NUMA awareness to skb_attempt_defer_free()
net: use llist for sd->defer_list
net: make softnet_data.defer_count an atomic
selftests: drv-net: psp: add tests for destroying devices
selftests: drv-net: psp: add test for auto-adjusting TCP MSS
...
Diffstat (limited to 'drivers/net/ethernet/intel/ice/ice_common.c')
-rw-r--r-- | drivers/net/ethernet/intel/ice/ice_common.c | 143 |
1 files changed, 133 insertions, 10 deletions
diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c index 003d60a4db21..2250426ec91b 100644 --- a/drivers/net/ethernet/intel/ice/ice_common.c +++ b/drivers/net/ethernet/intel/ice/ice_common.c @@ -984,6 +984,37 @@ static int ice_wait_for_fw(struct ice_hw *hw, u32 timeout) return -ETIMEDOUT; } +static int __fwlog_send_cmd(void *priv, struct libie_aq_desc *desc, void *buf, + u16 size) +{ + struct ice_hw *hw = priv; + + return ice_aq_send_cmd(hw, desc, buf, size, NULL); +} + +static int __fwlog_init(struct ice_hw *hw) +{ + struct ice_pf *pf = hw->back; + struct libie_fwlog_api api = { + .pdev = pf->pdev, + .send_cmd = __fwlog_send_cmd, + .priv = hw, + }; + int err; + + /* only support fw log commands on PF 0 */ + if (hw->bus.func) + return -EINVAL; + + err = ice_debugfs_pf_init(pf); + if (err) + return err; + + api.debugfs_root = pf->ice_debugfs_pf; + + return libie_fwlog_init(&hw->fwlog, &api); +} + /** * ice_init_hw - main hardware initialization routine * @hw: pointer to the hardware structure @@ -1012,7 +1043,7 @@ int ice_init_hw(struct ice_hw *hw) if (status) goto err_unroll_cqinit; - status = ice_fwlog_init(hw); + status = __fwlog_init(hw); if (status) ice_debug(hw, ICE_DBG_FW_LOG, "Error initializing FW logging: %d\n", status); @@ -1159,6 +1190,16 @@ err_unroll_cqinit: return status; } +static void __fwlog_deinit(struct ice_hw *hw) +{ + /* only support fw log commands on PF 0 */ + if (hw->bus.func) + return; + + ice_debugfs_pf_deinit(hw->back); + libie_fwlog_deinit(&hw->fwlog); +} + /** * ice_deinit_hw - unroll initialization operations done by ice_init_hw * @hw: pointer to the hardware structure @@ -1177,8 +1218,7 @@ void ice_deinit_hw(struct ice_hw *hw) ice_free_seg(hw); ice_free_hw_tbls(hw); mutex_destroy(&hw->tnl_lock); - - ice_fwlog_deinit(hw); + __fwlog_deinit(hw); ice_destroy_all_ctrlq(hw); /* Clear VSI contexts if not already cleared */ @@ -1693,6 +1733,44 @@ int ice_write_txq_ctx(struct ice_hw *hw, struct ice_tlan_ctx *tlan_ctx, return 0; } +/* Tx time Queue Context */ +static const struct packed_field_u8 ice_txtime_ctx_fields[] = { + /* Field Width LSB */ + ICE_CTX_STORE(ice_txtime_ctx, base, 57, 0), + ICE_CTX_STORE(ice_txtime_ctx, pf_num, 3, 57), + ICE_CTX_STORE(ice_txtime_ctx, vmvf_num, 10, 60), + ICE_CTX_STORE(ice_txtime_ctx, vmvf_type, 2, 70), + ICE_CTX_STORE(ice_txtime_ctx, src_vsi, 10, 72), + ICE_CTX_STORE(ice_txtime_ctx, cpuid, 8, 82), + ICE_CTX_STORE(ice_txtime_ctx, tphrd_desc, 1, 90), + ICE_CTX_STORE(ice_txtime_ctx, qlen, 13, 91), + ICE_CTX_STORE(ice_txtime_ctx, timer_num, 1, 104), + ICE_CTX_STORE(ice_txtime_ctx, txtime_ena_q, 1, 105), + ICE_CTX_STORE(ice_txtime_ctx, drbell_mode_32, 1, 106), + ICE_CTX_STORE(ice_txtime_ctx, ts_res, 4, 107), + ICE_CTX_STORE(ice_txtime_ctx, ts_round_type, 2, 111), + ICE_CTX_STORE(ice_txtime_ctx, ts_pacing_slot, 3, 113), + ICE_CTX_STORE(ice_txtime_ctx, merging_ena, 1, 116), + ICE_CTX_STORE(ice_txtime_ctx, ts_fetch_prof_id, 4, 117), + ICE_CTX_STORE(ice_txtime_ctx, ts_fetch_cache_line_aln_thld, 4, 121), + ICE_CTX_STORE(ice_txtime_ctx, tx_pipe_delay_mode, 1, 125), +}; + +/** + * ice_pack_txtime_ctx - pack Tx time queue context into a HW buffer + * @ctx: the Tx time queue context to pack + * @buf: the HW buffer to pack into + * + * Pack the Tx time queue context from the CPU-friendly unpacked buffer into + * its bit-packed HW layout. + */ +void ice_pack_txtime_ctx(const struct ice_txtime_ctx *ctx, + ice_txtime_ctx_buf_t *buf) +{ + pack_fields(buf, sizeof(*buf), ctx, ice_txtime_ctx_fields, + QUIRK_LITTLE_ENDIAN | QUIRK_LSW32_IS_FIRST); +} + /* Sideband Queue command wrappers */ /** @@ -2418,12 +2496,15 @@ ice_parse_common_caps(struct ice_hw *hw, struct ice_hw_common_caps *caps, caps->reset_restrict_support); break; case LIBIE_AQC_CAPS_FW_LAG_SUPPORT: - caps->roce_lag = !!(number & LIBIE_AQC_BIT_ROCEV2_LAG); + caps->roce_lag = number & LIBIE_AQC_BIT_ROCEV2_LAG; ice_debug(hw, ICE_DBG_INIT, "%s: roce_lag = %u\n", prefix, caps->roce_lag); - caps->sriov_lag = !!(number & LIBIE_AQC_BIT_SRIOV_LAG); + caps->sriov_lag = number & LIBIE_AQC_BIT_SRIOV_LAG; ice_debug(hw, ICE_DBG_INIT, "%s: sriov_lag = %u\n", prefix, caps->sriov_lag); + caps->sriov_aa_lag = number & LIBIE_AQC_BIT_SRIOV_AA_LAG; + ice_debug(hw, ICE_DBG_INIT, "%s: sriov_aa_lag = %u\n", + prefix, caps->sriov_aa_lag); break; case LIBIE_AQC_CAPS_TX_SCHED_TOPO_COMP_MODE: caps->tx_sched_topo_comp_mode_en = (number == 1); @@ -4712,24 +4793,24 @@ do_aq: } /** - * ice_aq_cfg_lan_txq + * ice_aq_cfg_lan_txq - send AQ command 0x0C32 to FW * @hw: pointer to the hardware structure * @buf: buffer for command * @buf_size: size of buffer in bytes * @num_qs: number of queues being configured * @oldport: origination lport * @newport: destination lport + * @mode: cmd_type for move to use * @cd: pointer to command details structure or NULL * * Move/Configure LAN Tx queue (0x0C32) * - * There is a better AQ command to use for moving nodes, so only coding - * this one for configuring the node. + * Return: Zero on success, associated error code on failure. */ int ice_aq_cfg_lan_txq(struct ice_hw *hw, struct ice_aqc_cfg_txqs_buf *buf, u16 buf_size, u16 num_qs, u8 oldport, u8 newport, - struct ice_sq_cd *cd) + u8 mode, struct ice_sq_cd *cd) { struct ice_aqc_cfg_txqs *cmd; struct libie_aq_desc desc; @@ -4742,10 +4823,12 @@ ice_aq_cfg_lan_txq(struct ice_hw *hw, struct ice_aqc_cfg_txqs_buf *buf, if (!buf) return -EINVAL; - cmd->cmd_type = ICE_AQC_Q_CFG_TC_CHNG; + cmd->cmd_type = mode; cmd->num_qs = num_qs; cmd->port_num_chng = (oldport & ICE_AQC_Q_CFG_SRC_PRT_M); cmd->port_num_chng |= FIELD_PREP(ICE_AQC_Q_CFG_DST_PRT_M, newport); + cmd->port_num_chng |= FIELD_PREP(ICE_AQC_Q_CFG_MODE_M, + ICE_AQC_Q_CFG_MODE_KEEP_OWN); cmd->time_out = FIELD_PREP(ICE_AQC_Q_CFG_TIMEOUT_M, 5); cmd->blocked_cgds = 0; @@ -4801,6 +4884,46 @@ ice_aq_add_rdma_qsets(struct ice_hw *hw, u8 num_qset_grps, return ice_aq_send_cmd(hw, &desc, qset_list, buf_size, cd); } +/** + * ice_aq_set_txtimeq - set Tx time queues + * @hw: pointer to the hardware structure + * @txtimeq: first Tx time queue id to configure + * @q_count: number of queues to configure + * @txtime_qg: queue group to be set + * @buf_size: size of buffer for indirect command + * @cd: pointer to command details structure or NULL + * + * Set Tx Time queue (0x0C35) + * Return: 0 on success or negative value on failure. + */ +int +ice_aq_set_txtimeq(struct ice_hw *hw, u16 txtimeq, u8 q_count, + struct ice_aqc_set_txtime_qgrp *txtime_qg, u16 buf_size, + struct ice_sq_cd *cd) +{ + struct ice_aqc_set_txtimeqs *cmd; + struct libie_aq_desc desc; + u16 size; + + if (!txtime_qg || txtimeq > ICE_TXTIME_MAX_QUEUE || + q_count < 1 || q_count > ICE_SET_TXTIME_MAX_Q_AMOUNT) + return -EINVAL; + + size = struct_size(txtime_qg, txtimeqs, q_count); + if (buf_size != size) + return -EINVAL; + + cmd = libie_aq_raw(&desc); + + ice_fill_dflt_direct_cmd_desc(&desc, ice_aqc_opc_set_txtimeqs); + + desc.flags |= cpu_to_le16(LIBIE_AQ_FLAG_RD); + + cmd->q_id = cpu_to_le16(txtimeq); + cmd->q_amount = cpu_to_le16(q_count); + return ice_aq_send_cmd(hw, &desc, txtime_qg, buf_size, cd); +} + /* End of FW Admin Queue command wrappers */ /** |