Age | Commit message (Collapse) | Author |
|
Remove commented out code.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Link: https://patch.msgid.link/20250211102057.587182-1-thorsten.blum@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Corrected spelling mistake
Signed-off-by: Ritvik Gupta <ritvikfoss@gmail.com>
Link: https://patch.msgid.link/20250212021311.13257-1-ritvikfoss@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
ioctl(SIOCDARP/SIOCSARP) operates on a single netns fetched from
an AF_INET socket in inet_ioctl().
Let's hold rtnl_net_lock() for SIOCDARP and SIOCSARP.
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250211045057.10419-1-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Marvell 88Q2XXX devices support up to two configurable Light Emitting
Diode (LED). Add minimal LED controller driver supporting the most common
uses with the 'netdev' trigger.
Reviewed-by: Stefan Eichenberger <eichest@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Dimitri Fedrau <dima.fedrau@gmail.com>
Link: https://patch.msgid.link/20250210-marvell-88q2xxx-leds-v4-1-3a0900dc121f@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Now for dwmac-loongson {tx,rx}_fifo_size are uninitialised, which means
zero. This means dwmac-loongson doesn't support changing MTU because in
stmmac_change_mtu() it requires the fifo size be no less than MTU. Thus,
set the correct tx_fifo_size and rx_fifo_size for it (16KB multiplied by
queue counts).
Here {tx,rx}_fifo_size is initialised with the initial value (also the
maximum value) of {tx,rx}_queues_to_use. So it will keep as 16KB if we
don't change the queue count, and will be larger than 16KB if we change
(decrease) the queue count. However stmmac_change_mtu() still work well
with current logic (MTU cannot be larger than 16KB for stmmac).
Note: the Fixes tag picked here is the oldest commit and key commit of
the dwmac-loongson series "stmmac: Add Loongson platform support".
Acked-by: Yanteng Si <si.yanteng@linux.dev>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Chong Qiao <qiaochong@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Link: https://patch.msgid.link/20250210134328.2755328-1-chenhuacai@loongson.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This adds support to dump the connection tracking table
("conntrack -L") and the conntrack statistics, ("conntrack -S").
Example conntrack dump:
tools/net/ynl/pyynl/cli.py --spec Documentation/netlink/specs/conntrack.yaml --dump get
[{'id': 59489769,
'mark': 0,
'nfgen-family': 2,
'protoinfo': {'protoinfo-tcp': {'tcp-flags-original': {'flags': {'maxack',
'sack-perm',
'window-scale'},
'mask': set()},
'tcp-flags-reply': {'flags': {'maxack',
'sack-perm',
'window-scale'},
'mask': set()},
'tcp-state': 'established',
'tcp-wscale-original': 7,
'tcp-wscale-reply': 8}},
'res-id': 0,
'secctx': {'secctx-name': 'system_u:object_r:unlabeled_t:s0'},
'status': {'assured',
'confirmed',
'dst-nat-done',
'seen-reply',
'src-nat-done'},
'timeout': 431949,
'tuple-orig': {'tuple-ip': {'ip-v4-dst': '34.107.243.93',
'ip-v4-src': '192.168.0.114'},
'tuple-proto': {'proto-dst-port': 443,
'proto-num': 6,
'proto-src-port': 37104}},
'tuple-reply': {'tuple-ip': {'ip-v4-dst': '192.168.0.114',
'ip-v4-src': '34.107.243.93'},
'tuple-proto': {'proto-dst-port': 37104,
'proto-num': 6,
'proto-src-port': 443}},
'use': 1,
'version': 0},
{'id': 3402229480,
Example stats dump:
tools/net/ynl/pyynl/cli.py --spec Documentation/netlink/specs/conntrack.yaml --dump get-stats
[{'chain-toolong': 0,
'clash-resolve': 3,
'drop': 0,
....
Changes since last iteration:
- Address comments from Donald Hunter, in particular, fixup "get" and
"get-stats" descriptions, the former operation supports both dump
and normal request (returns a single entry, if found), the latter
only supports dumps.
Signed-off-by: Florian Westphal <fw@strlen.de>
Link: https://patch.msgid.link/20250210152159.41077-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
After commit 5d4cc87414c5 ("net: reorganize "struct sock" fields"),
the sk_tsflags field shares the same cacheline with sk_forward_alloc.
The UDP protocol does not acquire the sock lock in the RX path;
forward allocations are protected via the receive queue spinlock;
additionally udp_recvmsg() calls sock_recv_cmsgs() unconditionally
touching sk_tsflags on each packet reception.
Due to the above, under high packet rate traffic, when the BH and the
user-space process run on different CPUs, UDP packet reception
experiences a cache miss while accessing sk_tsflags.
The receive path doesn't strictly need to access the problematic field;
change sock_set_timestamping() to maintain the relevant information
in a newly allocated sk_flags bit, so that sock_recv_cmsgs() can
take decisions accessing the latter field only.
With this patch applied, on an AMD epic server with i40e NICs, I
measured a 10% performance improvement for small packets UDP flood
performance tests - possibly a larger delta could be observed with more
recent H/W.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/dbd18c8a1171549f8249ac5a8b30b1b5ec88a425.1739294057.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Donald Hunter says:
====================
netlink: specs: add a spec for nl80211 wiphy
Add a rudimentary YNL spec for nl80211 that includes get-wiphy and
get-interface, along with some required enhancements to YNL and the
netlink schemas.
Patch 1 is a minor cleanup to prepare for patch 2
Patches 2-4 are new features for YNL
Patches 5-7 are updates to ynl_gen_c
Patches 8-9 are schema updates for feature parity
Patch 10 is the new nl80211 spec
====================
Link: https://patch.msgid.link/20250211120127.84858-1-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a rudimentary YNL spec for nl80211 that covers get-wiphy,
get-interface and get-protocol-features.
./tools/net/ynl/pyynl/cli.py --family nl80211 \
--do get-protocol-features
{'protocol-features': {'split-wiphy-dump'}}
./tools/net/ynl/pyynl/cli.py --family nl80211 \
--dump get-wiphy --json '{ "split-wiphy-dump": true }'
./tools/net/ynl/pyynl/cli.py --family nl80211 \
--dump get-interface
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Acked-by: Johannes Berg <johannes@sipsolutions.net>
Link: https://patch.msgid.link/20250211120127.84858-11-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add s8 and s16 types to the genetlink schemas to align scalar types
across all schemas.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250211120127.84858-10-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Nested structs are already supported in netlink-raw. Add the same
capability to the genetlink legacy schema.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250211120127.84858-9-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Extend ynl-gen-c.py with support for indexed-array that has a scalar
sub-type.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250211120127.84858-8-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Turn attribute names with leading digits into valid C names by
prepending an underscore, e.g. 5ghz -> _5ghz
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250211120127.84858-7-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add the missing s8 and s16 scalar types to the list of recognised
scalars in ynl-gen-c.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250211120127.84858-6-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The ynl tool uses display-hint to know when to format IP addresses in
printed output, but not to parse IP addresses from --json input. Add
support for parsing ipv4 and ipv6 strings.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250211120127.84858-5-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The nl80211 family encodes the list of supported ciphers as a C array of
u32 values. Add support for translating arrays of scalars into strings
for enum names and display hints.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250211120127.84858-4-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When decoding an indexed-array with a scalar subtype, it is currently
only possible to add a display-hint. Add support for decoding each value
as an enum.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250211120127.84858-3-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
_decode_array_attr() uses variable subattrs in every branch when only
one branch decodes more than a single attribute.
Change the variable name to subattr in the branches that only decode a
single attribute so that the intent is more obvious.
Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20250211120127.84858-2-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Russell King says:
====================
net: dsa: add support for phylink managed EEE
This series adds support for phylink managed EEE to DSA, and converts
mt753x to make use of this feature.
Patch 1 implements a helper to indicate whether the MAC LPI operations
are populated (suggested by Vladimir)
Patch 2 makes the necessary changes to the core code - we retain calling
set_mac_eee(), but this method now becomes a way to merely validate the
arguments when using phylink managed EEE rather than performing any
configuration.
Patch 3 converts the mt7530 driver to use phylink managed EEE.
====================
Link: https://patch.msgid.link/Z6nWujbjxlkzK_3P@shell.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Convert mt7530 to use phylink managed EEE. When enabling EEE, we set
both PMCR_FORCE_EEE1G and PMCR_FORCE_EEE100 irrespective of the speed,
and clear them both when disabling.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1thR9q-003vXI-Cp@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In order to allow DSA drivers to use phylink managed EEE, we need to
change the behaviour of the DSA's .set_eee() ethtool method.
Implementation of the DSA .set_mac_eee() method becomes optional with
phylink managed EEE as it is only used to validate the EEE parameters
supplied from userspace. The rest of the EEE state management should
be left to phylink.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1thR9l-003vXC-9F@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Provide a helper to determine whether the MAC operations structure
implements the LPI operations, which will be used by both phylink and
DSA.
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/E1thR9g-003vX6-4s@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Jakub Kicinski says:
====================
eth: fbnic: report software queue stats
Fill in typical software queue stats.
# ./pyynl/cli.py --spec netlink/specs/netdev.yaml --dump qstats-get
[{'ifindex': 2,
'rx-alloc-fail': 0,
'rx-bytes': 398064076,
'rx-csum-complete': 271,
'rx-csum-none': 0,
'rx-packets': 276044,
'tx-bytes': 7223770,
'tx-needs-csum': 28148,
'tx-packets': 28449,
'tx-stop': 0,
'tx-wake': 0}]
Note that we don't collect csum-unnecessary, just the uncommon
cases (and unnecessary is all the rest of the packets). There
is no programatic use for these stats AFAIK, just manual debug.
====================
Link: https://patch.msgid.link/20250211181356.580800-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Looks like recent commit broke the sort order, fix it.
Acked-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20250211181356.580800-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Gather and report software Tx queue stats - checksum stats
and queue stop / start.
Acked-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20250211181356.580800-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Gather and report software Rx queue stats - checksum stats
and allocation failures.
Acked-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20250211181356.580800-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The queue stats struct is used for Rx and Tx queues. Wrap
the Tx stats in a struct and a union, so that we can reuse
the same space for Rx stats on Rx queues.
This also makes it easy to add an assert to the stat handling
code to catch new stats not being aggregated on shutdown.
Acked-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20250211181356.580800-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 13c7c941e729 ("netdev: add qstat for csum complete") reserved
the entry for csum complete in the qstats uAPI. Start reporting this
value now that we have a driver which needs it.
Reviewed-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://patch.msgid.link/20250211181356.580800-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Oleksij Rempel says:
====================
Use PHYlib for reset randomization and adjustable polling
This patch set tackles a DP83TG720 reset lock issue and improves PHY
polling. Rather than adding a separate polling worker to randomize PHY
resets, I chose to extend the PHYlib framework - which already handles
most of the needed functionality - with adjustable polling. This
approach not only addresses the DP83TG720-specific problem (where
synchronized resets can lock the link) but also lays the groundwork for
optimizing PHY stats polling across all PHY drivers. With generic PHY
stats coming in, we can adjust the polling interval based on hardware
characteristics, such as using longer intervals for PHYs with stable HW
counters or shorter ones for high-speed links prone to counter
overflows.
Patch version changes are tracked in separate patches.
====================
Link: https://patch.msgid.link/20250210082358.200751-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Address the limitations of the DP83TG720 PHY, which cannot reliably
detect or report a stable link state. To handle this, the PHY must be
periodically reset when the link is down. However, synchronized reset
intervals between the PHY and its link partner can result in a deadlock,
preventing the link from re-establishing.
This change introduces a randomized polling interval when the link is
down to desynchronize resets between link partners.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250210082358.200751-3-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Introduce the `phy_get_next_update_time` function to allow PHY drivers
to dynamically determine the time (in jiffies) until the next state
update event. This enables more flexible and adaptive polling intervals
based on the link state or other conditions.
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20250210082358.200751-2-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Tariq Toukan says:
====================
mlx5 misc
Patches 1-3 by William reduce the memory consumption for representors to
achieve better scalability.
Patches 4-5 by Akiva expose ICM memory consumption per function.
Patches 6-8 expose helpful information on RSS resources in devlink RX
reporter diagnose.
Patches 9-10 are simple enhancements by Alex Lazar.
====================
Link: https://patch.msgid.link/20250209101716.112774-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In XDP scenarios, fragmented packets can occur if the MTU is larger
than the page size, even when the packet size fits within the linear
part.
If XDP multi-buffer support is disabled, the fragmented part won't be
handled in the TX flow, leading to packet drops.
Since XDP multi-buffer support is always available, this commit removes
the conditional check for enabling it.
This ensures that XDP multi-buffer support is always enabled,
regardless of the `is_xdp_mb` parameter, and guarantees the handling of
fragmented packets in such scenarios.
Signed-off-by: Alexei Lazar <alazar@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250209101716.112774-16-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Current loopback test validation ignores non-linear SKB case in
the SKB access, which can lead to failures in scenarios such as
when HW GRO is enabled.
Linearize the SKB so both cases will be handled.
Signed-off-by: Alexei Lazar <alazar@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250209101716.112774-15-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Underneath "rx resources" tag expose RSS diagnostic information. For
each RSS expose its rqtn, TIRs and inner TIRs.
$ devlink health diagnose auxiliary/mlx5_core.eth.0/65535 reporter rx
.......
RSS:
Index: 0 rqtn: 0
TIRs Numbers:
tt: TT_IPV4_TCP tirn: 0
tt: TT_IPV6_TCP tirn: 1
tt: TT_IPV4_UDP tirn: 2
tt: TT_IPV6_UDP tirn: 3
tt: TT_IPV4_IPSEC_AH tirn: 4
tt: TT_IPV6_IPSEC_AH tirn: 5
tt: TT_IPV4_IPSEC_ESP tirn: 6
tt: TT_IPV6_IPSEC_ESP tirn: 7
tt: TT_IPV4 tirn: 8
tt: TT_IPV6 tirn: 9
Inner TIRs Numbers:
tt: TT_IPV4_TCP tirn: 10
tt: TT_IPV6_TCP tirn: 11
tt: TT_IPV4_UDP tirn: 12
tt: TT_IPV6_UDP tirn: 13
tt: TT_IPV4_IPSEC_AH tirn: 14
tt: TT_IPV6_IPSEC_AH tirn: 15
tt: TT_IPV4_IPSEC_ESP tirn: 16
tt: TT_IPV6_IPSEC_ESP tirn: 17
tt: TT_IPV4 tirn: 18
tt: TT_IPV6 tirn: 19
Index: 2 rqtn: 27
TIRs Numbers:
tt: TT_IPV6_TCP tirn: 46
Signed-off-by: Amir Tzin <amirtz@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250209101716.112774-14-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add "RX resources" tag to the output of rx reporter diagnose callback.
Underneath add tag for direct TIRs, for each TIR expose its tirn and
the corresponding rqtn.
$ sudo devlink health diagnose auxiliary/mlx5_core.eth.0/65535 reporter rx
....
rx resources:
Direct TIRs:
ix: 0 tirn: 20 rqtn: 1
ix: 1 tirn: 21 rqtn: 2
ix: 2 tirn: 22 rqtn: 3
ix: 3 tirn: 23 rqtn: 4
ix: 4 tirn: 24 rqtn: 5
ix: 5 tirn: 25 rqtn: 6
ix: 6 tirn: 26 rqtn: 7
ix: 7 tirn: 27 rqtn: 8
ix: 8 tirn: 28 rqtn: 9
ix: 9 tirn: 29 rqtn: 10
ix: 10 tirn: 30 rqtn: 11
ix: 11 tirn: 31 rqtn: 12
ix: 12 tirn: 32 rqtn: 13
ix: 13 tirn: 33 rqtn: 14
ix: 14 tirn: 34 rqtn: 15
ix: 15 tirn: 35 rqtn: 16
ix: 16 tirn: 36 rqtn: 17
ix: 17 tirn: 37 rqtn: 18
ix: 18 tirn: 38 rqtn: 19
ix: 19 tirn: 39 rqtn: 20
ix: 20 tirn: 40 rqtn: 21
ix: 21 tirn: 41 rqtn: 22
ix: 22 tirn: 42 rqtn: 23
ix: 23 tirn: 43 rqtn: 24
Signed-off-by: Amir Tzin <amirtz@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250209101716.112774-13-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Move rx reporter RQs diagnose from mlx5e_rx_reporter_diagnose() to a
dedicated function. This change is a preparation for the following
series which extends diagnose output for the rx reporter. While at it,
also pass a mlx5e_priv pointer to
mlx5e_rx_reporter_diagnose_common_config() as this is the argument the
latter actually needs.
Signed-off-by: Amir Tzin <amirtz@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250209101716.112774-12-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ICM is a portion of the host's memory assigned to a function by the OS
through requests made by the NIC's firmware.
PF ICM consumption can be accessed directly, while VF/SF ICM consumption
can be accessed through their representors in switchdev mode.
The value is exposed to the user in granularity of 4KB through the vnic
health reporter as follows:
$ devlink health diagnose pci/0000:08:00.0 reporter vnic
vNIC env counters:
total_error_queues: 0 send_queue_priority_update_flow: 0
comp_eq_overrun: 0 async_eq_overrun: 0 cq_overrun: 0
invalid_command: 0 quota_exceeded_command: 0
nic_receive_steering_discard: 0 icm_consumption: 1032
Signed-off-by: Akiva Goldberger <agoldberger@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250209101716.112774-11-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Rename mlx5_esw_query_vport_vhca_id to mlx5_vport_get_vhca_id and move
it to vport file. Also, add function declaration to mlx5_core header
file. This better represents the function's usage and allows for it to
be called from other parts of the mlx5_core driver.
Signed-off-by: Akiva Goldberger <agoldberger@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250209101716.112774-10-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
By default, the mq netdev creates a pfifo_fast qdisc. On a
system with 16 core, the pfifo_fast with 3 bands consumes
16 * 3 * 8 (size of pointer) * 1024 (default tx queue len)
= 393KB. The patch sets the tx qlen to representor default
value, 128 (1<<MLX5E_REP_PARAMS_DEF_LOG_SQ_SIZE), which
consumes 16 * 3 * 8 * 128 = 49KB, saving 344KB for each
representor at ECPF.
Signed-off-by: William Tu <witu@nvidia.com>
Reviewed-by: Daniel Jurgens <danielj@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250209101716.112774-9-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
By experiments, a single queue representor netdev consumes kernel
memory around 2.8MB, and 1.8MB out of the 2.8MB is due to page
pool for the RXQ. Scaling to a thousand representors consumes 2.8GB,
which becomes a memory pressure issue for embedded devices such as
BlueField-2 16GB / BlueField-3 32GB memory.
Since representor netdevs mostly handles miss traffic, and ideally,
most of the traffic will be offloaded, reduce the default non-uplink
rep netdev's RXQ default depth from 1024 to 256 if mdev is ecpf eswitch
manager. This saves around 1MB of memory per regular RQ,
(1024 - 256) * 2KB, allocated from page pool.
With rxq depth of 256, the netlink page pool tool reports
$./tools/net/ynl/cli.py --spec Documentation/netlink/specs/netdev.yaml \
--dump page-pool-get
{'id': 277,
'ifindex': 9,
'inflight': 128,
'inflight-mem': 786432,
'napi-id': 775}]
This is due to mtu 1500 + headroom consumes half pages, so 256 rxq
entries consumes around 128 pages (thus create a page pool with
size 128), shown above at inflight.
Note that each netdev has multiple types of RQs, including
Regular RQ, XSK, PTP, Drop, Trap RQ. Since non-uplink representor
only supports regular rq, this patch only changes the regular RQ's
default depth.
Signed-off-by: William Tu <witu@nvidia.com>
Reviewed-by: Bodong Wang <bodong@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250209101716.112774-8-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
For the ECPF and representors, reduce the max MPWRQ size from 256KB (18)
to 128KB (17). This prepares the later patch for saving representor
memory.
With Striding RQ, there is a minimum of 4 MPWQEs. So with 128KB of max
MPWRQ size, the minimal memory is 4 * 128KB = 512KB. When creating page
pool, consider 1500 mtu, the minimal page pool size will be 512KB/4KB =
128 pages = 256 rx ring entries (2 entries per page).
Before this patch, setting RX ringsize (ethtool -G rx) to 256 causes
driver to allocate page pool size more than it needs due to max MPWRQ
is 256KB (18). Ex: 4 * 256KB = 1MB, 1MB/4KB = 256 pages, but actually
128 pages is good enough. Reducing the max MPWRQ to 128KB fixes the
limitation.
Signed-off-by: William Tu <witu@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Link: https://patch.msgid.link/20250209101716.112774-7-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2025-02-10 (ice, igc, e1000e)
For ice:
Karol, Jake, and Michal add PTP support for E830 devices. Karol
refactors and cleans up PTP code. Jake allows for a common
cross-timestamp implementation to be shared for all devices and
Michal adds E830 support.
Mateusz cleans up initial Flow Director rule creation to loop rather
than duplicate repeated similar calls.
For igc:
Siang adjust calls to remove need for close and open calls on loading
XDP program.
For e1000e:
Gerhard Engleder batches register writes for writing multicast table
on real-time kernels.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
e1000e: Fix real-time violations on link up
igc: Avoid unnecessary link down event in XDP_SETUP_PROG process
ice: refactor ice_fdir_create_dflt_rules() function
ice: Implement PTP support for E830 devices
ice: Refactor ice_ptp_init_tx_*
ice: Add unified ice_capture_crosststamp
ice: Process TSYN IRQ in a separate function
ice: Use FIELD_PREP for timestamp values
ice: Remove unnecessary ice_is_e8xx() functions
ice: Don't check device type when checking GNSS presence
====================
Link: https://patch.msgid.link/20250210192352.3799673-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Edward Cree says:
====================
sfc: support devlink flash
Allow upgrading device firmware on Solarflare NICs through standard tools.
====================
Link: https://patch.msgid.link/cover.1739186252.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Update the information in sfc's devlink documentation including
support for firmware update with devlink flash.
Also update the help text for CONFIG_SFC_MTD, as it is no longer
strictly required for firmware updates.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/3476b0ef04a0944f03e0b771ec8ed1a9c70db4dc.1739186253.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use MC_CMD_NVRAM_* wrappers to write the firmware to the partition
identified from the image header.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/a746335876b621b3e54cf4e49948148e349a1745.1739186253.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Support variable write-alignment, and background updates. The latter
allows other MCDI to continue while the device is processing an
MC_CMD_NVRAM_UPDATE_FINISH, since this can take a long time owing to
e.g. cryptographic signature verification.
Expose these handlers in mcdi.h, and build them even when
CONFIG_SFC_MTD=n, so they can be used for devlink flash in a
subsequent patch.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/de3d9e14fee69e15d95b46258401a93b75659f78.1739186253.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This parsing is necessary to obtain the metadata which will be
used in a subsequent patch to write the image to the device.
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/65318300f3f1b1462925f917f7c0d0ac833955ae.1739186253.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Huisong Li says:
====================
Use HWMON_CHANNEL_INFO macro to simplify code
The HWMON_CHANNEL_INFO macro is provided by hwmon.h and used widely by many
other drivers. This series use HWMON_CHANNEL_INFO macro to simplify code
in net subsystem.
Note: These patches do not depend on each other. Put them togeter just for
belonging to the same subsystem.
====================
Link: https://patch.msgid.link/20250210054710.12855-1-lihuisong@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use HWMON_CHANNEL_INFO macro to simplify code.
Signed-off-by: Huisong Li <lihuisong@huawei.com>
Link: https://patch.msgid.link/20250210054710.12855-6-lihuisong@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|