Age | Commit message (Collapse) | Author |
|
'virtio_net-vdpa-update-mac-address-when-it-is-generated-by-virtio-net'
Laurent Vivier says:
====================
virtio_net: vdpa: update MAC address when it is generated by virtio-net
When the MAC address is not provided by the vdpa device virtio_net
driver assigns a random one without notifying the device.
The consequence, in the case of mlx5_vdpa, is the internal routing
tables of the device are not updated and this can block the
communication between two namespaces.
To fix this problem, use virtnet_send_command(VIRTIO_NET_CTRL_MAC)
to set the address from virtnet_probe() when the MAC address is
not provided by the device.
====================
Link: https://lore.kernel.org/r/20230127204500.51930-1-lvivier@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In virtnet_probe(), if the device doesn't provide a MAC address the
driver assigns a random one.
As we modify the MAC address we need to notify the device to allow it
to update all the related information.
The problem can be seen with vDPA and mlx5_vdpa driver as it doesn't
assign a MAC address by default. The virtio_net device uses a random
MAC address (we can see it with "ip link"), but we can't ping a net
namespace from another one using the virtio-vdpa device because the
new MAC address has not been provided to the hardware:
RX packets are dropped since they don't go through the receive filters,
TX packets go through unaffected.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
failover relies on the MAC address to pair the primary and the standby
devices:
"[...] the hypervisor needs to enable VIRTIO_NET_F_STANDBY
feature on the virtio-net interface and assign the same MAC address
to both virtio-net and VF interfaces."
Documentation/networking/net_failover.rst
This patch disables the STANDBY feature if the MAC address is not
provided by the hypervisor.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch corrects two oversights relating to releasing resources
and DCB initialisation.
1. If mapping of the dcbcfg_tbl area fails: an error should be
propagated, allowing partial initialisation (probe) to be unwound.
2. Conversely, if where dcbcfg_tbl is successfully mapped: it should
be unmapped in nfp_nic_dcb_clean() which is called via various error
cleanup paths, and shutdown or removal of the PCIE device.
Fixes: 9b7fe8046d74 ("nfp: add DCB IEEE support")
Signed-off-by: Huayu Chen <huayu.chen@corigine.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/20230131163033.981937-1-simon.horman@corigine.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
A mutex may sleep, which is not permitted in atomic context.
Avoid a case where this may arise by moving the to
nfp_flower_lag_get_info_from_netdev() in nfp_tun_write_neigh() spinlock.
Fixes: abc210952af7 ("nfp: flower: tunnel neigh support bond offload")
Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Yanguo Li <yanguo.li@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230131080313.2076060-1-simon.horman@corigine.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Swap is a function interface that provides exchange function. To avoid
code duplication, we can use swap function.
./net/ipv6/icmp.c:344:25-26: WARNING opportunity for swap().
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=3896
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20230131063456.76302-1-jiapeng.chong@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
'ip-ip6_gre-fix-gre-tunnels-not-generating-ipv6-link-local-addresses'
Thomas Winter says:
====================
ip/ip6_gre: Fix GRE tunnels not generating IPv6 link local addresses
For our point-to-point GRE tunnels, they have IN6_ADDR_GEN_MODE_NONE
when they are created then we set IN6_ADDR_GEN_MODE_EUI64 when they
come up to generate the IPv6 link local address for the interface.
Recently we found that they were no longer generating IPv6 addresses.
Also, non-point-to-point tunnels were not generating any IPv6 link
local address and instead generating an IPv6 compat address,
breaking IPv6 communication on the tunnel.
These failures were caused by commit e5dd729460ca and this patch set
aims to resolve these issues.
====================
Link: https://lore.kernel.org/r/20230131034646.237671-1-Thomas.Winter@alliedtelesis.co.nz
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We recently found that our non-point-to-point tunnels were not
generating any IPv6 link local address and instead generating an
IPv6 compat address, breaking IPv6 communication on the tunnel.
Previously, addrconf_gre_config always would call addrconf_addr_gen
and generate a EUI64 link local address for the tunnel.
Then commit e5dd729460ca changed the code path so that add_v4_addrs
is called but this only generates a compat IPv6 address for
non-point-to-point tunnels.
I assume the compat address is specifically for SIT tunnels so
have kept that only for SIT - GRE tunnels now always generate link
local addresses.
Fixes: e5dd729460ca ("ip/ip6_gre: use the same logic as SIT interfaces when computing v6LL address")
Signed-off-by: Thomas Winter <Thomas.Winter@alliedtelesis.co.nz>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
For our point-to-point GRE tunnels, they have IN6_ADDR_GEN_MODE_NONE
when they are created then we set IN6_ADDR_GEN_MODE_EUI64 when they
come up to generate the IPv6 link local address for the interface.
Recently we found that they were no longer generating IPv6 addresses.
This issue would also have affected SIT tunnels.
Commit e5dd729460ca changed the code path so that GRE tunnels
generate an IPv6 address based on the tunnel source address.
It also changed the code path so GRE tunnels don't call addrconf_addr_gen
in addrconf_dev_config which is called by addrconf_sysctl_addr_gen_mode
when the IN6_ADDR_GEN_MODE is changed.
This patch aims to fix this issue by moving the code in addrconf_notify
which calls the addr gen for GRE and SIT into a separate function
and calling it in the places that expect the IPv6 address to be
generated.
The previous addrconf_dev_config is renamed to addrconf_eth_config
since it only expected eth type interfaces and follows the
addrconf_gre/sit_config format.
A part of this changes means that the loopback address will be
attempted to be configured when changing addr_gen_mode for lo.
This should not be a problem because the address should exist anyway
and if does already exist then no error is produced.
Fixes: e5dd729460ca ("ip/ip6_gre: use the same logic as SIT interfaces when computing v6LL address")
Signed-off-by: Thomas Winter <Thomas.Winter@alliedtelesis.co.nz>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Jiri Pirko says:
====================
devlink: trivial names cleanup
This is a follow-up to Jakub's devlink code split and dump iteration
helper patchset. No functional changes, just couple of renames to makes
things consistent and perhaps easier to follow.
====================
Link: https://lore.kernel.org/r/20230131090613.2131740-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In order to maintain naming consistency, rename and reorder all usages
of struct struct devlink_cmd in the following way:
1) Remove "gen" and replace it with "cmd" to match the struct name
2) Order devl_cmds[] and the header file to match the order
of enum devlink_command
3) Move devl_cmd_rate_get among the peers
4) Remove "inst" for DEVLINK_CMD_GET
5) Add "_get" suffix to all to match DEVLINK_CMD_*_GET (only rate had it
done correctly)
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
No need to have "gen" inside name of the structure for devlink commands.
Remove it.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
To have the name of the function consistent with the struct cb name,
rename devlink_nl_instance_iter_dump() to
devlink_nl_instance_iter_dumpit().
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Pull virtio fixes from Michael Tsirkin:
"Just small bugfixes all over the place"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
vdpa: ifcvf: Do proper cleanup if IFCVF init fails
vhost-scsi: unbreak any layout for response
tools/virtio: fix the vringh test for virtio ring changes
vhost/net: Clear the pending messages when the backend is removed
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A bit higher volume of changes than wished, but each change is
relatively small and the fix targets are mostly device-specific, so
those should be safe as a late stage merge.
The most significant LoC is about the memalloc helper fix, which is
applied only to Xen PV. The other major parts are ASoC Intel SOF and
AVS fixes that are scattered as various small code changes. The rest
are device-specific fixes and quirks for HD- and USB-audio, FireWire
and ASoC AMD / HDMI"
* tag 'sound-6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (30 commits)
ALSA: firewire-motu: fix unreleased lock warning in hwdep device
ALSA: memalloc: Workaround for Xen PV
ASoC: cs42l56: fix DT probe
ASoC: codecs: wsa883x: correct playback min/max rates
ALSA: hda/realtek: Add Acer Predator PH315-54
ASoC: amd: yc: Add Xiaomi Redmi Book Pro 15 2022 into DMI table
ALSA: hda: Do not unset preset when cleaning up codec
ASoC: SOF: sof-audio: prepare_widgets: Check swidget for NULL on sink failure
ASoC: hdmi-codec: zero clear HDMI pdata
ASoC: SOF: ipc4-mtrace: prevent underflow in sof_ipc4_priority_mask_dfs_write()
ASoC: Intel: sof_ssp_amp: always set dpcm_capture for amplifiers
ASoC: Intel: sof_nau8825: always set dpcm_capture for amplifiers
ASoC: Intel: sof_cs42l42: always set dpcm_capture for amplifiers
ASoC: Intel: sof_rt5682: always set dpcm_capture for amplifiers
ALSA: hda/via: Avoid potential array out-of-bound in add_secret_dac_path()
ALSA: usb-audio: Add FIXED_RATE quirk for JBL Quantum610 Wireless
ALSA: hda/realtek: fix mute/micmute LEDs, speaker don't work for a HP platform
ASoC: SOF: keep prepare/unprepare widgets in sink path
ASoC: SOF: sof-audio: skip prepare/unprepare if swidget is NULL
ASoC: SOF: sof-audio: unprepare when swidget->use_count > 0
...
|
|
The Flash Interface Unit (FIU) should have a reference to the Shared
Memory controller (SHM) so that flash access from the host (x86 computer
managed by the WPCM450 BMC) can be blocked during flash access by the
FIU driver.
Fixes: 38abcb0d68767 ("ARM: dts: wpcm450: Add FIU SPI controller node")
Signed-off-by: Jonathan Neuschäfer <j.neuschaefer@gmx.net>
Link: https://lore.kernel.org/r/20230129112611.1176517-1-j.neuschaefer@gmx.net
Signed-off-by: Joel Stanley <joel@jms.id.au>
Link: https://lore.kernel.org/r/20230201044158.962417-1-joel@jms.id.au
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
The linux-mediatek IRC channel has moved to liber.chat for quite some
time. Apart from that, not all patches are also send to LKML, so add
this ML explicitly.
And last but not least:
Angelo does a wunderfull job in reviewing patches for all kind of
devices from MediaTek.
Cc: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
Link: https://lore.kernel.org/r/20230201152256.19514-1-matthias.bgg@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
|
|
Alex Elder says:
====================
net: ipa: remaining IPA v5.0 support
This series includes almost all remaining IPA code changes required
to support IPA v5.0. IPA register definitions and configuration
data for IPA v5.0 will be sent later (soon). Note that the GSI
register definitions still require work. GSI for IPA v5.0 supports
up to 256 (rather than 32) channels, and this changes the way GSI
register offsets are calculated. A few GSI register fields also
change.
The first patch in this series increases the number of IPA endpoints
supported by the driver, from 32 to 36. The next updates the width
of the destination field for the IP_PACKET_INIT immediate command so
it can represent up to 256 endpoints rather than just 32. The next
adds a few definitions of some IPA registers and fields that are
first available in IPA v5.0.
The next two patches update the code that handles router and filter
table caches. Previously these were referred to as "hashed" tables,
and the IPv4 and IPv6 tables are now combined into one "unified"
table. The sixth and seventh patches add support for a new pulse
generator, which allows time periods to be specified with a wider
range of clock resolution. And the last patch just defines two new
memory regions that were not previously used.
====================
Link: https://lore.kernel.org/r/20230130210158.4126129-1-elder@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
IPA v5.0 uses two memory regions not previously used. Define them
and treat them as valid only for IPA v5.0.
Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The AP has third pulse generator available starting with IPA v5.0.
Redefine ipa_qtime_val() to support that possibility. Pass the IPA
pointer as an argument so the version can be determined. And stop
using the sign of the returned tick count to indicate which of two
pulse generators to use.
Instead, have the caller provide the address of a variable that will
hold the selected pulse generator for the Qtime value. And for
version 5.0, check whether the third pulse generator best represents
the time period.
Add code in ipa_qtime_config() to configure the fourth pulse
generator for IPA v5.0+; in that case configure both the third and
fourth pulse generators to use 10 msec granularity.
Consistently use "ticks" for local variables that represent a tick
count.
Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Starting with IPA v5.0, the head-of-line blocking timer has more
than two pulse generators available to define timer granularity.
To prepare for that, change the way the field value is encoded
to use ipa_reg_encode() rather than ipa_reg_bit().
The aggregation granularity selection could (in principle) also use
an additional pulse generator starting with IPA v5.0. Encode the
AGGR_GRAN_SEL field differently to allow that as well.
Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
IPA v5.0+ separates the configuration of entries in the cached
(previously "hashed") routing and filtering tables into distinct
registers. Previously a single "filter and router" register updated
entries in both tables at once; now the routing and filter table
caches have separate registers that define their content.
This patch updates the code that zeroes entries in the cached filter
and router tables to support IPA versions including v5.0+.
Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Update the code that causes filter and router table caches to be
flushed so that it supports IPA versions 5.0+. It adds a comment in
ipa_hardware_config_hashing() that explains that cacheing does not
need to be enabled, just as before, because it's enabled by default.
(For the record, the FILT_ROUT_CACHE_CFG register would have been
used if we wanted to explicitly enable these.)
Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Define some new registers that appear starting with IPA v5.0, along
with enumerated types identifying their fields. Code that uses
these will be added by upcoming patches.
Most of the new registers are related to filter and routing tables,
and in particular, their "hashed" variant. These tables are better
described as "cached", where a hash value determines which entries
are cached. From now on, naming related to this functionality will
use "cache" instead of "hash", and that is reflected in these new
register names. Some registers for managing these caches and their
contents have changed as well.
A few other new field definitions for registers (unrelated to table
caches) are also defined.
Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The IP_PACKET_INIT immediate command defines the destination
endpoint to which a packet should be sent. Prior to IPA v5.0, a
5 bit field in that command represents the endpoint, but starting
with IPA v5.0, the field is extended to 8 bits to support more than
32 endpoints.
Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Increase the number of endpoints supported by the driver to 36,
which IPA v5.0 supports. This makes it impossible to check at build
time whether the supported number is too big to fit within the
(5-bit) PACKET_INIT destination endpoint field. Instead, convert
the build time check to compare against what fits in 8 bits.
Add a check in ipa_endpoint_config() to also ensure the hardware
reports an endpoint count that's in the expected range. Just
open-code 32 as the limit (the PACKET_INIT field mask is not
available where we'd want to use it).
Signed-off-by: Alex Elder <elder@linaro.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2023-01-30
Add fast update encryption key
Jianbo Liu Says:
================
Data encryption keys (DEKs) are the keys used for data encryption and
decryption operations. Starting from version 22.33.0783, firmware is
optimized to accelerate the update of user keys into DEK object in
hardware. The support for bulk allocation and destruction of DEK
objects is added, and the bulk allocated DEKs are uninitialized, as
the bulk creation requires no input key. When offload
encryption/decryption, user gets one object from a bulk, and updates
key by a new "modify DEK" command. This command is the same as create
DEK object, but requires no heavy context memory allocation in
firmware, which consumes most cpu cycles of the create DEK command.
DEKs are cached internally by the NIC, so invalidating internal NIC
caches is required before reusing DEKs. The SYNC_CRYPTO command is
added to support it. DEK object can be reused, the keys in it can be
updated after this command is executed.
This patchset enhances the key creation and destruction flow, to get
use of this new feature. Any user, for example, ktls, ipsec and
macsec, can use it to offload keys. But, only ktls uses it, as others
don't need many keys, and caching two many DEKs in pool is wasteful.
There are two new data struts added:
a. DEK pool. One pool is created for each key type. The bulks by
the type, are placed in the pool's different bulk lists, according to
the number of available and in_used DEKs in the bulk.
b. DEK bulk. All DEKs in one bulk allocation are store here. There
are two bitmaps to indicate the state of each DEK.
New APIs are then added. When user need a DEK object,
a. Fetch one bulk with avail DEKs, from the partial_list or
avail_list, otherwise create new one.
b. Pick one DEK, and set its need_sync and in_used bits to 1.
Move the bulk to full_list if no more available keys, or put it to
partial_list if the bulk is newly created.
c. Update DEK object's key with user key, by the "modify DEK"
command.
d. Return DEK struct to user, then it gets the object id and fills
it into the offload commands.
When user free a DEK,
a. Set in_use bit to 0. If all need_sync bits are 1 and all in_use
bits of this bulk are 0, move it to sync_list.
b. If the number of DEKs, which are freed by users, is over the
threshold (128), schedule a workqueue to do the sync process.
For the sync process, the SYNC_CRYPTO command is executed first. Then,
for each bulks in partial_list, full_list and sync_list, reset
need_sync bits of the freed DEK objects. If all need_sync bits in one
bulk are zero, move it to avail_list.
We already supported TIS pool to recycle the TISes. With this series
and TIS pool, TLS CPS performance is improved greatly.
And we tested https on the system:
CPU: dual AMD EPYC 7763 64-Core processors
RAM: 512G
DEV: ConnectX-6 DX, with FW ver 22.33.0838 and TLS_OPTIMISE=true
TLS CPS performance numbers are:
Before: 11k connections/sec
After: 101 connections/sec
================
* tag 'mlx5-updates-2023-01-30' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
net/mlx5e: kTLS, Improve connection rate by using fast update encryption key
net/mlx5: Keep only one bulk of full available DEKs
net/mlx5: Add async garbage collector for DEK bulk
net/mlx5: Reuse DEKs after executing SYNC_CRYPTO command
net/mlx5: Use bulk allocation for fast update encryption key
net/mlx5: Add bulk allocation and modify_dek operation
net/mlx5: Add support SYNC_CRYPTO command
net/mlx5: Add new APIs for fast update encryption key
net/mlx5: Refactor the encryption key creation
net/mlx5: Add const to the key pointer of encryption key creation
net/mlx5: Prepare for fast crypto key update if hardware supports it
net/mlx5: Change key type to key purpose
net/mlx5: Add IFC bits and enums for crypto key
net/mlx5: Add IFC bits for general obj create param
net/mlx5: Header file for crypto
====================
Link: https://lore.kernel.org/r/20230131031201.35336-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
1) Release bridge info once packet escapes the br_netfilter path,
from Florian Westphal.
2) Revert incorrect fix for the SCTP connection tracking chunk
iterator, also from Florian.
First path fixes a long standing issue, the second path addresses
a mistake in the previous pull request for net.
* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
Revert "netfilter: conntrack: fix bug in for_each_sctp_chunk"
netfilter: br_netfilter: disable sabotage_in hook after first suppression
====================
Link: https://lore.kernel.org/r/20230131133158.4052-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The Meson G12A Internal PHY does not support standard IEEE MMD extended
register access, therefore add generic dummy stubs to fail the read and
write MMD calls. This is necessary to prevent the core PHY code from
erroneously believing that EEE is supported by this PHY even though this
PHY does not support EEE, as MMD register access returns all FFFFs.
Fixes: 5c3407abb338 ("net: phy: meson-gxl: add g12a support")
Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Chris Healy <healych@amazon.com>
Reviewed-by: Jerome Brunet <jbrunet@baylibre.com>
Link: https://lore.kernel.org/r/20230130231402.471493-1-cphealy@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 3a1296a38d0c ("net: Support GRO/GSO fraglist chaining.")
introduced UDP listifyed GRO. The segmentation relies on frag_list being
untouched when passing through the network stack. This assumption can be
broken sometimes, where frag_list itself gets pulled into linear area,
leaving frag_list being NULL. When this happens it can trigger
following NULL pointer dereference, and panic the kernel. Reverse the
test condition should fix it.
[19185.577801][ C1] BUG: kernel NULL pointer dereference, address:
...
[19185.663775][ C1] RIP: 0010:skb_segment_list+0x1cc/0x390
...
[19185.834644][ C1] Call Trace:
[19185.841730][ C1] <TASK>
[19185.848563][ C1] __udp_gso_segment+0x33e/0x510
[19185.857370][ C1] inet_gso_segment+0x15b/0x3e0
[19185.866059][ C1] skb_mac_gso_segment+0x97/0x110
[19185.874939][ C1] __skb_gso_segment+0xb2/0x160
[19185.883646][ C1] udp_queue_rcv_skb+0xc3/0x1d0
[19185.892319][ C1] udp_unicast_rcv_skb+0x75/0x90
[19185.900979][ C1] ip_protocol_deliver_rcu+0xd2/0x200
[19185.910003][ C1] ip_local_deliver_finish+0x44/0x60
[19185.918757][ C1] __netif_receive_skb_one_core+0x8b/0xa0
[19185.927834][ C1] process_backlog+0x88/0x130
[19185.935840][ C1] __napi_poll+0x27/0x150
[19185.943447][ C1] net_rx_action+0x27e/0x5f0
[19185.951331][ C1] ? mlx5_cq_tasklet_cb+0x70/0x160 [mlx5_core]
[19185.960848][ C1] __do_softirq+0xbc/0x25d
[19185.968607][ C1] irq_exit_rcu+0x83/0xb0
[19185.976247][ C1] common_interrupt+0x43/0xa0
[19185.984235][ C1] asm_common_interrupt+0x22/0x40
...
[19186.094106][ C1] </TASK>
Fixes: 3a1296a38d0c ("net: Support GRO/GSO fraglist chaining.")
Suggested-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Yan Zhai <yan@cloudflare.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/Y9gt5EUizK1UImEP@debian
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When memory allocation fails in lynx_pcs_create() and it returns NULL,
there remains a dangling reference to the mdiodev returned by
of_mdio_find_device() which is leaked as soon as memac_pcs_create()
returns empty-handed.
Fixes: a7c2a32e7f22 ("net: fman: memac: Use lynx pcs driver")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Sean Anderson <sean.anderson@seco.com>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Link: https://lore.kernel.org/r/20230130193051.563315-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:
====================
Intel Wired LAN: Remove redundant Device Control Error Reporting Enable
Bjorn Helgaas says:
Since f26e58bf6f54 ("PCI/AER: Enable error reporting when AER is native"),
the PCI core sets the Device Control bits that enable error reporting for
PCIe devices.
This series removes redundant calls to pci_enable_pcie_error_reporting()
that do the same thing from several NIC drivers.
There are several more drivers where this should be removed; I started with
just the Intel drivers here.
* '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
ixgbe: Remove redundant pci_enable_pcie_error_reporting()
igc: Remove redundant pci_enable_pcie_error_reporting()
igb: Remove redundant pci_enable_pcie_error_reporting()
ice: Remove redundant pci_enable_pcie_error_reporting()
iavf: Remove redundant pci_enable_pcie_error_reporting()
i40e: Remove redundant pci_enable_pcie_error_reporting()
fm10k: Remove redundant pci_enable_pcie_error_reporting()
e1000e: Remove redundant pci_enable_pcie_error_reporting()
====================
Link: https://lore.kernel.org/r/20230130192519.686446-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Petr Machata says:
====================
selftests: mlxsw: Convert to iproute2 dcb
There is a dedicated tool for configuration of DCB in iproute2. Use it
in the selftests instead of lldpad.
Patches #1-#3 convert three tests. Patch #4 drops the now-unnecessary
lldpad helpers.
====================
Link: https://lore.kernel.org/r/cover.1675096231.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The existing users of these helpers have been converted to iproute2 dcb.
Drop the helpers.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Set up default port priority through the iproute2 dcb tool, which is easier
to understand and manage.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Set up DSCP prioritization through the iproute2 dcb tool, which is easier
to understand and manage.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Set up DSCP prioritization through the iproute2 dcb tool, which is easier
to understand and manage.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
It tries to avoid the frequently hb_timer refresh in commit ba6f5e33bdbb
("sctp: avoid refreshing heartbeat timer too often"), and it only allows
mod_timer when the new expires is after hb_timer.expires. It means even
a much shorter interval for hb timer gets applied, it will have to wait
until the current hb timer to time out.
In sctp_do_8_2_transport_strike(), when a transport enters PF state, it
expects to update the hb timer to resend a heartbeat every rto after
calling sctp_transport_reset_hb_timer(), which will not work as the
change mentioned above.
The frequently hb_timer refresh was caused by sctp_transport_reset_timers()
called in sctp_outq_flush() and it was already removed in the commit above.
So we don't have to check hb_timer.expires when resetting hb_timer as it is
now not called very often.
Fixes: ba6f5e33bdbb ("sctp: avoid refreshing heartbeat timer too often")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Link: https://lore.kernel.org/r/d958c06985713ec84049a2d5664879802710179a.1675095933.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Jerome Brunet says:
====================
net: mdio: add amlogic gxl mdio mux support
Add support for the MDIO multiplexer found in the Amlogic GXL SoC family.
This multiplexer allows to choose between the external (SoC pins) MDIO bus,
or the internal one leading to the integrated 10/100M PHY.
This multiplexer has been handled with the mdio-mux-mmioreg generic driver
so far. When it was added, it was thought the logic was handled by a
single register.
It turns out more than a single register need to be properly set.
As long as the device is using the Amlogic vendor bootloader, or upstream
u-boot with net support, it is working fine since the kernel is inheriting
the bootloader settings. Without net support in the bootloader, this glue
comes unset in the kernel and only the external path may operate properly.
With this driver (and the associated change in
arch/arm64/boot/dts/amlogic/meson-gxl.dtsi), the kernel no longer relies
on the bootloader to set things up, fixing the problem.
====================
Link: https://lore.kernel.org/r/20230130151616.375168-1-jbrunet@baylibre.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add support for the mdio mux and internal phy glue of the GXL SoC
family
Reported-by: Da Xue <da@lessconfused.com>
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add documentation for the MDIO bus multiplexer found on the Amlogic GXL
SoC family
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Jakub Kicinski says:
====================
tools: ynl: more docs and basic ethtool support
I got discouraged from supporting ethtool in specs, because
generating the user space C code seems a little tricky.
The messages are ID'ed in a "directional" way (to and from
kernel are separate ID "spaces"). There is value, however,
in having the spec and being able to for example use it
in Python.
After paying off some technical debt - add a partial
ethtool spec. Partial because the header for ethtool is almost
a 1000 LoC, so converting in one sitting is tough. But adding
new commands should be trivial now.
Last but not least I add more docs, I realized that I've been
sending a similar "instructions" email to people working on
new families. It's now intro-specs.rst.
====================
Link: https://lore.kernel.org/r/20230131023354.1732677-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The scripts require Python 3 and some distros are dropping
Python 2 support.
Reported-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We have a bit of documentation about the internals of Netlink
and the specs, but really the goal is for most people to not
worry about those. Add a practical guide for beginners who
want to poke at the specs.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Ethtool is one of the most actively developed families.
With the changes to the CLI it should be possible to use
the YNL based code for easy prototyping and development.
Add a partial family definition. I've tested the string
set and rings. I don't have any MAC Merge implementation
to test with, but I added the definition for it, anyway,
because it's last. New commands can simply be added at
the end without having to worry about manually providing
IDs / values.
Set (with notification support - None is the response,
the data is from the notification):
$ sudo ./tools/net/ynl/cli.py \
--spec Documentation/netlink/specs/ethtool.yaml \
--do rings-set \
--json '{"header":{"dev-name":"enp0s31f6"}, "rx":129}' \
--subscribe monitor
None
[{'msg': {'header': {'dev-index': 2, 'dev-name': 'enp0s31f6'},
'rx': 136,
'rx-max': 4096,
'tx': 256,
'tx-max': 4096,
'tx-push': 0},
'name': 'rings-ntf'}]
Do / dump (yes, the kernel requires that even for dump and even
if empty - the "header" nest must be there):
$ ./tools/net/ynl/cli.py \
--spec Documentation/netlink/specs/ethtool.yaml \
--do rings-get \
--json '{"header":{"dev-index": 2}}'
{'header': {'dev-index': 2, 'dev-name': 'enp0s31f6'},
'rx': 136,
'rx-max': 4096,
'tx': 256,
'tx-max': 4096,
'tx-push': 0}
$ ./tools/net/ynl/cli.py \
--spec Documentation/netlink/specs/ethtool.yaml \
--dump rings-get \
--json '{"header":{}}'
[{'header': {'dev-index': 2, 'dev-name': 'enp0s31f6'},
'rx': 136,
'rx-max': 4096,
'tx': 256,
'tx-max': 4096,
'tx-push': 0},
{'header': {'dev-index': 3, 'dev-name': 'wlp0s20f3'}, 'tx-push': 0},
{'header': {'dev-index': 19, 'dev-name': 'enp58s0u1u1'},
'rx': 100,
'rx-max': 4096,
'tx-push': 0}]
And error reporting:
$ ./tools/net/ynl/cli.py \
--spec Documentation/netlink/specs/ethtool.yaml \
--dump rings-get \
--json '{"header":{"flags":5}}'
Netlink error: Invalid argument
nl_len = 68 (52) nl_flags = 0x300 nl_type = 2
error: -22 extack: {'msg': 'reserved bit set',
'bad-attr-offs': 24,
'bad-attr': '.header.flags'}
None
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
I had a (bright?) idea of introducing the concept of enum-models
to account for all the weird ways families enumerate their messages.
I've never finished it because generating C code for each of them
is pretty daunting. But for languages which can use ID values directly
the support is simple enough, so clean this up a bit.
"unified" model is what I recommend going forward.
"directional" model is what ethtool uses.
"notify-split" is used by the proposed DPLL code, but we can just
make them use "unified", it hasn't been merged :)
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The CLI script tries to validate jsonschema by default.
It's seems better to validate too many times than too few.
However, when copying the scripts to random servers having
to install jsonschema is tedious. Load jsonschema via
importlib, and let the user opt out.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When I wrote the first version of the Python code I was quite
excited that we can generate class methods directly from the
spec. Unfortunately we need to use valid identifiers for method
names (specifically no dashes are allowed). Don't reuse those
names on the CLI, it's much more natural to use the operation
names exactly as listed in the spec.
Instead of:
./cli --do rings_get
use:
./cli --do rings-get
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
One of my favorite features of the Netlink specs is that they
make decoding structured extack a ton easier.
Implement pretty printing bad attribute names in YNL.
For example it will now say:
'bad-attr': '.header.flags'
rather than the useless:
'bad-attr-offs': 32
Proof:
$ ./cli.py --spec ethtool.yaml --do rings_get \
--json '{"header":{"dev-index":1, "flags":4}}'
Netlink error: Invalid argument
nl_len = 68 (52) nl_flags = 0x300 nl_type = 2
error: -22 extack: {'msg': 'reserved bit set',
'bad-attr': '.header.flags'}
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Ethtool uses mutli-attr, add the support to YNL.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|