Age | Commit message (Collapse) | Author |
|
When dereferencing the port vlan group we should use the rcu helper
instead of the one relying on rtnl. In br_multicast_pg_to_port_ctx the
entry cannot disappear as we hold the multicast lock and rcu as explained
in the comment above it.
For the same reason we're ok in br_multicast_start_querier.
=============================
WARNING: suspicious RCU usage
5.14.0-rc5+ #429 Tainted: G W
-----------------------------
net/bridge/br_private.h:1478 suspicious rcu_dereference_protected() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
3 locks held by swapper/2/0:
#0: ffff88822be85eb0 ((&p->timer)){+.-.}-{0:0}, at: call_timer_fn+0x5/0x2da
#1: ffff88810b32f260 (&br->multicast_lock){+.-.}-{3:3}, at: br_multicast_port_group_expired+0x28/0x13d [bridge]
#2: ffffffff824f6c80 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire.constprop.0+0x0/0x22 [bridge]
stack backtrace:
CPU: 2 PID: 0 Comm: swapper/2 Kdump: loaded Tainted: G W 5.14.0-rc5+ #429
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-4.fc34 04/01/2014
Call Trace:
<IRQ>
dump_stack_lvl+0x45/0x59
nbp_vlan_group+0x3e/0x44 [bridge]
br_multicast_pg_to_port_ctx+0xd6/0x10d [bridge]
br_multicast_star_g_handle_mode+0xa1/0x2ce [bridge]
? netlink_broadcast+0xf/0x11
? nlmsg_notify+0x56/0x99
? br_mdb_notify+0x224/0x2e9 [bridge]
? br_multicast_del_pg+0x1dc/0x26d [bridge]
br_multicast_del_pg+0x1dc/0x26d [bridge]
br_multicast_port_group_expired+0xaa/0x13d [bridge]
? __grp_src_delete_marked.isra.0+0x35/0x35 [bridge]
? __grp_src_delete_marked.isra.0+0x35/0x35 [bridge]
call_timer_fn+0x134/0x2da
__run_timers+0x169/0x193
run_timer_softirq+0x19/0x2d
__do_softirq+0x1bc/0x42a
__irq_exit_rcu+0x5c/0xb3
irq_exit_rcu+0xa/0x12
sysvec_apic_timer_interrupt+0x5e/0x75
</IRQ>
asm_sysvec_apic_timer_interrupt+0x12/0x20
RIP: 0010:default_idle+0xc/0xd
Code: e8 14 40 71 ff e8 10 b3 ff ff 4c 89 e2 48 89 ef 31 f6 5d 41 5c e9 a9 e8 c2 ff cc cc cc cc 0f 1f 44 00 00 e8 7f 55 65 ff fb f4 <c3> 0f 1f 44 00 00 55 65 48 8b 2c 25 40 6f 01 00 53 f0 80 4d 02 20
RSP: 0018:ffff88810033bf00 EFLAGS: 00000206
RAX: ffffffff819cf828 RBX: ffff888100328000 RCX: 0000000000000001
RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff819cfa2d
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000001
R10: ffff8881008302c0 R11: 00000000000006db R12: 0000000000000000
R13: 0000000000000002 R14: 0000000000000000 R15: 0000000000000000
? __sched_text_end+0x4/0x4
? default_idle_call+0x15/0x7b
default_idle_call+0x4d/0x7b
do_idle+0x124/0x2a2
cpu_startup_entry+0x1d/0x1f
secondary_startup_64_no_verify+0xb0/0xbb
Fixes: 74edfd483de8 ("net: bridge: multicast: add helper to get port mcast context from port group")
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When sending a global vlan notification we should account for the number
of router ports when allocating the skb, otherwise we might end up
losing notifications.
Fixes: dc002875c22b ("net: bridge: vlan: use br_rports_fill_info() to export mcast router ports")
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We always create a vlan with enabled mcast snooping, so when the user
turns on per-vlan mcast contexts they'll get consistent behaviour with
the current situation, but one place wasn't updated when a bridge/master
vlan which already exists (created due to port vlans) is being added as
real bridge vlan (BRIDGE_VLAN_INFO_BRENTRY). We need to enable mcast
snooping for that vlan when that happens.
Fixes: 7b54aaaf53cb ("net: bridge: multicast: add vlan state initialization and control")
Signed-off-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Subbaraya Sundeep says:
====================
octeontx2: Rework MCAM flows management for VFs
From Octeontx2 hardware point of view there is no
difference between PFs and VFs. Hence with refactoring
in driver the packet classification features or offloads
can be supported by VFs also. This patchset unifies the
mcam flows management so that VFs can also support
ntuple filters. Since there are MCAM allocations by
all PFs and VFs in the system it is required to have
the ability to modify number of mcam rules count
for a PF/VF in runtime. This is achieved by using devlink.
Below is the summary of patches:
Patch 1,2,3 are trivial patches which helps in debugging
in case of errors by using custom error codes and
displaying proper error messages.
Patches 4,5 brings rx-all and ntuple support
for CGX mapped VFs and LBK VFs.
Patches 6,7,8 brings devlink support to
PF netdev driver so that mcam entries count
can be changed at runtime.
To change mcam rule count at runtime where multiple rule
allocations are done sorting is required.
Also both ntuple and TC rules needs to be unified.
Patch 9 is related to AF NPC where a PF
allocated entries are allocated at bottom(low priority).
On CN10K there is slight change in reading
NPC counters which is handled by patch 10.
Patch 11 is to allow packets from CPT for
NPC parsing on CN10K.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
On CN10K, the higher bits in the channel number represents the CPT
channel number. Mask out these higher bits in the npc configuration
to allow packets from cpt for parsing.
Signed-off-by: Vidya <vvelumuri@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The way SW can identify the number NPC counters supported by silicon
has changed for CN10K. This patch addresses this reading appropriate
registers to find out number of counters available.
Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
If the mcam entry allocation request is from PF
and NOT a priority allocation request then allocate
low priority entries so that PF entries always have
lower priority than its VFs. This is required so
that entries with (base) MCAM match criteria have lower
priority compared to entries with (base + additional)
match criteria. This patch considers only best case
scenario where PF entries are allocated from low
priority zone if low priority zone has free space.
There are worst case scenarios like:
1. VFs allocating hundreds of MCAM entries leading to VFs
using all mid priority zone and low priority zone entries
hence no entries free from low priority zone for PF.
2. All the PFs and VFs in the system allocating and freeing
entries causing fragmentation in MCAM space and all the
entries requested by PF could not fit in low priority
zone for allocation.
This patch do not handle worst case scenarios.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Added support for setting or modifying MCAM entry count at
runtime via devlink params.
commands:
devlink dev param show
pci/0002:02:00.0:
name mcam_count type driver-specific
values:
cmode runtime value 16
devlink dev param set pci/0002:02:00.0 name mcam_count
value 64 cmode runtime
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Variables used for TC flow management like maximum number
of flows, number of flows installed etc are a copy of ntuple
flow management variables. Since both TC and NTUPLE are not
supported at the same time, it's better to unify these with
common variables.
This patch addresses this unification and also does cleanup of
other minor stuff wrt TC.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Per single mailbox request a maximum of 256 MCAM entries
can be allocated. If more than 256 are being allocated, then
the mcam indices in the final list could get jumbled. Hence
sort the indices.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add packet flow classification support for both LMAC mapped virtual
functions and loopback VFs. This patch adds supports for ntuple
offload feature.
Signed-off-by: Rakesh Babu <rsaladi2@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Enabled NETIF_F_RXALL support for VF driver.
Also removed MTU range comments which are no longer valid.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Added debug messages for various failures during probe.
This will help in quickly identifying the API where the failure
is happening.
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add appropriate error codes to be used when returning from AF
mailbox handlers due to some error condition.
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When installing a flow using npc_install_flow
mailbox there are number of reasons to reject
the request like caller is not permitted,
invalid channel specified in request, flow
not supported in extraction profile and so on.
Hence define new error codes for npc flows and use
them instead of generic error codes.
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux
Saeed Mahameed says:
====================
mlx5-updates-2021-08-16
The following patchset provides two separate mlx5 updates
1) Ethtool RSS context and MQPRIO channel mode support:
1.1) enable mlx5e netdev driver to allow creating Transport Interface RX
(TIRs) objects on the fly to be used for ethtool RSS contexts and
TX MQPRIO channel mode
1.2) Introduce mlx5e_rss object to manage such TIRs.
1.3) Ethtool support for RSS context
1.4) Support MQPRIO channel mode
2) Bridge offloads Lag support:
to allow adding bond net devices to mlx5 bridge
2.1) Address bridge port by (vport_num, esw_owner_vhca_id) pair
since vport_num is only unique per eswitch and in lag mode we
need to manage ports from both eswitches.
2.2) Allow connectivity between representors of different eswitch
instances that are attached to same bridge
2.3) Bridge LAG, Require representors to be in shared FDB mode and
introduce local and peer ports representors,
match on paired eswitch metadata in peer FDB entries,
And finally support addition/deletion and aging of peer flows.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
ASUS B23E requires the same workaround like other machines with
VT1802, otherwise it looses the codec power on a few nodes and the
sound kept silence.
Fixes: a0645daf1610 ("ALSA: HDA: Early Forbid of runtime PM")
Link: https://lore.kernel.org/r/ac2232f142efcd67fe6ac38897f704f7176bd200.camel@gmail.com
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20210817052432.14751-1-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
During system shutdown codecs may be still active, and resetting the
controller->codec HW link in this state - based on the bug reporter's
tests - leads to the shutdown sequence to get stuck. This happens at
least on the reporter's KBL system with an ALC662 codec.
For now fix the issue by skipping the link reset step.
Fixes: 472e18f63c42 ("ALSA: hda: Release controller display power during shutdown/reboot")
References: https://bugzilla.kernel.org/show_bug.cgi?id=214045
References: https://gitlab.freedesktop.org/drm/intel/-/issues/3618#note_1024665
Reported-and-tested-by: youling257@gmail.com
Cc: youling257@gmail.com
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://lore.kernel.org/r/20210816174259.2759103-1-imre.deak@intel.com
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
"This contains a fix for a potential boot failure due to a missing
Kconfig dependency for people upgrading with the DRBG enabled"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: drbg - select SHA512
|
|
Since the original TFO server code was implemented in commit
168a8f58059a22feb9e9a2dcc1b8053dbbbc12ef ("tcp: TCP Fast Open Server -
main code path") the TFO server code has supported the sysctl bit flag
TFO_SERVER_COOKIE_NOT_REQD. Currently, when the TFO_SERVER_ENABLE and
TFO_SERVER_COOKIE_NOT_REQD sysctl bit flags are set, a server connection
will accept a SYN with N bytes of data (N > 0) that has no TFO cookie,
create a new fast open connection, process the incoming data in the SYN,
and make the connection ready for accepting. After accepting, the
connection is ready for read()/recvmsg() to read the N bytes of data in
the SYN, ready for write()/sendmsg() calls and data transmissions to
transmit data.
This commit changes an edge case in this feature by changing this
behavior to apply to (N >= 0) bytes of data in the SYN rather than only
(N > 0) bytes of data in the SYN. Now, a server will accept a data-less
SYN without a TFO cookie if TFO_SERVER_COOKIE_NOT_REQD is set.
Caveat! While this enables a new kind of TFO (data-less empty-cookie
SYN), some firewall rules setup may not work if they assume such packets
are not legit TFOs and will filter them.
Signed-off-by: Luke Hsiao <lukehsiao@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20210816205105.2533289-1-luke.w.hsiao@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Jonathan Lemon says:
====================
ptp: ocp: minor updates and fixes.
Fix errors spotted by automated tools.
Add myself to the MAINTAINERS for the ptp_ocp driver.
====================
Link: https://lore.kernel.org/r/20210816221337.390645-1-jonathan.lemon@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add maintainer info for the OpenCompute PTP driver.
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
NET doesn't imply NET_DEVLINK. Select this separately, so that
random config combinations don't complain.
Reported-by: kernel test robot <lkp@intel.com>
Fixes: 773bda964921 ("ptp: ocp: Expose various resources on the timecard.")
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If ptp_ocp_device_init() fails, pci_disable_device() is skipped.
Fix the error handling so this case is covered. Update ptp_ocp_remove()
so the normal exit path is identical.
Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: 773bda964921 ("ptp: ocp: Expose various resources on the timecard.")
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If attempting to flash the firmware with a blob of size 0,
the entire write loop is skipped and the uninitialized err
is returned. Fix by setting to 0 first.
Fixes: 773bda964921 ("ptp: ocp: Expose various resources on the timecard.")
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
To fix the "reverse-NAT" for replies.
When a packet is sent over a VRF, the POST_ROUTING hooks are called
twice: Once from the VRF interface, and once from the "actual"
interface the packet will be sent from:
1) First SNAT: l3mdev_l3_out() -> vrf_l3_out() -> .. -> vrf_output_direct()
This causes the POST_ROUTING hooks to run.
2) Second SNAT: 'ip_output()' calls POST_ROUTING hooks again.
Similarly for replies, first ip_rcv() calls PRE_ROUTING hooks, and
second vrf_l3_rcv() calls them again.
As an example, consider the following SNAT rule:
> iptables -t nat -A POSTROUTING -p udp -m udp --dport 53 -j SNAT --to-source 2.2.2.2 -o vrf_1
In this case sending over a VRF will create 2 conntrack entries.
The first is from the VRF interface, which performs the IP SNAT.
The second will run the SNAT, but since the "expected reply" will remain
the same, conntrack randomizes the source port of the packet:
e..g With a socket bound to 1.1.1.1:10000, sending to 3.3.3.3:53, the conntrack
rules are:
udp 17 29 src=2.2.2.2 dst=3.3.3.3 sport=10000 dport=53 packets=1 bytes=68 [UNREPLIED] src=3.3.3.3 dst=2.2.2.2 sport=53 dport=61033 packets=0 bytes=0 mark=0 use=1
udp 17 29 src=1.1.1.1 dst=3.3.3.3 sport=10000 dport=53 packets=1 bytes=68 [UNREPLIED] src=3.3.3.3 dst=2.2.2.2 sport=53 dport=10000 packets=0 bytes=0 mark=0 use=1
i.e. First SNAT IP from 1.1.1.1 --> 2.2.2.2, and second the src port is
SNAT-ed from 10000 --> 61033.
But when a reply is sent (3.3.3.3:53 -> 2.2.2.2:61033) only the later
conntrack entry is matched:
udp 17 29 src=2.2.2.2 dst=3.3.3.3 sport=10000 dport=53 packets=1 bytes=68 src=3.3.3.3 dst=2.2.2.2 sport=53 dport=61033 packets=1 bytes=49 mark=0 use=1
udp 17 28 src=1.1.1.1 dst=3.3.3.3 sport=10000 dport=53 packets=1 bytes=68 [UNREPLIED] src=3.3.3.3 dst=2.2.2.2 sport=53 dport=10000 packets=0 bytes=0 mark=0 use=1
And a "port 61033 unreachable" ICMP packet is sent back.
The issue is that when PRE_ROUTING hooks are called from vrf_l3_rcv(),
the skb already has a conntrack flow attached to it, which means
nf_conntrack_in() will not resolve the flow again.
This means only the dest port is "reverse-NATed" (61033 -> 10000) but
the dest IP remains 2.2.2.2, and since the socket is bound to 1.1.1.1 it's
not received.
This can be verified by logging the 4-tuple of the packet in '__udp4_lib_rcv()'.
The fix is then to reset the flow when skb is received on a VRF, to let
conntrack resolve the flow again (which now will hit the earlier flow).
To reproduce: (Without the fix "Got pkt_to_nat_port" will not be printed by
running 'bash ./repro'):
$ cat run_in_A1.py
import logging
logging.getLogger("scapy.runtime").setLevel(logging.ERROR)
from scapy.all import *
import argparse
def get_packet_to_send(udp_dst_port, msg_name):
return Ether(src='11:22:33:44:55:66', dst=iface_mac)/ \
IP(src='3.3.3.3', dst='2.2.2.2')/ \
UDP(sport=53, dport=udp_dst_port)/ \
Raw(f'{msg_name}\x0012345678901234567890')
parser = argparse.ArgumentParser()
parser.add_argument('-iface_mac', dest="iface_mac", type=str, required=True,
help="From run_in_A3.py")
parser.add_argument('-socket_port', dest="socket_port", type=str,
required=True, help="From run_in_A3.py")
parser.add_argument('-v1_mac', dest="v1_mac", type=str, required=True,
help="From script")
args, _ = parser.parse_known_args()
iface_mac = args.iface_mac
socket_port = int(args.socket_port)
v1_mac = args.v1_mac
print(f'Source port before NAT: {socket_port}')
while True:
pkts = sniff(iface='_v0', store=True, count=1, timeout=10)
if 0 == len(pkts):
print('Something failed, rerun the script :(', flush=True)
break
pkt = pkts[0]
if not pkt.haslayer('UDP'):
continue
pkt_sport = pkt.getlayer('UDP').sport
print(f'Source port after NAT: {pkt_sport}', flush=True)
pkt_to_send = get_packet_to_send(pkt_sport, 'pkt_to_nat_port')
sendp(pkt_to_send, '_v0', verbose=False) # Will not be received
pkt_to_send = get_packet_to_send(socket_port, 'pkt_to_socket_port')
sendp(pkt_to_send, '_v0', verbose=False)
break
$ cat run_in_A2.py
import socket
import netifaces
print(f"{netifaces.ifaddresses('e00000')[netifaces.AF_LINK][0]['addr']}",
flush=True)
s = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
s.setsockopt(socket.SOL_SOCKET, socket.SO_BINDTODEVICE,
str('vrf_1' + '\0').encode('utf-8'))
s.connect(('3.3.3.3', 53))
print(f'{s. getsockname()[1]}', flush=True)
s.settimeout(5)
while True:
try:
# Periodically send in order to keep the conntrack entry alive.
s.send(b'a'*40)
resp = s.recvfrom(1024)
msg_name = resp[0].decode('utf-8').split('\0')[0]
print(f"Got {msg_name}", flush=True)
except Exception as e:
pass
$ cat repro.sh
ip netns del A1 2> /dev/null
ip netns del A2 2> /dev/null
ip netns add A1
ip netns add A2
ip -n A1 link add _v0 type veth peer name _v1 netns A2
ip -n A1 link set _v0 up
ip -n A2 link add e00000 type bond
ip -n A2 link add lo0 type dummy
ip -n A2 link add vrf_1 type vrf table 10001
ip -n A2 link set vrf_1 up
ip -n A2 link set e00000 master vrf_1
ip -n A2 addr add 1.1.1.1/24 dev e00000
ip -n A2 link set e00000 up
ip -n A2 link set _v1 master e00000
ip -n A2 link set _v1 up
ip -n A2 link set lo0 up
ip -n A2 addr add 2.2.2.2/32 dev lo0
ip -n A2 neigh add 1.1.1.10 lladdr 77:77:77:77:77:77 dev e00000
ip -n A2 route add 3.3.3.3/32 via 1.1.1.10 dev e00000 table 10001
ip netns exec A2 iptables -t nat -A POSTROUTING -p udp -m udp --dport 53 -j \
SNAT --to-source 2.2.2.2 -o vrf_1
sleep 5
ip netns exec A2 python3 run_in_A2.py > x &
XPID=$!
sleep 5
IFACE_MAC=`sed -n 1p x`
SOCKET_PORT=`sed -n 2p x`
V1_MAC=`ip -n A2 link show _v1 | sed -n 2p | awk '{print $2'}`
ip netns exec A1 python3 run_in_A1.py -iface_mac ${IFACE_MAC} -socket_port \
${SOCKET_PORT} -v1_mac ${SOCKET_PORT}
sleep 5
kill -9 $XPID
wait $XPID 2> /dev/null
ip netns del A1
ip netns del A2
tail x -n 2
rm x
set +x
Fixes: 73e20b761acf ("net: vrf: Add support for PREROUTING rules on vrf device")
Signed-off-by: Lahav Schlesinger <lschlesinger@drivenets.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20210815120002.2787653-1-lschlesinger@drivenets.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Allow adding bond net devices to mlx5 bridge with following changes:
- Modify bridge representor code to obtain uplink represetor that belongs
to eswitch that is registered for notification. Require representor to be
in shared FDB mode. If representor is the lag master, then consider its
port as local, otherwise treat it as peer.
- Use devcom to match on paired eswitch metadata in peer FDB entries. This
is necessary for shared FDB LAG to function since packets are always
received on active eswitch instance as opposed to parent eswitch of port.
- Support for deleting peer flows when receiving
SWITCHDEV_FDB_DEL_TO_BRIDGE notification was implemented in one of previous
patches in series. Now also implement support for handling
SWITCHDEV_FDB_ADD_TO_BRIDGE which can be generated on peer by bridge update
workqueue task in LAG configuration. Refresh the flow 'lastuse' timestamp
to current jiffies when receiving such notification on eswitch that manages
the local FDB entry. This allows peer entries to prevent ageing of the FDB.
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Allow connectivity between representors of different eswitch instances that
are attached to same bridge when merged_eswitch capability is enabled. Add
ports of peer eswitch to bridge instance and mark them with
MLX5_ESW_BRIDGE_PORT_FLAG_PEER. Mark FDBs offloaded on peer ports with
MLX5_ESW_BRIDGE_FLAG_PEER flag. Such FDBs can only be aged out on their
local eswitch instance, which then sends SWITCHDEV_FDB_DEL_TO_BRIDGE event.
Listen to the event on mlx5 bridge implementation and delete peer FDBs in
event handler.
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
SWITCHDEV_FDB_DEL_TO_BRIDGE notification is generated in multiple places in
bridge code. Following patch in series changes the condition for the
notification. Extract the notification into dedicated helper function
mlx5_esw_bridge_fdb_del_notify() to only modify it in single place in the
future changes.
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Following patches in series allow traffic between vports of different
eswitch instances, which requires addressing bridge port by
vport_num+esw_owner_vhca_id pair since vport_num is only unique
per-eswitch. As a preparation, extend struct mlx5_esw_bridge_port with
'esw_owner_vhca_id' field and use it as part of key for
mlx5_esw_bridge->vports xarray.
With this change we can't rely on switchdev_handle_port_obj_add() helper to
get mlx5 representor from stacked device because we need specifically
representor from parent eswitch that registered the callback to obtain
correct esw_owner_vhca_id. The helper doesn't allow passing additional
parameters to predicate function and doesn't provide access to the notifier
block to obtain eswitch through br_offloads. Implement custom helpers to
obtain mlx5 representor and use them in
mlx5_esw_bridge_port_obj_{add|del|attr_set}() implementations.
Remove direct pointer to parent bridge from struct mlx5_vport as it is no
longer needed.
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Following patches in series will pass bond device to bridge, which means
the code can't assume the device is mlx5 representor. Moreover, the core
device can be easily obtained from eswitch instance, so there is no reason
for more complex code that obtains struct mlx5_priv from net_device in
order to use its mdev. Refactor the code to use esw->dev instead of
priv->mdev.
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Refactor mlx5_esw_bridge_vport_link() to release the bridge instance if
mlx5_esw_bridge_vport_init() returned an error instead of relying on it to
release the bridge. This improves the design because object instance is
taken and released in same layer and simplifies following patches that add
more logic to mlx5_esw_bridge_vport_link().
Signed-off-by: Vlad Buslov <vladbu@nvidia.com>
Reviewed-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Add support for MQPRIO channel mode, in which a partition to TCs
is defined over the channels. We allow partitions with contiguous
queue indices, with no holes within. We do not allow modification
to the num of channels while this MQPRIO mode is active.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Add handling for failures in netdev_set_num_tc().
Let mlx5e_netdev_set_tcs return an int.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
This is in preparation for supporting MQPRIO CHANNEL mode in
downstream patch, in addition to DCB mode that's supported today.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Abstract the MQPRIO params into a struct.
Use a getter for DCB mode num_tcs.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Extend the existing flow classification support, to steer
flows not only directly to a receive ring, but also into
the new RSS contexts.
Create needed TIR objects on demand, and hold reference
on the RSS context.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Add support to multiple RSS contexts. Resources of the non-default
RSS contexts are allocated and created on demand. Each RSS context
can be controlled and configured separately, via the implemented
ethtool ops. Here we limit the num of total contexts to 16.
We do not enforce any kind of new limitation over the indirection table
content. More specifically, two separate contexts can be configured to
fully or partially point to the same set of receive rings.
The default RSS context (index 0) is created with its full set of TIRs.
All other contexts are created with an empty set, then TIRs are added
upon first usage when steering rules are added.
We use a reference counting mechanism to make sure an RSS context is
not removed before the rules pointing to it.
Block ethtool set_channels operations when multiple RSS contexts exist,
as currently the kernel doesn't protect against inconsistent channels
configs that break non-default RSS contexts.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Move from static to dynamic memory allocations for TIR.
This is in preparation to supporting on-demand TIR operations in
downstream patches, where every RSS context will be init with an
empty set of TIRs.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Code related to RSS is now encapsulated into a dedicated object and put
into new files en/rss.{c,h}. All usages are converted.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Bring all fields that define and maintain RSS behavior together
into a new structure.
Align all usages with this new structure. Keep it hidden within
rx_res.c.
This helps supporting multiple RSS contexts in downstream patch.
Use dynamic allocations for the RSS context.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Take TIR control operations in rx_res into functions.
This is in preparation to supporting on-demand TIR operations in
downstream patches.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
All calls to mlx5e_rx_res_rss_set_indir_uniform() occur while the RSS
state is inactive, i.e. the RQT is pointing to the drop RQ, not to the
channels' RQs.
It means that the "apply" part of the function is not called.
Remove this part from the function, and document the change. It will be
useful for next patches in the series, allows code simplifications when
multiple RSS contexts are introduced.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
|
|
Guangbin Huang says:
====================
net: hns3: add support ethtool extended link state
This series adds support for ethtool extended link state in the HNS3
ethernet driver to add one additional information for user to know
why a link is not up.
====================
Link: https://lore.kernel.org/r/1629080129-46507-1-git-send-email-huangguangbin2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In order to know the reason of link up failure, add supporting ethtool
extended link state. Driver reads the link status code from firmware if
in link down state and converts it to ethtool extended link state.
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a new file hns3_ethtool.h, and move struct type definitions from
hns3_ethtool.c to hns3_ethtool.h.
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ETHTOOL_LINK_EXT_SUBSTATE_BSI_SERDES_REFERENCE_CLOCK_LOST means the input
external clock signal for SerDes is too weak or lost.
ETHTOOL_LINK_EXT_SUBSTATE_BSI_SERDES_ALOS means the received signal for
SerDes is too weak because analog loss of signal.
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add documentation for two bad signal integrity substates:
ETHTOOL_LINK_EXT_SUBSTATE_BSI_SERDES_REFERENCE_CLOCK_LOST
ETHTOOL_LINK_EXT_SUBSTATE_BSI_SERDES_ALOS.
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull MTD fixes from Miquel Raynal:
"MTD core fixes:
- Fix lock hierarchy in deregister_mtd_blktrans
- Handle flashes without OTP gracefully
- Break circular locks in register_mtd_blktrans
MTD device fixes:
- mchp48l640:
- Fix memory leak on cmd
- Silence some uninitialized variable warnings
- blkdevs:
- Initialize rq.limits.discard_granularity
CFI fixes:
- Fix crash when erasing/writing AMD cards
Raw NAND fixes:
- Fix of_get_nand_secure_regions():
- Add a missing check
- Avoid an unwanted probe failure when a DT property is missing"
* tag 'mtd/fixes-for-5.14-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
mtd: rawnand: Fix probe failure due to of_get_nand_secure_regions()
mtd: fix lock hierarchy in deregister_mtd_blktrans
mtd: devices: mchp48l640: Fix memory leak on cmd
mtd: cfi_cmdset_0002: fix crash when erasing/writing AMD cards
mtd: core: handle flashes without OTP gracefully
mtd: mchp48l640: silence some uninitialized variable warnings
mtd: break circular locks in register_mtd_blktrans
mtd: rawnand: Add a check in of_get_nand_secure_regions()
mtd: mtd_blkdevs: Initialize rq.limits.discard_granularity
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
"Fixes and clean ups to tracing:
- Fix header alignment when PREEMPT_RT is enabled for osnoise tracer
- Inject "stop" event to see where osnoise stopped the trace
- Define DYNAMIC_FTRACE_WITH_ARGS as some code had an #ifdef for it
- Fix erroneous message for bootconfig cmdline parameter
- Fix crash caused by not found variable in histograms"
* tag 'trace-v5.14-rc5-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
tracing / histogram: Fix NULL pointer dereference on strcmp() on NULL event name
init: Suppress wrong warning for bootconfig cmdline parameter
tracing: define needed config DYNAMIC_FTRACE_WITH_ARGS
trace/osnoise: Print a stop tracing message
trace/timerlat: Add a header with PREEMPT_RT additional fields
trace/osnoise: Add a header with PREEMPT_RT additional fields
|