diff options
author | David S. Miller <davem@davemloft.net> | 2022-02-16 11:21:05 +0000 |
---|---|---|
committer | David S. Miller <davem@davemloft.net> | 2022-02-16 11:21:05 +0000 |
commit | f0ead99e623baca56dcbcc577299e8f97aefab0b (patch) | |
tree | d846755802b21a9b1cbf774f89ad193b0e475ee5 /net/switchdev/switchdev.c | |
parent | b0471c26108160217fc17acec4a9fdce92aaeeea (diff) | |
parent | 164f861bd40ccc3ed10a59ee72437b93670a525a (diff) |
Merge branch 'Replay-and-offload-host-VLAN-entries-in-DSA'
Vladimir Oltean says:
====================
Replay and offload host VLAN entries in DSA
v2->v3:
- make the bridge stop notifying switchdev for !BRENTRY VLANs
- create precommit and commit wrappers around __vlan_add_flags().
- special-case the BRENTRY transition from false to true, instead of
treating it as a change of flags and letting drivers figure out that
it really isn't.
- avoid setting *changed unless we know that functions will not error
out later.
- drop "old_flags" from struct switchdev_obj_port_vlan, nobody needs it
now, in v2 only DSA needed it to filter out BRENTRY transitions, that
is now solved cleaner.
- no BRIDGE_VLAN_INFO_BRENTRY flag checks and manipulations in DSA
whatsoever, use the "bool changed" bit as-is after changing what it
means.
- merge dsa_slave_host_vlan_{add,del}() with
dsa_slave_foreign_vlan_{add,del}(), since now they do the same thing,
because the host_vlan functions no longer need to mangle the vlan
BRENTRY flags and bool changed.
v1->v2:
- prune switchdev VLAN additions with no actual change differently
- no longer need to revert struct net_bridge_vlan changes on error from
switchdev
- no longer need to first delete a changed VLAN before readding it
- pass 'bool changed' and 'u16 old_flags' through switchdev_obj_port_vlan
so that DSA can do some additional post-processing with the
BRIDGE_VLAN_INFO_BRENTRY flag
- support VLANs on foreign interfaces
- fix the same -EOPNOTSUPP error in mv88e6xxx, this time on removal, due
to VLAN deletion getting replayed earlier than FDB deletion
The motivation behind these patches is that Rafael reported the
following error with mv88e6xxx when the first switch port joins a
bridge:
mv88e6085 0x0000000008b96000:00: port 0 failed to add a6:ef:77:c8:5f:3d vid 1 to fdb: -95 (-EOPNOTSUPP)
The FDB entry that's added is the MAC address of the bridge, in VID 1
(the default_pvid), being replayed as part of br_add_if() -> ... ->
nbp_switchdev_sync_objs().
-EOPNOTSUPP is the mv88e6xxx driver's way of saying that VID 1 doesn't
exist in the VTU, so it can't program the ATU with a FID, something
which it needs.
It appears to be a race, but it isn't, since we only end up installing
VID 1 in the VTU by coincidence. DSA's approximation of programming
VLANs on the CPU port together with the user ports breaks down with
host FDB entries on mv88e6xxx, since that strictly requires the VTU to
contain the VID. But the user may freely add VLANs pointing just towards
the bridge, and FDB entries in those VLANs, and DSA will not be aware of
them, because it only listens for VLANs on user ports.
To create a solution that scales properly to cross-chip setups and
doesn't leak entries behind, some changes in the bridge driver are
required. I believe that these are for the better overall, but I may be
wrong. Namely, the same refcounting procedure that DSA has in place for
host FDB and MDB entries can be replicated for VLANs, except that it's
garbage in, garbage out: the VLAN addition and removal notifications
from switchdev aren't balanced. So the first 2 patches attempt to deal
with that.
This patch set has been superficially tested on a board with 3 mv88e6xxx
switches in a daisy chain and appears to produce the primary desired
effect - the driver no longer returns -EOPNOTSUPP when the first port
joins a bridge, and is successful in performing local termination under
a VLAN-aware bridge.
As an additional side effect, it silences the annoying "p%d: already a
member of VLAN %d\n" warning messages that the mv88e6xxx driver produces
when coupled with systemd-networkd, and a few VLANs are configured.
Furthermore, it advances Florian's idea from a few years back, which
never got merged:
https://lore.kernel.org/lkml/20180624153339.13572-1-f.fainelli@gmail.com/
v2 has also been tested on the NXP LS1028A felix switch.
Some testing:
root@debian:~# bridge vlan add dev br0 vid 101 pvid self
[ 100.709220] mv88e6085 d0032004.mdio-mii:10: mv88e6xxx_port_vlan_add: port 9 vlan 101
[ 100.873426] mv88e6085 d0032004.mdio-mii:10: mv88e6xxx_port_vlan_add: port 10 vlan 101
[ 100.892314] mv88e6085 d0032004.mdio-mii:11: mv88e6xxx_port_vlan_add: port 9 vlan 101
[ 101.053392] mv88e6085 d0032004.mdio-mii:11: mv88e6xxx_port_vlan_add: port 10 vlan 101
[ 101.076994] mv88e6085 d0032004.mdio-mii:12: mv88e6xxx_port_vlan_add: port 9 vlan 101
root@debian:~# bridge vlan add dev br0 vid 101 pvid self
root@debian:~# bridge vlan add dev br0 vid 101 pvid self
root@debian:~# bridge vlan
port vlan-id
eth0 1 PVID Egress Untagged
lan9 1 PVID Egress Untagged
lan10 1 PVID Egress Untagged
lan11 1 PVID Egress Untagged
lan12 1 PVID Egress Untagged
lan13 1 PVID Egress Untagged
lan14 1 PVID Egress Untagged
lan15 1 PVID Egress Untagged
lan16 1 PVID Egress Untagged
lan17 1 PVID Egress Untagged
lan18 1 PVID Egress Untagged
lan19 1 PVID Egress Untagged
lan20 1 PVID Egress Untagged
lan21 1 PVID Egress Untagged
lan22 1 PVID Egress Untagged
lan23 1 PVID Egress Untagged
lan24 1 PVID Egress Untagged
sfp 1 PVID Egress Untagged
lan1 1 PVID Egress Untagged
lan2 1 PVID Egress Untagged
lan3 1 PVID Egress Untagged
lan4 1 PVID Egress Untagged
lan5 1 PVID Egress Untagged
lan6 1 PVID Egress Untagged
lan7 1 PVID Egress Untagged
lan8 1 PVID Egress Untagged
br0 1 Egress Untagged
101 PVID
root@debian:~# bridge vlan del dev br0 vid 101 pvid self
[ 108.340487] mv88e6085 d0032004.mdio-mii:11: mv88e6xxx_port_vlan_del: port 9 vlan 101
[ 108.379167] mv88e6085 d0032004.mdio-mii:11: mv88e6xxx_port_vlan_del: port 10 vlan 101
[ 108.402319] mv88e6085 d0032004.mdio-mii:12: mv88e6xxx_port_vlan_del: port 9 vlan 101
[ 108.425866] mv88e6085 d0032004.mdio-mii:10: mv88e6xxx_port_vlan_del: port 9 vlan 101
[ 108.452280] mv88e6085 d0032004.mdio-mii:10: mv88e6xxx_port_vlan_del: port 10 vlan 101
root@debian:~# bridge vlan del dev br0 vid 101 pvid self
root@debian:~# bridge vlan del dev br0 vid 101 pvid self
root@debian:~# bridge vlan
port vlan-id
eth0 1 PVID Egress Untagged
lan9 1 PVID Egress Untagged
lan10 1 PVID Egress Untagged
lan11 1 PVID Egress Untagged
lan12 1 PVID Egress Untagged
lan13 1 PVID Egress Untagged
lan14 1 PVID Egress Untagged
lan15 1 PVID Egress Untagged
lan16 1 PVID Egress Untagged
lan17 1 PVID Egress Untagged
lan18 1 PVID Egress Untagged
lan19 1 PVID Egress Untagged
lan20 1 PVID Egress Untagged
lan21 1 PVID Egress Untagged
lan22 1 PVID Egress Untagged
lan23 1 PVID Egress Untagged
lan24 1 PVID Egress Untagged
sfp 1 PVID Egress Untagged
lan1 1 PVID Egress Untagged
lan2 1 PVID Egress Untagged
lan3 1 PVID Egress Untagged
lan4 1 PVID Egress Untagged
lan5 1 PVID Egress Untagged
lan6 1 PVID Egress Untagged
lan7 1 PVID Egress Untagged
lan8 1 PVID Egress Untagged
br0 1 Egress Untagged
root@debian:~# bridge vlan del dev br0 vid 101 pvid self
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Diffstat (limited to 'net/switchdev/switchdev.c')
-rw-r--r-- | net/switchdev/switchdev.c | 152 |
1 files changed, 138 insertions, 14 deletions
diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c index 12e6b4146bfb4..6a00c390547b8 100644 --- a/net/switchdev/switchdev.c +++ b/net/switchdev/switchdev.c @@ -409,6 +409,27 @@ static int switchdev_lower_dev_walk(struct net_device *lower_dev, } static struct net_device * +switchdev_lower_dev_find_rcu(struct net_device *dev, + bool (*check_cb)(const struct net_device *dev), + bool (*foreign_dev_check_cb)(const struct net_device *dev, + const struct net_device *foreign_dev)) +{ + struct switchdev_nested_priv switchdev_priv = { + .check_cb = check_cb, + .foreign_dev_check_cb = foreign_dev_check_cb, + .dev = dev, + .lower_dev = NULL, + }; + struct netdev_nested_priv priv = { + .data = &switchdev_priv, + }; + + netdev_walk_all_lower_dev_rcu(dev, switchdev_lower_dev_walk, &priv); + + return switchdev_priv.lower_dev; +} + +static struct net_device * switchdev_lower_dev_find(struct net_device *dev, bool (*check_cb)(const struct net_device *dev), bool (*foreign_dev_check_cb)(const struct net_device *dev, @@ -424,7 +445,7 @@ switchdev_lower_dev_find(struct net_device *dev, .data = &switchdev_priv, }; - netdev_walk_all_lower_dev_rcu(dev, switchdev_lower_dev_walk, &priv); + netdev_walk_all_lower_dev(dev, switchdev_lower_dev_walk, &priv); return switchdev_priv.lower_dev; } @@ -451,7 +472,7 @@ static int __switchdev_handle_fdb_event_to_device(struct net_device *dev, return mod_cb(dev, orig_dev, event, info->ctx, fdb_info); if (netif_is_lag_master(dev)) { - if (!switchdev_lower_dev_find(dev, check_cb, foreign_dev_check_cb)) + if (!switchdev_lower_dev_find_rcu(dev, check_cb, foreign_dev_check_cb)) goto maybe_bridged_with_us; /* This is a LAG interface that we offload */ @@ -465,7 +486,7 @@ static int __switchdev_handle_fdb_event_to_device(struct net_device *dev, * towards a bridge device. */ if (netif_is_bridge_master(dev)) { - if (!switchdev_lower_dev_find(dev, check_cb, foreign_dev_check_cb)) + if (!switchdev_lower_dev_find_rcu(dev, check_cb, foreign_dev_check_cb)) return 0; /* This is a bridge interface that we offload */ @@ -478,8 +499,8 @@ static int __switchdev_handle_fdb_event_to_device(struct net_device *dev, * that we offload. */ if (!check_cb(lower_dev) && - !switchdev_lower_dev_find(lower_dev, check_cb, - foreign_dev_check_cb)) + !switchdev_lower_dev_find_rcu(lower_dev, check_cb, + foreign_dev_check_cb)) continue; err = __switchdev_handle_fdb_event_to_device(lower_dev, orig_dev, @@ -501,7 +522,7 @@ maybe_bridged_with_us: if (!br || !netif_is_bridge_master(br)) return 0; - if (!switchdev_lower_dev_find(br, check_cb, foreign_dev_check_cb)) + if (!switchdev_lower_dev_find_rcu(br, check_cb, foreign_dev_check_cb)) return 0; return __switchdev_handle_fdb_event_to_device(br, orig_dev, event, fdb_info, @@ -536,13 +557,15 @@ EXPORT_SYMBOL_GPL(switchdev_handle_fdb_event_to_device); static int __switchdev_handle_port_obj_add(struct net_device *dev, struct switchdev_notifier_port_obj_info *port_obj_info, bool (*check_cb)(const struct net_device *dev), + bool (*foreign_dev_check_cb)(const struct net_device *dev, + const struct net_device *foreign_dev), int (*add_cb)(struct net_device *dev, const void *ctx, const struct switchdev_obj *obj, struct netlink_ext_ack *extack)) { struct switchdev_notifier_info *info = &port_obj_info->info; + struct net_device *br, *lower_dev; struct netlink_ext_ack *extack; - struct net_device *lower_dev; struct list_head *iter; int err = -EOPNOTSUPP; @@ -566,15 +589,42 @@ static int __switchdev_handle_port_obj_add(struct net_device *dev, if (netif_is_bridge_master(lower_dev)) continue; + /* When searching for switchdev interfaces that are neighbors + * of foreign ones, and @dev is a bridge, do not recurse on the + * foreign interface again, it was already visited. + */ + if (foreign_dev_check_cb && !check_cb(lower_dev) && + !switchdev_lower_dev_find(lower_dev, check_cb, foreign_dev_check_cb)) + continue; + err = __switchdev_handle_port_obj_add(lower_dev, port_obj_info, - check_cb, add_cb); + check_cb, foreign_dev_check_cb, + add_cb); if (err && err != -EOPNOTSUPP) return err; } - return err; + /* Event is neither on a bridge nor a LAG. Check whether it is on an + * interface that is in a bridge with us. + */ + if (!foreign_dev_check_cb) + return err; + + br = netdev_master_upper_dev_get(dev); + if (!br || !netif_is_bridge_master(br)) + return err; + + if (!switchdev_lower_dev_find(br, check_cb, foreign_dev_check_cb)) + return err; + + return __switchdev_handle_port_obj_add(br, port_obj_info, check_cb, + foreign_dev_check_cb, add_cb); } +/* Pass through a port object addition, if @dev passes @check_cb, or replicate + * it towards all lower interfaces of @dev that pass @check_cb, if @dev is a + * bridge or a LAG. + */ int switchdev_handle_port_obj_add(struct net_device *dev, struct switchdev_notifier_port_obj_info *port_obj_info, bool (*check_cb)(const struct net_device *dev), @@ -585,21 +635,46 @@ int switchdev_handle_port_obj_add(struct net_device *dev, int err; err = __switchdev_handle_port_obj_add(dev, port_obj_info, check_cb, - add_cb); + NULL, add_cb); if (err == -EOPNOTSUPP) err = 0; return err; } EXPORT_SYMBOL_GPL(switchdev_handle_port_obj_add); +/* Same as switchdev_handle_port_obj_add(), except if object is notified on a + * @dev that passes @foreign_dev_check_cb, it is replicated towards all devices + * that pass @check_cb and are in the same bridge as @dev. + */ +int switchdev_handle_port_obj_add_foreign(struct net_device *dev, + struct switchdev_notifier_port_obj_info *port_obj_info, + bool (*check_cb)(const struct net_device *dev), + bool (*foreign_dev_check_cb)(const struct net_device *dev, + const struct net_device *foreign_dev), + int (*add_cb)(struct net_device *dev, const void *ctx, + const struct switchdev_obj *obj, + struct netlink_ext_ack *extack)) +{ + int err; + + err = __switchdev_handle_port_obj_add(dev, port_obj_info, check_cb, + foreign_dev_check_cb, add_cb); + if (err == -EOPNOTSUPP) + err = 0; + return err; +} +EXPORT_SYMBOL_GPL(switchdev_handle_port_obj_add_foreign); + static int __switchdev_handle_port_obj_del(struct net_device *dev, struct switchdev_notifier_port_obj_info *port_obj_info, bool (*check_cb)(const struct net_device *dev), + bool (*foreign_dev_check_cb)(const struct net_device *dev, + const struct net_device *foreign_dev), int (*del_cb)(struct net_device *dev, const void *ctx, const struct switchdev_obj *obj)) { struct switchdev_notifier_info *info = &port_obj_info->info; - struct net_device *lower_dev; + struct net_device *br, *lower_dev; struct list_head *iter; int err = -EOPNOTSUPP; @@ -621,15 +696,42 @@ static int __switchdev_handle_port_obj_del(struct net_device *dev, if (netif_is_bridge_master(lower_dev)) continue; + /* When searching for switchdev interfaces that are neighbors + * of foreign ones, and @dev is a bridge, do not recurse on the + * foreign interface again, it was already visited. + */ + if (foreign_dev_check_cb && !check_cb(lower_dev) && + !switchdev_lower_dev_find(lower_dev, check_cb, foreign_dev_check_cb)) + continue; + err = __switchdev_handle_port_obj_del(lower_dev, port_obj_info, - check_cb, del_cb); + check_cb, foreign_dev_check_cb, + del_cb); if (err && err != -EOPNOTSUPP) return err; } - return err; + /* Event is neither on a bridge nor a LAG. Check whether it is on an + * interface that is in a bridge with us. + */ + if (!foreign_dev_check_cb) + return err; + + br = netdev_master_upper_dev_get(dev); + if (!br || !netif_is_bridge_master(br)) + return err; + + if (!switchdev_lower_dev_find(br, check_cb, foreign_dev_check_cb)) + return err; + + return __switchdev_handle_port_obj_del(br, port_obj_info, check_cb, + foreign_dev_check_cb, del_cb); } +/* Pass through a port object deletion, if @dev passes @check_cb, or replicate + * it towards all lower interfaces of @dev that pass @check_cb, if @dev is a + * bridge or a LAG. + */ int switchdev_handle_port_obj_del(struct net_device *dev, struct switchdev_notifier_port_obj_info *port_obj_info, bool (*check_cb)(const struct net_device *dev), @@ -639,13 +741,35 @@ int switchdev_handle_port_obj_del(struct net_device *dev, int err; err = __switchdev_handle_port_obj_del(dev, port_obj_info, check_cb, - del_cb); + NULL, del_cb); if (err == -EOPNOTSUPP) err = 0; return err; } EXPORT_SYMBOL_GPL(switchdev_handle_port_obj_del); +/* Same as switchdev_handle_port_obj_del(), except if object is notified on a + * @dev that passes @foreign_dev_check_cb, it is replicated towards all devices + * that pass @check_cb and are in the same bridge as @dev. + */ +int switchdev_handle_port_obj_del_foreign(struct net_device *dev, + struct switchdev_notifier_port_obj_info *port_obj_info, + bool (*check_cb)(const struct net_device *dev), + bool (*foreign_dev_check_cb)(const struct net_device *dev, + const struct net_device *foreign_dev), + int (*del_cb)(struct net_device *dev, const void *ctx, + const struct switchdev_obj *obj)) +{ + int err; + + err = __switchdev_handle_port_obj_del(dev, port_obj_info, check_cb, + foreign_dev_check_cb, del_cb); + if (err == -EOPNOTSUPP) + err = 0; + return err; +} +EXPORT_SYMBOL_GPL(switchdev_handle_port_obj_del_foreign); + static int __switchdev_handle_port_attr_set(struct net_device *dev, struct switchdev_notifier_port_attr_info *port_attr_info, bool (*check_cb)(const struct net_device *dev), |