summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-12-08Merge branch 'net-track-the-queue-count-at-unregistration'Jakub Kicinski
Antoine Tenart says: ==================== net: track the queue count at unregistration Those two patches allow to track the Rx and Tx queue count at unregistration and help in detecting illegal addition of Tx queues after unregister (a warning is added). This follows discussions on the following thread, https://lore.kernel.org/all/20211122162007.303623-1-atenart@kernel.org/T/ A patch fixing one issue linked to this was merged ealier, https://lore.kernel.org/all/20211203101318.435618-1-atenart@kernel.org/T/ ==================== Link: https://lore.kernel.org/r/20211207145725.352657-1-atenart@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08net-sysfs: warn if new queue objects are being created during device ↵Antoine Tenart
unregistration Calling netdev_queue_update_kobjects is allowed during device unregistration since commit 5c56580b74e5 ("net: Adjust TX queue kobjects if number of queues changes during unregister"). But this is solely to allow queue unregistrations. Any path attempting to add new queues after a device started its unregistration should be fixed. This patch adds a warning to detect such illegal use. Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08net-sysfs: update the queue counts in the unregistration pathAntoine Tenart
When updating Rx and Tx queue kobjects, the queue count should always be updated to match the queue kobjects count. This was not done in the net device unregistration path, fix it. Tracking all queue count updates will allow in a following up patch to detect illegal updates. Signed-off-by: Antoine Tenart <atenart@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08net: mvpp2: fix XDP rx queues registeringLouis Amas
The registration of XDP queue information is incorrect because the RX queue id we use is invalid. When port->id == 0 it appears to works as expected yet it's no longer the case when port->id != 0. The problem arised while using a recent kernel version on the MACCHIATOBin. This board has several ports: * eth0 and eth1 are 10Gbps interfaces ; both ports has port->id == 0; * eth2 is a 1Gbps interface with port->id != 0. Code from xdp-tutorial (more specifically advanced03-AF_XDP) was used to test packet capture and injection on all these interfaces. The XDP kernel was simplified to: SEC("xdp_sock") int xdp_sock_prog(struct xdp_md *ctx) { int index = ctx->rx_queue_index; /* A set entry here means that the correspnding queue_id * has an active AF_XDP socket bound to it. */ if (bpf_map_lookup_elem(&xsks_map, &index)) return bpf_redirect_map(&xsks_map, index, 0); return XDP_PASS; } Starting the program using: ./af_xdp_user -d DEV Gives the following result: * eth0 : ok * eth1 : ok * eth2 : no capture, no injection Investigating the issue shows that XDP rx queues for eth2 are wrong: XDP expects their id to be in the range [0..3] but we found them to be in the range [32..35]. Trying to force rx queue ids using: ./af_xdp_user -d eth2 -Q 32 fails as expected (we shall not have more than 4 queues). When we register the XDP rx queue information (using xdp_rxq_info_reg() in function mvpp2_rxq_init()) we tell it to use rxq->id as the queue id. This value is computed as: rxq->id = port->id * max_rxq_count + queue_id where max_rxq_count depends on the device version. In the MACCHIATOBin case, this value is 32, meaning that rx queues on eth2 are numbered from 32 to 35 - there are four of them. Clearly, this is not the per-port queue id that XDP is expecting: it wants a value in the range [0..3]. It shall directly use queue_id which is stored in rxq->logic_rxq -- so let's use that value instead. rxq->id is left untouched ; its value is indeed valid but it should not be used in this context. This is consistent with the remaining part of the code in mvpp2_rxq_init(). With this change, packet capture is working as expected on all the MACCHIATOBin ports. Fixes: b27db2274ba8 ("mvpp2: use page_pool allocator") Signed-off-by: Louis Amas <louis.amas@eho.link> Signed-off-by: Emmanuel Deloget <emmanuel.deloget@eho.link> Reviewed-by: Marcin Wojtas <mw@semihalf.com> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Link: https://lore.kernel.org/r/20211207143423.916334-1-louis.amas@eho.link Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08Merge branch 'wwan-debugfs-tweaks'Jakub Kicinski
Sergey Ryazanov says: ==================== WWAN debugfs tweaks This is a follow-up series to just applied IOSM (and WWAN) debugfs interface support [1]. The series has two main goals: 1. move the driver-specific debugfs knobs to a subdirectory; 2. make the debugfs interface optional for both IOSM and for the WWAN core. As for the first part, I must say that it was my mistake. I suggested to place debugfs entries under a common per WWAN device directory. But I missed the driver subdirectory in the example, so it become: /sys/kernel/debugfs/wwan/wwan0/trace Since the traces collection is a driver-specific feature, it is better to keep it under the driver-specific subdirectory: /sys/kernel/debugfs/wwan/wwan0/iosm/trace It is desirable to be able to entirely disable the debugfs interface. It can be disabled for several reasons, including security and consumed storage space. See detailed rationale with usage example in the 4th patch. The changes themselves are relatively simple, but require a code rearrangement. So to make changes clear, I chose to split them into preparatory and main changes and properly describe each of them. IOSM part is compile-tested only since I do not have IOSM supported device, so it needs Ack from the driver developers. I would like to thank Johannes Berg and Leon Romanovsky. Their suggestions and comments helped a lot to rework the initial over-engineered solution to something less confusing and much more simple. Thanks! 1. https://lore.kernel.org/netdev/20211120162155.1216081-1-m.chetan.kumar@linux.intel.com 2. https://patchwork.kernel.org/project/netdevbpf/patch/20211204174033.950528-1-arnd@kernel.org/ ==================== Link: https://lore.kernel.org/r/20211207092140.19142-1-ryazanov.s.a@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08net: wwan: make debugfs optionalSergey Ryazanov
Debugfs interface is optional for the regular modem use. Some distros and users will want to disable this feature for security or kernel size reasons. So add a configuration option that allows to completely disable the debugfs interface of the WWAN devices. A primary considered use case for this option was embedded firmwares. For example, in OpenWrt, you can not completely disable debugfs, as a lot of wireless stuff can only be configured and monitored with the debugfs knobs. At the same time, reducing the size of a kernel and modules is an essential task in the world of embedded software. Disabling the WWAN and IOSM debugfs interfaces allows us to save 50K (x86-64 build) of space for module storage. Not much, but already considerable when you only have 16MB of storage. So it is hard to just disable whole debugfs. Users need some fine grained set of options to control which debugfs interface is important and should be available and which is not. The new configuration symbol is enabled by default and is hidden under the EXPERT option. So a regular user would not be bothered by another one configuration question. While an embedded distro maintainer will be able to a little more reduce the final image size. Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Loic Poulain <loic.poulain@linaro.org> Acked-by: M Chetan Kumar <m.chetan.kumar@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08net: wwan: iosm: move debugfs knobs into a subdirSergey Ryazanov
The modem traces collection is a device (and so driver) specific option. Therefore, move the related debugfs files into a driver-specific subdirectory under the common per WWAN device directory. Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Reviewed-by: Loic Poulain <loic.poulain@linaro.org> Acked-by: M Chetan Kumar <m.chetan.kumar@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08net: wwan: iosm: allow trace port be uninitializedSergey Ryazanov
Collecting modem firmware traces is optional for the regular modem use. There are not many reasons for aborting device initialization due to an inability to initialize the trace port and (or) its debugfs interface. So, demote the initialization failure erro message into a warning and do not break the initialization sequence in this case. Rework packet processing and deinitialization so that they do not crash in case of uninitialized trace port. This change is mainly a preparation for an upcoming configuration option introduction that will help disable driver debugfs functionality. Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Reviewed-by: Loic Poulain <loic.poulain@linaro.org> Acked-by: M Chetan Kumar <m.chetan.kumar@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08net: wwan: iosm: consolidate trace port init codeSergey Ryazanov
Move the channel related structures initialization from ipc_imem_channel_init() to ipc_trace_init() and call it directly. On the one hand, this makes the trace port initialization symmetric to the deitialization, that is, it removes the additional wrapper. On the other hand, this change consolidates the trace port related code into a single source file, what facilitates an upcoming disabling of this functionality by a user choise. Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com> Reviewed-by: Loic Poulain <loic.poulain@linaro.org> Acked-by: M Chetan Kumar <m.chetan.kumar@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08vmxnet3: fix minimum vectors alloc issueRonak Doshi
'Commit 39f9895a00f4 ("vmxnet3: add support for 32 Tx/Rx queues")' added support for 32Tx/Rx queues. Within that patch, value of VMXNET3_LINUX_MIN_MSIX_VECT was updated. However, there is a case (numvcpus = 2) which actually requires 3 intrs which matches VMXNET3_LINUX_MIN_MSIX_VECT which then is treated as failure by stack to allocate more vectors. This patch fixes this issue. Fixes: 39f9895a00f4 ("vmxnet3: add support for 32 Tx/Rx queues") Signed-off-by: Ronak Doshi <doshir@vmware.com> Acked-by: Guolin Yang <gyang@vmware.com> Link: https://lore.kernel.org/r/20211207081737.14000-1-doshir@vmware.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08net, neigh: clear whole pneigh_entry at alloc timeEric Dumazet
Commit 2c611ad97a82 ("net, neigh: Extend neigh->flags to 32 bit to allow for extensions") enables a new KMSAM warning [1] I think the bug is actually older, because the following intruction only occurred if ndm->ndm_flags had NTF_PROXY set. pn->flags = ndm->ndm_flags; Let's clear all pneigh_entry fields at alloc time. [1] BUG: KMSAN: uninit-value in pneigh_fill_info+0x986/0xb30 net/core/neighbour.c:2593 pneigh_fill_info+0x986/0xb30 net/core/neighbour.c:2593 pneigh_dump_table net/core/neighbour.c:2715 [inline] neigh_dump_info+0x1e3f/0x2c60 net/core/neighbour.c:2832 netlink_dump+0xaca/0x16a0 net/netlink/af_netlink.c:2265 __netlink_dump_start+0xd1c/0xee0 net/netlink/af_netlink.c:2370 netlink_dump_start include/linux/netlink.h:254 [inline] rtnetlink_rcv_msg+0x181b/0x18c0 net/core/rtnetlink.c:5534 netlink_rcv_skb+0x447/0x800 net/netlink/af_netlink.c:2491 rtnetlink_rcv+0x50/0x60 net/core/rtnetlink.c:5589 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline] netlink_unicast+0x1095/0x1360 net/netlink/af_netlink.c:1345 netlink_sendmsg+0x16f3/0x1870 net/netlink/af_netlink.c:1916 sock_sendmsg_nosec net/socket.c:704 [inline] sock_sendmsg net/socket.c:724 [inline] sock_write_iter+0x594/0x690 net/socket.c:1057 call_write_iter include/linux/fs.h:2162 [inline] new_sync_write fs/read_write.c:503 [inline] vfs_write+0x1318/0x2030 fs/read_write.c:590 ksys_write+0x28c/0x520 fs/read_write.c:643 __do_sys_write fs/read_write.c:655 [inline] __se_sys_write fs/read_write.c:652 [inline] __x64_sys_write+0xdb/0x120 fs/read_write.c:652 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x44/0xae Uninit was created at: slab_post_alloc_hook mm/slab.h:524 [inline] slab_alloc_node mm/slub.c:3251 [inline] slab_alloc mm/slub.c:3259 [inline] __kmalloc+0xc3c/0x12d0 mm/slub.c:4437 kmalloc include/linux/slab.h:595 [inline] pneigh_lookup+0x60f/0xd70 net/core/neighbour.c:766 arp_req_set_public net/ipv4/arp.c:1016 [inline] arp_req_set+0x430/0x10a0 net/ipv4/arp.c:1032 arp_ioctl+0x8d4/0xb60 net/ipv4/arp.c:1232 inet_ioctl+0x4ef/0x820 net/ipv4/af_inet.c:947 sock_do_ioctl net/socket.c:1118 [inline] sock_ioctl+0xa3f/0x13e0 net/socket.c:1235 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:874 [inline] __se_sys_ioctl+0x2df/0x4a0 fs/ioctl.c:860 __x64_sys_ioctl+0xd8/0x110 fs/ioctl.c:860 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x44/0xae CPU: 1 PID: 20001 Comm: syz-executor.0 Not tainted 5.16.0-rc3-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Fixes: 62dd93181aaa ("[IPV6] NDISC: Set per-entry is_router flag in Proxy NA.") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Roopa Prabhu <roopa@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20211206165329.1049835-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08Merge tag 'linux-can-next-for-5.17-20211208' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== can-next 2021-12-08 The first patch is by Vincent Mailhol and replaces the custom CAN units with generic one form linux/units.h. The next 3 patches are by Evgeny Boger and add Allwinner R40 support to the sun4i CAN driver. Andy Shevchenko contributes 4 patches to the hi311x CAN driver, consisting of cleanups and converting the driver to the device property API. * tag 'linux-can-next-for-5.17-20211208' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next: can: hi311x: hi3110_can_probe(): convert to use dev_err_probe() can: hi311x: hi3110_can_probe(): make use of device property API can: hi311x: hi3110_can_probe(): try to get crystal clock rate from property can: hi311x: hi3110_can_probe(): use devm_clk_get_optional() to get the input clock ARM: dts: sun8i: r40: add node for CAN controller can: sun4i_can: add support for R40 CAN controller dt-bindings: net: can: add support for Allwinner R40 CAN controller can: bittiming: replace CAN units with the generic ones from linux/units.h ==================== Link: https://lore.kernel.org/r/20211208125055.223141-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfJakub Kicinski
Pablo Neira Ayuso says: ==================== Netfilter fixes for net 1) Fix bogus compilter warning in nfnetlink_queue, from Florian Westphal. 2) Don't run conntrack on vrf with !dflt qdisc, from Nicolas Dichtel. 3) Fix nft_pipapo bucket load in AVX2 lookup routine for six 8-bit groups, from Stefano Brivio. 4) Break rule evaluation on malformed TCP options. 5) Use socat instead of nc in selftests/netfilter/nft_zones_many.sh, also from Florian 6) Fix KCSAN data-race in conntrack timeout updates, from Eric Dumazet. * git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf: netfilter: conntrack: annotate data-races around ct->timeout selftests: netfilter: switch zone stress to socat netfilter: nft_exthdr: break evaluation if setting TCP option fails selftests: netfilter: Add correctness test for mac,net set type nft_set_pipapo: Fix bucket load in AVX2 lookup routine for six 8-bit groups vrf: don't run conntrack on vrf with !dflt qdisc netfilter: nfnetlink_queue: silence bogus compiler warning ==================== Link: https://lore.kernel.org/r/20211209000847.102598-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2021-12-08 Yahui adds re-initialization of Flow Director for VF reset. Paul restores interrupts when enabling VFs. Dave re-adds bandwidth check for DCBNL and moves DSCP mode check earlier in the function. Jesse prevents reporting of dropped packets that occur during initialization and fixes reporting of statistics which could occur with frequent reads. Michal corrects setting of protocol type for UDP header and fixes lack of differentiation when adding filters for tunnels. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue: ice: safer stats processing ice: fix adding different tunnels ice: fix choosing UDP header type ice: ignore dropped packets during init ice: Fix problems with DSCP QoS implementation ice: rearm other interrupt cause register after enabling VFs ice: fix FDIR init missing when reset VF ==================== Link: https://lore.kernel.org/r/20211208211144.2629867-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfJakub Kicinski
Daniel Borkmann says: ==================== bpf 2021-12-08 We've added 12 non-merge commits during the last 22 day(s) which contain a total of 29 files changed, 659 insertions(+), 80 deletions(-). The main changes are: 1) Fix an off-by-two error in packet range markings and also add a batch of new tests for coverage of these corner cases, from Maxim Mikityanskiy. 2) Fix a compilation issue on MIPS JIT for R10000 CPUs, from Johan Almbladh. 3) Fix two functional regressions and a build warning related to BTF kfunc for modules, from Kumar Kartikeya Dwivedi. 4) Fix outdated code and docs regarding BPF's migrate_disable() use on non- PREEMPT_RT kernels, from Sebastian Andrzej Siewior. 5) Add missing includes in order to be able to detangle cgroup vs bpf header dependencies, from Jakub Kicinski. 6) Fix regression in BPF sockmap tests caused by missing detachment of progs from sockets when they are removed from the map, from John Fastabend. 7) Fix a missing "no previous prototype" warning in x86 JIT caused by BPF dispatcher, from Björn Töpel. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Add selftests to cover packet access corner cases bpf: Fix the off-by-two error in range markings treewide: Add missing includes masked by cgroup -> bpf dependency tools/resolve_btfids: Skip unresolved symbol warning for empty BTF sets bpf: Fix bpf_check_mod_kfunc_call for built-in modules bpf: Make CONFIG_DEBUG_INFO_BTF depend upon CONFIG_BPF_SYSCALL mips, bpf: Fix reference to non-existing Kconfig symbol bpf: Make sure bpf_disable_instrumentation() is safe vs preemption. Documentation/locking/locktypes: Update migrate_disable() bits. bpf, sockmap: Re-evaluate proto ops when psock is removed from sockmap bpf, sockmap: Attach map progs to psock early for feature probes bpf, x86: Fix "no previous prototype" warning ==================== Link: https://lore.kernel.org/r/20211208155125.11826-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08net: dsa: mv88e6xxx: fix "don't use PHY_DETECT on internal PHY's"Russell King (Oracle)
This commit fixes a misunderstanding in commit 4a3e0aeddf09 ("net: dsa: mv88e6xxx: don't use PHY_DETECT on internal PHY's"). For Marvell DSA switches with the PHY_DETECT bit (for non-6250 family devices), controls whether the PPU polls the PHY to retrieve the link, speed, duplex and pause status to update the port configuration. This applies for both internal and external PHYs. For some switches such as 88E6352 and 88E6390X, PHY_DETECT has an additional function of enabling auto-media mode between the internal PHY and SERDES blocks depending on which first gains link. The original intention of commit 5d5b231da7ac (net: dsa: mv88e6xxx: use PHY_DETECT in mac_link_up/mac_link_down) was to allow this bit to be used to detect when this propagation is enabled, and allow software to update the port configuration. This has found to be necessary for some switches which do not automatically propagate status from the SERDES to the port, which includes the 88E6390. However, commit 4a3e0aeddf09 ("net: dsa: mv88e6xxx: don't use PHY_DETECT on internal PHY's") breaks this assumption. Maarten Zanders has confirmed that the issue he was addressing was for an 88E6250 switch, which does not have a PHY_DETECT bit in bit 12, but instead a link status bit. Therefore, mv88e6xxx_port_ppu_updates() does not report correctly. This patch resolves the above issues by reverting Maarten's change and instead making mv88e6xxx_port_ppu_updates() indicate whether the port is internal for the 88E6250 family of switches. Yes, you're right, I'm targeting the 6250 family. And yes, your suggestion would solve my case and is a better implementation for the other devices (as far as I can see). Fixes: 4a3e0aeddf09 ("net: dsa: mv88e6xxx: don't use PHY_DETECT on internal PHY's") Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk> Tested-by: Maarten Zanders <maarten.zanders@mind.be> Link: https://lore.kernel.org/r/E1muXm7-00EwJB-7n@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08Merge branch 'rework-dsa-bridge-tx-forwarding-offload-api'Jakub Kicinski
Vladimir Oltean says: ==================== Rework DSA bridge TX forwarding offload API This change set is preparation work for DSA support of bridge FDB isolation. It replaces struct net_device *dp->bridge_dev with a struct dsa_bridge *dp->bridge that contains some extra information about that bridge, like a unique number kept by DSA. Up until now we computed that number only with the bridge TX forwarding offload feature, but it will be needed for other features too, like for isolation of FDB entries belonging to different bridges. Hardware implementations vary, but one common pattern seems to be the presence of a FID field which can be associated with that bridge number kept by DSA. The idea was outlined here: https://patchwork.kernel.org/project/netdevbpf/patch/20210818120150.892647-16-vladimir.oltean@nxp.com/ (the difference being that with this new proposal, drivers would not need to call dsa_bridge_num_find, instead the bridge_num would be part of the struct dsa_bridge :: num passed as argument). No functional change is intended for drivers that don't already make use of the bridge TX forwarding offload. I've tested the changes on the felix, sja1105 and mv88e6xxx drivers, but nonetheless I'm copying all DSA driver maintainers due to API changes that are taking place. Compared to v1 and v2, the amount of patches is larger, but the contents is mostly the same, just split up hopefully a bit better for review. ==================== Link: https://lore.kernel.org/r/20211206165758.1553882-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08net: dsa: eliminate dsa_switch_ops :: port_bridge_tx_fwd_{,un}offloadVladimir Oltean
We don't really need new switch API for these, and with new switches which intend to add support for this feature, it will become cumbersome to maintain. The change consists in restructuring the two drivers that implement this offload (sja1105 and mv88e6xxx) such that the offload is enabled and disabled from the ->port_bridge_{join,leave} methods instead of the old ->port_bridge_tx_fwd_{,un}offload. The only non-trivial change is that mv88e6xxx_map_virtual_bridge_to_pvt() has been moved to avoid a forward declaration, and the mv88e6xxx_reg_lock() calls from inside it have been removed, since locking is now done from mv88e6xxx_port_bridge_{join,leave}. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08net: dsa: add a "tx_fwd_offload" argument to ->port_bridge_joinVladimir Oltean
This is a preparation patch for the removal of the DSA switch methods ->port_bridge_tx_fwd_offload() and ->port_bridge_tx_fwd_unoffload(). The plan is for the switch to report whether it offloads TX forwarding directly as a response to the ->port_bridge_join() method. This change deals with the noisy portion of converting all existing function prototypes to take this new boolean pointer argument. The bool is placed in the cross-chip notifier structure for bridge join, and a reference to it is provided to drivers. In the next change, DSA will then actually look at this value instead of calling ->port_bridge_tx_fwd_offload(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08net: dsa: keep the bridge_dev and bridge_num as part of the same structureVladimir Oltean
The main desire behind this is to provide coherent bridge information to the fast path without locking. For example, right now we set dp->bridge_dev and dp->bridge_num from separate code paths, it is theoretically possible for a packet transmission to read these two port properties consecutively and find a bridge number which does not correspond with the bridge device. Another desire is to start passing more complex bridge information to dsa_switch_ops functions. For example, with FDB isolation, it is expected that drivers will need to be passed the bridge which requested an FDB/MDB entry to be offloaded, and along with that bridge_dev, the associated bridge_num should be passed too, in case the driver might want to implement an isolation scheme based on that number. We already pass the {bridge_dev, bridge_num} pair to the TX forwarding offload switch API, however we'd like to remove that and squash it into the basic bridge join/leave API. So that means we need to pass this pair to the bridge join/leave API. During dsa_port_bridge_leave, first we unset dp->bridge_dev, then we call the driver's .port_bridge_leave with what used to be our dp->bridge_dev, but provided as an argument. When bridge_dev and bridge_num get folded into a single structure, we need to preserve this behavior in dsa_port_bridge_leave: we need a copy of what used to be in dp->bridge. Switch drivers check bridge membership by comparing dp->bridge_dev with the provided bridge_dev, but now, if we provide the struct dsa_bridge as a pointer, they cannot keep comparing dp->bridge to the provided pointer, since this only points to an on-stack copy. To make this obvious and prevent driver writers from forgetting and doing stupid things, in this new API, the struct dsa_bridge is provided as a full structure (not very large, contains an int and a pointer) instead of a pointer. An explicit comparison function needs to be used to determine bridge membership: dsa_port_offloads_bridge(). Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08net: dsa: export bridging offload helpers to driversVladimir Oltean
Move the static inline helpers from net/dsa/dsa_priv.h to include/net/dsa.h, so that drivers can call functions such as dsa_port_offloads_bridge_dev(), which will be necessary after the transition to a more complex bridge structure. More functions than are needed right now are being moved, but this is done for uniformity. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08net: dsa: rename dsa_port_offloads_bridge to dsa_port_offloads_bridge_devVladimir Oltean
Currently the majority of dsa_port_bridge_dev_get() calls in drivers is just to check whether a port is under the bridge device provided as argument by the DSA API. We'd like to change that DSA API so that a more complex structure is provided as argument. To keep things more generic, and considering that the new complex structure will be provided by value and not by reference, direct comparisons between dp->bridge and the provided bridge will be broken. The generic way to do the checking would simply be to do something like dsa_port_offloads_bridge(dp, &bridge). But there's a problem, we already have a function named that way, which actually takes a bridge_dev net_device as argument. Rename it so that we can use dsa_port_offloads_bridge for something else. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08net: dsa: hide dp->bridge_dev and dp->bridge_num in drivers behind helpersVladimir Oltean
The location of the bridge device pointer and number is going to change. It is not going to be kept individually per port, but in a common structure allocated dynamically and which will have lockdep validation. Use the helpers to access these elements so that we have a migration path to the new organization. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08net: dsa: hide dp->bridge_dev and dp->bridge_num in the core behind helpersVladimir Oltean
The location of the bridge device pointer and number is going to change. It is not going to be kept individually per port, but in a common structure allocated dynamically and which will have lockdep validation. Create helpers to access these elements so that we have a migration path to the new organization. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08net: dsa: mv88e6xxx: compute port vlan membership based on dp->bridge_dev ↵Vladimir Oltean
comparison The goal of this change is to reduce mv88e6xxx_port_vlan() to a form where dsa_port_bridge_same() can be used, since the dp->bridge_dev pointer will be hidden in a future change. To do that, we observe that the "br" pointer is deduced from a dp->bridge_dev in both cases (of a physical switch port as well as a virtual bridge). So instead of keeping the "br" pointer, we can just keep the "dp" pointer from which "br" gets derived. In the last iteration over switch ports, we must use another iterator variable, "other_dp"since now we use the "dp" variable to keep an indirect reference to the bridge. While at it, the old code used to filter only the ports which were part of the same switch as "ds". There exists a dedicated DSA port iterator for that: dsa_switch_for_each_port (which skips the ports in the tree that belong to non-local switches), so we can just use that. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08net: dsa: mv88e6xxx: iterate using dsa_switch_for_each_user_port in ↵Vladimir Oltean
mv88e6xxx_port_check_hw_vlan Avoid a plethora of dsa_to_port() calls (some hidden behind dsa_is_*_port and some in plain sight) by keeping two struct dsa_port references: one to the port passed as argument, and another to the other ports of the switch that we're iterating over. This isn't called from the DSA initialization path, so there is no risk that we have user ports without a dp->slave populated. So the combined checks that a port isn't a DSA port, a CPU port, or doesn't have a slave net device (therefore is unused), are strictly equivalent to the simple check that the port is a user port. This is already handled by the DSA iterator. i gets replaced by other_dp->index, dsa_is_*_port calls get replaced by dsa_port_is_*, and dsa_to_port gets replaced by the respective pointer directly. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08net: dsa: mt7530: iterate using dsa_switch_for_each_user_port in bridging opsVladimir Oltean
Avoid repeated calls to dsa_to_port() (some hidden behind dsa_is_user_port and some in plain sight) by keeping two struct dsa_port references: one to the port passed as argument, and another to the other ports of the switch that we're iterating over. dsa_to_port(ds, i) gets replaced by other_dp, i gets replaced by other_port which is derived from other_dp->index, dsa_is_user_port is handled by the DSA iterator. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08net: dsa: assign a bridge number even without TX forwarding offloadVladimir Oltean
The service where DSA assigns a unique bridge number for each forwarding domain is useful even for drivers which do not implement the TX forwarding offload feature. For example, drivers might use the dp->bridge_num for FDB isolation. So rename ds->num_fwd_offloading_bridges to ds->max_num_bridges, and calculate a unique bridge_num for all drivers that set this value. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08net: dsa: make dp->bridge_num one-basedVladimir Oltean
I have seen too many bugs already due to the fact that we must encode an invalid dp->bridge_num as a negative value, because the natural tendency is to check that invalid value using (!dp->bridge_num). Latest example can be seen in commit 1bec0f05062c ("net: dsa: fix bridge_num not getting cleared after ports leaving the bridge"). Convert the existing users to assume that dp->bridge_num == 0 is the encoding for invalid, and valid bridge numbers start from 1. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-08perf/bpf_counter: Use bpf_map_create instead of bpf_create_mapSong Liu
bpf_create_map is deprecated. Replace it with bpf_map_create. Also add a __weak bpf_map_create() so that when older version of libbpf is linked as a shared library, it falls back to bpf_create_map(). Fixes: 992c4225419a ("libbpf: Unify low-level map creation APIs w/ new bpf_map_create()") Signed-off-by: Song Liu <song@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20211207232340.2561471-1-song@kernel.org
2021-12-08ice: safer stats processingJesse Brandeburg
The driver was zeroing live stats that could be fetched by ndo_get_stats64 at any time. This could result in inconsistent statistics, and the telltale sign was when reading stats frequently from /proc/net/dev, the stats would go backwards. Fix by collecting stats into a local, and delaying when we write to the structure so it's not incremental. Fixes: fcea6f3da546 ("ice: Add stats and ethtool support") Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2021-12-08HID: Ignore battery for Elan touchscreen on Asus UX550VEHans de Goede
Battery status is reported for the Asus UX550VE touchscreen even though it does not have a battery. Prevent it from always reporting the battery as low. BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1897823 Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2021-12-08bpf: Add selftests to cover packet access corner casesMaxim Mikityanskiy
This commit adds BPF verifier selftests that cover all corner cases by packet boundary checks. Specifically, 8-byte packet reads are tested at the beginning of data and at the beginning of data_meta, using all kinds of boundary checks (all comparison operators: <, >, <=, >=; both permutations of operands: data + length compared to end, end compared to data + length). For each case there are three tests: 1. Length is just enough for an 8-byte read. Length is either 7 or 8, depending on the comparison. 2. Length is increased by 1 - should still pass the verifier. These cases are useful, because they failed before commit 2fa7d94afc1a ("bpf: Fix the off-by-two error in range markings"). 3. Length is decreased by 1 - should be rejected by the verifier. Some existing tests are just renamed to avoid duplication. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20211207081521.41923-1-maximmi@nvidia.com
2021-12-08can: hi311x: hi3110_can_probe(): convert to use dev_err_probe()Andy Shevchenko
When deferred the reason is saved for further debugging. Besides that, it's fine to call dev_err_probe() in ->probe() when error code is known. Convert the driver to use dev_err_probe(). Link: https://lore.kernel.org/all/20211206165542.69887-4-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-08can: hi311x: hi3110_can_probe(): make use of device property APIAndy Shevchenko
Make use of device property API in this driver so that both OF based system and ACPI based system can use this driver. Link: https://lore.kernel.org/all/20211206165542.69887-3-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-08can: hi311x: hi3110_can_probe(): try to get crystal clock rate from propertyAndy Shevchenko
In some configurations, mainly ACPI-based, the clock frequency of the device is supplied by very well established 'clock-frequency' property. Hence, try to get it from the property at last if no other providers are available. Link: https://lore.kernel.org/all/20211206165542.69887-2-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-08can: hi311x: hi3110_can_probe(): use devm_clk_get_optional() to get the ↵Andy Shevchenko
input clock It's not clear what was the intention of redundant usage of IS_ERR() around the clock pointer since with the error check of devm_clk_get() followed by bailout it can't be invalid, Simplify the code which fetches the input clock by using devm_clk_get_optional(). It will allow to switch to device properties approach in the future. Link: https://lore.kernel.org/all/20211206165542.69887-1-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-08ARM: dts: sun8i: r40: add node for CAN controllerEvgeny Boger
Allwinner R40 (also known as A40i, T3, V40) has a CAN controller. The controller is the same as in earlier A10 and A20 SoCs, but needs reset line to be deasserted before use. This patch adds a CAN node and the corresponding pinctrl descriptions. Link: https://lore.kernel.org/all/20211122104616.537156-4-boger@wirenboard.com Signed-off-by: Evgeny Boger <boger@wirenboard.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-08can: sun4i_can: add support for R40 CAN controllerEvgeny Boger
Allwinner R40 (also known as A40i, T3, V40) has a CAN controller. The controller is the same as in earlier A10 and A20 SoCs, but needs reset line to be deasserted before use. This patch adds a new compatible for R40 CAN controller. Depending on the compatible, reset line can be requested from DT. Link: https://lore.kernel.org/all/20211122104616.537156-3-boger@wirenboard.com Signed-off-by: Evgeny Boger <boger@wirenboard.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-08dt-bindings: net: can: add support for Allwinner R40 CAN controllerEvgeny Boger
Allwinner R40 (also known as A40i, T3, V40) has a CAN controller. The controller is the same as in earlier A10 and A20 SoCs, but needs reset line to be deasserted before use. This patch Introduces new compatible for R40 CAN controller with required resets property. Link: https://lore.kernel.org/all/20211122104616.537156-2-boger@wirenboard.com Signed-off-by: Evgeny Boger <boger@wirenboard.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-08can: bittiming: replace CAN units with the generic ones from linux/units.hVincent Mailhol
In [1], we introduced a set of units in linux/can/bittiming.h. Since then, generic SI prefixes were added to linux/units.h in [2]. Those new prefixes can perfectly replace CAN specific ones. This patch replaces all occurrences of the CAN units with their corresponding prefix (from linux/units) and the unit (as a comment) according to below table. CAN units SI metric prefix (from linux/units) + unit (as a comment) ------------------------------------------------------------------------ CAN_KBPS KILO /* BPS */ CAN_MBPS MEGA /* BPS */ CAM_MHZ MEGA /* Hz */ The definition are then removed from linux/can/bittiming.h [1] commit 1d7750760b70 ("can: bittiming: add CAN_KBPS, CAN_MBPS and CAN_MHZ macros") [2] commit 26471d4a6cf8 ("units: Add SI metric prefix definitions") Link: https://lore.kernel.org/all/20211124014536.782550-1-mailhol.vincent@wanadoo.fr Suggested-by: Jimmy Assarsson <extja@kvaser.com> Suggested-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2021-12-07Merge branch 's390-net-updates-2021-12-06'Jakub Kicinski
Alexandra Winter says: ==================== s390/net: updates 2021-12-06 This brings some maintenance improvements and removes some unnecessary code checks. ==================== Link: https://lore.kernel.org/r/20211207090452.1155688-1-wintera@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07s390/qeth: remove check for packing mode in qeth_check_outbound_queue()Julian Wiedmann
If qeth_check_outbound_queue() finds a partially filled TX buffer on the queue and flushes it, then the queue _must_ have been in packing mode. Remove the redundant check when updating the relevant statistics. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07s390/qeth: fine-tune .ndo_select_queue()Julian Wiedmann
Avoid a conditional branch for L2 devices when selecting the TX queue, and have shared logic for OSA devices. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07s390/qeth: don't offer .ndo_bridge_* ops for OSA devicesJulian Wiedmann
qeth_l2_detect_dev2br_support() will only set brport_hw_features for IQD devices. So qeth_l2_bridge_getlink() and qeth_l2_bridge_setlink() will always return -EOPNOTSUPP on OSA devices. Just don't offer these callbacks instead. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Signed-off-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07s390/qeth: split up L2 netdev_opsJulian Wiedmann
Splitting up the netdev_ops allows for fine-tuning some of the ndo's in subsequent patches. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07s390/qeth: simplify qeth_receive_skb()Julian Wiedmann
Now that the OSN code is gone, we don't need the second switch statement in the RX path. And getting rid of its (unreachable) default case is a nice simplification. Also don't pass in the full HW header, all we still need is a flag to indicate whether the skb can use CSO. This we can already obtain during the first peek at the HW header. Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com> Reviewed-by: Alexandra Winter <wintera@linux.ibm.com> Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com> Signed-off-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07hv_sock: Extract hvs_send_data() helper that takes only headerKees Cook
When building under -Warray-bounds, the compiler is especially conservative when faced with casts from a smaller object to a larger object. While this has found many real bugs, there are some cases that are currently false positives (like here). With this as one of the last few instances of the warning in the kernel before -Warray-bounds can be enabled globally, rearrange the functions so that there is a header-only version of hvs_send_data(). Silences this warning: net/vmw_vsock/hyperv_transport.c: In function 'hvs_shutdown_lock_held.constprop': net/vmw_vsock/hyperv_transport.c:231:32: warning: array subscript 'struct hvs_send_buf[0]' is partly outside array bounds of 'struct vmpipe_proto_header[1]' [-Warray-bounds] 231 | send_buf->hdr.pkt_type = 1; | ~~~~~~~~~~~~~~~~~~~~~~~^~~ net/vmw_vsock/hyperv_transport.c:465:36: note: while referencing 'hdr' 465 | struct vmpipe_proto_header hdr; | ^~~ This change results in no executable instruction differences. Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20211207063217.2591451-1-keescook@chromium.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07net: dsa: felix: use kmemdup() to replace kmalloc + memcpyYihao Han
Fix following coccicheck warning: /drivers/net/dsa/ocelot/felix_vsc9959.c:1627:13-20: WARNING opportunity for kmemdup /drivers/net/dsa/ocelot/felix_vsc9959.c:1506:16-23: WARNING opportunity for kmemdup Signed-off-by: Yihao Han <hanyihao@vivo.com> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20211207064419.38632-1-hanyihao@vivo.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-07Merge branch 'prepare-ocelot-for-external-interface-control'Jakub Kicinski
Colin Foster says: ==================== prepare ocelot for external interface control This patch set is derived from an attempt to include external control for a VSC751[1234] chip via SPI. That patch set has grown large and is getting unwieldy for reviewers and the developers... me. I'm breaking out the changes from that patch set. Some are trivial net: dsa: ocelot: remove unnecessary pci_bar variables net: dsa: ocelot: felix: Remove requirement for PCS in felix devices some are required for SPI net: dsa: ocelot: felix: add interface for custom regmaps and some are just to expose code to be shared net: mscc: ocelot: split register definitions to a separate file The entirety of this patch set should have essentially no impact on the system performance. ==================== Link: https://lore.kernel.org/r/20211207170030.1406601-1-colin.foster@in-advantage.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>