summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)Author
2020-07-28svcrdma: CM event handler clean upChuck Lever
Now that there's a core tracepoint that reports these events, there's no need to maintain dprintk() call sites in each arm of the switch statements. We also refresh the documenting comments. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-07-28svcrdma: Remove transport reference countingChuck Lever
Jason tells me that a ULP cannot rely on getting an ESTABLISHED and DISCONNECTED event pair for each connection, so transport reference counting in the CM event handler will never be reliable. Now that we have ib_drain_qp(), svcrdma should no longer need to hold transport references while Sends and Receives are posted. So remove the get/put call sites in the CM event handlers. This eliminates a significant source of locked memory bus traffic. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-07-28svcrdma: Fix another Receive buffer leakChuck Lever
During a connection tear down, the Receive queue is flushed before the device resources are freed. Typically, all the Receives flush with IB_WR_FLUSH_ERR. However, any pending successful Receives flush with IB_WR_SUCCESS, and the server automatically posts a fresh Receive to replace the completing one. This happens even after the connection has closed and the RQ is drained. Receives that are posted after the RQ is drained appear never to complete, causing a Receive resource leak. The leaked Receive buffer is left DMA-mapped. To prevent these late-posted recv_ctxt's from leaking, block new Receive posting after XPT_CLOSE is set. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2020-07-28xdp: Prevent kernel-infoleak in xsk_getsockopt()Peilin Ye
xsk_getsockopt() is copying uninitialized stack memory to userspace when 'extra_stats' is 'false'. Fix it. Doing '= {};' is sufficient since currently 'struct xdp_statistics' is defined as follows: struct xdp_statistics { __u64 rx_dropped; __u64 rx_invalid_descs; __u64 tx_invalid_descs; __u64 rx_ring_full; __u64 rx_fill_ring_empty_descs; __u64 tx_ring_empty_descs; }; When being copied to the userspace, 'stats' will not contain any uninitialized 'holes' between struct fields. Fixes: 8aa5a33578e9 ("xsk: Add new statistics") Suggested-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Björn Töpel <bjorn.topel@intel.com> Acked-by: Song Liu <songliubraving@fb.com> Acked-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/bpf/20200728053604.404631-1-yepeilin.cs@gmail.com
2020-07-28Bluetooth: Return NOTIFY_DONE for hci_suspend_notifierMax Chou
The original return is NOTIFY_STOP, but notifier_call_chain would stop the future call for register_pm_notifier even registered on other Kernel modules with the same priority which value is zero. Signed-off-by: Max Chou <max.chou@realtek.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-07-28Bluetooth: btusb: Fix and detect most of the Chinese Bluetooth controllersIsmael Ferreras Morezuelas
For some reason they tend to squat on the very first CSR/ Cambridge Silicon Radio VID/PID instead of paying fees. This is an extremely common problem; the issue goes as back as 2013 and these devices are only getting more popular, even rebranded by reputable vendors and sold by retailers everywhere. So, at this point in time there are hundreds of modern dongles reusing the ID of what originally was an early Bluetooth 1.1 controller. Linux is the only place where they don't work due to spotty checks in our detection code. It only covered a minimum subset. So what's the big idea? Take advantage of the fact that all CSR chips report the same internal version as both the LMP sub-version and HCI revision number. It always matches, couple that with the manufacturer code, that rarely lies, and we now have a good idea of who is who. Additionally, by compiling a list of user-reported HCI/lsusb dumps, and searching around for legit CSR dongles in similar product ranges we can find what CSR BlueCore firmware supported which Bluetooth versions. That way we can narrow down ranges of fakes for each of them. e.g. Real CSR dongles with LMP subversion 0x73 are old enough that support BT 1.1 only; so it's a dead giveaway when some third-party BT 4.0 dongle reuses it. So, to sum things up; there are multiple classes of fake controllers reusing the same 0A12:0001 VID/PID. This has been broken for a while. Known 'fake' bcdDevices: 0x0100, 0x0134, 0x1915, 0x2520, 0x7558, 0x8891 IC markings on 0x7558: FR3191AHAL 749H15143 (???) https://bugzilla.kernel.org/show_bug.cgi?id=60824 Fixes: 81cac64ba258ae (Deal with USB devices that are faking CSR vendor) Reported-by: Michał Wiśniewski <brylozketrzyn@gmail.com> Tested-by: Mike Johnson <yuyuyak@gmail.com> Tested-by: Ricardo Rodrigues <ekatonb@gmail.com> Tested-by: M.Hanny Sabbagh <mhsabbagh@outlook.com> Tested-by: Oussama BEN BRAHIM <b.brahim.oussama@gmail.com> Tested-by: Ismael Ferreras Morezuelas <swyterzone@gmail.com> Signed-off-by: Ismael Ferreras Morezuelas <swyterzone@gmail.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
2020-07-28xfrm: esp6: fix the location of the transport header with encapsulationSabrina Dubroca
commit 17175d1a27c6 ("xfrm: esp6: fix encapsulation header offset computation") changed esp6_input_done2 to correctly find the size of the IPv6 header that precedes the TCP/UDP encapsulation header, but didn't adjust the final call to skb_set_transport_header, which I assumed was correct in using skb_network_header_len. Xiumei Mu reported that when we create xfrm states that include port numbers in the selector, traffic from the user sockets is dropped. It turns out that we get a state mismatch in __xfrm_policy_check, because we end up trying to compare the encapsulation header's ports with the selector that's based on user traffic ports. Fixes: 0146dca70b87 ("xfrm: add support for UDPv6 encapsulation of ESP") Fixes: 26333c37fc28 ("xfrm: add IPv6 support for espintcp") Reported-by: Xiumei Mu <xmu@redhat.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2020-07-27fix a braino in cmsghdr_from_user_compat_to_kern()Al Viro
commit 547ce4cfb34c ("switch cmsghdr_from_user_compat_to_kern() to copy_from_user()") missed one of the places where ucmlen should've been replaced with cmsg.cmsg_len, now that we are fetching the entire struct rather than doing it field-by-field. As the result, compat sendmsg() with several different-sized cmsg attached started to fail with EINVAL. Trivial to fix, fortunately. Fixes: 547ce4cfb34c ("switch cmsghdr_from_user_compat_to_kern() to copy_from_user()") Reported-by: Nick Bowler <nbowler@draconx.ca> Tested-by: Nick Bowler <nbowler@draconx.ca> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27net: prp: enhance debugfs to display PRP infoMurali Karicheri
Print PRP specific information from node table as part of debugfs node table display. Also display the node as DAN-H or DAN-P depending on the info from node table. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27net: prp: add packet handling supportMurali Karicheri
DAN-P (Dual Attached Nodes PRP) nodes are expected to receive traditional IP packets as well as PRP (Parallel Redundancy Protocol) tagged (trailer) packets. PRP trailer is 6 bytes of PRP protocol unit called RCT, Redundancy Control Trailer (RCT) similar to HSR tag. PRP network can have traditional devices such as bridges/switches or PC attached to it and should be able to communicate. Regular Ethernet devices treat the RCT as pads. This patch adds logic to format L2 frames from network stack to add a trailer (RCT) and send it as duplicates over the slave interfaces when the protocol is PRP as per IEC 62439-3. At the ingress, it strips the trailer, do duplicate detection and rejection and forward a stripped frame up the network stack. PRP device should accept frames from Singly Attached Nodes (SAN) and thus the driver mark the link where the frame came from in the node table. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27net: hsr: define and use proto_ops ptrs to handle hsr specific framesMurali Karicheri
As a preparatory patch to introduce PRP, refactor the code specific to handling HSR frames into separate functions and call them through proto_ops function pointers. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27net: prp: add supervision frame generation utility functionMurali Karicheri
Add support for generation of PRP supervision frames. For PRP, supervision frame format is similar to HSR version 0, but have a PRP Redundancy Control Trailer (RCT) added and uses a different message type, PRP_TLV_LIFE_CHECK_DD. Also update is_supervision_frame() to include the new message type used for PRP supervision frame. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27net: hsr: introduce protocol specific function pointersMurali Karicheri
As a preparatory patch to introduce support for PRP protocol, add a protocol ops ptr in the private hsr structure to hold function pointers as some of the functions at protocol level packet handling is different for HSR vs PRP. It is expected that PRP will add its of set of functions for protocol handling. Modify existing hsr_announce() function to call proto_ops->send_sv_frame() to send supervision frame for HSR. This is expected to be different for PRP. So introduce a ops function ptr, send_sv_frame() for the same and initialize it to send_hsr_supervsion_frame(). Modify hsr_announce() to call proto_ops->send_sv_frame(). Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27net: hsr: introduce common code for skb initializationMurali Karicheri
As a preparatory patch to introduce PRP protocol support in the driver, refactor the skb init code to a separate function. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27hsr: enhance netlink socket interface to support PRPMurali Karicheri
Parallel Redundancy Protocol (PRP) is another redundancy protocol introduced by IEC 63439 standard. It is similar to HSR in many aspects:- - Use a pair of Ethernet interfaces to created the PRP device - Use a 6 byte redundancy protocol part (RCT, Redundancy Check Trailer) similar to HSR Tag. - Has Link Redundancy Entity (LRE) that works with RCT to implement redundancy. Key difference is that the protocol unit is a trailer instead of a prefix as in HSR. That makes it inter-operable with tradition network components such as bridges/switches which treat it as pad bytes, whereas HSR nodes requires some kind of translators (Called redbox) to talk to regular network devices. This features allows regular linux box to be converted to a DAN-P box. DAN-P stands for Dual Attached Node - PRP similar to DAN-H (Dual Attached Node - HSR). Add a comment at the header/source code to explicitly state that the driver files also handles PRP protocol as well. Signed-off-by: Murali Karicheri <m-karicheri2@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27mptcp: fix joined subflows with unblocking skMatthieu Baerts
Unblocking sockets used for outgoing connections were not containing inet info about the initial connection due to a typo there: the value of "err" variable is negative in the kernelspace. This fixes the creation of additional subflows where the remote port has to be reused if the other host didn't announce another one. This also fixes inet_diag showing blank info about MPTCP sockets from unblocking sockets doing a connect(). Fixes: 41be81a8d3d0 ("mptcp: fix unblocking connect()") Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net> Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27net: Removed the device type check to add mpls support for devicesMartin Varghese
MPLS has no dependency with the device type of underlying devices. Hence the device type check to add mpls support for devices can be avoided. Signed-off-by: Martin Varghese <martin.varghese@nokia.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27ipmr: Copy option to correct variableIdo Schimmel
Cited commit mistakenly copied provided option to 'val' instead of to 'mfc': ``` - if (copy_from_user(&mfc, optval, sizeof(mfc))) { + if (copy_from_sockptr(&val, optval, sizeof(val))) { ``` Fix this by copying the option to 'mfc'. selftest router_multicast.sh before: $ ./router_multicast.sh smcroutectl: Unknown or malformed IPC message 'a' from client. smcroutectl: failed removing multicast route, does not exist. TEST: mcast IPv4 [FAIL] Multicast not received on first host TEST: mcast IPv6 [ OK ] smcroutectl: Unknown or malformed IPC message 'a' from client. smcroutectl: failed removing multicast route, does not exist. TEST: RPF IPv4 [FAIL] Multicast not received on first host TEST: RPF IPv6 [ OK ] selftest router_multicast.sh after: $ ./router_multicast.sh TEST: mcast IPv4 [ OK ] TEST: mcast IPv6 [ OK ] TEST: RPF IPv4 [ OK ] TEST: RPF IPv6 [ OK ] Fixes: 01ccb5b48f08 ("net/ipv4: switch ip_mroute_setsockopt to sockptr_t") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27net/smc: unique reason code for exceeded max dmb countKarsten Graul
When the maximum dmb buffer limit for an ism device is reached no more dmb buffers can be registered. When this happens the reason code is set to SMC_CLC_DECL_MEM indicating out-of-memory. This is the same reason code that is used when no memory could be allocated for the new dmb buffer. This is confusing for users when they see this error but there is more memory available. To solve this set a separate new reason code when the maximum dmb limit exceeded. Reviewed-by: Ursula Braun <ubraun@linux.ibm.com> Signed-off-by: Karsten Graul <kgraul@linux.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-25bpf, xdp: Remove XDP_QUERY_PROG and XDP_QUERY_PROG_HW XDP commandsAndrii Nakryiko
Now that BPF program/link management is centralized in generic net_device code, kernel code never queries program id from drivers, so XDP_QUERY_PROG/XDP_QUERY_PROG_HW commands are unnecessary. This patch removes all the implementations of those commands in kernel, along the xdp_attachment_query(). This patch was compile-tested on allyesconfig. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200722064603.3350758-10-andriin@fb.com
2020-07-25bpf: Implement BPF XDP link-specific introspection APIsAndrii Nakryiko
Implement XDP link-specific show_fdinfo and link_info to emit ifindex. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200722064603.3350758-7-andriin@fb.com
2020-07-25bpf, xdp: Implement LINK_UPDATE for BPF XDP linkAndrii Nakryiko
Add support for LINK_UPDATE command for BPF XDP link to enable reliable replacement of underlying BPF program. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200722064603.3350758-6-andriin@fb.com
2020-07-25bpf, xdp: Add bpf_link-based XDP attachment APIAndrii Nakryiko
Add bpf_link-based API (bpf_xdp_link) to attach BPF XDP program through BPF_LINK_CREATE command. bpf_xdp_link is mutually exclusive with direct BPF program attachment, previous BPF program should be detached prior to attempting to create a new bpf_xdp_link attachment (for a given XDP mode). Once BPF link is attached, it can't be replaced by other BPF program attachment or link attachment. It will be detached only when the last BPF link FD is closed. bpf_xdp_link will be auto-detached when net_device is shutdown, similarly to how other BPF links behave (cgroup, flow_dissector). At that point bpf_link will become defunct, but won't be destroyed until last FD is closed. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200722064603.3350758-5-andriin@fb.com
2020-07-25bpf, xdp: Extract common XDP program attachment logicAndrii Nakryiko
Further refactor XDP attachment code. dev_change_xdp_fd() is split into two parts: getting bpf_progs from FDs and attachment logic, working with bpf_progs. This makes attachment logic a bit more straightforward and prepares code for bpf_xdp_link inclusion, which will share the common logic. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200722064603.3350758-4-andriin@fb.com
2020-07-25bpf, xdp: Maintain info on attached XDP BPF programs in net_deviceAndrii Nakryiko
Instead of delegating to drivers, maintain information about which BPF programs are attached in which XDP modes (generic/skb, driver, or hardware) locally in net_device. This effectively obsoletes XDP_QUERY_PROG command. Such re-organization simplifies existing code already. But it also allows to further add bpf_link-based XDP attachments without drivers having to know about any of this at all, which seems like a good setup. XDP_SETUP_PROG/XDP_SETUP_PROG_HW are just low-level commands to driver to install/uninstall active BPF program. All the higher-level concerns about prog/link interaction will be contained within generic driver-agnostic logic. All the XDP_QUERY_PROG calls to driver in dev_xdp_uninstall() were removed. It's not clear for me why dev_xdp_uninstall() were passing previous prog_flags when resetting installed programs. That seems unnecessary, plus most drivers don't populate prog_flags anyways. Having XDP_SETUP_PROG vs XDP_SETUP_PROG_HW should be enough of an indicator of what is required of driver to correctly reset active BPF program. dev_xdp_uninstall() is also generalized as an iteration over all three supported mode. Signed-off-by: Andrii Nakryiko <andriin@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200722064603.3350758-3-andriin@fb.com
2020-07-25bpf: Implement bpf iterator for sock local storage mapYonghong Song
The bpf iterator for bpf sock local storage map is implemented. User space interacts with sock local storage map with fd as a key and storage value. In kernel, passing fd to the bpf program does not really make sense. In this case, the sock itself is passed to bpf program. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200723184116.590602-1-yhs@fb.com
2020-07-25bpf: Refactor to provide aux info to bpf_iter_init_seq_priv_tYonghong Song
This patch refactored target bpf_iter_init_seq_priv_t callback function to accept additional information. This will be needed in later patches for map element targets since a particular map should be passed to traverse elements for that particular map. In the future, other information may be passed to target as well, e.g., pid, cgroup id, etc. to customize the iterator. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200723184110.590156-1-yhs@fb.com
2020-07-25bpf: Refactor bpf_iter_reg to have separate seq_info memberYonghong Song
There is no functionality change for this patch. Struct bpf_iter_reg is used to register a bpf_iter target, which includes information for both prog_load, link_create and seq_file creation. This patch puts fields related seq_file creation into a different structure. This will be useful for map elements iterator where one iterator covers different map types and different map types may have different seq_ops, init/fini private_data function and private_data size. Signed-off-by: Yonghong Song <yhs@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Link: https://lore.kernel.org/bpf/20200723184109.590030-1-yhs@fb.com
2020-07-25udp: Don't discard reuseport selection when group has connectionsJakub Sitnicki
When BPF socket lookup prog selects a socket that belongs to a reuseport group, and the reuseport group has connected sockets in it, the socket selected by reuseport will be discarded, and socket returned by BPF socket lookup will be used instead. Modify this behavior so that the socket selected by reuseport running after BPF socket lookup always gets used. Ignore the fact that the reuseport group might have connections because it is only relevant when scoring sockets during regular hashtable-based lookup. Fixes: 72f7e9440e9b ("udp: Run SK_LOOKUP BPF program on socket lookup") Fixes: 6d4201b1386b ("udp6: Run SK_LOOKUP BPF program on socket lookup") Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp> Link: https://lore.kernel.org/bpf/20200722161720.940831-2-jakub@cloudflare.com
2020-07-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
The UDP reuseport conflict was a little bit tricky. The net-next code, via bpf-next, extracted the reuseport handling into a helper so that the BPF sk lookup code could invoke it. At the same time, the logic for reuseport handling of unconnected sockets changed via commit efc6b6f6c3113e8b203b9debfb72d81e0f3dcace which changed the logic to carry on the reuseport result into the rest of the lookup loop if we do not return immediately. This requires moving the reuseport_has_conns() logic into the callers. While we are here, get rid of inline directives as they do not belong in foo.c files. The other changes were cases of more straightforward overlapping modifications. Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-25Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net into masterLinus Torvalds
Pull networking fixes from David Miller: 1) Fix RCU locaking in iwlwifi, from Johannes Berg. 2) mt76 can access uninitialized NAPI struct, from Felix Fietkau. 3) Fix race in updating pause settings in bnxt_en, from Vasundhara Volam. 4) Propagate error return properly during unbind failures in ax88172a, from George Kennedy. 5) Fix memleak in adf7242_probe, from Liu Jian. 6) smc_drv_probe() can leak, from Wang Hai. 7) Don't muck with the carrier state if register_netdevice() fails in the bonding driver, from Taehee Yoo. 8) Fix memleak in dpaa_eth_probe, from Liu Jian. 9) Need to check skb_put_padto() return value in hsr_fill_tag(), from Murali Karicheri. 10) Don't lose ionic RSS hash settings across FW update, from Shannon Nelson. 11) Fix clobbered SKB control block in act_ct, from Wen Xu. 12) Missing newlink in "tx_timeout" sysfs output, from Xiongfeng Wang. 13) IS_UDPLITE cleanup a long time ago, incorrectly handled transformations involving UDPLITE_RECV_CC. From Miaohe Lin. 14) Unbalanced locking in netdevsim, from Taehee Yoo. 15) Suppress false-positive error messages in qed driver, from Alexander Lobakin. 16) Out of bounds read in ax25_connect and ax25_sendmsg, from Peilin Ye. 17) Missing SKB release in cxgb4's uld_send(), from Navid Emamdoost. 18) Uninitialized value in geneve_changelink(), from Cong Wang. 19) Fix deadlock in xen-netfront, from Andera Righi. 19) flush_backlog() frees skbs with IRQs disabled, so should use dev_kfree_skb_irq() instead of kfree_skb(). From Subash Abhinov Kasiviswanathan. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (111 commits) drivers/net/wan: lapb: Corrected the usage of skb_cow dev: Defer free of skbs in flush_backlog qrtr: orphan socket in qrtr_release() xen-netfront: fix potential deadlock in xennet_remove() flow_offload: Move rhashtable inclusion to the source file geneve: fix an uninitialized value in geneve_changelink() bonding: check return value of register_netdevice() in bond_newlink() tcp: allow at most one TLP probe per flight AX.25: Prevent integer overflows in connect and sendmsg cxgb4: add missing release on skb in uld_send() net: atlantic: fix PTP on AQC10X AX.25: Prevent out-of-bounds read in ax25_sendmsg() sctp: shrink stream outq when fails to do addstream reconf sctp: shrink stream outq only when new outcnt < old outcnt AX.25: Fix out-of-bounds read in ax25_connect() enetc: Remove the mdio bus on PF probe bailout net: ethernet: ti: add NETIF_F_HW_TC hw feature flag for taprio offload net: ethernet: ave: Fix error returns in ave_init drivers/net/wan/x25_asy: Fix to make it work ipvs: fix the connection sync failed in some cases ...
2020-07-24dev: Defer free of skbs in flush_backlogSubash Abhinov Kasiviswanathan
IRQs are disabled when freeing skbs in input queue. Use the IRQ safe variant to free skbs here. Fixes: 145dd5f9c88f ("net: flush the softnet backlog in process context") Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24qrtr: orphan socket in qrtr_release()Cong Wang
We have to detach sock from socket in qrtr_release(), otherwise skb->sk may still reference to this socket when the skb is released in tun->queue, particularly sk->sk_wq still points to &sock->wq, which leads to a UAF. Reported-and-tested-by: syzbot+6720d64f31c081c2f708@syzkaller.appspotmail.com Fixes: 28fb4e59a47d ("net: qrtr: Expose tunneling endpoint to user space") Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Cc: Eric Dumazet <eric.dumazet@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24l2tp: WARN_ON rather than BUG_ON in l2tp_session_freeTom Parkin
l2tp_session_free called BUG_ON if the tunnel magic feather value wasn't correct. The intent of this was to catch lifetime bugs; for example early tunnel free due to incorrect use of reference counts. Since the tunnel magic feather being wrong indicates either early free or structure corruption, we can avoid doing more damage by simply leaving the tunnel structure alone. If the tunnel refcount isn't dropped when it should be, the tunnel instance will remain in the kernel, resulting in the tunnel structure and socket leaking. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24l2tp: remove BUG_ON refcount value in l2tp_session_freeTom Parkin
l2tp_session_free is only called by l2tp_session_dec_refcount when the reference count reaches zero, so it's of limited value to validate the reference count value in l2tp_session_free itself. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24l2tp: WARN_ON rather than BUG_ON in l2tp_session_queue_purgeTom Parkin
l2tp_session_queue_purge is used during session shutdown to drop any skbs queued for reordering purposes according to L2TP dataplane rules. The BUG_ON in this function checks the session magic feather in an attempt to catch lifetime bugs. Rather than crashing the kernel with a BUG_ON, we can simply WARN_ON and refuse to do anything more -- in the worst case this could result in a leak. However this is highly unlikely given that the session purge only occurs from codepaths which have obtained the session by means of a lookup via. the parent tunnel and which check the session "dead" flag to protect against shutdown races. While we're here, have l2tp_session_queue_purge return void rather than an integer, since neither of the callsites checked the return value. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24l2tp: don't BUG_ON seqfile checks in l2tp_pppTom Parkin
checkpatch advises that WARN_ON and recovery code are preferred over BUG_ON which crashes the kernel. l2tp_ppp has a BUG_ON check of struct seq_file's private pointer in pppol2tp_seq_start prior to accessing data through that pointer. Rather than crashing, we can simply bail out early and return NULL in order to terminate the seq file processing in much the same way as we do when reaching the end of tunnel/session instances to render. Retain a WARN_ON to help trace possible bugs in this area. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24l2tp: don't BUG_ON session magic checks in l2tp_pppTom Parkin
checkpatch advises that WARN_ON and recovery code are preferred over BUG_ON which crashes the kernel. l2tp_ppp.c's BUG_ON checks of the l2tp session structure's "magic" field occur in code paths where it's reasonably easy to recover: * In the case of pppol2tp_sock_to_session, we can return NULL and the caller will bail out appropriately. There is no change required to any of the callsites of this function since they already handle pppol2tp_sock_to_session returning NULL. * In the case of pppol2tp_session_destruct we can just avoid decrementing the reference count on the suspect session structure. In the worst case scenario this results in a memory leak, which is preferable to a crash. Convert these uses of BUG_ON to WARN_ON accordingly. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24l2tp: remove BUG_ON in l2tp_tunnel_closeallTom Parkin
l2tp_tunnel_closeall is only called from l2tp_core.c, and it's easy to statically analyse the code path calling it to validate that it should never be passed a NULL tunnel pointer. Having a BUG_ON checking the tunnel pointer triggers a checkpatch warning. Since the BUG_ON is of no value, remove it to avoid the warning. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24l2tp: remove BUG_ON in l2tp_session_queue_purgeTom Parkin
l2tp_session_queue_purge is only called from l2tp_core.c, and it's easy to statically analyse the code paths calling it to validate that it should never be passed a NULL session pointer. Having a BUG_ON checking the session pointer triggers a checkpatch warning. Since the BUG_ON is of no value, remove it to avoid the warning. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24l2tp: WARN_ON rather than BUG_ON in l2tp_dfs_seq_startTom Parkin
l2tp_dfs_seq_start had a BUG_ON to catch a possible programming error in l2tp_dfs_seq_open. Since we can easily bail out of l2tp_dfs_seq_start, prefer to do that and flag the error with a WARN_ON rather than crashing the kernel. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24l2tp: avoid multiple assignmentsTom Parkin
checkpatch warns about multiple assignments. Update l2tp accordingly. Signed-off-by: Tom Parkin <tparkin@katalix.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24icmp6: support rfc 4884Willem de Bruijn
Extend the rfc 4884 read interface introduced for ipv4 in commit eba75c587e81 ("icmp: support rfc 4884") to ipv6. Add socket option SOL_IPV6/IPV6_RECVERR_RFC4884. Changes v1->v2: - make ipv6_icmp_error_rfc4884 static (file scope) Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24icmp: prepare rfc 4884 for ipv6Willem de Bruijn
The RFC 4884 spec is largely the same between IPv4 and IPv6. Factor out the IPv4 specific parts in preparation for IPv6 support: - icmp types supported - icmp header size, and thus offset to original datagram start - datagram length field offset in icmp(6)hdr. - datagram length field word size: 4B for IPv4, 8B for IPv6. Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24icmp: revise rfc4884 testsWillem de Bruijn
1) Only accept packets with original datagram len field >= header len. The extension header must start after the original datagram headers. The embedded datagram len field is compared against the 128B minimum stipulated by RFC 4884. It is unlikely that headers extend beyond this. But as we know the exact header length, check explicitly. 2) Remove the check that datagram length must be <= 576B. This is a send constraint. There is no value in testing this on rx. Within private networks it may be known safe to send larger packets. Process these packets. This test was also too lax. It compared original datagram length rather than entire icmp packet length. The stand-alone fix would be: - if (hlen + skb->len > 576) + if (-skb_network_offset(skb) + skb->len > 576) Fixes: eba75c587e81 ("icmp: support rfc 4884") Signed-off-by: Willem de Bruijn <willemb@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24sctp: remove redundant initialization of variable statusColin Ian King
The variable status is being initialized with a value that is never read and it is being updated later with a new value. The initialization is redundant and can be removed. Also put the variable declarations into reverse christmas tree order. Addresses-Coverity: ("Unused value") Signed-off-by: Colin Ian King <colin.king@canonical.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24net: openvswitch: fixes potential deadlock in dp cleanup codeEelco Chaudron
The previous patch introduced a deadlock, this patch fixes it by making sure the work is canceled without holding the global ovs lock. This is done by moving the reorder processing one layer up to the netns level. Fixes: eac87c413bf9 ("net: openvswitch: reorder masks array based on usage") Reported-by: syzbot+2c4ff3614695f75ce26c@syzkaller.appspotmail.com Reported-by: syzbot+bad6507e5db05017b008@syzkaller.appspotmail.com Reviewed-by: Paolo <pabeni@redhat.com> Signed-off-by: Eelco Chaudron <echaudro@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24sctp: fix slab-out-of-bounds in SCTP_DELAYED_SACK processingChristoph Hellwig
This sockopt accepts two kinds of parameters, using struct sctp_sack_info and struct sctp_assoc_value. The mentioned commit didn't notice an implicit cast from the smaller (latter) struct to the bigger one (former) when copying the data from the user space, which now leads to an attempt to write beyond the buffer (because it assumes the storing buffer is bigger than the parameter itself). Fix it by allocating a sctp_sack_info on stack and filling it out based on the small struct for the compat case. Changelog stole from an earlier patch from Marcelo Ricardo Leitner. Fixes: ebb25defdc17 ("sctp: pass a kernel pointer to sctp_setsockopt_delayed_ack") Reported-by: syzbot+0e4699d000d8b874d8dc@syzkaller.appspotmail.com Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24net: optimize the sockptr_t for unified kernel/user address spacesChristoph Hellwig
For architectures like x86 and arm64 we don't need the separate bit to indicate that a pointer is a kernel pointer as the address spaces are unified. That way the sockptr_t can be reduced to a union of two pointers, which leads to nicer calling conventions. The only caveat is that we need to check that users don't pass in kernel address and thus gain access to kernel memory. Thus the USER_SOCKPTR helper is replaced with a init_user_sockptr function that does this check and returns an error if it fails. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-24net: pass a sockptr_t into ->setsockoptChristoph Hellwig
Rework the remaining setsockopt code to pass a sockptr_t instead of a plain user pointer. This removes the last remaining set_fs(KERNEL_DS) outside of architecture specific code. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Stefan Schmidt <stefan@datenfreihafen.org> [ieee802154] Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: David S. Miller <davem@davemloft.net>