summaryrefslogtreecommitdiff
path: root/net/mptcp
AgeCommit message (Collapse)Author
2024-09-18mptcp: pm: Fix uaf in __timer_delete_syncEdward Adam Davis
commit b4cd80b0338945a94972ac3ed54f8338d2da2076 upstream. There are two paths to access mptcp_pm_del_add_timer, result in a race condition: CPU1 CPU2 ==== ==== net_rx_action napi_poll netlink_sendmsg __napi_poll netlink_unicast process_backlog netlink_unicast_kernel __netif_receive_skb genl_rcv __netif_receive_skb_one_core netlink_rcv_skb NF_HOOK genl_rcv_msg ip_local_deliver_finish genl_family_rcv_msg ip_protocol_deliver_rcu genl_family_rcv_msg_doit tcp_v4_rcv mptcp_pm_nl_flush_addrs_doit tcp_v4_do_rcv mptcp_nl_remove_addrs_list tcp_rcv_established mptcp_pm_remove_addrs_and_subflows tcp_data_queue remove_anno_list_by_saddr mptcp_incoming_options mptcp_pm_del_add_timer mptcp_pm_del_add_timer kfree(entry) In remove_anno_list_by_saddr(running on CPU2), after leaving the critical zone protected by "pm.lock", the entry will be released, which leads to the occurrence of uaf in the mptcp_pm_del_add_timer(running on CPU1). Keeping a reference to add_timer inside the lock, and calling sk_stop_timer_sync() with this reference, instead of "entry->add_timer". Move list_del(&entry->list) to mptcp_pm_del_add_timer and inside the pm lock, do not directly access any members of the entry outside the pm lock, which can avoid similar "entry->x" uaf. Fixes: 00cfd77b9063 ("mptcp: retransmit ADD_ADDR when timeout") Cc: stable@vger.kernel.org Reported-and-tested-by: syzbot+f3a31fb909db9b2a5c4d@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=f3a31fb909db9b2a5c4d Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Edward Adam Davis <eadavis@qq.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/tencent_7142963A37944B4A74EF76CD66EA3C253609@qq.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-09-04mptcp: pm: ADD_ADDR 0 is not a new addressMatthieu Baerts (NGI0)
commit 57f86203b41c98b322119dfdbb1ec54ce5e3369b upstream. The ADD_ADDR 0 with the address from the initial subflow should not be considered as a new address: this is not something new. If the host receives it, it simply means that the address is available again. When receiving an ADD_ADDR for the ID 0, the PM already doesn't consider it as new by not incrementing the 'add_addr_accepted' counter. But the 'accept_addr' might not be set if the limit has already been reached: this can be bypassed in this case. But before, it is important to check that this ADD_ADDR for the ID 0 is for the same address as the initial subflow. If not, it is not something that should happen, and the ADD_ADDR can be ignored. Note that if an ADD_ADDR is received while there is already a subflow opened using the same address, this ADD_ADDR is ignored as well. It means that if multiple ADD_ADDR for ID 0 are received, there will not be any duplicated subflows created by the client. Fixes: d0876b2284cf ("mptcp: add the incoming RM_ADDR support") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-09-04mptcp: avoid duplicated SUB_CLOSED eventsMatthieu Baerts (NGI0)
commit d82809b6c5f2676b382f77a5cbeb1a5d91ed2235 upstream. The initial subflow might have already been closed, but still in the connection list. When the worker is instructed to close the subflows that have been marked as closed, it might then try to close the initial subflow again. A consequence of that is that the SUB_CLOSED event can be seen twice: # ip mptcp endpoint 1.1.1.1 id 1 subflow dev eth0 2.2.2.2 id 2 subflow dev eth1 # ip mptcp monitor & [ CREATED] remid=0 locid=0 saddr4=1.1.1.1 daddr4=9.9.9.9 [ ESTABLISHED] remid=0 locid=0 saddr4=1.1.1.1 daddr4=9.9.9.9 [ SF_ESTABLISHED] remid=0 locid=2 saddr4=2.2.2.2 daddr4=9.9.9.9 # ip mptcp endpoint delete id 1 [ SF_CLOSED] remid=0 locid=0 saddr4=1.1.1.1 daddr4=9.9.9.9 [ SF_CLOSED] remid=0 locid=0 saddr4=1.1.1.1 daddr4=9.9.9.9 The first one is coming from mptcp_pm_nl_rm_subflow_received(), and the second one from __mptcp_close_subflow(). To avoid doing the post-closed processing twice, the subflow is now marked as closed the first time. Note that it is not enough to check if we are dealing with the first subflow and check its sk_state: the subflow might have been reset or closed before calling mptcp_close_ssk(). Fixes: b911c97c7dc7 ("mptcp: add netlink event support") Cc: stable@vger.kernel.org Tested-by: Arınç ÜNAL <arinc.unal@arinc9.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-09-04mptcp: pm: fix ID 0 endp usage after multiple re-creationsMatthieu Baerts (NGI0)
commit 9366922adc6a71378ca01f898c41be295309f044 upstream. 'local_addr_used' and 'add_addr_accepted' are decremented for addresses not related to the initial subflow (ID0), because the source and destination addresses of the initial subflows are known from the beginning: they don't count as "additional local address being used" or "ADD_ADDR being accepted". It is then required not to increment them when the entrypoint used by the initial subflow is removed and re-added during a connection. Without this modification, this entrypoint cannot be removed and re-added more than once. Reported-by: Arınç ÜNAL <arinc.unal@arinc9.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/512 Fixes: 3ad14f54bd74 ("mptcp: more accurate MPC endpoint tracking") Reported-by: syzbot+455d38ecd5f655fc45cf@syzkaller.appspotmail.com Closes: https://lore.kernel.org/00000000000049861306209237f4@google.com Cc: stable@vger.kernel.org Tested-by: Arınç ÜNAL <arinc.unal@arinc9.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-09-04mptcp: pm: do not remove already closed subflowsMatthieu Baerts (NGI0)
commit 58e1b66b4e4b8a602d3f2843e8eba00a969ecce2 upstream. It is possible to have in the list already closed subflows, e.g. the initial subflow has been already closed, but still in the list. No need to try to close it again, and increments the related counters again. Fixes: 0ee4261a3681 ("mptcp: implement mptcp_pm_remove_subflow") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-09-04mptcp: pm: fix RM_ADDR ID for the initial subflowMatthieu Baerts (NGI0)
commit 87b5896f3f7848130095656739b05881904e2697 upstream. The initial subflow has a special local ID: 0. When an endpoint is being deleted, it is then important to check if its address is not linked to the initial subflow to send the right ID. If there was an endpoint linked to the initial subflow, msk's mpc_endpoint_id field will be set. We can then use this info when an endpoint is being removed to see if it is linked to the initial subflow. So now, the correct IDs are passed to mptcp_pm_nl_rm_addr_or_subflow(), it is no longer needed to use mptcp_local_id_match(). Fixes: 3ad14f54bd74 ("mptcp: more accurate MPC endpoint tracking") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-09-04mptcp: pm: send ACK on an active subflowMatthieu Baerts (NGI0)
commit c07cc3ed895f9bfe0c53b5ed6be710c133b4271c upstream. Taking the first one on the list doesn't work in some cases, e.g. if the initial subflow is being removed. Pick another one instead of not sending anything. Fixes: 84dfe3677a6f ("mptcp: send out dedicated ADD_ADDR packet") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-09-04mptcp: pm: reset MPC endp ID when re-addedMatthieu Baerts (NGI0)
commit dce1c6d1e92535f165219695a826caedcca4e9b9 upstream. The initial subflow has a special local ID: 0. It is specific per connection. When a global endpoint is deleted and re-added later, it can have a different ID -- most services managing the endpoints automatically don't force the ID to be the same as before. It is then important to track these modifications to be consistent with the ID being used for the address used by the initial subflow, not to confuse the other peer or to send the ID 0 for the wrong address. Now when removing an endpoint, msk->mpc_endpoint_id is reset if it corresponds to this endpoint. When adding a new endpoint, the same variable is updated if the address match the one of the initial subflow. Fixes: 3ad14f54bd74 ("mptcp: more accurate MPC endpoint tracking") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-09-04mptcp: pm: skip connecting to already established sfMatthieu Baerts (NGI0)
commit bc19ff57637ff563d2bdf2b385b48c41e6509e0d upstream. The lookup_subflow_by_daddr() helper checks if there is already a subflow connected to this address. But there could be a subflow that is closing, but taking time due to some reasons: latency, losses, data to process, etc. If an ADD_ADDR is received while the endpoint is being closed, it is better to try connecting to it, instead of rejecting it: the peer which has sent the ADD_ADDR will not be notified that the ADD_ADDR has been rejected for this reason, and the expected subflow will not be created at the end. This helper should then only look for subflows that are established, or going to be, but not the ones being closed. Fixes: d84ad04941c3 ("mptcp: skip connecting the connected address") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-09-04mptcp: pm: reuse ID 0 after delete and re-addMatthieu Baerts (NGI0)
commit 8b8ed1b429f8fa7ebd5632555e7b047bc0620075 upstream. When the endpoint used by the initial subflow is removed and re-added later, the PM has to force the ID 0, it is a special case imposed by the MPTCP specs. Note that the endpoint should then need to be re-added reusing the same ID. Fixes: 3ad14f54bd74 ("mptcp: more accurate MPC endpoint tracking") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-09-04mptcp: pr_debug: add missing \n at the endMatthieu Baerts (NGI0)
commit cb41b195e634d3f1ecfcd845314e64fd4bb3c7aa upstream. pr_debug() have been added in various places in MPTCP code to help developers to debug some situations. With the dynamic debug feature, it is easy to enable all or some of them, and asks users to reproduce issues with extra debug. Many of these pr_debug() don't end with a new line, while no 'pr_cont()' are used in MPTCP code. So the goal was not to display multiple debug messages on one line: they were then not missing the '\n' on purpose. Not having the new line at the end causes these messages to be printed with a delay, when something else needs to be printed. This issue is not visible when many messages need to be printed, but it is annoying and confusing when only specific messages are expected, e.g. # echo "func mptcp_pm_add_addr_echoed +fmp" \ > /sys/kernel/debug/dynamic_debug/control # ./mptcp_join.sh "signal address"; \ echo "$(awk '{print $1}' /proc/uptime) - end"; \ sleep 5s; \ echo "$(awk '{print $1}' /proc/uptime) - restart"; \ ./mptcp_join.sh "signal address" 013 signal address (...) 10.75 - end 15.76 - restart 013 signal address [ 10.367935] mptcp:mptcp_pm_add_addr_echoed: MPTCP: msk=(...) (...) => a delay of 5 seconds: printed with a 10.36 ts, but after 'restart' which was printed at the 15.76 ts. The 'Fixes' tag here below points to the first pr_debug() used without '\n' in net/mptcp. This patch could be split in many small ones, with different Fixes tag, but it doesn't seem worth it, because it is easy to re-generate this patch with this simple 'sed' command: git grep -l pr_debug -- net/mptcp | xargs sed -i "s/\(pr_debug(\".*[^n]\)\(\"[,)]\)/\1\\\n\2/g" So in case of conflicts, simply drop the modifications, and launch this command. Fixes: f870fa0b5768 ("mptcp: Add MPTCP socket stubs") Cc: stable@vger.kernel.org Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240826-net-mptcp-close-extra-sf-fin-v1-4-905199fe1172@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-09-04mptcp: sched: check both backup in retransMatthieu Baerts (NGI0)
commit 2a1f596ebb23eadc0f9b95a8012e18ef76295fc8 upstream. The 'mptcp_subflow_context' structure has two items related to the backup flags: - 'backup': the subflow has been marked as backup by the other peer - 'request_bkup': the backup flag has been set by the host Looking only at the 'backup' flag can make sense in some cases, but it is not the behaviour of the default packet scheduler when selecting paths. As explained in the commit b6a66e521a20 ("mptcp: sched: check both directions for backup"), the packet scheduler should look at both flags, because that was the behaviour from the beginning: the 'backup' flag was set by accident instead of the 'request_bkup' one. Now that the latter has been fixed, get_retrans() needs to be adapted as well. Fixes: b6a66e521a20 ("mptcp: sched: check both directions for backup") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240826-net-mptcp-close-extra-sf-fin-v1-3-905199fe1172@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-09-04mptcp: close subflow when receiving TCP+FINMatthieu Baerts (NGI0)
commit f09b0ad55a1196f5891663f8888463c0541059cb upstream. When a peer decides to close one subflow in the middle of a connection having multiple subflows, the receiver of the first FIN should accept that, and close the subflow on its side as well. If not, the subflow will stay half closed, and would even continue to be used until the end of the MPTCP connection or a reset from the network. The issue has not been seen before, probably because the in-kernel path-manager always sends a RM_ADDR before closing the subflow. Upon the reception of this RM_ADDR, the other peer will initiate the closure on its side as well. On the other hand, if the RM_ADDR is lost, or if the path-manager of the other peer only closes the subflow without sending a RM_ADDR, the subflow would switch to TCP_CLOSE_WAIT, but that's it, leaving the subflow half-closed. So now, when the subflow switches to the TCP_CLOSE_WAIT state, and if the MPTCP connection has not been closed before with a DATA_FIN, the kernel owning the subflow schedules its worker to initiate the closure on its side as well. This issue can be easily reproduced with packetdrill, as visible in [1], by creating an additional subflow, injecting a FIN+ACK before sending the DATA_FIN, and expecting a FIN+ACK in return. Fixes: 40947e13997a ("mptcp: schedule worker when subflow is closed") Cc: stable@vger.kernel.org Link: https://github.com/multipath-tcp/packetdrill/pull/154 [1] Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240826-net-mptcp-close-extra-sf-fin-v1-1-905199fe1172@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-29mptcp: pm: avoid possible UaF when selecting endpMatthieu Baerts (NGI0)
commit 48e50dcbcbaaf713d82bf2da5c16aeced94ad07d upstream. select_local_address() and select_signal_address() both select an endpoint entry from the list inside an RCU protected section, but return a reference to it, to be read later on. If the entry is dereferenced after the RCU unlock, reading info could cause a Use-after-Free. A simple solution is to copy the required info while inside the RCU protected section to avoid any risk of UaF later. The address ID might need to be modified later to handle the ID0 case later, so a copy seems OK to deal with. Reported-by: Paolo Abeni <pabeni@redhat.com> Closes: https://lore.kernel.org/45cd30d3-7710-491c-ae4d-a1368c00beb1@redhat.com Fixes: 01cacb00b35c ("mptcp: add netlink-based PM") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240819-net-mptcp-pm-reusing-id-v1-14-38035d40de5b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-29mptcp: pm: fullmesh: select the right ID laterMatthieu Baerts (NGI0)
commit 09355f7abb9fbfc1a240be029837921ea417bf4f upstream. When reacting upon the reception of an ADD_ADDR, the in-kernel PM first looks for fullmesh endpoints. If there are some, it will pick them, using their entry ID. It should set the ID 0 when using the endpoint corresponding to the initial subflow, it is a special case imposed by the MPTCP specs. Note that msk->mpc_endpoint_id might not be set when receiving the first ADD_ADDR from the server. So better to compare the addresses. Fixes: 1a0d6136c5f0 ("mptcp: local addresses fullmesh") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240819-net-mptcp-pm-reusing-id-v1-12-38035d40de5b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-29mptcp: pm: only in-kernel cannot have entries with ID 0Matthieu Baerts (NGI0)
commit ca6e55a703ca2894611bb5c5bca8bfd2290fd91e upstream. The ID 0 is specific per MPTCP connections. The per netns entries cannot have this special ID 0 then. But that's different for the userspace PM where the entries are per connection, they can then use this special ID 0. Fixes: f40be0db0b76 ("mptcp: unify pm get_flags_and_ifindex_by_id") Cc: stable@vger.kernel.org Acked-by: Geliang Tang <geliang@kernel.org> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240819-net-mptcp-pm-reusing-id-v1-11-38035d40de5b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-29mptcp: pm: check add_addr_accept_max before accepting new ADD_ADDRMatthieu Baerts (NGI0)
commit 0137a3c7c2ea3f9df8ebfc65d78b4ba712a187bb upstream. The limits might have changed in between, it is best to check them before accepting new ADD_ADDR. Fixes: d0876b2284cf ("mptcp: add the incoming RM_ADDR support") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240819-net-mptcp-pm-reusing-id-v1-10-38035d40de5b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-29mptcp: pm: only decrement add_addr_accepted for MPJ reqMatthieu Baerts (NGI0)
commit 1c1f721375989579e46741f59523e39ec9b2a9bd upstream. Adding the following warning ... WARN_ON_ONCE(msk->pm.add_addr_accepted == 0) ... before decrementing the add_addr_accepted counter helped to find a bug when running the "remove single subflow" subtest from the mptcp_join.sh selftest. Removing a 'subflow' endpoint will first trigger a RM_ADDR, then the subflow closure. Before this patch, and upon the reception of the RM_ADDR, the other peer will then try to decrement this add_addr_accepted. That's not correct because the attached subflows have not been created upon the reception of an ADD_ADDR. A way to solve that is to decrement the counter only if the attached subflow was an MP_JOIN to a remote id that was not 0, and initiated by the host receiving the RM_ADDR. Fixes: d0876b2284cf ("mptcp: add the incoming RM_ADDR support") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240819-net-mptcp-pm-reusing-id-v1-9-38035d40de5b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-29mptcp: pm: only mark 'subflow' endp as availableMatthieu Baerts (NGI0)
commit 322ea3778965da72862cca2a0c50253aacf65fe6 upstream. Adding the following warning ... WARN_ON_ONCE(msk->pm.local_addr_used == 0) ... before decrementing the local_addr_used counter helped to find a bug when running the "remove single address" subtest from the mptcp_join.sh selftests. Removing a 'signal' endpoint will trigger the removal of all subflows linked to this endpoint via mptcp_pm_nl_rm_addr_or_subflow() with rm_type == MPTCP_MIB_RMSUBFLOW. This will decrement the local_addr_used counter, which is wrong in this case because this counter is linked to 'subflow' endpoints, and here it is a 'signal' endpoint that is being removed. Now, the counter is decremented, only if the ID is being used outside of mptcp_pm_nl_rm_addr_or_subflow(), only for 'subflow' endpoints, and if the ID is not 0 -- local_addr_used is not taking into account these ones. This marking of the ID as being available, and the decrement is done no matter if a subflow using this ID is currently available, because the subflow could have been closed before. Fixes: 06faa2271034 ("mptcp: remove multi addresses and subflows in PM") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240819-net-mptcp-pm-reusing-id-v1-8-38035d40de5b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-29mptcp: pm: remove mptcp_pm_remove_subflow()Matthieu Baerts (NGI0)
commit f448451aa62d54be16acb0034223c17e0d12bc69 upstream. This helper is confusing. It is in pm.c, but it is specific to the in-kernel PM and it cannot be used by the userspace one. Also, it simply calls one in-kernel specific function with the PM lock, while the similar mptcp_pm_remove_addr() helper requires the PM lock. What's left is the pr_debug(), which is not that useful, because a similar one is present in the only function called by this helper: mptcp_pm_nl_rm_subflow_received() After these modifications, this helper can be marked as 'static', and the lock can be taken only once in mptcp_pm_flush_addrs_and_subflows(). Note that it is not a bug fix, but it will help backporting the following commits. Fixes: 0ee4261a3681 ("mptcp: implement mptcp_pm_remove_subflow") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240819-net-mptcp-pm-reusing-id-v1-7-38035d40de5b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-29mptcp: pm: re-using ID of unused flushed subflowsMatthieu Baerts (NGI0)
commit ef34a6ea0cab1800f4b3c9c3c2cefd5091e03379 upstream. If no subflows are attached to the 'subflow' endpoints that are being flushed, the corresponding addr IDs will not be marked as available again. Mark all ID as being available when flushing all the 'subflow' endpoints, and reset local_addr_used counter to cover these cases. Note that mptcp_pm_remove_addrs_and_subflows() helper is only called for flushing operations, not to remove a specific set of addresses and subflows. Fixes: 06faa2271034 ("mptcp: remove multi addresses and subflows in PM") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240819-net-mptcp-pm-reusing-id-v1-5-38035d40de5b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-29mptcp: pm: re-using ID of unused removed subflowsMatthieu Baerts (NGI0)
commit edd8b5d868a4d459f3065493001e293901af758d upstream. If no subflow is attached to the 'subflow' endpoint that is being removed, the addr ID will not be marked as available again. Mark the linked ID as available when removing the 'subflow' endpoint if no subflow is attached to it. While at it, the local_addr_used counter is decremented if the ID was marked as being used to reflect the reality, but also to allow adding new endpoints after that. Fixes: b6c08380860b ("mptcp: remove addr and subflow in PM netlink") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240819-net-mptcp-pm-reusing-id-v1-3-38035d40de5b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-29mptcp: pm: re-using ID of unused removed ADD_ADDRMatthieu Baerts (NGI0)
commit e255683c06df572ead96db5efb5d21be30c0efaa upstream. If no subflow is attached to the 'signal' endpoint that is being removed, the addr ID will not be marked as available again. Mark the linked ID as available when removing the address entry from the list to cover this case. Fixes: b6c08380860b ("mptcp: remove addr and subflow in PM netlink") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240819-net-mptcp-pm-reusing-id-v1-1-38035d40de5b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-29mptcp: correct MPTCP_SUBFLOW_ATTR_SSN_OFFSET reserved sizeEugene Syromiatnikov
[ Upstream commit 655111b838cdabdb604f3625a9ff08c5eedb11da ] ssn_offset field is u32 and is placed into the netlink response with nla_put_u32(), but only 2 bytes are reserved for the attribute payload in subflow_get_info_size() (even though it makes no difference in the end, as it is aligned up to 4 bytes). Supply the correct argument to the relevant nla_total_size() call to make it less confusing. Fixes: 5147dfb50832 ("mptcp: allow dumping subflow context to userspace") Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240812065024.GA19719@asgard.redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-14mptcp: pm: do not ignore 'subflow' if 'signal' flag is also setMatthieu Baerts (NGI0)
commit 85df533a787bf07bf4367ce2a02b822ff1fba1a3 upstream. Up to the 'Fixes' commit, having an endpoint with both the 'signal' and 'subflow' flags, resulted in the creation of a subflow and an address announcement using the address linked to this endpoint. After this commit, only the address announcement was done, ignoring the 'subflow' flag. That's because the same bitmap is used for the two flags. It is OK to keep this single bitmap, the already selected local endpoint simply have to be re-used, but not via select_local_address() not to look at the just modified bitmap. Note that it is unusual to set the two flags together: creating a new subflow using a new local address will implicitly advertise it to the other peer. So in theory, no need to advertise it explicitly as well. Maybe there are use-cases -- the subflow might not reach the other peer that way, we can ask the other peer to try initiating the new subflow without delay -- or very likely the user is confused, and put both flags "just to be sure at least the right one is set". Still, if it is allowed, the kernel should do what has been asked: using this endpoint to announce the address and to create a new subflow from it. An alternative is to forbid the use of the two flags together, but that's probably too late, there are maybe use-cases, and it was working before. This patch will avoid people complaining subflows are not created using the endpoint they added with the 'subflow' and 'signal' flag. Note that with the current patch, the subflow might not be created in some corner cases, e.g. if the 'subflows' limit was reached when sending the ADD_ADDR, but changed later on. It is probably not worth splitting id_avail_bitmap per target ('signal', 'subflow'), which will add another large field to the msk "just" to track (again) endpoints. Anyway, currently when the limits are changed, the kernel doesn't check if new subflows can be created or removed, because we would need to keep track of the received ADD_ADDR, and more. It sounds OK to assume that the limits should be properly configured before establishing new connections. Fixes: 86e39e04482b ("mptcp: keep track of local endpoint still available for each msk") Cc: stable@vger.kernel.org Suggested-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240731-upstream-net-20240731-mptcp-endp-subflow-signal-v1-5-c8a9b036493b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-14mptcp: pm: don't try to create sf if alloc failedMatthieu Baerts (NGI0)
commit cd7c957f936f8cb80d03e5152f4013aae65bd986 upstream. It sounds better to avoid wasting cycles and / or put extreme memory pressure on the system by trying to create new subflows if it was not possible to add a new item in the announce list. While at it, a warning is now printed if the entry was already in the list as it should not happen with the in-kernel path-manager. With this PM, mptcp_pm_alloc_anno_list() should only fail in case of memory pressure. Fixes: b6c08380860b ("mptcp: remove addr and subflow in PM netlink") Cc: stable@vger.kernel.org Suggested-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240731-upstream-net-20240731-mptcp-endp-subflow-signal-v1-4-c8a9b036493b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-14mptcp: pm: reduce indentation blocksMatthieu Baerts (NGI0)
commit c95eb32ced823a00be62202b43966b07b2f20b7f upstream. That will simplify the following commits. No functional changes intended. Suggested-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240731-upstream-net-20240731-mptcp-endp-subflow-signal-v1-3-c8a9b036493b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Stable-dep-of: cd7c957f936f ("mptcp: pm: don't try to create sf if alloc failed") Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-14mptcp: pm: deny endp with signal + subflow + portMatthieu Baerts (NGI0)
commit 8af1f11865f259c882cce71d32f85ee9004e2660 upstream. As mentioned in the 'Fixes' commit, the port flag is only supported by the 'signal' flag, and not by the 'subflow' one. Then if both the 'signal' and 'subflow' flags are set, the problem is the same: the feature cannot work with the 'subflow' flag. Technically, if both the 'signal' and 'subflow' flags are set, it will be possible to create the listening socket, but not to establish a subflow using this source port. So better to explicitly deny it, not to create some confusions because the expected behaviour is not possible. Fixes: 09f12c3ab7a5 ("mptcp: allow to use port and non-signal in set_flags") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240731-upstream-net-20240731-mptcp-endp-subflow-signal-v1-2-c8a9b036493b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-14mptcp: fully established after ADD_ADDR echo on MPJMatthieu Baerts (NGI0)
commit d67c5649c1541dc93f202eeffc6f49220a4ed71d upstream. Before this patch, receiving an ADD_ADDR echo on the just connected MP_JOIN subflow -- initiator side, after the MP_JOIN 3WHS -- was resulting in an MP_RESET. That's because only ACKs with a DSS or ADD_ADDRs without the echo bit were allowed. Not allowing the ADD_ADDR echo after an MP_CAPABLE 3WHS makes sense, as we are not supposed to send an ADD_ADDR before because it requires to be in full established mode first. For the MP_JOIN 3WHS, that's different: the ADD_ADDR can be sent on a previous subflow, and the ADD_ADDR echo can be received on the recently created one. The other peer will already be in fully established, so it is allowed to send that. We can then relax the conditions here to accept the ADD_ADDR echo for MPJ subflows. Fixes: 67b12f792d5e ("mptcp: full fully established support after ADD_ADDR") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240731-upstream-net-20240731-mptcp-endp-subflow-signal-v1-1-c8a9b036493b@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-11mptcp: fix duplicate data handlingPaolo Abeni
commit 68cc924729ffcfe90d0383177192030a9aeb2ee4 upstream. When a subflow receives and discards duplicate data, the mptcp stack assumes that the consumed offset inside the current skb is zero. With multiple subflows receiving data simultaneously such assertion does not held true. As a result the subflow-level copied_seq will be incorrectly increased and later on the same subflow will observe a bad mapping, leading to subflow reset. Address the issue taking into account the skb consumed offset in mptcp_subflow_discard_data(). Fixes: 04e4cd4f7ca4 ("mptcp: cleanup mptcp_subflow_discard_data()") Cc: stable@vger.kernel.org Link: https://github.com/multipath-tcp/mptcp_net-next/issues/501 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-11mptcp: pm: only set request_bkup flag when sending MP_PRIOMatthieu Baerts (NGI0)
commit 4258b94831bb7ff28ab80e3c8d94db37db930728 upstream. The 'backup' flag from mptcp_subflow_context structure is supposed to be set only when the other peer flagged a subflow as backup, not the opposite. Fixes: 067065422fcd ("mptcp: add the outgoing MP_PRIO support") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-11mptcp: pm: fix backup support in signal endpointsMatthieu Baerts (NGI0)
commit 6834097fc38c5416701c793da94558cea49c0a1f upstream. There was a support for signal endpoints, but only when the endpoint's flag was changed during a connection. If an endpoint with the signal and backup was already present, the MP_JOIN reply was not containing the backup flag as expected. That's confusing to have this inconsistent behaviour. On the other hand, the infrastructure to set the backup flag in the SYN + ACK + MP_JOIN was already there, it was just never set before. Now when requesting the local ID from the path-manager, the backup status is also requested. Note that when the userspace PM is used, the backup flag can be set if the local address was already used before with a backup flag, e.g. if the address was announced with the 'backup' flag, or a subflow was created with the 'backup' flag. Fixes: 4596a2c1b7f5 ("mptcp: allow creating non-backup subflows") Cc: stable@vger.kernel.org Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/507 Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-11mptcp: fix bad RCVPRUNED mib accountingPaolo Abeni
commit 0a567c2a10033bf04ed618368d179bce6977984b upstream. Since its introduction, the mentioned MIB accounted for the wrong event: wake-up being skipped as not-needed on some edge condition instead of incoming skb being dropped after landing in the (subflow) receive queue. Move the increment in the correct location. Fixes: ce599c516386 ("mptcp: properly account bulk freed memory") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-11mptcp: mib: count MPJ with backup flagMatthieu Baerts (NGI0)
commit 4dde0d72ccec500c60c798e036b852e013d6e124 upstream. Without such counters, it is difficult to easily debug issues with MPJ not having the backup flags on production servers. This is not strictly a fix, but it eases to validate the following patches without requiring to take packet traces, to query ongoing connections with Netlink with admin permissions, or to guess by looking at the behaviour of the packet scheduler. Also, the modification is self contained, isolated, well controlled, and the increments are done just after others, there from the beginning. It looks then safe, and helpful to backport this. Fixes: 4596a2c1b7f5 ("mptcp: allow creating non-backup subflows") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-11mptcp: fix NL PM announced address accountingPaolo Abeni
commit 4b317e0eb287bd30a1b329513531157c25e8b692 upstream. Currently the per connection announced address counter is never decreased. As a consequence, after connection establishment, if the NL PM deletes an endpoint and adds a new/different one, no additional subflow is created for the new endpoint even if the current limits allow that. Address the issue properly updating the signaled address counter every time the NL PM removes such addresses. Fixes: 01cacb00b35c ("mptcp: add netlink-based PM") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-11mptcp: distinguish rcv vs sent backup flag in requestsMatthieu Baerts (NGI0)
commit efd340bf3d7779a3a8ec954d8ec0fb8a10f24982 upstream. When sending an MP_JOIN + SYN + ACK, it is possible to mark the subflow as 'backup' by setting the flag with the same name. Before this patch, the backup was set if the other peer set it in its MP_JOIN + SYN request. It is not correct: the backup flag should be set in the MPJ+SYN+ACK only if the host asks for it, and not mirroring what was done by the other peer. It is then required to have a dedicated bit for each direction, similar to what is done in the subflow context. Fixes: f296234c98a8 ("mptcp: Add handling of incoming MP_JOIN requests") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-11mptcp: fix user-space PM announced address accountingPaolo Abeni
commit 167b93258d1e2230ee3e8a97669b4db4cc9e90aa upstream. Currently the per-connection announced address counter is never decreased. When the user-space PM is in use, this just affect the information exposed via diag/sockopt, but it could still foul the PM to wrong decision. Add the missing accounting for the user-space PM's sake. Fixes: 8b1c94da1e48 ("mptcp: only send RM_ADDR in nl_cmd_remove") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-11mptcp: sched: check both directions for backupMatthieu Baerts (NGI0)
commit b6a66e521a2032f7fcba2af5a9bcbaeaa19b7ca3 upstream. The 'mptcp_subflow_context' structure has two items related to the backup flags: - 'backup': the subflow has been marked as backup by the other peer - 'request_bkup': the backup flag has been set by the host Before this patch, the scheduler was only looking at the 'backup' flag. That can make sense in some cases, but it looks like that's not what we wanted for the general use, because either the path-manager was setting both of them when sending an MP_PRIO, or the receiver was duplicating the 'backup' flag in the subflow request. Note that the use of these two flags in the path-manager are going to be fixed in the next commits, but this change here is needed not to modify the behaviour. Fixes: f296234c98a8 ("mptcp: Add handling of incoming MP_JOIN requests") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-10mptcp: pm: update add_addr counters after connectYonglongLi
The creation of new subflows can fail for different reasons. If no subflow have been created using the received ADD_ADDR, the related counters should not be updated, otherwise they will never be decremented for events related to this ID later on. For the moment, the number of accepted ADD_ADDR is only decremented upon the reception of a related RM_ADDR, and only if the remote address ID is currently being used by at least one subflow. In other words, if no subflow can be created with the received address, the counter will not be decremented. In this case, it is then important not to increment pm.add_addr_accepted counter, and not to modify pm.accept_addr bit. Note that this patch does not modify the behaviour in case of failures later on, e.g. if the MP Join is dropped or rejected. The "remove invalid addresses" MP Join subtest has been modified to validate this case. The broadcast IP address is added before the "valid" address that will be used to successfully create a subflow, and the limit is decreased by one: without this patch, it was not possible to create the last subflow, because: - the broadcast address would have been accepted even if it was not usable: the creation of a subflow to this address results in an error, - the limit of 2 accepted ADD_ADDR would have then been reached. Fixes: 01cacb00b35c ("mptcp: add netlink-based PM") Cc: stable@vger.kernel.org Co-developed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: YonglongLi <liyonglong@chinatelecom.cn> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-3-1ab9ddfa3d00@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-10mptcp: pm: inc RmAddr MIB counter once per RM_ADDR IDYonglongLi
The RmAddr MIB counter is supposed to be incremented once when a valid RM_ADDR has been received. Before this patch, it could have been incremented as many times as the number of subflows connected to the linked address ID, so it could have been 0, 1 or more than 1. The "RmSubflow" is incremented after a local operation. In this case, it is normal to tied it with the number of subflows that have been actually removed. The "remove invalid addresses" MP Join subtest has been modified to validate this case. A broadcast IP address is now used instead: the client will not be able to create a subflow to this address. The consequence is that when receiving the RM_ADDR with the ID attached to this broadcast IP address, no subflow linked to this ID will be found. Fixes: 7a7e52e38a40 ("mptcp: add RM_ADDR related mibs") Cc: stable@vger.kernel.org Co-developed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: YonglongLi <liyonglong@chinatelecom.cn> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-2-1ab9ddfa3d00@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-10mptcp: ensure snd_una is properly initialized on connectPaolo Abeni
This is strictly related to commit fb7a0d334894 ("mptcp: ensure snd_nxt is properly initialized on connect"). It turns out that syzkaller can trigger the retransmit after fallback and before processing any other incoming packet - so that snd_una is still left uninitialized. Address the issue explicitly initializing snd_una together with snd_nxt and write_seq. Suggested-by: Mat Martineau <martineau@kernel.org> Fixes: 8fd738049ac3 ("mptcp: fallback in case of simultaneous connect") Cc: stable@vger.kernel.org Reported-by: Christoph Paasch <cpaasch@apple.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/485 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://lore.kernel.org/r/20240607-upstream-net-20240607-misc-fixes-v1-1-1ab9ddfa3d00@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-05mptcp: count CLOSE-WAIT sockets for MPTCP_MIB_CURRESTABJason Xing
Like previous patch does in TCP, we need to adhere to RFC 1213: "tcpCurrEstab OBJECT-TYPE ... The number of TCP connections for which the current state is either ESTABLISHED or CLOSE- WAIT." So let's consider CLOSE-WAIT sockets. The logic of counting When we increment the counter? a) Only if we change the state to ESTABLISHED. When we decrement the counter? a) if the socket leaves ESTABLISHED and will never go into CLOSE-WAIT, say, on the client side, changing from ESTABLISHED to FIN-WAIT-1. b) if the socket leaves CLOSE-WAIT, say, on the server side, changing from CLOSE-WAIT to LAST-ACK. Fixes: d9cd27b8cd19 ("mptcp: add CurrEstab MIB counter support") Signed-off-by: Jason Xing <kernelxing@tencent.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2024-05-18Merge tag 'net-accept-more-20240515' of git://git.kernel.dk/linuxLinus Torvalds
Pull more io_uring updates from Jens Axboe: "This adds support for IORING_CQE_F_SOCK_NONEMPTY for io_uring accept requests. This is very similar to previous work that enabled the same hint for doing receives on sockets. By far the majority of the work here is refactoring to enable the networking side to pass back whether or not the socket had more pending requests after accepting the current one, the last patch just wires it up for io_uring. Not only does this enable applications to know whether there are more connections to accept right now, it also enables smarter logic for io_uring multishot accept on whether to retry immediately or wait for a poll trigger" * tag 'net-accept-more-20240515' of git://git.kernel.dk/linux: io_uring/net: wire up IORING_CQE_F_SOCK_NONEMPTY for accept net: pass back whether socket was empty post accept net: have do_accept() take a struct proto_accept_arg argument net: change proto and proto_ops accept type
2024-05-13mptcp: include inet_common in mib.hMatthieu Baerts (NGI0)
So this file is now self-contained: it can be compiled alone with analytic tools. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20240514011335.176158-9-martineau@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-13mptcp: move mptcp_pm_gen.h's includeMatthieu Baerts (NGI0)
Nothing from protocol.h depends on mptcp_pm_gen.h, only code from pm_netlink.c and pm_userspace.c depends on it. So this include can be moved where it is needed to avoid a "unused includes" warning. Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20240514011335.176158-8-martineau@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-13mptcp: remove unnecessary else statementsMatthieu Baerts (NGI0)
The 'else' statements are not needed here, because their previous 'if' block ends with a 'return'. This fixes CheckPatch warnings: WARNING: else is not generally useful after a break or return Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20240514011335.176158-7-martineau@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-13mptcp: prefer strscpy over strcpyMatthieu Baerts (NGI0)
strcpy() performs no bounds checking on the destination buffer. This could result in linear overflows beyond the end of the buffer, leading to all kinds of misbehaviors. The safe replacement is strscpy() [1]. This is in preparation of a possible future step where all strcpy() uses will be removed in favour of strscpy() [2]. This fixes CheckPatch warnings: WARNING: Prefer strscpy over strcpy Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy [1] Link: https://github.com/KSPP/linux/issues/88 [2] Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20240514011335.176158-6-martineau@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-13mptcp: add net.mptcp.available_schedulersGregory Detal
The sysctl lists the available schedulers that can be set using net.mptcp.scheduler similarly to net.ipv4.tcp_available_congestion_control. Signed-off-by: Gregory Detal <gregory.detal@gmail.com> Reviewed-by: Mat Martineau <martineau@kernel.org> Tested-by: Geliang Tang <geliang@kernel.org> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20240514011335.176158-5-martineau@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-13mptcp: sockopt: info: stop early if no bufferMatthieu Baerts (NGI0)
Up to recently, it has been recommended to use getsockopt(MPTCP_INFO) to check if a fallback to TCP happened, or if the client requested to use MPTCP. In this case, the userspace app is only interested by the returned value of the getsocktop() call, and can then give 0 for the option length, and NULL for the buffer address. An easy optimisation is then to stop early, and avoid filling a local buffer -- which now requires two different locks -- if it is not needed. Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20240514011335.176158-4-martineau@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-13mptcp: fix full TCP keep-alive supportMatthieu Baerts (NGI0)
SO_KEEPALIVE support has been added a while ago, as part of a series "adding SOL_SOCKET" support. To have a full control of this keep-alive feature, it is important to also support TCP_KEEP* socket options at the SOL_TCP level. Supporting them on the setsockopt() part is easy, it is just a matter of remembering each value in the MPTCP sock structure, and calling tcp_sock_set_keep*() helpers on each subflow. If the value is not modified (0), calling these helpers will not do anything. For the getsockopt() part, the corresponding value from the MPTCP sock structure or the default one is simply returned. All of this is very similar to other TCP_* socket options supported by MPTCP. It looks important for kernels supporting SO_KEEPALIVE, to also support TCP_KEEP* options as well: some apps seem to (wrongly) consider that if the former is supported, the latter ones will be supported as well. But also, not having this simple and isolated change is preventing MPTCP support in some apps, and libraries like GoLang [1]. This is why this patch is seen as a fix. Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/383 Fixes: 1b3e7ede1365 ("mptcp: setsockopt: handle SO_KEEPALIVE and SO_PRIORITY") Link: https://github.com/golang/go/issues/56539 [1] Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Mat Martineau <martineau@kernel.org> Link: https://lore.kernel.org/r/20240514011335.176158-3-martineau@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>