Age | Commit message (Collapse) | Author |
|
The registration of mptcp_genl_family is useful for both the in-kernel
and the userspace PM. It should then be done in pm_netlink.c.
On the other hand, the registration of the in-kernel pernet subsystem is
specific to the in-kernel PM, and should stay there in pm_kernel.c.
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250313-net-next-mptcp-pm-ops-intro-v1-1-f4e4a88efc50@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Before this patch, the PM code was dispersed in different places:
- pm.c had common code for all PMs, but also Netlink specific code that
will not be needed with the future BPF path-managers.
- pm_netlink.c had common Netlink code.
To clarify the code, a reorganisation is suggested here, only by moving
code around, and small helper renaming to avoid confusions:
- pm_netlink.c now only contains common PM Netlink code:
- PM events: this code was already there
- shared helpers around Netlink code that were already there as well
- shared Netlink commands code from pm.c
- pm.c now no longer contain Netlink specific code.
- protocol.h has been updated accordingly:
- mptcp_nl_fill_addr() no longer need to be exported.
The code around the PM is now less confusing, which should help for the
maintenance in the long term.
This will certainly impact future backports, but because other cleanups
have already done recently, and more are coming to ease the addition of
a new path-manager controlled with BPF (struct_ops), doing that now
seems to be a good time. Also, many issues around the PM have been fixed
a few months ago while increasing the code coverage in the selftests, so
such big reorganisation can be done with more confidence now.
No behavioural changes intended.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-15-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Before this patch, the PM code was dispersed in different places:
- pm.c had common code for all PMs
- pm_netlink.c was supposed to be about the in-kernel PM, but also had
exported common Netlink helpers, NL events for PM userspace daemons,
etc. quite confusing.
To clarify the code, a reorganisation is suggested here, only by moving
code around to avoid confusions:
- pm_netlink.c now only contains common PM Netlink code:
- PM events: this code was already there
- shared helpers around Netlink code that were already there as well
- more shared Netlink commands code from pm.c will come after
- pm_kernel.c now contains only code that is specific to the in-kernel
PM. Now all functions are either called from:
- pm.c: events coming from the core, when this PM is being used
- pm_netlink.c: for shared Netlink commands
- mptcp_pm_gen.c: for Netlink commands specific to the in-kernel PM
- sockopt.c: for the exported counters per netns
- (while at it, a useless 'return;' spot by checkpatch at the end of
mptcp_pm_nl_set_flags_all, has been removed)
The code around the PM is now less confusing, which should help for the
maintenance in the long term.
This will certainly impact future backports, but because other cleanups
have already done recently, and more are coming to ease the addition of
a new path-manager controlled with BPF (struct_ops), doing that now
seems to be a good time. Also, many issues around the PM have been fixed
a few months ago while increasing the code coverage in the selftests, so
such big reorganisation can be done with more confidence now.
No behavioural changes intended.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-14-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Before this patch, the PM code was dispersed in different places:
- pm.c had common code for all PMs
- pm_netlink.c was supposed to be about the in-kernel PM, but also had
exported common helpers, callbacks used by the different PMs, NL
events for PM userspace daemon, etc. quite confusing.
- pm_userspace.c had userspace PM only code, but using specific
in-kernel PM helpers
To clarify the code, a reorganisation is suggested here, only by moving
code around, and (un)exporting functions:
- helpers used from both PMs and not linked to Netlink
- callbacks used by different PMs, e.g. ADD_ADDR management
- some helpers have been marked as 'static'
- protocol.h has been updated accordingly
- (while at it, a needless if before a kfree(), spot by checkpatch in
mptcp_remove_anno_list_by_saddr(), has been removed)
The code around the PM is now less confusing, which should help for the
maintenance in the long term.
This will certainly impact future backports, but because other cleanups
have already done recently, and more are coming to ease the addition of
a new path-manager controlled with BPF (struct_ops), doing that now
seems to be a good time. Also, many issues around the PM have been fixed
a few months ago while increasing the code coverage in the selftests, so
such big reorganisation can be done with more confidence now.
No behavioural changes intended.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-13-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In a following commit, the 'remote_address' helper will need to be used
from different files.
It is then exported, and prefixed with 'mptcp_', similar to
'mptcp_local_address'.
No behavioural changes intended.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-11-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
To make it clear what actions are in-kernel PM specific and which ones
are not and done for all PMs, e.g. sending ADD_ADDR and close associated
subflows when a RM_ADDR is received.
The behavioural is changed a bit: MPTCP_PM_ADD_ADDR_RECEIVED is now
treated after MPTCP_PM_ADD_ADDR_SEND_ACK and MPTCP_PM_RM_ADDR_RECEIVED,
but that should not change anything in practice.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-10-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently, in-kernel PM specific helpers are prefixed with
'mptcp_pm_nl_'. Here, '_pm' was missing from 'mptcp_nl_set_flags'.
Add '_pm' to be similar to others, and add '_all' to avoid confusions
witih the global 'mptcp_pm_nl_set_flags'.
No behavioural changes intended.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-8-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently, in-kernel PM specific helpers are prefixed with
'mptcp_pm_nl_'. But here 'mptcp_pm_nl_is_init_remote_addr' is not
specific to this PM: it is called from pm.c for both the in-kernel and
userspace PMs.
To avoid confusions, the '_nl' bit has been removed from the name.
No behavioural changes intended.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-7-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently, in-kernel PM specific helpers are prefixed with
'mptcp_pm_nl_'. But here 'mptcp_pm_nl_subflow_chk_stale' is not specific
to this PM: it is called from pm.c for both the in-kernel and userspace
PMs.
To avoid confusions, the '_nl' bit has been removed from the name.
No behavioural changes intended.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-6-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently, in-kernel PM specific helpers are prefixed with
'mptcp_pm_nl_'. But here 'mptcp_pm_nl_rm_addr_received' is not specific
to this PM: it is called from the PM worker, and used by both the
in-kernel and userspace PMs. The helper has been renamed to
'mptcp_pm_rm_addr_recv' instead of '_received' to avoid confusions with
the one from pm.c.
mptcp_pm_nl_rm_addr_or_subflow', and 'mptcp_pm_nl_rm_subflow_received'
have been updated too for the same reason.
To avoid confusions, the '_nl' bit has been removed from the name.
While at it, the in-kernel PM specific code has been move from
mptcp_pm_rm_addr_or_subflow to a new dedicated helper, clearer.
No behavioural changes intended.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-5-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently, in-kernel PM specific helpers are prefixed with
'mptcp_pm_nl_'. But here 'mptcp_pm_nl_work' is not specific to this PM:
it is called from the core to call helpers, some of them needed by both
the in-kernel and userspace PMs.
To avoid confusions, the '_nl' bit has been removed from the name.
Also used 'worker' instead of 'work', similar to protocol.c's worker.
No behavioural changes intended.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-4-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently, in-kernel PM specific helpers are prefixed with
'mptcp_pm_nl_'. But here 'mptcp_pm_nl_mp_prio_send_ack()' is not
specific to this PM: it is used by both the in-kernel and userspace PMs.
To avoid confusions, the '_nl' bit has been removed from the name.
No behavioural changes intended.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-3-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently, in-kernel PM specific helpers are prefixed with
'mptcp_pm_nl_'. But here 'mptcp_pm_nl_addr_send_ack()' is not specific
to this PM: it is used by both the in-kernel and userspace PMs.
To avoid confusions, the '_nl' bit has been removed from the name.
No behavioural changes intended.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-2-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The following code in mptcp_userspace_pm_get_local_id() that assigns "skc"
to "new_entry" is not allowed in BPF if we use the same code to implement
the get_local_id() interface of a BFP path manager:
memset(&new_entry, 0, sizeof(struct mptcp_pm_addr_entry));
new_entry.addr = *skc;
new_entry.addr.id = 0;
new_entry.flags = MPTCP_PM_ADDR_FLAG_IMPLICIT;
To solve the issue, this patch moves this assignment to "new_entry" forward
to mptcp_pm_get_local_id(), and then passing "new_entry" as a parameter to
both mptcp_pm_nl_get_local_id() and mptcp_userspace_pm_get_local_id().
No behavioural changes intended.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250307-net-next-mptcp-pm-reorg-v1-1-abef20ada03b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cross-merge networking fixes after downstream PR (net-6.14-rc6).
Conflicts:
net/ethtool/cabletest.c
2bcf4772e45a ("net: ethtool: try to protect all callback with netdev instance lock")
637399bf7e77 ("net: ethtool: netlink: Allow NULL nlattrs when getting a phy_device")
No Adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If multiple connection requests attempt to create an implicit mptcp
endpoint in parallel, more than one caller may end up in
mptcp_pm_nl_append_new_local_addr because none found the address in
local_addr_list during their call to mptcp_pm_nl_get_local_id. In this
case, the concurrent new_local_addr calls may delete the address entry
created by the previous caller. These deletes use synchronize_rcu, but
this is not permitted in some of the contexts where this function may be
called. During packet recv, the caller may be in a rcu read critical
section and have preemption disabled.
An example stack:
BUG: scheduling while atomic: swapper/2/0/0x00000302
Call Trace:
<IRQ>
dump_stack_lvl (lib/dump_stack.c:117 (discriminator 1))
dump_stack (lib/dump_stack.c:124)
__schedule_bug (kernel/sched/core.c:5943)
schedule_debug.constprop.0 (arch/x86/include/asm/preempt.h:33 kernel/sched/core.c:5970)
__schedule (arch/x86/include/asm/jump_label.h:27 include/linux/jump_label.h:207 kernel/sched/features.h:29 kernel/sched/core.c:6621)
schedule (arch/x86/include/asm/preempt.h:84 kernel/sched/core.c:6804 kernel/sched/core.c:6818)
schedule_timeout (kernel/time/timer.c:2160)
wait_for_completion (kernel/sched/completion.c:96 kernel/sched/completion.c:116 kernel/sched/completion.c:127 kernel/sched/completion.c:148)
__wait_rcu_gp (include/linux/rcupdate.h:311 kernel/rcu/update.c:444)
synchronize_rcu (kernel/rcu/tree.c:3609)
mptcp_pm_nl_append_new_local_addr (net/mptcp/pm_netlink.c:966 net/mptcp/pm_netlink.c:1061)
mptcp_pm_nl_get_local_id (net/mptcp/pm_netlink.c:1164)
mptcp_pm_get_local_id (net/mptcp/pm.c:420)
subflow_check_req (net/mptcp/subflow.c:98 net/mptcp/subflow.c:213)
subflow_v4_route_req (net/mptcp/subflow.c:305)
tcp_conn_request (net/ipv4/tcp_input.c:7216)
subflow_v4_conn_request (net/mptcp/subflow.c:651)
tcp_rcv_state_process (net/ipv4/tcp_input.c:6709)
tcp_v4_do_rcv (net/ipv4/tcp_ipv4.c:1934)
tcp_v4_rcv (net/ipv4/tcp_ipv4.c:2334)
ip_protocol_deliver_rcu (net/ipv4/ip_input.c:205 (discriminator 1))
ip_local_deliver_finish (include/linux/rcupdate.h:813 net/ipv4/ip_input.c:234)
ip_local_deliver (include/linux/netfilter.h:314 include/linux/netfilter.h:308 net/ipv4/ip_input.c:254)
ip_sublist_rcv_finish (include/net/dst.h:461 net/ipv4/ip_input.c:580)
ip_sublist_rcv (net/ipv4/ip_input.c:640)
ip_list_rcv (net/ipv4/ip_input.c:675)
__netif_receive_skb_list_core (net/core/dev.c:5583 net/core/dev.c:5631)
netif_receive_skb_list_internal (net/core/dev.c:5685 net/core/dev.c:5774)
napi_complete_done (include/linux/list.h:37 include/net/gro.h:449 include/net/gro.h:444 net/core/dev.c:6114)
igb_poll (drivers/net/ethernet/intel/igb/igb_main.c:8244) igb
__napi_poll (net/core/dev.c:6582)
net_rx_action (net/core/dev.c:6653 net/core/dev.c:6787)
handle_softirqs (kernel/softirq.c:553)
__irq_exit_rcu (kernel/softirq.c:588 kernel/softirq.c:427 kernel/softirq.c:636)
irq_exit_rcu (kernel/softirq.c:651)
common_interrupt (arch/x86/kernel/irq.c:247 (discriminator 14))
</IRQ>
This problem seems particularly prevalent if the user advertises an
endpoint that has a different external vs internal address. In the case
where the external address is advertised and multiple connections
already exist, multiple subflow SYNs arrive in parallel which tends to
trigger the race during creation of the first local_addr_list entries
which have the internal address instead.
Fix by skipping the replacement of an existing implicit local address if
called via mptcp_pm_nl_get_local_id.
Fixes: d045b9eb95a9 ("mptcp: introduce implicit endpoints")
Cc: stable@vger.kernel.org
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250303-net-mptcp-fix-sched-while-atomic-v1-1-f6a216c5a74c@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The number of parameters in mptcp_nl_set_flags() can be reduced.
Only need to pass a "local" parameter to it instead of "local->addr"
and "local->flags".
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250228-net-next-mptcp-coverage-small-opti-v1-4-f933c4275676@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In mptcp_pm_nl_set_flags(), "entry" is copied to "local" when pernet->lock
is held to avoid direct access to entry without pernet->lock.
Therefore, "local->flags" should be passed to mptcp_nl_set_flags instead
of "entry->flags" when pernet->lock is not held, so as to avoid access to
entry.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Fixes: 145dc6cc4abd ("mptcp: pm: change to fullmesh only for 'subflow'")
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250228-net-next-mptcp-coverage-small-opti-v1-3-f933c4275676@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cross-merge networking fixes after downstream PR (net-6.14-rc5).
Conflicts:
drivers/net/ethernet/cadence/macb_main.c
fa52f15c745c ("net: cadence: macb: Synchronize stats calculations")
75696dd0fd72 ("net: cadence: macb: Convert to get_stats64")
https://lore.kernel.org/20250224125848.68ee63e5@canb.auug.org.au
Adjacent changes:
drivers/net/ethernet/intel/ice/ice_sriov.c
79990cf5e7ad ("ice: Fix deinitializing VF in error path")
a203163274a4 ("ice: simplify VF MSI-X managing")
net/ipv4/tcp.c
18912c520674 ("tcp: devmem: don't write truncated dmabuf CMSGs to userspace")
297d389e9e5b ("net: prefix devmem specific helpers")
net/mptcp/subflow.c
8668860b0ad3 ("mptcp: reset when MPTCP opts are dropped after join")
c3349a22c200 ("mptcp: consolidate subflow cleanup")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Syzkaller reported a lockdep splat in the PM control path:
WARNING: CPU: 0 PID: 6693 at ./include/net/sock.h:1711 sock_owned_by_me include/net/sock.h:1711 [inline]
WARNING: CPU: 0 PID: 6693 at ./include/net/sock.h:1711 msk_owned_by_me net/mptcp/protocol.h:363 [inline]
WARNING: CPU: 0 PID: 6693 at ./include/net/sock.h:1711 mptcp_pm_nl_addr_send_ack+0x57c/0x610 net/mptcp/pm_netlink.c:788
Modules linked in:
CPU: 0 UID: 0 PID: 6693 Comm: syz.0.205 Not tainted 6.14.0-rc2-syzkaller-00303-gad1b832bf1cf #0
Hardware name: Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024
RIP: 0010:sock_owned_by_me include/net/sock.h:1711 [inline]
RIP: 0010:msk_owned_by_me net/mptcp/protocol.h:363 [inline]
RIP: 0010:mptcp_pm_nl_addr_send_ack+0x57c/0x610 net/mptcp/pm_netlink.c:788
Code: 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 ca 7b d3 f5 eb b9 e8 c3 7b d3 f5 90 0f 0b 90 e9 dd fb ff ff e8 b5 7b d3 f5 90 <0f> 0b 90 e9 3e fb ff ff 44 89 f1 80 e1 07 38 c1 0f 8c eb fb ff ff
RSP: 0000:ffffc900034f6f60 EFLAGS: 00010283
RAX: ffffffff8bee3c2b RBX: 0000000000000001 RCX: 0000000000080000
RDX: ffffc90004d42000 RSI: 000000000000a407 RDI: 000000000000a408
RBP: ffffc900034f7030 R08: ffffffff8bee37f6 R09: 0100000000000000
R10: dffffc0000000000 R11: ffffed100bcc62e4 R12: ffff88805e6316e0
R13: ffff88805e630c00 R14: dffffc0000000000 R15: ffff88805e630c00
FS: 00007f7e9a7e96c0(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000001b2fd18ff8 CR3: 0000000032c24000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
mptcp_pm_remove_addr+0x103/0x1d0 net/mptcp/pm.c:59
mptcp_pm_remove_anno_addr+0x1f4/0x2f0 net/mptcp/pm_netlink.c:1486
mptcp_nl_remove_subflow_and_signal_addr net/mptcp/pm_netlink.c:1518 [inline]
mptcp_pm_nl_del_addr_doit+0x118d/0x1af0 net/mptcp/pm_netlink.c:1629
genl_family_rcv_msg_doit net/netlink/genetlink.c:1115 [inline]
genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
genl_rcv_msg+0xb1f/0xec0 net/netlink/genetlink.c:1210
netlink_rcv_skb+0x206/0x480 net/netlink/af_netlink.c:2543
genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
netlink_unicast_kernel net/netlink/af_netlink.c:1322 [inline]
netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1348
netlink_sendmsg+0x8de/0xcb0 net/netlink/af_netlink.c:1892
sock_sendmsg_nosec net/socket.c:718 [inline]
__sock_sendmsg+0x221/0x270 net/socket.c:733
____sys_sendmsg+0x53a/0x860 net/socket.c:2573
___sys_sendmsg net/socket.c:2627 [inline]
__sys_sendmsg+0x269/0x350 net/socket.c:2659
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f7e9998cde9
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f7e9a7e9038 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f7e99ba5fa0 RCX: 00007f7e9998cde9
RDX: 000000002000c094 RSI: 0000400000000000 RDI: 0000000000000007
RBP: 00007f7e99a0e2a0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007f7e99ba5fa0 R15: 00007fff49231088
Indeed the PM can try to send a RM_ADDR over a msk without acquiring
first the msk socket lock.
The bugged code-path comes from an early optimization: when there
are no subflows, the PM should (usually) not send RM_ADDR
notifications.
The above statement is incorrect, as without locks another process
could concurrent create a new subflow and cause the RM_ADDR generation.
Additionally the supposed optimization is not very effective even
performance-wise, as most mptcp sockets should have at least one
subflow: the MPC one.
Address the issue removing the buggy code path, the existing "slow-path"
will handle correctly even the edge case.
Fixes: b6c08380860b ("mptcp: remove addr and subflow in PM netlink")
Cc: stable@vger.kernel.org
Reported-by: syzbot+cd3ce3d03a3393ae9700@syzkaller.appspotmail.com
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/546
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250224-net-mptcp-misc-fixes-v1-1-f550f636b435@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use ipv6_addr_equal() to check whether two IPv6 addresses are equal in
mptcp_addresses_equal().
This is more appropriate than using !ipv6_addr_cmp().
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250221-net-next-mptcp-pm-misc-cleanup-3-v1-7-2b70ab1cee79@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In mptcp_event_add_subflow(), mptcp_event_pm_listener() and
mptcp_nl_find_ssk(), 'issk' has already been got through inet_sk().
No need to use inet6_sk() to get 'ipv6_pinfo' again, just use
issk->pinet6 instead. This patch also drops these 'ipv6_pinfo'
variables.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250221-net-next-mptcp-pm-misc-cleanup-3-v1-6-2b70ab1cee79@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
To save some redundant code in dump_addr() interfaces of both the
netlink PM and userspace PM, the code that calls netlink message
helpers (genlmsg_put/cancel/end) and mptcp_nl_fill_addr() is wrapped
into a new helper mptcp_pm_genl_fill_addr().
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250221-net-next-mptcp-pm-misc-cleanup-3-v1-4-2b70ab1cee79@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If an endpoint doesn't have the 'subflow' flag -- in fact, has no type,
so not 'subflow', 'signal', nor 'implicit' -- there are then no subflows
created from this local endpoint to at least the initial destination
address. In this case, no need to call mptcp_pm_nl_fullmesh() which is
there to recreate the subflows to reflect the new value of the fullmesh
attribute.
Similarly, there is then no need to iterate over all connections to do
nothing, if only the 'fullmesh' flag has been changed, and the endpoint
doesn't have the 'subflow' one. So stop early when dealing with this
specific case.
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250221-net-next-mptcp-pm-misc-cleanup-3-v1-2-2b70ab1cee79@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The returned value is not used, it can then be dropped.
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250221-net-next-mptcp-pm-misc-cleanup-3-v1-1-2b70ab1cee79@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch updates the interfaces set_flags to reduce repetitive
code, adds a new parameter 'local' for them.
The local address is parsed in public helper mptcp_pm_nl_set_flags_doit(),
then pass it to mptcp_pm_nl_set_flags() and mptcp_userspace_pm_set_flags().
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The first parameter 'skb' in mptcp_pm_nl_set_flags() is only used to
obtained the network namespace, which can also be obtained through the
second parameters 'info' by using genl_info_net() helper.
This patch drops these useless parameters 'skb' in all three set_flags()
interfaces.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The netlink messages are sent both in mptcp_pm_nl_get_addr() and
mptcp_userspace_pm_get_addr(), this makes the code somewhat repetitive.
This is because the netlink PM and userspace PM use different locks to
protect the address entry that needs to be sent via the netlink message.
The former uses rcu read lock, and the latter uses msk->pm.lock.
The current get_addr() flow looks like this:
lock();
entry = get_entry();
send_nlmsg(entry);
unlock();
After holding the lock, get the entry from the list, send the entry, and
finally release the lock.
This patch changes the process by getting the entry while holding the lock,
then making a copy of the entry so that the lock can be released. Finally,
the copy of the entry is sent without locking:
lock();
entry = get_entry();
*copy = *entry;
unlock();
send_nlmsg(copy);
This way we can reuse the send_nlmsg() code in get_addr() interfaces
between the netlink PM and userspace PM. They only need to implement their
own get_addr() interfaces to hold the different locks, get the entry from
the different lists, then release the locks.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The address id is parsed both in mptcp_pm_nl_get_addr() and
mptcp_userspace_pm_get_addr(), this makes the code somewhat repetitive.
So this patch adds a new parameter 'id' for all get_addr() interfaces.
The address id is only parsed in mptcp_pm_nl_get_addr_doit(), then pass
it to both mptcp_pm_nl_get_addr() and mptcp_userspace_pm_get_addr().
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The first parameters 'skb' of get_addr() interfaces are now useless
since mptcp_userspace_pm_get_sock() helper is used. This patch drops
these useless parameters of them.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Three netlink functions:
mptcp_pm_nl_get_addr_doit()
mptcp_pm_nl_get_addr_dumpit()
mptcp_pm_nl_set_flags_doit()
are generic, implemented for each PM, in-kernel PM and userspace PM. It's
clearer to move them from pm_netlink.c to pm.c.
And the linked three path manager wrappers
mptcp_pm_get_addr()
mptcp_pm_dump_addr()
mptcp_pm_set_flags()
can be changed as static functions, no need to export them in protocol.h.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Instead of only returning a text message with GENL_SET_ERR_MSG(),
NL_SET_ERR_MSG_ATTR() can help the userspace developers by also
reporting which attribute is faulty.
When the error is specific to an attribute, NL_SET_ERR_MSG_ATTR() is now
used. The error messages have not been modified in this commit.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
mptcp_pm_parse_entry() will check if the given attribute is defined. If
not, it will return a generic error: "missing address info".
It might then not be clear for the userspace developer which attribute
is missing, especially when the command takes multiple addresses.
By using GENL_REQ_ATTR_CHECK(), the userspace will get a hint about
which attribute is missing, making thing clearer. Note that this is what
was already done for most of the other MPTCP NL commands, this patch
simply adds the missing ones.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Some error messages were:
- too generic: "missing input", "invalid request"
- not precise enough: "limit greater than maximum" but what's the max?
- missing: subflow not found, or connect error.
This can be easily improved by being more precise, or adding new error
messages.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
With the in-kernel path-manager, it is possible to change the 'fullmesh'
flag. The code in mptcp_pm_nl_fullmesh() expects to change it only on
'subflow' endpoints, to recreate more or less subflows using the linked
address.
Unfortunately, the set_flags() hook was a bit more permissive, and
allowed 'implicit' endpoints to get the 'fullmesh' flag while it is not
allowed before.
That's what syzbot found, triggering the following warning:
WARNING: CPU: 0 PID: 6499 at net/mptcp/pm_netlink.c:1496 __mark_subflow_endp_available net/mptcp/pm_netlink.c:1496 [inline]
WARNING: CPU: 0 PID: 6499 at net/mptcp/pm_netlink.c:1496 mptcp_pm_nl_fullmesh net/mptcp/pm_netlink.c:1980 [inline]
WARNING: CPU: 0 PID: 6499 at net/mptcp/pm_netlink.c:1496 mptcp_nl_set_flags net/mptcp/pm_netlink.c:2003 [inline]
WARNING: CPU: 0 PID: 6499 at net/mptcp/pm_netlink.c:1496 mptcp_pm_nl_set_flags+0x974/0xdc0 net/mptcp/pm_netlink.c:2064
Modules linked in:
CPU: 0 UID: 0 PID: 6499 Comm: syz.1.413 Not tainted 6.13.0-rc5-syzkaller-00172-gd1bf27c4e176 #0
Hardware name: Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
RIP: 0010:__mark_subflow_endp_available net/mptcp/pm_netlink.c:1496 [inline]
RIP: 0010:mptcp_pm_nl_fullmesh net/mptcp/pm_netlink.c:1980 [inline]
RIP: 0010:mptcp_nl_set_flags net/mptcp/pm_netlink.c:2003 [inline]
RIP: 0010:mptcp_pm_nl_set_flags+0x974/0xdc0 net/mptcp/pm_netlink.c:2064
Code: 01 00 00 49 89 c5 e8 fb 45 e8 f5 e9 b8 fc ff ff e8 f1 45 e8 f5 4c 89 f7 be 03 00 00 00 e8 44 1d 0b f9 eb a0 e8 dd 45 e8 f5 90 <0f> 0b 90 e9 17 ff ff ff 89 d9 80 e1 07 38 c1 0f 8c c9 fc ff ff 48
RSP: 0018:ffffc9000d307240 EFLAGS: 00010293
RAX: ffffffff8bb72e03 RBX: 0000000000000000 RCX: ffff88807da88000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffc9000d307430 R08: ffffffff8bb72cf0 R09: 1ffff1100b842a5e
R10: dffffc0000000000 R11: ffffed100b842a5f R12: ffff88801e2e5ac0
R13: ffff88805c214800 R14: ffff88805c2152e8 R15: 1ffff1100b842a5d
FS: 00005555619f6500(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020002840 CR3: 00000000247e6000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
genl_family_rcv_msg_doit net/netlink/genetlink.c:1115 [inline]
genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
genl_rcv_msg+0xb14/0xec0 net/netlink/genetlink.c:1210
netlink_rcv_skb+0x1e3/0x430 net/netlink/af_netlink.c:2542
genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
netlink_unicast_kernel net/netlink/af_netlink.c:1321 [inline]
netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1347
netlink_sendmsg+0x8e4/0xcb0 net/netlink/af_netlink.c:1891
sock_sendmsg_nosec net/socket.c:711 [inline]
__sock_sendmsg+0x221/0x270 net/socket.c:726
____sys_sendmsg+0x52a/0x7e0 net/socket.c:2583
___sys_sendmsg net/socket.c:2637 [inline]
__sys_sendmsg+0x269/0x350 net/socket.c:2669
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f5fe8785d29
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fff571f5558 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007f5fe8975fa0 RCX: 00007f5fe8785d29
RDX: 0000000000000000 RSI: 0000000020000480 RDI: 0000000000000007
RBP: 00007f5fe8801b08 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f5fe8975fa0 R14: 00007f5fe8975fa0 R15: 00000000000011f4
</TASK>
Here, syzbot managed to set the 'fullmesh' flag on an 'implicit' and
used -- according to 'id_avail_bitmap' -- endpoint, causing the PM to
try decrement the local_addr_used counter which is only incremented for
the 'subflow' endpoint.
Note that 'no type' endpoints -- not 'subflow', 'signal', 'implicit' --
are fine, because their ID will not be marked as used in the 'id_avail'
bitmap, and setting 'fullmesh' can help forcing the creation of subflow
when receiving an ADD_ADDR.
Fixes: 73c762c1f07d ("mptcp: set fullmesh flag in pm_netlink")
Cc: stable@vger.kernel.org
Reported-by: syzbot+cd16e79c1e45f3fe0377@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/6786ac51.050a0220.216c54.00a6.GAE@google.com
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/540
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250123-net-mptcp-syzbot-issues-v1-2-af73258a726f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since mptcp_pm_remove_addrs() is only called from the userspace PM, this
patch moves it into pm_userspace.c.
For this, lookup_subflow_by_saddr() and remove_anno_list_by_saddr()
helpers need to be exported in protocol.h. Also add "mptcp_" prefix for
these helpers.
Here, mptcp_pm_remove_addrs() is not changed to a static function because
it will be used in BPF Path Manager.
This patch doesn't change the behaviour of the code, just refactoring.
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241213-net-next-mptcp-pm-misc-cleanup-v1-4-ddb6d00109a8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The helper __lookup_addr() can be used in mptcp_pm_nl_get_local_id()
and mptcp_pm_nl_is_backup() to simplify the code, and avoid code
duplication.
Co-developed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241115-net-next-mptcp-pm-lockless-dump-v1-2-f4a1bcb4ca2c@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
To return an endpoint to the userspace via Netlink, and to dump all of
them, the endpoint list was iterated while holding the pernet->lock, but
only to read the content of the list.
In these cases, the spin locks can be replaced by RCU read ones, and use
the _rcu variants to iterate over the entries list in a lockless way.
Note that the __lookup_addr_by_id() helper has been modified to use the
_rcu variants of list_for_each_entry(), but with an extra conditions, so
it can be called either while the RCU read lock is held, or when the
associated pernet->lock is held.
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241115-net-next-mptcp-pm-lockless-dump-v1-1-f4a1bcb4ca2c@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cross-merge networking fixes after downstream PR (net-6.12-rc8).
Conflicts:
tools/testing/selftests/net/.gitignore
252e01e68241 ("selftests: net: add netlink-dumps to .gitignore")
be43a6b23829 ("selftests: ncdevmem: Move ncdevmem under drivers/net/hw")
https://lore.kernel.org/all/20241113122359.1b95180a@canb.auug.org.au/
drivers/net/phy/phylink.c
671154f174e0 ("net: phylink: ensure PHY momentary link-fails are handled")
7530ea26c810 ("net: phylink: remove "using_mac_select_pcs"")
Adjacent changes:
drivers/net/ethernet/stmicro/stmmac/dwmac-intel-plat.c
5b366eae7193 ("stmmac: dwmac-intel-plat: fix call balance of tx_clk handling routines")
e96321fad3ad ("net: ethernet: Switch back to struct platform_driver::remove()")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In mptcp_pm_create_subflow_or_signal_addr(), rcu_read_(un)lock() are
used as expected to iterate over the list of local addresses, but
list_for_each_entry() was used instead of list_for_each_entry_rcu() in
__lookup_addr(). It is important to use this variant which adds the
required READ_ONCE() (and diagnostic checks if enabled).
Because __lookup_addr() is also used in mptcp_pm_nl_set_flags() where it
is called under the pernet->lock and not rcu_read_lock(), an extra
condition is then passed to help the diagnostic checks making sure
either the associated spin lock or the RCU lock is held.
Fixes: 86e39e04482b ("mptcp: keep track of local endpoint still available for each msk")
Cc: stable@vger.kernel.org
Reviewed-by: Geliang Tang <geliang@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241112-net-mptcp-misc-6-12-pm-v1-3-b835580cefa8@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If the subflow is considered as "staled", it is better to avoid it to
send an ACK carrying an ADD_ADDR or RM_ADDR. Another subflow, if any,
will then be selected.
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241021-net-next-mptcp-misc-6-13-v1-1-1ef02746504a@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Syzkaller reported this splat:
==================================================================
BUG: KASAN: slab-use-after-free in mptcp_pm_nl_rm_addr_or_subflow+0xb44/0xcc0 net/mptcp/pm_netlink.c:881
Read of size 4 at addr ffff8880569ac858 by task syz.1.2799/14662
CPU: 0 UID: 0 PID: 14662 Comm: syz.1.2799 Not tainted 6.12.0-rc2-syzkaller-00307-g36c254515dc6 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:377 [inline]
print_report+0xc3/0x620 mm/kasan/report.c:488
kasan_report+0xd9/0x110 mm/kasan/report.c:601
mptcp_pm_nl_rm_addr_or_subflow+0xb44/0xcc0 net/mptcp/pm_netlink.c:881
mptcp_pm_nl_rm_subflow_received net/mptcp/pm_netlink.c:914 [inline]
mptcp_nl_remove_id_zero_address+0x305/0x4a0 net/mptcp/pm_netlink.c:1572
mptcp_pm_nl_del_addr_doit+0x5c9/0x770 net/mptcp/pm_netlink.c:1603
genl_family_rcv_msg_doit+0x202/0x2f0 net/netlink/genetlink.c:1115
genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
genl_rcv_msg+0x565/0x800 net/netlink/genetlink.c:1210
netlink_rcv_skb+0x165/0x410 net/netlink/af_netlink.c:2551
genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
netlink_unicast+0x53c/0x7f0 net/netlink/af_netlink.c:1357
netlink_sendmsg+0x8b8/0xd70 net/netlink/af_netlink.c:1901
sock_sendmsg_nosec net/socket.c:729 [inline]
__sock_sendmsg net/socket.c:744 [inline]
____sys_sendmsg+0x9ae/0xb40 net/socket.c:2607
___sys_sendmsg+0x135/0x1e0 net/socket.c:2661
__sys_sendmsg+0x117/0x1f0 net/socket.c:2690
do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline]
__do_fast_syscall_32+0x73/0x120 arch/x86/entry/common.c:386
do_fast_syscall_32+0x32/0x80 arch/x86/entry/common.c:411
entry_SYSENTER_compat_after_hwframe+0x84/0x8e
RIP: 0023:0xf7fe4579
Code: b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 03 74 d8 01 00 00 00 00 00 00 00 00 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 90 8d b4 26 00 00 00 00 8d b4 26 00 00 00 00
RSP: 002b:00000000f574556c EFLAGS: 00000296 ORIG_RAX: 0000000000000172
RAX: ffffffffffffffda RBX: 000000000000000b RCX: 0000000020000140
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000296 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
</TASK>
Allocated by task 5387:
kasan_save_stack+0x33/0x60 mm/kasan/common.c:47
kasan_save_track+0x14/0x30 mm/kasan/common.c:68
poison_kmalloc_redzone mm/kasan/common.c:377 [inline]
__kasan_kmalloc+0xaa/0xb0 mm/kasan/common.c:394
kmalloc_noprof include/linux/slab.h:878 [inline]
kzalloc_noprof include/linux/slab.h:1014 [inline]
subflow_create_ctx+0x87/0x2a0 net/mptcp/subflow.c:1803
subflow_ulp_init+0xc3/0x4d0 net/mptcp/subflow.c:1956
__tcp_set_ulp net/ipv4/tcp_ulp.c:146 [inline]
tcp_set_ulp+0x326/0x7f0 net/ipv4/tcp_ulp.c:167
mptcp_subflow_create_socket+0x4ae/0x10a0 net/mptcp/subflow.c:1764
__mptcp_subflow_connect+0x3cc/0x1490 net/mptcp/subflow.c:1592
mptcp_pm_create_subflow_or_signal_addr+0xbda/0x23a0 net/mptcp/pm_netlink.c:642
mptcp_pm_nl_fully_established net/mptcp/pm_netlink.c:650 [inline]
mptcp_pm_nl_work+0x3a1/0x4f0 net/mptcp/pm_netlink.c:943
mptcp_worker+0x15a/0x1240 net/mptcp/protocol.c:2777
process_one_work+0x958/0x1b30 kernel/workqueue.c:3229
process_scheduled_works kernel/workqueue.c:3310 [inline]
worker_thread+0x6c8/0xf00 kernel/workqueue.c:3391
kthread+0x2c1/0x3a0 kernel/kthread.c:389
ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
Freed by task 113:
kasan_save_stack+0x33/0x60 mm/kasan/common.c:47
kasan_save_track+0x14/0x30 mm/kasan/common.c:68
kasan_save_free_info+0x3b/0x60 mm/kasan/generic.c:579
poison_slab_object mm/kasan/common.c:247 [inline]
__kasan_slab_free+0x51/0x70 mm/kasan/common.c:264
kasan_slab_free include/linux/kasan.h:230 [inline]
slab_free_hook mm/slub.c:2342 [inline]
slab_free mm/slub.c:4579 [inline]
kfree+0x14f/0x4b0 mm/slub.c:4727
kvfree+0x47/0x50 mm/util.c:701
kvfree_rcu_list+0xf5/0x2c0 kernel/rcu/tree.c:3423
kvfree_rcu_drain_ready kernel/rcu/tree.c:3563 [inline]
kfree_rcu_monitor+0x503/0x8b0 kernel/rcu/tree.c:3632
kfree_rcu_shrink_scan+0x245/0x3a0 kernel/rcu/tree.c:3966
do_shrink_slab+0x44f/0x11c0 mm/shrinker.c:435
shrink_slab+0x32b/0x12a0 mm/shrinker.c:662
shrink_one+0x47e/0x7b0 mm/vmscan.c:4818
shrink_many mm/vmscan.c:4879 [inline]
lru_gen_shrink_node mm/vmscan.c:4957 [inline]
shrink_node+0x2452/0x39d0 mm/vmscan.c:5937
kswapd_shrink_node mm/vmscan.c:6765 [inline]
balance_pgdat+0xc19/0x18f0 mm/vmscan.c:6957
kswapd+0x5ea/0xbf0 mm/vmscan.c:7226
kthread+0x2c1/0x3a0 kernel/kthread.c:389
ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
Last potentially related work creation:
kasan_save_stack+0x33/0x60 mm/kasan/common.c:47
__kasan_record_aux_stack+0xba/0xd0 mm/kasan/generic.c:541
kvfree_call_rcu+0x74/0xbe0 kernel/rcu/tree.c:3810
subflow_ulp_release+0x2ae/0x350 net/mptcp/subflow.c:2009
tcp_cleanup_ulp+0x7c/0x130 net/ipv4/tcp_ulp.c:124
tcp_v4_destroy_sock+0x1c5/0x6a0 net/ipv4/tcp_ipv4.c:2541
inet_csk_destroy_sock+0x1a3/0x440 net/ipv4/inet_connection_sock.c:1293
tcp_done+0x252/0x350 net/ipv4/tcp.c:4870
tcp_rcv_state_process+0x379b/0x4f30 net/ipv4/tcp_input.c:6933
tcp_v4_do_rcv+0x1ad/0xa90 net/ipv4/tcp_ipv4.c:1938
sk_backlog_rcv include/net/sock.h:1115 [inline]
__release_sock+0x31b/0x400 net/core/sock.c:3072
__tcp_close+0x4f3/0xff0 net/ipv4/tcp.c:3142
__mptcp_close_ssk+0x331/0x14d0 net/mptcp/protocol.c:2489
mptcp_close_ssk net/mptcp/protocol.c:2543 [inline]
mptcp_close_ssk+0x150/0x220 net/mptcp/protocol.c:2526
mptcp_pm_nl_rm_addr_or_subflow+0x2be/0xcc0 net/mptcp/pm_netlink.c:878
mptcp_pm_nl_rm_subflow_received net/mptcp/pm_netlink.c:914 [inline]
mptcp_nl_remove_id_zero_address+0x305/0x4a0 net/mptcp/pm_netlink.c:1572
mptcp_pm_nl_del_addr_doit+0x5c9/0x770 net/mptcp/pm_netlink.c:1603
genl_family_rcv_msg_doit+0x202/0x2f0 net/netlink/genetlink.c:1115
genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
genl_rcv_msg+0x565/0x800 net/netlink/genetlink.c:1210
netlink_rcv_skb+0x165/0x410 net/netlink/af_netlink.c:2551
genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
netlink_unicast_kernel net/netlink/af_netlink.c:1331 [inline]
netlink_unicast+0x53c/0x7f0 net/netlink/af_netlink.c:1357
netlink_sendmsg+0x8b8/0xd70 net/netlink/af_netlink.c:1901
sock_sendmsg_nosec net/socket.c:729 [inline]
__sock_sendmsg net/socket.c:744 [inline]
____sys_sendmsg+0x9ae/0xb40 net/socket.c:2607
___sys_sendmsg+0x135/0x1e0 net/socket.c:2661
__sys_sendmsg+0x117/0x1f0 net/socket.c:2690
do_syscall_32_irqs_on arch/x86/entry/common.c:165 [inline]
__do_fast_syscall_32+0x73/0x120 arch/x86/entry/common.c:386
do_fast_syscall_32+0x32/0x80 arch/x86/entry/common.c:411
entry_SYSENTER_compat_after_hwframe+0x84/0x8e
The buggy address belongs to the object at ffff8880569ac800
which belongs to the cache kmalloc-512 of size 512
The buggy address is located 88 bytes inside of
freed 512-byte region [ffff8880569ac800, ffff8880569aca00)
The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x569ac
head: order:2 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
flags: 0x4fff00000000040(head|node=1|zone=1|lastcpupid=0x7ff)
page_type: f5(slab)
raw: 04fff00000000040 ffff88801ac42c80 dead000000000100 dead000000000122
raw: 0000000000000000 0000000080100010 00000001f5000000 0000000000000000
head: 04fff00000000040 ffff88801ac42c80 dead000000000100 dead000000000122
head: 0000000000000000 0000000080100010 00000001f5000000 0000000000000000
head: 04fff00000000002 ffffea00015a6b01 ffffffffffffffff 0000000000000000
head: 0000000000000004 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 2, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 10238, tgid 10238 (kworker/u32:6), ts 597403252405, free_ts 597177952947
set_page_owner include/linux/page_owner.h:32 [inline]
post_alloc_hook+0x2d1/0x350 mm/page_alloc.c:1537
prep_new_page mm/page_alloc.c:1545 [inline]
get_page_from_freelist+0x101e/0x3070 mm/page_alloc.c:3457
__alloc_pages_noprof+0x223/0x25a0 mm/page_alloc.c:4733
alloc_pages_mpol_noprof+0x2c9/0x610 mm/mempolicy.c:2265
alloc_slab_page mm/slub.c:2412 [inline]
allocate_slab mm/slub.c:2578 [inline]
new_slab+0x2ba/0x3f0 mm/slub.c:2631
___slab_alloc+0xd1d/0x16f0 mm/slub.c:3818
__slab_alloc.constprop.0+0x56/0xb0 mm/slub.c:3908
__slab_alloc_node mm/slub.c:3961 [inline]
slab_alloc_node mm/slub.c:4122 [inline]
__kmalloc_cache_noprof+0x2c5/0x310 mm/slub.c:4290
kmalloc_noprof include/linux/slab.h:878 [inline]
kzalloc_noprof include/linux/slab.h:1014 [inline]
mld_add_delrec net/ipv6/mcast.c:743 [inline]
igmp6_leave_group net/ipv6/mcast.c:2625 [inline]
igmp6_group_dropped+0x4ab/0xe40 net/ipv6/mcast.c:723
__ipv6_dev_mc_dec+0x281/0x360 net/ipv6/mcast.c:979
addrconf_leave_solict net/ipv6/addrconf.c:2253 [inline]
__ipv6_ifa_notify+0x3f6/0xc30 net/ipv6/addrconf.c:6283
addrconf_ifdown.isra.0+0xef9/0x1a20 net/ipv6/addrconf.c:3982
addrconf_notify+0x220/0x19c0 net/ipv6/addrconf.c:3781
notifier_call_chain+0xb9/0x410 kernel/notifier.c:93
call_netdevice_notifiers_info+0xbe/0x140 net/core/dev.c:1996
call_netdevice_notifiers_extack net/core/dev.c:2034 [inline]
call_netdevice_notifiers net/core/dev.c:2048 [inline]
dev_close_many+0x333/0x6a0 net/core/dev.c:1589
page last free pid 13136 tgid 13136 stack trace:
reset_page_owner include/linux/page_owner.h:25 [inline]
free_pages_prepare mm/page_alloc.c:1108 [inline]
free_unref_page+0x5f4/0xdc0 mm/page_alloc.c:2638
stack_depot_save_flags+0x2da/0x900 lib/stackdepot.c:666
kasan_save_stack+0x42/0x60 mm/kasan/common.c:48
kasan_save_track+0x14/0x30 mm/kasan/common.c:68
unpoison_slab_object mm/kasan/common.c:319 [inline]
__kasan_slab_alloc+0x89/0x90 mm/kasan/common.c:345
kasan_slab_alloc include/linux/kasan.h:247 [inline]
slab_post_alloc_hook mm/slub.c:4085 [inline]
slab_alloc_node mm/slub.c:4134 [inline]
kmem_cache_alloc_noprof+0x121/0x2f0 mm/slub.c:4141
skb_clone+0x190/0x3f0 net/core/skbuff.c:2084
do_one_broadcast net/netlink/af_netlink.c:1462 [inline]
netlink_broadcast_filtered+0xb11/0xef0 net/netlink/af_netlink.c:1540
netlink_broadcast+0x39/0x50 net/netlink/af_netlink.c:1564
uevent_net_broadcast_untagged lib/kobject_uevent.c:331 [inline]
kobject_uevent_net_broadcast lib/kobject_uevent.c:410 [inline]
kobject_uevent_env+0xacd/0x1670 lib/kobject_uevent.c:608
device_del+0x623/0x9f0 drivers/base/core.c:3882
snd_card_disconnect.part.0+0x58a/0x7c0 sound/core/init.c:546
snd_card_disconnect+0x1f/0x30 sound/core/init.c:495
snd_usx2y_disconnect+0xe9/0x1f0 sound/usb/usx2y/usbusx2y.c:417
usb_unbind_interface+0x1e8/0x970 drivers/usb/core/driver.c:461
device_remove drivers/base/dd.c:569 [inline]
device_remove+0x122/0x170 drivers/base/dd.c:561
That's because 'subflow' is used just after 'mptcp_close_ssk(subflow)',
which will initiate the release of its memory. Even if it is very likely
the release and the re-utilisation will be done later on, it is of
course better to avoid any issues and read the content of 'subflow'
before closing it.
Fixes: 1c1f72137598 ("mptcp: pm: only decrement add_addr_accepted for MPJ req")
Cc: stable@vger.kernel.org
Reported-by: syzbot+3c8b7a8e7df6a2a226ca@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/670d7337.050a0220.4cbc0.004f.GAE@google.com
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/20241015-net-mptcp-uaf-pm-rm-v1-1-c4ee5d987a64@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Syzkaller reported a lockdep splat:
============================================
WARNING: possible recursive locking detected
6.11.0-rc6-syzkaller-00019-g67784a74e258 #0 Not tainted
--------------------------------------------
syz-executor364/5113 is trying to acquire lock:
ffff8880449f1958 (k-slock-AF_INET){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
ffff8880449f1958 (k-slock-AF_INET){+.-.}-{2:2}, at: sk_clone_lock+0x2cd/0xf40 net/core/sock.c:2328
but task is already holding lock:
ffff88803fe3cb58 (k-slock-AF_INET){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
ffff88803fe3cb58 (k-slock-AF_INET){+.-.}-{2:2}, at: sk_clone_lock+0x2cd/0xf40 net/core/sock.c:2328
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(k-slock-AF_INET);
lock(k-slock-AF_INET);
*** DEADLOCK ***
May be due to missing lock nesting notation
7 locks held by syz-executor364/5113:
#0: ffff8880449f0e18 (sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1607 [inline]
#0: ffff8880449f0e18 (sk_lock-AF_INET){+.+.}-{0:0}, at: mptcp_sendmsg+0x153/0x1b10 net/mptcp/protocol.c:1806
#1: ffff88803fe39ad8 (k-sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1607 [inline]
#1: ffff88803fe39ad8 (k-sk_lock-AF_INET){+.+.}-{0:0}, at: mptcp_sendmsg_fastopen+0x11f/0x530 net/mptcp/protocol.c:1727
#2: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:326 [inline]
#2: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]
#2: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: __ip_queue_xmit+0x5f/0x1b80 net/ipv4/ip_output.c:470
#3: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:326 [inline]
#3: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]
#3: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: ip_finish_output2+0x45f/0x1390 net/ipv4/ip_output.c:228
#4: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: local_lock_acquire include/linux/local_lock_internal.h:29 [inline]
#4: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: process_backlog+0x33b/0x15b0 net/core/dev.c:6104
#5: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:326 [inline]
#5: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]
#5: ffffffff8e938320 (rcu_read_lock){....}-{1:2}, at: ip_local_deliver_finish+0x230/0x5f0 net/ipv4/ip_input.c:232
#6: ffff88803fe3cb58 (k-slock-AF_INET){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:351 [inline]
#6: ffff88803fe3cb58 (k-slock-AF_INET){+.-.}-{2:2}, at: sk_clone_lock+0x2cd/0xf40 net/core/sock.c:2328
stack backtrace:
CPU: 0 UID: 0 PID: 5113 Comm: syz-executor364 Not tainted 6.11.0-rc6-syzkaller-00019-g67784a74e258 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Call Trace:
<IRQ>
__dump_stack lib/dump_stack.c:93 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
check_deadlock kernel/locking/lockdep.c:3061 [inline]
validate_chain+0x15d3/0x5900 kernel/locking/lockdep.c:3855
__lock_acquire+0x137a/0x2040 kernel/locking/lockdep.c:5142
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5759
__raw_spin_lock include/linux/spinlock_api_smp.h:133 [inline]
_raw_spin_lock+0x2e/0x40 kernel/locking/spinlock.c:154
spin_lock include/linux/spinlock.h:351 [inline]
sk_clone_lock+0x2cd/0xf40 net/core/sock.c:2328
mptcp_sk_clone_init+0x32/0x13c0 net/mptcp/protocol.c:3279
subflow_syn_recv_sock+0x931/0x1920 net/mptcp/subflow.c:874
tcp_check_req+0xfe4/0x1a20 net/ipv4/tcp_minisocks.c:853
tcp_v4_rcv+0x1c3e/0x37f0 net/ipv4/tcp_ipv4.c:2267
ip_protocol_deliver_rcu+0x22e/0x440 net/ipv4/ip_input.c:205
ip_local_deliver_finish+0x341/0x5f0 net/ipv4/ip_input.c:233
NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314
NF_HOOK+0x3a4/0x450 include/linux/netfilter.h:314
__netif_receive_skb_one_core net/core/dev.c:5661 [inline]
__netif_receive_skb+0x2bf/0x650 net/core/dev.c:5775
process_backlog+0x662/0x15b0 net/core/dev.c:6108
__napi_poll+0xcb/0x490 net/core/dev.c:6772
napi_poll net/core/dev.c:6841 [inline]
net_rx_action+0x89b/0x1240 net/core/dev.c:6963
handle_softirqs+0x2c4/0x970 kernel/softirq.c:554
do_softirq+0x11b/0x1e0 kernel/softirq.c:455
</IRQ>
<TASK>
__local_bh_enable_ip+0x1bb/0x200 kernel/softirq.c:382
local_bh_enable include/linux/bottom_half.h:33 [inline]
rcu_read_unlock_bh include/linux/rcupdate.h:908 [inline]
__dev_queue_xmit+0x1763/0x3e90 net/core/dev.c:4450
dev_queue_xmit include/linux/netdevice.h:3105 [inline]
neigh_hh_output include/net/neighbour.h:526 [inline]
neigh_output include/net/neighbour.h:540 [inline]
ip_finish_output2+0xd41/0x1390 net/ipv4/ip_output.c:235
ip_local_out net/ipv4/ip_output.c:129 [inline]
__ip_queue_xmit+0x118c/0x1b80 net/ipv4/ip_output.c:535
__tcp_transmit_skb+0x2544/0x3b30 net/ipv4/tcp_output.c:1466
tcp_rcv_synsent_state_process net/ipv4/tcp_input.c:6542 [inline]
tcp_rcv_state_process+0x2c32/0x4570 net/ipv4/tcp_input.c:6729
tcp_v4_do_rcv+0x77d/0xc70 net/ipv4/tcp_ipv4.c:1934
sk_backlog_rcv include/net/sock.h:1111 [inline]
__release_sock+0x214/0x350 net/core/sock.c:3004
release_sock+0x61/0x1f0 net/core/sock.c:3558
mptcp_sendmsg_fastopen+0x1ad/0x530 net/mptcp/protocol.c:1733
mptcp_sendmsg+0x1884/0x1b10 net/mptcp/protocol.c:1812
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0x1a6/0x270 net/socket.c:745
____sys_sendmsg+0x525/0x7d0 net/socket.c:2597
___sys_sendmsg net/socket.c:2651 [inline]
__sys_sendmmsg+0x3b2/0x740 net/socket.c:2737
__do_sys_sendmmsg net/socket.c:2766 [inline]
__se_sys_sendmmsg net/socket.c:2763 [inline]
__x64_sys_sendmmsg+0xa0/0xb0 net/socket.c:2763
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f04fb13a6b9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 01 1a 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffd651f42d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000133
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f04fb13a6b9
RDX: 0000000000000001 RSI: 0000000020000d00 RDI: 0000000000000004
RBP: 00007ffd651f4310 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000020000080 R11: 0000000000000246 R12: 00000000000f4240
R13: 00007f04fb187449 R14: 00007ffd651f42f4 R15: 00007ffd651f4300
</TASK>
As noted by Cong Wang, the splat is false positive, but the code
path leading to the report is an unexpected one: a client is
attempting an MPC handshake towards the in-kernel listener created
by the in-kernel PM for a port based signal endpoint.
Such connection will be never accepted; many of them can make the
listener queue full and preventing the creation of MPJ subflow via
such listener - its intended role.
Explicitly detect this scenario at initial-syn time and drop the
incoming MPC request.
Fixes: 1729cf186d8a ("mptcp: create the listening socket for new port")
Cc: stable@vger.kernel.org
Reported-by: syzbot+f4aacdfef2c6a6529c3e@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=f4aacdfef2c6a6529c3e
Cc: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241014-net-mptcp-mpc-port-endp-v2-1-7faea8e6b6ae@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In a previous fix, the in-kernel path-manager has been modified not to
retrigger the removal of a subflow if it was already closed, e.g. when
the initial subflow is removed, but kept in the subflows list.
To be complete, this fix should also skip the subflows that are in any
closing state: mptcp_close_ssk() will initiate the closure, but the
switch to the TCP_CLOSE state depends on the other peer.
Fixes: 58e1b66b4e4b ("mptcp: pm: do not remove already closed subflows")
Cc: stable@vger.kernel.org
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20241008-net-mptcp-fallback-fixes-v1-4-c6fb8e93e551@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cross-merge networking fixes after downstream PR.
No conflicts (sort of) and no adjacent changes.
This merge reverts commit b3c9e65eb227 ("net: hsr: remove seqnr_lock")
from net, as it was superseded by
commit 430d67bdcb04 ("net: hsr: Use the seqnr lock for frames received via interlink port.")
in net-next.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There are two paths to access mptcp_pm_del_add_timer, result in a race
condition:
CPU1 CPU2
==== ====
net_rx_action
napi_poll netlink_sendmsg
__napi_poll netlink_unicast
process_backlog netlink_unicast_kernel
__netif_receive_skb genl_rcv
__netif_receive_skb_one_core netlink_rcv_skb
NF_HOOK genl_rcv_msg
ip_local_deliver_finish genl_family_rcv_msg
ip_protocol_deliver_rcu genl_family_rcv_msg_doit
tcp_v4_rcv mptcp_pm_nl_flush_addrs_doit
tcp_v4_do_rcv mptcp_nl_remove_addrs_list
tcp_rcv_established mptcp_pm_remove_addrs_and_subflows
tcp_data_queue remove_anno_list_by_saddr
mptcp_incoming_options mptcp_pm_del_add_timer
mptcp_pm_del_add_timer kfree(entry)
In remove_anno_list_by_saddr(running on CPU2), after leaving the critical
zone protected by "pm.lock", the entry will be released, which leads to the
occurrence of uaf in the mptcp_pm_del_add_timer(running on CPU1).
Keeping a reference to add_timer inside the lock, and calling
sk_stop_timer_sync() with this reference, instead of "entry->add_timer".
Move list_del(&entry->list) to mptcp_pm_del_add_timer and inside the pm lock,
do not directly access any members of the entry outside the pm lock, which
can avoid similar "entry->x" uaf.
Fixes: 00cfd77b9063 ("mptcp: retransmit ADD_ADDR when timeout")
Cc: stable@vger.kernel.org
Reported-and-tested-by: syzbot+f3a31fb909db9b2a5c4d@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=f3a31fb909db9b2a5c4d
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/tencent_7142963A37944B4A74EF76CD66EA3C253609@qq.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
__mptcp_subflow_connect() is currently called from the path-managers,
which have all the required information to create subflows. No need to
call the PM again to re-iterate over the list of entries with RCU lock
to get more info.
Instead, it is possible to pass a mptcp_pm_addr_entry structure, instead
of a mptcp_addr_info one. The former contains the ifindex and the flags
that are required when creating the new subflow.
This is a partial revert of commit ee285257a9c1 ("mptcp: drop flags and
ifindex arguments").
While at it, the local ID can also be set if it is known and 0, to avoid
having to set it in the 'rebuild_header' hook, which will cause a new
iteration of the endpoint entries.
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240902-net-next-mptcp-mib-mpjtx-misc-v1-2-d3e0f3773b90@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Rename all the helpers specific to the flushing operations to make it
clear that the intention is to flush all created subflows, and remove
all announced addresses, not just a specific selection.
That way, it is easier to understand why the id_avail_bitmap and
local_addr_used are reset at the end.
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240902-net-next-mptcp-mib-mpjtx-misc-v1-1-d3e0f3773b90@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The ADD_ADDR 0 with the address from the initial subflow should not be
considered as a new address: this is not something new. If the host
receives it, it simply means that the address is available again.
When receiving an ADD_ADDR for the ID 0, the PM already doesn't consider
it as new by not incrementing the 'add_addr_accepted' counter. But the
'accept_addr' might not be set if the limit has already been reached:
this can be bypassed in this case. But before, it is important to check
that this ADD_ADDR for the ID 0 is for the same address as the initial
subflow. If not, it is not something that should happen, and the
ADD_ADDR can be ignored.
Note that if an ADD_ADDR is received while there is already a subflow
opened using the same address, this ADD_ADDR is ignored as well. It
means that if multiple ADD_ADDR for ID 0 are received, there will not be
any duplicated subflows created by the client.
Fixes: d0876b2284cf ("mptcp: add the incoming RM_ADDR support")
Cc: stable@vger.kernel.org
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
'local_addr_used' and 'add_addr_accepted' are decremented for addresses
not related to the initial subflow (ID0), because the source and
destination addresses of the initial subflows are known from the
beginning: they don't count as "additional local address being used" or
"ADD_ADDR being accepted".
It is then required not to increment them when the entrypoint used by
the initial subflow is removed and re-added during a connection. Without
this modification, this entrypoint cannot be removed and re-added more
than once.
Reported-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/512
Fixes: 3ad14f54bd74 ("mptcp: more accurate MPC endpoint tracking")
Reported-by: syzbot+455d38ecd5f655fc45cf@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/00000000000049861306209237f4@google.com
Cc: stable@vger.kernel.org
Tested-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|