Age | Commit message (Collapse) | Author |
|
Guillaume says:
> I believe commit 5f7fc5d69f6e ("SUNRPC: Resupply rq_pages from
> node-local memory") in Linux 6.5+ is incorrect. It passes
> unconditionally rq_pool->sp_id as the NUMA node.
>
> While the comment in the svc_pool declaration in sunrpc/svc.h says
> that sp_id is also the NUMA node id, it might not be the case if
> the svc is created using svc_create_pooled(). svc_created_pooled()
> can use the per-cpu pool mode therefore in this case sp_id would
> be the cpu id.
Fix this by reverting now. At a later point this minor optimization,
and the deceptive labeling of the sp_id field, can be revisited.
Reported-by: Guillaume Morin <guillaume@morinfr.org>
Closes: https://lore.kernel.org/linux-nfs/ZYC9rsno8qYggVt9@bender.morinfr.org/T/#u
Fixes: 5f7fc5d69f6e ("SUNRPC: Resupply rq_pages from node-local memory")
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
|
|
With the introduction of the rcu_replace_pointer_rtnl helper,
cleanup the rtnl_unregister_* functions to use the helper instead
of open coding it.
Signed-off-by: Pedro Tammela <pctammela@mojatatu.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
W=1 builds warn on missing MODULE_DESCRIPTION, add them here in MPTCP.
Only two were missing: two modules with different KUnit tests for MPTCP.
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The netlink PM can race with fastopen self-connect attempts, shutting
down the first subflow via:
MPTCP_PM_CMD_DEL_ADDR -> mptcp_nl_remove_id_zero_address ->
mptcp_pm_nl_rm_subflow_received -> mptcp_close_ssk
and transitioning such subflow to FIN_WAIT1 status before the syn-ack
packet is processed. The MPTCP code does not react to such state change,
leaving the connection in not-fallback status and the subflow handshake
uncompleted, triggering the following splat:
WARNING: CPU: 0 PID: 10630 at net/mptcp/subflow.c:1405 subflow_data_ready+0x39f/0x690 net/mptcp/subflow.c:1405
Modules linked in:
CPU: 0 PID: 10630 Comm: kworker/u4:11 Not tainted 6.6.0-syzkaller-14500-g1c41041124bd #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023
Workqueue: bat_events batadv_nc_worker
RIP: 0010:subflow_data_ready+0x39f/0x690 net/mptcp/subflow.c:1405
Code: 18 89 ee e8 e3 d2 21 f7 40 84 ed 75 1f e8 a9 d7 21 f7 44 89 fe bf 07 00 00 00 e8 0c d3 21 f7 41 83 ff 07 74 07 e8 91 d7 21 f7 <0f> 0b e8 8a d7 21 f7 48 89 df e8 d2 b2 ff ff 31 ff 89 c5 89 c6 e8
RSP: 0018:ffffc90000007448 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff888031efc700 RCX: ffffffff8a65baf4
RDX: ffff888043222140 RSI: ffffffff8a65baff RDI: 0000000000000005
RBP: 0000000000000000 R08: 0000000000000005 R09: 0000000000000007
R10: 000000000000000b R11: 0000000000000000 R12: 1ffff92000000e89
R13: ffff88807a534d80 R14: ffff888021c11a00 R15: 000000000000000b
FS: 0000000000000000(0000) GS:ffff8880b9800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa19a0ffc81 CR3: 000000007a2db000 CR4: 00000000003506f0
DR0: 000000000000d8dd DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Call Trace:
<IRQ>
tcp_data_ready+0x14c/0x5b0 net/ipv4/tcp_input.c:5128
tcp_data_queue+0x19c3/0x5190 net/ipv4/tcp_input.c:5208
tcp_rcv_state_process+0x11ef/0x4e10 net/ipv4/tcp_input.c:6844
tcp_v4_do_rcv+0x369/0xa10 net/ipv4/tcp_ipv4.c:1929
tcp_v4_rcv+0x3888/0x3b30 net/ipv4/tcp_ipv4.c:2329
ip_protocol_deliver_rcu+0x9f/0x480 net/ipv4/ip_input.c:205
ip_local_deliver_finish+0x2e4/0x510 net/ipv4/ip_input.c:233
NF_HOOK include/linux/netfilter.h:314 [inline]
NF_HOOK include/linux/netfilter.h:308 [inline]
ip_local_deliver+0x1b6/0x550 net/ipv4/ip_input.c:254
dst_input include/net/dst.h:461 [inline]
ip_rcv_finish+0x1c4/0x2e0 net/ipv4/ip_input.c:449
NF_HOOK include/linux/netfilter.h:314 [inline]
NF_HOOK include/linux/netfilter.h:308 [inline]
ip_rcv+0xce/0x440 net/ipv4/ip_input.c:569
__netif_receive_skb_one_core+0x115/0x180 net/core/dev.c:5527
__netif_receive_skb+0x1f/0x1b0 net/core/dev.c:5641
process_backlog+0x101/0x6b0 net/core/dev.c:5969
__napi_poll.constprop.0+0xb4/0x540 net/core/dev.c:6531
napi_poll net/core/dev.c:6600 [inline]
net_rx_action+0x956/0xe90 net/core/dev.c:6733
__do_softirq+0x21a/0x968 kernel/softirq.c:553
do_softirq kernel/softirq.c:454 [inline]
do_softirq+0xaa/0xe0 kernel/softirq.c:441
</IRQ>
<TASK>
__local_bh_enable_ip+0xf8/0x120 kernel/softirq.c:381
spin_unlock_bh include/linux/spinlock.h:396 [inline]
batadv_nc_purge_paths+0x1ce/0x3c0 net/batman-adv/network-coding.c:471
batadv_nc_worker+0x9b1/0x10e0 net/batman-adv/network-coding.c:722
process_one_work+0x884/0x15c0 kernel/workqueue.c:2630
process_scheduled_works kernel/workqueue.c:2703 [inline]
worker_thread+0x8b9/0x1290 kernel/workqueue.c:2784
kthread+0x33c/0x440 kernel/kthread.c:388
ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:242
</TASK>
To address the issue, catch the racing subflow state change and
use it to cause the MPTCP fallback. Such fallback is also used to
cause the first subflow state propagation to the msk socket via
mptcp_set_connected(). After this change, the first subflow can
additionally propagate the TCP_FIN_WAIT1 state, so rename the
helper accordingly.
Finally, if the state propagation is delayed to the msk release
callback, the first subflow can change to a different state in between.
Cache the relevant target state in a new msk-level field and use
such value to update the msk state at release time.
Fixes: 1e777f39b4d7 ("mptcp: add MSG_FASTOPEN sendmsg flag support")
Cc: stable@vger.kernel.org
Reported-by: <syzbot+c53d4d3ddb327e80bc51@syzkaller.appspotmail.com>
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/458
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts <matttbe@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In order to address the issues encountered with commit 1effe8ca4e34
("skbuff: fix coalescing for page_pool fragment recycling"), the
combination of the following condition was excluded from skb coalescing:
from->pp_recycle = 1
from->cloned = 1
to->pp_recycle = 1
However, with page pool environments, the aforementioned combination can
be quite common(ex. NetworkMananger may lead to the additional
packet_type being registered, thus the cloning). In scenarios with a
higher number of small packets, it can significantly affect the success
rate of coalescing. For example, considering packets of 256 bytes size,
our comparison of coalescing success rate is as follows:
Without page pool: 70%
With page pool: 13%
Consequently, this has an impact on performance:
Without page pool: 2.57 Gbits/sec
With page pool: 2.26 Gbits/sec
Therefore, it seems worthwhile to optimize this scenario and enable
coalescing of this particular combination. To achieve this, we need to
ensure the correct increment of the "from" SKB page's page pool
reference count (pp_ref_count).
Following this optimization, the success rate of coalescing measured in
our environment has improved as follows:
With page pool: 60%
This success rate is approaching the rate achieved without using page
pool, and the performance has also been improved:
With page pool: 2.52 Gbits/sec
Below is the performance comparison for small packets before and after
this optimization. We observe no impact to packets larger than 4K.
packet size before after improved
(bytes) (Gbits/sec) (Gbits/sec)
128 1.19 1.27 7.13%
256 2.26 2.52 11.75%
512 4.13 4.81 16.50%
1024 6.17 6.73 9.05%
2048 14.54 15.47 6.45%
4096 25.44 27.87 9.52%
Signed-off-by: Liang Chen <liangchen.linux@gmail.com>
Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
Suggested-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Wrap code for checking if a page is a page_pool page into a
function for better readability and ease of reuse.
Signed-off-by: Liang Chen <liangchen.linux@gmail.com>
Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Reviewed-by: Mina Almasry <almarsymina@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Up to now, we were only subtracting from the number of used page fragments
to figure out when a page could be freed or recycled. A following patch
introduces support for multiple users referencing the same fragment. So
reduce the initial page fragments value to half to avoid overflowing.
Signed-off-by: Liang Chen <liangchen.linux@gmail.com>
Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Mina Almasry <almarsymina@google.com>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In commit 1580ab63fc9a ("tcp/dccp: better use of ephemeral ports in connect()")
we added an heuristic to select even ports for connect() and odd ports for bind().
This was nice because no applications changes were needed.
But it added more costs when all even ports are in use,
when there are few listeners and many active connections.
Since then, IP_LOCAL_PORT_RANGE has been added to permit an application
to partition ephemeral port range at will.
This patch extends the idea so that if IP_LOCAL_PORT_RANGE is set on
a socket before accept(), port selection no longer favors even ports.
This means that connect() can find a suitable source port faster,
and applications can use a different split between connect() and bind()
users.
This should give more entropy to Toeplitz hash used in RSS: Using even
ports was wasting one bit from the 16bit sport.
A similar change can be done in inet_csk_find_open_port() if needed.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jakub Sitnicki <jakub@cloudflare.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://lore.kernel.org/r/20231214192939.1962891-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Change inet_sk_get_local_port_range() to return a boolean,
telling the callers if the port range was provided by
IP_LOCAL_PORT_RANGE socket option.
Adds documentation while we are at it.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20231214192939.1962891-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Ensure the various dtor functions match their prototype and retain
their CFI signatures, since they don't have their address taken, they
are prone to not getting CFI, making them impossible to call
indirectly.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231215092707.799451071@infradead.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
BPF struct_ops uses __arch_prepare_bpf_trampoline() to write
trampolines for indirect function calls. These tramplines much have
matching CFI.
In order to obtain the correct CFI hash for the various methods, add a
matching structure that contains stub functions, the compiler will
generate correct CFI which we can pilfer for the trampolines.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231215092707.566977112@infradead.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
This code is rarely (never?) enabled by distros, and it hasn't caught
anything in decades. Let's kill off this legacy debug code.
Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This can cause a race with bt_sock_ioctl() because
bt_sock_recvmsg() gets the skb from sk->sk_receive_queue
and then frees it without holding lock_sock.
A use-after-free for a skb occurs with the following flow.
```
bt_sock_recvmsg() -> skb_recv_datagram() -> skb_free_datagram()
bt_sock_ioctl() -> skb_peek()
```
Add lock_sock to bt_sock_recvmsg() to fix this issue.
Cc: stable@vger.kernel.org
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Hyunwoo Kim <v4bel@theori.io>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
When we are slave role and receives l2cap conn req when encryption has
started, we should check the enc key size to avoid KNOB attack or BLUFFS
attack.
From SIG recommendation, implementations are advised to reject
service-level connections on an encrypted baseband link with key
strengths below 7 octets.
A simple and clear way to achieve this is to place the enc key size
check in hci_cc_read_enc_key_size()
The btmon log below shows the case that lacks enc key size check.
> HCI Event: Connect Request (0x04) plen 10
Address: BB:22:33:44:55:99 (OUI BB-22-33)
Class: 0x480104
Major class: Computer (desktop, notebook, PDA, organizers)
Minor class: Desktop workstation
Capturing (Scanner, Microphone)
Telephony (Cordless telephony, Modem, Headset)
Link type: ACL (0x01)
< HCI Command: Accept Connection Request (0x01|0x0009) plen 7
Address: BB:22:33:44:55:99 (OUI BB-22-33)
Role: Peripheral (0x01)
> HCI Event: Command Status (0x0f) plen 4
Accept Connection Request (0x01|0x0009) ncmd 2
Status: Success (0x00)
> HCI Event: Connect Complete (0x03) plen 11
Status: Success (0x00)
Handle: 1
Address: BB:22:33:44:55:99 (OUI BB-22-33)
Link type: ACL (0x01)
Encryption: Disabled (0x00)
...
> HCI Event: Encryption Change (0x08) plen 4
Status: Success (0x00)
Handle: 1 Address: BB:22:33:44:55:99 (OUI BB-22-33)
Encryption: Enabled with E0 (0x01)
< HCI Command: Read Encryption Key Size (0x05|0x0008) plen 2
Handle: 1 Address: BB:22:33:44:55:99 (OUI BB-22-33)
> HCI Event: Command Complete (0x0e) plen 7
Read Encryption Key Size (0x05|0x0008) ncmd 2
Status: Success (0x00)
Handle: 1 Address: BB:22:33:44:55:99 (OUI BB-22-33)
Key size: 6
// We should check the enc key size
...
> ACL Data RX: Handle 1 flags 0x02 dlen 12
L2CAP: Connection Request (0x02) ident 3 len 4
PSM: 25 (0x0019)
Source CID: 64
< ACL Data TX: Handle 1 flags 0x00 dlen 16
L2CAP: Connection Response (0x03) ident 3 len 8
Destination CID: 64
Source CID: 64
Result: Connection pending (0x0001)
Status: Authorization pending (0x0002)
> HCI Event: Number of Completed Packets (0x13) plen 5
Num handles: 1
Handle: 1 Address: BB:22:33:44:55:99 (OUI BB-22-33)
Count: 1
#35: len 16 (25 Kb/s)
Latency: 5 msec (2-7 msec ~4 msec)
< ACL Data TX: Handle 1 flags 0x00 dlen 16
L2CAP: Connection Response (0x03) ident 3 len 8
Destination CID: 64
Source CID: 64
Result: Connection successful (0x0000)
Status: No further information available (0x0000)
Cc: stable@vger.kernel.org
Signed-off-by: Alex Lu <alex_lu@realsil.com.cn>
Signed-off-by: Max Chou <max.chou@realtek.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
If two Bluetooth devices both support BR/EDR and BLE, and also
support Secure Connections, then they only need to pair once.
The LTK generated during the LE pairing process may be converted
into a BR/EDR link key for BR/EDR transport, and conversely, a
link key generated during the BR/EDR SSP pairing process can be
converted into an LTK for LE transport. Hence, the link type of
the link key and LTK is not fixed, they can be either an LE LINK
or an ACL LINK.
Currently, in the mgmt_new_irk/ltk/crsk/link_key functions, the
link type is fixed, which could lead to incorrect address types
being reported to the application layer. Therefore, it is necessary
to add link_type/addr_type to the smp_irk/ltk/crsk and link_key,
to ensure the generation of the correct address type.
SMP over BREDR:
Before Fix:
> ACL Data RX: Handle 11 flags 0x02 dlen 12
BR/EDR SMP: Identity Address Information (0x09) len 7
Address: F8:7D:76:F2:12:F3 (OUI F8-7D-76)
@ MGMT Event: New Identity Resolving Key (0x0018) plen 30
Random address: 00:00:00:00:00:00 (Non-Resolvable)
LE Address: F8:7D:76:F2:12:F3 (OUI F8-7D-76)
@ MGMT Event: New Long Term Key (0x000a) plen 37
LE Address: F8:7D:76:F2:12:F3 (OUI F8-7D-76)
Key type: Authenticated key from P-256 (0x03)
After Fix:
> ACL Data RX: Handle 11 flags 0x02 dlen 12
BR/EDR SMP: Identity Address Information (0x09) len 7
Address: F8:7D:76:F2:12:F3 (OUI F8-7D-76)
@ MGMT Event: New Identity Resolving Key (0x0018) plen 30
Random address: 00:00:00:00:00:00 (Non-Resolvable)
BR/EDR Address: F8:7D:76:F2:12:F3 (OUI F8-7D-76)
@ MGMT Event: New Long Term Key (0x000a) plen 37
BR/EDR Address: F8:7D:76:F2:12:F3 (OUI F8-7D-76)
Key type: Authenticated key from P-256 (0x03)
SMP over LE:
Before Fix:
@ MGMT Event: New Identity Resolving Key (0x0018) plen 30
Random address: 5F:5C:07:37:47:D5 (Resolvable)
LE Address: F8:7D:76:F2:12:F3 (OUI F8-7D-76)
@ MGMT Event: New Long Term Key (0x000a) plen 37
LE Address: F8:7D:76:F2:12:F3 (OUI F8-7D-76)
Key type: Authenticated key from P-256 (0x03)
@ MGMT Event: New Link Key (0x0009) plen 26
BR/EDR Address: F8:7D:76:F2:12:F3 (OUI F8-7D-76)
Key type: Authenticated Combination key from P-256 (0x08)
After Fix:
@ MGMT Event: New Identity Resolving Key (0x0018) plen 30
Random address: 5E:03:1C:00:38:21 (Resolvable)
LE Address: F8:7D:76:F2:12:F3 (OUI F8-7D-76)
@ MGMT Event: New Long Term Key (0x000a) plen 37
LE Address: F8:7D:76:F2:12:F3 (OUI F8-7D-76)
Key type: Authenticated key from P-256 (0x03)
@ MGMT Event: New Link Key (0x0009) plen 26
Store hint: Yes (0x01)
LE Address: F8:7D:76:F2:12:F3 (OUI F8-7D-76)
Key type: Authenticated Combination key from P-256 (0x08)
Cc: stable@vger.kernel.org
Signed-off-by: Xiao Yao <xiaoyao@rock-chips.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
L2CAP/COS/CED/BI-02-C PTS test send a malformed L2CAP signaling packet
with 2 commands in it (a connection request and an unknown command) and
expect to get a connection response packet and a command reject packet.
The second is currently not sent.
Cc: stable@vger.kernel.org
Signed-off-by: Frédéric Danis <frederic.danis@collabora.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
Turning on -Wstringop-overflow globally exposed a misleading compiler
warning in bluetooth:
net/bluetooth/hci_event.c: In function 'hci_cc_read_class_of_dev':
net/bluetooth/hci_event.c:524:9: error: 'memcpy' writing 3 bytes into a
region of size 0 overflows the destination [-Werror=stringop-overflow=]
524 | memcpy(hdev->dev_class, rp->dev_class, 3);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The problem here is the check for hdev being NULL in bt_dev_dbg() that
leads the compiler to conclude that hdev->dev_class might be an invalid
pointer access.
Add another explicit check for the same condition to make sure gcc sees
this cannot happen.
Fixes: a9de9248064b ("[Bluetooth] Switch from OGF+OCF to using only opcodes")
Fixes: 1b56c90018f0 ("Makefile: Enable -Wstringop-overflow globally")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
Before setting HCI_INQUIRY bit check if HCI_OP_INQUIRY was really sent
otherwise the controller maybe be generating invalid events or, more
likely, it is a result of fuzzing tools attempting to test the right
behavior of the stack when unexpected events are generated.
Cc: stable@vger.kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=218151
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
Some layers such as SMP depend on getting notified about encryption
changes immediately as they only allow certain PDU to be transmitted
over an encrypted link which may cause SMP implementation to reject
valid PDUs received thus causing pairing to fail when it shouldn't.
Fixes: 7aca0ac4792e ("Bluetooth: Wait for HCI_OP_WRITE_AUTH_PAYLOAD_TO to complete")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
We assume in handful of places that the name of the spec is
the same as the name of the family. We could fix that but
it seems like a fair assumption to make. Rename the MPTCP
spec instead.
Reviewed-by: Mat Martineau <martineau@kernel.org>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
syzbot found an interesting netdev refcounting issue in
net/rose/af_rose.c, thanks to CONFIG_NET_DEV_REFCNT_TRACKER=y [1]
Problem is that rose_kill_by_device() can change rose->device
while other threads do not expect the pointer to be changed.
We have to first collect sockets in a temporary array,
then perform the changes while holding the socket
lock and rose_list_lock spinlock (in this order)
Change rose_release() to also acquire rose_list_lock
before releasing the netdev refcount.
[1]
[ 1185.055088][ T7889] ref_tracker: reference already released.
[ 1185.061476][ T7889] ref_tracker: allocated in:
[ 1185.066081][ T7889] rose_bind+0x4ab/0xd10
[ 1185.070446][ T7889] __sys_bind+0x1ec/0x220
[ 1185.074818][ T7889] __x64_sys_bind+0x72/0xb0
[ 1185.079356][ T7889] do_syscall_64+0x40/0x110
[ 1185.083897][ T7889] entry_SYSCALL_64_after_hwframe+0x63/0x6b
[ 1185.089835][ T7889] ref_tracker: freed in:
[ 1185.094088][ T7889] rose_release+0x2f5/0x570
[ 1185.098629][ T7889] __sock_release+0xae/0x260
[ 1185.103262][ T7889] sock_close+0x1c/0x20
[ 1185.107453][ T7889] __fput+0x270/0xbb0
[ 1185.111467][ T7889] task_work_run+0x14d/0x240
[ 1185.116085][ T7889] get_signal+0x106f/0x2790
[ 1185.120622][ T7889] arch_do_signal_or_restart+0x90/0x7f0
[ 1185.126205][ T7889] exit_to_user_mode_prepare+0x121/0x240
[ 1185.131846][ T7889] syscall_exit_to_user_mode+0x1e/0x60
[ 1185.137293][ T7889] do_syscall_64+0x4d/0x110
[ 1185.141783][ T7889] entry_SYSCALL_64_after_hwframe+0x63/0x6b
[ 1185.148085][ T7889] ------------[ cut here ]------------
WARNING: CPU: 1 PID: 7889 at lib/ref_tracker.c:255 ref_tracker_free+0x61a/0x810 lib/ref_tracker.c:255
Modules linked in:
CPU: 1 PID: 7889 Comm: syz-executor.2 Not tainted 6.7.0-rc4-syzkaller-00162-g65c95f78917e #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023
RIP: 0010:ref_tracker_free+0x61a/0x810 lib/ref_tracker.c:255
Code: 00 44 8b 6b 18 31 ff 44 89 ee e8 21 62 f5 fc 45 85 ed 0f 85 a6 00 00 00 e8 a3 66 f5 fc 48 8b 34 24 48 89 ef e8 27 5f f1 05 90 <0f> 0b 90 bb ea ff ff ff e9 52 fd ff ff e8 84 66 f5 fc 4c 8d 6d 44
RSP: 0018:ffffc90004917850 EFLAGS: 00010202
RAX: 0000000000000201 RBX: ffff88802618f4c0 RCX: 0000000000000000
RDX: 0000000000000202 RSI: ffffffff8accb920 RDI: 0000000000000001
RBP: ffff8880269ea5b8 R08: 0000000000000001 R09: fffffbfff23e35f6
R10: ffffffff91f1afb7 R11: 0000000000000001 R12: 1ffff92000922f0c
R13: 0000000005a2039b R14: ffff88802618f4d8 R15: 00000000ffffffff
FS: 00007f0a720ef6c0(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f43a819d988 CR3: 0000000076c64000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
netdev_tracker_free include/linux/netdevice.h:4127 [inline]
netdev_put include/linux/netdevice.h:4144 [inline]
netdev_put include/linux/netdevice.h:4140 [inline]
rose_kill_by_device net/rose/af_rose.c:195 [inline]
rose_device_event+0x25d/0x330 net/rose/af_rose.c:218
notifier_call_chain+0xb6/0x3b0 kernel/notifier.c:93
call_netdevice_notifiers_info+0xbe/0x130 net/core/dev.c:1967
call_netdevice_notifiers_extack net/core/dev.c:2005 [inline]
call_netdevice_notifiers net/core/dev.c:2019 [inline]
__dev_notify_flags+0x1f5/0x2e0 net/core/dev.c:8646
dev_change_flags+0x122/0x170 net/core/dev.c:8682
dev_ifsioc+0x9ad/0x1090 net/core/dev_ioctl.c:529
dev_ioctl+0x224/0x1090 net/core/dev_ioctl.c:786
sock_do_ioctl+0x198/0x270 net/socket.c:1234
sock_ioctl+0x22e/0x6b0 net/socket.c:1339
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:871 [inline]
__se_sys_ioctl fs/ioctl.c:857 [inline]
__x64_sys_ioctl+0x18f/0x210 fs/ioctl.c:857
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x63/0x6b
RIP: 0033:0x7f0a7147cba9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f0a720ef0c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f0a7159bf80 RCX: 00007f0a7147cba9
RDX: 0000000020000040 RSI: 0000000000008914 RDI: 0000000000000004
RBP: 00007f0a714c847a R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000000b R14: 00007f0a7159bf80 R15: 00007ffc8bb3a5f8
</TASK>
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Bernard Pidoux <f6bvp@free.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
optmem_max being used in tx zerocopy,
we want to be able to control it on a netns basis.
Following patch changes two tests.
Tested:
oqq130:~# cat /proc/sys/net/core/optmem_max
131072
oqq130:~# echo 1000000 >/proc/sys/net/core/optmem_max
oqq130:~# cat /proc/sys/net/core/optmem_max
1000000
oqq130:~# unshare -n
oqq130:~# cat /proc/sys/net/core/optmem_max
131072
oqq130:~# exit
logout
oqq130:~# cat /proc/sys/net/core/optmem_max
1000000
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
For many years, /proc/sys/net/core/optmem_max default value
on a 64bit kernel has been 20 KB.
Regular usage of TCP tx zerocopy needs a bit more.
Google has used 128KB as the default value for 7 years without
any problem.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
ife_decode() calls pskb_may_pull() two times, we need to reload
ifehdr after the second one, or risk use-after-free as reported
by syzbot:
BUG: KASAN: slab-use-after-free in __ife_tlv_meta_valid net/ife/ife.c:108 [inline]
BUG: KASAN: slab-use-after-free in ife_tlv_meta_decode+0x1d1/0x210 net/ife/ife.c:131
Read of size 2 at addr ffff88802d7300a4 by task syz-executor.5/22323
CPU: 0 PID: 22323 Comm: syz-executor.5 Not tainted 6.7.0-rc3-syzkaller-00804-g074ac38d5b95 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106
print_address_description mm/kasan/report.c:364 [inline]
print_report+0xc4/0x620 mm/kasan/report.c:475
kasan_report+0xda/0x110 mm/kasan/report.c:588
__ife_tlv_meta_valid net/ife/ife.c:108 [inline]
ife_tlv_meta_decode+0x1d1/0x210 net/ife/ife.c:131
tcf_ife_decode net/sched/act_ife.c:739 [inline]
tcf_ife_act+0x4e3/0x1cd0 net/sched/act_ife.c:879
tc_act include/net/tc_wrapper.h:221 [inline]
tcf_action_exec+0x1ac/0x620 net/sched/act_api.c:1079
tcf_exts_exec include/net/pkt_cls.h:344 [inline]
mall_classify+0x201/0x310 net/sched/cls_matchall.c:42
tc_classify include/net/tc_wrapper.h:227 [inline]
__tcf_classify net/sched/cls_api.c:1703 [inline]
tcf_classify+0x82f/0x1260 net/sched/cls_api.c:1800
hfsc_classify net/sched/sch_hfsc.c:1147 [inline]
hfsc_enqueue+0x315/0x1060 net/sched/sch_hfsc.c:1546
dev_qdisc_enqueue+0x3f/0x230 net/core/dev.c:3739
__dev_xmit_skb net/core/dev.c:3828 [inline]
__dev_queue_xmit+0x1de1/0x3d30 net/core/dev.c:4311
dev_queue_xmit include/linux/netdevice.h:3165 [inline]
packet_xmit+0x237/0x350 net/packet/af_packet.c:276
packet_snd net/packet/af_packet.c:3081 [inline]
packet_sendmsg+0x24aa/0x5200 net/packet/af_packet.c:3113
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0xd5/0x180 net/socket.c:745
__sys_sendto+0x255/0x340 net/socket.c:2190
__do_sys_sendto net/socket.c:2202 [inline]
__se_sys_sendto net/socket.c:2198 [inline]
__x64_sys_sendto+0xe0/0x1b0 net/socket.c:2198
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
entry_SYSCALL_64_after_hwframe+0x63/0x6b
RIP: 0033:0x7fe9acc7cae9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fe9ada450c8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00007fe9acd9bf80 RCX: 00007fe9acc7cae9
RDX: 000000000000fce0 RSI: 00000000200002c0 RDI: 0000000000000003
RBP: 00007fe9accc847a R08: 0000000020000140 R09: 0000000000000014
R10: 0000000000000004 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000000b R14: 00007fe9acd9bf80 R15: 00007ffd5427ae78
</TASK>
Allocated by task 22323:
kasan_save_stack+0x33/0x50 mm/kasan/common.c:45
kasan_set_track+0x25/0x30 mm/kasan/common.c:52
____kasan_kmalloc mm/kasan/common.c:374 [inline]
__kasan_kmalloc+0xa2/0xb0 mm/kasan/common.c:383
kasan_kmalloc include/linux/kasan.h:198 [inline]
__do_kmalloc_node mm/slab_common.c:1007 [inline]
__kmalloc_node_track_caller+0x5a/0x90 mm/slab_common.c:1027
kmalloc_reserve+0xef/0x260 net/core/skbuff.c:582
__alloc_skb+0x12b/0x330 net/core/skbuff.c:651
alloc_skb include/linux/skbuff.h:1298 [inline]
alloc_skb_with_frags+0xe4/0x710 net/core/skbuff.c:6331
sock_alloc_send_pskb+0x7e4/0x970 net/core/sock.c:2780
packet_alloc_skb net/packet/af_packet.c:2930 [inline]
packet_snd net/packet/af_packet.c:3024 [inline]
packet_sendmsg+0x1e2a/0x5200 net/packet/af_packet.c:3113
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0xd5/0x180 net/socket.c:745
__sys_sendto+0x255/0x340 net/socket.c:2190
__do_sys_sendto net/socket.c:2202 [inline]
__se_sys_sendto net/socket.c:2198 [inline]
__x64_sys_sendto+0xe0/0x1b0 net/socket.c:2198
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
entry_SYSCALL_64_after_hwframe+0x63/0x6b
Freed by task 22323:
kasan_save_stack+0x33/0x50 mm/kasan/common.c:45
kasan_set_track+0x25/0x30 mm/kasan/common.c:52
kasan_save_free_info+0x2b/0x40 mm/kasan/generic.c:522
____kasan_slab_free mm/kasan/common.c:236 [inline]
____kasan_slab_free+0x15b/0x1b0 mm/kasan/common.c:200
kasan_slab_free include/linux/kasan.h:164 [inline]
slab_free_hook mm/slub.c:1800 [inline]
slab_free_freelist_hook+0x114/0x1e0 mm/slub.c:1826
slab_free mm/slub.c:3809 [inline]
__kmem_cache_free+0xc0/0x180 mm/slub.c:3822
skb_kfree_head net/core/skbuff.c:950 [inline]
skb_free_head+0x110/0x1b0 net/core/skbuff.c:962
pskb_expand_head+0x3c5/0x1170 net/core/skbuff.c:2130
__pskb_pull_tail+0xe1/0x1830 net/core/skbuff.c:2655
pskb_may_pull_reason include/linux/skbuff.h:2685 [inline]
pskb_may_pull include/linux/skbuff.h:2693 [inline]
ife_decode+0x394/0x4f0 net/ife/ife.c:82
tcf_ife_decode net/sched/act_ife.c:727 [inline]
tcf_ife_act+0x43b/0x1cd0 net/sched/act_ife.c:879
tc_act include/net/tc_wrapper.h:221 [inline]
tcf_action_exec+0x1ac/0x620 net/sched/act_api.c:1079
tcf_exts_exec include/net/pkt_cls.h:344 [inline]
mall_classify+0x201/0x310 net/sched/cls_matchall.c:42
tc_classify include/net/tc_wrapper.h:227 [inline]
__tcf_classify net/sched/cls_api.c:1703 [inline]
tcf_classify+0x82f/0x1260 net/sched/cls_api.c:1800
hfsc_classify net/sched/sch_hfsc.c:1147 [inline]
hfsc_enqueue+0x315/0x1060 net/sched/sch_hfsc.c:1546
dev_qdisc_enqueue+0x3f/0x230 net/core/dev.c:3739
__dev_xmit_skb net/core/dev.c:3828 [inline]
__dev_queue_xmit+0x1de1/0x3d30 net/core/dev.c:4311
dev_queue_xmit include/linux/netdevice.h:3165 [inline]
packet_xmit+0x237/0x350 net/packet/af_packet.c:276
packet_snd net/packet/af_packet.c:3081 [inline]
packet_sendmsg+0x24aa/0x5200 net/packet/af_packet.c:3113
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0xd5/0x180 net/socket.c:745
__sys_sendto+0x255/0x340 net/socket.c:2190
__do_sys_sendto net/socket.c:2202 [inline]
__se_sys_sendto net/socket.c:2198 [inline]
__x64_sys_sendto+0xe0/0x1b0 net/socket.c:2198
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
entry_SYSCALL_64_after_hwframe+0x63/0x6b
The buggy address belongs to the object at ffff88802d730000
which belongs to the cache kmalloc-8k of size 8192
The buggy address is located 164 bytes inside of
freed 8192-byte region [ffff88802d730000, ffff88802d732000)
The buggy address belongs to the physical page:
page:ffffea0000b5cc00 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x2d730
head:ffffea0000b5cc00 order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
flags: 0xfff00000000840(slab|head|node=0|zone=1|lastcpupid=0x7ff)
page_type: 0xffffffff()
raw: 00fff00000000840 ffff888013042280 dead000000000122 0000000000000000
raw: 0000000000000000 0000000080020002 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 3, migratetype Unmovable, gfp_mask 0x1d20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL), pid 22323, tgid 22320 (syz-executor.5), ts 950317230369, free_ts 950233467461
set_page_owner include/linux/page_owner.h:31 [inline]
post_alloc_hook+0x2d0/0x350 mm/page_alloc.c:1544
prep_new_page mm/page_alloc.c:1551 [inline]
get_page_from_freelist+0xa28/0x3730 mm/page_alloc.c:3319
__alloc_pages+0x22e/0x2420 mm/page_alloc.c:4575
alloc_pages_mpol+0x258/0x5f0 mm/mempolicy.c:2133
alloc_slab_page mm/slub.c:1870 [inline]
allocate_slab mm/slub.c:2017 [inline]
new_slab+0x283/0x3c0 mm/slub.c:2070
___slab_alloc+0x979/0x1500 mm/slub.c:3223
__slab_alloc.constprop.0+0x56/0xa0 mm/slub.c:3322
__slab_alloc_node mm/slub.c:3375 [inline]
slab_alloc_node mm/slub.c:3468 [inline]
__kmem_cache_alloc_node+0x131/0x310 mm/slub.c:3517
__do_kmalloc_node mm/slab_common.c:1006 [inline]
__kmalloc_node_track_caller+0x4a/0x90 mm/slab_common.c:1027
kmalloc_reserve+0xef/0x260 net/core/skbuff.c:582
__alloc_skb+0x12b/0x330 net/core/skbuff.c:651
alloc_skb include/linux/skbuff.h:1298 [inline]
alloc_skb_with_frags+0xe4/0x710 net/core/skbuff.c:6331
sock_alloc_send_pskb+0x7e4/0x970 net/core/sock.c:2780
packet_alloc_skb net/packet/af_packet.c:2930 [inline]
packet_snd net/packet/af_packet.c:3024 [inline]
packet_sendmsg+0x1e2a/0x5200 net/packet/af_packet.c:3113
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0xd5/0x180 net/socket.c:745
__sys_sendto+0x255/0x340 net/socket.c:2190
page last free stack trace:
reset_page_owner include/linux/page_owner.h:24 [inline]
free_pages_prepare mm/page_alloc.c:1144 [inline]
free_unref_page_prepare+0x53c/0xb80 mm/page_alloc.c:2354
free_unref_page+0x33/0x3b0 mm/page_alloc.c:2494
__unfreeze_partials+0x226/0x240 mm/slub.c:2655
qlink_free mm/kasan/quarantine.c:168 [inline]
qlist_free_all+0x6a/0x170 mm/kasan/quarantine.c:187
kasan_quarantine_reduce+0x18e/0x1d0 mm/kasan/quarantine.c:294
__kasan_slab_alloc+0x65/0x90 mm/kasan/common.c:305
kasan_slab_alloc include/linux/kasan.h:188 [inline]
slab_post_alloc_hook mm/slab.h:763 [inline]
slab_alloc_node mm/slub.c:3478 [inline]
slab_alloc mm/slub.c:3486 [inline]
__kmem_cache_alloc_lru mm/slub.c:3493 [inline]
kmem_cache_alloc_lru+0x219/0x6f0 mm/slub.c:3509
alloc_inode_sb include/linux/fs.h:2937 [inline]
ext4_alloc_inode+0x28/0x650 fs/ext4/super.c:1408
alloc_inode+0x5d/0x220 fs/inode.c:261
new_inode_pseudo fs/inode.c:1006 [inline]
new_inode+0x22/0x260 fs/inode.c:1032
__ext4_new_inode+0x333/0x5200 fs/ext4/ialloc.c:958
ext4_symlink+0x5d7/0xa20 fs/ext4/namei.c:3398
vfs_symlink fs/namei.c:4464 [inline]
vfs_symlink+0x3e5/0x620 fs/namei.c:4448
do_symlinkat+0x25f/0x310 fs/namei.c:4490
__do_sys_symlinkat fs/namei.c:4506 [inline]
__se_sys_symlinkat fs/namei.c:4503 [inline]
__x64_sys_symlinkat+0x97/0xc0 fs/namei.c:4503
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:82
Fixes: d57493d6d1be ("net: sched: ife: check on metadata length")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Alexander Aring <aahringo@redhat.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The following NULL pointer dereference issue occurred:
BUG: kernel NULL pointer dereference, address: 0000000000000000
<...>
RIP: 0010:ccid_hc_tx_send_packet net/dccp/ccid.h:166 [inline]
RIP: 0010:dccp_write_xmit+0x49/0x140 net/dccp/output.c:356
<...>
Call Trace:
<TASK>
dccp_sendmsg+0x642/0x7e0 net/dccp/proto.c:801
inet_sendmsg+0x63/0x90 net/ipv4/af_inet.c:846
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0x83/0xe0 net/socket.c:745
____sys_sendmsg+0x443/0x510 net/socket.c:2558
___sys_sendmsg+0xe5/0x150 net/socket.c:2612
__sys_sendmsg+0xa6/0x120 net/socket.c:2641
__do_sys_sendmsg net/socket.c:2650 [inline]
__se_sys_sendmsg net/socket.c:2648 [inline]
__x64_sys_sendmsg+0x45/0x50 net/socket.c:2648
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x43/0x110 arch/x86/entry/common.c:82
entry_SYSCALL_64_after_hwframe+0x63/0x6b
sk_wait_event() returns an error (-EPIPE) if disconnect() is called on the
socket waiting for the event. However, sk_stream_wait_connect() returns
success, i.e. zero, even if sk_wait_event() returns -EPIPE, so a function
that waits for a connection with sk_stream_wait_connect() may misbehave.
In the case of the above DCCP issue, dccp_sendmsg() is waiting for the
connection. If disconnect() is called in concurrently, the above issue
occurs.
This patch fixes the issue by returning error from sk_stream_wait_connect()
if sk_wait_event() fails.
Fixes: 419ce133ab92 ("tcp: allow again tcp_disconnect() when threads are waiting")
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reported-by: syzbot+c71bc336c5061153b502@syzkaller.appspotmail.com
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reported-by: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Send credit update message when SO_RCVLOWAT is updated and it is bigger
than number of bytes in rx queue. It is needed, because 'poll()' will
wait until number of bytes in rx queue will be not smaller than
O_RCVLOWAT, so kick sender to send more data. Otherwise mutual hungup
for tx/rx is possible: sender waits for free space and receiver is
waiting data in 'poll()'.
Rename 'set_rcvlowat' callback to 'notify_set_rcvlowat' and set
'sk->sk_rcvlowat' only in one place (i.e. 'vsock_set_rcvlowat'), so the
transport doesn't need to do it.
Fixes: b89d882dc9fc ("vsock/virtio: reduce credit update messages")
Signed-off-by: Arseniy Krasnov <avkrasnov@salutedevices.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add one more condition for sending credit update during dequeue from
stream socket: when number of bytes in the rx queue is smaller than
SO_RCVLOWAT value of the socket. This is actual for non-default value
of SO_RCVLOWAT (e.g. not 1) - idea is to "kick" peer to continue data
transmission, because we need at least SO_RCVLOWAT bytes in our rx
queue to wake up user for reading data (in corner case it is also
possible to stuck both tx and rx sides, this is why 'Fixes' is used).
Fixes: b89d882dc9fc ("vsock/virtio: reduce credit update messages")
Signed-off-by: Arseniy Krasnov <avkrasnov@salutedevices.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In order to support IP_PKTINFO on those packets, we need to call
ipv4_pktinfo_prepare.
When sending mrouted/pimd daemons a cache report IGMP msg, it is
unnecessary to set dst on the newly created skb.
It used to be necessary on older versions until
commit d826eb14ecef ("ipv4: PKTINFO doesnt need dst reference") which
changed the way IP_PKTINFO struct is been retrieved.
Changes from v1:
1. Undo changes in ipv4_pktinfo_prepare function. use it directly
and copy the control block.
Fixes: d826eb14ecef ("ipv4: PKTINFO doesnt need dst reference")
Signed-off-by: Leone Fernando <leone4fernando@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
While disassociating from a PAN ourselves, let's set the maximum number
of associations temporarily to zero to be sure no new device tries to
associate with us.
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Acked-by: Stefan Schmidt <stefan@datenfreihafen.org>
Acked-by: Alexander Aring <aahringo@redhat.com>
Link: https://lore.kernel.org/linux-wpan/20231128111655.507479-6-miquel.raynal@bootlin.com
|
|
Once associated with any device, we are part of a PAN (with a specific
PAN ID), and we are expected to be present on a particular
channel. Let's avoid confusing other devices by preventing any PAN
ID/channel change once associated.
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Acked-by: Stefan Schmidt <stefan@datenfreihafen.org>
Acked-by: Alexander Aring <aahringo@redhat.com>
Link: https://lore.kernel.org/linux-wpan/20231128111655.507479-5-miquel.raynal@bootlin.com
|
|
It is not very clear in the specification whether simple coordinators
are allowed or not to answer to association requests themselves. As
there is no synchronization mechanism, it is probably best to rely on
the relay feature of these coordinators and avoid processing them in
this case.
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Acked-by: Stefan Schmidt <stefan@datenfreihafen.org>
Acked-by: Alexander Aring <aahringo@redhat.com>
Link: https://lore.kernel.org/linux-wpan/20231128111655.507479-4-miquel.raynal@bootlin.com
|
|
ACKs come with the source and destination address empty, this has been
clarified already. But there is something else: if the destination
address is empty but the source address is valid, it may be a way to
reach the PAN coordinator. Either the device receiving this frame is the
PAN coordinator itself and should process what it just received
(PACKET_HOST) or it is not and may, if supported, relay the packet as it
is targeted to another device in the network.
Right now we do not support relaying so the packet should be dropped in
the first place, but the stamping looks more accurate this way.
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Acked-by: Stefan Schmidt <stefan@datenfreihafen.org>
Acked-by: Alexander Aring <aahringo@redhat.com>
Link: https://lore.kernel.org/linux-wpan/20231128111655.507479-3-miquel.raynal@bootlin.com
|
|
Sending a beacon is a way to advertise a PAN, but also ourselves as
coordinator in the PAN. There is only one PAN coordinator in a PAN, this
is the device without parent (not associated itself to an "upper"
coordinator). Instead of blindly saying that we are the PAN coordinator,
let's actually use our internal information to fill this field.
Signed-off-by: Miquel Raynal <miquel.raynal@bootlin.com>
Acked-by: Stefan Schmidt <stefan@datenfreihafen.org>
Acked-by: Alexander Aring <aahringo@redhat.com>
Link: https://lore.kernel.org/linux-wpan/20231128111655.507479-2-miquel.raynal@bootlin.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless
Johannes Berg says:
====================
* add (and fix) certificate for regdb handover to Chen-Yu Tsai
* fix rfkill GPIO handling
* a few driver (iwlwifi, mt76) crash fixes
* logic fixes in the stack
* tag 'wireless-2023-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless:
wifi: cfg80211: fix certs build to not depend on file order
wifi: mt76: fix crash with WED rx support enabled
wifi: iwlwifi: pcie: avoid a NULL pointer dereference
wifi: mac80211: mesh_plink: fix matches_local logic
wifi: mac80211: mesh: check element parsing succeeded
wifi: mac80211: check defragmentation succeeded
wifi: mac80211: don't re-add debugfs during reconfig
net: rfkill: gpio: set GPIO direction
wifi: mac80211: check if the existing link config remains unchanged
wifi: cfg80211: Add my certificate
wifi: iwlwifi: pcie: add another missing bh-disable for rxq->lock
wifi: ieee80211: don't require protected vendor action frames
====================
Link: https://lore.kernel.org/r/20231214111515.60626-3-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cross-merge networking fixes after downstream PR.
Conflicts:
drivers/net/ethernet/intel/iavf/iavf_ethtool.c
3a0b5a2929fd ("iavf: Introduce new state machines for flow director")
95260816b489 ("iavf: use iavf_schedule_aq_request() helper")
https://lore.kernel.org/all/84e12519-04dc-bd80-bc34-8cf50d7898ce@intel.com/
drivers/net/ethernet/broadcom/bnxt/bnxt.c
c13e268c0768 ("bnxt_en: Fix HWTSTAMP_FILTER_ALL packet timestamp logic")
c2f8063309da ("bnxt_en: Refactor RX VLAN acceleration logic.")
a7445d69809f ("bnxt_en: Add support for new RX and TPA_START completion types for P7")
1c7fd6ee2fe4 ("bnxt_en: Rename some macros for the P5 chips")
https://lore.kernel.org/all/20231211110022.27926ad9@canb.auug.org.au/
drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c
bd6781c18cb5 ("bnxt_en: Fix wrong return value check in bnxt_close_nic()")
84793a499578 ("bnxt_en: Skip nic close/open when configuring tstamp filters")
https://lore.kernel.org/all/20231214113041.3a0c003c@canb.auug.org.au/
drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c
3d7a3f2612d7 ("net/mlx5: Nack sync reset request when HotPlug is enabled")
cecf44ea1a1f ("net/mlx5: Allow sync reset flow when BF MGT interface device is present")
https://lore.kernel.org/all/20231211110328.76c925af@canb.auug.org.au/
No adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This commit adds an unstable kfunc helper to access internal xfrm_state
associated with an SA. This is intended to be used for the upcoming
IPsec pcpu work to assign special pcpu SAs to a particular CPU. In other
words: for custom software RSS.
That being said, the function that this kfunc wraps is fairly generic
and used for a lot of xfrm tasks. I'm sure people will find uses
elsewhere over time.
This commit also adds a corresponding bpf_xdp_xfrm_state_release() kfunc
to release the refcnt acquired by bpf_xdp_get_xfrm_state(). The verifier
will require that all acquired xfrm_state's are released.
Co-developed-by: Antony Antony <antony.antony@secunet.com>
Signed-off-by: Antony Antony <antony.antony@secunet.com>
Acked-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Link: https://lore.kernel.org/r/a29699c42f5fad456b875c98dd11c6afc3ffb707.1702593901.git.dxu@dxuuu.xyz
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
Correct run-on sentences by changing "," to ";".
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Kalle Valo <kvalo@kernel.org>
Cc: linux-wireless@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Link: https://msgid.link/20231213054809.23475-1-rdunlap@infradead.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Correct a run-on sentence by changing "," to ";".
Add a subject in one sentence.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: Kalle Valo <kvalo@kernel.org>
Cc: linux-wireless@vger.kernel.org
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Link: https://msgid.link/20231213054800.22561-1-rdunlap@infradead.org
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
The build can become unreproducible if the list of files
found by $(wildcard ...) differs. Sort the list to avoid
this.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Because atalk_ioctl() accesses sk->sk_receive_queue
without holding a sk->sk_receive_queue.lock, it can
cause a race with atalk_recvmsg().
A use-after-free for skb occurs with the following flow.
```
atalk_ioctl() -> skb_peek()
atalk_recvmsg() -> skb_recv_datagram() -> skb_free_datagram()
```
Add sk->sk_receive_queue.lock to atalk_ioctl() to fix this issue.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Hyunwoo Kim <v4bel@theori.io>
Link: https://lore.kernel.org/r/20231213041056.GA519680@v4bel-B760M-AORUS-ELITE-AX
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The file for the new certificate (Chen-Yu Tsai's) didn't
end with a comma, so depending on the file order in the
build rule, we'd end up with invalid C when concatenating
the (now two) certificates. Fix that.
Cc: stable@vger.kernel.org
Reported-by: Biju Das <biju.das.jz@bp.renesas.com>
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Fixes: fb768d3b13ff ("wifi: cfg80211: Add my certificate")
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
|
|
Symmetric RSS hash functions are beneficial in applications that monitor
both Tx and Rx packets of the same flow (IDS, software firewalls, ..etc).
Getting all traffic of the same flow on the same RX queue results in
higher CPU cache efficiency.
A NIC that supports "symmetric-xor" can achieve this RSS hash symmetry
by XORing the source and destination fields and pass the values to the
RSS hash algorithm.
The user may request RSS hash symmetry for a specific algorithm, via:
# ethtool -X eth0 hfunc <hash_alg> symmetric-xor
or turn symmetry off (asymmetric) by:
# ethtool -X eth0 hfunc <hash_alg>
The specific fields for each flow type should then be specified as usual
via:
# ethtool -N|-U eth0 rx-flow-hash <flow_type> s|d|f|n
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
Link: https://lore.kernel.org/r/20231213003321.605376-4-ahmed.zaki@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add the RSS context parameters to struct ethtool_rxfh_param and use the
get/set_rxfh to handle the RSS contexts as well.
This is part 2/2 of the fix suggested in [1]:
- Add a rss_context member to the argument struct and a capability
like cap_link_lanes_supported to indicate whether driver supports
rss contexts, then you can remove *et_rxfh_context functions,
and instead call *et_rxfh() with a non-zero rss_context.
Link: https://lore.kernel.org/netdev/20231121152906.2dd5f487@kernel.org/ [1]
CC: Jesse Brandeburg <jesse.brandeburg@intel.com>
CC: Tony Nguyen <anthony.l.nguyen@intel.com>
CC: Marcin Wojtas <mw@semihalf.com>
CC: Russell King <linux@armlinux.org.uk>
CC: Sunil Goutham <sgoutham@marvell.com>
CC: Geetha sowjanya <gakula@marvell.com>
CC: Subbaraya Sundeep <sbhatta@marvell.com>
CC: hariprasad <hkelam@marvell.com>
CC: Saeed Mahameed <saeedm@nvidia.com>
CC: Leon Romanovsky <leon@kernel.org>
CC: Edward Cree <ecree.xilinx@gmail.com>
CC: Martin Habets <habetsm.xilinx@gmail.com>
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
Link: https://lore.kernel.org/r/20231213003321.605376-3-ahmed.zaki@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The get/set_rxfh ethtool ops currently takes the rxfh (RSS) parameters
as direct function arguments. This will force us to change the API (and
all drivers' functions) every time some new parameters are added.
This is part 1/2 of the fix, as suggested in [1]:
- First simplify the code by always providing a pointer to all params
(indir, key and func); the fact that some of them may be NULL seems
like a weird historic thing or a premature optimization.
It will simplify the drivers if all pointers are always present.
- Then make the functions take a dev pointer, and a pointer to a
single struct wrapping all arguments. The set_* should also take
an extack.
Link: https://lore.kernel.org/netdev/20231121152906.2dd5f487@kernel.org/ [1]
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Suggested-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
Link: https://lore.kernel.org/r/20231213003321.605376-2-ahmed.zaki@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Releasing the DMA mapping will be useful for other types
of pages, so factor it out. Make sure compiler inlines it,
to avoid any regressions.
Signed-off-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
To support multiple users referencing the same fragment,
'pp_frag_count' is renamed to 'pp_ref_count', transitioning pp pages
from fragment management to reference count management after draining
based on the suggestion from [1].
The idea is that the concept of fragmenting exists before the page is
drained, and all related functions retain their current names.
However, once the page is drained, its management shifts to being
governed by 'pp_ref_count'. Therefore, all functions associated with
that lifecycle stage of a pp page are renamed.
[1]
http://lore.kernel.org/netdev/f71d9448-70c8-8793-dc9a-0eb48a570300@huawei.com
Signed-off-by: Liang Chen <liangchen.linux@gmail.com>
Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>
Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Link: https://lore.kernel.org/r/20231212044614.42733-2-liangchen.linux@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
For some reason sctp_poll() generates EPOLLERR if sk->sk_error_queue
is not empty but recvmsg() can not drain the error queue yet.
This is needed to better support timestamping.
I had to export inet_recv_error(), since sctp
can be compiled as a module.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Cc: Willem de Bruijn <willemb@google.com>
Acked-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/20231212145550.3872051-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Once again syzbot is able to crash the kernel in skb_segment() [1]
GSO_BY_FRAGS is a forbidden value, but unfortunately the following
computation in skb_segment() can reach it quite easily :
mss = mss * partial_segs;
65535 = 3 * 5 * 17 * 257, so many initial values of mss can lead to
a bad final result.
Make sure to limit segmentation so that the new mss value is smaller
than GSO_BY_FRAGS.
[1]
general protection fault, probably for non-canonical address 0xdffffc000000000e: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077]
CPU: 1 PID: 5079 Comm: syz-executor993 Not tainted 6.7.0-rc4-syzkaller-00141-g1ae4cd3cbdd0 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/10/2023
RIP: 0010:skb_segment+0x181d/0x3f30 net/core/skbuff.c:4551
Code: 83 e3 02 e9 fb ed ff ff e8 90 68 1c f9 48 8b 84 24 f8 00 00 00 48 8d 78 70 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 8a 21 00 00 48 8b 84 24 f8 00
RSP: 0018:ffffc900043473d0 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: 0000000000010046 RCX: ffffffff886b1597
RDX: 000000000000000e RSI: ffffffff886b2520 RDI: 0000000000000070
RBP: ffffc90004347578 R08: 0000000000000005 R09: 000000000000ffff
R10: 000000000000ffff R11: 0000000000000002 R12: ffff888063202ac0
R13: 0000000000010000 R14: 000000000000ffff R15: 0000000000000046
FS: 0000555556e7e380(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020010000 CR3: 0000000027ee2000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
udp6_ufo_fragment+0xa0e/0xd00 net/ipv6/udp_offload.c:109
ipv6_gso_segment+0x534/0x17e0 net/ipv6/ip6_offload.c:120
skb_mac_gso_segment+0x290/0x610 net/core/gso.c:53
__skb_gso_segment+0x339/0x710 net/core/gso.c:124
skb_gso_segment include/net/gso.h:83 [inline]
validate_xmit_skb+0x36c/0xeb0 net/core/dev.c:3626
__dev_queue_xmit+0x6f3/0x3d60 net/core/dev.c:4338
dev_queue_xmit include/linux/netdevice.h:3134 [inline]
packet_xmit+0x257/0x380 net/packet/af_packet.c:276
packet_snd net/packet/af_packet.c:3087 [inline]
packet_sendmsg+0x24c6/0x5220 net/packet/af_packet.c:3119
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0xd5/0x180 net/socket.c:745
__sys_sendto+0x255/0x340 net/socket.c:2190
__do_sys_sendto net/socket.c:2202 [inline]
__se_sys_sendto net/socket.c:2198 [inline]
__x64_sys_sendto+0xe0/0x1b0 net/socket.c:2198
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x63/0x6b
RIP: 0033:0x7f8692032aa9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 d1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fff8d685418 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f8692032aa9
RDX: 0000000000010048 RSI: 00000000200000c0 RDI: 0000000000000003
RBP: 00000000000f4240 R08: 0000000020000540 R09: 0000000000000014
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff8d685480
R13: 0000000000000001 R14: 00007fff8d685480 R15: 0000000000000003
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:skb_segment+0x181d/0x3f30 net/core/skbuff.c:4551
Code: 83 e3 02 e9 fb ed ff ff e8 90 68 1c f9 48 8b 84 24 f8 00 00 00 48 8d 78 70 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 8a 21 00 00 48 8b 84 24 f8 00
RSP: 0018:ffffc900043473d0 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: 0000000000010046 RCX: ffffffff886b1597
RDX: 000000000000000e RSI: ffffffff886b2520 RDI: 0000000000000070
RBP: ffffc90004347578 R08: 0000000000000005 R09: 000000000000ffff
R10: 000000000000ffff R11: 0000000000000002 R12: ffff888063202ac0
R13: 0000000000010000 R14: 000000000000ffff R15: 0000000000000046
FS: 0000555556e7e380(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020010000 CR3: 0000000027ee2000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Fixes: 3953c46c3ac7 ("sk_buff: allow segmenting based on frag sizes")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20231212164621.4131800-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Follow up commit 9690ae604290 ("ethtool: add header/data split
indication") and add the set part of Ethtool's header split, i.e.
ability to enable/disable header split via the Ethtool Netlink
interface. This might be helpful to optimize the setup for particular
workloads, for example, to avoid XDP frags, and so on.
A driver should advertise ``ETHTOOL_RING_USE_TCP_DATA_SPLIT`` in its
ops->supported_ring_params to allow doing that. "Unknown" passed from
the userspace when the header split is supported means the driver is
free to choose the preferred state.
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://lore.kernel.org/r/20231212142752.935000-2-aleksander.lobakin@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
We need to do signed arithmetic if we expect condition
`if (bytes < 0)` to be possible
Found by Linux Verification Center (linuxtesting.org) with SVACE
Fixes: 06a8fc78367d ("VSOCK: Introduce virtio_vsock_common.ko")
Signed-off-by: Nikolay Kuratov <kniv@yandex-team.ru>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/r/20231211162317.4116625-1-kniv@yandex-team.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|