summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-11-09net: wwan: iosm: fix kernel test robot reported errorsM Chetan Kumar
Include linux/vmalloc.h in iosm_ipc_coredump.c & iosm_ipc_devlink.c to resolve kernel test robot errors. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-09net: wwan: iosm: fix invalid mux header typeM Chetan Kumar
Data stall seen during peak DL throughput test & packets are dropped by mux layer due to invalid header type in datagram. During initlization Mux aggregration protocol is set to default UL/DL size and TD count of Mux lite protocol. This configuration mismatch between device and driver is resulting in data stall/packet drops. Override the UL/DL size and TD count for Mux aggregation protocol. Fixes: 1f52d7b62285 ("net: wwan: iosm: Enable M.2 7360 WWAN card support") Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-09net: wwan: iosm: fix driver not working with INTEL_IOMMU disabledM Chetan Kumar
With INTEL_IOMMU disable config or by forcing intel_iommu=off from grub some of the features of IOSM driver like browsing, flashing & coredump collection is not working. When driver calls DMA API - dma_map_single() for tx transfers. It is resulting in dma mapping error. Set the device DMA addressing capabilities using dma_set_mask() and remove the INTEL_IOMMU dependency in kconfig so that driver follows the platform config either INTEL_IOMMU enable or disable. Fixes: f7af616c632e ("net: iosm: infrastructure") Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-09net: wwan: iosm: fix memory leak in ipc_pcie_read_bios_cfgM Chetan Kumar
ipc_pcie_read_bios_cfg() is using the acpi_evaluate_dsm() to obtain the wwan power state configuration from BIOS but is not freeing the acpi_object. The acpi_evaluate_dsm() returned acpi_object to be freed. Free the acpi_object after use. Fixes: 7e98d785ae61 ("net: iosm: entry point") Signed-off-by: M Chetan Kumar <m.chetan.kumar@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-09net/core: Allow live renaming when an interface is upAndy Ren
Allow a network interface to be renamed when the interface is up. As described in the netconsole documentation [1], when netconsole is used as a built-in, it will bring up the specified interface as soon as possible. As a result, user space will not be able to rename the interface since the kernel disallows renaming of interfaces that are administratively up unless the 'IFF_LIVE_RENAME_OK' private flag was set by the kernel. The original solution [2] to this problem was to add a new parameter to the netconsole configuration parameters that allows renaming of the interface used by netconsole while it is administratively up. However, during the discussion that followed, it became apparent that we have no reason to keep the current restriction and instead we should allow user space to rename interfaces regardless of their administrative state: 1. The restriction was put in place over 20 years ago when renaming was only possible via IOCTL and before rtnetlink started notifying user space about such changes like it does today. 2. The 'IFF_LIVE_RENAME_OK' flag was added over 3 years ago in version 5.2 and no regressions were reported. 3. In-kernel listeners to 'NETDEV_CHANGENAME' do not seem to care about the administrative state of interface. Therefore, allow user space to rename running interfaces by removing the restriction and the associated 'IFF_LIVE_RENAME_OK' flag. Help in possible triage by emitting a message to the kernel log that an interface was renamed while UP. [1] https://www.kernel.org/doc/Documentation/networking/netconsole.rst [2] https://lore.kernel.org/netdev/20221102002420.2613004-1-andy.ren@getcruise.com/ Signed-off-by: Andy Ren <andy.ren@getcruise.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-09Merge branch 'dsa-microchip-checking'David S. Miller
Rakesh Sankaranarayanan says: ==================== net: dsa: microchip: ksz_pwrite status check for lan937x and irq and error checking updates for ksz series This patch series include following changes, - Add KSZ9563 inside ksz_switch_chips. As per current structure, KSZ9893 is reused inside ksz_switch_chips structure, but since there is a mismatch in number of irq's, new member added for KSZ9563 and sku detected based on Global Chip ID 4 Register. Compatible string from device tree mapped to KSZ9563 for spi and i2c mode probes. - Assign device interrupt during i2c probe operation. - Add error checking for ksz_pwrite inside lan937x_change_mtu. After v6.0, ksz_pwrite updated to have return type int instead of void, and lan937x_change_mtu still uses ksz_pwrite without status verification. - Add port_nirq as 3 for KSZ8563 switch family. - Use dev_err_probe() instead of dev_err() to have more standardized error formatting and logging. v1 -> v2: - Removed regmap validation patch from the series, planning to take up in future after checking for any better approach and studying the actual need for this change. - Resolved error reported in ksz8863_smi.c file. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-09net: dsa: microchip: add dev_err_probe in probe functionsRakesh Sankaranarayanan
Probe functions uses normal dev_err() to check error conditions and print messages. Replace dev_err() with dev_err_probe() to have more standardized format and error logging. Signed-off-by: Rakesh Sankaranarayanan <rakesh.sankaranarayanan@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-09net: dsa: microchip: ksz8563: Add number of port irqRakesh Sankaranarayanan
KSZ8563 have three port interrupts: PTP, PHY and ACL. Add port_nirq as 3 for KSZ8563 inside ksz_chip_data. Signed-off-by: Rakesh Sankaranarayanan <rakesh.sankaranarayanan@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-09net: dsa: microchip: add error checking for ksz_pwriteRakesh Sankaranarayanan
Add status validation for port register write inside lan937x_change_mtu. ksz_pwrite and ksz_pread api's are updated with return type int (Reference patch mentioned below). Update lan937x_change_mtu with status validation for ksz_pwrite16(). Link: https://patchwork.kernel.org/project/netdevbpf/patch/20220826105634.3855578-6-o.rempel@pengutronix.de/ Signed-off-by: Rakesh Sankaranarayanan <rakesh.sankaranarayanan@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-09net: dsa: microchip: add irq in i2c probeRakesh Sankaranarayanan
add device irq in i2c probe function. Signed-off-by: Rakesh Sankaranarayanan <rakesh.sankaranarayanan@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-09net: dsa: microchip: add ksz9563 in ksz_switch_ops and select based on ↵Rakesh Sankaranarayanan
compatible string Add KSZ9563 inside ksz_switch_chips structure with port_nirq as 3. KSZ9563 use KSZ9893 switch parameters but port_nirq count is 3 for KSZ9563 whereas 2 for KSZ9893. Add KSZ9563 inside ksz_switch_chips as a separate member and from device tree map compatible string into KSZ9563 inside ksz_spi.c and ksz9477_i2c.c. Global Chip ID 1 and 2 registers read value 9893, select sku based on Global Chip ID 4 Register which read 0x1c for KSZ9563. Signed-off-by: Rakesh Sankaranarayanan <rakesh.sankaranarayanan@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-09selftests: netfilter: Fix and review rpath.shPhil Sutter
Address a few problems with the initial test script version: * On systems with ip6tables but no ip6tables-legacy, testing for ip6tables was disabled by accident. * Firewall setup phase did not respect possibly unavailable tools. * Consistently call nft via '$nft'. Fixes: 6e31ce831c63b ("selftests: netfilter: Test reverse path filtering") Signed-off-by: Phil Sutter <phil@nwl.cc> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-11-08net: ethernet: renesas: rswitch: Fix endless loop in error pathsYoshihiro Shimoda
Coverity reported that the error path in rswitch_gwca_queue_alloc_skb() has an issue to cause endless loop. So, fix the issue by changing variables' types from u32 to int. After changed the types, rswitch_tx_free() should use rswitch_get_num_cur_queues() to calculate number of current queues. Reported-by: coverity-bot <keescook+coverity-bot@chromium.org> Addresses-Coverity-ID: 1527147 ("Control flow issues") Fixes: 3590918b5d07 ("net: ethernet: renesas: Add support for "Ethernet Switch"") Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Link: https://lore.kernel.org/r/20221107081021.2955122-1-yoshihiro.shimoda.uh@renesas.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-08ibmveth: Reduce default tx queues to 8Nick Child
Previously, the default number of transmit queues was 16. Due to resource concerns, set to 8 queues instead. Still allow the user to set more queues (max 16) if they like. Since the driver is virtualized away from the physical NIC, the purpose of multiple queues is purely to allow for parallel calls to the hypervisor. Therefore, there is no noticeable effect on performance by reducing queue count to 8. Fixes: d926793c1de9 ("ibmveth: Implement multi queue on xmit") Reported-by: Dave Taht <dave.taht@gmail.com> Signed-off-by: Nick Child <nnac123@linux.ibm.com> Link: https://lore.kernel.org/r/20221107203215.58206-1-nnac123@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-08net: nixge: disable napi when enable interrupts failed in nixge_open()Zhengchao Shao
When failed to enable interrupts in nixge_open() for opening device, napi isn't disabled. When open nixge device next time, it will reports a invalid opcode issue. Fix it. Only be compiled, not be tested. Fixes: 492caffa8a1a ("net: ethernet: nixge: Add support for National Instruments XGE netdev") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Link: https://lore.kernel.org/r/20221107101443.120205-1-shaozhengchao@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-08net: tun: call napi_schedule_prep() to ensure we own a napiEric Dumazet
A recent patch exposed another issue in napi_get_frags() caught by syzbot [1] Before feeding packets to GRO, and calling napi_complete() we must first grab NAPI_STATE_SCHED. [1] WARNING: CPU: 0 PID: 3612 at net/core/dev.c:6076 napi_complete_done+0x45b/0x880 net/core/dev.c:6076 Modules linked in: CPU: 0 PID: 3612 Comm: syz-executor408 Not tainted 6.1.0-rc3-syzkaller-00175-g1118b2049d77 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 RIP: 0010:napi_complete_done+0x45b/0x880 net/core/dev.c:6076 Code: c1 ea 03 0f b6 14 02 4c 89 f0 83 e0 07 83 c0 03 38 d0 7c 08 84 d2 0f 85 24 04 00 00 41 89 5d 1c e9 73 fc ff ff e8 b5 53 22 fa <0f> 0b e9 82 fe ff ff e8 a9 53 22 fa 48 8b 5c 24 08 31 ff 48 89 de RSP: 0018:ffffc90003c4f920 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000030 RCX: 0000000000000000 RDX: ffff8880251c0000 RSI: ffffffff875a58db RDI: 0000000000000007 RBP: 0000000000000001 R08: 0000000000000007 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff888072d02628 R13: ffff888072d02618 R14: ffff888072d02634 R15: 0000000000000000 FS: 0000555555f13300(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055c44d3892b8 CR3: 00000000172d2000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> napi_complete include/linux/netdevice.h:510 [inline] tun_get_user+0x206d/0x3a60 drivers/net/tun.c:1980 tun_chr_write_iter+0xdb/0x200 drivers/net/tun.c:2027 call_write_iter include/linux/fs.h:2191 [inline] do_iter_readv_writev+0x20b/0x3b0 fs/read_write.c:735 do_iter_write+0x182/0x700 fs/read_write.c:861 vfs_writev+0x1aa/0x630 fs/read_write.c:934 do_writev+0x133/0x2f0 fs/read_write.c:977 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f37021a3c19 Fixes: 1118b2049d77 ("net: tun: Fix memory leaks of napi_get_frags") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Wang Yufen <wangyufen@huawei.com> Link: https://lore.kernel.org/r/20221107180011.188437-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-08net: marvell: prestera: fix memory leak in prestera_rxtx_switch_init()Zhengchao Shao
When prestera_sdma_switch_init() failed, the memory pointed to by sw->rxtx isn't released. Fix it. Only be compiled, not be tested. Fixes: 501ef3066c89 ("net: marvell: prestera: Add driver for Prestera family ASIC devices") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Reviewed-by: Vadym Kochan <vadym.kochan@plvision.eu> Link: https://lore.kernel.org/r/20221108025607.338450-1-shaozhengchao@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-08Merge tag 'linux-can-fixes-for-6.1-20221107' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== can 2022-11-07 The first patch is by Chen Zhongjin and adds a missing dev_remove_pack() to the AF_CAN protocol. Zhengchao Shao's patch fixes a potential NULL pointer deref in AF_CAN's can_rx_register(). The next patch is by Oliver Hartkopp and targets the CAN ISO-TP protocol, and fixes the state handling for echo TX processing. Oliver Hartkopp's patch for the j1939 protocol adds a missing initialization of the CAN headers inside outgoing skbs. Another patch by Oliver Hartkopp fixes an out of bounds read in the check for invalid CAN frames in the xmit callback of virtual CAN devices. This touches all non virtual device drivers as we decided to rename the function requiring that netdev_priv points to a struct can_priv. (Note: This patch will create a merge conflict with net-next where the pch_can driver has removed.) The last patch is by Geert Uytterhoeven and adds the missing ECC error checks for the channels 2-7 in the rcar_canfd driver. * tag 'linux-can-fixes-for-6.1-20221107' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: can: rcar_canfd: Add missing ECC error checks for channels 2-7 can: dev: fix skb drop check can: j1939: j1939_send_one(): fix missing CAN header initialization can: isotp: fix tx state handling for echo tx processing can: af_can: fix NULL pointer dereference in can_rx_register() can: af_can: can_exit(): add missing dev_remove_pack() of canxl_packet ==================== Link: https://lore.kernel.org/r/20221107133217.59861-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-08lib: Fix some kernel-doc commentsYang Li
Make the description of @policy to @p in nla_policy_len() to clear the below warnings: lib/nlattr.c:660: warning: Function parameter or member 'p' not described in 'nla_policy_len' lib/nlattr.c:660: warning: Excess function parameter 'policy' description in 'nla_policy_len' Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=2736 Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Link: https://lore.kernel.org/r/20221107062623.6709-1-yang.lee@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-08netfilter: Cleanup nft_net->module_list from nf_tables_exit_net()Shigeru Yoshida
syzbot reported a warning like below [1]: WARNING: CPU: 3 PID: 9 at net/netfilter/nf_tables_api.c:10096 nf_tables_exit_net+0x71c/0x840 Modules linked in: CPU: 2 PID: 9 Comm: kworker/u8:0 Tainted: G W 6.1.0-rc3-00072-g8e5423e991e8 #47 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014 Workqueue: netns cleanup_net RIP: 0010:nf_tables_exit_net+0x71c/0x840 ... Call Trace: <TASK> ? __nft_release_table+0xfc0/0xfc0 ops_exit_list+0xb5/0x180 cleanup_net+0x506/0xb10 ? unregister_pernet_device+0x80/0x80 process_one_work+0xa38/0x1730 ? pwq_dec_nr_in_flight+0x2b0/0x2b0 ? rwlock_bug.part.0+0x90/0x90 ? _raw_spin_lock_irq+0x46/0x50 worker_thread+0x67e/0x10e0 ? process_one_work+0x1730/0x1730 kthread+0x2e5/0x3a0 ? kthread_complete_and_exit+0x40/0x40 ret_from_fork+0x1f/0x30 </TASK> In nf_tables_exit_net(), there is a case where nft_net->commit_list is empty but nft_net->module_list is not empty. Such a case occurs with the following scenario: 1. nfnetlink_rcv_batch() is called 2. nf_tables_newset() returns -EAGAIN and NFNL_BATCH_FAILURE bit is set to status 3. nf_tables_abort() is called with NFNL_ABORT_AUTOLOAD (nft_net->commit_list is released, but nft_net->module_list is not because of NFNL_ABORT_AUTOLOAD flag) 4. Jump to replay label 5. netlink_skb_clone() fails and returns from the function (this is caused by fault injection in the reproducer of syzbot) This patch fixes this issue by calling __nf_tables_abort() when nft_net->module_list is not empty in nf_tables_exit_net(). Fixes: eb014de4fd41 ("netfilter: nf_tables: autoload modules from the abort path") Link: https://syzkaller.appspot.com/bug?id=802aba2422de4218ad0c01b46c9525cc9d4e4aa3 [1] Reported-by: syzbot+178efee9e2d7f87f5103@syzkaller.appspotmail.com Signed-off-by: Shigeru Yoshida <syoshida@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de>
2022-11-08netfilter: nfnetlink: fix potential dead lock in nfnetlink_rcv_msg()Ziyang Xuan
When type is NFNL_CB_MUTEX and -EAGAIN error occur in nfnetlink_rcv_msg(), it does not execute nfnl_unlock(). That would trigger potential dead lock. Fixes: 50f2db9e368f ("netfilter: nfnetlink: consolidate callback types") Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Signed-off-by: Florian Westphal <fw@strlen.de>
2022-11-08Merge tag 'audit-pr-20221107' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit fix from Paul Moore: "A small audit patch to fix an instance of undefined behavior in a shift operator caused when shifting a signed value too far, the same case as the lsm patch merged previously. While the fix is trivial and I can't imagine it causing a problem in a backport, I'm not explicitly marking it for stable on the off chance that there is some system out there which is relying on some wonky unexpected behavior which this patch could break; *if* it does break, IMO it's better that to happen in a minor or -rcX release and not in a stable backport" * tag 'audit-pr-20221107' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: fix undefined behavior in bit shift for AUDIT_BIT
2022-11-08Merge tag 'lsm-pr-20221107' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull lsm fix from Paul Moore: "A small capability patch to fix an instance of undefined behavior in a shift operator caused when shifting a signed value too far. While the fix is trivial and I can't imagine it causing a problem in a backport, I'm not explicitly marking it for stable on the off chance that there is some system out there which is relying on some wonky unexpected behavior which this patch could break; *if* it does break, IMO it's better that to happen in a minor or -rcX release and not in a stable backport" * tag 'lsm-pr-20221107' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: capabilities: fix undefined behavior in bit shift for CAP_TO_MASK
2022-11-08rxrpc: Allocate an skcipher each time needed rather than reusingDavid Howells
In the rxkad security class, allocate the skcipher used to do packet encryption and decription rather than allocating one up front and reusing it for each packet. Reusing the skcipher precludes doing crypto in parallel. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-11-08rxrpc: Fix congestion managementDavid Howells
rxrpc has a problem in its congestion management in that it saves the congestion window size (cwnd) from one call to another, but if this is 0 at the time is saved, then the next call may not actually manage to ever transmit anything. To this end: (1) Don't save cwnd between calls, but rather reset back down to the initial cwnd and re-enter slow-start if data transmission is idle for more than an RTT. (2) Preserve ssthresh instead, as that is a handy estimate of pipe capacity. Knowing roughly when to stop slow start and enter congestion avoidance can reduce the tendency to overshoot and drop larger amounts of packets when probing. In future, cwind growth also needs to be constrained when the window isn't being filled due to being application limited. Reported-by: Simon Wilkinson <sxw@auristor.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-11-08rxrpc: Remove the rxtx ringDavid Howells
The Rx/Tx ring is no longer used, so remove it. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-11-08rxrpc: Save last ACK's SACK table rather than marking txbufsDavid Howells
Improve the tracking of which packets need to be transmitted by saving the last ACK packet that we receive that has a populated soft-ACK table rather than marking packets. Then we can step through the soft-ACK table and look at the packets we've transmitted beyond that to determine which packets we might want to retransmit. We also look at the highest serial number that has been acked to try and guess which packets we've transmitted the peer is likely to have seen. If necessary, we send a ping to retrieve that number. One downside that might be a problem is that we can't then compare the previous acked/unacked state so easily in rxrpc_input_soft_acks() - which is a potential problem for the slow-start algorithm. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-11-08rxrpc: Remove call->lockDavid Howells
call->lock is no longer necessary, so remove it. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-11-08rxrpc: Don't use a ring buffer for call Tx queueDavid Howells
Change the way the Tx queueing works to make the following ends easier to achieve: (1) The filling of packets, the encryption of packets and the transmission of packets can be handled in parallel by separate threads, rather than rxrpc_sendmsg() allocating, filling, encrypting and transmitting each packet before moving onto the next one. (2) Get rid of the fixed-size ring which sets a hard limit on the number of packets that can be retained in the ring. This allows the number of packets to increase without having to allocate a very large ring or having variable-sized rings. [Note: the downside of this is that it's then less efficient to locate a packet for retransmission as we then have to step through a list and examine each buffer in the list.] (3) Allow the filler/encrypter to run ahead of the transmission window. (4) Make it easier to do zero copy UDP from the packet buffers. (5) Make it easier to do zero copy from userspace to the packet buffers - and thence to UDP (only if for unauthenticated connections). To that end, the following changes are made: (1) Use the new rxrpc_txbuf struct instead of sk_buff for keeping packets to be transmitted in. This allows them to be placed on multiple queues simultaneously. An sk_buff isn't really necessary as it's never passed on to lower-level networking code. (2) Keep the transmissable packets in a linked list on the call struct rather than in a ring. As a consequence, the annotation buffer isn't used either; rather a flag is set on the packet to indicate ackedness. (3) Use the RXRPC_CALL_TX_LAST flag to indicate that the last packet to be transmitted has been queued. Add RXRPC_CALL_TX_ALL_ACKED to indicate that all packets up to and including the last got hard acked. (4) Wire headers are now stored in the txbuf rather than being concocted on the stack and they're stored immediately before the data, thereby allowing zerocopy of a single span. (5) Don't bother with instant-resend on transmission failure; rather, leave it for a timer or an ACK packet to trigger. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-11-08rxrpc: Get rid of the Rx ringDavid Howells
Get rid of the Rx ring and replace it with a pair of queues instead. One queue gets the packets that are in-sequence and are ready for processing by recvmsg(); the other queue gets the out-of-sequence packets for addition to the first queue as the holes get filled. The annotation ring is removed and replaced with a SACK table. The SACK table has the bits set that correspond exactly to the sequence number of the packet being acked. The SACK ring is copied when an ACK packet is being assembled and rotated so that the first ACK is in byte 0. Flow control handling is altered so that packets that are moved to the in-sequence queue are hard-ACK'd even before they're consumed - and then the Rx window size in the ACK packet (rsize) is shrunk down to compensate (even going to 0 if the window is full). Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-11-08rxrpc: Clone received jumbo subpackets and queue separatelyDavid Howells
Split up received jumbo packets into separate skbuffs by cloning the original skbuff for each subpacket and setting the offset and length of the data in that subpacket in the skbuff's private data. The subpackets are then placed on the recvmsg queue separately. The security class then gets to revise the offset and length to remove its metadata. If we fail to clone a packet, we just drop it and let the peer resend it. The original packet gets used for the final subpacket. This should make it easier to handle parallel decryption of the subpackets. It also simplifies the handling of lost or misordered packets in the queuing/buffering loop as the possibility of overlapping jumbo packets no longer needs to be considered. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-11-08rxrpc: Split the rxrpc_recvmsg tracepointDavid Howells
Split the rxrpc_recvmsg tracepoint so that the tracepoints that are about data packet processing (and which have extra pieces of information) are separate from the tracepoint that shows the general flow of recvmsg(). Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-11-08rxrpc: Clean up ACK handlingDavid Howells
Clean up the rxrpc_propose_ACK() function. If deferred PING ACK proposal is split out, it's only really needed for deferred DELAY ACKs. All other ACKs, bar terminal IDLE ACK are sent immediately. The deferred IDLE ACK submission can be handled by conversion of a DELAY ACK into an IDLE ACK if there's nothing to be SACK'd. Also, because there's a delay between an ACK being generated and being transmitted, it's possible that other ACKs of the same type will be generated during that interval. Apart from the ACK time and the serial number responded to, most of the ACK body, including window and SACK parameters, are not filled out till the point of transmission - so we can avoid generating a new ACK if there's one pending that will cover the SACK data we need to convey. Therefore, don't propose a new DELAY or IDLE ACK for a call if there's one already pending. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-11-08rxrpc: Allocate ACK records at proposal and queue for transmissionDavid Howells
Allocate rxrpc_txbuf records for ACKs and put onto a queue for the transmitter thread to dispatch. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-11-08rxrpc: Define rxrpc_txbuf struct to carry data to be transmittedDavid Howells
Define a struct, rxrpc_txbuf, to carry data to be transmitted instead of a socket buffer so that it can be placed onto multiple queues at once. This also allows the data buffer to be in the same allocation as the internal data. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-11-08rxrpc: Remove call->tx_phaseDavid Howells
Remove call->tx_phase as it's only ever set. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-11-08rxrpc: Remove the flags from the rxrpc_skb tracepointDavid Howells
Remove the flags from the rxrpc_skb tracepoint as we're no longer going to be using this for the transmission buffers and so marking which are transmission buffers isn't going to be necessary. Note that this also remove the rxrpc skb flag that indicates if this is a transmission buffer and so the count is not updated for the moment. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-11-08rxrpc: Remove unnecessary header inclusionsDavid Howells
Remove a bunch of unnecessary header inclusions. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-11-08rxrpc: Call udp_sendmsg() directlyDavid Howells
Call udp_sendmsg() and udpv6_sendmsg() directly rather than calling kernel_sendmsg() as the latter assumes we want a kvec-class iterator. However, zerocopy explicitly doesn't work with such an iterator. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-11-08rxrpc: Use the core ICMP/ICMP6 parsersDavid Howells
Make rxrpc_encap_rcv_err() pass the ICMP/ICMP6 skbuff to ip_icmp_error() or ipv6_icmp_error() as appropriate to do the parsing rather than trying to do it in rxrpc. This pushes an error report onto the UDP socket's error queue and calls ->sk_error_report() from which point rxrpc can pick it up. It would be preferable to steal the packet directly from ip*_icmp_error() rather than letting it get queued, but this is probably good enough. Also note that __udp4_lib_err() calls sk_error_report() twice in some cases. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-11-08net: Change the udp encap_err_rcv to allow use of {ip,ipv6}_icmp_error()David Howells
Change the udp encap_err_rcv signature to match ip_icmp_error() and ipv6_icmp_error() so that those can be used from the called function and export them. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org
2022-11-08rxrpc: Fix ack.bufferSize to be 0 when generating an ackDavid Howells
ack.bufferSize should be set to 0 when generating an ack. Fixes: 8d94aa381dab ("rxrpc: Calls shouldn't hold socket refs") Reported-by: Jeffrey Altman <jaltman@auristor.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-11-08rxrpc: Record stats for why the REQUEST-ACK flag is being setDavid Howells
Record stats for why the REQUEST-ACK flag is being set. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-11-08rxrpc: Record statistics about ACK typesDavid Howells
Record statistics about the different types of ACKs that have been transmitted and received and the number of ACKs that have been filled out and transmitted or that have been skipped. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-11-08rxrpc: Add stats procfile and DATA packet statsDavid Howells
Add a procfile, /proc/net/rxrpc/stats, to display some statistics about what rxrpc has been doing. Writing a blank line to the stats file will clear the increment-only counters. Allocated resource counters don't get cleared. Add some counters to count various things about DATA packets, including the number created, transmitted and retransmitted and the number received, the number of ACK-requests markings and the number of jumbo packets received. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-11-08rxrpc: Track highest acked serialDavid Howells
Keep track of the highest DATA serial number that has been acked by the peer for future purposes. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-11-08rxrpc: Split call timer-expiration from call timer-set tracepointDavid Howells
Split the tracepoint for call timer-set to separate out the call timer-expiration event Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-11-08rxrpc: Trace setting of the request-ack flagDavid Howells
Add a tracepoint to log why the request-ack flag is set on an outgoing DATA packet, allowing debugging as to why. Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org
2022-11-08net, proc: Provide PROC_FS=n fallback for proc_create_net_single_write()David Howells
Provide a CONFIG_PROC_FS=n fallback for proc_create_net_single_write(). Also provide a fallback for proc_create_net_data_write(). Fixes: 564def71765c ("proc: Add a way to make network proc files writable") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: netdev@vger.kernel.org
2022-11-08drivers: net: xgene: disable napi when register irq failed in xgene_enet_open()Zhengchao Shao
When failed to register irq in xgene_enet_open() for opening device, napi isn't disabled. When open xgene device next time, it will reports a invalid opcode issue. Fix it. Only be compiled, not be tested. Fixes: aeb20b6b3f4e ("drivers: net: xgene: fix: ifconfig up/down crash") Signed-off-by: Zhengchao Shao <shaozhengchao@huawei.com> Link: https://lore.kernel.org/r/20221107043032.357673-1-shaozhengchao@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>