summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-05-10net: lan966x: Add TC support for ES0 VCAPHoratiu Vultur
Enable the TC command to use the lan966x ES0 VCAP. Currently support only one action which is vlan pop, other will be added later. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10net: lan966x: Add ES0 VCAP keyset configuration for lan966xHoratiu Vultur
Add ES0 VCAP port keyset configuration for lan966x and also update debugfs to show the keyset configuration. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10net: lan966x: Add ES0 VCAP modelHoratiu Vultur
Provide ES0 (egress stage 0) VCAP model for lan966x. This provides rewriting functionality in the gress path. Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10netlink: annotate accesses to nlk->cb_runningEric Dumazet
Both netlink_recvmsg() and netlink_native_seq_show() read nlk->cb_running locklessly. Use READ_ONCE() there. Add corresponding WRITE_ONCE() to netlink_dump() and __netlink_dump_start() syzbot reported: BUG: KCSAN: data-race in __netlink_dump_start / netlink_recvmsg write to 0xffff88813ea4db59 of 1 bytes by task 28219 on cpu 0: __netlink_dump_start+0x3af/0x4d0 net/netlink/af_netlink.c:2399 netlink_dump_start include/linux/netlink.h:308 [inline] rtnetlink_rcv_msg+0x70f/0x8c0 net/core/rtnetlink.c:6130 netlink_rcv_skb+0x126/0x220 net/netlink/af_netlink.c:2577 rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:6192 netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline] netlink_unicast+0x56f/0x640 net/netlink/af_netlink.c:1365 netlink_sendmsg+0x665/0x770 net/netlink/af_netlink.c:1942 sock_sendmsg_nosec net/socket.c:724 [inline] sock_sendmsg net/socket.c:747 [inline] sock_write_iter+0x1aa/0x230 net/socket.c:1138 call_write_iter include/linux/fs.h:1851 [inline] new_sync_write fs/read_write.c:491 [inline] vfs_write+0x463/0x760 fs/read_write.c:584 ksys_write+0xeb/0x1a0 fs/read_write.c:637 __do_sys_write fs/read_write.c:649 [inline] __se_sys_write fs/read_write.c:646 [inline] __x64_sys_write+0x42/0x50 fs/read_write.c:646 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd read to 0xffff88813ea4db59 of 1 bytes by task 28222 on cpu 1: netlink_recvmsg+0x3b4/0x730 net/netlink/af_netlink.c:2022 sock_recvmsg_nosec+0x4c/0x80 net/socket.c:1017 ____sys_recvmsg+0x2db/0x310 net/socket.c:2718 ___sys_recvmsg net/socket.c:2762 [inline] do_recvmmsg+0x2e5/0x710 net/socket.c:2856 __sys_recvmmsg net/socket.c:2935 [inline] __do_sys_recvmmsg net/socket.c:2958 [inline] __se_sys_recvmmsg net/socket.c:2951 [inline] __x64_sys_recvmmsg+0xe2/0x160 net/socket.c:2951 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd value changed: 0x00 -> 0x01 Fixes: 16b304f3404f ("netlink: Eliminate kmalloc in netlink dump operation.") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10Merge branch 'bonding-overflow'David S. Miller
Hangbin Liu says: ==================== bonding: fix send_peer_notif overflow Bonding send_peer_notif was defined as u8. But the value is num_peer_notif multiplied by peer_notif_delay, which is u8 * u32. This would cause the send_peer_notif overflow. Before the fix: TEST: num_grat_arp (active-backup miimon num_grat_arp 10) [ OK ] TEST: num_grat_arp (active-backup miimon num_grat_arp 20) [ OK ] 4 garp packets sent on active slave eth1 TEST: num_grat_arp (active-backup miimon num_grat_arp 30) [FAIL] 24 garp packets sent on active slave eth1 TEST: num_grat_arp (active-backup miimon num_grat_arp 50) [FAIL] After the fix: TEST: num_grat_arp (active-backup miimon num_grat_arp 10) [ OK ] TEST: num_grat_arp (active-backup miimon num_grat_arp 20) [ OK ] TEST: num_grat_arp (active-backup miimon num_grat_arp 30) [ OK ] TEST: num_grat_arp (active-backup miimon num_grat_arp 50) [ OK ] ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10kselftest: bonding: add num_grat_arp testHangbin Liu
TEST: num_grat_arp (active-backup miimon num_grat_arp 10) [ OK ] TEST: num_grat_arp (active-backup miimon num_grat_arp 20) [ OK ] TEST: num_grat_arp (active-backup miimon num_grat_arp 30) [ OK ] TEST: num_grat_arp (active-backup miimon num_grat_arp 50) [ OK ] Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10selftests: forwarding: lib: add netns support for tc rule handle stats getHangbin Liu
When run the test in netns, it's not easy to get the tc stats via tc_rule_handle_stats_get(). With the new netns parameter, we can get stats from specific netns like num=$(tc_rule_handle_stats_get "dev eth0 ingress" 101 ".packets" "-n ns") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10Documentation: bonding: fix the doc of peer_notif_delayHangbin Liu
Bonding only supports setting peer_notif_delay with miimon set. Fixes: 0307d589c4d6 ("bonding: add documentation for peer_notif_delay") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10bonding: fix send_peer_notif overflowHangbin Liu
Bonding send_peer_notif was defined as u8. Since commit 07a4ddec3ce9 ("bonding: add an option to specify a delay between peer notifications"). the bond->send_peer_notif will be num_peer_notif multiplied by peer_notif_delay, which is u8 * u32. This would cause the send_peer_notif overflow easily. e.g. ip link add bond0 type bond mode 1 miimon 100 num_grat_arp 30 peer_notify_delay 1000 To fix the overflow, let's set the send_peer_notif to u32 and limit peer_notif_delay to 300s. Reported-by: Liang Li <liali@redhat.com> Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2090053 Fixes: 07a4ddec3ce9 ("bonding: add an option to specify a delay between peer notifications") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10net: ethernet: mtk_eth_soc: fix NULL pointer dereferenceDaniel Golle
Check for NULL pointer to avoid kernel crashing in case of missing WO firmware in case only a single WEDv2 device has been initialized, e.g. on MT7981 which can connect just one wireless frontend. Fixes: 86ce0d09e424 ("net: ethernet: mtk_eth_soc: use WO firmware for MT7981") Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10net: ipconfig: Allow DNS to be overwritten by DHCPACKMartin Wetterwald
Some DHCP server implementations only send the important requested DHCP options in the final BOOTP reply (DHCPACK). One example is systemd-networkd. However, RFC2131, in section 4.3.1 states: > The server MUST return to the client: > [...] > o Parameters requested by the client, according to the following > rules: > > -- IF the server has been explicitly configured with a default > value for the parameter, the server MUST include that value > in an appropriate option in the 'option' field, ELSE I've reported the issue here: https://github.com/systemd/systemd/issues/27471 Linux PNP DHCP client implementation only takes into account the DNS servers received in the first BOOTP reply (DHCPOFFER). This usually isn't an issue as servers are required to put the same values in the DHCPOFFER and DHCPACK. However, RFC2131, in section 4.3.2 states: > Any configuration parameters in the DHCPACK message SHOULD NOT > conflict with those in the earlier DHCPOFFER message to which the > client is responding. The client SHOULD use the parameters in the > DHCPACK message for configuration. When making Linux PNP DHCP client (cmdline ip=dhcp) interact with systemd-networkd DHCP server, an interesting "protocol misunderstanding" happens: Because DNS servers were only specified in the DHCPACK and not in the DHCPOFFER, Linux will not catch the correct DNS servers: in the first BOOTP reply (DHCPOFFER), it sees that there is no DNS, and sets as fallback the IP of the DHCP server itself. When the second BOOTP reply comes (DHCPACK), it's already too late: the kernel will not overwrite the fallback setting it has set previously. This patch makes the kernel overwrite its DNS fallback by DNS servers specified in the DHCPACK if any. Signed-off-by: Martin Wetterwald <martin@wetterwald.eu> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10selftests: nft_flowtable.sh: check ingress/egress chain tooFlorian Westphal
Make sure flowtable interacts correctly with ingress and egress chains, i.e. those get handled before and after flow table respectively. Adds three more tests: 1. repeat flowtable test, but with 'ip dscp set cs3' done in inet forward chain. Expect that some packets have been mangled (before flowtable offload became effective) while some pass without mangling (after offload succeeds). 2. repeat flowtable test, but with 'ip dscp set cs3' done in veth0:ingress. Expect that all packets pass with cs3 dscp field. 3. same as 2, but use veth1:egress. Expect the same outcome. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-05-10selftests: nft_flowtable.sh: monitor result file sizesBoris Sukholitko
When running nft_flowtable.sh in VM on a busy server we've found that the time of the netcat file transfers vary wildly. Therefore replace hardcoded 3 second sleep with the loop checking for a change in the file sizes. Once no change in detected we test the results. Nice side effect is that we shave 1 second sleep in the fast case (hard-coded 3 second sleep vs two 1 second sleeps). Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-05-10selftests: nft_flowtable.sh: wait for specific nc pidsBoris Sukholitko
Doing wait with no parameters may interfere with some of the tests having their own background processes. Although no such test is currently present, the cleanup is useful to rely on the nft_flowtable.sh for local development (e.g. running background tcpdump command during the tests). Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-05-10selftests: nft_flowtable.sh: no need for ps -x optionBoris Sukholitko
Some ps commands (e.g. busybox derived) have no -x option. For the purposes of hash calculation of the list of processes this option is inessential. Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-05-10selftests: nft_flowtable.sh: use /proc for pid checkingBoris Sukholitko
Some ps commands (e.g. busybox derived) have no -p option. Use /proc for pid existence check. Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-05-10netfilter: conntrack: fix possible bug_on with enable_hooks=1Florian Westphal
I received a bug report (no reproducer so far) where we trip over 712 rcu_read_lock(); 713 ct_hook = rcu_dereference(nf_ct_hook); 714 BUG_ON(ct_hook == NULL); // here In nf_conntrack_destroy(). First turn this BUG_ON into a WARN. I think it was triggered via enable_hooks=1 flag. When this flag is turned on, the conntrack hooks are registered before nf_ct_hook pointer gets assigned. This opens a short window where packets enter the conntrack machinery, can have skb->_nfct set up and a subsequent kfree_skb might occur before nf_ct_hook is set. Call nf_conntrack_init_end() to set nf_ct_hook before we register the pernet ops. Fixes: ba3fbe663635 ("netfilter: nf_conntrack: provide modparam to always register conntrack hooks") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-05-10netfilter: nf_tables: always release netdev hooks from notifierFlorian Westphal
This reverts "netfilter: nf_tables: skip netdev events generated on netns removal". The problem is that when a veth device is released, the veth release callback will also queue the peer netns device for removal. Its possible that the peer netns is also slated for removal. In this case, the device memory is already released before the pre_exit hook of the peer netns runs: BUG: KASAN: slab-use-after-free in nf_hook_entry_head+0x1b8/0x1d0 Read of size 8 at addr ffff88812c0124f0 by task kworker/u8:1/45 Workqueue: netns cleanup_net Call Trace: nf_hook_entry_head+0x1b8/0x1d0 __nf_unregister_net_hook+0x76/0x510 nft_netdev_unregister_hooks+0xa0/0x220 __nft_release_hook+0x184/0x490 nf_tables_pre_exit_net+0x12f/0x1b0 .. Order is: 1. First netns is released, veth_dellink() queues peer netns device for removal 2. peer netns is queued for removal 3. peer netns device is released, unreg event is triggered 4. unreg event is ignored because netns is going down 5. pre_exit hook calls nft_netdev_unregister_hooks but device memory might be free'd already. Fixes: 68a3765c659f ("netfilter: nf_tables: skip netdev events generated on netns removal") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-05-09net: phy: bcm7xx: Correct read from expansion registerFlorian Fainelli
Since the driver works in the "legacy" addressing mode, we need to write to the expansion register (0x17) with bits 11:8 set to 0xf to properly select the expansion register passed as argument. Fixes: f68d08c437f9 ("net: phy: bcm7xxx: Add EPHY entry for 72165") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230508231749.1681169-1-f.fainelli@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09net: veth: rely on napi_build_skb in veth_convert_skb_to_xdp_buffLorenzo Bianconi
Since veth_convert_skb_to_xdp_buff routine runs in veth_poll() NAPI, rely on napi_build_skb() instead of build_skb() to reduce skb allocation cost. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com> Link: https://lore.kernel.org/r/0f822c0b72f8b71555c11745cb8fb33399d02de9.1683578488.git.lorenzo@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09net: Fix load-tearing on sk->sk_stamp in sock_recv_cmsgs().Kuniyuki Iwashima
KCSAN found a data race in sock_recv_cmsgs() where the read access to sk->sk_stamp needs READ_ONCE(). BUG: KCSAN: data-race in packet_recvmsg / packet_recvmsg write (marked) to 0xffff88803c81f258 of 8 bytes by task 19171 on cpu 0: sock_write_timestamp include/net/sock.h:2670 [inline] sock_recv_cmsgs include/net/sock.h:2722 [inline] packet_recvmsg+0xb97/0xd00 net/packet/af_packet.c:3489 sock_recvmsg_nosec net/socket.c:1019 [inline] sock_recvmsg+0x11a/0x130 net/socket.c:1040 sock_read_iter+0x176/0x220 net/socket.c:1118 call_read_iter include/linux/fs.h:1845 [inline] new_sync_read fs/read_write.c:389 [inline] vfs_read+0x5e0/0x630 fs/read_write.c:470 ksys_read+0x163/0x1a0 fs/read_write.c:613 __do_sys_read fs/read_write.c:623 [inline] __se_sys_read fs/read_write.c:621 [inline] __x64_sys_read+0x41/0x50 fs/read_write.c:621 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x72/0xdc read to 0xffff88803c81f258 of 8 bytes by task 19183 on cpu 1: sock_recv_cmsgs include/net/sock.h:2721 [inline] packet_recvmsg+0xb64/0xd00 net/packet/af_packet.c:3489 sock_recvmsg_nosec net/socket.c:1019 [inline] sock_recvmsg+0x11a/0x130 net/socket.c:1040 sock_read_iter+0x176/0x220 net/socket.c:1118 call_read_iter include/linux/fs.h:1845 [inline] new_sync_read fs/read_write.c:389 [inline] vfs_read+0x5e0/0x630 fs/read_write.c:470 ksys_read+0x163/0x1a0 fs/read_write.c:613 __do_sys_read fs/read_write.c:623 [inline] __se_sys_read fs/read_write.c:621 [inline] __x64_sys_read+0x41/0x50 fs/read_write.c:621 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x72/0xdc value changed: 0xffffffffc4653600 -> 0x0000000000000000 Reported by Kernel Concurrency Sanitizer on: CPU: 1 PID: 19183 Comm: syz-executor.5 Not tainted 6.3.0-rc7-02330-gca6270c12e20 #2 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Fixes: 6c7c98bad488 ("sock: avoid dirtying sk_stamp, if possible") Reported-by: syzbot <syzkaller@googlegroups.com> Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20230508175543.55756-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09net: stmmac: xgmac: add ethtool per-queue irq statistic supportTeoh Ji Sheng
Commit af9bf70154eb ("net: stmmac: add ethtool per-queue irq statistic support") introduced ethtool per-queue statistics support to display number of interrupts generated by DMA tx and DMA rx for DWMAC4 core. This patch extend the support to XGMAC core. Signed-off-by: Teoh Ji Sheng <ji.sheng.teoh@intel.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230508144339.3014402-1-ji.sheng.teoh@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09Merge branch 'net-stmmac-convert-to-platform-remove-callback-returning-void'Jakub Kicinski
Uwe Kleine-König says: ==================== net: stmmac: Convert to platform remove callback returning void (implicit) v1 of this series is available at https://lore.kernel.org/netdev/20230402143025.2524443-1-u.kleine-koenig@pengutronix.de Changes since then: - Added various Reviewed-by: and Acked-by: tags received for v1 - Removed a variable in an earlier patch to make all intermediate steps compilable, spotted by Simon Horman - Rebased to v6.4-rc1 (which needed a slight adaption to cope for 4bd3bb7b4526 ("net: stmmac: Add glue layer for StarFive JH7110 SoC")) ==================== Link: https://lore.kernel.org/r/20230508142637.1449363-1-u.kleine-koenig@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09net: stmmac: dwmac-tegra: Convert to platform remove callback returning voidUwe Kleine-König
The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is (mostly) ignored and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new() which already returns void. Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Reviewed-by: Simon Horman <simon.horman@corigine.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09net: stmmac: dwmac-sun8i: Convert to platform remove callback returning voidUwe Kleine-König
The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is (mostly) ignored and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new() which already returns void. Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09net: stmmac: dwmac-stm32: Convert to platform remove callback returning voidUwe Kleine-König
The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is (mostly) ignored and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new() which already returns void. Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09net: stmmac: dwmac-sti: Convert to platform remove callback returning voidUwe Kleine-König
The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is (mostly) ignored and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new() which already returns void. Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09net: stmmac: dwmac-rk: Convert to platform remove callback returning voidUwe Kleine-König
The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is (mostly) ignored and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new() which already returns void. Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09net: stmmac: dwmac-qcom-ethqos: Convert to platform remove callback ↵Uwe Kleine-König
returning void The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is (mostly) ignored and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new() which already returns void. Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: Bhupesh Sharma <bhupesh.sharma@linaro.org> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09net: stmmac: dwmac-dwc-qos-eth: Convert to platform remove callback ↵Uwe Kleine-König
returning void The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is (mostly) ignored and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new() which already returns void. Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09net: stmmac: dwmac-visconti: Convert to platform remove callback returning voidUwe Kleine-König
The .remove() callback for a platform driver returns an int which makes many driver authors wrongly assume it's possible to do error handling by returning an error code. However the value returned is (mostly) ignored and this typically results in resource leaks. To improve here there is a quest to make the remove callback return void. In the first step of this quest all drivers are converted to .remove_new() which already returns void. Trivially convert this driver from always returning zero in the remove callback to the void returning variant. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09net: stmmac: dwmac-qcom-ethqos: Drop an if with an always false conditionUwe Kleine-König
The remove callback is only ever called after .probe() returned successfully. After that get_stmmac_bsp_priv() always return non-NULL. Side note: The early exit would also be a bug because the return value of qcom_ethqos_remove() is ignored by the device core and the device is unbound unconditionally. So exiting early resulted in a dangerous resource leak as all devm allocated resources (some memory and the register mappings) are freed but the network device stays around. Using the network device afterwards probably oopses. Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09net: stmmac: dwmac-visconti: Make visconti_eth_clock_remove() return voidUwe Kleine-König
The function returns zero unconditionally. Change it to return void instead which simplifies one caller as error handing becomes unnecessary. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: Simon Horman <simon.horman@corigine.com> Acked-by: Nobuhiro Iwamatsu <nobuhiro1.iwamatsu@toshiba.co.jp> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09net: stmmac: Make stmmac_pltfr_remove() return voidUwe Kleine-König
The function returns zero unconditionally. Change it to return void instead which simplifies some callers as error handing becomes unnecessary. The function is also used for some drivers as remove callback. Switch these to the .remove_new() callback. For some others no error can happen in the remove callback now, convert them to .remove_new(), too. Acked-by: Jernej Skrabec <jernej.skrabec@gmail.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09Merge branch 'virtio_net-refactor-xdp-codes'Jakub Kicinski
Xuan Zhuo says: ==================== virtio_net: refactor xdp codes Due to historical reasons, the implementation of XDP in virtio-net is relatively chaotic. For example, the processing of XDP actions has two copies of similar code. Such as page, xdp_page processing, etc. The purpose of this patch set is to refactor these code. Reduce the difficulty of subsequent maintenance. Subsequent developers will not introduce new bugs because of some complex logical relationships. In addition, the supporting to AF_XDP that I want to submit later will also need to reuse the logic of XDP, such as the processing of actions, I don't want to introduce a new similar code. In this way, I can reuse these codes in the future. ==================== Link: https://lore.kernel.org/r/20230508061417.65297-1-xuanzhuo@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09virtio_net: introduce virtnet_build_skb()Xuan Zhuo
This logic is used in multiple places, now we separate it into a helper. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09virtio_net: introduce receive_small_build_xdpXuan Zhuo
Simplifying receive_small() function. Bringing the logic relating to build_skb together. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09virtio_net: small: remove skip_xdpXuan Zhuo
Because the skb build code is not shared between xdp and non-xdp, and the xdp code in receive_small() is simpler, so "skip_xdp" is not needed. We can remove it. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09virtio_net: small: avoid code duplication in xdp scenariosXuan Zhuo
Avoid the problem that some variables(headroom and so on) will repeat the calculation when process xdp. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09virtio_net: small: remove the deltaXuan Zhuo
In the case of XDP-PASS, skb_reserve uses the "delta" to compatible non-XDP, now that is not shared between xdp and non-xdp, so we can remove this logic. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09virtio_net: introduce receive_small_xdp()Xuan Zhuo
The purpose of this patch is to simplify the receive_small(). Separate all the logic of XDP of small into a function. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09virtio_net: merge: remove skip_xdpXuan Zhuo
Now, the logic of merge xdp process is simple, we can remove the skip_xdp. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09virtio_net: introduce receive_mergeable_xdp()Xuan Zhuo
The purpose of this patch is to simplify the receive_mergeable(). Separate all the logic of XDP into a function. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09virtio_net: virtnet_build_xdp_buff_mrg() auto release xdp shinfoXuan Zhuo
virtnet_build_xdp_buff_mrg() auto release xdp shinfo then the caller no need to careful the xdp shinfo. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09virtio_net: separate the logic of freeing the rest mergeable bufXuan Zhuo
This patch introduce a new function that frees the rest mergeable buf. The subsequent patch will reuse this function. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09virtio_net: separate the logic of freeing xdp shinfoXuan Zhuo
This patch introduce a new function that releases the xdp shinfo. The subsequent patch will reuse this function. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09virtio_net: introduce virtnet_xdp_handler() to seprate the logic of run xdpXuan Zhuo
At present, we have two similar logic to perform the XDP prog. Therefore, this patch separates the code of executing XDP, which is conducive to later maintenance. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09virtio_net: optimize mergeable_xdp_get_buf()Xuan Zhuo
The previous patch, in order to facilitate review, I do not do any modification. This patch has made some optimization on the top. * remove some repeated logics in this function. * add fast check for passing without any alloc. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09virtio_net: introduce mergeable_xdp_get_buf()Xuan Zhuo
Separating the logic of preparation for xdp from receive_mergeable. The purpose of this is to simplify the logic of execution of XDP. The main logic here is that when headroom is insufficient, we need to allocate a new page and calculate offset. It should be noted that if there is new page, the variable page will refer to the new page. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09virtio_net: mergeable xdp: put old page immediatelyXuan Zhuo
In the xdp implementation of virtio-net mergeable, it always checks whether two page is used and a page is selected to release. This is complicated for the processing of action, and be careful. In the entire process, we have such principles: * If xdp_page is used (PASS, TX, Redirect), then we release the old page. * If it is a drop case, we will release two. The old page obtained from buf is release inside err_xdp, and xdp_page needs be relased by us. But in fact, when we allocate a new page, we can release the old page immediately. Then just one is using, we just need to release the new page for drop case. On the drop path, err_xdp will release the variable "page", so we only need to let "page" point to the new xdp_page in advance. Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>