summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-05-24net/handshake: Fix handshake_dup() ref countingChuck Lever
If get_unused_fd_flags() fails, we ended up calling fput(sock->file) twice. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Suggested-by: Paolo Abeni <pabeni@redhat.com> Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-24net/handshake: Remove unneeded check from handshake_dup()Chuck Lever
handshake_req_submit() now verifies that the socket has a file. Fixes: 3b3009ea8abb ("net/handshake: Create a NETLINK service for handling handshake requests") Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-24Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2023-05-24 We've added 19 non-merge commits during the last 10 day(s) which contain a total of 20 files changed, 738 insertions(+), 448 deletions(-). The main changes are: 1) Batch of BPF sockmap fixes found when running against NGINX TCP tests, from John Fastabend. 2) Fix a memleak in the LRU{,_PERCPU} hash map when bucket locking fails, from Anton Protopopov. 3) Init the BPF offload table earlier than just late_initcall, from Jakub Kicinski. 4) Fix ctx access mask generation for 32-bit narrow loads of 64-bit fields, from Will Deacon. 5) Remove a now unsupported __fallthrough in BPF samples, from Andrii Nakryiko. 6) Fix a typo in pkg-config call for building sign-file, from Jeremy Sowden. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf, sockmap: Test progs verifier error with latest clang bpf, sockmap: Test FIONREAD returns correct bytes in rx buffer with drops bpf, sockmap: Test FIONREAD returns correct bytes in rx buffer bpf, sockmap: Test shutdown() correctly exits epoll and recv()=0 bpf, sockmap: Build helper to create connected socket pair bpf, sockmap: Pull socket helpers out of listen test for general use bpf, sockmap: Incorrectly handling copied_seq bpf, sockmap: Wake up polling after data copy bpf, sockmap: TCP data stall on recv before accept bpf, sockmap: Handle fin correctly bpf, sockmap: Improved check for empty queue bpf, sockmap: Reschedule is now done through backlog bpf, sockmap: Convert schedule_work into delayed_work bpf, sockmap: Pass skb ownership through read_skb bpf: fix a memory leak in the LRU and LRU_PERCPU hash maps bpf: Fix mask generation for 32-bit narrow loads of 64-bit fields samples/bpf: Drop unnecessary fallthrough bpf: netdev: init the offload table earlier selftests/bpf: Fix pkg-config call building sign-file ==================== Link: https://lore.kernel.org/r/20230524170839.13905-1-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-24Merge branch 'libbpf: capability for resizing datasec maps'Andrii Nakryiko
JP Kobryn says: ==================== Due to the way the datasec maps like bss, data, rodata are memory mapped, they cannot be resized with bpf_map__set_value_size() like non-datasec maps can. This series offers a way to allow the resizing of datasec maps, by having the mapped regions resized as needed and also adjusting associated BTF info if possible. The thought behind this is to allow for use cases where a given datasec needs to scale to for example the number of CPU's present. A bpf program can have a global array in a data section with an initial length and before loading the bpf program, the array length could be extended to match the CPU count. The selftests included in this series perform this scaling to an arbitrary value to demonstrate how it can work. ==================== Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2023-05-24libbpf: Selftests for resizing datasec mapsJP Kobryn
This patch adds test coverage for resizing datasec maps. The first two subtests resize the bss and custom data sections. In both cases, an initial array (of length one) has its element set to one. After resizing the rest of the array is filled with ones as well. A BPF program is then run to sum the respective arrays and back on the userspace side the sum is checked to be equal to the number of elements. The third subtest attempts to perform resizing under conditions that will result in either the resize failing or the BTF info being cleared. Signed-off-by: JP Kobryn <inwardvessel@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20230524004537.18614-3-inwardvessel@gmail.com
2023-05-24libbpf: Add capability for resizing datasec mapsJP Kobryn
This patch updates bpf_map__set_value_size() so that if the given map is memory mapped, it will attempt to resize the mapped region. Initial contents of the mapped region are preserved. BTF is not required, but after the mapping is resized an attempt is made to adjust the associated BTF information if the following criteria is met: - BTF info is present - the map is a datasec - the final variable in the datasec is an array ... the resulting BTF info will be updated so that the final array variable is associated with a new BTF array type sized to cover the requested size. Note that the initial resizing of the memory mapped region can succeed while the subsequent BTF adjustment can fail. In this case, BTF info is dropped from the map by clearing the key and value type. Signed-off-by: JP Kobryn <inwardvessel@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/bpf/20230524004537.18614-2-inwardvessel@gmail.com
2023-05-24Merge tag 'spi-fix-v6.4-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A collection of fixes that came in since the merge window, plus an update to MAINTAINERS. The Cadence fixes are coming from the addition of device mode support, they required a couple of incremental updates in order to get something that works robustly for both device and controller modes" * tag 'spi-fix-v6.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: spi-cadence: Interleave write of TX and read of RX FIFO spi: dw: Replace spi->chip_select references with function calls spi: MAINTAINERS: drop Krzysztof Kozlowski from Samsung SPI spi: spi-cadence: Only overlap FIFO transactions in slave mode spi: spi-cadence: Avoid read of RX FIFO before its ready spi: spi-geni-qcom: Select FIFO mode for chip select
2023-05-24Merge tag 'regulator-fix-v6.4-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fixes from Mark Brown: "Some fixes that came in since the merge window, nothing terribly exciting - a couple of driver specific fixes and a fix for the error handling when setting up the debugfs for the devices" * tag 'regulator-fix-v6.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: mt6359: add read check for PMIC MT6359 regulator: Fix error checking for debugfs_create_dir regulator: pca9450: Fix BUCK2 enable_mask
2023-05-24Merge tag 'mmc-v6.4-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc Pull MMC fixes from Ulf Hansson: "MMC core: - Fix error propagation for the non-block-device I/O paths MMC host: - sdhci-cadence: Fix an error path during probe - sdhci-esdhc-imx: Fix support for the 'no-mmc-hs400' DT property" * tag 'mmc-v6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc: mmc: sdhci-esdhc-imx: make "no-mmc-hs400" works mmc: sdhci-cadence: Fix an error handling path in sdhci_cdns_probe() mmc: block: ensure error propagation for non-blk
2023-05-24Merge branch 'net-pcs-xpcs-cleanups-for-clause-73-support'Jakub Kicinski
Russell King says: ==================== net: pcs: xpcs: cleanups for clause 73 support This series cleans up xpcs code, moving much of the clause 73 code out of the driver into places where others can make use of it. Specifically, we add a helper to convert a clause 73 advertisement to ethtool link modes to mdio.h, and a helper to resolve the clause 73 negotiation state to phylink, which includes the pause modes. In doing this cleanup, several issues were identified with the original xpcs implementation: 1) it masks the link partner advertisement with its own advertisement so userspace can't see what the full link partner advertisement was. 2) it was always setting pause modes irrespective of the advertisements on either end of the link. 3) it was reading the STAT1 registers multiple times. Reading STAT1 has the side effect of unlatching the link-down status, so multiple reads should be avoided. This patch series addresses the first two first by addressing the issues, and then by moving over to the new helpers. The third issue is solved by restructuring the xpcs code. ==================== Link: https://lore.kernel.org/r/ZGyR/jDyYTYzRklg@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-24net: pcs: xpcs: avoid reading STAT1 more than onceRussell King (Oracle)
Avoid reading the STAT1 registers more than once while getting the PCS state, as this register contains latching-low bits that are lost after the first read. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-24net: pcs: xpcs: use phylink_resolve_c73() helperRussell King (Oracle)
Use phylink_resolve_c73() to resolve the clause 73 autonegotiation result. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-24net: pcs: xpcs: correct pause resolutionRussell King (Oracle)
xpcs was indicating symmetric pause should be enabled regardless of the advertisements by either party. Fix this to use linkmode_resolve_pause() now that we're no longer obliterating the link partner's advertisement by logically anding it with our own. This is transitional, the function will be entirely replaced with phylink_resolve_c73() in the following patch. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-24net: pcs: xpcs: correct lp_advertising contentsRussell King (Oracle)
lp_advertising is supposed to reflect the link partner's advertisement unmodified by the local advertisement, but xpcs bitwise ands it with the local advertisement prior to calculating the resolution of the negotiation. Fix this by moving the bitwise and to xpcs_resolve_lpa_c73() so it can place the results in a temporary bitmap before passing that to ixpcs_get_max_usxgmii_speed(). Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-24net: pcs: xpcs: use mii_c73_to_linkmode() helperRussell King (Oracle)
Convert xpcs clause 73 reading to use the newly introduced mii_c73_to_linkmode() helper to translate the link partner advertisement to an ethtool bitmap. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-24net: pcs: xpcs: clean up reading clause 73 link partner advertisementRussell King (Oracle)
Read the clause 73 link partner advertisement in a loop and then translate to the ethtool modes. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-24net: phylink: add function to resolve clause 73 negotiationRussell King (Oracle)
Add a function to resolve clause 73 negotiation according to the priority resolution function described in clause 73.3.6. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-24net: phylink: remove duplicated linkmode pause resolutionRussell King (Oracle)
Phylink had two chunks of code virtually the same for resolving the negotiated pause modes. Factor this down to one function. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-24net: mdio: add clause 73 to ethtool conversion helperRussell King (Oracle)
Add a helper to convert a clause 73 advertisement to an ethtool bitmap. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-24ALSA: hda/realtek: Enable headset onLenovo M70/M90Bin Li
Lenovo M70/M90 Gen4 are equipped with ALC897, and they need ALC897_FIXUP_HEADSET_MIC_PIN quirk to make its headset mic work. The previous quirk for M70/M90 is for Gen3. Signed-off-by: Bin Li <bin.li@canonical.com> Cc: <stable@vger.kernel.org> Link: https://lore.kernel.org/r/20230524113755.1346928-1-bin.li@canonical.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2023-05-24Merge tag 'asoc-fix-v6.4-rc3' of ↵Takashi Iwai
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v6.4 A collection of fixes for v6.4, mostly driver specific but there's also one fix for DPCM to avoid incorrectly repeated calls to prepare() which can trigger issues on some systems.
2023-05-24Merge branch 'devlink-port_del-new-cleanup'David S. Miller
Jiri Pirko says: ==================== devlink: small port_new/del() cleanup This patchset cleans up couple of leftovers after recent devlink locking changes. Previously, both port_new/dev() commands were called without holding instance lock. Currently all devlink commands are called with instance lock held. The first patch just removes redundant port notification. The second one removes couple of outdated comments. The last patch changes port_dev() to have devlink_port pointer as an arg instead of port_index, which makes it similar to the rest of port related ops. ==================== Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-24devlink: pass devlink_port pointer to ops->port_del() instead of indexJiri Pirko
Historically there was a reason why port_dev() along with for example port_split() did get port_index instead of the devlink_port pointer. With the locking changes that were done which ensured devlink instance mutex is hold for every command, the port ops could get devlink_port pointer directly. Change the forgotten port_dev() op to be as others and pass devlink_port pointer instead of port_index. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-24devlink: remove no longer true locking comment from port_new/del()Jiri Pirko
All commands are called holding instance lock. Remove the outdated comment that says otherwise. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-24devlink: remove duplicate port notificationJiri Pirko
The notification about created port is send from devl_port_register() function called from ops->port_new(). No need to send it again here, so remove the call and the helper function. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-24Merge branch 'tools-ynl-byteorder'David S. Miller
Donald Hunter says: ==================== tools: ynl: Add byte-order support for struct members This patchset adds support to ynl for handling byte-order in struct members. The first patch is a refactor to use predefined Struct() objects instead of generating byte-order specific formats on the fly. The second patch adds byte-order handling for struct members. ==================== Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-24tools: ynl: Handle byte-order in struct membersDonald Hunter
Add support for byte-order in struct members in the genetlink-legacy spec. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-24tools: ynl: Use dict of predefined Structs to decode scalar typesDonald Hunter
Use a dict of predefined Struct() objects to decode scalar types in native, big or little endian format. This removes the repetitive code for the scalar variants and ensures all the signed variants are supported. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-24ipv6: Fix out-of-bounds access in ipv6_find_tlv()Gavrilov Ilia
optlen is fetched without checking whether there is more than one byte to parse. It can lead to out-of-bounds access. Found by InfoTeCS on behalf of Linux Verification Center (linuxtesting.org) with SVACE. Fixes: c61a40432509 ("[IPV6]: Find option offset by type.") Signed-off-by: Gavrilov Ilia <Ilia.Gavrilov@infotecs.ru> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-24Merge tag 'mlx5-fixes-2023-05-22' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-fixes-2023-05-22 This series provides bug fixes for the mlx5 driver. Please pull and let me know if there is any problem. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-24net: phy: avoid kernel warning dump when stopping an errored PHYRussell King (Oracle)
When taking a network interface down (or removing a SFP module) after the PHY has encountered an error, phy_stop() complains incorrectly that it was called from HALTED state. The reason this is incorrect is that the network driver will have called phy_start() when the interface was brought up, and the fact that the PHY has a problem bears no relationship to the administrative state of the interface. Taking the interface administratively down (which calls phy_stop()) is always the right thing to do after a successful phy_start() call, whether or not the PHY has encountered an error. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-24Merge branch 'RTO_ONLINK'David S. Miller
Guillaume Nault says: ==================== ipv4: Remove RTO_ONLINK from udp, ping and raw sockets. udp_sendmsg(), ping_v4_sendmsg() and raw_sendmsg() use similar patterns for restricting their route lookup to on-link hosts. Although they use slightly different code, they all use RTO_ONLINK to override the least significant bit of their tos value. RTO_ONLINK is used to restrict the route scope even when the scope is set to RT_SCOPE_UNIVERSE. Therefore it isn't necessary: we can properly set the scope to RT_SCOPE_LINK instead. Removing RTO_ONLINK will allow to convert .flowi4_tos to dscp_t in the future, thus allowing to properly separate the DSCP from the ECN bits in the networking stack. This patch series defines a common helper to figure out what's the scope of the route lookup. This unifies the way udp, ping and raw sockets get their routing scope and removes their dependency on RTO_ONLINK. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-24udp: Stop using RTO_ONLINK.Guillaume Nault
Use ip_sendmsg_scope() to properly initialise the scope in flowi4_init_output(), instead of overriding tos with the RTO_ONLINK flag. The objective is to eventually remove RTO_ONLINK, which will allow converting .flowi4_tos to dscp_t. Now that the scope is determined by ip_sendmsg_scope(), we need to check its result to set the 'connected' variable. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-24raw: Stop using RTO_ONLINK.Guillaume Nault
Use ip_sendmsg_scope() to properly initialise the scope in flowi4_init_output(), instead of overriding tos with the RTO_ONLINK flag. The objective is to eventually remove RTO_ONLINK, which will allow converting .flowi4_tos to dscp_t. The MSG_DONTROUTE and SOCK_LOCALROUTE cases were already handled by raw_sendmsg() (SOCK_LOCALROUTE was handled by the RT_CONN_FLAGS*() macros called by get_rtconn_flags()). However, opt.is_strictroute wasn't taken into account. Therefore, a side effect of this patch is to now honour opt.is_strictroute, and thus align raw_sendmsg() with ping_v4_sendmsg() and udp_sendmsg(). Since raw_sendmsg() was the only user of get_rtconn_flags(), we can now remove this function. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-24ping: Stop using RTO_ONLINK.Guillaume Nault
Define a new helper to figure out the correct route scope to use on TX, depending on socket configuration, ancillary data and send flags. Use this new helper to properly initialise the scope in flowi4_init_output(), instead of overriding tos with the RTO_ONLINK flag. The objective is to eventually remove RTO_ONLINK, which will allow converting .flowi4_tos to dscp_t. Signed-off-by: Guillaume Nault <gnault@redhat.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-24net: ethernet: mtk_eth_soc: fix QoS on DSA MAC on non MTK_NETSYS_V2 SoCsArınç ÜNAL
The commit c6d96df9fa2c ("net: ethernet: mtk_eth_soc: drop generic vlan rx offload, only use DSA untagging") makes VLAN RX offloading to be only used on the SoCs without the MTK_NETSYS_V2 ability (which are not just MT7621 and MT7622). The commit disables the proper handling of special tagged (DSA) frames, added with commit 87e3df4961f4 ("net-next: ethernet: mediatek: add CDM able to recognize the tag for DSA"), for non MTK_NETSYS_V2 SoCs when it finds a MAC that does not use DSA. So if the other MAC uses DSA, the CDMQ component transmits DSA tagged frames to the CPU improperly. This issue can be observed on frames with TCP, for example, a TCP speed test using iperf3 won't work. The commit disables the proper handling of special tagged (DSA) frames because it assumes that these SoCs don't use more than one MAC, which is wrong. Although I made Frank address this false assumption on the patch log when they sent the patch on behalf of Felix, the code still made changes with this assumption. Therefore, the proper handling of special tagged (DSA) frames must be kept enabled in all circumstances as it doesn't affect non DSA tagged frames. Hardware DSA untagging, introduced with the commit 2d7605a72906 ("net: ethernet: mtk_eth_soc: enable hardware DSA untagging"), and VLAN RX offloading are operations on the two CDM components of the frame engine, CDMP and CDMQ, which connect to Packet DMA (PDMA) and QoS DMA (QDMA) and are between the MACs and the CPU. These operations apply to all MACs of the SoC so if one MAC uses DSA and the other doesn't, the hardware DSA untagging operation will cause the CDMP component to transmit non DSA tagged frames to the CPU improperly. Since the VLAN RX offloading feature configuration was dropped, VLAN RX offloading can only be used along with hardware DSA untagging. So, for the case above, we need to disable both features and leave it to the CPU, therefore software, to untag the DSA and VLAN tags. So the correct way to handle this is: For all SoCs: Enable the proper handling of special tagged (DSA) frames (MTK_CDMQ_IG_CTRL). For non MTK_NETSYS_V2 SoCs: Enable hardware DSA untagging (MTK_CDMP_IG_CTRL). Enable VLAN RX offloading (MTK_CDMP_EG_CTRL). When a non MTK_NETSYS_V2 SoC MAC does not use DSA: Disable hardware DSA untagging (MTK_CDMP_IG_CTRL). Disable VLAN RX offloading (MTK_CDMP_EG_CTRL). Fixes: c6d96df9fa2c ("net: ethernet: mtk_eth_soc: drop generic vlan rx offload, only use DSA untagging") Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-23docs: netdev: document the existence of the mail botJakub Kicinski
We had a good run, but after 4 weeks of use we heard someone asking about pw-bot commands. Let's explain its existence in the docs. It's not a complete documentation but hopefully it's enough for the casual contributor. The project and scope are in flux so the details would likely become out of date, if we were to document more in depth. Link: https://lore.kernel.org/all/20230522140057.GB18381@nucnuc.mle/ Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230522230903.1853151-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-23gve: Support IPv6 Big TCP on DQCoco Li
Add support for using IPv6 Big TCP on DQ which can handle large TSO/GRO packets. See https://lwn.net/Articles/895398/. This can improve the throughput and CPU usage. Perf test result: ip -d link show $DEV gso_max_size 185000 gso_max_segs 65535 tso_max_size 262143 tso_max_segs 65535 gro_max_size 185000 For performance, tested with neper using 9k MTU on hardware that supports 200Gb/s line rate. In single streams when line rate is not saturated, we expect throughput improvements. When the networking is performing at line rate, we expect cpu usage improvements. Tcp_stream (unidirectional stream test, T=thread, F=flow): skb=180kb, T=1, F=1, no zerocopy: throughput average=64576.88 Mb/s, sender stime=8.3, receiver stime=10.68 skb=64kb, T=1, F=1, no zerocopy: throughput average=64862.54 Mb/s, sender stime=9.96, receiver stime=12.67 skb=180kb, T=1, F=1, yes zerocopy: throughput average=146604.97 Mb/s, sender stime=10.61, receiver stime=5.52 skb=64kb, T=1, F=1, yes zerocopy: throughput average=131357.78 Mb/s, sender stime=12.11, receiver stime=12.25 skb=180kb, T=20, F=100, no zerocopy: throughput average=182411.37 Mb/s, sender stime=41.62, receiver stime=79.4 skb=64kb, T=20, F=100, no zerocopy: throughput average=182892.02 Mb/s, sender stime=57.39, receiver stime=72.69 skb=180kb, T=20, F=100, yes zerocopy: throughput average=182337.65 Mb/s, sender stime=27.94, receiver stime=39.7 skb=64kb, T=20, F=100, yes zerocopy: throughput average=182144.20 Mb/s, sender stime=47.06, receiver stime=39.01 Signed-off-by: Ziwei Xiao <ziweixiao@google.com> Signed-off-by: Coco Li <lixiaoyan@google.com> Reviewed-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20230522201552.3585421-1-ziweixiao@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-23net: fix skb leak in __skb_tstamp_tx()Pratyush Yadav
Commit 50749f2dd685 ("tcp/udp: Fix memleaks of sk and zerocopy skbs with TX timestamp.") added a call to skb_orphan_frags_rx() to fix leaks with zerocopy skbs. But it ended up adding a leak of its own. When skb_orphan_frags_rx() fails, the function just returns, leaking the skb it just cloned. Free it before returning. This bug was discovered and resolved using Coverity Static Analysis Security Testing (SAST) by Synopsys, Inc. Fixes: 50749f2dd685 ("tcp/udp: Fix memleaks of sk and zerocopy skbs with TX timestamp.") Signed-off-by: Pratyush Yadav <ptyadav@amazon.de> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20230522153020.32422-1-ptyadav@amazon.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-23Merge branch 'splice-net-replace-sendpage-with-sendmsg-msg_splice_pages-part-1'Jakub Kicinski
David Howells says: ==================== splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES), part 1 Here's the first tranche of patches towards providing a MSG_SPLICE_PAGES internal sendmsg flag that is intended to replace the ->sendpage() op with calls to sendmsg(). MSG_SPLICE_PAGES is a hint that tells the protocol that it should splice the pages supplied if it can and copy them if not. This will allow splice to pass multiple pages in a single call and allow certain parts of higher protocols (e.g. sunrpc, iwarp) to pass an entire message in one go rather than having to send them piecemeal. This should also make it easier to handle the splicing of multipage folios. A helper, skb_splice_from_iter() is provided to do the work of splicing or copying data from an iterator. If a page is determined to be unspliceable (such as being in the slab), then the helper will give an error. Note that this facility is not made available to userspace and does not provide any sort of callback. This set consists of the following parts: (1) Define the MSG_SPLICE_PAGES flag and prevent sys_sendmsg() from being able to set it. (2) Add an extra argument to skb_append_pagefrags() so that something other than MAX_SKB_FRAGS can be used (sysctl_max_skb_frags for example). (3) Add the skb_splice_from_iter() helper to handle splicing pages into skbuffs for MSG_SPLICE_PAGES that can be shared by TCP, IP/UDP and AF_UNIX. (4) Implement MSG_SPLICE_PAGES support in TCP. (5) Make do_tcp_sendpages() just wrap sendmsg() and then fold it in to its various callers. (6) Implement MSG_SPLICE_PAGES support in IP and make udp_sendpage() just a wrapper around sendmsg(). (7) Implement MSG_SPLICE_PAGES support in IP6/UDP6. (8) Implement MSG_SPLICE_PAGES support in AF_UNIX. (9) Make AF_UNIX copy unspliceable pages. Link: https://lore.kernel.org/r/20230316152618.711970-1-dhowells@redhat.com/ # v1 Link: https://lore.kernel.org/r/20230329141354.516864-1-dhowells@redhat.com/ # v2 Link: https://lore.kernel.org/r/20230331160914.1608208-1-dhowells@redhat.com/ # v3 Link: https://lore.kernel.org/r/20230405165339.3468808-1-dhowells@redhat.com/ # v4 Link: https://lore.kernel.org/r/20230406094245.3633290-1-dhowells@redhat.com/ # v5 Link: https://lore.kernel.org/r/20230411160902.4134381-1-dhowells@redhat.com/ # v6 Link: https://lore.kernel.org/r/20230515093345.396978-1-dhowells@redhat.com/ # v7 Link: https://lore.kernel.org/r/20230518113453.1350757-1-dhowells@redhat.com/ # v8 ==================== Link: https://lore.kernel.org/r/20230522121125.2595254-1-dhowells@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-23unix: Convert unix_stream_sendpage() to use MSG_SPLICE_PAGESDavid Howells
Convert unix_stream_sendpage() to use sendmsg() with MSG_SPLICE_PAGES rather than directly splicing in the pages itself. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells <dhowells@redhat.com> cc: Kuniyuki Iwashima <kuniyu@amazon.com> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-23af_unix: Support MSG_SPLICE_PAGESDavid Howells
Make AF_UNIX sendmsg() support MSG_SPLICE_PAGES, splicing in pages from the source iterator if possible and copying the data in otherwise. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells <dhowells@redhat.com> cc: Kuniyuki Iwashima <kuniyu@amazon.com> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-23ip: Remove ip_append_page()David Howells
ip_append_page() is no longer used with the removal of udp_sendpage(), so remove it. Signed-off-by: David Howells <dhowells@redhat.com> cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com> cc: David Ahern <dsahern@kernel.org> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-23udp: Convert udp_sendpage() to use MSG_SPLICE_PAGESDavid Howells
Convert udp_sendpage() to use sendmsg() with MSG_SPLICE_PAGES rather than directly splicing in the pages itself. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells <dhowells@redhat.com> cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com> cc: David Ahern <dsahern@kernel.org> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-23ip6, udp6: Support MSG_SPLICE_PAGESDavid Howells
Make IP6/UDP6 sendmsg() support MSG_SPLICE_PAGES. This causes pages to be spliced from the source iterator if possible, copying the data if not. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells <dhowells@redhat.com> cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com> cc: David Ahern <dsahern@kernel.org> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-23ip, udp: Support MSG_SPLICE_PAGESDavid Howells
Make IP/UDP sendmsg() support MSG_SPLICE_PAGES. This causes pages to be spliced from the source iterator. This allows ->sendpage() to be replaced by something that can handle multiple multipage folios in a single transaction. Signed-off-by: David Howells <dhowells@redhat.com> cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com> cc: David Ahern <dsahern@kernel.org> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-23tcp: Fold do_tcp_sendpages() into tcp_sendpage_locked()David Howells
Fold do_tcp_sendpages() into its last remaining caller, tcp_sendpage_locked(). Signed-off-by: David Howells <dhowells@redhat.com> cc: David Ahern <dsahern@kernel.org> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-23siw: Inline do_tcp_sendpages()David Howells
do_tcp_sendpages() is now just a small wrapper around tcp_sendmsg_locked(), so inline it, allowing do_tcp_sendpages() to be removed. This is part of replacing ->sendpage() with a call to sendmsg() with MSG_SPLICE_PAGES set. Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Bernard Metzler <bmt@zurich.ibm.com> Reviewed-by: Tom Talpey <tom@talpey.com> cc: Jason Gunthorpe <jgg@ziepe.ca> cc: Leon Romanovsky <leon@kernel.org> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-23tls: Inline do_tcp_sendpages()David Howells
do_tcp_sendpages() is now just a small wrapper around tcp_sendmsg_locked(), so inline it, allowing do_tcp_sendpages() to be removed. This is part of replacing ->sendpage() with a call to sendmsg() with MSG_SPLICE_PAGES set. Signed-off-by: David Howells <dhowells@redhat.com> cc: Boris Pismenny <borisp@nvidia.com> cc: John Fastabend <john.fastabend@gmail.com> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-23espintcp: Inline do_tcp_sendpages()David Howells
do_tcp_sendpages() is now just a small wrapper around tcp_sendmsg_locked(), so inline it, allowing do_tcp_sendpages() to be removed. This is part of replacing ->sendpage() with a call to sendmsg() with MSG_SPLICE_PAGES set. Signed-off-by: David Howells <dhowells@redhat.com> cc: Steffen Klassert <steffen.klassert@secunet.com> cc: Herbert Xu <herbert@gondor.apana.org.au> cc: David Ahern <dsahern@kernel.org> cc: Jens Axboe <axboe@kernel.dk> cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>