summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-08-29selftests: mptcp: join: no extra msg if no counterMatthieu Baerts (NGI0)
The checksum and fail counters might not be available. Then no need to display an extra message with missing info. While at it, fix the indentation around, which is wrong since the same commit. Fixes: 47867f0a7e83 ("selftests: mptcp: join: skip check if MIB counter not supported") Cc: stable@vger.kernel.org Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-29selftests: mptcp: join: check re-adding init endp with != idMatthieu Baerts (NGI0)
The initial subflow has a special local ID: 0. It is specific per connection. When a global endpoint is deleted and re-added later, it can have a different ID, but the kernel should still use the ID 0 if it corresponds to the initial address. This test validates this behaviour: the endpoint linked to the initial subflow is removed, and re-added with a different ID. Note that removing the initial subflow will not decrement the 'subflows' counters, which corresponds to the *additional* subflows. On the other hand, when the same endpoint is re-added, it will increment this counter, as it will be seen as an additional subflow this time. The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID. Fixes: 3ad14f54bd74 ("mptcp: more accurate MPC endpoint tracking") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-29mptcp: pm: reset MPC endp ID when re-addedMatthieu Baerts (NGI0)
The initial subflow has a special local ID: 0. It is specific per connection. When a global endpoint is deleted and re-added later, it can have a different ID -- most services managing the endpoints automatically don't force the ID to be the same as before. It is then important to track these modifications to be consistent with the ID being used for the address used by the initial subflow, not to confuse the other peer or to send the ID 0 for the wrong address. Now when removing an endpoint, msk->mpc_endpoint_id is reset if it corresponds to this endpoint. When adding a new endpoint, the same variable is updated if the address match the one of the initial subflow. Fixes: 3ad14f54bd74 ("mptcp: more accurate MPC endpoint tracking") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-29mptcp: pm: skip connecting to already established sfMatthieu Baerts (NGI0)
The lookup_subflow_by_daddr() helper checks if there is already a subflow connected to this address. But there could be a subflow that is closing, but taking time due to some reasons: latency, losses, data to process, etc. If an ADD_ADDR is received while the endpoint is being closed, it is better to try connecting to it, instead of rejecting it: the peer which has sent the ADD_ADDR will not be notified that the ADD_ADDR has been rejected for this reason, and the expected subflow will not be created at the end. This helper should then only look for subflows that are established, or going to be, but not the ones being closed. Fixes: d84ad04941c3 ("mptcp: skip connecting the connected address") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-29mptcp: pm: send ACK on an active subflowMatthieu Baerts (NGI0)
Taking the first one on the list doesn't work in some cases, e.g. if the initial subflow is being removed. Pick another one instead of not sending anything. Fixes: 84dfe3677a6f ("mptcp: send out dedicated ADD_ADDR packet") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-29selftests: mptcp: join: check removing ID 0 endpointMatthieu Baerts (NGI0)
Removing the endpoint linked to the initial subflow should trigger a RM_ADDR for the right ID, and the removal of the subflow. That's what is now being verified in the "delete and re-add" test. Note that removing the initial subflow will not decrement the 'subflows' counters, which corresponds to the *additional* subflows. On the other hand, when the same endpoint is re-added, it will increment this counter, as it will be seen as an additional subflow this time. The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID. Fixes: 3ad14f54bd74 ("mptcp: more accurate MPC endpoint tracking") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-29mptcp: pm: fix RM_ADDR ID for the initial subflowMatthieu Baerts (NGI0)
The initial subflow has a special local ID: 0. When an endpoint is being deleted, it is then important to check if its address is not linked to the initial subflow to send the right ID. If there was an endpoint linked to the initial subflow, msk's mpc_endpoint_id field will be set. We can then use this info when an endpoint is being removed to see if it is linked to the initial subflow. So now, the correct IDs are passed to mptcp_pm_nl_rm_addr_or_subflow(), it is no longer needed to use mptcp_local_id_match(). Fixes: 3ad14f54bd74 ("mptcp: more accurate MPC endpoint tracking") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-29mptcp: pm: reuse ID 0 after delete and re-addMatthieu Baerts (NGI0)
When the endpoint used by the initial subflow is removed and re-added later, the PM has to force the ID 0, it is a special case imposed by the MPTCP specs. Note that the endpoint should then need to be re-added reusing the same ID. Fixes: 3ad14f54bd74 ("mptcp: more accurate MPC endpoint tracking") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-29selftests: netfilter: nft_queue.sh: reduce test file size for debug buildFlorian Westphal
The sctp selftest is very slow on debug kernels. Reported-by: Jakub Kicinski <kuba@kernel.org> Closes: https://lore.kernel.org/netdev/20240826192500.32efa22c@kernel.org/ Fixes: 4e97d521c2be ("selftests: netfilter: nft_queue.sh: sctp coverage") Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Pablo Neira Ayuso <pablo@netfilter.org> Link: https://patch.msgid.link/20240827090023.8917-1-fw@strlen.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-08-29Merge tag 'random-6.11-rc6-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator fix from Jason Donenfeld: "Reject invalid flags passed to vgetrandom() in the same way that getrandom() does, so that the behavior is the same, from Yann. The flags argument to getrandom() only has a behavioral effect on the function if the RNG isn't initialized yet, so vgetrandom() falls back to the syscall in that case. But if the RNG is initialized, all of the flags behave the same way, so vgetrandom() didn't bother checking them, and just ignored them entirely. But that doesn't account for invalid flags passed in, which need to be rejected so we can use them later" * tag 'random-6.11-rc6-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: random: vDSO: reject unknown getrandom() flags
2024-08-28Merge branch 'net-hisilicon-minor-fixes'Jakub Kicinski
Krzysztof Kozlowski says: ==================== net: hisilicon: minor fixes Minor fixes for hisilicon ethernet driver which look too trivial to be considered for current RC. ==================== Link: https://patch.msgid.link/20240827144421.52852-1-krzysztof.kozlowski@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-28net: hisilicon: hns_mdio: fix OF node leak in probe()Krzysztof Kozlowski
Driver is leaking OF node reference from of_parse_phandle_with_fixed_args() in probe(). Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240827144421.52852-4-krzysztof.kozlowski@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-28net: hisilicon: hns_dsaf_mac: fix OF node leak in hns_mac_get_info()Krzysztof Kozlowski
Driver is leaking OF node reference from of_parse_phandle_with_fixed_args() in hns_mac_get_info(). Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240827144421.52852-3-krzysztof.kozlowski@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-28net: hisilicon: hip04: fix OF node leak in probe()Krzysztof Kozlowski
Driver is leaking OF node reference from of_parse_phandle_with_fixed_args() in probe(). Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240827144421.52852-2-krzysztof.kozlowski@linaro.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-28net: dsa: mv88e6xxx: Remove stale commentAndy Shevchenko
GPIOF_DIR_* definitions are legacy and subject to remove. Taking this into account, remove stale comment. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20240827171005.2301845-1-andriy.shevchenko@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-28net: busy-poll: use ktime_get_ns() instead of local_clock()Eric Dumazet
Typically, busy-polling durations are below 100 usec. When/if the busy-poller thread migrates to another cpu, local_clock() can be off by +/-2msec or more for small values of HZ, depending on the platform. Use ktimer_get_ns() to ensure deterministic behavior, which is the whole point of busy-polling. Fixes: 060212928670 ("net: add low latency socket poll") Fixes: 9a3c71aa8024 ("net: convert low latency sockets to sched_clock()") Fixes: 37089834528b ("sched, net: Fixup busy_loop_us_clock()") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Willem de Bruijn <willemb@google.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20240827114916.223377-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-28net: ethtool: cable-test: Release RTNL when the PHY isn't foundMaxime Chevallier
Use the correct logic to check for the presence of a PHY device, and jump to a label that correctly releases RTNL in case of an error, as we are holding RTNL at that point. Fixes: 3688ff3077d3 ("net: ethtool: cable-test: Target the command to the requested PHY") Closes: https://lore.kernel.org/netdev/20240827104825.5cbe0602@fedora-3.home/T/#m6bc49cdcc5cfab0d162516b92916b944a01c833f Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20240827092314.2500284-1-maxime.chevallier@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-28net: netvsc: Update default VMBus channelsErni Sri Satya Vennela
Change VMBus channels macro (VRSS_CHANNEL_DEFAULT) in Linux netvsc from 8 to 16 to align with Azure Windows VM and improve networking throughput. For VMs having less than 16 vCPUS, the channels depend on number of vCPUs. For greater than 16 vCPUs, set the channels to maximum of VRSS_CHANNEL_DEFAULT and number of physical cores / 2 which is returned by netif_get_num_default_rss_queues() as a way to optimize CPU resource utilization and scale for high-end processors with many cores. Maximum number of channels are by default set to 64. Based on this change the channel creation would change as follows: ----------------------------------------------------------------- | No. of vCPU | dev_info->num_chn | channels created | ----------------------------------------------------------------- | 1-16 | 16 | vCPU | | >16 | max(16,#cores/2) | min(64 , max(16,#cores/2)) | ----------------------------------------------------------------- Performance tests showed significant improvement in throughput: - 0.54% for 16 vCPUs - 0.83% for 32 vCPUs - 0.86% for 48 vCPUs - 9.72% for 64 vCPUs - 13.57% for 96 vCPUs Signed-off-by: Erni Sri Satya Vennela <ernis@linux.microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Shradha Gupta <shradhagupta@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://patch.msgid.link/1724735791-22815-1-git-send-email-ernis@linux.microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-28net: ftgmac100: Get link speed and duplex for NC-SIJacky Chou
The ethtool of this driver uses the phy API of ethtool to get the link information from PHY driver. Because the NC-SI is forced on 100Mbps and full duplex, the driver connect a fixed-link phy driver for NC-SI. The ethtool will get the link information from the fixed-link phy driver. Signed-off-by: Jacky Chou <jacky_chou@aspeedtech.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20240827030513.481469-1-jacky_chou@aspeedtech.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-28Merge branch 'tcp-take-better-care-of-tw_substate-and-tw_rcv_nxt'Jakub Kicinski
Eric Dumazet says: ==================== tcp: take better care of tw_substate and tw_rcv_nxt While reviewing Jason Xing recent commit (0d9e5df4a257 "tcp: avoid reusing FIN_WAIT2 when trying to find port in connect() process") I saw we could remove the volatile qualifier for tw_substate field, and I also added missing data-race annotations around tcptw->tw_rcv_nxt. ==================== Link: https://patch.msgid.link/20240827015250.3509197-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-28tcp: annotate data-races around tcptw->tw_rcv_nxtEric Dumazet
No lock protects tcp tw fields. tcptw->tw_rcv_nxt can be changed from twsk_rcv_nxt_update() while other threads might read this field. Add READ_ONCE()/WRITE_ONCE() annotations, and make sure tcp_timewait_state_process() reads tcptw->tw_rcv_nxt only once. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://patch.msgid.link/20240827015250.3509197-3-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-28tcp: remove volatile qualifier on tw_substateEric Dumazet
Using a volatile qualifier for a specific struct field is unusual. Use instead READ_ONCE()/WRITE_ONCE() where necessary. tcp_timewait_state_process() can change tw_substate while other threads are reading this field. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://patch.msgid.link/20240827015250.3509197-2-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-28net/xen-netback: prevent UAF in xenvif_flush_hash()Jeongjun Park
During the list_for_each_entry_rcu iteration call of xenvif_flush_hash, kfree_rcu does not exist inside the rcu read critical section, so if kfree_rcu is called when the rcu grace period ends during the iteration, UAF occurs when accessing head->next after the entry becomes free. Therefore, to solve this, you need to change it to list_for_each_entry_safe. Signed-off-by: Jeongjun Park <aha310510@gmail.com> Link: https://patch.msgid.link/20240822181109.2577354-1-aha310510@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-28Merge tag 'wireless-2024-08-28' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== Regressions: * wfx: fix for open network connection * iwlwifi: fix for hibernate (due to fast resume feature) * iwlwifi: fix for a few warnings that were recently added (had previously been messages not warnings) Previously broken: * mwifiex: fix static structures used for per-device data * iwlwifi: some harmless FW related messages were tagged too high priority * iwlwifi: scan buffers weren't checked correctly * mac80211: SKB leak on beacon error path * iwlwifi: fix ACPI table interop with certain BIOSes * iwlwifi: fix locking for link selection * mac80211: fix SSID comparison in beacon validation * tag 'wireless-2024-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: iwlwifi: clear trans->state earlier upon error wifi: wfx: repair open network AP mode wifi: mac80211: free skb on error path in ieee80211_beacon_get_ap() wifi: iwlwifi: mvm: don't wait for tx queues if firmware is dead wifi: iwlwifi: mvm: allow 6 GHz channels in MLO scan wifi: iwlwifi: mvm: pause TCM when the firmware is stopped wifi: iwlwifi: fw: fix wgds rev 3 exact size wifi: iwlwifi: mvm: take the mutex before running link selection wifi: iwlwifi: mvm: fix iwl_mvm_max_scan_ie_fw_cmd_room() wifi: iwlwifi: mvm: fix iwl_mvm_scan_fits() calculation wifi: iwlwifi: lower message level for FW buffer destination wifi: iwlwifi: mvm: fix hibernation wifi: mac80211: fix beacon SSID mismatch handling wifi: mwifiex: duplicate static structs used in driver instances ==================== Link: https://patch.msgid.link/20240828100151.23662-3-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-29Merge tag 'loongarch-fixes-6.11-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson Pull LoongArch fixes from Huacai Chen: "Remove the unused dma-direct.h, and some bug & warning fixes" * tag 'loongarch-fixes-6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: LoongArch: KVM: Invalidate guest steal time address on vCPU reset LoongArch: Add ifdefs to fix LSX and LASX related warnings LoongArch: Define ARCH_IRQ_INIT_FLAGS as IRQ_NOPROBE LoongArch: Remove the unused dma-direct.h
2024-08-29Merge tag 'platform-drivers-x86-v6.11-5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform drivers fixes from Ilpo Järvinen: - platform/x86/amd/pmc: AMD 1Ah model 60h series support (2nd attempt) - asus-wmi: Prevent spurious rfkill on Asus Zenbook Duo - x86-android-tablets: Relax DMI match to cover another model * tag 'platform-drivers-x86-v6.11-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: x86-android-tablets: Make Lenovo Yoga Tab 3 X90F DMI match less strict platform/x86: asus-wmi: Fix spurious rfkill on UX8406MA platform/x86/amd/pmc: Extend support for PMC features on new AMD platform platform/x86/amd/pmc: Fix SMU command submission path on new AMD platform
2024-08-29Merge tag 'nfsd-6.11-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Fix a number of crashers - Update email address for an NFSD reviewer * tag 'nfsd-6.11-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: fs/nfsd: fix update of inode attrs in CB_GETATTR nfsd: fix potential UAF in nfsd4_cb_getattr_release nfsd: hold reference to delegation when updating it for cb_getattr MAINTAINERS: Update Olga Kornievskaia's email address nfsd: prevent panic for nfsv4.0 closed files in nfs4_show_open nfsd: ensure that nfsd4_fattr_args.context is zeroed out
2024-08-29Merge tag 'for-6.11-rc5-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - fix use-after-free when submitting bios for read, after an error and partially submitted bio the original one is freed while it can be still be accessed again - fix fstests case btrfs/301, with enabled quotas wait for delayed iputs when flushing delalloc - fix periodic block group reclaim, an unitialized value can be returned if there are no block groups to reclaim - fix build warning (-Wmaybe-uninitialized) * tag 'for-6.11-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix uninitialized return value from btrfs_reclaim_sweep() btrfs: fix a use-after-free when hitting errors inside btrfs_submit_chunk() btrfs: initialize last_extent_end to fix -Wmaybe-uninitialized warning in extent_fiemap() btrfs: run delayed iputs when flushing delalloc
2024-08-28Merge tag 'v6.11-rc5-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fixes from Steve French: - two RDMA/smbdirect fixes and a minor cleanup - punch hole fix * tag 'v6.11-rc5-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: Fix FALLOC_FL_PUNCH_HOLE support smb/client: fix rdma usage in smb2_async_writev() smb/client: remove unused rq_iter_size from struct smb_rqst smb/client: avoid dereferencing rdata=NULL in smb2_new_read_req()
2024-08-28Merge tag 'tpmdd-next-6.11-rc6' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull TPM fix from Jarkko Sakkinen: "A bug fix for tpm_ibmvtpm driver so that it will take the bus encryption into use" * tag 'tpmdd-next-6.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: tpm: ibmvtpm: Call tpm2_sessions_init() to initialize session support
2024-08-27Merge branch '100GbE' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2024-08-26 (ice) This series contains updates to ice driver only. Jake implements and uses rd32_poll_timeout to replace a jiffies loop for calling ice_sq_done. The rd32_poll_timeout() function is designed to allow simplifying other places in the driver where we need to read a register until it matches a known value. Jake, Bruce, and Przemek update ice_debug_cq() to be more robust, and more useful for tracing control queue messages sent and received by the device driver. Jake rewords several commands in the ice_control.c file which previously referred to the "Admin queue" when they were actually generic functions usable on any control queue. Jake removes the unused and unnecessary cmd_buf array allocation for send queues. This logic originally was going to be useful if we ever implemented asynchronous completion of transmit messages. This support is unlikely to materialize, so the overhead of allocating a command buffer is unnecessary. Sergey improves the log messages when the ice driver reports that the NVM version on the device is not supported by the driver. Now, these messages include both the discovered NVM version and the requested/expected NVM version. Aleksandr Mishin corrects overallocation of memory related to adding scheduler nodes. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: Adjust over allocation of memory in ice_sched_add_root_node() and ice_sched_add_node() ice: Report NVM version numbers on mismatch during load ice: remove unnecessary control queue cmd_buf arrays ice: reword comments referring to control queues ice: stop intermixing AQ commands/responses debug dumps ice: do not clutter debug logs with unused data ice: improve debug print for control queue messages ice: implement and use rd32_poll_timeout for ice_sq_done timeout ==================== Link: https://patch.msgid.link/20240826224655.133847-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27Merge branch 'net-dsa-microchip-add-ksz8895-ksz8864-switch-support'Jakub Kicinski
Tristram Ha says: ==================== net: dsa: microchip: Add KSZ8895/KSZ8864 switch support This series of patches is to add KSZ8895/KSZ8864 switch support to the KSZ DSA driver. ==================== Link: https://patch.msgid.link/BYAPR11MB3558B8A089C88DFFFC09B067EC8B2@BYAPR11MB3558.namprd11.prod.outlook.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27net: dsa: microchip: Add KSZ8895/KSZ8864 switch supportTristram Ha
KSZ8895/KSZ8864 is a switch family between KSZ8863/73 and KSZ8795, so it shares some registers and functions in those switches already implemented in the KSZ DSA driver. Signed-off-by: Tristram Ha <tristram.ha@microchip.com> Tested-by: Pieter Van Trappen <pieter.van.trappen@cern.ch> Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27dt-bindings: net: dsa: microchip: Add KSZ8895/KSZ8864 switch supportTristram Ha
KSZ8895/KSZ8864 is a switch family developed before KSZ8795 and after KSZ8863, so it shares some registers and functions in those switches. KSZ8895 has 5 ports and so is more similar to KSZ8795. KSZ8864 is a 4-port version of KSZ8895. The first port is removed while port 5 remains as a host port. Signed-off-by: Tristram Ha <tristram.ha@microchip.com> Acked-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org> Link: https://patch.msgid.link/BYAPR11MB3558FD0717772263FAD86846EC8B2@BYAPR11MB3558.namprd11.prod.outlook.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27net/handshake: use sockfd_put() helperA K M Fazla Mehrab
Replace fput() with sockfd_put() in handshake_nl_done_doit(). Signed-off-by: A K M Fazla Mehrab <a.mehrab@bytedance.com> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Link: https://patch.msgid.link/20240826182652.2449359-1-a.mehrab@bytedance.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27net: mana: Implement get_ringparam/set_ringparam for manaShradha Gupta
Currently the values of WQs for RX and TX queues for MANA devices are hardcoded to default sizes. Allow configuring these values for MANA devices as ringparam configuration(get/set) through ethtool_ops. Pre-allocate buffers at the beginning of this operation, to prevent complete network loss in low-memory conditions. Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com> Link: https://patch.msgid.link/1724688461-12203-1-git-send-email-shradhagupta@linux.microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27sctp: fix association labeling in the duplicate COOKIE-ECHO caseOndrej Mosnacek
sctp_sf_do_5_2_4_dupcook() currently calls security_sctp_assoc_request() on new_asoc, but as it turns out, this association is always discarded and the LSM labels never get into the final association (asoc). This can be reproduced by having two SCTP endpoints try to initiate an association with each other at approximately the same time and then peel off the association into a new socket, which exposes the unitialized labels and triggers SELinux denials. Fix it by calling security_sctp_assoc_request() on asoc instead of new_asoc. Xin Long also suggested limit calling the hook only to cases A, B, and D, since in cases C and E the COOKIE ECHO chunk is discarded and the association doesn't enter the ESTABLISHED state, so rectify that as well. One related caveat with SELinux and peer labeling: When an SCTP connection is set up simultaneously in this way, we will end up with an association that is initialized with security_sctp_assoc_request() on both sides, so the MLS component of the security context of the association will get swapped between the peers, instead of just one side setting it to the other's MLS component. However, at that point security_sctp_assoc_request() had already been called on both sides in sctp_sf_do_unexpected_init() (on a temporary association) and thus if the exchange didn't fail before due to MLS, it won't fail now either (most likely both endpoints have the same MLS range). Tested by: - reproducer from https://src.fedoraproject.org/tests/selinux/pull-request/530 - selinux-testsuite (https://github.com/SELinuxProject/selinux-testsuite/) - sctp-tests (https://github.com/sctp/sctp-tests) - no tests failed that wouldn't fail also without the patch applied Fixes: c081d53f97a1 ("security: pass asoc to sctp_assoc_request and sctp_sk_clone") Suggested-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Ondrej Mosnacek <omosnace@redhat.com> Acked-by: Xin Long <lucien.xin@gmail.com> Acked-by: Paul Moore <paul@paul-moore.com> (LSM/SELinux) Link: https://patch.msgid.link/20240826130711.141271-1-omosnace@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27Merge branch 'mptcp-close-subflow-when-receiving-tcp-fin-and-misc'Jakub Kicinski
Matthieu Baerts says: ==================== mptcp: close subflow when receiving TCP+FIN and misc. Here are different fixes: Patch 1 closes the subflow after having received a FIN, instead of leaving it half-closed until the end of the MPTCP connection. A fix for v5.12. Patch 2 validates the previous patch. Patch 3 is a fix for a recent fix to check both directions for the backup flag. It can follow the 'Fixes' commit and be backported up to v5.7. Patch 4 adds a missing \n at the end of pr_debug(), causing debug messages to be displayed with a delay, which confuses the debugger. A fix for v5.6. ==================== Link: https://patch.msgid.link/20240826-net-mptcp-close-extra-sf-fin-v1-0-905199fe1172@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27mptcp: pr_debug: add missing \n at the endMatthieu Baerts (NGI0)
pr_debug() have been added in various places in MPTCP code to help developers to debug some situations. With the dynamic debug feature, it is easy to enable all or some of them, and asks users to reproduce issues with extra debug. Many of these pr_debug() don't end with a new line, while no 'pr_cont()' are used in MPTCP code. So the goal was not to display multiple debug messages on one line: they were then not missing the '\n' on purpose. Not having the new line at the end causes these messages to be printed with a delay, when something else needs to be printed. This issue is not visible when many messages need to be printed, but it is annoying and confusing when only specific messages are expected, e.g. # echo "func mptcp_pm_add_addr_echoed +fmp" \ > /sys/kernel/debug/dynamic_debug/control # ./mptcp_join.sh "signal address"; \ echo "$(awk '{print $1}' /proc/uptime) - end"; \ sleep 5s; \ echo "$(awk '{print $1}' /proc/uptime) - restart"; \ ./mptcp_join.sh "signal address" 013 signal address (...) 10.75 - end 15.76 - restart 013 signal address [ 10.367935] mptcp:mptcp_pm_add_addr_echoed: MPTCP: msk=(...) (...) => a delay of 5 seconds: printed with a 10.36 ts, but after 'restart' which was printed at the 15.76 ts. The 'Fixes' tag here below points to the first pr_debug() used without '\n' in net/mptcp. This patch could be split in many small ones, with different Fixes tag, but it doesn't seem worth it, because it is easy to re-generate this patch with this simple 'sed' command: git grep -l pr_debug -- net/mptcp | xargs sed -i "s/\(pr_debug(\".*[^n]\)\(\"[,)]\)/\1\\\n\2/g" So in case of conflicts, simply drop the modifications, and launch this command. Fixes: f870fa0b5768 ("mptcp: Add MPTCP socket stubs") Cc: stable@vger.kernel.org Reviewed-by: Geliang Tang <geliang@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240826-net-mptcp-close-extra-sf-fin-v1-4-905199fe1172@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27mptcp: sched: check both backup in retransMatthieu Baerts (NGI0)
The 'mptcp_subflow_context' structure has two items related to the backup flags: - 'backup': the subflow has been marked as backup by the other peer - 'request_bkup': the backup flag has been set by the host Looking only at the 'backup' flag can make sense in some cases, but it is not the behaviour of the default packet scheduler when selecting paths. As explained in the commit b6a66e521a20 ("mptcp: sched: check both directions for backup"), the packet scheduler should look at both flags, because that was the behaviour from the beginning: the 'backup' flag was set by accident instead of the 'request_bkup' one. Now that the latter has been fixed, get_retrans() needs to be adapted as well. Fixes: b6a66e521a20 ("mptcp: sched: check both directions for backup") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240826-net-mptcp-close-extra-sf-fin-v1-3-905199fe1172@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27selftests: mptcp: join: cannot rm sf if closedMatthieu Baerts (NGI0)
Thanks to the previous commit, the MPTCP subflows are now closed on both directions even when only the MPTCP path-manager of one peer asks for their closure. In the two tests modified here -- "userspace pm add & remove address" and "userspace pm create destroy subflow" -- one peer is controlled by the userspace PM, and the other one by the in-kernel PM. When the userspace PM sends a RM_ADDR notification, the in-kernel PM will automatically react by closing all subflows using this address. Now, thanks to the previous commit, the subflows are properly closed on both directions, the userspace PM can then no longer closes the same subflows if they are already closed. Before, it was OK to do that, because the subflows were still half-opened, still OK to send a RM_ADDR. In other words, thanks to the previous commit closing the subflows, an error will be returned to the userspace if it tries to close a subflow that has already been closed. So no need to run this command, which mean that the linked counters will then not be incremented. These tests are then no longer sending both a RM_ADDR, then closing the linked subflow just after. The test with the userspace PM on the server side is now removing one subflow linked to one address, then sending a RM_ADDR for another address. The test with the userspace PM on the client side is now only removing the subflow that was previously created. Fixes: 4369c198e599 ("selftests: mptcp: test userspace pm out of transfer") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240826-net-mptcp-close-extra-sf-fin-v1-2-905199fe1172@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27mptcp: close subflow when receiving TCP+FINMatthieu Baerts (NGI0)
When a peer decides to close one subflow in the middle of a connection having multiple subflows, the receiver of the first FIN should accept that, and close the subflow on its side as well. If not, the subflow will stay half closed, and would even continue to be used until the end of the MPTCP connection or a reset from the network. The issue has not been seen before, probably because the in-kernel path-manager always sends a RM_ADDR before closing the subflow. Upon the reception of this RM_ADDR, the other peer will initiate the closure on its side as well. On the other hand, if the RM_ADDR is lost, or if the path-manager of the other peer only closes the subflow without sending a RM_ADDR, the subflow would switch to TCP_CLOSE_WAIT, but that's it, leaving the subflow half-closed. So now, when the subflow switches to the TCP_CLOSE_WAIT state, and if the MPTCP connection has not been closed before with a DATA_FIN, the kernel owning the subflow schedules its worker to initiate the closure on its side as well. This issue can be easily reproduced with packetdrill, as visible in [1], by creating an additional subflow, injecting a FIN+ACK before sending the DATA_FIN, and expecting a FIN+ACK in return. Fixes: 40947e13997a ("mptcp: schedule worker when subflow is closed") Cc: stable@vger.kernel.org Link: https://github.com/multipath-tcp/packetdrill/pull/154 [1] Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20240826-net-mptcp-close-extra-sf-fin-v1-1-905199fe1172@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27tcp: fix forever orphan socket caused by tcp_abortXueming Feng
We have some problem closing zero-window fin-wait-1 tcp sockets in our environment. This patch come from the investigation. Previously tcp_abort only sends out reset and calls tcp_done when the socket is not SOCK_DEAD, aka orphan. For orphan socket, it will only purging the write queue, but not close the socket and left it to the timer. While purging the write queue, tp->packets_out and sk->sk_write_queue is cleared along the way. However tcp_retransmit_timer have early return based on !tp->packets_out and tcp_probe_timer have early return based on !sk->sk_write_queue. This caused ICSK_TIME_RETRANS and ICSK_TIME_PROBE0 not being resched and socket not being killed by the timers, converting a zero-windowed orphan into a forever orphan. This patch removes the SOCK_DEAD check in tcp_abort, making it send reset to peer and close the socket accordingly. Preventing the timer-less orphan from happening. According to Lorenzo's email in the v1 thread, the check was there to prevent force-closing the same socket twice. That situation is handled by testing for TCP_CLOSE inside lock, and returning -ENOENT if it is already closed. The -ENOENT code comes from the associate patch Lorenzo made for iproute2-ss; link attached below, which also conform to RFC 9293. At the end of the patch, tcp_write_queue_purge(sk) is removed because it was already called in tcp_done_with_error(). p.s. This is the same patch with v2. Resent due to mis-labeled "changes requested" on patchwork.kernel.org. Link: https://patchwork.ozlabs.org/project/netdev/patch/1450773094-7978-3-git-send-email-lorenzo@google.com/ Fixes: c1e64e298b8c ("net: diag: Support destroying TCP sockets.") Signed-off-by: Xueming Feng <kuro@kuroa.me> Tested-by: Lorenzo Colitti <lorenzo@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20240826102327.1461482-1-kuro@kuroa.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27net: phy: vitesse: implement MDI-X configuration in vsc73xxPawel Dembicki
This commit introduces MDI-X configuration support in vsc73xx phys. Vsc73xx supports only auto mode or forced MDI. Vsc73xx have auto MDI-X disabled by default in forced speed mode. This commit enables it. Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20240826093710.511837-1-paweldembicki@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27Merge branch 'net-fix-module-autoloading'Jakub Kicinski
Liao Chen says: ==================== net: fix module autoloading This patchset aims to enable autoloading of some net modules. By registering MDT, the kernel is allowed to automatically bind modules to devices that match the specified compatible strings. ==================== Link: https://patch.msgid.link/20240826091858.369910-1-liaochen4@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27net: airoha: fix module autoloadingLiao Chen
Add MODULE_DEVICE_TABLE(), so modules could be properly autoloaded based on the alias from of_device_id table. Signed-off-by: Liao Chen <liaochen4@huawei.com> Acked-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://patch.msgid.link/20240826091858.369910-4-liaochen4@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27net: ag71xx: fix module autoloadingLiao Chen
Add MODULE_DEVICE_TABLE(), so modules could be properly autoloaded based on the alias from of_device_id table. Signed-off-by: Liao Chen <liaochen4@huawei.com> Link: https://patch.msgid.link/20240826091858.369910-3-liaochen4@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27net: dm9051: fix module autoloadingLiao Chen
Add MODULE_DEVICE_TABLE(), so modules could be properly autoloaded based on the alias from of_device_id table. Signed-off-by: Liao Chen <liaochen4@huawei.com> Link: https://patch.msgid.link/20240826091858.369910-2-liaochen4@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27net: txgbe: use pci_dev_id() helperYu Liao
PCI core API pci_dev_id() can be used to get the BDF number for a PCI device. We don't need to compose it manually. Use pci_dev_id() to simplify the code a little bit. Signed-off-by: Yu Liao <liaoyu15@huawei.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240826012100.3975175-1-liaoyu15@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-27gtp: fix a potential NULL pointer dereferenceCong Wang
When sockfd_lookup() fails, gtp_encap_enable_socket() returns a NULL pointer, but its callers only check for error pointers thus miss the NULL pointer case. Fix it by returning an error pointer with the error code carried from sockfd_lookup(). (I found this bug during code inspection.) Fixes: 1e3a3abd8b28 ("gtp: make GTP sockets in gtp_newlink optional") Cc: Andreas Schultz <aschultz@tpip.net> Cc: Harald Welte <laforge@gnumonks.org> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Pablo Neira Ayuso <pablo@netfilter.org> Link: https://patch.msgid.link/20240825191638.146748-1-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>