summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-02-25ixgbe: fix media cage present detection for E610 devicePiotr Kwapulinski
The commit 23c0e5a16bcc ("ixgbe: Add link management support for E610 device") introduced incorrect checking of media cage presence for E610 device. Fix it. Fixes: 23c0e5a16bcc ("ixgbe: Add link management support for E610 device") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/all/e7d73b32-f12a-49d1-8b60-1ef83359ec13@stanley.mountain/ Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Piotr Kwapulinski <piotr.kwapulinski@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Bharath R <bharath.r@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250224190647.3601930-6-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25iavf: fix circular lock dependency with netdev_lockJacob Keller
We have recently seen reports of lockdep circular lock dependency warnings when loading the iAVF driver: [ 1504.790308] ====================================================== [ 1504.790309] WARNING: possible circular locking dependency detected [ 1504.790310] 6.13.0 #net_next_rt.c2933b2befe2.el9 Not tainted [ 1504.790311] ------------------------------------------------------ [ 1504.790312] kworker/u128:0/13566 is trying to acquire lock: [ 1504.790313] ffff97d0e4738f18 (&dev->lock){+.+.}-{4:4}, at: register_netdevice+0x52c/0x710 [ 1504.790320] [ 1504.790320] but task is already holding lock: [ 1504.790321] ffff97d0e47392e8 (&adapter->crit_lock){+.+.}-{4:4}, at: iavf_finish_config+0x37/0x240 [iavf] [ 1504.790330] [ 1504.790330] which lock already depends on the new lock. [ 1504.790330] [ 1504.790330] [ 1504.790330] the existing dependency chain (in reverse order) is: [ 1504.790331] [ 1504.790331] -> #1 (&adapter->crit_lock){+.+.}-{4:4}: [ 1504.790333] __lock_acquire+0x52d/0xbb0 [ 1504.790337] lock_acquire+0xd9/0x330 [ 1504.790338] mutex_lock_nested+0x4b/0xb0 [ 1504.790341] iavf_finish_config+0x37/0x240 [iavf] [ 1504.790347] process_one_work+0x248/0x6d0 [ 1504.790350] worker_thread+0x18d/0x330 [ 1504.790352] kthread+0x10e/0x250 [ 1504.790354] ret_from_fork+0x30/0x50 [ 1504.790357] ret_from_fork_asm+0x1a/0x30 [ 1504.790361] [ 1504.790361] -> #0 (&dev->lock){+.+.}-{4:4}: [ 1504.790364] check_prev_add+0xf1/0xce0 [ 1504.790366] validate_chain+0x46a/0x570 [ 1504.790368] __lock_acquire+0x52d/0xbb0 [ 1504.790370] lock_acquire+0xd9/0x330 [ 1504.790371] mutex_lock_nested+0x4b/0xb0 [ 1504.790372] register_netdevice+0x52c/0x710 [ 1504.790374] iavf_finish_config+0xfa/0x240 [iavf] [ 1504.790379] process_one_work+0x248/0x6d0 [ 1504.790381] worker_thread+0x18d/0x330 [ 1504.790383] kthread+0x10e/0x250 [ 1504.790385] ret_from_fork+0x30/0x50 [ 1504.790387] ret_from_fork_asm+0x1a/0x30 [ 1504.790389] [ 1504.790389] other info that might help us debug this: [ 1504.790389] [ 1504.790389] Possible unsafe locking scenario: [ 1504.790389] [ 1504.790390] CPU0 CPU1 [ 1504.790391] ---- ---- [ 1504.790391] lock(&adapter->crit_lock); [ 1504.790393] lock(&dev->lock); [ 1504.790394] lock(&adapter->crit_lock); [ 1504.790395] lock(&dev->lock); [ 1504.790397] [ 1504.790397] *** DEADLOCK *** This appears to be caused by the change in commit 5fda3f35349b ("net: make netdev_lock() protect netdev->reg_state"), which added a netdev_lock() in register_netdevice. The iAVF driver calls register_netdevice() from iavf_finish_config(), as a final stage of its state machine post-probe. It currently takes the RTNL lock, then the netdev lock, and then the device critical lock. This pattern is used throughout the driver. Thus there is a strong dependency that the crit_lock should not be acquired before the net device lock. The change to register_netdevice creates an ABBA lock order violation because the iAVF driver is holding the crit_lock while calling register_netdevice, which then takes the netdev_lock. It seems likely that future refactors could result in netdev APIs which hold the netdev_lock while calling into the driver. This means that we should not re-order the locks so that netdev_lock is acquired after the device private crit_lock. Instead, notice that we already release the netdev_lock prior to calling the register_netdevice. This flow only happens during the early driver initialization as we transition through the __IAVF_STARTUP, __IAVF_INIT_VERSION_CHECK, __IAVF_INIT_GET_RESOURCES, etc. Analyzing the places where we take crit_lock in the driver there are two sources: a) several of the work queue tasks including adminq_task, watchdog_task, reset_task, and the finish_config task. b) various callbacks which ultimately stem back to .ndo operations or ethtool operations. The latter cannot be triggered until after the netdevice registration is completed successfully. The iAVF driver uses alloc_ordered_workqueue, which is an unbound workqueue that has a max limit of 1, and thus guarantees that only a single work item on the queue is executing at any given time, so none of the other work threads could be executing due to the ordered workqueue guarantees. The iavf_finish_config() function also does not do anything else after register_netdevice, unless it fails. It seems unlikely that the driver private crit_lock is protecting anything that register_netdevice() itself touches. Thus, to fix this ABBA lock violation, lets simply release the adapter->crit_lock as well as netdev_lock prior to calling register_netdevice(). We do still keep holding the RTNL lock as required by the function. If we do fail to register the netdevice, then we re-acquire the adapter critical lock to finish the transition back to __IAVF_INIT_CONFIG_ADAPTER. This ensures every call where both netdev_lock and the adapter->crit_lock are acquired under the same ordering. Fixes: afc664987ab3 ("eth: iavf: extend the netdev_lock usage") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250224190647.3601930-5-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25ice: Avoid setting default Rx VSI twice in switchdev setupMarcin Szycik
As part of switchdev environment setup, uplink VSI is configured as default for both Tx and Rx. Default Rx VSI is also used by promiscuous mode. If promisc mode is enabled and an attempt to enter switchdev mode is made, the setup will fail because Rx VSI is already configured as default (rule exists). Reproducer: devlink dev eswitch set $PF1_PCI mode switchdev ip l s $PF1 up ip l s $PF1 promisc on echo 1 > /sys/class/net/$PF1/device/sriov_numvfs In switchdev setup, use ice_set_dflt_vsi() instead of plain ice_cfg_dflt_vsi(), which avoids repeating setting default VSI for Rx if it's already configured. Fixes: 50d62022f455 ("ice: default Tx rule instead of to queue") Reported-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Closes: https://lore.kernel.org/intel-wired-lan/PH0PR11MB50138B635F2E5CEB7075325D961F2@PH0PR11MB5013.namprd11.prod.outlook.com Reviewed-by: Martyna Szapar-Mudlaw <martyna.szapar-mudlaw@linux.intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250224190647.3601930-3-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25ice: Fix deinitializing VF in error pathMarcin Szycik
If ice_ena_vfs() fails after calling ice_create_vf_entries(), it frees all VFs without removing them from snapshot PF-VF mailbox list, leading to list corruption. Reproducer: devlink dev eswitch set $PF1_PCI mode switchdev ip l s $PF1 up ip l s $PF1 promisc on sleep 1 echo 1 > /sys/class/net/$PF1/device/sriov_numvfs sleep 1 echo 1 > /sys/class/net/$PF1/device/sriov_numvfs Trace (minimized): list_add corruption. next->prev should be prev (ffff8882e241c6f0), but was 0000000000000000. (next=ffff888455da1330). kernel BUG at lib/list_debug.c:29! RIP: 0010:__list_add_valid_or_report+0xa6/0x100 ice_mbx_init_vf_info+0xa7/0x180 [ice] ice_initialize_vf_entry+0x1fa/0x250 [ice] ice_sriov_configure+0x8d7/0x1520 [ice] ? __percpu_ref_switch_mode+0x1b1/0x5d0 ? __pfx_ice_sriov_configure+0x10/0x10 [ice] Sometimes a KASAN report can be seen instead with a similar stack trace: BUG: KASAN: use-after-free in __list_add_valid_or_report+0xf1/0x100 VFs are added to this list in ice_mbx_init_vf_info(), but only removed in ice_free_vfs(). Move the removing to ice_free_vf_entries(), which is also being called in other places where VFs are being removed (including ice_free_vfs() itself). Fixes: 8cd8a6b17d27 ("ice: move VF overflow message count into struct ice_mbx_vf_info") Reported-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Closes: https://lore.kernel.org/intel-wired-lan/PH0PR11MB50138B635F2E5CEB7075325D961F2@PH0PR11MB5013.namprd11.prod.outlook.com Reviewed-by: Martyna Szapar-Mudlaw <martyna.szapar-mudlaw@linux.intel.com> Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20250224190647.3601930-2-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25enic: add dependency on Page PoolJohn Daley
Driver was not configured to select page_pool, causing a compile error if page pool module was not already selected. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202502211253.3XRosM9I-lkp@intel.com/ Fixes: d24cb52b2d8a ("enic: Use the Page Pool API for RX") Reviewed-by: Nelson Escobar <neescoba@cisco.com> Signed-off-by: John Daley <johndale@cisco.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250224234350.23157-2-johndale@cisco.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25Merge branch 'mlx5-next' of ↵Jakub Kicinski
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux Tariq Toukan says: ==================== mlx5-next updates 2025-02-24 The following pull-request contains common mlx5 updates for your *net-next* tree. * 'mlx5-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux: net/mlx5: Change POOL_NEXT_SIZE define value and make it global net/mlx5: Add new health syndrome error and crr bit offset ==================== Link: https://patch.msgid.link/20250224212446.523259-1-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25Merge branch 'mptcp-misc-fixes'Jakub Kicinski
Matthieu Baerts says: ==================== mptcp: misc. fixes Here are two unrelated fixes, plus an extra patch: - Patch 1: prevent a warning by removing an unneeded and incorrect small optimisation in the path-manager. A fix for v5.10. - Patch 2: reset a subflow when MPTCP opts have been dropped after having correctly added a new path. A fix for v5.19. - Patch 3: add a safety check to prevent issues like the one fixed by the second patch. ==================== Link: https://patch.msgid.link/20250224-net-mptcp-misc-fixes-v1-0-f550f636b435@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25mptcp: safety check before fallbackMatthieu Baerts (NGI0)
Recently, some fallback have been initiated, while the connection was not supposed to fallback. Add a safety check with a warning to detect when an wrong attempt to fallback is being done. This should help detecting any future issues quicker. Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250224-net-mptcp-misc-fixes-v1-3-f550f636b435@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25mptcp: reset when MPTCP opts are dropped after joinMatthieu Baerts (NGI0)
Before this patch, if the checksum was not used, the subflow was only reset if map_data_len was != 0. If there were no MPTCP options or an invalid mapping, map_data_len was not set to the data len, and then the subflow was not reset as it should have been, leaving the MPTCP connection in a wrong fallback mode. This map_data_len condition has been introduced to handle the reception of the infinite mapping. Instead, a new dedicated mapping error could have been returned and treated as a special case. However, the commit 31bf11de146c ("mptcp: introduce MAPPING_BAD_CSUM") has been introduced by Paolo Abeni soon after, and backported later on to stable. It better handle the csum case, and it means the exception for valid_csum_seen in subflow_can_fallback(), plus this one for the infinite mapping in subflow_check_data_avail(), are no longer needed. In other words, the code can be simplified there: a fallback should only be done if msk->allow_infinite_fallback is set. This boolean is set to false once MPTCP-specific operations acting on the whole MPTCP connection vs the initial path have been done, e.g. a second path has been created, or an MPTCP re-injection -- yes, possible even with a single subflow. The subflow_can_fallback() helper can then be dropped, and replaced by this single condition. This also makes the code clearer: a fallback should only be done if it is possible to do so. While at it, no need to set map_data_len to 0 in get_mapping_status() for the infinite mapping case: it will be set to skb->len just after, at the end of subflow_check_data_avail(), and not read in between. Fixes: f8d4bcacff3b ("mptcp: infinite mapping receiving") Cc: stable@vger.kernel.org Reported-by: Chester A. Unal <chester.a.unal@xpedite-tech.com> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/544 Acked-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Tested-by: Chester A. Unal <chester.a.unal@xpedite-tech.com> Link: https://patch.msgid.link/20250224-net-mptcp-misc-fixes-v1-2-f550f636b435@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25mptcp: always handle address removal under msk socket lockPaolo Abeni
Syzkaller reported a lockdep splat in the PM control path: WARNING: CPU: 0 PID: 6693 at ./include/net/sock.h:1711 sock_owned_by_me include/net/sock.h:1711 [inline] WARNING: CPU: 0 PID: 6693 at ./include/net/sock.h:1711 msk_owned_by_me net/mptcp/protocol.h:363 [inline] WARNING: CPU: 0 PID: 6693 at ./include/net/sock.h:1711 mptcp_pm_nl_addr_send_ack+0x57c/0x610 net/mptcp/pm_netlink.c:788 Modules linked in: CPU: 0 UID: 0 PID: 6693 Comm: syz.0.205 Not tainted 6.14.0-rc2-syzkaller-00303-gad1b832bf1cf #0 Hardware name: Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024 RIP: 0010:sock_owned_by_me include/net/sock.h:1711 [inline] RIP: 0010:msk_owned_by_me net/mptcp/protocol.h:363 [inline] RIP: 0010:mptcp_pm_nl_addr_send_ack+0x57c/0x610 net/mptcp/pm_netlink.c:788 Code: 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc e8 ca 7b d3 f5 eb b9 e8 c3 7b d3 f5 90 0f 0b 90 e9 dd fb ff ff e8 b5 7b d3 f5 90 <0f> 0b 90 e9 3e fb ff ff 44 89 f1 80 e1 07 38 c1 0f 8c eb fb ff ff RSP: 0000:ffffc900034f6f60 EFLAGS: 00010283 RAX: ffffffff8bee3c2b RBX: 0000000000000001 RCX: 0000000000080000 RDX: ffffc90004d42000 RSI: 000000000000a407 RDI: 000000000000a408 RBP: ffffc900034f7030 R08: ffffffff8bee37f6 R09: 0100000000000000 R10: dffffc0000000000 R11: ffffed100bcc62e4 R12: ffff88805e6316e0 R13: ffff88805e630c00 R14: dffffc0000000000 R15: ffff88805e630c00 FS: 00007f7e9a7e96c0(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000001b2fd18ff8 CR3: 0000000032c24000 CR4: 00000000003526f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> mptcp_pm_remove_addr+0x103/0x1d0 net/mptcp/pm.c:59 mptcp_pm_remove_anno_addr+0x1f4/0x2f0 net/mptcp/pm_netlink.c:1486 mptcp_nl_remove_subflow_and_signal_addr net/mptcp/pm_netlink.c:1518 [inline] mptcp_pm_nl_del_addr_doit+0x118d/0x1af0 net/mptcp/pm_netlink.c:1629 genl_family_rcv_msg_doit net/netlink/genetlink.c:1115 [inline] genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline] genl_rcv_msg+0xb1f/0xec0 net/netlink/genetlink.c:1210 netlink_rcv_skb+0x206/0x480 net/netlink/af_netlink.c:2543 genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219 netlink_unicast_kernel net/netlink/af_netlink.c:1322 [inline] netlink_unicast+0x7f6/0x990 net/netlink/af_netlink.c:1348 netlink_sendmsg+0x8de/0xcb0 net/netlink/af_netlink.c:1892 sock_sendmsg_nosec net/socket.c:718 [inline] __sock_sendmsg+0x221/0x270 net/socket.c:733 ____sys_sendmsg+0x53a/0x860 net/socket.c:2573 ___sys_sendmsg net/socket.c:2627 [inline] __sys_sendmsg+0x269/0x350 net/socket.c:2659 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f7e9998cde9 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f7e9a7e9038 EFLAGS: 00000246 ORIG_RAX: 000000000000002e RAX: ffffffffffffffda RBX: 00007f7e99ba5fa0 RCX: 00007f7e9998cde9 RDX: 000000002000c094 RSI: 0000400000000000 RDI: 0000000000000007 RBP: 00007f7e99a0e2a0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 0000000000000000 R14: 00007f7e99ba5fa0 R15: 00007fff49231088 Indeed the PM can try to send a RM_ADDR over a msk without acquiring first the msk socket lock. The bugged code-path comes from an early optimization: when there are no subflows, the PM should (usually) not send RM_ADDR notifications. The above statement is incorrect, as without locks another process could concurrent create a new subflow and cause the RM_ADDR generation. Additionally the supposed optimization is not very effective even performance-wise, as most mptcp sockets should have at least one subflow: the MPC one. Address the issue removing the buggy code path, the existing "slow-path" will handle correctly even the edge case. Fixes: b6c08380860b ("mptcp: remove addr and subflow in PM netlink") Cc: stable@vger.kernel.org Reported-by: syzbot+cd3ce3d03a3393ae9700@syzkaller.appspotmail.com Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/546 Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250224-net-mptcp-misc-fixes-v1-1-f550f636b435@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25net: wangxun: fix LIBWX dependenciesArnd Bergmann
Selecting LIBWX requires that its dependencies are met first: WARNING: unmet direct dependencies detected for LIBWX Depends on [m]: NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_WANGXUN [=y] && PTP_1588_CLOCK_OPTIONAL [=m] Selected by [y]: - TXGBE [=y] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_WANGXUN [=y] && PCI [=y] && COMMON_CLK [=y] && I2C_DESIGNWARE_PLATFORM [=y] ld.lld-21: error: undefined symbol: ptp_schedule_worker >>> referenced by wx_ptp.c:747 (/home/arnd/arm-soc/drivers/net/ethernet/wangxun/libwx/wx_ptp.c:747) >>> drivers/net/ethernet/wangxun/libwx/wx_ptp.o:(wx_ptp_reset) in archive vmlinux.a Add the smae dependency on PTP_1588_CLOCK_OPTIONAL to the two driver using this library module. Fixes: 06e75161b9d4 ("net: wangxun: Add support for PTP clock") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20250224140516.1168214-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25Merge branch 'symmetric-or-xor-rss-hash'Jakub Kicinski
Gal Pressman says: ==================== Symmetric OR-XOR RSS hash Add support for a new type of input_xfrm: Symmetric OR-XOR. Symmetric OR-XOR performs hash as follows: (SRC_IP | DST_IP, SRC_IP ^ DST_IP, SRC_PORT | DST_PORT, SRC_PORT ^ DST_PORT) Configuration is done through ethtool -x/X command. For mlx5, the default is already symmetric hash, this patch now exposes this to userspace and allows enabling/disabling of the feature. v5: https://lore.kernel.org/20250220113435.417487-1-gal@nvidia.com v4: https://lore.kernel.org/20250216182453.226325-1-gal@nvidia.com v3: https://lore.kernel.org/20250205135341.542720-1-gal@nvidia.com v2: https://lore.kernel.org/20250203150039.519301-1-gal@nvidia.com ==================== Link: https://patch.msgid.link/20250224174416.499070-1-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25selftests: drv-net-hw: Add a test for symmetric RSS hashGal Pressman
Add a selftest that verifies symmetric RSS hash is working as intended. The test runs iterations of traffic, swapping the src/dst UDP ports, and verifies that the same RX queue is receiving the traffic in both cases. Reviewed-by: Nimrod Oren <noren@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250224174416.499070-5-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25selftests: drv-net: Make rand_port() get a port more reliablyGal Pressman
Instead of guessing a port and checking whether it's available, get an available port from the OS. Reviewed-by: Nimrod Oren <noren@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250224174416.499070-4-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25net/mlx5e: Symmetric OR-XOR RSS hash controlGal Pressman
Allow control over the symmetric RSS hash, which was previously set to enabled by default by the driver. Symmetric OR-XOR RSS can now be queried and controlled using the 'ethtool -x/X' command. Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Link: https://patch.msgid.link/20250224174416.499070-3-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25ethtool: Symmetric OR-XOR RSS hashGal Pressman
Add an additional type of symmetric RSS hash type: OR-XOR. The "Symmetric-OR-XOR" algorithm transforms the input as follows: (SRC_IP | DST_IP, SRC_IP ^ DST_IP, SRC_PORT | DST_PORT, SRC_PORT ^ DST_PORT) Change 'cap_rss_sym_xor_supported' to 'supported_input_xfrm', a bitmap of supported RXH_XFRM_* types. Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/20250224174416.499070-2-gal@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25drivers: net: xgene: Don't use "proxy" headersAndy Shevchenko
Update header inclusions to follow IWYU (Include What You Use) principle. In this case replace *gpio.h, which are subject to remove by the GPIOLIB subsystem, with the respective headers that are being used by the driver. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://patch.msgid.link/20250224120037.3801609-1-andriy.shevchenko@linux.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25Merge branch 'net-stmmac-dwc-qos-clean-up-clock-initialisation'Jakub Kicinski
Russell King says: ==================== net: stmmac: dwc-qos: clean up clock initialisation My single v1 patch has become two patches as a result of the build error, as it appears this code uses "data" differently from others. v2 still produced build warnings despite local builds being clean, so v3 addresses those. The first patch brings some consistency with other drivers, naming local variables that refer to struct plat_stmmacenet_data as "plat_dat", as is used elsewhere in this driver. ==================== Link: https://patch.msgid.link/Z7yj_BZa6yG02KcI@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25net: stmmac: dwc-qos: clean up clock initialisationRussell King (Oracle)
Clean up the clock initialisation by providing a helper to find a named clock in the bulk clocks, and provide the name of the stmmac clock in match data so we can locate the stmmac clock in generic code. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Thierry Reding <treding@nvidia.com> Link: https://patch.msgid.link/E1tmbhj-004vSz-Pt@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25net: stmmac: dwc-qos: name struct plat_stmmacenet_data consistentlyRussell King (Oracle)
Most of the stmmac driver uses "plat_dat" to name the platform data, and "data" is used for the glue driver data. make dwc-qos follow this pattern to avoid silly mistakes. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Thierry Reding <treding@nvidia.com> Link: https://patch.msgid.link/E1tmbhe-004vSt-M3@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25tcp: devmem: don't write truncated dmabuf CMSGs to userspaceStanislav Fomichev
Currently, we report -ETOOSMALL (err) only on the first iteration (!sent). When we get put_cmsg error after a bunch of successful put_cmsg calls, we don't signal the error at all. This might be confusing on the userspace side which will see truncated CMSGs but no MSG_CTRUNC signal. Consider the following case: - sizeof(struct cmsghdr) = 16 - sizeof(struct dmabuf_cmsg) = 24 - total cmsg size (CMSG_LEN) = 40 (16+24) When calling recvmsg with msg_controllen=60, the userspace will receive two(!) dmabuf_cmsg(s), the first one will be a valid one and the second one will be silently truncated. There is no easy way to discover the truncation besides doing something like "cm->cmsg_len != CMSG_LEN(sizeof(dmabuf_cmsg))". Introduce new put_devmem_cmsg wrapper that reports an error instead of doing the truncation. Mina suggests that it's the intended way this API should work. Note that we might now report MSG_CTRUNC when the users (incorrectly) call us with msg_control == NULL. Fixes: 8f0b3cc9a4c1 ("tcp: RX path for devmem TCP") Reviewed-by: Mina Almasry <almasrymina@google.com> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20250224174401.3582695-1-sdf@fomichev.me Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25Add OVN to `rtnetlink.h`Jonas Gottlieb
- The Open Virtual Network (OVN) routing netlink handler uses ID 84 - Will also add to `/etc/iproute2/rt_protos` once this is accepted - For more information: https://github.com/ovn-org/ovn Signed-off-by: Jonas Gottlieb <jonas.gottlieb@stackit.cloud> Link: https://patch.msgid.link/Z7w_e7cfA3xmHDa6@SIT-SDELAP4051.int.lidl.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25selftests/net: ensure mptcp is enabled in netnsHangbin Liu
Some distributions may not enable MPTCP by default. All other MPTCP tests source mptcp_lib.sh to ensure MPTCP is enabled before testing. However, the ip_local_port_range test is the only one that does not include this step. Let's also ensure MPTCP is enabled in netns for ip_local_port_range so that it passes on all distributions. Suggested-by: Davide Caratti <dcaratti@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250224094013.13159-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25net: ethernet: ti: am65-cpsw: select PAGE_POOLSascha Hauer
am65-cpsw uses page_pool_dev_alloc_pages(), thus needs PAGE_POOL selected to avoid linker errors. This is missing since the driver started to use page_pool helpers in 8acacc40f733 ("net: ethernet: ti: am65-cpsw: Add minimal XDP support") Fixes: 8acacc40f733 ("net: ethernet: ti: am65-cpsw: Add minimal XDP support") Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/20250224-net-am654-nuss-kconfig-v2-1-c124f4915c92@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25Octeontx2-af: RPM: Register driver with PCI subsys IDsHariprasad Kelam
Although the PCI device ID and Vendor ID for the RPM (MAC) block have remained the same across Octeon CN10K and the next-generation CN20K silicon, Hardware architecture has changed (NIX mapped RPMs and RFOE Mapped RPMs). Add PCI Subsystem IDs to the device table to ensure that this driver can be probed from NIX mapped RPM devices only. Signed-off-by: Hariprasad Kelam <hkelam@marvell.com> Link: https://patch.msgid.link/20250224035603.1220913-1-hkelam@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdmaLinus Torvalds
Pull rdma fixes from Jason Gunthorpe: - Fix a mlx5 malfunction if the UMR QP gets an error - Return the correct port number to userspace for a mlx5 DCT - Don't cause a UMR QP error if DMABUF teardown races with invalidation - Fix a WARN splat when unregisering so mlx5 device memory MR types - Use the correct alignment for the mana doorbell so that two processes do not share the same physical page on non-4k page systems - MAINTAINERS updates for MANA - Retry failed HNS FW commands because some can take a long time - Cast void * handle to the correct type in bnxt to fix corruption - Avoid a NULL pointer crash in bnxt_re - Fix skipped ib_device_unregsiter() for bnxt_re due to some earlier rework - Correctly detect if the bnxt supports extended statistics - Fix refcount leak in mlx5 odp introduced by a previous fix - Map the FW result for the port rate to the userspace values properly in mlx5, returns correct values for newer 800G ports - Don't wrongly destroy counters objects that were not automatically created during mlx5 bind qp - Set page size/shift members of kernel owned SRQs to fix a crash in nvme target * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: RDMA/bnxt_re: Fix the page details for the srq created by kernel consumers RDMA/mlx5: Fix bind QP error cleanup flow RDMA/mlx5: Fix AH static rate parsing RDMA/mlx5: Fix implicit ODP hang on parent deregistration RDMA/bnxt_re: Fix the statistics for Gen P7 VF RDMA/bnxt_re: Fix issue in the unload path RDMA/bnxt_re: Add sanity checks on rdev validity RDMA/bnxt_re: Fix an issue in bnxt_re_async_notifier RDMA/hns: Fix mbox timing out by adding retry mechanism MAINTAINERS: update maintainer for Microsoft MANA RDMA driver RDMA/mana_ib: Allocate PAGE aligned doorbell index RDMA/mlx5: Fix a WARN during dereg_mr for DM type RDMA/mlx5: Fix a race for DMABUF MR which can lead to CQE with error IB/mlx5: Set and get correct qp_num for a DCT QP RDMA/mlx5: Fix the recovery flow of the UMR QP
2025-02-25Merge tag 'perf-tools-fixes-for-v6.14-2-2025-02-25' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix tools/ quiet build Makefile infrastructure that was broken when working on tools/perf/ without testing on other tools/ living utilities. * tag 'perf-tools-fixes-for-v6.14-2-2025-02-25' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: tools: Remove redundant quiet setup tools: Unify top-level quiet infrastructure
2025-02-25lsm,nfs: fix memory leak of lsm_contextStephen Smalley
commit b530104f50e8 ("lsm: lsm_context in security_dentry_init_security") did not preserve the lsm id for subsequent release calls, which results in a memory leak. Fix it by saving the lsm id in the nfs4_label and providing it on the subsequent release call. Fixes: b530104f50e8 ("lsm: lsm_context in security_dentry_init_security") Signed-off-by: Stephen Smalley <stephen.smalley.work@gmail.com> Acked-by: Paul Moore <paul@paul-moore.com> Acked-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-02-25sunrpc: suppress warnings for unused procfs functionsArnd Bergmann
There is a warning about unused variables when building with W=1 and no procfs: net/sunrpc/cache.c:1660:30: error: 'cache_flush_proc_ops' defined but not used [-Werror=unused-const-variable=] 1660 | static const struct proc_ops cache_flush_proc_ops = { | ^~~~~~~~~~~~~~~~~~~~ net/sunrpc/cache.c:1622:30: error: 'content_proc_ops' defined but not used [-Werror=unused-const-variable=] 1622 | static const struct proc_ops content_proc_ops = { | ^~~~~~~~~~~~~~~~ net/sunrpc/cache.c:1598:30: error: 'cache_channel_proc_ops' defined but not used [-Werror=unused-const-variable=] 1598 | static const struct proc_ops cache_channel_proc_ops = { | ^~~~~~~~~~~~~~~~~~~~~~ These are used inside of an #ifdef, so replacing that with an IS_ENABLED() check lets the compiler see how they are used while still dropping them during dead code elimination. Fixes: dbf847ecb631 ("knfsd: allow cache_register to return error on failure") Reviewed-by: Jeff Layton <jlayton@kernel.org> Acked-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Anna Schumaker <anna.schumaker@oracle.com>
2025-02-25sched_ext: Fix pick_task_scx() picking non-queued tasks when it's called ↵Tejun Heo
without balance() a6250aa251ea ("sched_ext: Handle cases where pick_task_scx() is called without preceding balance_scx()") added a workaround to handle the cases where pick_task_scx() is called without prececing balance_scx() which is due to a fair class bug where pick_taks_fair() may return NULL after a true return from balance_fair(). The workaround detects when pick_task_scx() is called without preceding balance_scx() and emulates SCX_RQ_BAL_KEEP and triggers kicking to avoid stalling. Unfortunately, the workaround code was testing whether @prev was on SCX to decide whether to keep the task running. This is incorrect as the task may be on SCX but no longer runnable. This could lead to a non-runnable task to be returned from pick_task_scx() which cause interesting confusions and failures. e.g. A common failure mode is the task ending up with (!on_rq && on_cpu) state which can cause potential wakers to busy loop, which can easily lead to deadlocks. Fix it by testing whether @prev has SCX_TASK_QUEUED set. This makes @prev_on_scx only used in one place. Open code the usage and improve the comment while at it. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Pat Cody <patcody@meta.com> Fixes: a6250aa251ea ("sched_ext: Handle cases where pick_task_scx() is called without preceding balance_scx()") Cc: stable@vger.kernel.org # v6.12+ Acked-by: Andrea Righi <arighi@nvidia.com>
2025-02-25ASoC: Intel: don't check number of sdw links when setMark Brown
Merge series from Bard Liao <yung-chuan.liao@linux.intel.com>: Currently, we assume that the PCH DMIC pins are pin-muxed with SoundWire links. However, we do see a HW design that use PCH DMIC along with 3 SoundWire links. Remove the check and add warning to let users know that SoundWire MIC and PCH DMIC are both present and they could overwrite it with kernel params.
2025-02-25Merge tag 'for-6.14-rc4-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - extent map shrinker fixes: - fix potential use after free accessing an inode to reach fs_info, the shrinker could do iput() in the meantime - skip unnecessary scanning of inodes without extent maps - do direct iput(), no need for indirection via workqueue - in block < page mode, fix race when extending i_size in buffered mode - fix minor memory leak in selftests - print descriptive error message when seeding device is not found * tag 'for-6.14-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix data overwriting bug during buffered write when block size < page size btrfs: output an error message if btrfs failed to find the seed fsid btrfs: do regular iput instead of delayed iput during extent map shrinking btrfs: skip inodes without loaded extent maps when shrinking extent maps btrfs: fix use-after-free on inode when scanning root during em shrinking btrfs: selftests: fix btrfs_test_delayed_refs() leak of transaction
2025-02-25Merge tag 'vfs-6.14-rc5.fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - Use __readahead_folio() in fuse again to fix a UAF issue when using splice - Remove d_op->d_delete method from pidfs - Remove d_op->d_delete method from nsfs - Simplify iomap_dio_bio_iter() - Fix a UAF in ovl_dentry_update_reval - Fix a miscalulated file range for filemap_fdatawrite_range_kick() - Don't skip skip dirty page in folio_unmap_invalidate() * tag 'vfs-6.14-rc5.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: iomap: Minor code simplification in iomap_dio_bio_iter() nsfs: remove d_op->d_delete pidfs: remove d_op->d_delete mm/truncate: don't skip dirty page in folio_unmap_invalidate() mm/filemap: fix miscalculated file range for filemap_fdatawrite_range_kick() fuse: don't truncate cached, mutated symlink ovl: fix UAF in ovl_dentry_update_reval by moving dput() in ovl_link_up fuse: revert back to __readahead_folio() for readahead
2025-02-25ALSA: hda/realtek: Fix wrong mic setup for ASUS VivoBook 15Takashi Iwai
ASUS VivoBook 15 with SSID 1043:1460 took an incorrect quirk via the pin pattern matching for ASUS (ALC256_FIXUP_ASUS_MIC), resulting in the two built-in mic pins (0x13 and 0x1b). This had worked without problems casually in the past because the right pin (0x1b) was picked up as the primary device. But since we fixed the pin enumeration for other bugs, the bogus one (0x13) is picked up as the primary device, hence the bug surfaced now. For addressing the regression, this patch explicitly specifies the quirk entry with ALC256_FIXUP_ASUS_MIC_NO_PRESENCE, which sets up only the headset mic pin. Fixes: 3b4309546b48 ("ALSA: hda: Fix headset detection failure due to unstable sort") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219807 Link: https://patch.msgid.link/20250225154540.13543-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-02-25ASoC: cs35l56: Prevent races when soft-resetting using SPI controlRichard Fitzgerald
When SPI is used for control, the driver must hold the SPI bus lock while issuing the sequence of writes to perform a soft reset. >From the time the driver writes the SYSTEM_RESET command until the driver does a write to terminate the reset, there must not be any activity on the SPI bus lines. If there is any SPI activity during the soft-reset, another soft-reset will be triggered. The state of the SPI chip select is irrelevant. A repeated soft-reset does not in itself cause any problems, and it is not an infinite loop. The problem is a race between these resets and the driver polling for boot completion. There is a time window between soft resets where the driver could read HALO_STATE as 2 (fully booted) while the chip is actually soft-resetting. Although this window is small, it is long enough that it is possible to hit it in normal operation. To prevent this race and ensure the chip really is fully booted, the driver calls spi_bus_lock() to prevent other activity while resetting. It then issues the SYSTEM_RESET mailbox command. After allowing sufficient time for reset to take effect, the driver issues a PING mailbox command, which will force completion of the full soft-reset sequence. The SPI bus lock can then be released. The mailbox is checked for any boot or wakeup response from the firmware, before the value in HALO_STATE will be trusted. This does not affect SoundWire or I2C control. Fixes: 8a731fd37f8b ("ASoC: cs35l56: Move utility functions to shared file") Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Link: https://patch.msgid.link/20250225131843.113752-3-rf@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-02-25firmware: cs_dsp: Remove async regmap writesRichard Fitzgerald
Change calls to async regmap write functions to use the normal blocking writes so that the cs35l56 driver can use spi_bus_lock() to gain exclusive access to the SPI bus. As this is part of a fix, it makes only the minimal change to swap the functions to the blocking equivalents. There's no need to risk reworking the buffer allocation logic that is now partially redundant. The async writes are a 12-year-old workaround for inefficiency of synchronous writes in the SPI subsystem. The SPI subsystem has since been changed to avoid the overheads, so this workaround should not be necessary. The cs35l56 driver needs to use spi_bus_lock() prevent bus activity while it is soft-resetting the cs35l56. But spi_bus_lock() is incompatible with spi_async() calls, which will fail with -EBUSY. Fixes: 8a731fd37f8b ("ASoC: cs35l56: Move utility functions to shared file") Signed-off-by: Richard Fitzgerald <rf@opensource.cirrus.com> Link: https://patch.msgid.link/20250225131843.113752-2-rf@opensource.cirrus.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-02-25sctp: Remove unused payload from sctp_idatahdrThorsten Blum
Remove the unused payload array from the struct sctp_idatahdr. Cc: Kees Cook <kees@kernel.org> Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Link: https://patch.msgid.link/20250223204505.2499-3-thorsten.blum@linux.dev Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-25ASoC: Intel: sof_sdw: warn both sdw and pch dmic are usedBard Liao
Typically, SoundWire MIC and PCH DMIC will not coexist. However, we may want to use both of them in some special cases. Add a warning to let users know that SoundWire MIC and PCH DMIC are both present and they could overwrite it with kernel params. Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Link: https://patch.msgid.link/20250225093716.67240-3-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-02-25ASoC: SOF: Intel: don't check number of sdw links when set dmic_fixupBard Liao
Currently, we assume that the PCH DMIC pins are pin-muxed with SoundWire links. However, we do see a HW design that use PCH DMIC along with 3 SoundWire links. Remove the check now. With this change the PCM DMIC will be presented if it is reported by the BIOS irrespective of whether there are SDW links present or not. Signed-off-by: Bard Liao <yung-chuan.liao@linux.intel.com> Reviewed-by: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Reviewed-by: Péter Ujfalusi <peter.ujfalusi@linux.intel.com> Link: https://patch.msgid.link/20250225093716.67240-2-yung-chuan.liao@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-02-25MAINTAINERS: socket timestamping: add Jason Xing as reviewerWillem de Bruijn
Jason has been helping as reviewer for this area already, and has contributed various features directly, notably BPF timestamping. Also extend coverage to all timestamping tests, including those new with BPF timestamping. Link: https://lore.kernel.org/netdev/20250220072940.99994-1-kerneljasonxing@gmail.com/ Signed-off-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://patch.msgid.link/20250222172839.642079-1-willemdebruijn.kernel@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-25ipvs: Always clear ipvs_property flag in skb_scrub_packet()Philo Lu
We found an issue when using bpf_redirect with ipvs NAT mode after commit ff70202b2d1a ("dev_forward_skb: do not scrub skb mark within the same name space"). Particularly, we use bpf_redirect to return the skb directly back to the netif it comes from, i.e., xnet is false in skb_scrub_packet(), and then ipvs_property is preserved and SNAT is skipped in the rx path. ipvs_property has been already cleared when netns is changed in commit 2b5ec1a5f973 ("netfilter/ipvs: clear ipvs_property flag when SKB net namespace changed"). This patch just clears it in spite of netns. Fixes: 2b5ec1a5f973 ("netfilter/ipvs: clear ipvs_property flag when SKB net namespace changed") Signed-off-by: Philo Lu <lulie@linux.alibaba.com> Acked-by: Julian Anastasov <ja@ssi.bg> Link: https://patch.msgid.link/20250222033518.126087-1-lulie@linux.alibaba.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-25Merge branch 'eth-fbnic-update-fbnic-driver'Paolo Abeni
Mohsin Bashir says: ==================== eth: fbnic: Update fbnic driver This patchset makes following trivial changes to the fbnic driver: 1) Add coverage for PCIe CSRs in the ethtool register dump. 2) Consolidate the PUL_USER CSR section, update the end boundary, and remove redundant definition of the end boundary. 3) Update the return value in kdoc for fbnic_netdev_alloc(). ==================== Link: https://patch.msgid.link/20250221201813.2688052-1-mohsin.bashr@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-25eth: fbnic: Update return value in kdocMohsin Bashir
Fix return value in kdoc for fbnic_netdev_alloc() Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-25eth: fbnic: Consolidate PUL_USER CSR sectionMohsin Bashir
Move PUL_USER CSRs in the relevant section, update the end boundary address, and remove the redundant definition of end boundary. Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-25eth: fbnic: Add PCIe registers dumpMohsin Bashir
Provide coverage to PCIe registers in ethtool register dump Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-25iomap: Minor code simplification in iomap_dio_bio_iter()John Garry
Combine 'else' and 'if' conditional statements onto a single line and drop unrequired braces, as is standard coding style. The code had been like this since commit c3b0e880bbfa ("iomap: support REQ_OP_ZONE_APPEND"). Signed-off-by: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20250224154538.548028-1-john.g.garry@oracle.com Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-02-24net: phy: add phylib-internal.hHeiner Kallweit
This patch is a starting point for moving phylib-internal declarations to a private header file. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/082eacd2-a888-4716-8797-b3491ce02820@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-24Merge branch 'mptcp-pm-misc-cleanups-part-3'Jakub Kicinski
Matthieu Baerts says: ==================== mptcp: pm: misc cleanups, part 3 These cleanups lead the way to the unification of the path-manager interfaces, and allow future extensions. The following patches are not all linked to each others, but are all related to the path-managers, except the last three. - Patch 1: remove unused returned value in mptcp_nl_set_flags(). - Patch 2: new flag: avoid iterating over all connections if not needed. - Patch 3: add a build check making sure there is enough space in cb-ctx. - Patch 4: new mptcp_pm_genl_fill_addr helper to reduce duplicated code. - Patch 5: simplify userspace_pm_append_new_local_addr helper. - Patch 6: drop unneeded inet6_sk(). - Patch 7: use ipv6_addr_equal() instead of !ipv6_addr_cmp() - Patch 8: scheduler: split an interface in two. - Patch 9: scheduler: save 64 bytes of currently unused data. - Patch 10: small optimisation to exit early in case of retransmissions. ==================== Link: https://patch.msgid.link/20250221-net-next-mptcp-pm-misc-cleanup-3-v1-0-2b70ab1cee79@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-24mptcp: blackhole: avoid checking the state twiceMatthieu Baerts (NGI0)
A small cleanup, reordering the conditions to avoid checking things twice. The code here is called in case of timeout on a TCP connection, before triggering a retransmission. But it only acts on SYN + MPC packets. So the conditions can be re-order to exit early in case of non-MPTCP SYN + MPC. This also reduce the indentation levels. Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250221-net-next-mptcp-pm-misc-cleanup-3-v1-10-2b70ab1cee79@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-24mptcp: sched: reduce size for unused dataMatthieu Baerts (NGI0)
Thanks for the previous commit ("mptcp: sched: split get_subflow interface into two"), the mptcp_sched_data structure is now currently unused. This structure has been added to allow future extensions that are not ready yet. At the end, this structure will not even be used at all when mptcp_subflow bpf_iter will be supported [1]. Here is a first step to save 64 bytes on the stack for each scheduling operation. The structure is not removed yet not to break the WIP work on these extensions, but will be done when [1] will be ready and applied. Link: https://lore.kernel.org/6645ad6e-8874-44c5-8730-854c30673218@linux.dev [1] Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20250221-net-next-mptcp-pm-misc-cleanup-3-v1-9-2b70ab1cee79@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>