linux/linux-stable.git - Linux kernel stable tree

Age	Commit message (Collapse)	Author
2025-04-25	io_uring/zcrx: selftests: set hds_thresh to 0	David Wei
	Setting hds_thresh to 0 is required for queue reset. Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250425022049.3474590-3-dw@davidwei.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-25	io_uring/zcrx: selftests: switch to using defer() for cleanup	David Wei
	Switch to using defer() for putting the NIC back to the original state prior to running the selftest. Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250425022049.3474590-2-dw@davidwei.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24	Merge branch 'fix-netdevim-to-correctly-mark-napi-ids'	Jakub Kicinski
	Joe Damato says: ==================== Fix netdevim to correctly mark NAPI IDs This series fixes netdevsim to correctly set the NAPI ID on the skb. This is helpful for writing tests around features that use SO_INCOMING_NAPI_ID. In addition to the netdevsim fix in patch 1, patches 2 & 3 do some self test refactoring and add a test for NAPI IDs. The test itself (patch 3) introduces a C helper because apparently python doesn't have socket.SO_INCOMING_NAPI_ID. v3: https://lore.kernel.org/20250418013719.12094-1-jdamato@fastly.com v2: https://lore.kernel.org/20250417013301.39228-1-jdamato@fastly.com rfcv1: https://lore.kernel.org/20250329000030.39543-1-jdamato@fastly.com ==================== Link: https://patch.msgid.link/20250424002746.16891-1-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24	selftests: drv-net: Test that NAPI ID is non-zero	Joe Damato
	Test that the SO_INCOMING_NAPI_ID of a network file descriptor is non-zero. This ensures that either the core networking stack or, in some cases like netdevsim, the driver correctly sets the NAPI ID. Signed-off-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250424002746.16891-4-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24	selftests: drv-net: Factor out ksft C helpers	Joe Damato
	Factor ksft C helpers to a header so they can be used by other C-based tests. Signed-off-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250424002746.16891-3-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24	netdevsim: Mark NAPI ID on skb in nsim_rcv	Joe Damato
	Previously, nsim_rcv was not marking the NAPI ID on the skb, leading to applications seeing a napi ID of 0 when using SO_INCOMING_NAPI_ID. To add to the userland confusion, netlink appears to correctly report the NAPI IDs for netdevsim queues but the resulting file descriptor from a call to accept() was reporting a NAPI ID of 0. Signed-off-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250424002746.16891-2-jdamato@fastly.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24	tools: ynl: fix the header guard name for OVPN	Jakub Kicinski
	Thorsten reports that after upgrading system headers from linux-next the YNL build breaks. I typo'ed the header guard, _H is missing. Reported-by: Thorsten Leemhuis <linux@leemhuis.info> Link: https://lore.kernel.org/59ba7a94-17b9-485f-aa6d-14e4f01a7a39@leemhuis.info Fixes: 12b196568a3a ("tools: ynl: add missing header deps") Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Thorsten Leemhuis <linux@leemhuis.info> Reviewed-by: Joe Damato <jdamato@fastly.com> Link: https://patch.msgid.link/20250423220231.1035931-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24	net: ethernet: mtk_wed: annotate RCU release in attach()	Johannes Berg
	There are some sparse warnings in wifi, and it seems that it's actually possible to annotate a function pointer with __releases(), making the sparse warnings go away. In a way that also serves as documentation that rcu_read_unlock() must be called in the attach method, so add that annotation. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://patch.msgid.link/20250423150811.456205-2-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24	Merge branch 'tcp-fastopen-observability'	Jakub Kicinski
	Jeremy Harris says: ==================== tcp: fastopen: observability Whether TCP Fast Open was used for a connection is not reliably observable by an accepting application when the SYN passed no data. Fix this by noting during SYN receive processing that an acceptable Fast Open option was used, and provide this to userland via getsockopt TCP_INFO. ==================== Link: https://patch.msgid.link/20250423124334.4916-1-jgh@exim.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24	tcp: fastopen: pass TFO child indication through getsockopt	Jeremy Harris
	tcp: fastopen: pass TFO child indication through getsockopt Note that this uses up the last bit of a field in struct tcp_info Signed-off-by: Jeremy Harris <jgh@exim.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Link: https://patch.msgid.link/20250423124334.4916-3-jgh@exim.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24	tcp: fastopen: note that a child socket was created	Jeremy Harris
	tcp: fastopen: note that a child socket was created This uses up the last bit in a field of tcp_sock. Signed-off-by: Jeremy Harris <jgh@exim.org> Reviewed-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Link: https://patch.msgid.link/20250423124334.4916-2-jgh@exim.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24	net: ip_gre: Fix spelling mistake "demultiplexor" -> "demultiplexer"	Colin Ian King
	There is a spelling mistake in a pr_info message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://patch.msgid.link/20250423113719.173539-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24	rxrpc: rxgk: Fix some reference count leaks	Dan Carpenter
	These paths should call rxgk_put(gk) but they don't. In the rxgk_construct_response() function the "goto error;" will free the "response" skb as well calling rxgk_put() so that's a bonus. Fixes: 9d1d2b59341f ("rxrpc: rxgk: Implement the yfs-rxgk security class (GSSAPI)") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Acked-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/aAikCbsnnzYtVmIA@stanley.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24	net: ethernet: mtk_eth_soc: convert cap_bit in mtk_eth_muxc struct to u64	Bo-Cun Chen
	With commit 51a4df60db5c2 ("net: ethernet: mtk_eth_soc: convert caps in mtk_soc_data struct to u64") the capabilities bitfield was converted to a 64-bit value, but a cap_bit in struct mtk_eth_muxc which is used to store a full bitfield (rather than the bit number, as the name would suggest) still holds only a 32-bit value. Change the type of cap_bit to u64 in order to avoid truncating the bitfield which results in path selection to not work with capabilities above the 32-bit limit. The values currently stored in the cap_bit field are MTK_ETH_MUX_GDM1_TO_GMAC1_ESW: BIT_ULL(18) \| BIT_ULL(5) MTK_ETH_MUX_GMAC2_GMAC0_TO_GEPHY: BIT_ULL(19) \| BIT_ULL(5) \| BIT_ULL(6) MTK_ETH_MUX_U3_GMAC2_TO_QPHY: BIT_ULL(20) \| BIT_ULL(5) \| BIT_ULL(6) MTK_ETH_MUX_GMAC1_GMAC2_TO_SGMII_RGMII: BIT_ULL(20) \| BIT_ULL(5) \| BIT_ULL(7) MTK_ETH_MUX_GMAC12_TO_GEPHY_SGMII: BIT_ULL(21) \| BIT_ULL(5) While all those values are currently still within 32-bit boundaries, the addition of new capabilities of MT7988 as well as future SoC's like MT7987 will exceed them. Also, the use of a 32-bit 'int' type to store the result of a BIT_ULL(...) is misleading. Signed-off-by: Bo-Cun Chen <bc-bocun.chen@mediatek.com> Signed-off-by: Daniel Golle <daniel@makrotopia.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/ded98b0d716c3203017a7a92151516ec2bf1abee.1745369249.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24	rxrpc: Remove deadcode	Dr. David Alan Gilbert
	Remove three functions that are no longer used. rxrpc_get_txbuf() last use was removed by 2020's commit 5e6ef4f1017c ("rxrpc: Make the I/O thread take over the call and local processor work") rxrpc_kernel_get_epoch() last use was removed by 2020's commit 44746355ccb1 ("afs: Don't get epoch from a server because it may be ambiguous") rxrpc_kernel_set_max_life() last use was removed by 2023's commit db099c625b13 ("rxrpc: Fix timeout of a call that hasn't yet been granted a channel") Both of the rxrpc_kernel_* functions were documented. Remove that documentation as well as the code. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Acked-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/20250422235147.146460-1-linux@treblig.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24	Merge branch 'net-bcmasp-add-v3-0-and-remove-v2-0'	Jakub Kicinski
	Justin Chen says: ==================== net: bcmasp: Add v3.0 and remove v2.0 asp-v2.0 had one supported SoC that never saw the light of day. Given that it was the first iteration of the HW, it ended up with some one off HW design decisions that were changed in futher iterations of the HW. We remove support to simplify the code and make it easier to add future revisions. Add support for asp-v3.0. asp-v3.0 reduces the feature set for cost savings. We reduce the number of channel/network filters. And also remove some features and statistics. ==================== Link: https://patch.msgid.link/20250422233645.1931036-1-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24	net: phy: mdio-bcm-unimac: Add asp-v3.0	Justin Chen
	Add mdio compat string for asp-v3.0 ethernet driver. Signed-off-by: Justin Chen <justin.chen@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250422233645.1931036-9-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24	net: bcmasp: Add support for asp-v3.0	Justin Chen
	The asp-v3.0 is a major HW revision that reduced the number of channels and filters. The goal was to save cost by reducing the feature set. Changes for asp-v3.0 - Number of network filters were reduced. - Number of channels were reduced. - EDPKT stats were removed. - Fix a bug with csum offload. Signed-off-by: Justin Chen <justin.chen@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250422233645.1931036-8-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24	dt-bindings: net: brcm,unimac-mdio: Add asp-v3.0	Justin Chen
	The asp-v3.0 Ethernet controller uses a brcm unimac like its predecessor. Signed-off-by: Justin Chen <justin.chen@broadcom.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250422233645.1931036-7-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24	dt-bindings: net: brcm,asp-v2.0: Add asp-v3.0	Justin Chen
	Add asp-v3.0 support. v3.0 is a major revision that reduces the feature set for cost savings. We have a reduced amount of channels and network filters. Signed-off-by: Justin Chen <justin.chen@broadcom.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250422233645.1931036-6-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24	net: phy: mdio-bcm-unimac: Remove asp-v2.0	Justin Chen
	Remove asp-v2.0 which will no longer be supported. Signed-off-by: Justin Chen <justin.chen@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250422233645.1931036-5-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24	net: bcmasp: Remove support for asp-v2.0	Justin Chen
	The SoC that supported asp-v2.0 never saw the light of day. asp-v2.0 has quirks that makes the logic overly complicated. For example, asp-v2.0 is the only revision that has a different wake up IRQ hook up. Remove asp-v2.0 support to make supporting future HW revisions cleaner. Signed-off-by: Justin Chen <justin.chen@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250422233645.1931036-4-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24	dt-bindings: net: brcm,unimac-mdio: Remove asp-v2.0	Justin Chen
	Remove asp-v2.0 which was only supported on one SoC that never saw the light of day. Signed-off-by: Justin Chen <justin.chen@broadcom.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250422233645.1931036-3-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24	dt-bindings: net: brcm,asp-v2.0: Remove asp-v2.0	Justin Chen
	Remove asp-v2.0 which was only supported on one SoC that never saw the light of day. Signed-off-by: Justin Chen <justin.chen@broadcom.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20250422233645.1931036-2-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24	selftests: iou-zcrx: Get the page size at runtime	Haiyue Wang
	Use the API `sysconf()` to query page size at runtime, instead of using hard code number 4096. And use `posix_memalign` to allocate the page size aligned momory. Signed-off-by: Haiyue Wang <haiyuewa@163.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: David Wei <dw@davidwei.uk> Link: https://patch.msgid.link/20250419141044.10304-1-haiyuewa@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	Cross-merge networking fixes after downstream PR (net-6.15-rc4). This pull includes wireless and a fix to vxlan which isn't in Linus's tree just yet. The latter creates with a silent conflict / build breakage, so merging it now to avoid causing problems. drivers/net/vxlan/vxlan_vnifilter.c 094adad91310 ("vxlan: Use a single lock to protect the FDB table") 087a9eb9e597 ("vxlan: vnifilter: Fix unlocked deletion of default FDB entry") https://lore.kernel.org/20250423145131.513029-1-idosch@nvidia.com No "normal" conflicts, or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24	vxlan: vnifilter: Fix unlocked deletion of default FDB entry	Ido Schimmel
	When a VNI is deleted from a VXLAN device in 'vnifilter' mode, the FDB entry associated with the default remote (assuming one was configured) is deleted without holding the hash lock. This is wrong and will result in a warning [1] being generated by the lockdep annotation that was added by commit ebe642067455 ("vxlan: Create wrappers for FDB lookup"). Reproducer: # ip link add vx0 up type vxlan dstport 4789 external vnifilter local 192.0.2.1 # bridge vni add vni 10010 remote 198.51.100.1 dev vx0 # bridge vni del vni 10010 dev vx0 Fix by acquiring the hash lock before the deletion and releasing it afterwards. Blame the original commit that introduced the issue rather than the one that exposed it. [1] WARNING: CPU: 3 PID: 392 at drivers/net/vxlan/vxlan_core.c:417 vxlan_find_mac+0x17f/0x1a0 [...] RIP: 0010:vxlan_find_mac+0x17f/0x1a0 [...] Call Trace: <TASK> __vxlan_fdb_delete+0xbe/0x560 vxlan_vni_delete_group+0x2ba/0x940 vxlan_vni_del.isra.0+0x15f/0x580 vxlan_process_vni_filter+0x38b/0x7b0 vxlan_vnifilter_process+0x3bb/0x510 rtnetlink_rcv_msg+0x2f7/0xb70 netlink_rcv_skb+0x131/0x360 netlink_unicast+0x426/0x710 netlink_sendmsg+0x75a/0xc20 __sock_sendmsg+0xc1/0x150 ____sys_sendmsg+0x5aa/0x7b0 ___sys_sendmsg+0xfc/0x180 __sys_sendmsg+0x121/0x1b0 do_syscall_64+0xbb/0x1d0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 Fixes: f9c4bb0b245c ("vxlan: vni filtering support on collect metadata device") Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20250423145131.513029-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24	Merge tag 'wireless-2025-04-24' of ↵	Jakub Kicinski
	https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless Johannes Berg says: ==================== Some more fixes, notably: * iwlwifi: various regression and iwlmld fixes * mac80211: fix TX frames in monitor mode * brcmfmac: error handling for firmware load * tag 'wireless-2025-04-24' of https://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless: wifi: iwlwifi: restore missing initialization of async_handlers_list wifi: brcm80211: fmac: Add error handling for brcmf_usb_dl_writeimage() wifi: plfxlc: Remove erroneous assert in plfxlc_mac_release wifi: iwlwifi: fix the check for the SCRATCH register upon resume wifi: iwlwifi: don't warn if the NIC is gone in resume wifi: iwlwifi: mld: fix BAID validity check wifi: iwlwifi: back off on continuous errors wifi: iwlwifi: mld: only create debugfs symlink if it does not exist wifi: iwlwifi: mld: inform trans on init failure wifi: iwlwifi: mld: properly handle async notification in op mode start Revert "wifi: iwlwifi: make no_160 more generic" Revert "wifi: iwlwifi: add support for BE213" wifi: mac80211: restore monitor for outgoing frames ==================== Link: https://patch.msgid.link/20250424120535.56499-3-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-24	Merge tag 'net-6.15-rc4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "No fixes from any subtree. Current release - regressions: - net: fix the missing unlock for detached devices Previous releases - regressions: - sched: fix UAF vulnerability in HFSC qdisc - lwtunnel: disable BHs when required - mptcp: pm: defer freeing of MPTCP userspace path manager entries - tipc: fix NULL pointer dereference in tipc_mon_reinit_self() - eth: virtio-net: disable delayed refill when pausing rx Previous releases - always broken: - phylink: fix suspend/resume with WoL enabled and link down - eth: - mlx5: fix null-ptr-deref in mlx5_create_{inner_,}ttc_table() - xen-netfront: handle NULL returned by xdp_convert_buff_to_frame() - enetc: fix frame corruption on bpf_xdp_adjust_head/tail() and XDP_PASS - stmmac: fix dwmac1000 ptp timestamp status offset - pds_core: prevent possible adminq overflow/stuck condition Misc: - a bunch of MAINTAINERS updates" * tag 'net-6.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (32 commits) net: stmmac: fix multiplication overflow when reading timestamp net: stmmac: fix dwmac1000 ptp timestamp status offset net: dp83822: Fix OF_MDIO config check pds_core: make wait_context part of q_info pds_core: Remove unnecessary check in pds_client_adminq_cmd() pds_core: handle unsupported PDS_CORE_CMD_FW_CONTROL result pds_core: Prevent possible adminq overflow/stuck condition net: dsa: mt7530: sync driver-specific behavior of MT7531 variants selftests/tc-testing: Add test for HFSC queue emptying during peek operation net_sched: hfsc: Fix a potential UAF in hfsc_dequeue() too net_sched: hfsc: Fix a UAF vulnerability in class handling selftests: mptcp: diag: use mptcp_lib_get_info_value mptcp: pm: Defer freeing of MPTCP userspace path manager entries net: ethernet: mtk_eth_soc: net: revise NETSYSv3 hardware configuration tipc: fix NULL pointer dereference in tipc_mon_reinit_self() virtio-net: disable delayed refill when pausing rx net: phy: leds: fix memory leak net: phylink: mac_link_(up\|down)() clarifications net: phylink: fix suspend/resume with WoL enabled and link down net: lwtunnel: disable BHs when required ...
2025-04-24	Merge tag 'v6.15-p5' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto fixes from Herbert Xu: - Revert acomp multibuffer tests which were buggy - Fix off-by-one regression in new scomp code - Lower quality setting on atmel-sha204a as it may not be random * tag 'v6.15-p5' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: atmel-sha204a - Set hwrng quality to lowest possible crypto: scomp - Fix off-by-one bug when calculating last page Revert "crypto: testmgr - Add multibuffer acomp testing"
2025-04-24	net: phy: marvell-88q2xxx: Enable temperature sensor for mv88q211x	Niklas Söderlund
	The temperature sensor enabled for mv88q222x devices also functions for mv88q211x based devices. Unify the two devices probe functions to enable the sensors for all devices supported by this driver. The same oddity as for mv88q222x devices exists, the PHY link must be up for a correct temperature reading to be reported. # cat /sys/class/hwmon/hwmon9/temp1_input -75000 # ifconfig end5 up # cat /sys/class/hwmon/hwmon9/temp1_input 59000 Worth noting is that while the temperature register offsets and layout are the same between mv88q211x and mv88q222x devices their names in the datasheets are different. This change keeps the mv88q222x names for the mv88q211x support. Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Dimitri Fedrau <dima.fedrau@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20250418145800.2420751-1-niklas.soderlund+renesas@ragnatech.se Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-24	Merge branch 'net-stmmac-fix-timestamp-snapshots-on-dwmac1000'	Paolo Abeni
	Alexis Lothore says: ==================== net: stmmac: fix timestamp snapshots on dwmac1000 this is the v2 of a small series containing two small fixes for the timestamp snapshot feature on stmmac, especially on dwmac1000 version. Those issues have been detected on a socfpga (Cyclone V) platform. They kind of follow the big rework sent by Maxime at the end of last year to properly split this feature support between different versions of the DWMAC IP. v1: https://lore.kernel.org/r/20250422-stmmac_ts-v1-0-b59c9f406041@bootlin.com Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com> ==================== Link: https://patch.msgid.link/20250423-stmmac_ts-v2-0-e2cf2bbd61b1@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-24	net: stmmac: fix multiplication overflow when reading timestamp	Alexis Lothoré
	The current way of reading a timestamp snapshot in stmmac can lead to integer overflow, as the computation is done on 32 bits. The issue has been observed on a dwmac-socfpga platform returning chaotic timestamp values due to this overflow. The corresponding multiplication is done with a MUL instruction, which returns 32 bit values. Explicitly casting the value to 64 bits replaced the MUL with a UMLAL, which computes and returns the result on 64 bits, and so returns correctly the timestamps. Prevent this overflow by explicitly casting the intermediate value to u64 to make sure that the whole computation is made on u64. While at it, apply the same cast on the other dwmac variant (GMAC4) method for snapshot retrieval. Fixes: 477c3e1f6363 ("net: stmmac: Introduce dwmac1000 timestamping operations") Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250423-stmmac_ts-v2-2-e2cf2bbd61b1@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-24	net: stmmac: fix dwmac1000 ptp timestamp status offset	Alexis Lothore
	When a PTP interrupt occurs, the driver accesses the wrong offset to learn about the number of available snapshots in the FIFO for dwmac1000: it should be accessing bits 29..25, while it is currently reading bits 19..16 (those are bits about the auxiliary triggers which have generated the timestamps). As a consequence, it does not compute correctly the number of available snapshots, and so possibly do not generate the corresponding clock events if the bogus value ends up being 0. Fix clock events generation by reading the correct bits in the timestamp register for dwmac1000. Fixes: 477c3e1f6363 ("net: stmmac: Introduce dwmac1000 timestamping operations") Signed-off-by: Alexis Lothoré <alexis.lothore@bootlin.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250423-stmmac_ts-v2-1-e2cf2bbd61b1@bootlin.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-24	net: dp83822: Fix OF_MDIO config check	Johannes Schneider
	When CONFIG_OF_MDIO is set to be a module the code block is not compiled. Use the IS_ENABLED macro that checks for both built in as well as module. Fixes: 5dc39fd5ef35 ("net: phy: DP83822: Add ability to advertise Fiber connection") Signed-off-by: Johannes Schneider <johannes.schneider@leica-geosystems.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20250423044724.1284492-1-johannes.schneider@leica-geosystems.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-24	Merge branch 'ipv6-no-rtnl-for-ipv6-routing-table'	Paolo Abeni
	Kuniyuki Iwashima says: ==================== ipv6: No RTNL for IPv6 routing table. IPv6 routing tables are protected by each table's lock and work in the interrupt context, which means we basically don't need RTNL to modify an IPv6 routing table itself. Currently, the control paths require RTNL because we may need to perform device and nexthop lookups; we must prevent dev/nexthop from going away from the netns. This, however, can be achieved by RCU as well. If we are in the RCU critical section while adding an IPv6 route, synchronize_net() in __dev_change_net_namespace() and unregister_netdevice_many_notify() guarantee that the dev will not be moved to another netns or removed. Also, nexthop is guaranteed not to be freed during the RCU grace period. If we care about a race between nexthop removal and IPv6 route addition, we can get rid of RTNL from the control paths. Patch 1 moves a validation for RTA_MULTIPATH earlier. Patch 2 removes RTNL for SIOCDELRT and RTM_DELROUTE. Patch 3 ~ 11 moves validation and memory allocation earlier. Patch 12 prevents a race between two requests for the same table. Patch 13 & 14 prevents the nexthop race mentioned above. Patch 15 removes RTNL for SIOCADDRT and RTM_NEWROUTE. Test: The script [0] lets each CPU-X create 100000 routes on table-X in a batch. On c7a.metal-48xl EC2 instance with 192 CPUs, without this series: $ sudo ./route_test.sh start adding routes added 19200000 routes (100000 routes * 192 tables). total routes: 19200006 Time elapsed: 191577 milliseconds. with this series: $ sudo ./route_test.sh start adding routes added 19200000 routes (100000 routes * 192 tables). total routes: 19200006 Time elapsed: 62854 milliseconds. I changed the number of routes (1000 ~ 100000 per CPU/table) and consistently saw it finish 3x faster with this series. [0] mkdir tmp NS="test" ip netns add $NS ip -n $NS link add veth0 type veth peer veth1 ip -n $NS link set veth0 up ip -n $NS link set veth1 up TABLES=() for i in $(seq $(nproc)); do TABLES+=("$i") done ROUTES=() for i in {1..100}; do for j in {1..1000}; do ROUTES+=("2001:$i:$j::/64") done done for TABLE in "${TABLES[@]}"; do ( FILE="./tmp/batch-table-$TABLE.txt" > $FILE for ROUTE in "${ROUTES[@]}"; do echo "route add $ROUTE dev veth0 table $TABLE" >> $FILE done ) & done wait echo "start adding routes" START_TIME=$(date +%s%3N) for TABLE in "${TABLES[@]}"; do ip -n $NS -6 -batch "./tmp/batch-table-$TABLE.txt" & done wait END_TIME=$(date +%s%3N) ELAPSED_TIME=$((END_TIME - START_TIME)) echo "added $((${#ROUTES[@]} * ${#TABLES[@]})) routes (${#ROUTES[@]} routes * ${#TABLES[@]} tables)." echo "total routes: $(ip -n $NS -6 route show table all \| wc -l)" # Just for debug echo "Time elapsed: ${ELAPSED_TIME} milliseconds." ip netns del $NS rm -fr ./tmp/ v2: https://lore.kernel.org/netdev/20250409011243.26195-1-kuniyu@amazon.com/ v1: https://lore.kernel.org/netdev/20250321040131.21057-1-kuniyu@amazon.com/ ==================== Link: https://patch.msgid.link/20250418000443.43734-1-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-24	ipv6: Get rid of RTNL for SIOCADDRT and RTM_NEWROUTE.	Kuniyuki Iwashima
	Now we are ready to remove RTNL from SIOCADDRT and RTM_NEWROUTE. The remaining things to do are 1. pass false to lwtunnel_valid_encap_type_attr() 2. use rcu_dereference_rtnl() in fib6_check_nexthop() 3. place rcu_read_lock() before ip6_route_info_create_nh(). Let's complete the RTNL-free conversion. When each CPU-X adds 100000 routes on table-X in a batch concurrently on c7a.metal-48xl EC2 instance with 192 CPUs, without this series: $ sudo ./route_test.sh ... added 19200000 routes (100000 routes * 192 tables). time elapsed: 191577 milliseconds. with this series: $ sudo ./route_test.sh ... added 19200000 routes (100000 routes * 192 tables). time elapsed: 62854 milliseconds. I changed the number of routes in each table (1000 ~ 100000) and consistently saw it finish 3x faster with this series. Note that now every caller of lwtunnel_valid_encap_type() passes false as the last argument, and this can be removed later. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250418000443.43734-16-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-24	ipv6: Protect nh->f6i_list with spinlock and flag.	Kuniyuki Iwashima
	We will get rid of RTNL from RTM_NEWROUTE and SIOCADDRT. Then, we may be going to add a route tied to a dying nexthop. The nexthop itself is not freed during the RCU grace period, but if we link a route after __remove_nexthop_fib() is called for the nexthop, the route will be leaked. To avoid the race between IPv6 route addition under RCU vs nexthop deletion under RTNL, let's add a dead flag and protect it and nh->f6i_list with a spinlock. __remove_nexthop_fib() acquires the nexthop's spinlock and sets false to nh->dead, then calls ip6_del_rt() for the linked route one by one without the spinlock because fib6_purge_rt() acquires it later. While adding an IPv6 route, fib6_add() acquires the nexthop lock and checks the dead flag just before inserting the route. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250418000443.43734-15-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-24	ipv6: Defer fib6_purge_rt() in fib6_add_rt2node() to fib6_add().	Kuniyuki Iwashima
	The next patch adds per-nexthop spinlock which protects nh->f6i_list. When rt->nh is not NULL, fib6_add_rt2node() will be called under the lock. fib6_add_rt2node() could call fib6_purge_rt() for another route, which could holds another nexthop lock. Then, deadlock could happen between two nexthops. Let's defer fib6_purge_rt() after fib6_add_rt2node(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/20250418000443.43734-14-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-24	ipv6: Protect fib6_link_table() with spinlock.	Kuniyuki Iwashima
	We will get rid of RTNL from RTM_NEWROUTE and SIOCADDRT. If the request specifies a new table ID, fib6_new_table() is called to create a new routing table. Two concurrent requests could specify the same table ID, so we need a lock to protect net->ipv6.fib_table_hash[h]. Let's add a spinlock to protect the hash bucket linkage. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/20250418000443.43734-13-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-24	ipv6: Factorise ip6_route_multipath_add().	Kuniyuki Iwashima
	We will get rid of RTNL from RTM_NEWROUTE and SIOCADDRT and rely on RCU to guarantee dev and nexthop lifetime. Then, the RCU section will start before ip6_route_info_create_nh() in ip6_route_multipath_add(), but ip6_route_info_create() is called in the same loop and will sleep. Let's split the loop into ip6_route_mpath_info_create() and ip6_route_mpath_info_create_nh(). Note that ip6_route_info_append() is now integrated into ip6_route_mpath_info_create_nh() because we need to call different free functions for nexthops that passed ip6_route_info_create_nh(). In case of failure, the remaining nexthops that ip6_route_info_create_nh() has not been called for will be freed by ip6_route_mpath_info_cleanup(). OTOH, if a nexthop passes ip6_route_info_create_nh(), it will be linked to a local temporary list, which will be spliced back to rt6_nh_list. In case of failure, these nexthops will be released by fib6_info_release() in ip6_route_multipath_add(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250418000443.43734-12-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-24	ipv6: Rename rt6_nh.next to rt6_nh.list.	Kuniyuki Iwashima
	ip6_route_multipath_add() allocates struct rt6_nh for each config of multipath routes to link them to a local list rt6_nh_list. struct rt6_nh.next is the list node of each config, so the name is quite misleading. Let's rename it to list. Suggested-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/netdev/c9bee472-c94e-4878-8cc2-1512b2c54db5@redhat.com/ Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250418000443.43734-11-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-24	ipv6: Don't pass net to ip6_route_info_append().	Kuniyuki Iwashima
	net is not used in ip6_route_info_append() after commit 36f19d5b4f99 ("net/ipv6: Remove extra call to ip6_convert_metrics for multipath case"). Let's remove the argument. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/20250418000443.43734-10-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-24	ipv6: Preallocate nhc_pcpu_rth_output in ip6_route_info_create().	Kuniyuki Iwashima
	ip6_route_info_create_nh() will be called under RCU. It calls fib_nh_common_init() and allocates nhc->nhc_pcpu_rth_output. As with the reason for rt->fib6_nh->rt6i_pcpu, we want to avoid GFP_ATOMIC allocation for nhc->nhc_pcpu_rth_output under RCU. Let's preallocate it in ip6_route_info_create(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250418000443.43734-9-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-24	ipv6: Preallocate rt->fib6_nh->rt6i_pcpu in ip6_route_info_create().	Kuniyuki Iwashima
	ip6_route_info_create_nh() will be called under RCU. Then, fib6_nh_init() is also under RCU, but per-cpu memory allocation is very likely to fail with GFP_ATOMIC while bulk-adding IPv6 routes and we would see a bunch of this message in dmesg. percpu: allocation failed, size=8 align=8 atomic=1, atomic alloc failed, no space left percpu: allocation failed, size=8 align=8 atomic=1, atomic alloc failed, no space left Let's preallocate rt->fib6_nh->rt6i_pcpu in ip6_route_info_create(). If something fails before the original memory allocation in fib6_nh_init(), ip6_route_info_create_nh() calls fib6_info_release(), which releases the preallocated per-cpu memory. Note that rt->fib6_nh->rt6i_pcpu is not preallocated when called via ipv6_stub, so we still need alloc_percpu_gfp() in fib6_nh_init(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250418000443.43734-8-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-24	ipv6: Split ip6_route_info_create().	Kuniyuki Iwashima
	We will get rid of RTNL from RTM_NEWROUTE and SIOCADDRT and rely on RCU to guarantee dev and nexthop lifetime. Then, we want to allocate as much as possible before entering the RCU section. The RCU section will start in the middle of ip6_route_info_create(), and this is problematic for ip6_route_multipath_add() that calls ip6_route_info_create() multiple times. Let's split ip6_route_info_create() into two parts; one for memory allocation and another for nexthop setup. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/20250418000443.43734-7-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-24	ipv6: Move nexthop_find_by_id() after fib6_info_alloc().	Kuniyuki Iwashima
	We will get rid of RTNL from RTM_NEWROUTE and SIOCADDRT. Then, we must perform two lookups for nexthop and dev under RCU to guarantee their lifetime. ip6_route_info_create() calls nexthop_find_by_id() first if RTA_NH_ID is specified, and then allocates struct fib6_info. nexthop_find_by_id() must be called under RCU, but we do not want to use GFP_ATOMIC for memory allocation here, which will be likely to fail in ip6_route_multipath_add(). Let's move nexthop_find_by_id() after the memory allocation so that we can later split ip6_route_info_create() into two parts: the sleepable part and the RCU part. Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/20250418000443.43734-6-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-24	ipv6: Check GATEWAY in rtm_to_fib6_multipath_config().	Kuniyuki Iwashima
	In ip6_route_multipath_add(), we call rt6_qualify_for_ecmp() for each entry. If it returns false, the request fails. rt6_qualify_for_ecmp() returns false if either of the conditions below is true: 1. f6i->fib6_flags has RTF_ADDRCONF 2. f6i->nh is not NULL 3. f6i->fib6_nh->fib_nh_gw_family is AF_UNSPEC 1 is unnecessary because rtm_to_fib6_config() never sets RTF_ADDRCONF to cfg->fc_flags. 2. is equivalent with cfg->fc_nh_id. 3. can be replaced by checking RTF_GATEWAY in the base and each multipath entry because AF_INET6 is set to f6i->fib6_nh->fib_nh_gw_family only when cfg.fc_is_fdb is true or RTF_GATEWAY is set, but the former is always false. These checks do not require RCU and can be done earlier. Let's perform the equivalent checks in rtm_to_fib6_multipath_config(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250418000443.43734-5-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-24	ipv6: Move some validation from ip6_route_info_create() to rtm_to_fib6_config().	Kuniyuki Iwashima
	ip6_route_info_create() is called from 3 functions: * ip6_route_add() * ip6_route_multipath_add() * addrconf_f6i_alloc() addrconf_f6i_alloc() does not need validation for struct fib6_config in ip6_route_info_create(). ip6_route_multipath_add() calls ip6_route_info_create() for multiple routes with slightly different fib6_config instances, which is copied from the base config passed from userspace. So, we need not validate the same config repeatedly. Let's move such validation into rtm_to_fib6_config(). Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://patch.msgid.link/20250418000443.43734-4-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-24	ipv6: Get rid of RTNL for SIOCDELRT and RTM_DELROUTE.	Kuniyuki Iwashima
	Basically, removing an IPv6 route does not require RTNL because the IPv6 routing tables are protected by per table lock. inet6_rtm_delroute() calls nexthop_find_by_id() to check if the nexthop specified by RTA_NH_ID exists. nexthop uses rbtree and the top-down walk can be safely performed under RCU. ip6_route_del() already relies on RCU and the table lock, but we need to extend the RCU critical section a bit more to cover __ip6_del_rt(). For example, nexthop_for_each_fib6_nh() and inet6_rt_notify() needs RCU. Let's call nexthop_find_by_id() and __ip6_del_rt() under RCU and get rid of RTNL from inet6_rtm_delroute() and SIOCDELRT. Even if the nexthop is removed after rcu_read_unlock() in inet6_rtm_delroute(), __remove_nexthop_fib() cleans up the routes tied to the nexthop, and ip6_route_del() returns -ESRCH. So the request was at least valid as of nexthop_find_by_id(), and it's just a matter of timing. Note that we need to pass false to lwtunnel_valid_encap_type_attr(). The following patches also use the newroute bool. Note also that fib6_get_table() does not require RCU because once allocated fib6_table is not freed until netns dismantle. I will post a follow-up series to convert such callers to RCU-lockless version. [0] Link: https://lore.kernel.org/netdev/20250417174557.65721-1-kuniyu@amazon.com/ #[0] Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250418000443.43734-3-kuniyu@amazon.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>