summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-04-22vxlan: Use linked list to traverse FDB entriesIdo Schimmel
In preparation for removing the fixed size hash table, convert FDB entry traversal to use the newly added FDB linked list. No functional changes intended. Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250415121143.345227-9-idosch@nvidia.com Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-22vxlan: Add a linked list of FDB entriesIdo Schimmel
Currently, FDB entries are stored in a hash table with a fixed number of buckets. The table is used for both lookups and entry traversal. Subsequent patches will convert the table to rhashtable which is not suitable for entry traversal. In preparation for this conversion, add FDB entries to a linked list. Subsequent patches will convert the driver to use this list when traversing entries during dump, flush, etc. Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250415121143.345227-8-idosch@nvidia.com Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-22vxlan: Use a single lock to protect the FDB tableIdo Schimmel
Currently, the VXLAN driver stores FDB entries in a hash table with a fixed number of buckets (256). Subsequent patches are going to convert this table to rhashtable with a linked list for entry traversal, as rhashtable is more scalable. In preparation for this conversion, move from a per-bucket spin lock to a single spin lock that protects the entire FDB table. The per-bucket spin locks were introduced by commit fe1e0713bbe8 ("vxlan: Use FDB_HASH_SIZE hash_locks to reduce contention") citing "huge contention when inserting/deleting vxlan_fdbs into the fdb_head". It is not clear from the commit message which code path was holding the spin lock for long periods of time, but the obvious suspect is the FDB cleanup routine (vxlan_cleanup()) that periodically traverses the entire table in order to delete aged-out entries. This will be solved by subsequent patches that will convert the FDB cleanup routine to traverse the linked list of FDB entries using RCU, only acquiring the spin lock when deleting an aged-out entry. The change reduces the size of the VXLAN device structure from 3600 bytes to 2576 bytes. Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250415121143.345227-7-idosch@nvidia.com Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-22vxlan: Relocate assignment of default remote deviceIdo Schimmel
The default FDB entry can be associated with a net device if a physical device (i.e., 'dev PHYS_DEV') was specified during the creation of the VXLAN device. The assignment of the net device pointer to 'dst->remote_dev' logically belongs in the if block that resolves the pointer from the specified ifindex, so move it there. Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250415121143.345227-6-idosch@nvidia.com Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-22vxlan: Unsplit default FDB entry creation and notificationIdo Schimmel
Commit 0241b836732f ("vxlan: fix default fdb entry netlink notify ordering during netdev create") split the creation of the default FDB entry from its notification to avoid sending a RTM_NEWNEIGH notification before RTM_NEWLINK. Previous patches restructured the code so that the default FDB entry is created after registering the VXLAN device and the notification about the new entry immediately follows its creation. Therefore, simplify the code and revert back to vxlan_fdb_update() which takes care of both creating the FDB entry and notifying user space about it. Hold the FDB hash lock when calling vxlan_fdb_update() like it expects. A subsequent patch will add a lockdep assertion to make sure this is indeed the case. Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250415121143.345227-5-idosch@nvidia.com Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-22vxlan: Insert FDB into hash table in vxlan_fdb_create()Ido Schimmel
Commit 7c31e54aeee5 ("vxlan: do not destroy fdb if register_netdevice() is failed") split the insertion of FDB entries into the FDB hash table from the function where they are created. This was done in order to work around a problem that is no longer possible after the previous patch. Simplify the code and move the body of vxlan_fdb_insert() back into vxlan_fdb_create(). Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250415121143.345227-4-idosch@nvidia.com Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-22vxlan: Simplify creation of default FDB entryIdo Schimmel
There is asymmetry in how the default FDB entry (all-zeroes) is created and destroyed in the VXLAN driver. It is created as part of the driver's newlink() routine, but destroyed as part of its ndo_uninit() routine. This caused multiple problems in the past. First, commit 0241b836732f ("vxlan: fix default fdb entry netlink notify ordering during netdev create") split the notification about the entry from its creation so that it will not be notified to user space before the VXLAN device is registered. Then, commit 6db924687139 ("vxlan: Fix error path in __vxlan_dev_create()") made the error path in __vxlan_dev_create() asymmetric by destroying the FDB entry before unregistering the net device. Otherwise, the FDB entry would have been freed twice: By ndo_uninit() as part of unregister_netdevice() and by vxlan_fdb_destroy() in the error path. Finally, commit 7c31e54aeee5 ("vxlan: do not destroy fdb if register_netdevice() is failed") split the insertion of the FDB entry into the hash table from its creation, moving the insertion after the registration of the net device. Otherwise, like before, the FDB entry would have been freed twice: By ndo_uninit() as part of register_netdevice()'s error path and by vxlan_fdb_destroy() in the error path of __vxlan_dev_create(). The end result is that the code is unnecessarily complex. In addition, the fixed size hash table cannot be converted to rhashtable as vxlan_fdb_insert() cannot fail, which will no longer be true with rhashtable. Solve this by making the addition and deletion of the default FDB entry completely symmetric. Namely, as part of newlink() routine, create the entry, insert it into to the hash table and send a notification to user space after the net device was registered. Note that at this stage the net device is still administratively down and cannot transmit / receive packets. Move the deletion from ndo_uninit() to the dellink routine(): Flush the default entry together with all the other entries, before unregistering the net device. Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250415121143.345227-3-idosch@nvidia.com Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-22vxlan: Add RCU read-side critical sections in the Tx pathIdo Schimmel
The Tx path does not run from an RCU read-side critical section which makes the current lockless accesses to FDB entries invalid. As far as I am aware, this has not been a problem in practice, but traces will be generated once we transition the FDB lookup to rhashtable_lookup(). Add rcu_read_{lock,unlock}() around the handling of FDB entries in the Tx path. Remove the RCU read-side critical section from vxlan_xmit_nh() as now the function is always called from an RCU read-side critical section. Reviewed-by: Petr Machata <petrm@nvidia.com> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250415121143.345227-2-idosch@nvidia.com Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-21Merge tag 'for-netdev' of ↵Jakub Kicinski
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Martin KaFai Lau says: ==================== pull-request: bpf-next 2025-04-17 We've added 12 non-merge commits during the last 9 day(s) which contain a total of 18 files changed, 1748 insertions(+), 19 deletions(-). The main changes are: 1) bpf qdisc support, from Amery Hung. A qdisc can be implemented in bpf struct_ops programs and can be used the same as other existing qdiscs in the "tc qdisc" command. 2) Add xsk tail adjustment tests, from Tushar Vyavahare. * tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: selftests/bpf: Test attaching bpf qdisc to mq and non root selftests/bpf: Add a bpf fq qdisc to selftest selftests/bpf: Add a basic fifo qdisc test libbpf: Support creating and destroying qdisc bpf: net_sched: Disable attaching bpf qdisc to non root bpf: net_sched: Support updating bstats bpf: net_sched: Add a qdisc watchdog timer bpf: net_sched: Add basic bpf qdisc kfuncs bpf: net_sched: Support implementation of Qdisc_ops in bpf bpf: Prepare to reuse get_ctx_arg_idx selftests/xsk: Add tail adjustment tests and support check selftests/xsk: Add packet stream replacement function ==================== Link: https://patch.msgid.link/20250417184338.3152168-1-martin.lau@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-21Merge branch 'bnxt_en-update-for-net-next'Jakub Kicinski
Michael Chan says: ==================== bnxt_en: Update for net-next The first patch changes the FW message timeout threshold for a warning message. The second patch adjusts the ethtool -w coredump length to suppress a warning. The last 2 patches are small cleanup patches for the bnxt_ulp RoCE auxbus code. v1: https://lore.kernel.org/netdev/20250415174818.1088646-1-michael.chan@broadcom.com/ ==================== Link: https://patch.msgid.link/20250417172448.1206107-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-21bnxt_en: Remove unused macros in bnxt_ulp.hKalesh AP
BNXT_ROCE_ULP and BNXT_MAX_ULP are no longer used. Remove them to clean up the code. Reviewed-by: Shruti Parab <shruti.parab@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250417172448.1206107-5-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-21bnxt_en: Remove unused field "ref_count" in struct bnxt_ulpKalesh AP
The "ref_count" field in struct bnxt_ulp is unused after commit a43c26fa2e6c ("RDMA/bnxt_re: Remove the sriov config callback"). So we can just remove it now. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250417172448.1206107-4-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-21bnxt_en: Report the ethtool coredump length after copying the coredumpShruti Parab
ethtool first calls .get_dump_flags() to get the dump length. For coredump, the driver calls the FW to get the coredump length (L1). The min. of L1 and the user specified length is then passed to .get_dump_data() (L2) to get the coredump. The actual coredump length retrieved by the FW (L3) during .get_dump_data() may be smaller than L1. This length discrepancy will trigger a WARN_ON() in ethtool_get_dump_data(). ethtool has already vzalloc'ed a buffer with size L1. Just report the coredump length as L2 even though the actual coredump length L3 may be smaller. The extra zero padding does not matter. This will prevent the warning that may alarm the user. For correctness, only do the final length update if there is no error. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Signed-off-by: Shruti Parab <shruti.parab@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250417172448.1206107-3-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-21bnxt_en: Change FW message timeout warningMichael Chan
The firmware advertises a "hwrm_cmd_max_timeout" value to the driver for NVRAM and coredump related functions that can take tens of seconds to complete. The driver polls for the operation to complete under mutex and may trigger hung task watchdog warning if the wait is too long. To warn the user about this, the driver currently prints a warning if this advertised value exceeds 40 seconds: Device requests max timeout of %d seconds, may trigger hung task watchdog Initially, we chose 40 seconds, well below the kernel's default CONFIG_DEFAULT_HUNG_TASK_TIMEOUT (120 seconds) to avoid triggering the hung task watchdog. But 60 seconds is the timeout on most production FW and cannot be reduced further. Change the driver's warning threshold to 60 seconds to avoid triggering this warning on all production devices. We also print the warning if the value exceeds CONFIG_DEFAULT_HUNG_TASK_TIMEOUT which may be set to architecture specific defaults as low as 10 seconds. Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20250417172448.1206107-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-21Merge branch 'net-stmmac-socfpga-fix-init-ordering-and-cleanups'Jakub Kicinski
Russell King says: ==================== net: stmmac: socfpga: fix init ordering and cleanups This series fixes the init ordering of the socfpga probe function. The standard rule is to do all setup before publishing any device, and socfpga violates that. I can see no reason for this, but these patches have not been tested on hardware. Address this by moving the initialisation of dwmac->stmmac_rst along with all the other dwmac initialisers - there's no reason for this to be late as plat_dat->stmmac_rst has already been populated. Next, replace the call to ops->set_phy_mode() with an init function socfpga_dwmac_init() which will then be linked in to plat_dat->init. Then, add this to plat_dat->init, and switch to stmmac_pltfr_pm_ops from the private ops. The runtime suspend/resume socfpga implementations are identical to the platform ones, but misses the noirq versions which this will add. Before we swap the order of socfpga_dwmac_init() and stmmac_dvr_probe(), we need to change the way the interface is obtained, as that uses driver data and the struct net_device which haven't been initialised. Save a pointer to plat_dat in the socfpga private data, and use that to get the interface mode. We can then swap the order of the init and probe functions. Finally, convert to devm_stmmac_pltfr_probe() by moving the call to ops->set_phy_mode() into an init function appropriately populating plat_dat->init. ==================== Link: https://patch.msgid.link/aAE2tKlImhwKySq_@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-21net: stmmac: socfpga: convert to devm_stmmac_pltfr_probe()Russell King (Oracle)
Convert socfpga to use devm_stmmac_pltfr_probe() to further simplify the probe function, wrapping the call to the set_phy_mode() method into socfpga_dwmac_init() which can be called from the plat_dat->init() method. Also call this from socfpga_dwmac_resume() thereby simplifying that function. Using the devm variant also means we can remove the call to stmmac_pltfr_remove(). Unfortunately, we can't also convert to stmmac_pltfr_pm_ops as there is extra work done in socfpga_dwmac_resume(). Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/E1u5Sns-001IJw-OY@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-21net: stmmac: socfpga: call set_phy_mode() before registrationRussell King (Oracle)
Initialisation/setup after registration is a bug. This is the second of two patches fixing this in socfpga. The set_phy_mode() functions do various hardware setup that would interfere with a netdev that has been published, and thus available to be opened by the kernel/userspace. However, set_phy_mode() relies upon the netdev having been initialised to get at the plat_stmmacenet_data structure, which is probably why it was placed after stmmac_drv_probe(). We can remove that need by storing a pointer to struct plat_stmmacenet_data in struct socfpga_dwmac. Move the call to set_phy_mode() before calling stmmac_dvr_probe(). This also simplifies the probe function as there is no need to unregister the netdev if set_phy_mode() fails. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/E1u5Snn-001IJq-L0@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-21net: stmmac: socfpga: convert to stmmac_pltfr_pm_opsRussell King (Oracle)
Convert socfpga to use the generic stmmac_pltfr_pm_ops, which can be achieved by adding an appropriate plat_dat->init function to do the setup. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/E1u5Sni-001IJk-Gi@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-21net: stmmac: socfpga: provide init functionRussell King (Oracle)
Both the resume and probe path needs to configure the phy mode, so provide a common function to do this which can later be hooked into plat_dat->init. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/E1u5Snd-001IJe-Cx@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-21net: stmmac: socfpga: init dwmac->stmmac_rst before registrationRussell King (Oracle)
Initialisation/setup after registration is a bug. This is the first of two patches fixing this in socfpga. dwmac->stmmac_rst is initialised from the stmmac plat_dat's stmmac_rst member, which is itself initialised by devm_stmmac_probe_config_dt(). Therefore, this can be initialised before we call stmmac_dvr_probe(). Move it there. dwmac->stmmac_rst is used by the set_phy_mode() method. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/E1u5SnY-001IJY-90@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-21Merge branch 'net-adopting-nlmsg_payload-final-series'Jakub Kicinski
Breno Leitao says: ==================== net: Adopting nlmsg_payload() (final series) This patchset marks the final step in converting users to the new nlmsg_payload() function. It addresses the last two files that were not converted in previous series, specifically updating the following functions: neigh_valid_dump_req rtnl_valid_dump_ifinfo_req rtnl_valid_getlink_req valid_fdb_get_strict valid_bridge_getlink_req rtnl_valid_stats_req rtnl_mdb_valid_dump_req I would like to extend a big thank you to Kuniyuki Iwashima for his invaluable help and review of this effort. ==================== Link: https://patch.msgid.link/20250417-nlmsg_v3-v1-0-9b09d9d7e61d@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-21net: Use nlmsg_payload in rtnetlink fileBreno Leitao
Leverage the new nlmsg_payload() helper to avoid checking for message size and then reading the nlmsg data. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250417-nlmsg_v3-v1-2-9b09d9d7e61d@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-21net: Use nlmsg_payload in neighbour fileBreno Leitao
Leverage the new nlmsg_payload() helper to avoid checking for message size and then reading the nlmsg data. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250417-nlmsg_v3-v1-1-9b09d9d7e61d@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-21s390: ism: Pass string literal as format argument of dev_set_name()Simon Horman
GCC 14.2.0 reports that passing a non-string literal as the format argument of dev_set_name() is potentially insecure. drivers/s390/net/ism_drv.c: In function 'ism_probe': drivers/s390/net/ism_drv.c:615:2: warning: format not a string literal and no format arguments [-Wformat-security] 615 | dev_set_name(&ism->dev, dev_name(&pdev->dev)); | ^~~~~~~~~~~~ It seems to me that as pdev is a PCIE device then the dev_name call above should always return the device's BDF, e.g. 00:12.0. That this should not contain format escape sequences. And thus the current usage is safe. But, it seems better to be safe than sorry. And, as a bonus, compiler output becomes less verbose by addressing this issue. Compile tested only. No functional change intended. Signed-off-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250417-ism-str-fmt-v1-1-9818b029874d@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-21net/mlx5: Fix spelling mistakes in mlx5_core_dbg message and commentsColin Ian King
There is a spelling mistake in a mlx5_core_dbg and two spelling mistakes in comment blocks. Fix them. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Acked-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250418135703.542722-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-21net: axienet: Fix spelling mistake "archecture" -> "architecture"Colin Ian King
There is a spelling mistake in a dev_error message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://patch.msgid.link/20250418112447.533746-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-21tools: ynl: add missing header depsJakub Kicinski
Various new families and my recent work on rtnetlink missed adding dependencies on C headers. If the system headers are up to date or don't include a given header at all this doesn't make a difference. But if the system headers are in place but stale - compilation will break. Reported-by: Kory Maincent <kory.maincent@bootlin.com> Fixes: 29d34a4d785b ("tools: ynl: generate code for rt-addr and add a sample") Link: https://lore.kernel.org/20250418190431.69c10431@kmaincent-XPS-13-7390 Acked-by: Stanislav Fomichev <sdf@fomichev.me> Tested-by: Kory Maincent <kory.maincent@bootlin.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250418234942.2344036-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17net: add UAPI to the header guard in various network headersJakub Kicinski
fib_rule, ip6_tunnel, and a whole lot of if_* headers lack the customary _UAPI in the header guard. Without it YNL build can't protect from in tree and system headers both getting included. YNL doesn't need most of these but it's annoying to have to fix them one by one. Note that header installation strips this _UAPI prefix so this should result in no change to the end user. Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250416200840.1338195-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17trace: tcp: Add const qualifier to skb parameter in tcp_probe eventBreno Leitao
Change the tcp_probe tracepoint to accept a const struct sk_buff parameter instead of a non-const one. This improves type safety and better reflects that the skb is not modified within the tracepoint implementation. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250416-tcp_probe-v1-1-1edc3c5a1cb8@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17net: Delete the outer () duplicated of macro SOCK_SKB_CB_OFFSET definitionZijun Hu
For macro SOCK_SKB_CB_OFFSET definition, Delete the outer () duplicated. Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250416-fix_net-v1-1-d544c9f3f169@quicinc.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17net: stmmac: mediatek: stop initialising plat->mac_interfaceRussell King (Oracle)
Mediatek doesn't make use of mac_interface, and none of the in-tree DT files use the mac-mode property. Therefore, mac_interface already follows phy_interface. Remove this unnecessary assignment. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/E1u4zyh-000xVE-PG@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17net: stmmac: dwc-qos: use PHY clock-stop capabilityRussell King (Oracle)
Use the PHY clock-stop capability when programming the MAC LPI mode, which allows the transmit clock to the PHY to be gated. Tested on the Jetson Xavier NX platform. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/E1u4zi1-000xHh-57@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17netdev: fix the locking for netdev notificationsJakub Kicinski
Kuniyuki reports that the assert for netdev lock fires when there are netdev event listeners (otherwise we skip the netlink event generation). Correct the locking when coming from the notifier. The NETDEV_XDP_FEAT_CHANGE notifier is already fully locked, it's the documentation that's incorrect. Fixes: 99e44f39a8f7 ("netdev: depend on netdev->lock for xdp features") Reported-by: syzkaller <syzkaller@googlegroups.com> Reported-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/20250410171019.62128-1-kuniyu@amazon.com Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250416030447.1077551-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17net/mlx5e: ethtool: Fix formatting of ptp_rq0_csum_complete_tail_slowKees Cook
The new GCC 15 warning -Wunterminated-string-initialization reports: In file included from drivers/net/ethernet/mellanox/mlx5/core/en.h:55, from drivers/net/ethernet/mellanox/mlx5/core/en_stats.c:34: drivers/net/ethernet/mellanox/mlx5/core/en_stats.h:57:46: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization] 57 | #define MLX5E_DECLARE_PTP_RQ_STAT(type, fld) "ptp_rq%d_"#fld, offsetof(type, fld) | ^~~~~~~~~~~ drivers/net/ethernet/mellanox/mlx5/core/en_stats.c:2279:11: note: in expansion of macro 'MLX5E_DECLARE_PTP_RQ_STAT' 2279 | { MLX5E_DECLARE_PTP_RQ_STAT(struct mlx5e_rq_stats, csum_complete_tail_slow) }, | ^~~~~~~~~~~~~~~~~~~~~~~~~ This stat string is being used in ethtool_sprintf(), so it must be a valid NUL-terminated string. Currently the string lacks the final NUL byte (as GCC warns), but by absolute luck, the next byte in memory is a space (decimal 32) followed by a NUL. "format" is immediately followed by little-endian size_t: struct counter_desc { char format[32]; /* 0 32 */ size_t offset; /* 32 8 */ }; The "offset" member is populated by the stats member offset: #define MLX5E_DECLARE_PTP_RQ_STAT(type, fld) "ptp_rq%d_"#fld, offsetof(type, fld) which for this struct mlx5e_rq_stats member, csum_complete_tail_slow, is 32, or space, and then the rest of the "offset" bytes are NULs. struct mlx5e_rq_stats { ... u64 csum_complete_tail_slow; /* 32 8 */ The use of vsnprintf(), within ethtool_sprintf(), reads past the end of "format" and sees the format string as "ptp_rq%d_csum_complete_tail_slow ", with %d getting resolved by MLX5E_PTP_CHANNEL_IX (value 0): ethtool_sprintf(data, ptp_rq_stats_desc[i].format, MLX5E_PTP_CHANNEL_IX); With an output result of "ptp_rq0_csum_complete_tail_slow", which gets precisely truncated to 31 characters with a trailing NUL. So, instead of accidentally getting this correct due to the NUL bytes at the end of the size_t that happens to follow the format string, just make the string initializer 1 byte shorter by replacing "%d" with "0", since MLX5E_PTP_CHANNEL_IX is already hard-coded. This results in no initializer truncation and no need to call sprintf(). Signed-off-by: Kees Cook <kees@kernel.org> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Link: https://patch.msgid.link/20250416020109.work.297-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17net: ethtool: Adjust exactly ETH_GSTRING_LEN-long stats to use memcpyKees Cook
Many drivers populate the stats buffer using C-String based APIs (e.g. ethtool_sprintf() and ethtool_puts()), usually when building up the list of stats individually (i.e. with a for() loop). This, however, requires that the source strings be populated in such a way as to have a terminating NUL byte in the source. Other drivers populate the stats buffer directly using one big memcpy() of an entire array of strings. No NUL termination is needed here, as the bytes are being directly passed through. Yet others will build up the stats buffer individually, but also use memcpy(). This, too, does not need NUL termination of the source strings. However, there are cases where the strings that populate the source stats strings are exactly ETH_GSTRING_LEN long, and GCC 15's -Wunterminated-string-initialization option complains that the trailing NUL byte has been truncated. This situation is fine only if the driver is using the memcpy() approach. If the C-String APIs are used, the destination string name will have its final byte truncated by the required trailing NUL byte applied by the C-string API. For drivers that are already using memcpy() but have initializers that truncate the NUL terminator, mark their source strings as __nonstring to silence the GCC warnings. For drivers that have initializers that truncate the NUL terminator and are using the C-String APIs, switch to memcpy() to avoid destination string truncation and mark their source strings as __nonstring to silence the GCC warnings. (Also introduce ethtool_cpy() as a helper to make this an easy replacement). Specifically the following warnings were investigated and addressed: ../drivers/net/ethernet/chelsio/cxgb/cxgb2.c:364:9: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization] 364 | "TxFramesAbortedDueToXSCollisions", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../drivers/net/ethernet/freescale/enetc/enetc_ethtool.c:165:33: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization] 165 | { ENETC_PM_R1523X(0), "MAC rx 1523 to max-octet packets" }, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../drivers/net/ethernet/freescale/enetc/enetc_ethtool.c:190:33: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization] 190 | { ENETC_PM_T1523X(0), "MAC tx 1523 to max-octet packets" }, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../drivers/net/ethernet/google/gve/gve_ethtool.c:76:9: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization] 76 | "adminq_dcfg_device_resources_cnt", "adminq_set_driver_parameter_cnt", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c:117:53: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization] 117 | STMMAC_STAT(ptp_rx_msg_type_pdelay_follow_up), | ^ ../drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c:46:12: note: in definition of macro 'STMMAC_STAT' 46 | { #m, sizeof_field(struct stmmac_extra_stats, m), \ | ^ ../drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c:328:24: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization] 328 | .str = "a_mac_control_frames_transmitted", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c:340:24: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization] 340 | .str = "a_pause_mac_ctrl_frames_received", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Kees Cook <kees@kernel.org> Reviewed-by: Petr Machata <petrm@nvidia.com> # for mlxsw Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20250416010210.work.904-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17r8169: add RTL_GIGA_MAC_VER_LAST to facilitate adding support for new chip ↵Heiner Kallweit
versions Add a new mac_version enum value RTL_GIGA_MAC_VER_LAST. Benefit is that when adding support for a new chip version we have to touch less code, except something changes fundamentally. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/06991f47-2aec-4aa2-8918-2c6e79332303@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17r8169: refactor chip version detectionHeiner Kallweit
Refactor chip version detection and merge both configuration tables. Apart from reducing the code by a third, this paves the way for merging chip version handling if only difference is the firmware. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/1fea533a-dd5a-4198-a9e2-895e11083947@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17Merge branch 'net-stmmac-sunxi-cleanups'Jakub Kicinski
Russell King says: ==================== net: stmmac: sunxi cleanups This series cleans up the sunxi (sun7i) code in two ways: 1. it converts to use the new set_clk_tx_rate() method, even though we don't use clk_tx_i. In doing so, I reformat the function to read better, but with no changes to the code. 2. convert from stmmac_dvr_probe() to stmmac_pltfr_probe(), and then to its devm variant, which allows code simplification. ==================== Link: https://patch.msgid.link/Z_5WT_jOBgubjWQg@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17net: stmmac: sunxi: use devm_stmmac_pltfr_probe()Russell King (Oracle)
Using devm_stmmac_pltfr_probe() simplifies the probe function. This will not only call plat_dat->init (sun7i_dwmac_init), but also plat_dat->exit (sun7i_dwmac_exit) appropriately if stmmac_dvr_probe() fails. This results in an overall simplification of the glue driver. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1u4fre-000nMr-FT@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17net: stmmac: sunxi: use stmmac_pltfr_probe()Russell King (Oracle)
Rather than open-coding the calls to sun7i_gmac_init() and sun7i_gmac_exit() in the probe function, use stmmac_pltfr_probe() which will automatically call the plat_dat->init() and plat_dat->exit() methods appropriately. This simplifies the code. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1u4frZ-000nMl-BB@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17net: stmmac: sunxi: convert to set_clk_tx_rate()Russell King (Oracle)
Convert sunxi to use the set_clk_tx_rate() callback rather than the fix_mac_speed() callback. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1u4frU-000nMf-6o@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.15-rc3). No conflicts. Adjacent changes: tools/net/ynl/pyynl/ynl_gen_c.py 4d07bbf2d456 ("tools: ynl-gen: don't declare loop iterator in place") 7e8ba0c7de2b ("tools: ynl: don't use genlmsghdr in classic netlink") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17Merge tag 'net-6.15-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from Bluetooth, CAN and Netfilter. Current release - regressions: - two fixes for the netdev per-instance locking - batman-adv: fix double-hold of meshif when getting enabled Current release - new code bugs: - Bluetooth: increment TX timestamping tskey always for stream sockets - wifi: static analysis and build fixes for the new Intel sub-driver Previous releases - regressions: - net: fib_rules: fix iif / oif matching on L3 master (VRF) device - ipv6: add exception routes to GC list in rt6_insert_exception() - netfilter: conntrack: fix erroneous removal of offload bit - Bluetooth: - fix sending MGMT_EV_DEVICE_FOUND for invalid address - l2cap: process valid commands in too long frame - btnxpuart: Revert baudrate change in nxp_shutdown Previous releases - always broken: - ethtool: fix memory corruption during SFP FW flashing - eth: - hibmcge: fixes for link and MTU handling, pause frames etc - igc: fixes for PTM (PCIe timestamping) - dsa: b53: enable BPDU reception for management port Misc: - fixes for Netlink protocol schemas" * tag 'net-6.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (81 commits) net: ethernet: mtk_eth_soc: revise QDMA packet scheduler settings net: ethernet: mtk_eth_soc: correct the max weight of the queue limit for 100Mbps net: ethernet: mtk_eth_soc: reapply mdc divider on reset net: ti: icss-iep: Fix possible NULL pointer dereference for perout request net: ti: icssg-prueth: Fix possible NULL pointer dereference inside emac_xmit_xdp_frame() net: ti: icssg-prueth: Fix kernel warning while bringing down network interface netfilter: conntrack: fix erronous removal of offload bit net: don't try to ops lock uninitialized devs ptp: ocp: fix start time alignment in ptp_ocp_signal_set net: dsa: avoid refcount warnings when ds->ops->tag_8021q_vlan_del() fails net: dsa: free routing table on probe failure net: dsa: clean up FDB, MDB, VLAN entries on unbind net: dsa: mv88e6xxx: fix -ENOENT when deleting VLANs and MST is unsupported net: dsa: mv88e6xxx: avoid unregistering devlink regions which were never registered net: txgbe: fix memory leak in txgbe_probe() error path net: bridge: switchdev: do not notify new brentries as changed net: b53: enable BPDU reception for management port netlink: specs: rt-neigh: prefix struct nfmsg members with ndm netlink: specs: rt-link: adjust mctp attribute naming netlink: specs: rtnetlink: attribute naming corrections ...
2025-04-17Merge branch 'bpf-qdisc'Martin KaFai Lau
Amery Hung says: ==================== bpf qdisc Hi all, This patchset aims to support implementing qdisc using bpf struct_ops. This version takes a step back and only implements the minimum support for bpf qdisc. 1) support of adding skb to bpf_list and bpf_rbtree directly and 2) classful qdisc are deferred to future patchsets. In addition, we only allow attaching bpf qdisc to root or mq for now. This is to prevent accidentally breaking exisiting classful qdiscs that rely on data in a child qdisc. This limit may be lifted in the future after careful inspection. * Overview * This series supports implementing qdisc using bpf struct_ops. bpf qdisc aims to be a flexible and easy-to-use infrastructure that allows users to quickly experiment with different scheduling algorithms/policies. It only requires users to implement core qdisc logic using bpf and implements the mundane part for them. In addition, the ability to easily communicate between qdisc and other components will also bring new opportunities for new applications and optimizations. * Performance of bpf qdisc * This patchset includes two qdisc examples, bpf_fifo and bpf_fq, for __testing__ purposes. For performance test, we compare selftests and their kernel counterparts to give you a sense of the performance of qdisc implemented in bpf. The implementation of bpf_fq is fairly complex and slightly different from fq so later we only compare the two fifo qdiscs. bpf_fq implements a scheduling algorithm similar to fq before commit 29f834aa326e ("net_sched: sch_fq: add 3 bands and WRR scheduling") was introduced. bpf_fifo uses a single bpf_list as a queue instead of three queues for different priorities in pfifo_fast. The time complexity of fifo however should be similar since the queue selection time is negligible. Test setup: client -> qdisc -------------> server ~~~~~~~~~~~~~~~ ~~~~~~ nested VM1 @ DC1 VM2 @ DC2 Throghput: iperf3 -t 600, 5 times Qdisc Average (GBits/sec) ---------- ------------------- pfifo_fast 12.52 ± 0.26 bpf_fifo 11.72 ± 0.32 fq 10.24 ± 0.13 bpf_fq 11.92 ± 0.64 Latency: sockperf pp --tcp -t 600, 5 times Qdisc Average (usec) ---------- -------------- pfifo_fast 244.58 ± 7.93 bpf_fifo 244.92 ± 15.22 fq 234.30 ± 19.25 bpf_fq 221.34 ± 10.76 Looking at the two fifo qdiscs, the 6.4% drop in throughput in the bpf implementatioin is consistent with previous observation (v8 throughput test on a loopback device). This should be able to be mitigated by supporting adding skb to bpf_list or bpf_rbtree directly in the future. * Clean up skb in bpf qdisc during reset * The current implementation relies on bpf qdisc implementors to correctly release skbs in queues (bpf graphs or maps) in .reset, which might not be a safe thing to do. The solution as Martin has suggested would be supporting private data in struct_ops. This can also help simplifying implementation of qdisc that works with mq. For examples, qdiscs in the selftest mostly use global data. Therefore, even if user add multiple qdisc instances under mq, they would still share the same queue. ==================== Link: https://patch.msgid.link/20250409214606.2000194-1-ameryhung@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2025-04-17selftests/bpf: Test attaching bpf qdisc to mq and non rootAmery Hung
Until we are certain that existing classful qdiscs work with bpf qdisc, make sure we don't allow attaching a bpf qdisc to non root. Meanwhile, attaching to mq is allowed. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409214606.2000194-11-ameryhung@gmail.com
2025-04-17selftests/bpf: Add a bpf fq qdisc to selftestAmery Hung
This test implements a more sophisticated qdisc using bpf. The bpf fair- queueing (fq) qdisc gives each flow an equal chance to transmit data. It also respects the timestamp of skb for rate limiting. Signed-off-by: Amery Hung <amery.hung@bytedance.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409214606.2000194-10-ameryhung@gmail.com
2025-04-17selftests/bpf: Add a basic fifo qdisc testAmery Hung
This selftest includes a bare minimum fifo qdisc, which simply enqueues sk_buffs into the back of a bpf list and dequeues from the front of the list. Signed-off-by: Amery Hung <amery.hung@bytedance.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409214606.2000194-9-ameryhung@gmail.com
2025-04-17libbpf: Support creating and destroying qdiscAmery Hung
Extend struct bpf_tc_hook with handle, qdisc name and a new attach type, BPF_TC_QDISC, to allow users to add or remove any qdisc specified in addition to clsact. Signed-off-by: Amery Hung <amery.hung@bytedance.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409214606.2000194-8-ameryhung@gmail.com
2025-04-17bpf: net_sched: Disable attaching bpf qdisc to non rootAmery Hung
Do not allow users to attach bpf qdiscs to classful qdiscs. This is to prevent accidentally breaking existings classful qdiscs if they rely on some data in the child qdisc. This restriction can potentially be lifted in the future. Note that, we still allow bpf qdisc to be attached to mq. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409214606.2000194-7-ameryhung@gmail.com
2025-04-17bpf: net_sched: Support updating bstatsAmery Hung
Add a kfunc to update Qdisc bstats when an skb is dequeued. The kfunc is only available in .dequeue programs. Signed-off-by: Amery Hung <amery.hung@bytedance.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409214606.2000194-6-ameryhung@gmail.com