summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-04-21Merge branch 'net-adopting-nlmsg_payload-final-series'Jakub Kicinski
Breno Leitao says: ==================== net: Adopting nlmsg_payload() (final series) This patchset marks the final step in converting users to the new nlmsg_payload() function. It addresses the last two files that were not converted in previous series, specifically updating the following functions: neigh_valid_dump_req rtnl_valid_dump_ifinfo_req rtnl_valid_getlink_req valid_fdb_get_strict valid_bridge_getlink_req rtnl_valid_stats_req rtnl_mdb_valid_dump_req I would like to extend a big thank you to Kuniyuki Iwashima for his invaluable help and review of this effort. ==================== Link: https://patch.msgid.link/20250417-nlmsg_v3-v1-0-9b09d9d7e61d@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-21net: Use nlmsg_payload in rtnetlink fileBreno Leitao
Leverage the new nlmsg_payload() helper to avoid checking for message size and then reading the nlmsg data. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250417-nlmsg_v3-v1-2-9b09d9d7e61d@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-21net: Use nlmsg_payload in neighbour fileBreno Leitao
Leverage the new nlmsg_payload() helper to avoid checking for message size and then reading the nlmsg data. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250417-nlmsg_v3-v1-1-9b09d9d7e61d@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-21s390: ism: Pass string literal as format argument of dev_set_name()Simon Horman
GCC 14.2.0 reports that passing a non-string literal as the format argument of dev_set_name() is potentially insecure. drivers/s390/net/ism_drv.c: In function 'ism_probe': drivers/s390/net/ism_drv.c:615:2: warning: format not a string literal and no format arguments [-Wformat-security] 615 | dev_set_name(&ism->dev, dev_name(&pdev->dev)); | ^~~~~~~~~~~~ It seems to me that as pdev is a PCIE device then the dev_name call above should always return the device's BDF, e.g. 00:12.0. That this should not contain format escape sequences. And thus the current usage is safe. But, it seems better to be safe than sorry. And, as a bonus, compiler output becomes less verbose by addressing this issue. Compile tested only. No functional change intended. Signed-off-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250417-ism-str-fmt-v1-1-9818b029874d@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-21net/mlx5: Fix spelling mistakes in mlx5_core_dbg message and commentsColin Ian King
There is a spelling mistake in a mlx5_core_dbg and two spelling mistakes in comment blocks. Fix them. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Acked-by: Mark Bloch <mbloch@nvidia.com> Link: https://patch.msgid.link/20250418135703.542722-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-21net: axienet: Fix spelling mistake "archecture" -> "architecture"Colin Ian King
There is a spelling mistake in a dev_error message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://patch.msgid.link/20250418112447.533746-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-21tools: ynl: add missing header depsJakub Kicinski
Various new families and my recent work on rtnetlink missed adding dependencies on C headers. If the system headers are up to date or don't include a given header at all this doesn't make a difference. But if the system headers are in place but stale - compilation will break. Reported-by: Kory Maincent <kory.maincent@bootlin.com> Fixes: 29d34a4d785b ("tools: ynl: generate code for rt-addr and add a sample") Link: https://lore.kernel.org/20250418190431.69c10431@kmaincent-XPS-13-7390 Acked-by: Stanislav Fomichev <sdf@fomichev.me> Tested-by: Kory Maincent <kory.maincent@bootlin.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20250418234942.2344036-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17net: add UAPI to the header guard in various network headersJakub Kicinski
fib_rule, ip6_tunnel, and a whole lot of if_* headers lack the customary _UAPI in the header guard. Without it YNL build can't protect from in tree and system headers both getting included. YNL doesn't need most of these but it's annoying to have to fix them one by one. Note that header installation strips this _UAPI prefix so this should result in no change to the end user. Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250416200840.1338195-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17trace: tcp: Add const qualifier to skb parameter in tcp_probe eventBreno Leitao
Change the tcp_probe tracepoint to accept a const struct sk_buff parameter instead of a non-const one. This improves type safety and better reflects that the skb is not modified within the tracepoint implementation. Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250416-tcp_probe-v1-1-1edc3c5a1cb8@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17net: Delete the outer () duplicated of macro SOCK_SKB_CB_OFFSET definitionZijun Hu
For macro SOCK_SKB_CB_OFFSET definition, Delete the outer () duplicated. Signed-off-by: Zijun Hu <quic_zijuhu@quicinc.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://patch.msgid.link/20250416-fix_net-v1-1-d544c9f3f169@quicinc.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17net: stmmac: mediatek: stop initialising plat->mac_interfaceRussell King (Oracle)
Mediatek doesn't make use of mac_interface, and none of the in-tree DT files use the mac-mode property. Therefore, mac_interface already follows phy_interface. Remove this unnecessary assignment. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/E1u4zyh-000xVE-PG@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17net: stmmac: dwc-qos: use PHY clock-stop capabilityRussell King (Oracle)
Use the PHY clock-stop capability when programming the MAC LPI mode, which allows the transmit clock to the PHY to be gated. Tested on the Jetson Xavier NX platform. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/E1u4zi1-000xHh-57@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17netdev: fix the locking for netdev notificationsJakub Kicinski
Kuniyuki reports that the assert for netdev lock fires when there are netdev event listeners (otherwise we skip the netlink event generation). Correct the locking when coming from the notifier. The NETDEV_XDP_FEAT_CHANGE notifier is already fully locked, it's the documentation that's incorrect. Fixes: 99e44f39a8f7 ("netdev: depend on netdev->lock for xdp features") Reported-by: syzkaller <syzkaller@googlegroups.com> Reported-by: Kuniyuki Iwashima <kuniyu@amazon.com> Link: https://lore.kernel.org/20250410171019.62128-1-kuniyu@amazon.com Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250416030447.1077551-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17net/mlx5e: ethtool: Fix formatting of ptp_rq0_csum_complete_tail_slowKees Cook
The new GCC 15 warning -Wunterminated-string-initialization reports: In file included from drivers/net/ethernet/mellanox/mlx5/core/en.h:55, from drivers/net/ethernet/mellanox/mlx5/core/en_stats.c:34: drivers/net/ethernet/mellanox/mlx5/core/en_stats.h:57:46: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization] 57 | #define MLX5E_DECLARE_PTP_RQ_STAT(type, fld) "ptp_rq%d_"#fld, offsetof(type, fld) | ^~~~~~~~~~~ drivers/net/ethernet/mellanox/mlx5/core/en_stats.c:2279:11: note: in expansion of macro 'MLX5E_DECLARE_PTP_RQ_STAT' 2279 | { MLX5E_DECLARE_PTP_RQ_STAT(struct mlx5e_rq_stats, csum_complete_tail_slow) }, | ^~~~~~~~~~~~~~~~~~~~~~~~~ This stat string is being used in ethtool_sprintf(), so it must be a valid NUL-terminated string. Currently the string lacks the final NUL byte (as GCC warns), but by absolute luck, the next byte in memory is a space (decimal 32) followed by a NUL. "format" is immediately followed by little-endian size_t: struct counter_desc { char format[32]; /* 0 32 */ size_t offset; /* 32 8 */ }; The "offset" member is populated by the stats member offset: #define MLX5E_DECLARE_PTP_RQ_STAT(type, fld) "ptp_rq%d_"#fld, offsetof(type, fld) which for this struct mlx5e_rq_stats member, csum_complete_tail_slow, is 32, or space, and then the rest of the "offset" bytes are NULs. struct mlx5e_rq_stats { ... u64 csum_complete_tail_slow; /* 32 8 */ The use of vsnprintf(), within ethtool_sprintf(), reads past the end of "format" and sees the format string as "ptp_rq%d_csum_complete_tail_slow ", with %d getting resolved by MLX5E_PTP_CHANNEL_IX (value 0): ethtool_sprintf(data, ptp_rq_stats_desc[i].format, MLX5E_PTP_CHANNEL_IX); With an output result of "ptp_rq0_csum_complete_tail_slow", which gets precisely truncated to 31 characters with a trailing NUL. So, instead of accidentally getting this correct due to the NUL bytes at the end of the size_t that happens to follow the format string, just make the string initializer 1 byte shorter by replacing "%d" with "0", since MLX5E_PTP_CHANNEL_IX is already hard-coded. This results in no initializer truncation and no need to call sprintf(). Signed-off-by: Kees Cook <kees@kernel.org> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Link: https://patch.msgid.link/20250416020109.work.297-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17net: ethtool: Adjust exactly ETH_GSTRING_LEN-long stats to use memcpyKees Cook
Many drivers populate the stats buffer using C-String based APIs (e.g. ethtool_sprintf() and ethtool_puts()), usually when building up the list of stats individually (i.e. with a for() loop). This, however, requires that the source strings be populated in such a way as to have a terminating NUL byte in the source. Other drivers populate the stats buffer directly using one big memcpy() of an entire array of strings. No NUL termination is needed here, as the bytes are being directly passed through. Yet others will build up the stats buffer individually, but also use memcpy(). This, too, does not need NUL termination of the source strings. However, there are cases where the strings that populate the source stats strings are exactly ETH_GSTRING_LEN long, and GCC 15's -Wunterminated-string-initialization option complains that the trailing NUL byte has been truncated. This situation is fine only if the driver is using the memcpy() approach. If the C-String APIs are used, the destination string name will have its final byte truncated by the required trailing NUL byte applied by the C-string API. For drivers that are already using memcpy() but have initializers that truncate the NUL terminator, mark their source strings as __nonstring to silence the GCC warnings. For drivers that have initializers that truncate the NUL terminator and are using the C-String APIs, switch to memcpy() to avoid destination string truncation and mark their source strings as __nonstring to silence the GCC warnings. (Also introduce ethtool_cpy() as a helper to make this an easy replacement). Specifically the following warnings were investigated and addressed: ../drivers/net/ethernet/chelsio/cxgb/cxgb2.c:364:9: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization] 364 | "TxFramesAbortedDueToXSCollisions", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../drivers/net/ethernet/freescale/enetc/enetc_ethtool.c:165:33: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization] 165 | { ENETC_PM_R1523X(0), "MAC rx 1523 to max-octet packets" }, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../drivers/net/ethernet/freescale/enetc/enetc_ethtool.c:190:33: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization] 190 | { ENETC_PM_T1523X(0), "MAC tx 1523 to max-octet packets" }, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../drivers/net/ethernet/google/gve/gve_ethtool.c:76:9: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization] 76 | "adminq_dcfg_device_resources_cnt", "adminq_set_driver_parameter_cnt", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c:117:53: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization] 117 | STMMAC_STAT(ptp_rx_msg_type_pdelay_follow_up), | ^ ../drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c:46:12: note: in definition of macro 'STMMAC_STAT' 46 | { #m, sizeof_field(struct stmmac_extra_stats, m), \ | ^ ../drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c:328:24: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization] 328 | .str = "a_mac_control_frames_transmitted", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c:340:24: warning: initializer-string for array of 'char' truncates NUL terminator but destination lacks 'nonstring' attribute (33 chars into 32 available) [-Wunterminated-string-initialization] 340 | .str = "a_pause_mac_ctrl_frames_received", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Signed-off-by: Kees Cook <kees@kernel.org> Reviewed-by: Petr Machata <petrm@nvidia.com> # for mlxsw Reviewed-by: Harshitha Ramamurthy <hramamurthy@google.com> Link: https://patch.msgid.link/20250416010210.work.904-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17r8169: add RTL_GIGA_MAC_VER_LAST to facilitate adding support for new chip ↵Heiner Kallweit
versions Add a new mac_version enum value RTL_GIGA_MAC_VER_LAST. Benefit is that when adding support for a new chip version we have to touch less code, except something changes fundamentally. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/06991f47-2aec-4aa2-8918-2c6e79332303@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17r8169: refactor chip version detectionHeiner Kallweit
Refactor chip version detection and merge both configuration tables. Apart from reducing the code by a third, this paves the way for merging chip version handling if only difference is the firmware. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/1fea533a-dd5a-4198-a9e2-895e11083947@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17Merge branch 'net-stmmac-sunxi-cleanups'Jakub Kicinski
Russell King says: ==================== net: stmmac: sunxi cleanups This series cleans up the sunxi (sun7i) code in two ways: 1. it converts to use the new set_clk_tx_rate() method, even though we don't use clk_tx_i. In doing so, I reformat the function to read better, but with no changes to the code. 2. convert from stmmac_dvr_probe() to stmmac_pltfr_probe(), and then to its devm variant, which allows code simplification. ==================== Link: https://patch.msgid.link/Z_5WT_jOBgubjWQg@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17net: stmmac: sunxi: use devm_stmmac_pltfr_probe()Russell King (Oracle)
Using devm_stmmac_pltfr_probe() simplifies the probe function. This will not only call plat_dat->init (sun7i_dwmac_init), but also plat_dat->exit (sun7i_dwmac_exit) appropriately if stmmac_dvr_probe() fails. This results in an overall simplification of the glue driver. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1u4fre-000nMr-FT@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17net: stmmac: sunxi: use stmmac_pltfr_probe()Russell King (Oracle)
Rather than open-coding the calls to sun7i_gmac_init() and sun7i_gmac_exit() in the probe function, use stmmac_pltfr_probe() which will automatically call the plat_dat->init() and plat_dat->exit() methods appropriately. This simplifies the code. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1u4frZ-000nMl-BB@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17net: stmmac: sunxi: convert to set_clk_tx_rate()Russell King (Oracle)
Convert sunxi to use the set_clk_tx_rate() callback rather than the fix_mac_speed() callback. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1u4frU-000nMf-6o@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.15-rc3). No conflicts. Adjacent changes: tools/net/ynl/pyynl/ynl_gen_c.py 4d07bbf2d456 ("tools: ynl-gen: don't declare loop iterator in place") 7e8ba0c7de2b ("tools: ynl: don't use genlmsghdr in classic netlink") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17Merge tag 'net-6.15-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from Bluetooth, CAN and Netfilter. Current release - regressions: - two fixes for the netdev per-instance locking - batman-adv: fix double-hold of meshif when getting enabled Current release - new code bugs: - Bluetooth: increment TX timestamping tskey always for stream sockets - wifi: static analysis and build fixes for the new Intel sub-driver Previous releases - regressions: - net: fib_rules: fix iif / oif matching on L3 master (VRF) device - ipv6: add exception routes to GC list in rt6_insert_exception() - netfilter: conntrack: fix erroneous removal of offload bit - Bluetooth: - fix sending MGMT_EV_DEVICE_FOUND for invalid address - l2cap: process valid commands in too long frame - btnxpuart: Revert baudrate change in nxp_shutdown Previous releases - always broken: - ethtool: fix memory corruption during SFP FW flashing - eth: - hibmcge: fixes for link and MTU handling, pause frames etc - igc: fixes for PTM (PCIe timestamping) - dsa: b53: enable BPDU reception for management port Misc: - fixes for Netlink protocol schemas" * tag 'net-6.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (81 commits) net: ethernet: mtk_eth_soc: revise QDMA packet scheduler settings net: ethernet: mtk_eth_soc: correct the max weight of the queue limit for 100Mbps net: ethernet: mtk_eth_soc: reapply mdc divider on reset net: ti: icss-iep: Fix possible NULL pointer dereference for perout request net: ti: icssg-prueth: Fix possible NULL pointer dereference inside emac_xmit_xdp_frame() net: ti: icssg-prueth: Fix kernel warning while bringing down network interface netfilter: conntrack: fix erronous removal of offload bit net: don't try to ops lock uninitialized devs ptp: ocp: fix start time alignment in ptp_ocp_signal_set net: dsa: avoid refcount warnings when ds->ops->tag_8021q_vlan_del() fails net: dsa: free routing table on probe failure net: dsa: clean up FDB, MDB, VLAN entries on unbind net: dsa: mv88e6xxx: fix -ENOENT when deleting VLANs and MST is unsupported net: dsa: mv88e6xxx: avoid unregistering devlink regions which were never registered net: txgbe: fix memory leak in txgbe_probe() error path net: bridge: switchdev: do not notify new brentries as changed net: b53: enable BPDU reception for management port netlink: specs: rt-neigh: prefix struct nfmsg members with ndm netlink: specs: rt-link: adjust mctp attribute naming netlink: specs: rtnetlink: attribute naming corrections ...
2025-04-17Merge branch 'bpf-qdisc'Martin KaFai Lau
Amery Hung says: ==================== bpf qdisc Hi all, This patchset aims to support implementing qdisc using bpf struct_ops. This version takes a step back and only implements the minimum support for bpf qdisc. 1) support of adding skb to bpf_list and bpf_rbtree directly and 2) classful qdisc are deferred to future patchsets. In addition, we only allow attaching bpf qdisc to root or mq for now. This is to prevent accidentally breaking exisiting classful qdiscs that rely on data in a child qdisc. This limit may be lifted in the future after careful inspection. * Overview * This series supports implementing qdisc using bpf struct_ops. bpf qdisc aims to be a flexible and easy-to-use infrastructure that allows users to quickly experiment with different scheduling algorithms/policies. It only requires users to implement core qdisc logic using bpf and implements the mundane part for them. In addition, the ability to easily communicate between qdisc and other components will also bring new opportunities for new applications and optimizations. * Performance of bpf qdisc * This patchset includes two qdisc examples, bpf_fifo and bpf_fq, for __testing__ purposes. For performance test, we compare selftests and their kernel counterparts to give you a sense of the performance of qdisc implemented in bpf. The implementation of bpf_fq is fairly complex and slightly different from fq so later we only compare the two fifo qdiscs. bpf_fq implements a scheduling algorithm similar to fq before commit 29f834aa326e ("net_sched: sch_fq: add 3 bands and WRR scheduling") was introduced. bpf_fifo uses a single bpf_list as a queue instead of three queues for different priorities in pfifo_fast. The time complexity of fifo however should be similar since the queue selection time is negligible. Test setup: client -> qdisc -------------> server ~~~~~~~~~~~~~~~ ~~~~~~ nested VM1 @ DC1 VM2 @ DC2 Throghput: iperf3 -t 600, 5 times Qdisc Average (GBits/sec) ---------- ------------------- pfifo_fast 12.52 ± 0.26 bpf_fifo 11.72 ± 0.32 fq 10.24 ± 0.13 bpf_fq 11.92 ± 0.64 Latency: sockperf pp --tcp -t 600, 5 times Qdisc Average (usec) ---------- -------------- pfifo_fast 244.58 ± 7.93 bpf_fifo 244.92 ± 15.22 fq 234.30 ± 19.25 bpf_fq 221.34 ± 10.76 Looking at the two fifo qdiscs, the 6.4% drop in throughput in the bpf implementatioin is consistent with previous observation (v8 throughput test on a loopback device). This should be able to be mitigated by supporting adding skb to bpf_list or bpf_rbtree directly in the future. * Clean up skb in bpf qdisc during reset * The current implementation relies on bpf qdisc implementors to correctly release skbs in queues (bpf graphs or maps) in .reset, which might not be a safe thing to do. The solution as Martin has suggested would be supporting private data in struct_ops. This can also help simplifying implementation of qdisc that works with mq. For examples, qdiscs in the selftest mostly use global data. Therefore, even if user add multiple qdisc instances under mq, they would still share the same queue. ==================== Link: https://patch.msgid.link/20250409214606.2000194-1-ameryhung@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2025-04-17selftests/bpf: Test attaching bpf qdisc to mq and non rootAmery Hung
Until we are certain that existing classful qdiscs work with bpf qdisc, make sure we don't allow attaching a bpf qdisc to non root. Meanwhile, attaching to mq is allowed. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409214606.2000194-11-ameryhung@gmail.com
2025-04-17selftests/bpf: Add a bpf fq qdisc to selftestAmery Hung
This test implements a more sophisticated qdisc using bpf. The bpf fair- queueing (fq) qdisc gives each flow an equal chance to transmit data. It also respects the timestamp of skb for rate limiting. Signed-off-by: Amery Hung <amery.hung@bytedance.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409214606.2000194-10-ameryhung@gmail.com
2025-04-17selftests/bpf: Add a basic fifo qdisc testAmery Hung
This selftest includes a bare minimum fifo qdisc, which simply enqueues sk_buffs into the back of a bpf list and dequeues from the front of the list. Signed-off-by: Amery Hung <amery.hung@bytedance.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409214606.2000194-9-ameryhung@gmail.com
2025-04-17libbpf: Support creating and destroying qdiscAmery Hung
Extend struct bpf_tc_hook with handle, qdisc name and a new attach type, BPF_TC_QDISC, to allow users to add or remove any qdisc specified in addition to clsact. Signed-off-by: Amery Hung <amery.hung@bytedance.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409214606.2000194-8-ameryhung@gmail.com
2025-04-17bpf: net_sched: Disable attaching bpf qdisc to non rootAmery Hung
Do not allow users to attach bpf qdiscs to classful qdiscs. This is to prevent accidentally breaking existings classful qdiscs if they rely on some data in the child qdisc. This restriction can potentially be lifted in the future. Note that, we still allow bpf qdisc to be attached to mq. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409214606.2000194-7-ameryhung@gmail.com
2025-04-17bpf: net_sched: Support updating bstatsAmery Hung
Add a kfunc to update Qdisc bstats when an skb is dequeued. The kfunc is only available in .dequeue programs. Signed-off-by: Amery Hung <amery.hung@bytedance.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409214606.2000194-6-ameryhung@gmail.com
2025-04-17bpf: net_sched: Add a qdisc watchdog timerAmery Hung
Add a watchdog timer to bpf qdisc. The watchdog can be used to schedule the execution of qdisc through kfunc, bpf_qdisc_schedule(). It can be useful for building traffic shaping scheduling algorithm, where the time the next packet will be dequeued is known. The implementation relies on struct_ops gen_prologue/epilogue to patch bpf programs provided by users. Operator specific prologue/epilogue kfuncs are introduced instead of watchdog kfuncs so that it is easier to extend prologue/epilogue in the future (writing C vs BPF bytecode). Signed-off-by: Amery Hung <amery.hung@bytedance.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409214606.2000194-5-ameryhung@gmail.com
2025-04-17bpf: net_sched: Add basic bpf qdisc kfuncsAmery Hung
Add basic kfuncs for working on skb in qdisc. Both bpf_qdisc_skb_drop() and bpf_kfree_skb() can be used to release a reference to an skb. However, bpf_qdisc_skb_drop() can only be called in .enqueue where a to_free skb list is available from kernel to defer the release. bpf_kfree_skb() should be used elsewhere. It is also used in bpf_obj_free_fields() when cleaning up skb in maps and collections. bpf_skb_get_hash() returns the flow hash of an skb, which can be used to build flow-based queueing algorithms. Finally, allow users to create read-only dynptr via bpf_dynptr_from_skb(). Signed-off-by: Amery Hung <amery.hung@bytedance.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409214606.2000194-4-ameryhung@gmail.com
2025-04-17bpf: net_sched: Support implementation of Qdisc_ops in bpfAmery Hung
The recent advancement in bpf such as allocated objects, bpf list and bpf rbtree has provided powerful and flexible building blocks to realize sophisticated packet scheduling algorithms. As struct_ops now supports core operators in Qdisc_ops, start allowing qdisc to be implemented using bpf struct_ops with this patch. Users can implement Qdisc_ops.{enqueue, dequeue, init, reset, destroy} in bpf and register the qdisc dynamically into the kernel. Co-developed-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Amery Hung <amery.hung@bytedance.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409214606.2000194-3-ameryhung@gmail.com
2025-04-17bpf: Prepare to reuse get_ctx_arg_idxAmery Hung
Rename get_ctx_arg_idx to bpf_ctx_arg_idx, and allow others to call it. No functional change. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409214606.2000194-2-ameryhung@gmail.com
2025-04-17Merge tag 'for-linus-6.15a-rc3-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fix from Juergen Gross: "Just a single fix for the Xen multicall driver avoiding a percpu variable referencing initdata by its initializer" * tag 'for-linus-6.15a-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: fix multicall debug feature
2025-04-17Merge tag 'for-linus-fwctl' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull fwctl fixes from Jason Gunthorpe: "Three small changes from further build testing: - Don't rely on the userspace uuid.h for the uapi header - Fix sparse warnings in pds - Typo in log message" * tag 'for-linus-fwctl' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: fwctl: Fix repeated device word in log message pds_fwctl: Fix type and endian complaints fwctl/cxl: Fix uuid_t usage in uapi
2025-04-17Merge tag 'sound-6.15-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A collection of small fixes. All are device-specific like quirks, new IDs, and other safe (or rather boring) changes" * tag 'sound-6.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: firmware: cs_dsp: test_bin_error: Fix uninitialized data used as fw version ASoC: codecs: Add of_match_table for aw888081 driver ASoC: fsl: fsl_qmc_audio: Reset audio data pointers on TRIGGER_START event mailmap: Add entry for Srinivas Kandagatla MAINTAINERS: use kernel.org alias ASoC: cs42l43: Reset clamp override on jack removal ALSA: hda/realtek - Fixed ASUS platform headset Mic issue ALSA: hda/cirrus_scodec_test: Don't select dependencies ALSA: azt2320: Replace deprecated strcpy() with strscpy() ASoC: hdmi-codec: use RTD ID instead of DAI ID for ELD entry ASoC: Intel: avs: Constrain path based on BE capabilities ALSA: hda/tas2781: Remove unnecessary NULL check before release_firmware() ASoC: Intel: avs: Fix null-ptr-deref in avs_component_probe() ASoC: fsl_asrc_dma: get codec or cpu dai from backend ASoC: qcom: Fix sc7280 lpass potential buffer overflow ASoC: dwc: always enable/disable i2s irqs ASoC: Intel: sof_sdw: Add quirk for Asus Zenbook S16 ASoC: codecs:lpass-wsa-macro: Fix logic of enabling vi channels ASoC: codecs:lpass-wsa-macro: Fix vi feedback rate
2025-04-17Merge tag 'platform-drivers-x86-v6.15-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform drivers fixes from Ilpo Järvinen: "Fixes: - amd/pmf: Fix STT limits - asus-laptop: Fix an uninitialized variable - intel_pmc_ipc: Allow building without ACPI - mlxbf-bootctl: Use sysfs_emit_at() in secure_boot_fuse_state_show() - msi-wmi-platform: Add locking to workaround ACPI firmware bug New HW support: - alienware-wmi-wmax: - Extended thermal control support to: - Alienware Area-51m R2 - Alienware m16 R1 - Alienware m16 R2 - Dell G16 7630 - Dell G5 5505 SE - G-Mode support to Alienware m16 R1 - x86-android-tablets: Add Vexia Edu Atla 10 tablet 5V data" * tag 'platform-drivers-x86-v6.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: msi-wmi-platform: Workaround a ACPI firmware bug platform/x86: msi-wmi-platform: Rename "data" variable platform/x86: alienware-wmi-wmax: Extend support to more laptops platform/x86: alienware-wmi-wmax: Add G-Mode support to Alienware m16 R1 platform/x86: amd: pmf: Fix STT limits mlxbf-bootctl: use sysfs_emit_at() in secure_boot_fuse_state_show() platform/x86: x86-android-tablets: Add Vexia Edu Atla 10 tablet 5V data platform/x86: x86-android-tablets: Add "9v" to Vexia EDU ATLA 10 tablet symbols asus-laptop: Fix an uninitialized variable platform/x86: intel_pmc_ipc: add option to build without ACPI
2025-04-17Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Small drivers fixes, except for ufs which has two large updates, one for exposing the device level feature, which is a new addition to the device spec and the other reworking the exynos driver to fix coherence issues on some android phones" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: megaraid_sas: Driver version update to 07.734.00.00-rc1 scsi: megaraid_sas: Block zero-length ATA VPD inquiry scsi: scsi_transport_srp: Replace min/max nesting with clamp() scsi: ufs: core: Add device level exception support scsi: ufs: core: Rename ufshcd_wb_presrv_usrspc_keep_vcc_on() scsi: smartpqi: Use is_kdump_kernel() to check for kdump scsi: pm80xx: Set phy_attached to zero when device is gone scsi: ufs: exynos: gs101: Put UFS device in reset on .suspend() scsi: ufs: exynos: Move phy calls to .exit() callback scsi: ufs: exynos: Enable PRDT pre-fetching with UFSHCD_CAP_CRYPTO scsi: ufs: exynos: Ensure consistent phy reference counts scsi: ufs: exynos: Disable iocc if dma-coherent property isn't set scsi: ufs: exynos: Move UFS shareability value to drvdata scsi: ufs: exynos: Ensure pre_link() executes before exynos_ufs_phy_init() scsi: iscsi: Fix missing scsi_host_put() in error path scsi: ufs: core: Fix a race condition related to device commands scsi: hisi_sas: Fix I/O errors caused by hardware port ID changes scsi: hisi_sas: Enable force phy when SATA disk directly connected
2025-04-17Merge tag 'ata-6.15-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux Pull ata fix from Damien Le Moal: - Fix how sense data from the sense data for successfull NCQ commands log page is used to fully initialize the result_tf of a completed command, so that the sense data returned to the scsi layer is fully initialized with all the device provided information (from Niklas) * tag 'ata-6.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux: ata: libata-sata: Save all fields from sense data descriptor
2025-04-17Merge tag 'xfs-fixes-6.15-rc3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull XFS fixes from Carlos Maiolino: "This mostly includes fixes and documentation for the zoned allocator feature merged during previous merge window, but it also adds a sysfs tunable for the zone garbage collector. There is also a fix for a regression to the RT device that we'd like to fix ASAP now that we're getting more users on the RT zoned allocator" * tag 'xfs-fixes-6.15-rc3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: document zoned rt specifics in admin-guide xfs: fix fsmap for internal zoned devices xfs: Fix spelling mistake "drity" -> "dirty" xfs: compute buffer address correctly in xmbuf_map_backing_mem xfs: add tunable threshold parameter for triggering zone GC xfs: mark xfs_buf_free as might_sleep() xfs: remove the leftover xfs_{set,clear}_li_failed infrastructure
2025-04-17Merge tag 'for-6.15-rc2-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - handle encoded read ioctl returning EAGAIN so it does not mistakenly free the work structure - escape subvolume path in mount option list so it cannot be wrongly parsed when the path contains "," - remove folio size assertions when writing super block to device with enabled large folios * tag 'for-6.15-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: remove folio order ASSERT()s in super block writeback path btrfs: correctly escape subvol in btrfs_show_options() btrfs: ioctl: don't free iov when btrfs_encoded_read() returns -EAGAIN
2025-04-17Merge tag 'slab-for-6.15-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab fix from Vlastimil Babka: - Stable fix adding zero initialization of slab->obj_ext to prevent crashes with allocation profiling (Suren Baghdasaryan) * tag 'slab-for-6.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: slab: ensure slab->obj_exts is clear in a newly allocated slab page
2025-04-17net: ethernet: mtk_eth_soc: revise QDMA packet scheduler settingsBo-Cun Chen
The QDMA packet scheduler suffers from a performance issue. Fix this by picking up changes from MediaTek's SDK which change to use Token Bucket instead of Leaky Bucket and fix the SPEED_1000 configuration. Fixes: 160d3a9b1929 ("net: ethernet: mtk_eth_soc: introduce MTK_NETSYS_V2 support") Signed-off-by: Bo-Cun Chen <bc-bocun.chen@mediatek.com> Signed-off-by: Daniel Golle <daniel@makrotopia.org> Link: https://patch.msgid.link/18040f60f9e2f5855036b75b28c4332a2d2ebdd8.1744764277.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17net: ethernet: mtk_eth_soc: correct the max weight of the queue limit for ↵Bo-Cun Chen
100Mbps Without this patch, the maximum weight of the queue limit will be incorrect when linked at 100Mbps due to an apparent typo. Fixes: f63959c7eec31 ("net: ethernet: mtk_eth_soc: implement multi-queue support for per-port queues") Signed-off-by: Bo-Cun Chen <bc-bocun.chen@mediatek.com> Signed-off-by: Daniel Golle <daniel@makrotopia.org> Link: https://patch.msgid.link/74111ba0bdb13743313999ed467ce564e8189006.1744764277.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17net: ethernet: mtk_eth_soc: reapply mdc divider on resetBo-Cun Chen
In the current method, the MDC divider was reset to the default setting of 2.5MHz after the NETSYS SER. Therefore, we need to reapply the MDC divider configuration function in mtk_hw_init() after reset. Fixes: c0a440031d431 ("net: ethernet: mtk_eth_soc: set MDIO bus clock frequency") Signed-off-by: Bo-Cun Chen <bc-bocun.chen@mediatek.com> Signed-off-by: Daniel Golle <daniel@makrotopia.org> Link: https://patch.msgid.link/8ab7381447e6cdcb317d5b5a6ddd90a1734efcb0.1744764277.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17Merge tag 'nf-25-04-17' of ↵Paolo Abeni
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf Pablo Neira Ayuso says: ==================== Netfilter fix for net The following batch contains one Netfilter fix for net: 1) conntrack offload bit is erroneously unset in a race scenario, from Florian Westphal. netfilter pull request 25-04-17 * tag 'nf-25-04-17' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf: netfilter: conntrack: fix erronous removal of offload bit ==================== Link: https://patch.msgid.link/20250417102847.16640-1-pablo@netfilter.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-17Merge tag 'for-net-2025-04-16' of ↵Paolo Abeni
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - l2cap: Process valid commands in too long frame - vhci: Avoid needless snprintf() calls * tag 'for-net-2025-04-16' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: vhci: Avoid needless snprintf() calls Bluetooth: l2cap: Process valid commands in too long frame ==================== Link: https://patch.msgid.link/20250416210126.2034212-1-luiz.dentz@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-17Merge branch 'net-pktgen-fix-checkpatch-code-style-errors-warnings'Paolo Abeni
Peter Seiderer says: ==================== net: pktgen: fix checkpatch code style errors/warnings Fix checkpatch detected code style errors/warnings detected in the file net/core/pktgen.c (remaining checkpatch checks will be addressed in a follow up patch set). ==================== Link: https://patch.msgid.link/20250415112916.113455-1-ps.report@gmx.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-04-17net: pktgen: fix code style (WARNING: Prefer strscpy over strcpy)Peter Seiderer
Fix checkpatch code style warnings: WARNING: Prefer strscpy over strcpy - see: https://github.com/KSPP/linux/issues/88 #1423: FILE: net/core/pktgen.c:1423: + strcpy(pkt_dev->dst_min, buf); WARNING: Prefer strscpy over strcpy - see: https://github.com/KSPP/linux/issues/88 #1444: FILE: net/core/pktgen.c:1444: + strcpy(pkt_dev->dst_max, buf); WARNING: Prefer strscpy over strcpy - see: https://github.com/KSPP/linux/issues/88 #1554: FILE: net/core/pktgen.c:1554: + strcpy(pkt_dev->src_min, buf); WARNING: Prefer strscpy over strcpy - see: https://github.com/KSPP/linux/issues/88 #1575: FILE: net/core/pktgen.c:1575: + strcpy(pkt_dev->src_max, buf); WARNING: Prefer strscpy over strcpy - see: https://github.com/KSPP/linux/issues/88 #3231: FILE: net/core/pktgen.c:3231: + strcpy(pkt_dev->result, "Starting"); WARNING: Prefer strscpy over strcpy - see: https://github.com/KSPP/linux/issues/88 #3235: FILE: net/core/pktgen.c:3235: + strcpy(pkt_dev->result, "Error starting"); WARNING: Prefer strscpy over strcpy - see: https://github.com/KSPP/linux/issues/88 #3849: FILE: net/core/pktgen.c:3849: + strcpy(pkt_dev->odevname, ifname); While at it squash memset/strcpy pattern into single strscpy_pad call. Signed-off-by: Peter Seiderer <ps.report@gmx.net> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20250415112916.113455-4-ps.report@gmx.net Signed-off-by: Paolo Abeni <pabeni@redhat.com>