summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-04-17r8169: add RTL_GIGA_MAC_VER_LAST to facilitate adding support for new chip ↵Heiner Kallweit
versions Add a new mac_version enum value RTL_GIGA_MAC_VER_LAST. Benefit is that when adding support for a new chip version we have to touch less code, except something changes fundamentally. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/06991f47-2aec-4aa2-8918-2c6e79332303@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17r8169: refactor chip version detectionHeiner Kallweit
Refactor chip version detection and merge both configuration tables. Apart from reducing the code by a third, this paves the way for merging chip version handling if only difference is the firmware. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://patch.msgid.link/1fea533a-dd5a-4198-a9e2-895e11083947@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17Merge branch 'net-stmmac-sunxi-cleanups'Jakub Kicinski
Russell King says: ==================== net: stmmac: sunxi cleanups This series cleans up the sunxi (sun7i) code in two ways: 1. it converts to use the new set_clk_tx_rate() method, even though we don't use clk_tx_i. In doing so, I reformat the function to read better, but with no changes to the code. 2. convert from stmmac_dvr_probe() to stmmac_pltfr_probe(), and then to its devm variant, which allows code simplification. ==================== Link: https://patch.msgid.link/Z_5WT_jOBgubjWQg@shell.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17net: stmmac: sunxi: use devm_stmmac_pltfr_probe()Russell King (Oracle)
Using devm_stmmac_pltfr_probe() simplifies the probe function. This will not only call plat_dat->init (sun7i_dwmac_init), but also plat_dat->exit (sun7i_dwmac_exit) appropriately if stmmac_dvr_probe() fails. This results in an overall simplification of the glue driver. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1u4fre-000nMr-FT@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17net: stmmac: sunxi: use stmmac_pltfr_probe()Russell King (Oracle)
Rather than open-coding the calls to sun7i_gmac_init() and sun7i_gmac_exit() in the probe function, use stmmac_pltfr_probe() which will automatically call the plat_dat->init() and plat_dat->exit() methods appropriately. This simplifies the code. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1u4frZ-000nMl-BB@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17net: stmmac: sunxi: convert to set_clk_tx_rate()Russell King (Oracle)
Convert sunxi to use the set_clk_tx_rate() callback rather than the fix_mac_speed() callback. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/E1u4frU-000nMf-6o@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17Merge tag 'pci-v6.15-fixes-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci fix from Bjorn Helgaas: - Revert a reset patch that broke VFIO passthrough because devices ended up with no available reset mechanisms (Alex Williamson) * tag 'pci-v6.15-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: Revert "PCI: Avoid reset when disabled via sysfs"
2025-04-18Merge tag 'drm-misc-fixes-2025-04-17' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: dma-buf: - Correctly decrement refcounter on errors gem: - Fix test for imported buffers ivpu: - Fix debugging - Fixes to frequency - Support firmware API 3.28.3 - Flush jobs upon reset mgag200: - Set vblank start to correct values v3d: - Fix Indirect Dispatch Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://lore.kernel.org/r/20250417084043.GA365738@linux.fritz.box
2025-04-18Merge tag 'drm-intel-fixes-2025-04-17' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes drm/i915 fixes for v6.15-rc3: - Fix DP DSC configurations that require 3 DSC engines per pipe Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/87fri7p8tp.fsf@intel.com
2025-04-17Merge tag 'bcachefs-2025-04-17' of git://evilpiepirate.org/bcachefsLinus Torvalds
Pull bcachefs fixes from Kent Overstreet: "Usual set of small fixes/logging improvements. One bigger user reported fix, for inode <-> dirent inconsistencies reported in fsck, after moving a subvolume that had been snapshotted" * tag 'bcachefs-2025-04-17' of git://evilpiepirate.org/bcachefs: bcachefs: Fix snapshotting a subvolume, then renaming it bcachefs: Add missing READ_ONCE() for metadata replicas bcachefs: snapshot_node_missing is now autofix bcachefs: Log message when incompat version requested but not enabled bcachefs: Print version_incompat_allowed on startup bcachefs: Silence extent_poisoned error messages bcachefs: btree_root_unreadable_and_scan_found_nothing now AUTOFIX bcachefs: fix bch2_dev_usage_full_read_fast() bcachefs: Don't print data read retry success on non-errors bcachefs: Add missing error handling bcachefs: Prevent granting write refs when filesystem is read-only
2025-04-17Merge tag 'vfio-v6.15-rc3' of https://github.com/awilliam/linux-vfioLinus Torvalds
Pull vfio fix from Alex Williamson: - Include devices where the platform indicates PCI INTx is not routed by setting pdev->irq to zero in the expanded virtualization of the PCI pin register. This provides consistency in the INFO and SET_IRQS ioctls (Alex Williamson) * tag 'vfio-v6.15-rc3' of https://github.com/awilliam/linux-vfio: vfio/pci: Virtualize zero INTx PIN if no pdev->irq
2025-04-17Merge tag 'spi-fix-v6.15-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "A few more device specific fixes plus one trivial quirk. There's a couple of patches for Tegra which avoid some fairly spectacular log spam if the hardware breaks in ways which were actually seen in production, plus a fix for the i.MX driver to propagate errors properly when setting up the hardware. We also have a trivial patch marking the sun4i driver as being compatible with GPIO chip selects" * tag 'spi-fix-v6.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: spi-imx: Add check for spi_imx_setupxfer() spi: tegra210-quad: add rate limiting and simplify timeout error message spi: tegra210-quad: use WARN_ON_ONCE instead of WARN_ON for timeouts spi: sun4i: add support for GPIO chip select lines
2025-04-17Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.15-rc3). No conflicts. Adjacent changes: tools/net/ynl/pyynl/ynl_gen_c.py 4d07bbf2d456 ("tools: ynl-gen: don't declare loop iterator in place") 7e8ba0c7de2b ("tools: ynl: don't use genlmsghdr in classic netlink") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17ftrace: Fix type of ftrace_graph_ent_entry.depthIlya Leoshkevich
ftrace_graph_ent.depth is int, but ftrace_graph_ent_entry.depth is unsigned long. This confuses trace-cmd on 64-bit big-endian systems and makes it print a huge amount of spaces. Fix this by using unsigned int, which has a matching size, instead. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Link: https://lore.kernel.org/20250412221847.17310-2-iii@linux.ibm.com Fixes: ff5c9c576e75 ("ftrace: Add support for function argument to graph tracer") Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-04-17ftrace: fix incorrect hash size in register_ftrace_direct()Menglong Dong
The maximum of the ftrace hash bits is made fls(32) in register_ftrace_direct(), which seems illogical. So, we fix it by making the max hash bits FTRACE_HASH_MAX_BITS instead. Link: https://lore.kernel.org/20250413014444.36724-1-dongml2@chinatelecom.cn Fixes: d05cb470663a ("ftrace: Fix modification of direct_function hash while in use") Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-04-17ftrace: Free ftrace hashes after they are replaced in the subops codeSteven Rostedt
The subops processing creates new hashes when adding and removing subops. There were some places that the old hashes that were replaced were not freed and this caused some memory leaks. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20250417135939.245b128d@gandalf.local.home Fixes: 0ae6b8ce200d ("ftrace: Fix accounting of subop hashes") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-04-17ftrace: Reinitialize hash to EMPTY_HASH after freeingSteven Rostedt
There's several locations that free a ftrace hash pointer but may be referenced again. Reset them to EMPTY_HASH so that a u-a-f bug doesn't happen. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20250417110933.20ab718b@gandalf.local.home Fixes: 0ae6b8ce200d ("ftrace: Fix accounting of subop hashes") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-04-17ftrace: Initialize variables for ftrace_startup/shutdown_subops()Steven Rostedt
The reworking to fix and simplify the ftrace_startup_subops() and the ftrace_shutdown_subops() made it possible for the filter_hash and notrace_hash variables to be used uninitialized in a way that the compiler did not catch it. Initialize both filter_hash and notrace_hash to the EMPTY_HASH as that is what they should be if they never are used. Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Link: https://lore.kernel.org/20250417104017.3aea66c2@gandalf.local.home Reported-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Fixes: 0ae6b8ce200d ("ftrace: Fix accounting of subop hashes") Closes: https://lore.kernel.org/all/1db64a42-626d-4b3a-be08-c65e47333ce2@linux.ibm.com/ Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-04-17Merge tag 'net-6.15-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from Bluetooth, CAN and Netfilter. Current release - regressions: - two fixes for the netdev per-instance locking - batman-adv: fix double-hold of meshif when getting enabled Current release - new code bugs: - Bluetooth: increment TX timestamping tskey always for stream sockets - wifi: static analysis and build fixes for the new Intel sub-driver Previous releases - regressions: - net: fib_rules: fix iif / oif matching on L3 master (VRF) device - ipv6: add exception routes to GC list in rt6_insert_exception() - netfilter: conntrack: fix erroneous removal of offload bit - Bluetooth: - fix sending MGMT_EV_DEVICE_FOUND for invalid address - l2cap: process valid commands in too long frame - btnxpuart: Revert baudrate change in nxp_shutdown Previous releases - always broken: - ethtool: fix memory corruption during SFP FW flashing - eth: - hibmcge: fixes for link and MTU handling, pause frames etc - igc: fixes for PTM (PCIe timestamping) - dsa: b53: enable BPDU reception for management port Misc: - fixes for Netlink protocol schemas" * tag 'net-6.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (81 commits) net: ethernet: mtk_eth_soc: revise QDMA packet scheduler settings net: ethernet: mtk_eth_soc: correct the max weight of the queue limit for 100Mbps net: ethernet: mtk_eth_soc: reapply mdc divider on reset net: ti: icss-iep: Fix possible NULL pointer dereference for perout request net: ti: icssg-prueth: Fix possible NULL pointer dereference inside emac_xmit_xdp_frame() net: ti: icssg-prueth: Fix kernel warning while bringing down network interface netfilter: conntrack: fix erronous removal of offload bit net: don't try to ops lock uninitialized devs ptp: ocp: fix start time alignment in ptp_ocp_signal_set net: dsa: avoid refcount warnings when ds->ops->tag_8021q_vlan_del() fails net: dsa: free routing table on probe failure net: dsa: clean up FDB, MDB, VLAN entries on unbind net: dsa: mv88e6xxx: fix -ENOENT when deleting VLANs and MST is unsupported net: dsa: mv88e6xxx: avoid unregistering devlink regions which were never registered net: txgbe: fix memory leak in txgbe_probe() error path net: bridge: switchdev: do not notify new brentries as changed net: b53: enable BPDU reception for management port netlink: specs: rt-neigh: prefix struct nfmsg members with ndm netlink: specs: rt-link: adjust mctp attribute naming netlink: specs: rtnetlink: attribute naming corrections ...
2025-04-17bcachefs: Fix snapshotting a subvolume, then renaming itKent Overstreet
Subvolume roots and the dirents that point to them are special; they don't obey the normal snapshot versioning rules because they cross snapshot boundaries. We don't keep around older versions of subvolume dirents on rename - we don't need to, because subvolume dirents are only visible in the parent subvolume, and we wouldn't be able to match up the different dirent and inode versions due to crossing the snapshot ID boundary. That means that when we rename a subvolume, that's been snapshotted, the older version of the subvolume root will become dangling - it won't have a dirent that points to it. That's expected, we just need to tell fsck that this is ok. Fixes: https://github.com/koverstreet/bcachefs/issues/856 Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-17io_uring/rsrc: ensure segments counts are correct on kbuf buffersJens Axboe
kbuf imports have the front offset adjusted and segments removed, but the tail segments are still included in the segment count that gets passed in the iov_iter. As the segments aren't necessarily all the same size, move importing to a separate helper and iterate the mapped length to get an exact count. Reviewed-by: Nitesh Shetty <nj.shetty@samsung.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-17Merge branch 'bpf-qdisc'Martin KaFai Lau
Amery Hung says: ==================== bpf qdisc Hi all, This patchset aims to support implementing qdisc using bpf struct_ops. This version takes a step back and only implements the minimum support for bpf qdisc. 1) support of adding skb to bpf_list and bpf_rbtree directly and 2) classful qdisc are deferred to future patchsets. In addition, we only allow attaching bpf qdisc to root or mq for now. This is to prevent accidentally breaking exisiting classful qdiscs that rely on data in a child qdisc. This limit may be lifted in the future after careful inspection. * Overview * This series supports implementing qdisc using bpf struct_ops. bpf qdisc aims to be a flexible and easy-to-use infrastructure that allows users to quickly experiment with different scheduling algorithms/policies. It only requires users to implement core qdisc logic using bpf and implements the mundane part for them. In addition, the ability to easily communicate between qdisc and other components will also bring new opportunities for new applications and optimizations. * Performance of bpf qdisc * This patchset includes two qdisc examples, bpf_fifo and bpf_fq, for __testing__ purposes. For performance test, we compare selftests and their kernel counterparts to give you a sense of the performance of qdisc implemented in bpf. The implementation of bpf_fq is fairly complex and slightly different from fq so later we only compare the two fifo qdiscs. bpf_fq implements a scheduling algorithm similar to fq before commit 29f834aa326e ("net_sched: sch_fq: add 3 bands and WRR scheduling") was introduced. bpf_fifo uses a single bpf_list as a queue instead of three queues for different priorities in pfifo_fast. The time complexity of fifo however should be similar since the queue selection time is negligible. Test setup: client -> qdisc -------------> server ~~~~~~~~~~~~~~~ ~~~~~~ nested VM1 @ DC1 VM2 @ DC2 Throghput: iperf3 -t 600, 5 times Qdisc Average (GBits/sec) ---------- ------------------- pfifo_fast 12.52 ± 0.26 bpf_fifo 11.72 ± 0.32 fq 10.24 ± 0.13 bpf_fq 11.92 ± 0.64 Latency: sockperf pp --tcp -t 600, 5 times Qdisc Average (usec) ---------- -------------- pfifo_fast 244.58 ± 7.93 bpf_fifo 244.92 ± 15.22 fq 234.30 ± 19.25 bpf_fq 221.34 ± 10.76 Looking at the two fifo qdiscs, the 6.4% drop in throughput in the bpf implementatioin is consistent with previous observation (v8 throughput test on a loopback device). This should be able to be mitigated by supporting adding skb to bpf_list or bpf_rbtree directly in the future. * Clean up skb in bpf qdisc during reset * The current implementation relies on bpf qdisc implementors to correctly release skbs in queues (bpf graphs or maps) in .reset, which might not be a safe thing to do. The solution as Martin has suggested would be supporting private data in struct_ops. This can also help simplifying implementation of qdisc that works with mq. For examples, qdiscs in the selftest mostly use global data. Therefore, even if user add multiple qdisc instances under mq, they would still share the same queue. ==================== Link: https://patch.msgid.link/20250409214606.2000194-1-ameryhung@gmail.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2025-04-17selftests/bpf: Test attaching bpf qdisc to mq and non rootAmery Hung
Until we are certain that existing classful qdiscs work with bpf qdisc, make sure we don't allow attaching a bpf qdisc to non root. Meanwhile, attaching to mq is allowed. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409214606.2000194-11-ameryhung@gmail.com
2025-04-17selftests/bpf: Add a bpf fq qdisc to selftestAmery Hung
This test implements a more sophisticated qdisc using bpf. The bpf fair- queueing (fq) qdisc gives each flow an equal chance to transmit data. It also respects the timestamp of skb for rate limiting. Signed-off-by: Amery Hung <amery.hung@bytedance.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409214606.2000194-10-ameryhung@gmail.com
2025-04-17selftests/bpf: Add a basic fifo qdisc testAmery Hung
This selftest includes a bare minimum fifo qdisc, which simply enqueues sk_buffs into the back of a bpf list and dequeues from the front of the list. Signed-off-by: Amery Hung <amery.hung@bytedance.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409214606.2000194-9-ameryhung@gmail.com
2025-04-17libbpf: Support creating and destroying qdiscAmery Hung
Extend struct bpf_tc_hook with handle, qdisc name and a new attach type, BPF_TC_QDISC, to allow users to add or remove any qdisc specified in addition to clsact. Signed-off-by: Amery Hung <amery.hung@bytedance.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409214606.2000194-8-ameryhung@gmail.com
2025-04-17bpf: net_sched: Disable attaching bpf qdisc to non rootAmery Hung
Do not allow users to attach bpf qdiscs to classful qdiscs. This is to prevent accidentally breaking existings classful qdiscs if they rely on some data in the child qdisc. This restriction can potentially be lifted in the future. Note that, we still allow bpf qdisc to be attached to mq. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409214606.2000194-7-ameryhung@gmail.com
2025-04-17bpf: net_sched: Support updating bstatsAmery Hung
Add a kfunc to update Qdisc bstats when an skb is dequeued. The kfunc is only available in .dequeue programs. Signed-off-by: Amery Hung <amery.hung@bytedance.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409214606.2000194-6-ameryhung@gmail.com
2025-04-17bpf: net_sched: Add a qdisc watchdog timerAmery Hung
Add a watchdog timer to bpf qdisc. The watchdog can be used to schedule the execution of qdisc through kfunc, bpf_qdisc_schedule(). It can be useful for building traffic shaping scheduling algorithm, where the time the next packet will be dequeued is known. The implementation relies on struct_ops gen_prologue/epilogue to patch bpf programs provided by users. Operator specific prologue/epilogue kfuncs are introduced instead of watchdog kfuncs so that it is easier to extend prologue/epilogue in the future (writing C vs BPF bytecode). Signed-off-by: Amery Hung <amery.hung@bytedance.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409214606.2000194-5-ameryhung@gmail.com
2025-04-17bpf: net_sched: Add basic bpf qdisc kfuncsAmery Hung
Add basic kfuncs for working on skb in qdisc. Both bpf_qdisc_skb_drop() and bpf_kfree_skb() can be used to release a reference to an skb. However, bpf_qdisc_skb_drop() can only be called in .enqueue where a to_free skb list is available from kernel to defer the release. bpf_kfree_skb() should be used elsewhere. It is also used in bpf_obj_free_fields() when cleaning up skb in maps and collections. bpf_skb_get_hash() returns the flow hash of an skb, which can be used to build flow-based queueing algorithms. Finally, allow users to create read-only dynptr via bpf_dynptr_from_skb(). Signed-off-by: Amery Hung <amery.hung@bytedance.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409214606.2000194-4-ameryhung@gmail.com
2025-04-17bpf: net_sched: Support implementation of Qdisc_ops in bpfAmery Hung
The recent advancement in bpf such as allocated objects, bpf list and bpf rbtree has provided powerful and flexible building blocks to realize sophisticated packet scheduling algorithms. As struct_ops now supports core operators in Qdisc_ops, start allowing qdisc to be implemented using bpf struct_ops with this patch. Users can implement Qdisc_ops.{enqueue, dequeue, init, reset, destroy} in bpf and register the qdisc dynamically into the kernel. Co-developed-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Cong Wang <cong.wang@bytedance.com> Signed-off-by: Amery Hung <amery.hung@bytedance.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409214606.2000194-3-ameryhung@gmail.com
2025-04-17bpf: Prepare to reuse get_ctx_arg_idxAmery Hung
Rename get_ctx_arg_idx to bpf_ctx_arg_idx, and allow others to call it. No functional change. Signed-off-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Link: https://patch.msgid.link/20250409214606.2000194-2-ameryhung@gmail.com
2025-04-17cgroup/cpuset-v1: Add missing support for cpuset_v2_modeT.J. Mercier
Android has mounted the v1 cpuset controller using filesystem type "cpuset" (not "cgroup") since 2015 [1], and depends on the resulting behavior where the controller name is not added as a prefix for cgroupfs files. [2] Later, a problem was discovered where cpu hotplug onlining did not affect the cpuset/cpus files, which Android carried an out-of-tree patch to address for a while. An attempt was made to upstream this patch, but the recommendation was to use the "cpuset_v2_mode" mount option instead. [3] An effort was made to do so, but this fails with "cgroup: Unknown parameter 'cpuset_v2_mode'" because commit e1cba4b85daa ("cgroup: Add mount flag to enable cpuset to use v2 behavior in v1 cgroup") did not update the special cased cpuset_mount(), and only the cgroup (v1) filesystem type was updated. Add parameter parsing to the cpuset filesystem type so that cpuset_v2_mode works like the cgroup filesystem type: $ mkdir /dev/cpuset $ mount -t cpuset -ocpuset_v2_mode none /dev/cpuset $ mount|grep cpuset none on /dev/cpuset type cgroup (rw,relatime,cpuset,noprefix,cpuset_v2_mode,release_agent=/sbin/cpuset_release_agent) [1] https://cs.android.com/android/_/android/platform/system/core/+/b769c8d24fd7be96f8968aa4c80b669525b930d3 [2] https://cs.android.com/android/platform/superproject/main/+/main:system/core/libprocessgroup/setup/cgroup_map_write.cpp;drc=2dac5d89a0f024a2d0cc46a80ba4ee13472f1681;l=192 [3] https://lore.kernel.org/lkml/f795f8be-a184-408a-0b5a-553d26061385@redhat.com/T/ Fixes: e1cba4b85daa ("cgroup: Add mount flag to enable cpuset to use v2 behavior in v1 cgroup") Signed-off-by: T.J. Mercier <tjmercier@google.com> Acked-by: Waiman Long <longman@redhat.com> Reviewed-by: Kamalesh Babulal <kamalesh.babulal@oracle.com> Acked-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-04-17Merge tag 'for-linus-6.15a-rc3-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fix from Juergen Gross: "Just a single fix for the Xen multicall driver avoiding a percpu variable referencing initdata by its initializer" * tag 'for-linus-6.15a-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen: fix multicall debug feature
2025-04-17Merge tag 'for-linus-fwctl' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull fwctl fixes from Jason Gunthorpe: "Three small changes from further build testing: - Don't rely on the userspace uuid.h for the uapi header - Fix sparse warnings in pds - Typo in log message" * tag 'for-linus-fwctl' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: fwctl: Fix repeated device word in log message pds_fwctl: Fix type and endian complaints fwctl/cxl: Fix uuid_t usage in uapi
2025-04-17Merge tag 'sound-6.15-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A collection of small fixes. All are device-specific like quirks, new IDs, and other safe (or rather boring) changes" * tag 'sound-6.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: firmware: cs_dsp: test_bin_error: Fix uninitialized data used as fw version ASoC: codecs: Add of_match_table for aw888081 driver ASoC: fsl: fsl_qmc_audio: Reset audio data pointers on TRIGGER_START event mailmap: Add entry for Srinivas Kandagatla MAINTAINERS: use kernel.org alias ASoC: cs42l43: Reset clamp override on jack removal ALSA: hda/realtek - Fixed ASUS platform headset Mic issue ALSA: hda/cirrus_scodec_test: Don't select dependencies ALSA: azt2320: Replace deprecated strcpy() with strscpy() ASoC: hdmi-codec: use RTD ID instead of DAI ID for ELD entry ASoC: Intel: avs: Constrain path based on BE capabilities ALSA: hda/tas2781: Remove unnecessary NULL check before release_firmware() ASoC: Intel: avs: Fix null-ptr-deref in avs_component_probe() ASoC: fsl_asrc_dma: get codec or cpu dai from backend ASoC: qcom: Fix sc7280 lpass potential buffer overflow ASoC: dwc: always enable/disable i2s irqs ASoC: Intel: sof_sdw: Add quirk for Asus Zenbook S16 ASoC: codecs:lpass-wsa-macro: Fix logic of enabling vi channels ASoC: codecs:lpass-wsa-macro: Fix vi feedback rate
2025-04-17Merge tag 'platform-drivers-x86-v6.15-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform drivers fixes from Ilpo Järvinen: "Fixes: - amd/pmf: Fix STT limits - asus-laptop: Fix an uninitialized variable - intel_pmc_ipc: Allow building without ACPI - mlxbf-bootctl: Use sysfs_emit_at() in secure_boot_fuse_state_show() - msi-wmi-platform: Add locking to workaround ACPI firmware bug New HW support: - alienware-wmi-wmax: - Extended thermal control support to: - Alienware Area-51m R2 - Alienware m16 R1 - Alienware m16 R2 - Dell G16 7630 - Dell G5 5505 SE - G-Mode support to Alienware m16 R1 - x86-android-tablets: Add Vexia Edu Atla 10 tablet 5V data" * tag 'platform-drivers-x86-v6.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: msi-wmi-platform: Workaround a ACPI firmware bug platform/x86: msi-wmi-platform: Rename "data" variable platform/x86: alienware-wmi-wmax: Extend support to more laptops platform/x86: alienware-wmi-wmax: Add G-Mode support to Alienware m16 R1 platform/x86: amd: pmf: Fix STT limits mlxbf-bootctl: use sysfs_emit_at() in secure_boot_fuse_state_show() platform/x86: x86-android-tablets: Add Vexia Edu Atla 10 tablet 5V data platform/x86: x86-android-tablets: Add "9v" to Vexia EDU ATLA 10 tablet symbols asus-laptop: Fix an uninitialized variable platform/x86: intel_pmc_ipc: add option to build without ACPI
2025-04-17Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Small drivers fixes, except for ufs which has two large updates, one for exposing the device level feature, which is a new addition to the device spec and the other reworking the exynos driver to fix coherence issues on some android phones" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: megaraid_sas: Driver version update to 07.734.00.00-rc1 scsi: megaraid_sas: Block zero-length ATA VPD inquiry scsi: scsi_transport_srp: Replace min/max nesting with clamp() scsi: ufs: core: Add device level exception support scsi: ufs: core: Rename ufshcd_wb_presrv_usrspc_keep_vcc_on() scsi: smartpqi: Use is_kdump_kernel() to check for kdump scsi: pm80xx: Set phy_attached to zero when device is gone scsi: ufs: exynos: gs101: Put UFS device in reset on .suspend() scsi: ufs: exynos: Move phy calls to .exit() callback scsi: ufs: exynos: Enable PRDT pre-fetching with UFSHCD_CAP_CRYPTO scsi: ufs: exynos: Ensure consistent phy reference counts scsi: ufs: exynos: Disable iocc if dma-coherent property isn't set scsi: ufs: exynos: Move UFS shareability value to drvdata scsi: ufs: exynos: Ensure pre_link() executes before exynos_ufs_phy_init() scsi: iscsi: Fix missing scsi_host_put() in error path scsi: ufs: core: Fix a race condition related to device commands scsi: hisi_sas: Fix I/O errors caused by hardware port ID changes scsi: hisi_sas: Enable force phy when SATA disk directly connected
2025-04-17Merge tag 'ata-6.15-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux Pull ata fix from Damien Le Moal: - Fix how sense data from the sense data for successfull NCQ commands log page is used to fully initialize the result_tf of a completed command, so that the sense data returned to the scsi layer is fully initialized with all the device provided information (from Niklas) * tag 'ata-6.15-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux: ata: libata-sata: Save all fields from sense data descriptor
2025-04-17cgroup: Fix compilation issue due to cgroup_mutex not being exportedgaoxu
When adding folio_memcg function call in the zram module for Android16-6.12, the following error occurs during compilation: ERROR: modpost: "cgroup_mutex" [../soc-repo/zram.ko] undefined! This error is caused by the indirect call to lockdep_is_held(&cgroup_mutex) within folio_memcg. The export setting for cgroup_mutex is controlled by the CONFIG_PROVE_RCU macro. If CONFIG_LOCKDEP is enabled while CONFIG_PROVE_RCU is not, this compilation error will occur. To resolve this issue, add a parallel macro CONFIG_LOCKDEP control to ensure cgroup_mutex is properly exported when needed. Signed-off-by: gao xu <gaoxu2@honor.com> Acked-by: Michal Koutný <mkoutny@suse.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-04-17Merge tag 'xfs-fixes-6.15-rc3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull XFS fixes from Carlos Maiolino: "This mostly includes fixes and documentation for the zoned allocator feature merged during previous merge window, but it also adds a sysfs tunable for the zone garbage collector. There is also a fix for a regression to the RT device that we'd like to fix ASAP now that we're getting more users on the RT zoned allocator" * tag 'xfs-fixes-6.15-rc3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: document zoned rt specifics in admin-guide xfs: fix fsmap for internal zoned devices xfs: Fix spelling mistake "drity" -> "dirty" xfs: compute buffer address correctly in xmbuf_map_backing_mem xfs: add tunable threshold parameter for triggering zone GC xfs: mark xfs_buf_free as might_sleep() xfs: remove the leftover xfs_{set,clear}_li_failed infrastructure
2025-04-17Merge tag 'for-6.15-rc2-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - handle encoded read ioctl returning EAGAIN so it does not mistakenly free the work structure - escape subvolume path in mount option list so it cannot be wrongly parsed when the path contains "," - remove folio size assertions when writing super block to device with enabled large folios * tag 'for-6.15-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: remove folio order ASSERT()s in super block writeback path btrfs: correctly escape subvol in btrfs_show_options() btrfs: ioctl: don't free iov when btrfs_encoded_read() returns -EAGAIN
2025-04-17Merge tag 'slab-for-6.15-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab fix from Vlastimil Babka: - Stable fix adding zero initialization of slab->obj_ext to prevent crashes with allocation profiling (Suren Baghdasaryan) * tag 'slab-for-6.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: slab: ensure slab->obj_exts is clear in a newly allocated slab page
2025-04-17Merge tag 'amd-pstate-v6.15-2025-04-15' of ↵Rafael J. Wysocki
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux Merge amd-pstate content for 6.15 (4/15/25) from Mario Limonciello: "Add a fix for X3D processors where depending upon what BIOS was set initially rankings might be set improperly. Add a fix for changing min/max limits while on the performance governor." * tag 'amd-pstate-v6.15-2025-04-15' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux: cpufreq/amd-pstate: Enable ITMT support after initializing core rankings cpufreq/amd-pstate: Fix min_limit perf and freq updation for performance governor
2025-04-17cpufreq: Avoid using inconsistent policy->min and policy->maxRafael J. Wysocki
Since cpufreq_driver_resolve_freq() can run in parallel with cpufreq_set_policy() and there is no synchronization between them, the former may access policy->min and policy->max while the latter is updating them and it may see intermediate values of them due to the way the update is carried out. Also the compiler is free to apply any optimizations it wants both to the stores in cpufreq_set_policy() and to the loads in cpufreq_driver_resolve_freq() which may result in additional inconsistencies. To address this, use WRITE_ONCE() when updating policy->min and policy->max in cpufreq_set_policy() and use READ_ONCE() for reading them in cpufreq_driver_resolve_freq(). Moreover, rearrange the update in cpufreq_set_policy() to avoid storing intermediate values in policy->min and policy->max with the help of the observation that their new values are expected to be properly ordered upfront. Also modify cpufreq_driver_resolve_freq() to take the possible reverse ordering of policy->min and policy->max, which may happen depending on the ordering of operations when this function and cpufreq_set_policy() run concurrently, into account by always honoring the max when it turns out to be less than the min (in case it comes from thermal throttling or similar). Fixes: 151717690694 ("cpufreq: Make policy min/max hard requirements") Cc: 5.16+ <stable@vger.kernel.org> # 5.16+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Christian Loehle <christian.loehle@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://patch.msgid.link/5907080.DvuYhMxLoT@rjwysocki.net
2025-04-17cpufreq/sched: Set need_freq_update in ignore_dl_rate_limit()Rafael J. Wysocki
Notice that ignore_dl_rate_limit() need not piggy back on the limits_changed handling to achieve its goal (which is to enforce a frequency update before its due time). Namely, if sugov_should_update_freq() is updated to check sg_policy->need_freq_update and return 'true' if it is set when sg_policy->limits_changed is not set, ignore_dl_rate_limit() may set the former directly instead of setting the latter, so it can avoid hitting the memory barrier in sugov_should_update_freq(). Update the code accordingly. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Christian Loehle <christian.loehle@arm.com> Link: https://patch.msgid.link/10666429.nUPlyArG6x@rjwysocki.net
2025-04-17cpufreq/sched: Explicitly synchronize limits_changed flag handlingRafael J. Wysocki
The handling of the limits_changed flag in struct sugov_policy needs to be explicitly synchronized to ensure that cpufreq policy limits updates will not be missed in some cases. Without that synchronization it is theoretically possible that the limits_changed update in sugov_should_update_freq() will be reordered with respect to the reads of the policy limits in cpufreq_driver_resolve_freq() and in that case, if the limits_changed update in sugov_limits() clobbers the one in sugov_should_update_freq(), the new policy limits may not take effect for a long time. Likewise, the limits_changed update in sugov_limits() may theoretically get reordered with respect to the updates of the policy limits in cpufreq_set_policy() and if sugov_should_update_freq() runs between them, the policy limits change may be missed. To ensure that the above situations will not take place, add memory barriers preventing the reordering in question from taking place and add READ_ONCE() and WRITE_ONCE() annotations around all of the limits_changed flag updates to prevent the compiler from messing up with that code. Fixes: 600f5badb78c ("cpufreq: schedutil: Don't skip freq update when limits change") Cc: 5.3+ <stable@vger.kernel.org> # 5.3+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Christian Loehle <christian.loehle@arm.com> Link: https://patch.msgid.link/3376719.44csPzL39Z@rjwysocki.net
2025-04-17cpufreq/sched: Fix the usage of CPUFREQ_NEED_UPDATE_LIMITSRafael J. Wysocki
Commit 8e461a1cb43d ("cpufreq: schedutil: Fix superfluous updates caused by need_freq_update") modified sugov_should_update_freq() to set the need_freq_update flag only for drivers with CPUFREQ_NEED_UPDATE_LIMITS set, but that flag generally needs to be set when the policy limits change because the driver callback may need to be invoked for the new limits to take effect. However, if the return value of cpufreq_driver_resolve_freq() after applying the new limits is still equal to the previously selected frequency, the driver callback needs to be invoked only in the case when CPUFREQ_NEED_UPDATE_LIMITS is set (which means that the driver specifically wants its callback to be invoked every time the policy limits change). Update the code accordingly to avoid missing policy limits changes for drivers without CPUFREQ_NEED_UPDATE_LIMITS. Fixes: 8e461a1cb43d ("cpufreq: schedutil: Fix superfluous updates caused by need_freq_update") Closes: https://lore.kernel.org/lkml/Z_Tlc6Qs-tYpxWYb@linaro.org/ Reported-by: Stephan Gerhold <stephan.gerhold@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Christian Loehle <christian.loehle@arm.com> Link: https://patch.msgid.link/3010358.e9J7NaK4W3@rjwysocki.net
2025-04-17net: ethernet: mtk_eth_soc: revise QDMA packet scheduler settingsBo-Cun Chen
The QDMA packet scheduler suffers from a performance issue. Fix this by picking up changes from MediaTek's SDK which change to use Token Bucket instead of Leaky Bucket and fix the SPEED_1000 configuration. Fixes: 160d3a9b1929 ("net: ethernet: mtk_eth_soc: introduce MTK_NETSYS_V2 support") Signed-off-by: Bo-Cun Chen <bc-bocun.chen@mediatek.com> Signed-off-by: Daniel Golle <daniel@makrotopia.org> Link: https://patch.msgid.link/18040f60f9e2f5855036b75b28c4332a2d2ebdd8.1744764277.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-17net: ethernet: mtk_eth_soc: correct the max weight of the queue limit for ↵Bo-Cun Chen
100Mbps Without this patch, the maximum weight of the queue limit will be incorrect when linked at 100Mbps due to an apparent typo. Fixes: f63959c7eec31 ("net: ethernet: mtk_eth_soc: implement multi-queue support for per-port queues") Signed-off-by: Bo-Cun Chen <bc-bocun.chen@mediatek.com> Signed-off-by: Daniel Golle <daniel@makrotopia.org> Link: https://patch.msgid.link/74111ba0bdb13743313999ed467ce564e8189006.1744764277.git.daniel@makrotopia.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>