summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-12-02selftests/bpf: remove test_flow_dissector.shAlexis Lothoré (eBPF Foundation)
Now that test_flow_dissector.sh has been converted to test_progs, remove the legacy test. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20241120-flow_dissector-v3-14-45b46494f937@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-12-02selftests/bpf: migrate bpf flow dissectors tests to test_progsAlexis Lothoré (eBPF Foundation)
test_flow_dissector.sh loads flow_dissector program and subprograms, creates and configured relevant tunnels and interfaces, and ensure that the bpf dissection is actually performed correctly. Similar tests exist in test_progs (thanks to flow_dissector.c) and run the same programs, but those are only executed with BPF_PROG_RUN: those tests are then missing some coverage (eg: coverage for flow keys manipulated when the configured flower uses a port range, which has a dedicated test in test_flow_dissector.sh) Convert test_flow_dissector.sh into test_progs so that the corresponding tests are also run in CI. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://lore.kernel.org/r/20241120-flow_dissector-v3-13-45b46494f937@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-12-02selftests/bpf: add network helpers to generate udp checksumsAlexis Lothoré (eBPF Foundation)
network_helpers.c provides some helpers to generate ip checksums or ip pseudo-header checksums, but not for upper layers (eg: udp checksums) Add helpers for udp checksum to allow manually building udp packets. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://lore.kernel.org/r/20241120-flow_dissector-v3-12-45b46494f937@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-12-02selftests/bpf: use the same udp and tcp headers in tests under test_progsAlexis Lothoré (eBPF Foundation)
Trying to add udp-dedicated helpers in network_helpers involves including some udp header, which makes multiple test_progs tests build fail: In file included from ./progs/test_cls_redirect.h:13, from [...]/prog_tests/cls_redirect.c:15: [...]/usr/include/linux/udp.h:23:8: error: redefinition of ‘struct udphdr’ 23 | struct udphdr { | ^~~~~~ In file included from ./network_helpers.h:17, from [...]/prog_tests/cls_redirect.c:13: [...]/usr/include/netinet/udp.h:55:8: note: originally defined here 55 | struct udphdr | ^~~~~~ This error is due to struct udphdr being defined in both <linux/udp.h> and <netinet/udp.h>. Use only <netinet/udp.h> in every test. While at it, perform the same for tcp.h. For some tests, the change needs to be done in the eBPF program part as well, because of some headers sharing between both sides. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://lore.kernel.org/r/20241120-flow_dissector-v3-11-45b46494f937@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-12-02selftests/bpf: document pseudo-header checksum helpersAlexis Lothoré (eBPF Foundation)
network_helpers.h provides helpers to compute checksum for pseudo headers but no helpers to compute the global checksums. Before adding those, clarify csum_tcpudp_magic and csum_ipv6_magic purpose by adding some documentation. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://lore.kernel.org/r/20241120-flow_dissector-v3-10-45b46494f937@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-12-02selftests/bpf: move ip checksum helper to network helpersAlexis Lothoré (eBPF Foundation)
xdp_metadata test has a small helper computing ipv4 checksums to allow manually building packets. Move this helper to network_helpers to share it with other tests. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://lore.kernel.org/r/20241120-flow_dissector-v3-9-45b46494f937@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-12-02selftests/bpf: Enable generic tc actions in selftests configAlexis Lothoré (eBPF Foundation)
Enable CONFIG_NET_ACT_GACT to allow adding simple actions with tc filters. This is for example needed to migrate test_flow_dissector into the automated testing performed in CI. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20241120-flow_dissector-v3-8-45b46494f937@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-12-02selftests/bpf: migrate flow_dissector namespace exclusivity testAlexis Lothoré (eBPF Foundation)
Commit a11c397c43d5 ("bpf/flow_dissector: add mode to enforce global BPF flow dissector") is currently tested in test_flow_dissector.sh, which is not part of test_progs. Add the corresponding test to flow_dissector.c, which is part of test_progs. The new test reproduces the behavior implemented in its shell script counterpart: - attach a flow dissector program to the root net namespace, ensure that we can not attach another flow dissector in any non-root net namespace - attach a flow dissector program to a non-root net namespace, ensure that we can not attach another flow dissector in root namespace Since the new test is performing operations in the root net namespace, make sure to set it as a "serial" test to make sure not to conflict with any other test. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20241120-flow_dissector-v3-7-45b46494f937@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-12-02selftests/bpf: add gre packets testing to flow_dissectorAlexis Lothoré (eBPF Foundation)
The bpf_flow program is able to handle GRE headers in IP packets. Add a few test data input simulating those GRE packets, with 2 different cases: - parse GRE and the encapsulated packet - parse GRE only Acked-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20241120-flow_dissector-v3-6-45b46494f937@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-12-02selftests/bpf: expose all subtests from flow_dissectorAlexis Lothoré (eBPF Foundation)
The flow_dissector test integrated in test_progs actually runs a wide matrix of tests over different packets types and bpf programs modes, but exposes only 3 main tests, preventing tests users from running specific subtests with a specific input only. Expose all subtests executed by flow_dissector by using test__start_subtest(). Acked-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20241120-flow_dissector-v3-5-45b46494f937@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-12-02selftests/bpf: re-split main function into dedicated testsAlexis Lothoré (eBPF Foundation)
The flow_dissector runs plenty of tests over diffent kind of packets, grouped into three categories: skb mode, non-skb mode with direct attach, and non-skb with indirect attach. Re-split the main function into dedicated tests. Each test now must have its own setup/teardown, but for the advantage of being able to run them separately. While at it, make sure that tests attaching the bpf programs are run in a dedicated ns. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20241120-flow_dissector-v3-4-45b46494f937@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-12-02selftests/bpf: replace CHECK calls with ASSERT macros in flow_dissector testAlexis Lothoré (eBPF Foundation)
The flow dissector test currently relies on generic CHECK macros to perform tests. Update those to newer, more-specific ASSERT macros. This update allows to get rid of the global duration variable, which was needed by the CHECK macros Acked-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20241120-flow_dissector-v3-3-45b46494f937@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-12-02selftests/bpf: use ASSERT_MEMEQ to compare bpf flow keysAlexis Lothoré (eBPF Foundation)
The flow_dissector program currently compares flow keys returned by bpf program with the expected one thanks to a custom macro using memcmp. Use the new ASSERT_MEMEQ macro to perform this comparision. This update also allows to get rid of the unused bpf_test_run_opts variable in run_tests_skb_less (it was only used by the CHECK macro for its duration field) Acked-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20241120-flow_dissector-v3-2-45b46494f937@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-12-02selftests/bpf: add a macro to compare raw memoryAlexis Lothoré (eBPF Foundation)
We sometimes need to compare whole structures in an assert. It is possible to use the existing macros on each field, but when the whole structure has to be checked, it is more convenient to simply compare the whole structure memory Add a dedicated assert macro, ASSERT_MEMEQ, to allow bare memory comparision The output generated by this new macro looks like the following: [...] run_tests_skb_less:FAIL:returned flow keys unexpected memory mismatch actual: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 expected: 0E 00 3E 00 DD 86 01 01 00 06 86 DD 50 00 90 1F 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 00 [...] Acked-by: Stanislav Fomichev <sdf@fomichev.me> Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20241120-flow_dissector-v3-1-45b46494f937@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-12-01Get rid of 'remove_new' relic from platform driver structLinus Torvalds
The continual trickle of small conversion patches is grating on me, and is really not helping. Just get rid of the 'remove_new' member function, which is just an alias for the plain 'remove', and had a comment to that effect: /* * .remove_new() is a relic from a prototype conversion of .remove(). * New drivers are supposed to implement .remove(). Once all drivers are * converted to not use .remove_new any more, it will be dropped. */ This was just a tree-wide 'sed' script that replaced '.remove_new' with '.remove', with some care taken to turn a subsequent tab into two tabs to make things line up. I did do some minimal manual whitespace adjustment for places that used spaces to line things up. Then I just removed the old (sic) .remove_new member function, and this is the end result. No more unnecessary conversion noise. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-12-01Linux 6.13-rc1v6.13-rc1Linus Torvalds
2024-12-01Merge tag 'i2c-for-6.13-rc1-part3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c component probing support from Wolfram Sang: "Add OF component probing. Some devices are designed and manufactured with some components having multiple drop-in replacement options. These components are often connected to the mainboard via ribbon cables, having the same signals and pin assignments across all options. These may include the display panel and touchscreen on laptops and tablets, and the trackpad on laptops. Sometimes which component option is used in a particular device can be detected by some firmware provided identifier, other times that information is not available, and the kernel has to try to probe each device. Instead of a delicate dance between drivers and device tree quirks, this change introduces a simple I2C component probe function. For a given class of devices on the same I2C bus, it will go through all of them, doing a simple I2C read transfer and see which one of them responds. It will then enable the device that responds" * tag 'i2c-for-6.13-rc1-part3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: MAINTAINERS: fix typo in I2C OF COMPONENT PROBER of: base: Document prefix argument for of_get_next_child_with_prefix() i2c: Fix whitespace style issue arm64: dts: mediatek: mt8173-elm-hana: Mark touchscreens and trackpads as fail platform/chrome: Introduce device tree hardware prober i2c: of-prober: Add GPIO support to simple helpers i2c: of-prober: Add simple helpers for regulator support i2c: Introduce OF component probe function of: base: Add for_each_child_of_node_with_prefix() of: dynamic: Add of_changeset_update_prop_string
2024-12-01Merge tag 'trace-printf-v6.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull bprintf() removal from Steven Rostedt: - Remove unused bprintf() function, that was added with the rest of the "bin-printf" functions. These are functions that are used by trace_printk() that allows to quickly save the format and arguments into the ring buffer without the expensive processing of converting numbers to ASCII. Then on output, at a much later time, the ring buffer is read and the string processing occurs then. The bprintf() was added for consistency but was never used. It can be safely removed. * tag 'trace-printf-v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: printf: Remove unused 'bprintf'
2024-12-01Merge tag 'timers_urgent_for_v6.13_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Borislav Petkov: - Fix a case where posix timers with a thread-group-wide target would miss signals if some of the group's threads are exiting - Fix a hang caused by ndelay() calling the wrong delay function __udelay() - Fix a wrong offset calculation in adjtimex(2) when using ADJ_MICRO (microsecond resolution) and a negative offset * tag 'timers_urgent_for_v6.13_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: posix-timers: Target group sigqueue to current task only if not exiting delay: Fix ndelay() spuriously treated as udelay() ntp: Remove invalid cast in time offset math
2024-12-01Merge tag 'irq_urgent_for_v6.13_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Borislav Petkov: - Move the ->select callback to the correct ops structure in irq-mvebu-sei to fix some Marvell Armada platforms - Add a workaround for Hisilicon ITS erratum 162100801 which can cause some virtual interrupts to get lost - More platform_driver::remove() conversion * tag 'irq_urgent_for_v6.13_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: irqchip: Switch back to struct platform_driver::remove() irqchip/gicv3-its: Add workaround for hip09 ITS erratum 162100801 irqchip/irq-mvebu-sei: Move misplaced select() callback to SEI CP domain
2024-12-01Merge tag 'x86_urgent_for_v6.13_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Add a terminating zero end-element to the array describing AMD CPUs affected by erratum 1386 so that the matching loop actually terminates instead of going off into the weeds - Update the boot protocol documentation to mention the fact that the preferred address to load the kernel to is considered in the relocatable kernel case too - Flush the memory buffer containing the microcode patch after applying microcode on AMD Zen1 and Zen2, to avoid unnecessary slowdowns - Make sure the PPIN CPU feature flag is cleared on all CPUs if PPIN has been disabled * tag 'x86_urgent_for_v6.13_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/CPU/AMD: Terminate the erratum_1386_microcode array x86/Documentation: Update algo in init_size description of boot protocol x86/microcode/AMD: Flush patch buffer mapping after application x86/mm: Carve out INVLPG inline asm for use by others x86/cpu: Fix PPIN initialization
2024-12-01strscpy: write destination buffer only onceLinus Torvalds
The point behind strscpy() was to once and for all avoid all the problems with 'strncpy()' and later broken "fixed" versions like strlcpy() that just made things worse. So strscpy not only guarantees NUL-termination (unlike strncpy), it also doesn't do unnecessary padding at the destination. But at the same time also avoids byte-at-a-time reads and writes by _allowing_ some extra NUL writes - within the size, of course - so that the whole copy can be done with word operations. It is also stable in the face of a mutable source string: it explicitly does not read the source buffer multiple times (so an implementation using "strnlen()+memcpy()" would be wrong), and does not read the source buffer past the size (like the mis-design that is strlcpy does). Finally, the return value is designed to be simple and unambiguous: if the string cannot be copied fully, it returns an actual negative error, making error handling clearer and simpler (and the caller already knows the size of the buffer). Otherwise it returns the string length of the result. However, there was one final stability issue that can be important to callers: the stability of the destination buffer. In particular, the same way we shouldn't read the source buffer more than once, we should avoid doing multiple writes to the destination buffer: first writing a potentially non-terminated string, and then terminating it with NUL at the end does not result in a stable result buffer. Yes, it gives the right result in the end, but if the rule for the destination buffer was that it is _always_ NUL-terminated even when accessed concurrently with updates, the final byte of the buffer needs to always _stay_ as a NUL byte. [ Note that "final byte is NUL" here is literally about the final byte in the destination array, not the terminating NUL at the end of the string itself. There is no attempt to try to make concurrent reads and writes give any kind of consistent string length or contents, but we do want to guarantee that there is always at least that final terminating NUL character at the end of the destination array if it existed before ] This is relevant in the kernel for the tsk->comm[] array, for example. Even without locking (for either readers or writers), we want to know that while the buffer contents may be garbled, it is always a valid C string and always has a NUL character at 'comm[TASK_COMM_LEN-1]' (and never has any "out of thin air" data). So avoid any "copy possibly non-terminated string, and terminate later" behavior, and write the destination buffer only once. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-11-30printf: Remove unused 'bprintf'Dr. David Alan Gilbert
bprintf() is unused. Remove it. It was added in the commit 4370aa4aa753 ("vsprintf: add binary printf") but as far as I can see was never used, unlike the other two functions in that patch. Link: https://lore.kernel.org/20241002173147.210107-1-linux@treblig.org Reviewed-by: Andy Shevchenko <andy@kernel.org> Acked-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-11-30Merge tag 'turbostat-2024.11.30' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull turbostat updates from Len Brown: - assorted minor bug fixes - assorted platform specific tweaks - initial RAPL PSYS (SysWatt) support * tag 'turbostat-2024.11.30' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: tools/power turbostat: 2024.11.30 tools/power turbostat: Add RAPL psys as a built-in counter tools/power turbostat: Fix child's argument forwarding tools/power turbostat: Force --no-perf in --dump mode tools/power turbostat: Add support for /sys/class/drm/card1 tools/power turbostat: Cache graphics sysfs file descriptors during probe tools/power turbostat: Consolidate graphics sysfs access tools/power turbostat: Remove unnecessary fflush() call tools/power turbostat: Enhance platform divergence description tools/power turbostat: Add initial support for GraniteRapids-D tools/power turbostat: Remove PC3 support on Lunarlake tools/power turbostat: Rename arl_features to lnl_features tools/power turbostat: Add back PC8 support on Arrowlake tools/power turbostat: Remove PC7/PC9 support on MTL tools/power turbostat: Honor --show CPU, even when even when num_cpus=1 tools/power turbostat: Fix trailing '\n' parsing tools/power turbostat: Allow using cpu device in perf counters on hybrid platforms tools/power turbostat: Fix column printing for PMT xtal_time counters tools/power turbostat: fix GCC9 build regression
2024-11-30Merge tag 'pci-v6.13-fixes-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull PCI fix from Bjorn Helgaas: - When removing a PCI device, only look up and remove a platform device if there is an associated device node for which there could be a platform device, to fix a merge window regression (Brian Norris) * tag 'pci-v6.13-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: PCI/pwrctrl: Unregister platform device only if one actually exists
2024-11-30Merge tag 'lsm-pr-20241129' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull ima fix from Paul Moore: "One small patch to fix a function parameter / local variable naming snafu that went up to you in the current merge window" * tag 'lsm-pr-20241129' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: ima: uncover hidden variable in ima_match_rules()
2024-11-30Merge tag 'block-6.13-20242901' of git://git.kernel.dk/linuxLinus Torvalds
Pull more block updates from Jens Axboe: - NVMe pull request via Keith: - Use correct srcu list traversal (Breno) - Scatter-gather support for metadata (Keith) - Fabrics shutdown race condition fix (Nilay) - Persistent reservations updates (Guixin) - Add the required bits for MD atomic write support for raid0/1/10 - Correct return value for unknown opcode in ublk - Fix deadlock with zone revalidation - Fix for the io priority request vs bio cleanups - Use the correct unsigned int type for various limit helpers - Fix for a race in loop - Cleanup blk_rq_prep_clone() to prevent uninit-value warning and make it easier for actual humans to read - Fix potential UAF when iterating tags - A few fixes for bfq-iosched UAF issues - Fix for brd discard not decrementing the allocated page count - Various little fixes and cleanups * tag 'block-6.13-20242901' of git://git.kernel.dk/linux: (36 commits) brd: decrease the number of allocated pages which discarded block, bfq: fix bfqq uaf in bfq_limit_depth() block: Don't allow an atomic write be truncated in blkdev_write_iter() mq-deadline: don't call req_get_ioprio from the I/O completion handler block: Prevent potential deadlock in blk_revalidate_disk_zones() block: Remove extra part pointer NULLify in blk_rq_init() nvme: tuning pr code by using defined structs and macros nvme: introduce change ptpl and iekey definition block: return bool from get_disk_ro and bdev_read_only block: remove a duplicate definition for bdev_read_only block: return bool from blk_rq_aligned block: return unsigned int from blk_lim_dma_alignment_and_pad block: return unsigned int from queue_dma_alignment block: return unsigned int from bdev_io_opt block: req->bio is always set in the merge code block: don't bother checking the data direction for merges block: blk-mq: fix uninit-value in blk_rq_prep_clone and refactor Revert "block, bfq: merge bfq_release_process_ref() into bfq_put_cooperator()" md/raid10: Atomic write support md/raid1: Atomic write support ...
2024-11-30Merge tag 'io_uring-6.13-20242901' of git://git.kernel.dk/linuxLinus Torvalds
Pull more io_uring updates from Jens Axboe: - Remove a leftover struct from when the cqwait registered waiting was transitioned to regions. - Fix for an issue introduced in this merge window, where nop->fd might be used uninitialized. Ensure it's always set. - Add capping of the task_work run in local task_work mode, to prevent bursty and long chains from adding too much latency. - Work around xa_store() leaving ->head non-NULL if it encounters an allocation error during storing. Just a debug trigger, and can go away once xa_store() behaves in a more expected way for this condition. Not a major thing as it basically requires fault injection to trigger it. - Fix a few mapping corner cases - Fix KCSAN complaint on reading the table size post unlock. Again not a "real" issue, but it's easy to silence by just keeping the reading inside the lock that protects it. * tag 'io_uring-6.13-20242901' of git://git.kernel.dk/linux: io_uring/tctx: work around xa_store() allocation error issue io_uring: fix corner case forgetting to vunmap io_uring: fix task_work cap overshooting io_uring: check for overflows in io_pin_pages io_uring/nop: ensure nop->fd is always initialized io_uring: limit local tw done io_uring: add io_local_work_pending() io_uring/region: return negative -E2BIG in io_create_region() io_uring: protect register tracing io_uring: remove io_uring_cqwait_reg_arg
2024-11-30Merge tag 'dma-mapping-6.13-2024-11-30' of ↵Linus Torvalds
git://git.infradead.org/users/hch/dma-mapping Pull dma-mapping fix from Christoph Hellwig: - fix physical address calculation for struct dma_debug_entry (Fedor Pchelkin) * tag 'dma-mapping-6.13-2024-11-30' of git://git.infradead.org/users/hch/dma-mapping: dma-debug: fix physical address calculation for struct dma_debug_entry
2024-11-30Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull more kvm updates from Paolo Bonzini: - ARM fixes - RISC-V Svade and Svadu (accessed and dirty bit) extension support for host and guest * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: riscv: selftests: Add Svade and Svadu Extension to get-reg-list test RISC-V: KVM: Add Svade and Svadu Extensions Support for Guest/VM dt-bindings: riscv: Add Svade and Svadu Entries RISC-V: Add Svade and Svadu Extensions Support KVM: arm64: Use MDCR_EL2.HPME to evaluate overflow of hyp counters KVM: arm64: Ignore PMCNTENSET_EL0 while checking for overflow status KVM: arm64: Mark set_sysreg_masks() as inline to avoid build failure KVM: arm64: vgic-its: Add stronger type-checking to the ITS entry sizes KVM: arm64: vgic: Kill VGIC_MAX_PRIVATE definition KVM: arm64: vgic: Make vgic_get_irq() more robust KVM: arm64: vgic-v3: Sanitise guest writes to GICR_INVLPIR
2024-11-30Merge tag 'sh-for-v6.13-tag1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux Pull sh updates from John Paul Adrian Glaubitz: "Two small fixes. The first one by Huacai Chen addresses a runtime warning when CONFIG_CPUMASK_OFFSTACK and CONFIG_DEBUG_PER_CPU_MAPS are selected which occurs because the cpuinfo code on sh incorrectly uses NR_CPUS when iterating CPUs instead of the runtime limit nr_cpu_ids. A second fix by Dan Carpenter fixes a use-after-free bug in register_intc_controller() which occurred as a result of improper error handling in the interrupt controller driver code when registering an interrupt controller during plat_irq_setup() on sh" * tag 'sh-for-v6.13-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/glaubitz/sh-linux: sh: intc: Fix use-after-free bug in register_intc_controller() sh: cpuinfo: Fix a warning for CONFIG_CPUMASK_OFFSTACK
2024-11-30Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Catalin Marinas: - Deselect ARCH_CORRECT_STACKTRACE_ON_KRETPROBE so that tests depending on it don't run (and fail) on arm64 - Fix lockdep assert in the Arm SMMUv3 PMU driver - Fix the port and device ID bits setting in the Arm CMN perf driver * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: perf/arm-cmn: Ensure port and device id bits are set properly perf/arm-smmuv3: Fix lockdep assert in ->event_init() arm64: disable ARCH_CORRECT_STACKTRACE_ON_KRETPROBE tests
2024-11-30tools/power turbostat: 2024.11.30Len Brown
since 2024.07.26: assorted minor bug fixes assorted platform specific tweaks initial RAPL PSYS (SysWatt) support Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Add RAPL psys as a built-in counterPatryk Wlazlyn
Introduce the counter as a part of global, platform counters structure. We open the counter for only one cpu, but otherwise treat it as an ordinary RAPL counter, allowing for grouped perf read. The counter is disabled by default, because it's interpretation may require additional, platform specific information, making it unsuitable for general use. Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Fix child's argument forwardingPatryk Wlazlyn
Add '+' to optstring when early scanning for --no-msr and --no-perf. It causes option processing to stop as soon as a nonoption argument is encountered, effectively skipping child's arguments. Fixes: 3e4048466c39 ("tools/power turbostat: Add --no-msr option") Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Force --no-perf in --dump modePatryk Wlazlyn
Force the --no-perf early to prevent using it as a source. User asks for raw values, but perf returns them relative to the opening of the file descriptor. Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Add support for /sys/class/drm/card1Zhang Rui
On some machines, the graphics device is enumerated as /sys/class/drm/card1 instead of /sys/class/drm/card0. The current implementation does not handle this scenario, resulting in the loss of graphics C6 residency and frequency information. Add support for /sys/class/drm/card1, ensuring that turbostat can retrieve and display the graphics columns for these platforms. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Cache graphics sysfs file descriptors during probeZhang Rui
Snapshots of the graphics sysfs knobs are taken based on file descriptors. To optimize this process, open the files and cache the file descriptors during the graphics probe phase. As a result, the previously cached pathnames become redundant and are removed. This change aims to streamline the code without altering its functionality. No functional change intended. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Consolidate graphics sysfs accessZhang Rui
Currently, there is an inconsistency in how graphics sysfs knobs are accessed: graphics residency sysfs knobs are opened and closed for each read, while graphics frequency sysfs knobs are opened once and remain open until turbostat exits. This inconsistency is confusing and adds unnecessary code complexity. Consolidate the access method by opening the sysfs files once and reusing the file pointers for subsequent accesses. This approach simplifies the code and ensures a consistent method for accessing graphics sysfs knobs. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Remove unnecessary fflush() callZhang Rui
The graphics sysfs knobs are read-only, making the use of fflush() before reading them redundant. Remove the unnecessary fflush() call. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Enhance platform divergence descriptionZhang Rui
In various generations, platforms often share a majority of features, diverging only in a few specific aspects. The current approach of using hardcoded values in 'platform_features' structure fails to effectively represent these divergences. To improve the description of platform divergence: 1. Each newly introduced 'platform_features' structure must have a base, typically derived from the previous generation. 2. Platform feature values should be inherited from the base structure rather than being hardcoded. This approach ensures a more accurate and maintainable representation of platform-specific features across different generations. Converts `adl_features` and `lnl_features` to follow this new scheme. No functional change. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Add initial support for GraniteRapids-DZhang Rui
Add initial support for GraniteRapids-D. It shares the same features with SapphireRapids. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Remove PC3 support on LunarlakeZhang Rui
Lunarlake supports CC1/CC6/CC7/PC2/PC6/PC10. Remove PC3 support on Lunarlake. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Rename arl_features to lnl_featuresZhang Rui
As ARL shares the same features with ADL/RPL/MTL, now 'arl_features' is used by Lunarlake platform only. Rename 'arl_features' to 'lnl_features'. No functional change. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Add back PC8 support on ArrowlakeZhang Rui
Similar to ADL/RPL/MTL, ARL supports CC1/CC6/CC7/PC2/PC3/PC6/PC8/PC10. Add back PC8 support on Arrowlake. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Remove PC7/PC9 support on MTLZhang Rui
Similar to ADL/RPL, MTL support CC1/CC6/CC7/PC2/PC3/PC6/PC8/CP10. Remove PC7/PC9 support on MTL. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Honor --show CPU, even when even when num_cpus=1Patryk Wlazlyn
Honor --show CPU and --show Core when "topo.num_cpus == 1". Previously turbostat assumed that on a 1-CPU system, these columns should never appear. Honoring these flags makes it easier for several programs that parse turbostat output. Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Fix trailing '\n' parsingZhang Rui
parse_cpu_string() parses the string input either from command line or from /sys/fs/cgroup/cpuset.cpus.effective to get a list of CPUs that turbostat can run with. The cpu string returned by /sys/fs/cgroup/cpuset.cpus.effective contains a trailing '\n', but strtoul() fails to treat this as an error. That says, for the code below val = ("\n", NULL, 10); val returns 0, and errno is also not set. As a result, CPU0 is erroneously considered as allowed CPU and this causes failures when turbostat tries to run on CPU0. get_counters: Could not migrate to CPU 0 ... turbostat: re-initialized with num_cpus 8, allowed_cpus 5 get_counters: Could not migrate to CPU 0 Add a check to return immediately if '\n' or '\0' is detected. Fixes: 8c3dd2c9e542 ("tools/power/turbostat: Abstrct function for parsing cpu string") Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Allow using cpu device in perf counters on hybrid ↵Patryk Wlazlyn
platforms Intel hybrid platforms expose different perf devices for P and E cores. Instead of one, "/sys/bus/event_source/devices/cpu" device, there are "/sys/bus/event_source/devices/{cpu_core,cpu_atom}". This, however makes it more complicated for the user, because most of the counters are available on both and had to be handled manually. This patch allows users to use "virtual" cpu device that is seemingly translated to cpu_core and cpu_atom perf devices, depending on the type of a CPU we are opening the counter for. Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
2024-11-30tools/power turbostat: Fix column printing for PMT xtal_time countersPatryk Wlazlyn
If the very first printed column was for a PMT counter of type xtal_time we would misalign the column header, because we were always printing the delimiter. Signed-off-by: Patryk Wlazlyn <patryk.wlazlyn@linux.intel.com> Signed-off-by: Len Brown <len.brown@intel.com>