summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)Author
2025-05-02perf hist: Add struct he_mem_statNamhyung Kim
The 'struct he_mem_stat' is to save detailed information about memory instruction. It'll be used to show breakdown of various data from PERF_SAMPLE_DATA_SRC. Note that this structure is generic and the contents will be different depending on actual data it'll use later. The information about the actual data will be saved in 'struct hists' and its length is in nr_mem_stats. This commit just adds ground works and does nothing since hists->nr_mem_stats is 0 for now. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Link: https://lore.kernel.org/r/20250430205548.789750-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02perf hist: Support multi-line headerNamhyung Kim
This is a preparation to support multi-line headers in 'perf mem report'. Normal sort keys and output fields that don't have contents for multi- line will print the header string at the last line only. As we don't use multi-line headers normally, it should not have any changes in the output. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Link: https://lore.kernel.org/r/20250430205548.789750-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02perf record: Add --sample-mem-info optionNamhyung Kim
There's no way to enable PERF_SAMPLE_DATA_SRC without PERF_SAMPLE_ADDR which brings a lot of overhead due to the number of MMAP[2] records. Let's add a new option to enable this information separately. Committer testing: # perf record -a --sample-mem-info ^C[ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 1.815 MB perf.data (2637 samples) ] # # perf evlist -v cycles:P: type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0 (PERF_COUNT_HW_CPU_CYCLES), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|IDENTIFIER|DATA_SRC, read_format: ID|LOST, disabled: 1, freq: 1, precise_ip: 2, sample_id_all: 1 dummy:u: type: 1 (PERF_TYPE_SOFTWARE), size: 136, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|IDENTIFIER|DATA_SRC, read_format: ID|LOST, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1 # # perf report -D |& grep -w PERF_RECORD_SAMPLE -A3 -m1 0 44675164447282 0x1a7590 [0x40]: PERF_RECORD_SAMPLE(IP, 0x4001): 107299/107299: 0xffffffffac4a5e11 period: 144 addr: 0 . data_src: 0x229080142 ... thread: perf:107299 ...... dso: /lib/modules/6.15.0-rc4+/build/vmlinux # Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Link: https://lore.kernel.org/r/20250430205548.789750-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02perf hist: Remove output field from sort-list properlyNamhyung Kim
When it removes an output format for cancelled children or latency, it should delete itself from the sort list as well. Otherwise assertion in fmt_free() will fire. $ perf report -H --stdio perf: ui/hist.c:603: fmt_free: Assertion `!(!list_empty(&fmt->sort_list))' failed. Aborted (core dumped) Also convert to perf_hpp__column_unregister() for the same open codes. Committer notes: Before this patch: # perf test hierarchy 83: perf report --hierarchy : FAILED! # perf test -v hierarchy --- start --- test child forked, pid 102242 perf report --hierarchy Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.025 MB /tmp/perf-test-report.HX0N85TlPq/perf-report-hierarchy-perf.data (6 samples) ] perf: ui/hist.c:603: fmt_free: Assertion `!(!list_empty(&fmt->sort_list))' failed. /home/acme/libexec/perf-core/tests/shell/perf-report-hierarchy.sh: line 34: 102250 Aborted (core dumped) perf report --hierarchy > /dev/null --- Cleaning up --- ---- end(-1) ---- 83: perf report --hierarchy : FAILED! # After: # perf test hierarchy 83: perf report --hierarchy : Ok # Fixes: dbd11b6bdab12f60 ("perf hist: Remove formats in hierarchy when cancel children") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250430180321.736939-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02perf test perf-report-hierarchy: Add new testArnaldo Carvalho de Melo
Super simple test to check that at least we're not segfaulting when trying to use 'perf report --hierarchy', more subtests should be added to make sure the output is the expected one. This is being merged right before a fix for that that this test detects: # perf test hierarchy 83: perf report --hierarchy : FAILED! # perf test -v hierarchy --- start --- test child forked, pid 102242 perf report --hierarchy Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.025 MB /tmp/perf-test-report.HX0N85TlPq/perf-report-hierarchy-perf.data (6 samples) ] perf: ui/hist.c:603: fmt_free: Assertion `!(!list_empty(&fmt->sort_list))' failed. /home/acme/libexec/perf-core/tests/shell/perf-report-hierarchy.sh: line 34: 102250 Aborted (core dumped) perf report --hierarchy > /dev/null --- Cleaning up --- ---- end(-1) ---- 83: perf report --hierarchy : FAILED! # Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/lkml/20250430180321.736939-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02Merge tag 'block-6.15-20250502' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fixes from Jens Axboe: - NVMe pull request via Christoph: - fix queue unquiesce check on PCI slot_reset (Keith Busch) - fix premature queue removal and I/O failover in nvme-tcp (Michael Liang) - don't restore null sk_state_change (Alistair Francis) - select CONFIG_TLS where needed (Alistair Francis) - always free derived key data (Hannes Reinecke) - more quirks (Wentao Guan) - ublk zero copy fix - ublk selftest fix for UBLK_F_NEED_GET_DATA * tag 'block-6.15-20250502' of git://git.kernel.dk/linux: nvmet-auth: always free derived key data nvmet-tcp: don't restore null sk_state_change nvmet-tcp: select CONFIG_TLS from CONFIG_NVME_TARGET_TCP_TLS nvme-tcp: select CONFIG_TLS from CONFIG_NVME_TCP_TLS nvme-tcp: fix premature queue removal and I/O failover nvme-pci: add quirks for WDC Blue SN550 15b7:5009 nvme-pci: add quirks for device 126f:1001 nvme-pci: fix queue unquiesce check on slot_reset ublk: remove the check of ublk_need_req_ref() from __ublk_check_and_get_req ublk: enhance check for register/unregister io buffer command ublk: decouple zero copy from user copy selftests: ublk: fix UBLK_F_NEED_GET_DATA
2025-05-01Merge tag 'net-6.15-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Happy May Day. Things have calmed down on our end (knock on wood), no outstanding investigations. Including fixes from Bluetooth and WiFi. Current release - fix to a fix: - igc: fix lock order in igc_ptp_reset Current release - new code bugs: - Revert "wifi: iwlwifi: make no_160 more generic", fixes regression to Killer line of devices reported by a number of people - Revert "wifi: iwlwifi: add support for BE213", initial FW is too buggy - number of fixes for mld, the new Intel WiFi subdriver Previous releases - regressions: - wifi: mac80211: restore monitor for outgoing frames - drv: vmxnet3: fix malformed packet sizing in vmxnet3_process_xdp - eth: bnxt_en: fix timestamping FIFO getting out of sync on reset, delivering stale timestamps - use sock_gen_put() in the TCP fraglist GRO heuristic, don't assume every socket is a full socket Previous releases - always broken: - sched: adapt qdiscs for reentrant enqueue cases, fix list corruptions - xsk: fix race condition in AF_XDP generic RX path, shared UMEM can't be protected by a per-socket lock - eth: mtk-star-emac: fix spinlock recursion issues on rx/tx poll - btusb: avoid NULL pointer dereference in skb_dequeue() - dsa: felix: fix broken taprio gate states after clock jump" * tag 'net-6.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (83 commits) net: vertexcom: mse102x: Fix RX error handling net: vertexcom: mse102x: Add range check for CMD_RTS net: vertexcom: mse102x: Fix LEN_MASK net: vertexcom: mse102x: Fix possible stuck of SPI interrupt net: hns3: defer calling ptp_clock_register() net: hns3: fixed debugfs tm_qset size net: hns3: fix an interrupt residual problem net: hns3: store rx VLAN tag offload state for VF octeon_ep: Fix host hang issue during device reboot net: fec: ERR007885 Workaround for conventional TX net: lan743x: Fix memleak issue when GSO enabled ptp: ocp: Fix NULL dereference in Adva board SMA sysfs operations net: use sock_gen_put() when sk_state is TCP_TIME_WAIT bnxt_en: fix module unload sequence bnxt_en: Fix ethtool -d byte order for 32-bit values bnxt_en: Fix out-of-bound memcpy() during ethtool -w bnxt_en: Fix coredump logic to free allocated buffer bnxt_en: delay pci_alloc_irq_vectors() in the AER path bnxt_en: call pci_alloc_irq_vectors() after bnxt_reserve_rings() bnxt_en: Add missing skb_mark_for_recycle() in bnxt_rx_vlan() ...
2025-05-01ASoC: stm32: sai: fix kernel rate configurationMark Brown
Merge series from Olivier Moysan <olivier.moysan@foss.st.com>: This patchset adds some checks on kernel minimum rate requirements. This avoids potential clock rate misconfiguration, when setting the kernel frequency on STM32MP2 SoCs.
2025-04-29perf test amd ibs: Add sample period unit testRavi Bangoria
IBS Fetch and IBS Op PMUs has various constraints on supported sample periods. Add perf unit tests to test those. Running it in parallel with other tests causes intermittent failures. Mark it exclusive to force it to run sequentially. Sample output on a Zen5 machine: Without kernel fixes: $ sudo ./perf test -vv 112 112: AMD IBS sample period: --- start --- test child forked, pid 8774 Using CPUID AuthenticAMD-26-2-1 IBS config tests: ----------------- Fetch PMU tests: 0xffff : Ok (nr samples: 1078) 0x1000 : Ok (nr samples: 17030) 0xff : Ok (nr samples: 41068) 0x1 : Ok (nr samples: 40543) 0x0 : Ok 0x10000 : Ok Op PMU tests: 0x0 : Ok 0x1 : Fail 0x8 : Fail 0x9 : Ok (nr samples: 40543) 0xf : Ok (nr samples: 40543) 0x1000 : Ok (nr samples: 18736) 0xffff : Ok (nr samples: 1168) 0x10000 : Ok 0x100000 : Fail (nr samples: 14) 0xf00000 : Fail (nr samples: 1) 0xf0ffff : Fail (nr samples: 1) 0x1f0ffff : Fail (nr samples: 1) 0x7f0ffff : Fail (nr samples: 0) 0x8f0ffff : Ok 0x17f0ffff : Ok IBS sample period constraint tests: ----------------------------------- Fetch PMU test: freq 0, sample_freq 0: Ok freq 0, sample_freq 1: Fail freq 0, sample_freq 15: Fail freq 0, sample_freq 16: Ok (nr samples: 1604) freq 0, sample_freq 17: Ok (nr samples: 1604) freq 0, sample_freq 143: Ok (nr samples: 1604) freq 0, sample_freq 144: Ok (nr samples: 1604) freq 0, sample_freq 145: Ok (nr samples: 1604) freq 0, sample_freq 1234: Ok (nr samples: 1566) freq 0, sample_freq 4103: Ok (nr samples: 1119) freq 0, sample_freq 65520: Ok (nr samples: 2264) freq 0, sample_freq 65535: Ok (nr samples: 2263) freq 0, sample_freq 65552: Ok (nr samples: 1166) freq 0, sample_freq 8388607: Ok (nr samples: 268) freq 0, sample_freq 268435455: Ok (nr samples: 8) freq 1, sample_freq 0: Ok freq 1, sample_freq 1: Ok (nr samples: 4) freq 1, sample_freq 15: Ok (nr samples: 4) freq 1, sample_freq 16: Ok (nr samples: 4) freq 1, sample_freq 17: Ok (nr samples: 4) freq 1, sample_freq 143: Ok (nr samples: 5) freq 1, sample_freq 144: Ok (nr samples: 5) freq 1, sample_freq 145: Ok (nr samples: 5) freq 1, sample_freq 1234: Ok (nr samples: 7) freq 1, sample_freq 4103: Ok (nr samples: 35) freq 1, sample_freq 65520: Ok (nr samples: 642) freq 1, sample_freq 65535: Ok (nr samples: 636) freq 1, sample_freq 65552: Ok (nr samples: 651) freq 1, sample_freq 8388607: Ok Op PMU test: freq 0, sample_freq 0: Ok freq 0, sample_freq 1: Fail freq 0, sample_freq 15: Fail freq 0, sample_freq 16: Fail freq 0, sample_freq 17: Fail freq 0, sample_freq 143: Fail freq 0, sample_freq 144: Ok (nr samples: 1604) freq 0, sample_freq 145: Ok (nr samples: 1604) freq 0, sample_freq 1234: Ok (nr samples: 1604) freq 0, sample_freq 4103: Ok (nr samples: 1604) freq 0, sample_freq 65520: Ok (nr samples: 2227) freq 0, sample_freq 65535: Ok (nr samples: 2296) freq 0, sample_freq 65552: Ok (nr samples: 2213) freq 0, sample_freq 8388607: Ok (nr samples: 250) freq 0, sample_freq 268435455: Ok (nr samples: 8) freq 1, sample_freq 0: Ok freq 1, sample_freq 1: Fail (nr samples: 4) freq 1, sample_freq 15: Fail (nr samples: 4) freq 1, sample_freq 16: Fail (nr samples: 4) freq 1, sample_freq 17: Fail (nr samples: 4) freq 1, sample_freq 143: Fail (nr samples: 5) freq 1, sample_freq 144: Fail (nr samples: 5) freq 1, sample_freq 145: Fail (nr samples: 5) freq 1, sample_freq 1234: Fail (nr samples: 8) freq 1, sample_freq 4103: Fail (nr samples: 33) freq 1, sample_freq 65520: Fail (nr samples: 546) freq 1, sample_freq 65535: Fail (nr samples: 544) freq 1, sample_freq 65552: Fail (nr samples: 555) freq 1, sample_freq 8388607: Ok IBS ioctl() tests: ------------------ Fetch PMU tests ioctl(period = 0x0 ): Ok ioctl(period = 0x1 ): Fail ioctl(period = 0xf ): Fail ioctl(period = 0x10 ): Ok ioctl(period = 0x11 ): Fail ioctl(period = 0x1f ): Fail ioctl(period = 0x20 ): Ok ioctl(period = 0x80 ): Ok ioctl(period = 0x8f ): Fail ioctl(period = 0x90 ): Ok ioctl(period = 0x91 ): Fail ioctl(period = 0x100 ): Ok ioctl(period = 0xfff0 ): Ok ioctl(period = 0xffff ): Fail ioctl(period = 0x10000 ): Ok ioctl(period = 0x1fff0 ): Ok ioctl(period = 0x1fff5 ): Fail ioctl(freq = 0x0 ): Ok ioctl(freq = 0x1 ): Ok ioctl(freq = 0xf ): Ok ioctl(freq = 0x10 ): Ok ioctl(freq = 0x11 ): Ok ioctl(freq = 0x1f ): Ok ioctl(freq = 0x20 ): Ok ioctl(freq = 0x80 ): Ok ioctl(freq = 0x8f ): Ok ioctl(freq = 0x90 ): Ok ioctl(freq = 0x91 ): Ok ioctl(freq = 0x100 ): Ok Op PMU tests ioctl(period = 0x0 ): Ok ioctl(period = 0x1 ): Fail ioctl(period = 0xf ): Fail ioctl(period = 0x10 ): Fail ioctl(period = 0x11 ): Fail ioctl(period = 0x1f ): Fail ioctl(period = 0x20 ): Fail ioctl(period = 0x80 ): Fail ioctl(period = 0x8f ): Fail ioctl(period = 0x90 ): Ok ioctl(period = 0x91 ): Fail ioctl(period = 0x100 ): Ok ioctl(period = 0xfff0 ): Ok ioctl(period = 0xffff ): Fail ioctl(period = 0x10000 ): Ok ioctl(period = 0x1fff0 ): Ok ioctl(period = 0x1fff5 ): Fail ioctl(freq = 0x0 ): Ok ioctl(freq = 0x1 ): Ok ioctl(freq = 0xf ): Ok ioctl(freq = 0x10 ): Ok ioctl(freq = 0x11 ): Ok ioctl(freq = 0x1f ): Ok ioctl(freq = 0x20 ): Ok ioctl(freq = 0x80 ): Ok ioctl(freq = 0x8f ): Ok ioctl(freq = 0x90 ): Ok ioctl(freq = 0x91 ): Ok ioctl(freq = 0x100 ): Ok IBS freq (negative) tests: -------------------------- freq 1, sample_freq 200000: Fail IBS L3MissOnly test: (takes a while) -------------------- Fetch L3MissOnly: Fail (nr_samples: 1213) Op L3MissOnly: Ok (nr_samples: 1193) ---- end(-1) ---- 112: AMD IBS sample period : FAILED! With kernel fixes: $ sudo ./perf test -vv 112 112: AMD IBS sample period: --- start --- test child forked, pid 6939 Using CPUID AuthenticAMD-26-2-1 IBS config tests: ----------------- Fetch PMU tests: 0xffff : Ok (nr samples: 969) 0x1000 : Ok (nr samples: 15540) 0xff : Ok (nr samples: 40555) 0x1 : Ok (nr samples: 40543) 0x0 : Ok 0x10000 : Ok Op PMU tests: 0x0 : Ok 0x1 : Ok 0x8 : Ok 0x9 : Ok (nr samples: 40543) 0xf : Ok (nr samples: 40543) 0x1000 : Ok (nr samples: 19156) 0xffff : Ok (nr samples: 1169) 0x10000 : Ok 0x100000 : Ok (nr samples: 1151) 0xf00000 : Ok (nr samples: 76) 0xf0ffff : Ok (nr samples: 73) 0x1f0ffff : Ok (nr samples: 33) 0x7f0ffff : Ok (nr samples: 10) 0x8f0ffff : Ok 0x17f0ffff : Ok IBS sample period constraint tests: ----------------------------------- Fetch PMU test: freq 0, sample_freq 0: Ok freq 0, sample_freq 1: Ok freq 0, sample_freq 15: Ok freq 0, sample_freq 16: Ok (nr samples: 1203) freq 0, sample_freq 17: Ok (nr samples: 1604) freq 0, sample_freq 143: Ok (nr samples: 1604) freq 0, sample_freq 144: Ok (nr samples: 1604) freq 0, sample_freq 145: Ok (nr samples: 1604) freq 0, sample_freq 1234: Ok (nr samples: 1604) freq 0, sample_freq 4103: Ok (nr samples: 1343) freq 0, sample_freq 65520: Ok (nr samples: 2254) freq 0, sample_freq 65535: Ok (nr samples: 2136) freq 0, sample_freq 65552: Ok (nr samples: 1158) freq 0, sample_freq 8388607: Ok (nr samples: 257) freq 0, sample_freq 268435455: Ok (nr samples: 8) freq 1, sample_freq 0: Ok freq 1, sample_freq 1: Ok (nr samples: 4) freq 1, sample_freq 15: Ok (nr samples: 4) freq 1, sample_freq 16: Ok (nr samples: 4) freq 1, sample_freq 17: Ok (nr samples: 4) freq 1, sample_freq 143: Ok (nr samples: 5) freq 1, sample_freq 144: Ok (nr samples: 5) freq 1, sample_freq 145: Ok (nr samples: 5) freq 1, sample_freq 1234: Ok (nr samples: 8) freq 1, sample_freq 4103: Ok (nr samples: 34) freq 1, sample_freq 65520: Ok (nr samples: 458) freq 1, sample_freq 65535: Ok (nr samples: 628) freq 1, sample_freq 65552: Ok (nr samples: 396) freq 1, sample_freq 8388607: Ok Op PMU test: freq 0, sample_freq 0: Ok freq 0, sample_freq 1: Ok freq 0, sample_freq 15: Ok freq 0, sample_freq 16: Ok freq 0, sample_freq 17: Ok freq 0, sample_freq 143: Ok freq 0, sample_freq 144: Ok (nr samples: 1604) freq 0, sample_freq 145: Ok (nr samples: 1604) freq 0, sample_freq 1234: Ok (nr samples: 1604) freq 0, sample_freq 4103: Ok (nr samples: 1604) freq 0, sample_freq 65520: Ok (nr samples: 2250) freq 0, sample_freq 65535: Ok (nr samples: 2158) freq 0, sample_freq 65552: Ok (nr samples: 2296) freq 0, sample_freq 8388607: Ok (nr samples: 243) freq 0, sample_freq 268435455: Ok (nr samples: 6) freq 1, sample_freq 0: Ok freq 1, sample_freq 1: Ok (nr samples: 4) freq 1, sample_freq 15: Ok (nr samples: 4) freq 1, sample_freq 16: Ok (nr samples: 4) freq 1, sample_freq 17: Ok (nr samples: 4) freq 1, sample_freq 143: Ok (nr samples: 4) freq 1, sample_freq 144: Ok (nr samples: 5) freq 1, sample_freq 145: Ok (nr samples: 4) freq 1, sample_freq 1234: Ok (nr samples: 6) freq 1, sample_freq 4103: Ok (nr samples: 27) freq 1, sample_freq 65520: Ok (nr samples: 542) freq 1, sample_freq 65535: Ok (nr samples: 550) freq 1, sample_freq 65552: Ok (nr samples: 552) freq 1, sample_freq 8388607: Ok IBS ioctl() tests: ------------------ Fetch PMU tests ioctl(period = 0x0 ): Ok ioctl(period = 0x1 ): Ok ioctl(period = 0xf ): Ok ioctl(period = 0x10 ): Ok ioctl(period = 0x11 ): Ok ioctl(period = 0x1f ): Ok ioctl(period = 0x20 ): Ok ioctl(period = 0x80 ): Ok ioctl(period = 0x8f ): Ok ioctl(period = 0x90 ): Ok ioctl(period = 0x91 ): Ok ioctl(period = 0x100 ): Ok ioctl(period = 0xfff0 ): Ok ioctl(period = 0xffff ): Ok ioctl(period = 0x10000 ): Ok ioctl(period = 0x1fff0 ): Ok ioctl(period = 0x1fff5 ): Ok ioctl(freq = 0x0 ): Ok ioctl(freq = 0x1 ): Ok ioctl(freq = 0xf ): Ok ioctl(freq = 0x10 ): Ok ioctl(freq = 0x11 ): Ok ioctl(freq = 0x1f ): Ok ioctl(freq = 0x20 ): Ok ioctl(freq = 0x80 ): Ok ioctl(freq = 0x8f ): Ok ioctl(freq = 0x90 ): Ok ioctl(freq = 0x91 ): Ok ioctl(freq = 0x100 ): Ok Op PMU tests ioctl(period = 0x0 ): Ok ioctl(period = 0x1 ): Ok ioctl(period = 0xf ): Ok ioctl(period = 0x10 ): Ok ioctl(period = 0x11 ): Ok ioctl(period = 0x1f ): Ok ioctl(period = 0x20 ): Ok ioctl(period = 0x80 ): Ok ioctl(period = 0x8f ): Ok ioctl(period = 0x90 ): Ok ioctl(period = 0x91 ): Ok ioctl(period = 0x100 ): Ok ioctl(period = 0xfff0 ): Ok ioctl(period = 0xffff ): Ok ioctl(period = 0x10000 ): Ok ioctl(period = 0x1fff0 ): Ok ioctl(period = 0x1fff5 ): Ok ioctl(freq = 0x0 ): Ok ioctl(freq = 0x1 ): Ok ioctl(freq = 0xf ): Ok ioctl(freq = 0x10 ): Ok ioctl(freq = 0x11 ): Ok ioctl(freq = 0x1f ): Ok ioctl(freq = 0x20 ): Ok ioctl(freq = 0x80 ): Ok ioctl(freq = 0x8f ): Ok ioctl(freq = 0x90 ): Ok ioctl(freq = 0x91 ): Ok ioctl(freq = 0x100 ): Ok IBS freq (negative) tests: -------------------------- freq 1, sample_freq 200000: Ok IBS L3MissOnly test: (takes a while) -------------------- Fetch L3MissOnly: Ok (nr_samples: 1301) Op L3MissOnly: Ok (nr_samples: 1590) ---- end(0) ---- 112: AMD IBS sample period : Ok Committer notes: Avoid using PAGE_SIZE as that define is also in sys/user.h Make it a variable not to call sysconf() multiple times. Also cast func to void * when passing it as the first arg to memcpy to avoid this with some versions of clang: arch/x86/tests/amd-ibs-period.c:81:3: error: no matching function for call to 'memcpy' memcpy(func, insn1, sizeof(insn1)); ^~~~~~ /usr/include/string.h:27:7: note: candidate function not viable: no known conversion from 'int (*)(void)' to 'void *' for 1st argument void *memcpy (void *__restrict, const void *__restrict, size_t); ^ /usr/include/fortify/string.h:40:27: note: candidate function not viable: no known conversion from 'int (*)(void)' to 'void *const' for 1st argument _FORTIFY_FN(memcpy) void *memcpy(void * _FORTIFY_POS0 __od, ^ arch/x86/tests/amd-ibs-period.c:87:3: error: no matching function for call to 'memcpy' This one, for instance: Alpine clang version 19.1.4 Target: x86_64-alpine-linux-musl Thread model: posix InstalledDir: /usr/lib/llvm19/bin Configuration file: /etc/clang19/x86_64-alpine-linux-musl.cfg System configuration file directory: /etc/clang19 Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Ananth Narayan <ananth.narayan@amd.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Mario <jmario@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Santosh Shukla <santosh.shukla@amd.com> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20250429035938.1301-5-ravi.bangoria@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-29perf mem/c2c amd: Add ldlat supportRavi Bangoria
'perf mem/c2c' uses IBS Op PMU on AMD platforms. IBS Op PMU on Zen5 uarch has added support for Load Latency filtering. Implement 'perf mem/c2c' --ldlat using IBS Op Load Latency filtering capability. Some subtle differences between AMD and other arch: o --ldlat is disabled by default on AMD o Supported values are 128 to 2048. Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Ananth Narayan <ananth.narayan@amd.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Mario <jmario@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Santosh Shukla <santosh.shukla@amd.com> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20250429035938.1301-4-ravi.bangoria@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-29perf amd ibs: Incorporate Zen5 DTLB and PageSize informationRavi Bangoria
IBS Op PMU on Zen5 reports DTLB and page size information differently compared to prior generation. IBS_OP_DATA3 Zen3/4 Zen5 ---------------------------------------------------------------- 19 IbsDcL2TlbHit1G Reserved ---------------------------------------------------------------- 6 IbsDcL2tlbHit2M Reserved ---------------------------------------------------------------- 5 IbsDcL1TlbHit1G PageSize: 4 IbsDcL1TlbHit2M 0 - 4K 1 - 2M 2 - 1G 3 - Reserved Valid only if IbsDcPhyAddrValid = 1 ---------------------------------------------------------------- 3 IbsDcL2TlbMiss IbsDcL2TlbMiss Valid only if IbsDcPhyAddrValid = 1 ---------------------------------------------------------------- 2 IbsDcL1tlbMiss IbsDcL1tlbMiss Valid only if IbsDcPhyAddrValid = 1 ---------------------------------------------------------------- Kernel expose this change as "dtlb_pgsize" capability in PMU sysfs. Change IBS register raw-dump logic according to new bit definitions. Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Ananth Narayan <ananth.narayan@amd.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Mario <jmario@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Santosh Shukla <santosh.shukla@amd.com> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20250429035938.1301-3-ravi.bangoria@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-29perf amd ibs: Add Load Latency bits in raw dumpRavi Bangoria
IBS OP PMU on Zen5 supports Load Latency filtering. Decode and dump Load Latency filtering related bits into perf script raw dump. Also add oneliner example in the perf-amd-ibs man page. Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Ananth Narayan <ananth.narayan@amd.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Mario <jmario@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Santosh Shukla <santosh.shukla@amd.com> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20250429035938.1301-2-ravi.bangoria@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-29perf symbols: Handle 'u' and 'l' symbols in /proc/kallsymsArnaldo Carvalho de Melo
I started seeing this in recent Fedora 42 kernels: # uname -a Linux number 6.14.3-300.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Sun Apr 20 16:08:39 UTC 2025 x86_64 GNU/Linux # # perf test vmlinux 1: vmlinux symtab matches kallsyms : FAILED! # Where we have Rust enabled: # grep CONFIG_RUST /boot/config-6.14.3-300.fc42.x86_64 CONFIG_RUSTC_VERSION=108600 CONFIG_RUST_IS_AVAILABLE=y CONFIG_RUSTC_LLVM_VERSION=200101 CONFIG_RUSTC_HAS_COERCE_POINTEE=y CONFIG_RUST=y CONFIG_RUSTC_VERSION_TEXT="rustc 1.86.0 (05f9846f8 2025-03-31) (Fedora 1.86.0-1.fc42)" CONFIG_RUST_FW_LOADER_ABSTRACTIONS=y CONFIG_RUST_PHYLIB_ABSTRACTIONS=y # CONFIG_RUST_DEBUG_ASSERTIONS is not set CONFIG_RUST_OVERFLOW_CHECKS=y # CONFIG_RUST_BUILD_ASSERT_ALLOW is not set # Looking at the reason for the failure: # perf test -v vmlinux |& grep ^ERR ERR : 0xffffffff99efc7d0: __pfx__RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_ not on kallsyms ERR : 0xffffffff99efc7e0: _RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_ not on kallsyms # But: # grep -w u /proc/kallsyms ffffffff99efc7d0 u __pfx__RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_ ffffffff99efc7e0 u _RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_ # The test checks that "vmlinux symtab matches kallsyms", so it finds those two symbols in vmlinux: # pahole --running_kernel_vmlinux /usr/lib/debug/lib/modules/6.14.3-300.fc42.x86_64/vmlinux # # readelf -sW /usr/lib/debug/lib/modules/6.14.3-300.fc42.x86_64/vmlinux | grep -Ew '(__pfx__RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_|_RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_)' 81844: ffffffff81efc7e0 524 FUNC LOCAL DEFAULT 1 _RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_ 144259: ffffffff81efc7d0 16 FUNC LOCAL DEFAULT 1 __pfx__RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_ # It is there. From the nm documentation we can see that: "U" The symbol is undefined. "u" The symbol is a unique global symbol. This is a GNU extension to the standard set of ELF symbol bindings. For such a symbol the dynamic linker will make sure that in the entire process there is just one symbol with this name and type in use. So lets consider 'u' symbols in /proc/kallsyms when loading it to cover this case. Fedora:40 shows this as a 'l' symbol, so consider that as well. With this patch 'perf test 1' is happy again: # perf test vmlinux 1: vmlinux symtab matches kallsyms : Ok # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/aBE_n0PGl3g6h-cS@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-29selftests: net: tc_taprio: new testVladimir Oltean
Add a forwarding path test for tc-taprio, based on isochron. This is specifically intended for NICs with an offloaded data path (switchdev/DSA) and requires taprio 'flags 2'. Also, $h1 and $h2 must support hardware timestamping, and $h1 tc-etf offload, for isochron to work. Packets received by a switch while the egress port has a taprio schedule with an open gate for the traffic class must be sent right away. Packets received by the switch while the traffic class gate must be delayed until it opens. Packets received by the switch must be dropped if the gate for the traffic class never opens. Packets should pass if the maximum SDU for the traffic class allows it, and should be dropped otherwise. The schedule should auto-update itself if clock jumps take place while taprio is installed. Repeat most of the above tests after forcing two clock jumps, one backwards (in Jan 1970) and one back into the present. Symlink it from tools/testing/selftests/drivers/net/dsa, because usually DSA ports have the same MAC address, and we need STABLE_MAC_ADDRS=yes from its forwarding.config for the test to run successfully. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20250426144859.3128352-5-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-29selftests: net: tsn_lib: add window_size argument to isochron_do()Vladimir Oltean
Make out-of-band testing (send a packet when its traffic class gate is closed, expecting it to be delayed) more predictable by allowing the window size to be customized by isochron_do(). From man isochron-send, the window size alters the advance time (the delta between the transmission time of the packet, and its expected TX time when using SO_TXTIME or tc-taprio on the sender). In absence of the argument, isochron-send defaults to maximizing the advance time (making it equal to the cycle length). The default behavior is exactly what is problematic. An advance time that is too large will make packets intended to be out-of-band still be potentially in-band with an open gate from the schedule's previous cycle. We need to allow that advance time to be reduced. Perhaps a bit confusingly, isochron_do() has a shift_time argument currently, but that does not help here. The shift time shifts both the user space wakeup time and the expected TX time by equal amounts, it is unable of bringing them closer to one another. Set the window size properly for the Ocelot PSFP selftest as well. That used to work due to a very carefully chosen SHIFT_TIME_NS. I've re-tested that the test still works properly. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20250426144859.3128352-4-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-29selftests: net: tsn_lib: create common helper for counting received packetsVladimir Oltean
This snippet will be necessary for a future isochron-based test, so provide a simpler high-level interface for counting the received packets. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20250426144859.3128352-3-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-29perf tools: Fix in-source libperf buildJames Clark
When libperf is built alone in-source, $(OUTPUT) isn't set. This causes the generated uapi path to resolve to '/../arch' which results in a permissions error: mkdir: cannot create directory '/../arch': Permission denied Fix it by removing the preceding '/..' which means that it gets generated either in the tools/lib/perf part of the tree or the OUTPUT folder. Some other rules that rely on OUTPUT further refine this conditionally depending on whether it's an in-source or out-of-source build, but I don't think we need the extra complexity here. And this rule is slightly different to others because the header is needed by both libperf and Perf. This is further complicated by the fact that Perf always passes O=... to libperf even for in source builds, meaning that OUTPUT isn't set consistently between projects. Because we're no longer going one level up to try to generate the file in the tools/ folder, Perf's include rule needs to descend into libperf. Also fix the clean rule while we're here. Reported-by: Thorsten Leemhuis <linux@leemhuis.info> Closes: https://lore.kernel.org/linux-perf-users/7703f88e-ccb7-4c98-9da4-8aad224e780f@leemhuis.info/ Fixes: bfb713ea53c7 ("perf tools: Fix arm64 build by generating unistd_64.h") Signed-off-by: James Clark <james.clark@linaro.org> Tested-by: Thorsten Leemhuis <linux@leemhuis.info> Link: https://lore.kernel.org/r/20250429-james-perf-fix-libperf-in-source-build-v1-1-a1a827ac15e5@linaro.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-04-29Merge tag 'fsnotify_for_v6.15-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull fsnotify fix from Jan Kara: "A fix for the recently merged mount notification support" * tag 'fsnotify_for_v6.15-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: selftests/fs/mount-notify: test also remove/flush of mntns marks fanotify: fix flush of mntns marks
2025-04-29Merge tag 'fixes-2025-04-29' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock Pull memblock fixes from Mike Rapoport: "Fixes for nid setting in memmap_init_reserved_pages(): - pass 'size' rather than 'end' to memblock_set_node() as that function expects - fix a corner case when memblock.reserved is doubled at memmap_init_reserved_pages() and the newly reserved block won't have nid assigned" * tag 'fixes-2025-04-29' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: memblock tests: add test for memblock_set_node mm/memblock: repeat setting reserved region nid if array is doubled mm/memblock: pass size instead of end to memblock_set_node()
2025-04-29perf test probe_vfs_getname: Skip if no suitable line detectedJakub Brnak
In some cases when calling function add_probe_vfs_getname, line number can't be detected by 'perf probe -L getname_flags': 78 atomic_set(&result->refcnt, 1); // one of the following lines should have line number // but sometimes it does not because of optimization result->uptr = filename; result->aname = NULL; 81 audit_getname(result); To prevent false failures, skip the affected tests if no suitable line numbers can be detected. Signed-off-by: Jakub Brnak <jbrnak@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tomas Glozar <tglozar@redhat.com> Link: https://lore.kernel.org/r/20250324144523.597557-1-jbrnak@redhat.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-29perf lock contention: Symbolize zone->lock using BTFNamhyung Kim
The struct zone is embedded in struct pglist_data which can be allocated for each NUMA node early in the boot process. As it's not a slab object nor a global lock, this was not symbolized. Since the zone->lock is often contended, it'd be nice if we can symbolize it. On NUMA systems, node_data array will have pointers for struct pglist_data. By following the pointer, it can calculate the address of each zone and its lock using BTF. On UMA, it can just use contig_page_data and its zones. The following example shows the zone lock contention at the end. $ sudo ./perf lock con -abl -E 5 -- ./perf bench sched messaging # Running 'sched/messaging' benchmark: # 20 sender and receiver processes per group # 10 groups == 400 processes run Total time: 0.038 [sec] contended total wait max wait avg wait address symbol 5167 18.17 ms 10.27 us 3.52 us ffff953340052d00 &kmem_cache_node (spinlock) 38 11.75 ms 465.49 us 309.13 us ffff95334060c480 &sock_inode_cache (spinlock) 3916 10.13 ms 10.43 us 2.59 us ffff953342aecb40 &kmem_cache_node (spinlock) 2963 10.02 ms 13.75 us 3.38 us ffff9533d2344098 &kmalloc-rnd-08-2k (spinlock) 216 5.05 ms 99.49 us 23.39 us ffff9542bf7d65d0 zone_lock (spinlock) Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Stephane Eranian <eranian@google.com> Cc: bpf@vger.kernel.org Cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/20250401063055.7431-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-29selftests: ublk: fix UBLK_F_NEED_GET_DATAMing Lei
Commit 57e13a2e8cd2 ("selftests: ublk: support user recovery") starts to support UBLK_F_NEED_GET_DATA for covering recovery feature, however the ublk utility implementation isn't done correctly. Fix it by supporting UBLK_F_NEED_GET_DATA correctly. Also add test generic_07 for covering UBLK_F_NEED_GET_DATA. Reviewed-by: Caleb Sander Mateos <csander@purestorage.com> Fixes: 57e13a2e8cd2 ("selftests: ublk: support user recovery") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250429022941.1718671-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-04-28selftests: tc-testing: Add TDC tests that exercise reentrant enqueue behaviourVictor Nogueira
Add 5 TDC tests that exercise the reentrant enqueue behaviour in drr, ets, qfq, and hfsc: - Test DRR's enqueue reentrant behaviour with netem (which caused a double list add) - Test ETS's enqueue reentrant behaviour with netem (which caused a double list add) - Test QFQ's enqueue reentrant behaviour with netem (which caused a double list add) - Test HFSC's enqueue reentrant behaviour with netem (which caused a UAF) - Test nested DRR's enqueue reentrant behaviour with netem (which caused a double list add) Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Victor Nogueira <victor@mojatatu.com> Link: https://patch.msgid.link/20250425220710.3964791-6-victor@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-28perf test: Add perf trace summary testNamhyung Kim
$ sudo ./perf test -vv 'trace summary' 109: perf trace summary: --- start --- test child forked, pid 3501572 testing: perf trace -s -- true testing: perf trace -S -- true testing: perf trace -s --summary-mode=thread -- true testing: perf trace -S --summary-mode=total -- true testing: perf trace -as --summary-mode=thread --no-bpf-summary -- true testing: perf trace -as --summary-mode=total --no-bpf-summary -- true testing: perf trace -as --summary-mode=thread --bpf-summary -- true testing: perf trace -as --summary-mode=total --bpf-summary -- true testing: perf trace -aS --summary-mode=total --bpf-summary -- true ---- end(0) ---- 109: perf trace summary : Ok Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20250326044001.3503432-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-28perf trace: Implement syscall summary in BPFNamhyung Kim
When -s/--summary option is used, it doesn't need (augmented) arguments of syscalls. Let's skip the augmentation and load another small BPF program to collect the statistics in the kernel instead of copying the data to the ring-buffer to calculate the stats in userspace. This will be much more light-weight than the existing approach and remove any lost events. Let's add a new option --bpf-summary to control this behavior. I cannot make it default because there's no way to get e_machine in the BPF which is needed for detecting different ABIs like 32-bit compat mode. No functional changes intended except for no more LOST events. :) $ sudo ./perf trace -as --summary-mode=total --bpf-summary sleep 1 Summary of events: total, 6194 events syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ epoll_wait 561 0 4530.843 0.000 8.076 520.941 18.75% futex 693 45 4317.231 0.000 6.230 500.077 21.98% poll 300 0 1040.109 0.000 3.467 120.928 17.02% clock_nanosleep 1 0 1000.172 1000.172 1000.172 1000.172 0.00% ppoll 360 0 872.386 0.001 2.423 253.275 41.91% epoll_pwait 14 0 384.349 0.001 27.453 380.002 98.79% pselect6 14 0 108.130 7.198 7.724 8.206 0.85% nanosleep 39 0 43.378 0.069 1.112 10.084 44.23% ... Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20250326044001.3503432-1-namhyung@kernel.org [ Added fixup sent from Namhyung in response to my report to make it also dependent on CONFIG_TRACE ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-28Merge tag 'hyperv-fixes-signed-20250427' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv fixes from Wei Liu: - Bug fixes for the Hyper-V driver and kvp_daemon * tag 'hyperv-fixes-signed-20250427' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: Drivers: hv: Fix bad ref to hv_synic_eventring_tail when CPU goes offline tools/hv: update route parsing in kvp daemon Drivers: hv: Fix bad pointer dereference in hv_get_partition_id
2025-04-26Merge tag 'pci-v6.15-fixes-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull PCI fixes from Bjorn Helgaas: - When releasing a start-aligned resource, e.g., a bridge window, save start/end/flags for the next assignment attempt; fixes a v6.15-rc1 regression (Ilpo Järvinen) - Move set_pcie_speed.sh from TEST_PROGS to TEST_FILE; fixes a bwctrl selftest v6.15-rc1 regression (Ilpo Järvinen) - Add Manivannan Sadhasivam as maintainer of native host bridge and endpoint drivers (Manivannan Sadhasivam) - In endpoint test driver, defer IRQ allocation from .probe() until ioctl() to fix a regression on platforms where the Vendor/Device ID match doesn't include driver_data (Niklas Cassel) * tag 'pci-v6.15-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: misc: pci_endpoint_test: Defer IRQ allocation until ioctl(PCITEST_SET_IRQTYPE) MAINTAINERS: Move Manivannan Sadhasivam as PCI Native host bridge and endpoint maintainer selftests/pcie_bwctrl: Fix test progs list PCI: Restore assigned resources fully after release
2025-04-26Merge tag 'x86-urgent-2025-04-26' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 fixes from Ingo Molnar: - Fix 32-bit kernel boot crash if passed physical memory with more than 32 address bits - Fix Xen PV crash - Work around build bug in certain limited build environments - Fix CTEST instruction decoding in insn_decoder_test * tag 'x86-urgent-2025-04-26' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/insn: Fix CTEST instruction decoding x86/boot: Work around broken busybox 'truncate' tool x86/mm: Fix _pgd_alloc() for Xen PV mode x86/e820: Discard high memory that can't be addressed by 32-bit systems
2025-04-26Merge tag 'move-lib-kunit-v6.15-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull kunit fix from Kees Cook: "A single fix for the kunit lib/tests/ relocation: - Ensure prime numbers tests are included in KUnit test runs (Mark Brown)" * tag 'move-lib-kunit-v6.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: lib: Ensure prime numbers tests are included in KUnit test runs
2025-04-25selftests: net: bridge_vlan_aware: test untagged/8021p-tagged with and ↵Vladimir Oltean
without PVID Recent discussions around commit ad1afb003939 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") have sparked the question what happens with the DSA (and possibly other switchdev) data path when the bridge says that ports should have no PVID VLAN, but the 8021q module, as the result of a NETDEV_UP event, decides it should add VID 0 to the RX filter of those bridge ports. Do those bridge ports receive packets tagged with VID 0 or not, now? We don't know, there is no test. In the veth realm, this passes trivially, because veth is not VLAN filtering and this, the 8021q module lacks the instinct to add VID 0 in the first place. In the realm of VLAN filtering NICs with no switchdev offload, this should also pass, because the VLAN groups of the software bridge are consulted, where it can clearly be seen that a PVID is missing, even though the packet was initially accepted by the NIC. The test only poses a challenge for switchdev drivers, which usually have to program to hardware both VLANs from RX filtering, as well as from switchdev. Especially when a switchdev port joins a VLAN-aware bridge, it is unavoidable that it gains the NETIF_F_HW_VLAN_CTAG_FILTER feature, i.e. any 8021q uppers that the bridge port may have must also be committed to the RX filtering table of the interface. When a VLAN-tagged packet is physically received by the port, it is initially indistinguishable whether it will reach the bridge data path or the 8021q upper data path. That is rather the final step of the new tests that we introduce. We need to build context up to that stage, which means the following: - we need to test that 802.1p (VID 0) tagged traffic is received in the first place (on bridge ports with a valid PVID). This is the "8021p" test. - we need to test that the usual paths of reaching a configuration with no PVID on a bridge port are all covered and they all reach the same state. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20250424223734.3096202-2-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-25Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfLinus Torvalds
Pull bpf fixes from Alexei Starovoitov: - Add namespace to BPF internal symbols (Alexei Starovoitov) - Fix possible endless loop in BPF map iteration (Brandon Kammerdiener) - Fix compilation failure for samples/bpf on LoongArch (Haoran Jiang) - Disable a part of sockmap_ktls test (Ihor Solodrai) - Correct typo in __clang_major__ macro (Peilin Ye) * tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: selftests/bpf: Correct typo in __clang_major__ macro samples/bpf: Fix compilation failure for samples/bpf on LoongArch Fedora bpf: Add namespace to BPF internal symbols selftests/bpf: add test for softlock when modifying hashmap while iterating bpf: fix possible endless loop in BPF map iteration selftests/bpf: Mitigate sockmap_ktls disconnect_after_delete failure
2025-04-25selftests/bpf: Correct typo in __clang_major__ macroPeilin Ye
Make sure that CAN_USE_BPF_ST test (compute_live_registers/store) is enabled when __clang_major__ >= 18. Fixes: 2ea8f6a1cda7 ("selftests/bpf: test cases for compute_live_registers()") Signed-off-by: Peilin Ye <yepeilin@google.com> Link: https://lore.kernel.org/r/20250425213712.1542077-1-yepeilin@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-04-25Merge tag 'cxl-fixes-6.15-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull cxl fixes from Dave Jiang: "The fixes address global persistent flush (GPF) changes and CXL Features support changes that went in the 6.15 merge window. And also a fix to an issue observed on CXL 1.1 platform during device enumeration. Summary: - Fix using the wrong GPF DVSEC location: - Fix caching of dport GPF DVSEC from the first endpoint - Ensure that the GPF phase timeout is only updated once by first endpoint - Drop is_port parameter for cxl_gpf_get_dvsec() - Fix the devm_* call host device for CXL fwctl setup - Set the out_len in Set Features failure case - Fix RCD initialization by skipping unneeded mem_en check" * tag 'cxl-fixes-6.15-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/core/regs.c: Skip Memory Space Enable check for RCD and RCH Ports cxl/feature: Update out_len in set feature failure case cxl: Fix devm host device for CXL fwctl initialization cxl/pci: Drop the parameter is_port of cxl_gpf_get_dvsec() cxl/pci: Update Port GPF timeout only when the first EP attaching cxl/core: Fix caching dport GPF DVSEC issue
2025-04-25Merge tag 'block-6.15-20250424' of git://git.kernel.dk/linuxLinus Torvalds
Pull block fixes from Jens Axboe: - Fix autoloading of drivers from stat*(2) - Fix losing read-ahead setting one suspend/resume, when a device is re-probed. - Fix race between setting the block size and page cache updates. Includes a helper that a coming XFS fix will use as well. - ublk cancelation fixes. - ublk selftest additions and fixes. - NVMe pull via Christoph: - fix an out-of-bounds access in nvmet_enable_port (Richard Weinberger) * tag 'block-6.15-20250424' of git://git.kernel.dk/linux: ublk: fix race between io_uring_cmd_complete_in_task and ublk_cancel_cmd ublk: call ublk_dispatch_req() for handling UBLK_U_IO_NEED_GET_DATA block: don't autoload drivers on blk-cgroup configuration block: don't autoload drivers on stat block: remove the backing_inode variable in bdev_statx block: move blkdev_{get,put} _no_open prototypes out of blkdev.h block: never reduce ra_pages in blk_apply_bdi_limits selftests: ublk: common: fix _get_disk_dev_t for pre-9.0 coreutils selftests: ublk: remove useless 'delay_us' from 'struct dev_ctx' selftests: ublk: fix recover test block: hoist block size validation code to a separate function block: fix race between set_blocksize and read paths nvmet: fix out-of-bounds access in nvmet_enable_port
2025-04-25selftests/bpf: add test for softlock when modifying hashmap while iteratingBrandon Kammerdiener
Add test that modifies the map while it's being iterated in such a way that hangs the kernel thread unless the _safe fix is applied to bpf_for_each_hash_elem. Signed-off-by: Brandon Kammerdiener <brandon.kammerdiener@intel.com> Link: https://lore.kernel.org/r/20250424153246.141677-3-brandon.kammerdiener@intel.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Hou Tao <houtao1@huawei.com>
2025-04-25perf vendor events arm64: Drop hip08 PublicDescription if same as ↵Junhao He
BriefDescription If BriefDescription and PublicDescription are the same, only BriefDescription is needed. It will be used for both long and short format outputs. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Junhao He <hejunhao3@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250418070812.3771441-3-hejunhao3@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf vendor events arm64: Fill up Desc field for Hisi hip08 hha pmuJunhao He
In the same PMU, when some JSON events have the "BriefDescription" field populated while others do not, the cmp_sevent() function will split these two types of events into separate groups. As a result, when using perf list to display events, the two types of events cannot be grouped together in the output. before patch: $ perf list pmu ... uncore hha: hisi_sccl1_hha2/sdir-hit/ hisi_sccl1_hha2/sdir-lookup/ ... uncore hha: edir-hit [Count of The number of HHA E-Dir hit operations. Unit: hisi_sccl1_hha2] ... after patch: $ perf list pmu ... uncore hha: edir-hit [Count of The number of HHA E-Dir hit operations. Unit: hisi_sccl1_hha2] sdir-hit [Count of The number of HHA S-Dir hit operations. Unit: hisi_sccl1_hha2] sdir-lookup [Count of the number of HHA S-Dir lookup operations. Unit: hisi_sccl1_hha2] ... Reviewed-by: James Clark <james.clark@linaro.org> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Yicong Yang <yangyicong@hisilicon.com> Signed-off-by: Junhao He <hejunhao3@huawei.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250418070812.3771441-2-hejunhao3@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf bench evlist-open-close: Reduce scope of 2 variablesIan Rogers
Make 2 global variables local. Reduces ELF binary size by removing relocations. For a no flags build, the perf binary size is reduced by 4,144 bytes on x86-64. Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Hao Ge <gehao@kylinos.cn> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Levi Yun <yeoreum.yun@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tengda Wu <wutengda@huaweicloud.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Veronika Molnarova <vmolnaro@redhat.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Xu Yang <xu.yang_2@nxp.com> Link: https://lore.kernel.org/r/20250410173631.1713627-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf tests record: Cleanup improvementsIan Rogers
Remove the script output file. Add a trap debug message. Minor style consistency changes. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250410173631.1713627-2-irogers@google.com Cc: Mark Rutland <mark.rutland@arm.com> Cc: Levi Yun <yeoreum.yun@arm.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Howard Chu <howardchu95@gmail.com> Cc: Tengda Wu <wutengda@huaweicloud.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Hao Ge <gehao@kylinos.cn> Cc: James Clark <james.clark@linaro.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Xu Yang <xu.yang_2@nxp.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Veronika Molnarova <vmolnaro@redhat.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: bpf@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-perf-users@vger.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf tests metric-only perf stat: Fix tests 84 and 86 s390Thomas Richter
On s390x KVM and z/VM machines the CPU Measurement Facility is not available. Events cycles and instructions do not exist. Running above tests on s390 KVM and z/VM guests always fail with this error: # ./perf test 84 86 84: perf stat JSON output linter : FAILED! 86: perf stat STD output linter : FAILED! # Root cause is command: # perf stat -j --metric-only -e instructions,cycles -- true {"metric-value" : "none"} # Which fails due to unsupported events and returns "none". Do not execute this test case on s390 KVM and z/VM machines. Output after: # ./perf test 84 86 84: perf stat JSON output linter : Ok 86: perf stat STD output linter : Ok # Fixes: 45a86d017adf4d6c ("perf test: Add --metric-only to perf stat output tests") Suggested-by: Heiko Carstens <hca@linux.ibm.com> Suggested-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Vasily Gorbik <gor@linux.ibm.com> Link: https://lore.kernel.org/r/20250424133310.37452-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf tool_pmu: Fix aggregation on duration_timeIan Rogers
evsel__count_has_error() fails counters when the enabled or running time are 0. The duration_time event reads 0 when the cpu_map_idx != 0 to avoid aggregating time over CPUs. Change the enable and running time to always have a ratio of 100% so that evsel__count_has_error won't fail. Before: ``` $ sudo /tmp/perf/perf stat --per-core -a -M UNCORE_FREQ sleep 1 Performance counter stats for 'system wide': S0-D0-C0 1 2,615,819,485 UNC_CLOCK.SOCKET # 2.61 UNCORE_FREQ S0-D0-C0 2 <not counted> duration_time 1.002111784 seconds time elapsed ``` After: ``` $ perf stat --per-core -a -M UNCORE_FREQ sleep 1 Performance counter stats for 'system wide': S0-D0-C0 1 758,160,296 UNC_CLOCK.SOCKET # 0.76 UNCORE_FREQ S0-D0-C0 2 1,003,438,246 duration_time 1.002486017 seconds time elapsed ``` Note: the metric reads the value a different way and isn't impacted. Fixes: 240505b2d0adcdc8 ("perf tool_pmu: Factor tool events into their own PMU") Reported-by: Stephane Eranian <eranian@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Link: https://lore.kernel.org/r/20250423050358.94310-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf session: Skip unsupported new event typesChun-Tse Shao
`perf report` currently halts with an error when encountering unsupported new event types (`event.type >= PERF_RECORD_HEADER_MAX`). This patch modifies the behavior to skip these samples and continue processing the remaining events. Additionally, stops reporting if the new event size is not 8-byte aligned. Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org> Suggested-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Chun-Tse Shao <ctshao@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250414173921.2905822-1-ctshao@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf hist: Allow custom output fields in hierarchy modeNamhyung Kim
Now it can handle multiple output fields and sort keys in separate levels, so it should be ok to use it in the hierarchy mode. This allows fully customized output format. $ perf report -F latency,comm,parallelism -H --stdio ... # Latency Command / Parallelism # ........... ..................... # 31.84% cc1 29.96% 5 1.24% 4 0.37% 6 0.26% 3 0.02% 2 24.68% as 22.39% 5 1.12% 2 0.98% 4 0.12% 3 0.07% 6 ... Committer testing: Before: $ perf report -F latency,comm,parallelism -H --stdio Error: --hierarchy and --fields options cannot be used together Usage: perf report [<options>] -F, --fields <key[,keys...]> output field(s): overhead latency overhead_sys overhead_us overhead_guest_sys overhead_guest_us overhead_children latency_children sample period weight1 weight2 weight3 <SNIP> -H, --hierarchy Show entries in a hierarchy $ After: $ perf report -F latency,comm,parallelism -H --stdio # Total Lost Samples: 0 # # Samples: 1K of event 'cycles:Pu' # Event count (approx.): 1581450138 # # Latency Command / Parallelism # ........... ..................... # 97.66% git 96.95% 1 0.55% 2 0.04% 5 0.03% 8 0.03% 4 0.02% 3 0.01% 9 0.01% 7 0.01% 6 0.01% 10 0.00% 12 2.34% git-remote-http 2.24% 1 0.07% 5 0.02% 2 0.00% 4 # # (Tip: To analyze particular parallelism levels, try: perf report --latency --parallelism=32-64) Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250331073722.4695-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf hist: Set levels in output_field_add()Namhyung Kim
It turns out that the output fields didn't consider the hierarchy mode and put all the fields in the same level. To support hierarchy, each non-output field should be in a separate level. Pass a pointer to level to output_field_add() and make it increase the level when it sees non-output fields. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250331073722.4695-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf hist: Remove formats in hierarchy when cancel latencyNamhyung Kim
Likewise, it should remove latency output fields in hierarchy list. Pass evlist to perf_hpp__cancel_latency() to handle them properly. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250331073722.4695-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf hist: Remove formats in hierarchy when cancel childrenNamhyung Kim
This is to support hierarchy options with custom output fields. Currently perf_hpp__cancel_cumulate() only removes accumulated overhead and latency fields from the global perf_hpp_list. This is not used in the hierarchy mode because each evsel's hist has its own separate hpp_list. So it needs to remove the fields from the lists too. Pass evlist to the function so that it can iterate the evsels. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250331073722.4695-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf record: Retirement latency cleanup in evsel__configIan Rogers
'perf record' will fail with retirement latency events as the open doesn't do a perf_event_open system call. Use evsel__config() to set up such events for recording by removing the flag and enabling sample weights - the sample weights containing the retirement latency. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Weilin Wang <weilin.wang@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andreas Färber <afaerber@suse.de> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250414174134.3095492-17-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf pmu-events: Add retirement latency to JSON events inside of perfIan Rogers
The updated Intel vendor events add retirement latency for graniterapids: https://lore.kernel.org/lkml/20250322063403.364981-14-irogers@google.com/ This change makes those values available within an alias/event within a PMU and saves them into the evsel at event parse time. When no TPEBS data is available the default values are substituted in for TMA metrics that are using retirement latency events - currently just those on graniterapids. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Weilin Wang <weilin.wang@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andreas Färber <afaerber@suse.de> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250414174134.3095492-16-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf stat: Add mean, min, max and last --tpebs-mode optionsIan Rogers
Add command line configuration option for how retirement latency events are combined. The default "mean" gives the average of retirement latency. "min" or "max" give the smallest or largest retirment latency times respectively. "last" uses the last retirment latency sample's time. Committer notes: Enclose parse_tpebs_mode() under HAVE_ARCH_X86_64_SUPPORT to match the ifdef block where it is used, fixing the build in systems like: 20 5.60 debian:experimental-x-mips : FAIL gcc version 14.2.0 (Debian 14.2.0-1) builtin-stat.c:2330:12: error: 'parse_tpebs_mode' defined but not used [-Werror=unused-function] 2330 | static int parse_tpebs_mode(const struct option *opt, const char *str, | ^~~~~~~~~~~~~~~~ Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Weilin Wang <weilin.wang@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andreas Färber <afaerber@suse.de> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250414174134.3095492-15-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf intel-tpebs: Use stats for retirement latency statisticsIan Rogers
struct stats provides access to mean, min and max. It also provides uniformity with statistics code used elsewhere in perf. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Weilin Wang <weilin.wang@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andreas Färber <afaerber@suse.de> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250414174134.3095492-14-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>