linux/linux-stable.git - Linux kernel stable tree

Age	Commit message (Collapse)	Author
2023-02-03	tools/power/x86/intel-speed-select: Remove unused non_block flag	Zhang Rui
	variable 'non_block' is always 0, thus remove the variable and the handling for "non_block != 0" case. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-02-03	tools/power/x86/intel-speed-select: Remove wrong check in set_isst_id()	Zhang Rui
	struct isst_id *id is a pointer, comparing it with less than zero is wrong. The check is there to make sure the id->pkg and id->die is set to -1, when it is illegal or unavailable. Here comparing with MAX_PACKAGE_COUNT and MAX_DIE_PER_PACKAGE is sufficient. Hence remove the wrong check. Signed-off-by: Zhang Rui <rui.zhang@intel.com> [srinivas.pandruvada@linux.intel.com: Subject and changelog edits] Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-02-02	kselftest: vm: add tests for memory-deny-write-execute	Kees Cook
	Add some tests to cover the new PR_SET_MDWE prctl. Link: https://lkml.kernel.org/r/20230119160344.54358-3-joey.gouly@arm.com Co-developed-by: Joey Gouly <joey.gouly@arm.com> Signed-off-by: Joey Gouly <joey.gouly@arm.com> Signed-off-by: Kees Cook <keescook@chromium.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Jeremy Linton <jeremy.linton@arm.com> Cc: Lennart Poettering <lennart@poettering.net> Cc: Mark Brown <broonie@kernel.org> Cc: nd <nd@arm.com> Cc: Szabolcs Nagy <szabolcs.nagy@arm.com> Cc: Topi Miettinen <toiwoton@gmail.com> Cc: Zbigniew Jędrzejewski-Szmek <zbyszek@in.waw.pl> Cc: David Hildenbrand <david@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-02	tools/mm: allow users to provide additional cflags/ldflags	Herton R. Krzesinski
	Right now there is no way to provide additional cflags/ldflags when building tools/vm binaries. And using eg. make CFLAGS=<options> will override the CFLAGS being set in the Makefile, making the build fail since it requires the include of the ../lib dir (for libapi). This change then allows you to specify: CFLAGS=<options> LDFLAGS=<options> make V=1 -C tools/vm And the options will be correctly appended as can be seen from the make output. Link: https://lkml.kernel.org/r/20230116224921.4106324-1-herton@redhat.com Signed-off-by: Herton R. Krzesinski <herton@redhat.com> Cc: Don Zickus <dzickus@redhat.com> Cc: Justin Forbes <jforbes@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Scott Weaver <scweaver@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-02	mm: discard __GFP_ATOMIC	NeilBrown
	__GFP_ATOMIC serves little purpose. Its main effect is to set ALLOC_HARDER which adds a few little boosts to increase the chance of an allocation succeeding, one of which is to lower the water-mark at which it will succeed. It is always paired with __GFP_HIGH which sets ALLOC_HIGH which also adjusts this watermark. It is probable that other users of __GFP_HIGH should benefit from the other little bonuses that __GFP_ATOMIC gets. __GFP_ATOMIC also gives a warning if used with __GFP_DIRECT_RECLAIM. There is little point to this. We already get a might_sleep() warning if __GFP_DIRECT_RECLAIM is set. __GFP_ATOMIC allows the "watermark_boost" to be side-stepped. It is probable that testing ALLOC_HARDER is a better fit here. __GFP_ATOMIC is used by tegra-smmu.c to check if the allocation might sleep. This should test __GFP_DIRECT_RECLAIM instead. This patch: - removes __GFP_ATOMIC - allows __GFP_HIGH allocations to ignore watermark boosting as well as GFP_ATOMIC requests. - makes other adjustments as suggested by the above. The net result is not change to GFP_ATOMIC allocations. Other allocations that use __GFP_HIGH will benefit from a few different extra privileges. This affects: xen, dm, md, ntfs3 the vermillion frame buffer hibernation ksm swap all of which likely produce more benefit than cost if these selected allocation are more likely to succeed quickly. [mgorman: Minor adjustments to rework on top of a series] Link: https://lkml.kernel.org/r/163712397076.13692.4727608274002939094@noble.neil.brown.name Link: https://lkml.kernel.org/r/20230113111217.14134-7-mgorman@techsingularity.net Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Mel Gorman <mgorman@techsingularity.net> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-02	maple_tree: remove the parameter entry of mas_preallocate	Vernon Yang
	The parameter entry of mas_preallocate is not used, so drop it. Link: https://lkml.kernel.org/r/20230110154211.1758562-1-vernon2gm@gmail.com Signed-off-by: Vernon Yang <vernon2gm@gmail.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Matthew Wilcox <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-02	selftests/damon/debugfs_rm_non_contexts: hide expected write error messages	SeongJae Park
	A selftest case for DAMON debugfs interface has a test for expected failure. To make the test output clean, hide the expected failure error message. Link: https://lkml.kernel.org/r/20230110190400.119388-9-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-02	selftests/damon/sysfs: hide expected write failures	SeongJae Park
	DAMON selftests for sysfs (sysfs.sh) tests if some writes to DAMON sysfs interface files fails as expected. It makes the test results noisy with the failure error message because it tests a number of such failures. Redirect the expected failure error messages to /dev/null to make the results clean. Link: https://lkml.kernel.org/r/20230110190400.119388-8-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-02-02	bpftool: profile online CPUs instead of possible	Tonghao Zhang
	The number of online cpu may be not equal to possible cpu. "bpftool prog profile" can not create pmu event on possible but on online cpu. $ dmidecode -s system-product-name PowerEdge R620 $ cat /sys/devices/system/cpu/possible 0-47 $ cat /sys/devices/system/cpu/online 0-31 Disable cpu dynamically: $ echo 0 > /sys/devices/system/cpu/cpuX/online If one cpu is offline, perf_event_open will return ENODEV. To fix this issue: * check value returned and skip offline cpu. * close pmu_fd immediately on error path, avoid fd leaking. Fixes: 47c09d6a9f67 ("bpftool: Introduce "prog profile" command") Signed-off-by: Tonghao Zhang <tong@infragraf.org> Cc: Quentin Monnet <quentin@isovalent.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Martin KaFai Lau <martin.lau@linux.dev> Cc: Song Liu <song@kernel.org> Cc: Yonghong Song <yhs@fb.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Stanislav Fomichev <sdf@google.com> Cc: Hao Luo <haoluo@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Link: https://lore.kernel.org/r/20230202131701.29519-1-tong@infragraf.org Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-02-03	Merge tag 'iio-for-6.3a' of ↵	Greg Kroah-Hartman
	https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-next Jonathan writes: 1st set of IIO new device support, features and cleanups for the 6.3 cycle The usual mixed bag. So far this has been a quiet cycle for IIO. New device support * adi,ad8686 - Add support for the AD5337 DAC - ID and 8 bit channel support. * maxim,max5522 - New driver for this 2 channel DAC. * nxp,imx93-adc - New driver for this SoC ADC which is a fresh IP that will probably turn up in additional SoCs going forwards. * st,magn - Add support for magnetometer part of LSM303C which is very similar to standalone LIS3MDL already supported. * ti,ads7924 - New driver for this 4 channel, 12-bit I2C ADC. * ti,lmp92064 - New driver for this 12 bit SPI ADC. * ti,tmag5273 - New driver for this 3D Hall-Effect Sensor. Features * core - Add a standard structure for the value pairs in IIO_VAL_INT_PLUS_MICRO available attributes and similar. * cirrus,ep93xx - Add DT binding docs and convert driver to DT based probing. - Enable testing building with CONFIG_COMPILE_TEST. * st,stm32-dfsdm - Enable ID register support for discovery of hardware capabilities on some devices. Cleanups and minor fixes * core - Drop the custom iio_sysfs_match_string_with_gaps(). The special ability of this function to skip gaps in an array was never used by any upstream driver. - Sort headers whilst touching this file. * tools - Fix memory leak in iio_utils.c * various - leftover i2c probe_new() conversions. - scnprintf() -> sysfs_emit() cleanups. - hand rolled devm enables -> devm_regulator[_bulk]_get_enable() - typo fixes - dt-binding cleanup (whitespace, excess quotes and similar) * adi,ad7746 - Set variable without pointless conditional. * fsl,mma9551 - Squash false positives about use of uninitialized variable where garbage undergoes an endian conversion before being ignored. * measspec,ms5611 - Switch to fully devm_ managed probe() and so drop explicit remove() * qcom,spmi-adc - Use dev_err_probe() to suppress deferred print. * qcom,spmi-adc5 - Define a missing channel used for battery identification. * qcom,spmi-iadc - Document a compatible seen in wild. * semtech,sx9360 - Fix units on semtech,resolution dt-binding. * sensiron,scd30 - dev_err_probe() usage to simplify error paths a little. * st,lsm6dsx - Add missing mount matrix for the gyro IIO device. * taos,tsl2563 - Respect firmware configured interrupt polarity if present. - Use i2c_smbus_write_word_data() in a few cases not previously covered. - Factor out duplicated interrupt configuration. - Switch to GENMASK() / BIT() from hand coded equivalents. - Tidy up unused definitions. - Use dev_err_probe() as appropriate. - Drop platform_data as no in kernel users and there are better ways to do equivalent if any are added. - Add local struct device variable to tidy up code. - Avoid dance via i2c_client to get the drvdata. - Tidy up headers ordering and Makefile ordering. * ti,adc128s052 - Use new spi_get_device_match_data(). - Drop ACPI_PTR() protection. - Sort headers whilst here. - Use asm instead of incorrect include of asm-generic/unaligned.h * vishay,vcn4000 - Interrupt support for vcnl4040 (lots of refactoring needed) * xilinx,ams - Use fwnode_device_is_compatible() instead of open coding it. * tag 'iio-for-6.3a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio: (71 commits) iio: adc: ad7291: Fix indentation error by adding extra spaces iio: accel: mma9551_core: Prevent uninitialized variable in mma9551_read_config_word() iio: accel: mma9551_core: Prevent uninitialized variable in mma9551_read_status_word() dt-bindings: iio/proximity: semtech,sx9360: Fix 'semtech,resolution' type iio: imu: fix spdx format iio: adc: imx93: Fix spelling mistake "geting" -> "getting" dt-bindings: iio: cleanup examples - indentation dt-bindings: iio: use lowercase hex in examples dt-bindings: iio: correct node names in examples dt-bindings: iio: minor whitespace cleanups dt-bindings: iio: drop unneeded quotes dt-bindings: iio: adc: Add NXP IMX93 ADC iio: adc: add imx93 adc support dt-bindings: iio: adc: add Texas Instruments ADS7924 iio: adc: ti-ads7924: add Texas Instruments ADS7924 driver iio: imu: st_lsm6dsx: add 'mount_matrix' sysfs entry to gyro channel. iio: imu: st_lsm6dsx: fix naming of 'struct iio_info' in st_lsm6dsx_shub.c. iio: light: vcnl4000: Add interrupt support for vcnl4040 iio: light: vcnl4000: Make irq handling more generic iio: light: vcnl4000: Prepare for more generic setup ...
2023-02-02	selftests/bpf: introduce XDP compliance test tool	Lorenzo Bianconi
	Introduce xdp_features tool in order to test XDP features supported by the NIC and match them against advertised ones. In order to test supported/advertised XDP features, xdp_features must run on the Device Under Test (DUT) and on a Tester device. xdp_features opens a control TCP channel between DUT and Tester devices to send control commands from Tester to the DUT and a UDP data channel where the Tester sends UDP 'echo' packets and the DUT is expected to reply back with the same packet. DUT installs multiple XDP programs on the NIC to test XDP capabilities and reports back to the Tester some XDP stats. Currently xdp_features supports the following XDP features: - XDP_DROP - XDP_ABORTED - XDP_PASS - XDP_TX - XDP_REDIRECT - XDP_NDO_XMIT Co-developed-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Acked-by: Stanislav Fomichev <sdf@google.com> Link: https://lore.kernel.org/r/7c1af8e7e6ef0614cf32fa9e6bdaa2d8d605f859.1675245258.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-02	selftests/bpf: add test for bpf_xdp_query xdp-features support	Lorenzo Bianconi
	Introduce a self-test to verify libbpf bpf_xdp_query capability to dump the xdp-features supported by the device (lo and veth in this case). Acked-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/534550318a2c883e174811683909544c63632f05.1675245258.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-02	libbpf: add API to get XDP/XSK supported features	Lorenzo Bianconi
	Extend bpf_xdp_query routine in order to get XDP/XSK supported features of netdev over route netlink interface. Extend libbpf netlink implementation in order to support netlink_generic protocol. Co-developed-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Co-developed-by: Marek Majtyka <alardam@gmail.com> Signed-off-by: Marek Majtyka <alardam@gmail.com> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/a72609ef4f0de7fee5376c40dbf54ad7f13bfb8d.1675245258.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-02	libbpf: add the capability to specify netlink proto in libbpf_netlink_send_recv	Lorenzo Bianconi
	This is a preliminary patch in order to introduce netlink_generic protocol support to libbpf. Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Link: https://lore.kernel.org/r/7878a54667e74afeec3ee519999c044bd514b44c.1675245258.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-02	netdev-genl: create a simple family for netdev stuff	Jakub Kicinski
	Add a Netlink spec-compatible family for netdevs. This is a very simple implementation without much thought going into it. It allows us to reap all the benefits of Netlink specs, one can use the generic client to issue the commands: $ ./cli.py --spec netdev.yaml --dump dev_get [{'ifindex': 1, 'xdp-features': set()}, {'ifindex': 2, 'xdp-features': {'basic', 'ndo-xmit', 'redirect'}}, {'ifindex': 3, 'xdp-features': {'rx-sg'}}] the generic python library does not have flags-by-name support, yet, but we also don't have to carry strings in the messages, as user space can get the names from the spec. Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Co-developed-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Co-developed-by: Marek Majtyka <alardam@gmail.com> Signed-off-by: Marek Majtyka <alardam@gmail.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/327ad9c9868becbe1e601b580c962549c8cd81f2.1675245258.git.lorenzo@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-02	selftests/bpf: Use semicolon instead of comma in test_verifier.c	Tiezhu Yang
	Just silence the following checkpatch warning: WARNING: Possible comma where semicolon could be used Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Link: https://lore.kernel.org/r/1675319486-27744-3-git-send-email-yangtiezhu@loongson.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-02	tools/bpf: Use tab instead of white spaces to sync bpf.h	Tiezhu Yang
	Just silence the following build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/bpf.h' differs from latest version at 'include/uapi/linux/bpf.h' Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Link: https://lore.kernel.org/r/1675319486-27744-2-git-send-email-yangtiezhu@loongson.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-02	selftests/bpf: Initialize tc in xdp_synproxy	Ilya Leoshkevich
	xdp_synproxy/xdp fails in CI with: Error: bpf_tc_hook_create: File exists The XDP version of the test should not be calling bpf_tc_hook_create(); the reason it's happening anyway is that if we don't specify --tc on the command line, tc variable remains uninitialized. Fixes: 784d5dc0efc2 ("selftests/bpf: Add selftests for raw syncookie helpers in TC mode") Reported-by: Alexei Starovoitov <ast@kernel.org> Reported-by: Joanne Koong <joannelkoong@gmail.com> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20230202235335.3403781-1-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-02	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski
	net/core/gro.c 7d2c89b32587 ("skb: Do mix page pool and page referenced frags in GRO") b1a78b9b9886 ("net: add support for ipv4 big tcp") https://lore.kernel.org/all/20230203094454.5766f160@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-02	Merge tag 'net-6.2-rc7' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf, can and netfilter. Current release - regressions: - phy: fix null-deref in phy_attach_direct - mac802154: fix possible double free upon parsing error Previous releases - regressions: - bpf: preserve reg parent/live fields when copying range info, prevent mis-verification of programs as safe - ip6: fix GRE tunnels not generating IPv6 link local addresses - phy: dp83822: fix null-deref on DP83825/DP83826 devices - sctp: do not check hb_timer.expires when resetting hb_timer - eth: mtk_sock: fix SGMII configuration after phylink conversion Previous releases - always broken: - eth: xdp: execute xdp_do_flush() before napi_complete_done() - skb: do not mix page pool and page referenced frags in GRO - bpf: - fix a possible task gone issue with bpf_send_signal[_thread]() - fix an off-by-one bug in bpf_mem_cache_idx() to select the right cache - add missing btf_put to register_btf_id_dtor_kfuncs - sockmap: fon't let sock_map_{close,destroy,unhash} call itself - gso: fix null-deref in skb_segment_list() - mctp: purge receive queues on sk destruction - fix UaF caused by accept on already connected socket in exotic socket families - tls: don't treat list head as an entry in tls_is_tx_ready() - netfilter: br_netfilter: disable sabotage_in hook after first suppression - wwan: t7xx: fix runtime PM implementation Misc: - MAINTAINERS: spring cleanup of networking maintainers" * tag 'net-6.2-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (65 commits) mtk_sgmii: enable PCS polling to allow SFP work net: mediatek: sgmii: fix duplex configuration net: mediatek: sgmii: ensure the SGMII PHY is powered down on configuration MAINTAINERS: update SCTP maintainers MAINTAINERS: ipv6: retire Hideaki Yoshifuji mailmap: add John Crispin's entry MAINTAINERS: bonding: move Veaceslav Falico to CREDITS net: openvswitch: fix flow memory leak in ovs_flow_cmd_new net: ethernet: mtk_eth_soc: disable hardware DSA untagging for second MAC virtio-net: Keep stop() to follow mirror sequence of open() selftests: net: udpgso_bench_tx: Cater for pending datagrams zerocopy benchmarking selftests: net: udpgso_bench: Fix racing bug between the rx/tx programs selftests: net: udpgso_bench_rx/tx: Stop when wrong CLI args are provided selftests: net: udpgso_bench_rx: Fix 'used uninitialized' compiler warning can: mcp251xfd: mcp251xfd_ring_set_ringparam(): assign missing tx_obj_num_coalesce_irq can: isotp: split tx timer into transmission and timeout can: isotp: handle wait_event_interruptible() return values can: raw: fix CAN FD frame transmissions over CAN XL devices can: j1939: fix errant WARN_ON_ONCE in j1939_session_deactivate hv_netvsc: Fix missed pagebuf entries in netvsc_dma_map/unmap() ...
2023-02-02	perf pmu-events: Add separate metric from pmu_event	Ian Rogers
	Create a new pmu_metric for the metric related variables from pmu_event but that is initially just a clone of pmu_event. Add iterators for pmu_metric and use in places that metrics are desired rather than events. Make the event iterator skip metric only events, and the metric iterator skip event only events. Reviewed-by: John Garry <john.g.garry@oracle.com> Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20230126233645.200509-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02	perf jevents: Rewrite metrics in the same file with each other	Ian Rogers
	Rewrite metrics within the same file in terms of each other. For example, on Power8 other_stall_cpi is rewritten from: "PM_CMPLU_STALL / PM_RUN_INST_CMPL - PM_CMPLU_STALL_BRU_CRU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_FXU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_VSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_LSU / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NTCG_FLUSH / PM_RUN_INST_CMPL - PM_CMPLU_STALL_NO_NTF / PM_RUN_INST_CMPL" to: "stall_cpi - bru_cru_stall_cpi - fxu_stall_cpi - vsu_stall_cpi - lsu_stall_cpi - ntcg_flush_cpi - no_ntf_stall_cpi" Which more closely matches the definition on Power9. To avoid recomputation decorate the function with a cache. Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20230126233645.200509-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02	perf jevents metric: Add ability to rewrite metrics in terms of others	Ian Rogers
	Add RewriteMetricsInTermsOfOthers that iterates over pairs of names and expressions trying to replace an expression, within the current expression, with its name. Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20230126233645.200509-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02	perf jevents metric: Correct Function equality	Ian Rogers
	rhs may not be defined, say for source_count, so add a guard. Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Florian Fischer <florian.fischer@muhq.space> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Kang Minchul <tegongkang@gmail.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Rob Herring <robh@kernel.org> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Will Deacon <will@kernel.org> Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20230126233645.200509-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02	perf session: Show branch speculation info in raw dump	Sandipan Das
	Show the branch speculation info if provided by the branch recording hardware feature. This can be useful for purposes of code optimization. E.g. $ perf record -j any,u ./test_branch $ perf report --dump-raw-trace Before: [...] 8380958377610 0x40b178 [0x1b0]: PERF_RECORD_SAMPLE(IP, 0x2): 7952/7952: 0x4f851a period: 48973 addr: 0 ... branch stack: nr:16 ..... 0: 00000000004b52fd -> 00000000004f82c0 0 cycles P 0 ..... 1: ffffffff8220137c -> 00000000004b52f0 0 cycles M 0 ..... 2: 000000000041d1c4 -> 00000000004b52f0 0 cycles P 0 ..... 3: 00000000004e7ead -> 000000000041d1b0 0 cycles M 0 ..... 4: 00000000004e7f91 -> 00000000004e7ead 0 cycles P 0 ..... 5: 00000000004e7ea8 -> 00000000004e7f70 0 cycles P 0 ..... 6: 00000000004e7e52 -> 00000000004e7e98 0 cycles M 0 ..... 7: 00000000004e7e1f -> 00000000004e7e40 0 cycles M 0 ..... 8: 00000000004e7f60 -> 00000000004e7df0 0 cycles P 0 ..... 9: 00000000004e7f58 -> 00000000004e7f60 0 cycles M 0 ..... 10: 000000000041d85d -> 00000000004e7f50 0 cycles P 0 ..... 11: 000000000043306a -> 000000000041d840 0 cycles P 0 ..... 12: ffffffff8220137c -> 0000000000433040 0 cycles M 0 ..... 13: 000000000041e4a1 -> 0000000000433040 0 cycles P 0 ..... 14: ffffffff8220137c -> 000000000041e490 0 cycles M 0 ..... 15: 000000000041d89b -> 000000000041e487 0 cycles P 0 ... thread: test_branch:7952 ...... dso: /data/sandipan/test_branch [...] After: [...] 8380958377610 0x40b178 [0x1b0]: PERF_RECORD_SAMPLE(IP, 0x2): 7952/7952: 0x4f851a period: 48973 addr: 0 ... branch stack: nr:16 ..... 0: 00000000004b52fd -> 00000000004f82c0 0 cycles P 0 NON_SPEC_CORRECT_PATH ..... 1: ffffffff8220137c -> 00000000004b52f0 0 cycles M 0 NON_SPEC_CORRECT_PATH ..... 2: 000000000041d1c4 -> 00000000004b52f0 0 cycles P 0 NON_SPEC_CORRECT_PATH ..... 3: 00000000004e7ead -> 000000000041d1b0 0 cycles M 0 NON_SPEC_CORRECT_PATH ..... 4: 00000000004e7f91 -> 00000000004e7ead 0 cycles P 0 NON_SPEC_CORRECT_PATH ..... 5: 00000000004e7ea8 -> 00000000004e7f70 0 cycles P 0 NON_SPEC_CORRECT_PATH ..... 6: 00000000004e7e52 -> 00000000004e7e98 0 cycles M 0 SPEC_CORRECT_PATH ..... 7: 00000000004e7e1f -> 00000000004e7e40 0 cycles M 0 NON_SPEC_CORRECT_PATH ..... 8: 00000000004e7f60 -> 00000000004e7df0 0 cycles P 0 NON_SPEC_CORRECT_PATH ..... 9: 00000000004e7f58 -> 00000000004e7f60 0 cycles M 0 NON_SPEC_CORRECT_PATH ..... 10: 000000000041d85d -> 00000000004e7f50 0 cycles P 0 NON_SPEC_CORRECT_PATH ..... 11: 000000000043306a -> 000000000041d840 0 cycles P 0 NON_SPEC_CORRECT_PATH ..... 12: ffffffff8220137c -> 0000000000433040 0 cycles M 0 NON_SPEC_CORRECT_PATH ..... 13: 000000000041e4a1 -> 0000000000433040 0 cycles P 0 NON_SPEC_CORRECT_PATH ..... 14: ffffffff8220137c -> 000000000041e490 0 cycles M 0 NON_SPEC_CORRECT_PATH ..... 15: 000000000041d89b -> 000000000041e487 0 cycles P 0 NON_SPEC_CORRECT_PATH ... thread: test_branch:7952 ...... dso: /data/sandipan/test_branch [...] With the addition of new branch flags, the "brstacksym" fields in perf script output now shows speculation information after the branch type. Change the regular expressions accordingly for the test to pass. Since branch speculation information may vary across platforms, the test does not look for specific values. E.g. $ perf test -v 110 Before: 110: Check branch stack sampling : --- start --- test child forked, pid 54154 Testing user branch stack sampling + grep -E -m1 ^brstack_bench\+[^ ]/brstack_foo\+[^ ]/IND_CALL$ /tmp/__perf_test.program.AfhUI/perf.script + cleanup + rm -rf /tmp/__perf_test.program.AfhUI test child finished with -1 ---- end ---- Check branch stack sampling: FAILED! After: 110: Check branch stack sampling : --- start --- test child forked, pid 43716 Testing user branch stack sampling + grep -E -m1 ^brstack_bench\+[^ ]/brstack_foo\+[^ ]/IND_CALL/.$ /tmp/__perf_test.program.xgzAi/perf.script brstack_bench+0x66/brstack_foo+0x0/P/-/-/0/IND_CALL/NON_SPEC_CORRECT_PATH + grep -E -m1 ^brstack_foo\+[^ ]/brstack_bar\+[^ ]/CALL/.$ /tmp/__perf_test.program.xgzAi/perf.script brstack_foo+0x1b/brstack_bar+0x0/P/-/-/0/CALL/NON_SPEC_CORRECT_PATH + grep -E -m1 ^brstack_bench\+[^ ]/brstack_foo\+[^ ]/CALL/.$ /tmp/__perf_test.program.xgzAi/perf.script brstack_bench+0x58/brstack_foo+0x0/P/-/-/0/CALL/NON_SPEC_CORRECT_PATH + grep -E -m1 ^brstack_bench\+[^ ]/brstack_bar\+[^ ]/CALL/.$ /tmp/__perf_test.program.xgzAi/perf.script brstack_bench+0x5d/brstack_bar+0x0/P/-/-/0/CALL/NON_SPEC_CORRECT_PATH + grep -E -m1 ^brstack_bar\+[^ ]/brstack_foo\+[^ ]/RET/.$ /tmp/__perf_test.program.xgzAi/perf.script brstack_bar+0x31/brstack_foo+0x20/P/-/-/0/RET/NON_SPEC_CORRECT_PATH + grep -E -m1 ^brstack_foo\+[^ ]/brstack_bench\+[^ ]/RET/.$ /tmp/__perf_test.program.xgzAi/perf.script brstack_foo+0x36/brstack_bench+0x5d/P/-/-/0/RET/NON_SPEC_CORRECT_PATH + grep -E -m1 ^brstack_bench\+[^ ]/brstack_bench\+[^ ]/COND/.$ /tmp/__perf_test.program.xgzAi/perf.script brstack_bench+0x76/brstack_bench+0x7d/P/-/-/0/COND/NON_SPEC_CORRECT_PATH + grep -E -m1 ^brstack\+[^ ]/brstack\+[^ ]/UNCOND/.$ /tmp/__perf_test.program.xgzAi/perf.script brstack+0x5a/brstack+0x41/P/-/-/0/UNCOND/NON_SPEC_CORRECT_PATH + set +x Testing branch stack filtering permutation (any_call,CALL\|IND_CALL\|COND_CALL\|SYSCALL\|IRQ) Testing branch stack filtering permutation (call,CALL\|SYSCALL) Testing branch stack filtering permutation (cond,COND) Testing branch stack filtering permutation (any_ret,RET\|COND_RET\|SYSRET\|ERET) Testing branch stack filtering permutation (call,cond,CALL\|SYSCALL\|COND) Testing branch stack filtering permutation (any_call,cond,CALL\|IND_CALL\|COND_CALL\|IRQ\|SYSCALL\|COND) Testing branch stack filtering permutation (cond,any_call,any_ret,COND\|CALL\|IND_CALL\|COND_CALL\|SYSCALL\|IRQ\|RET\|COND_RET\|SYSRET\|ERET) test child finished with 0 ---- end ---- Check branch stack sampling: Ok Signed-off-by: Sandipan Das <sandipan.das@amd.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ananth Narayan <ananth.narayan@amd.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Santosh Shukla <santosh.shukla@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: x86@kernel.org Link: https://lore.kernel.org/r/048d67c9de3cc8e3dbf19aaa7ff718dec91364c5.1675333809.git.sandipan.das@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02	perf script: Show branch speculation info	Sandipan Das
	Show the branch speculation info if provided by the branch recording hardware feature. This can be useful for optimizing code further. The speculation info is appended to the end of the list of fields so any existing tools that use "/" as a delimiter for access fields via an index remain unaffected. Also show "-" instead of "N/A" when speculation info is unavailable because "/" is used as the field separator. E.g. $ perf record -j any,u,save_type ./test_branch $ perf script --fields brstacksym Before: [...] check_match+0x60/strcmp+0x0/P/-/-/0/CALL do_lookup_x+0x3c5/check_match+0x0/P/-/-/0/CALL [...] After: [...] check_match+0x60/strcmp+0x0/P/-/-/0/CALL/NON_SPEC_CORRECT_PATH do_lookup_x+0x3c5/check_match+0x0/P/-/-/0/CALL/NON_SPEC_CORRECT_PATH [...] The bitfield swapping scheme used duing sample parsing has changed because of the addition of new branch flags, namely "spec", "new_type" and "priv". Earlier, these were all part of the "reserved" field but now, each of these fields get swapped separately. Change the expected flag values accordingly for the test to pass. E.g. $ perf test -v 27 Before: 27: Sample parsing : --- start --- test child forked, pid 61979 parsing failed for sample_type 0x800 test child finished with -1 ---- end ---- Sample parsing: FAILED! After: 27: Sample parsing : --- start --- test child forked, pid 63293 test child finished with 0 ---- end ---- Sample parsing: Ok Signed-off-by: Sandipan Das <sandipan.das@amd.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ananth Narayan <ananth.narayan@amd.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Santosh Shukla <santosh.shukla@amd.com> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: x86@kernel.org Link: https://lore.kernel.org/r/56e272583552526e999ba0b536ac009ae3613966.1675333809.git.sandipan.das@amd.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02	perf test: Add more test cases for perf lock contention	Namhyung Kim
	Check callstack filter with two different aggregation mode. $ sudo ./perf test -v contention 88: kernel lock contention analysis test : --- start --- test child forked, pid 83416 Testing perf lock record and perf lock contention Testing perf lock contention --use-bpf Testing perf lock record and perf lock contention at the same time Testing perf lock contention --threads Testing perf lock contention --lock-addr Testing perf lock contention --type-filter (w/ spinlock) Testing perf lock contention --lock-filter (w/ tasklist_lock) Testing perf lock contention --callstack-filter (w/ unix_stream) Testing perf lock contention --callstack-filter with task aggregation test child finished with 0 ---- end ---- kernel lock contention analysis test: Ok Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Hao Luo <haoluo@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230202050455.2187592-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02	perf bench syscall: Add execve syscall benchmark	Tiezhu Yang
	This commit adds the execve syscall benchmark, more syscall benchmarks can be added in the future. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/1668052208-14047-5-git-send-email-yangtiezhu@loongson.cn Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02	perf bench syscall: Add getpgid syscall benchmark	Tiezhu Yang
	This commit adds a simple getpgid syscall benchmark, more syscall benchmarks can be added in the future. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/1668052208-14047-4-git-send-email-yangtiezhu@loongson.cn Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02	perf bench syscall: Introduce bench_syscall_common()	Tiezhu Yang
	In the current code, there is only a basic syscall benchmark via getppid, this is not enough. Introduce bench_syscall_common() so that we can add more syscalls to benchmark. This is preparation for later patch, no functionality change. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/1668052208-14047-3-git-send-email-yangtiezhu@loongson.cn Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02	tools x86: Keep list sorted by number in unistd_{32,64}.h	Tiezhu Yang
	It is better to keep list sorted by number in unistd_{32,64}.h, so that we can add more syscall number to a proper position. This is preparation for later patch, no functionality change. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/1668052208-14047-2-git-send-email-yangtiezhu@loongson.cn Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02	perf test: Replace legacy `...` with $(...)	Diederik de Haas
	As detailed in https://www.shellcheck.net/wiki/SC2006: The use of `...` is legacy syntax with several issues: 1. It has a series of undefined behaviors related to quoting in POSIX. 2. It imposes a custom escaping mode with surprising results. 3. It's exceptionally hard to nest. $(...) command substitution has none of these problems, and is therefore strongly encouraged. Signed-off-by: Diederik de Haas <didi.debian@cknow.org> Acked-by: Carsten Haitzler <carsten.haitzler@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230201214945.127474-3-didi.debian@cknow.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02	perf test: Replace 'grep \| wc -l' with 'grep -c'	Diederik de Haas
	To count the number of results from grep, use the '-c' parameter instead of piping it to 'wc'. See also https://www.shellcheck.net/wiki/SC2126 Signed-off-by: Diederik de Haas <didi.debian@cknow.org> Acked-by: Carsten Haitzler <carsten.haitzler@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230201214945.127474-2-didi.debian@cknow.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02	perf stat: Hide invalid uncore event output for aggr mode	Namhyung Kim
	The current display code for perf stat iterates given cpus and build the aggr map to collect the event data for the aggregation mode. But uncore events have their own cpu maps and it won't guarantee that it'd match to the aggr map. For example, per-package uncore events would generate a single value for each socket. When user asks per-core aggregation mode, the output would contain 0 values for other cores. Thus it needs to check the uncore PMU's cpumask and if it matches to the current aggregation id. Before: $ sudo ./perf stat -a --per-core -e power/energy-pkg/ sleep 1 Performance counter stats for 'system wide': S0-D0-C0 1 3.73 Joules power/energy-pkg/ S0-D0-C1 0 <not counted> Joules power/energy-pkg/ S0-D0-C2 0 <not counted> Joules power/energy-pkg/ S0-D0-C3 0 <not counted> Joules power/energy-pkg/ 1.001404046 seconds time elapsed Some events weren't counted. Try disabling the NMI watchdog: echo 0 > /proc/sys/kernel/nmi_watchdog perf stat ... echo 1 > /proc/sys/kernel/nmi_watchdog The core 1, 2 and 3 should not be printed because the event is handled in a cpu in the core 0 only. With this change, the output becomes like below. After: $ sudo ./perf stat -a --per-core -e power/energy-pkg/ sleep 1 Performance counter stats for 'system wide': S0-D0-C0 1 2.09 Joules power/energy-pkg/ Fixes: b897613510890d6e ("perf stat: Update event skip condition for system-wide per-thread mode and merged uncore and hybrid events") Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Ian Rogers <irogers@google.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230125192431.2929677-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02	perf lock contention: Add -S/--callstack-filter option	Namhyung Kim
	The -S/--callstack-filter is to limit display entries having the given string in the callstack (not only in the caller in the output). The following example shows lock contention results if the callstack has 'net' substring somewhere. Note that the caller '__dev_queue_xmit' does not match to it, but it has 'inet6_csk_xmit' in the callstack. This applies even if you don't use -v option to show the full callstack. $ sudo ./perf lock con -abv -S net sleep 1 ... contended total wait max wait avg wait type caller 5 70.20 us 16.13 us 14.04 us spinlock __dev_queue_xmit+0xb6d 0xffffffffa5dd1c60 _raw_spin_lock+0x30 0xffffffffa5b8f6ed __dev_queue_xmit+0xb6d 0xffffffffa5cd8267 ip6_finish_output2+0x2c7 0xffffffffa5cdac14 ip6_finish_output+0x1d4 0xffffffffa5cdb477 ip6_xmit+0x457 0xffffffffa5d1fd17 inet6_csk_xmit+0xd7 0xffffffffa5c5f4aa __tcp_transmit_skb+0x54a 0xffffffffa5c6467d tcp_keepalive_timer+0x2fd Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: bpf@vger.kernel.org Link: https://lore.kernel.org/r/20230126000936.3017683-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02	perf script: Add 'cgroup' field for output	Namhyung Kim
	There's no field for the cgroup, let's add one. To do that, users need to specify --all-cgroup option for perf record to capture the cgroup info. $ perf record --all-cgroups -- true $ perf script -F comm,pid,cgroup true 337112 /user.slice/user-657345.slice/user@657345.service/... true 337112 /user.slice/user-657345.slice/user@657345.service/... true 337112 /user.slice/user-657345.slice/user@657345.service/... true 337112 /user.slice/user-657345.slice/user@657345.service/... If it's recorded without the --all-cgroups, it'd complain. $ perf script -F comm,pid,cgroup Samples for 'cycles:u' event do not have CGROUP attribute set. Cannot print 'cgroup' field. Hint: run 'perf record --all-cgroups ...' Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20230126213610.3381147-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02	perf tools docs: Use canonical ftrace path	Ross Zwisler
	The canonical location for the tracefs filesystem is at /sys/kernel/tracing. But, from Documentation/trace/ftrace.rst: Before 4.1, all ftrace tracing control files were within the debugfs file system, which is typically located at /sys/kernel/debug/tracing. For backward compatibility, when mounting the debugfs file system, the tracefs file system will be automatically mounted at: /sys/kernel/debug/tracing A few spots in the perf docs still refer to this older debugfs path, so let's update them to avoid confusion. Signed-off-by: Ross Zwisler <zwisler@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: linux-trace-kernel@vger.kernel.org Link: http://lore.kernel.org/lkml/20230130181915.1113313-5-zwisler@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02	perf arm-spe: Only warn once for each unsupported address packet	Rob Herring
	Unknown address packet indexes are not an error as the Arm architecture can (and has with SPEv1.2) define new ones and implementation defined ones are also allowed. The error message for every occurrence of the packet is needlessly noisy as well. Change the message to print just once for each unknown index. Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: Rob Herring <robh@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230127205546.667740-1-robh@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02	perf symbols: Symbol lookup with kcore can fail if multiple segments match stext	Krister Johansen
	This problem was encountered on an arm64 system with a lot of memory. Without kernel debug symbols installed, and with both kcore and kallsyms available, perf managed to get confused and returned "unknown" for all of the kernel symbols that it tried to look up. On this system, stext fell within the vmalloc segment. The kcore symbol matching code tries to find the first segment that contains stext and uses that to replace the segment generated from just the kallsyms information. In this case, however, there were two: a very large vmalloc segment, and the text segment. This caused perf to get confused because multiple overlapping segments were inserted into the RB tree that holds the discovered segments. However, that alone wasn't sufficient to cause the problem. Even when we could find the segment, the offsets were adjusted in such a way that the newly generated symbols didn't line up with the instruction addresses in the trace. The most obvious solution would be to consult which segment type is text from kcore, but this information is not exposed to users. Instead, select the smallest matching segment that contains stext instead of the first matching segment. This allows us to match the text segment instead of vmalloc, if one is contained within the other. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Krister Johansen <kjlx@templeofstupid.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: David Reaver <me@davidreaver.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20230125183418.GD1963@templeofstupid.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02	perf probe: Fix usage when libtraceevent is missing	Athira Rajeev
	While parsing the tracepoint events in parse_events_add_tracepoint() function, code checks for HAVE_LIBTRACEEVENT support. This is needed since libtraceevent is necessary for tracepoint. But while adding probe points, check for LIBTRACEEVENT is not done in case of perf probe. Hence, in environment with missing libtraceevent-devel, it is observed that adding a probe point shows below message though it can't be used via perf record. Example: Adding probe point: ./perf probe 'vfs_getname=getname_flags:72 pathname=result->name:string' Added new event: probe:vfs_getname (on getname_flags:72 with pathname=result->name:string) You can now use it in all perf tools, such as: perf record -e probe:vfs_getname -aR sleep 1 But trying perf record: ./perf record -e probe:vfs_getname -aR sleep 1 event syntax error: 'probe:vfs_getname' \___ unsupported tracepoint libtraceevent is necessary for tracepoint support Run 'perf list' for a list of valid events The builtin tool like perf record needs libtraceevent to parse tracefs. But still the probe can be used by enabling via tracefs. Patch fixes the probe usage message to the user based on presence of libtraceevent. With the fix, # ./perf probe 'pmu:myprobe=schedule' Added new event: pmu:myprobe (on schedule) perf is not linked with libtraceevent, to use the new probe you can use tracefs: cd /sys/kernel/tracing/ echo 1 > events/pmu/myprobe/enable echo 1 > tracing_on cat trace_pipe Before removing the probe, echo 0 > events/pmu/myprobe/enable Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Disha Goel <disgoel@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nageswara R Sastry <rnsastry@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: linuxppc-dev@lists.ozlabs.org Link: https://lore.kernel.org/r/20230131134748.54567-1-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02	perf symbols: Get symbols for .plt.got for x86-64	Adrian Hunter
	For x86_64, determine a symbol for .plt.got entries. That requires computing the target offset and finding that in .rela.dyn, which in turn means .rela.dyn needs to be sorted by offset. Example: In this example, the GNU C Library is using .plt.got for malloc and free. Before: $ gcc --version gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0 Copyright (C) 2021 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ perf record -e intel_pt//u uname Linux [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.027 MB perf.data ] $ perf script --itrace=be --ns -F-event,+addr,-period,-comm,-tid,-cpu > /tmp/cmp1.txt After: $ perf script --itrace=be --ns -F-event,+addr,-period,-comm,-tid,-cpu > /tmp/cmp2.txt $ diff /tmp/cmp1.txt /tmp/cmp2.txt \| head -12 15509,15510c15509,15510 < 27046.755390907: 7f0b2943e3ab _nl_normalize_codeset+0x5b (/usr/lib/x86_64-linux-gnu/libc.so.6) => 7f0b29428380 offset_0x28380@plt+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6) < 27046.755390907: 7f0b29428384 offset_0x28380@plt+0x4 (/usr/lib/x86_64-linux-gnu/libc.so.6) => 7f0b294a5120 malloc+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6) --- > 27046.755390907: 7f0b2943e3ab _nl_normalize_codeset+0x5b (/usr/lib/x86_64-linux-gnu/libc.so.6) => 7f0b29428380 malloc@plt+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6) > 27046.755390907: 7f0b29428384 malloc@plt+0x4 (/usr/lib/x86_64-linux-gnu/libc.so.6) => 7f0b294a5120 malloc+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6) 15821,15822c15821,15822 < 27046.755394865: 7f0b2943850c _nl_load_locale_from_archive+0x5bc (/usr/lib/x86_64-linux-gnu/libc.so.6) => 7f0b29428370 offset_0x28370@plt+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6) < 27046.755394865: 7f0b29428374 offset_0x28370@plt+0x4 (/usr/lib/x86_64-linux-gnu/libc.so.6) => 7f0b294a5460 cfree@GLIBC_2.2.5+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6) --- > 27046.755394865: 7f0b2943850c _nl_load_locale_from_archive+0x5bc (/usr/lib/x86_64-linux-gnu/libc.so.6) => 7f0b29428370 free@plt+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6) > 27046.755394865: 7f0b29428374 free@plt+0x4 (/usr/lib/x86_64-linux-gnu/libc.so.6) => 7f0b294a5460 cfree@GLIBC_2.2.5+0x0 (/usr/lib/x86_64-linux-gnu/libc.so.6) Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230131131625.6964-10-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02	perf symbols: Start adding support for .plt.got for x86	Adrian Hunter
	For x86, .plt.got is used, for example, when the address is taken of a dynamically linked function. Start adding support by synthesizing a symbol for each entry. A subsequent patch will attempt to get a better name for the symbol. Example: Before: $ cat tstpltlib.c void fn1(void) {} void fn2(void) {} void fn3(void) {} void fn4(void) {} $ cat tstpltgot.c void fn1(void); void fn2(void); void fn3(void); void fn4(void); void callfn(void (*fn)(void)) { fn(); } int main() { fn4(); fn1(); callfn(fn3); fn2(); fn3(); return 0; } $ gcc --version gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0 Copyright (C) 2021 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ gcc -Wall -Wextra -shared -o libtstpltlib.so tstpltlib.c $ gcc -Wall -Wextra -o tstpltgot tstpltgot.c -L . -ltstpltlib -Wl,-rpath="$(pwd)" $ readelf -SW tstpltgot \| grep 'Name\\|plt\\|dyn' [Nr] Name Type Address Off Size ES Flg Lk Inf Al [ 6] .dynsym DYNSYM 00000000000003d8 0003d8 0000f0 18 A 7 1 8 [ 7] .dynstr STRTAB 00000000000004c8 0004c8 0000c6 00 A 0 0 1 [10] .rela.dyn RELA 00000000000005d8 0005d8 0000d8 18 A 6 0 8 [11] .rela.plt RELA 00000000000006b0 0006b0 000048 18 AI 6 24 8 [13] .plt PROGBITS 0000000000001020 001020 000040 10 AX 0 0 16 [14] .plt.got PROGBITS 0000000000001060 001060 000020 10 AX 0 0 16 [15] .plt.sec PROGBITS 0000000000001080 001080 000030 10 AX 0 0 16 [23] .dynamic DYNAMIC 0000000000003d90 002d90 000210 10 WA 7 0 8 $ perf record -e intel_pt//u --filter 'filter main @ ./tstpltgot , filter callfn @ ./tstpltgot' ./tstpltgot [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.011 MB perf.data ] $ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso 28393.810326915: tr strt 0 [unknown] => 562350baa1b2 main+0x0 28393.810326915: tr end call 562350baa1ba main+0x8 => 562350baa090 fn4@plt+0x0 28393.810326917: tr strt 0 [unknown] => 562350baa1bf main+0xd 28393.810326917: tr end call 562350baa1bf main+0xd => 562350baa080 fn1@plt+0x0 28393.810326917: tr strt 0 [unknown] => 562350baa1c4 main+0x12 28393.810326917: call 562350baa1ce main+0x1c => 562350baa199 callfn+0x0 28393.810326917: tr end call 562350baa1ad callfn+0x14 => 7f607d36110f fn3+0x0 28393.810326922: tr strt 0 [unknown] => 562350baa1af callfn+0x16 28393.810326922: return 562350baa1b1 callfn+0x18 => 562350baa1d3 main+0x21 28393.810326922: tr end call 562350baa1d3 main+0x21 => 562350baa0a0 fn2@plt+0x0 28393.810326924: tr strt 0 [unknown] => 562350baa1d8 main+0x26 28393.810326924: tr end call 562350baa1d8 main+0x26 => 562350baa060 [unknown] <- call to fn3 via .plt.got 28393.810326925: tr strt 0 [unknown] => 562350baa1dd main+0x2b 28393.810326925: tr end return 562350baa1e3 main+0x31 => 7f607d029d90 __libc_start_call_main+0x80 After: $ perf script --itrace=be --ns -F+flags,-event,+addr,-period,-comm,-tid,-cpu,-dso 28393.810326915: tr strt 0 [unknown] => 562350baa1b2 main+0x0 28393.810326915: tr end call 562350baa1ba main+0x8 => 562350baa090 fn4@plt+0x0 28393.810326917: tr strt 0 [unknown] => 562350baa1bf main+0xd 28393.810326917: tr end call 562350baa1bf main+0xd => 562350baa080 fn1@plt+0x0 28393.810326917: tr strt 0 [unknown] => 562350baa1c4 main+0x12 28393.810326917: call 562350baa1ce main+0x1c => 562350baa199 callfn+0x0 28393.810326917: tr end call 562350baa1ad callfn+0x14 => 7f607d36110f fn3+0x0 28393.810326922: tr strt 0 [unknown] => 562350baa1af callfn+0x16 28393.810326922: return 562350baa1b1 callfn+0x18 => 562350baa1d3 main+0x21 28393.810326922: tr end call 562350baa1d3 main+0x21 => 562350baa0a0 fn2@plt+0x0 28393.810326924: tr strt 0 [unknown] => 562350baa1d8 main+0x26 28393.810326924: tr end call 562350baa1d8 main+0x26 => 562350baa060 offset_0x1060@plt+0x0 28393.810326925: tr strt 0 [unknown] => 562350baa1dd main+0x2b 28393.810326925: tr end return 562350baa1e3 main+0x31 => 7f607d029d90 __libc_start_call_main+0x80 Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20230131131625.6964-9-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-02-02	rtla/timerlat: Add auto-analysis support to timerlat top	Daniel Bristot de Oliveira
	Currently, timerlat top displays the timerlat tracer latency results, saving the intuitive timerlat trace for the developer to analyze. This patch goes a step forward in the automaton of the scheduling latency analysis by providing a summary of the root cause of a latency higher than the passed "stop tracing" parameter if the trace stops. The output is intuitive enough for non-expert users to have a general idea of the root cause by looking at each factor's contribution percentage while keeping the technical detail in the output for more expert users to start an in dept debug or to correlate a root cause with an existing one. The terminology is in line with recent industry and academic publications to facilitate the understanding of both audiences. Here is one example of tool output: ----------------------------------------- %< ----------------------------------------------------- # taskset -c 0 timerlat -a 40 -c 1-23 -q Timer Latency 0 00:00:12 \| IRQ Timer Latency (us) \| Thread Timer Latency (us) CPU COUNT \| cur min avg max \| cur min avg max 1 #12322 \| 0 0 1 15 \| 10 3 9 31 2 #12322 \| 3 0 1 12 \| 10 3 9 23 3 #12322 \| 1 0 1 21 \| 8 2 8 34 4 #12322 \| 1 0 1 17 \| 10 2 11 33 5 #12322 \| 0 0 1 12 \| 8 3 8 25 6 #12322 \| 1 0 1 14 \| 16 3 11 35 7 #12322 \| 0 0 1 14 \| 9 2 8 29 8 #12322 \| 1 0 1 22 \| 9 3 9 34 9 #12322 \| 0 0 1 14 \| 8 2 8 24 10 #12322 \| 1 0 0 12 \| 9 3 8 24 11 #12322 \| 0 0 0 15 \| 6 2 7 29 12 #12321 \| 1 0 0 13 \| 5 3 8 23 13 #12319 \| 0 0 1 14 \| 9 3 9 26 14 #12321 \| 1 0 0 13 \| 6 2 8 24 15 #12321 \| 1 0 1 15 \| 12 3 11 27 16 #12318 \| 0 0 1 13 \| 7 3 10 24 17 #12319 \| 0 0 1 13 \| 11 3 9 25 18 #12318 \| 0 0 0 12 \| 8 2 8 20 19 #12319 \| 0 0 1 18 \| 10 2 9 28 20 #12317 \| 0 0 0 20 \| 9 3 8 34 21 #12318 \| 0 0 0 13 \| 8 3 8 28 22 #12319 \| 0 0 1 11 \| 8 3 10 22 23 #12320 \| 28 0 1 28 \| 41 3 11 41 rtla timerlat hit stop tracing ## CPU 23 hit stop tracing, analyzing it ## IRQ handler delay: 27.49 us (65.52 %) IRQ latency: 28.13 us Timerlat IRQ duration: 9.59 us (22.85 %) Blocking thread: 3.79 us (9.03 %) objtool:49256 3.79 us Blocking thread stacktrace -> timerlat_irq -> __hrtimer_run_queues -> hrtimer_interrupt -> __sysvec_apic_timer_interrupt -> sysvec_apic_timer_interrupt -> asm_sysvec_apic_timer_interrupt -> _raw_spin_unlock_irqrestore -> cgroup_rstat_flush_locked -> cgroup_rstat_flush_irqsafe -> mem_cgroup_flush_stats -> mem_cgroup_wb_stats -> balance_dirty_pages -> balance_dirty_pages_ratelimited_flags -> btrfs_buffered_write -> btrfs_do_write_iter -> vfs_write -> __x64_sys_pwrite64 -> do_syscall_64 -> entry_SYSCALL_64_after_hwframe ------------------------------------------------------------------------ Thread latency: 41.96 us (100%) The system has exit from idle latency! Max timerlat IRQ latency from idle: 17.48 us in cpu 4 Saving trace to timerlat_trace.txt ----------------------------------------- >% ----------------------------------------------------- In this case, the major factor was the delay suffered by the IRQ handler that handles timerlat wakeup: 65.52 %. This can be caused by the current thread masking interrupts, which can be seen in the blocking thread stacktrace: the current thread (objtool:49256) disabled interrupts via raw spin lock operations inside mem cgroup, while doing write syscall in a btrfs file system. A simple search for the function name on Google shows that this is a legit case for disabling the interrupts: cgroup: Use irqsave in cgroup_rstat_flush_locked() lore.kernel.org/linux-mm/20220301122143.1521823-2-bigeasy@linutronix.de/ The output also prints other reasons for the latency root cause, such as: - an IRQ that happened before the IRQ handler that caused delays - The interference from NMI, IRQ, Softirq, and Threads The details about how these factors affect the scheduling latency can be found here: https://bristot.me/demystifying-the-real-time-linux-latency/ Link: https://lkml.kernel.org/r/3d45f40e630317f51ac6d678e2d96d310e495729.1675179318.git.bristot@kernel.org Cc: Daniel Bristot de Oliveira <bristot@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-02-02	rtla/timerlat: Add auto-analysis core	Daniel Bristot de Oliveira
	Currently, timerlat displays a summary of the timerlat tracer results saving the trace if the system hits a stop condition. While this represented a huge step forward, the root cause was not that is accessible to non-expert users. The auto-analysis fulfill this gap by parsing the trace timerlat runs, printing an intuitive auto-analysis. Link: https://lkml.kernel.org/r/1ee073822f6a2cbb33da0c817331d0d4045e837f.1675179318.git.bristot@kernel.org Cc: Daniel Bristot de Oliveira <bristot@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-02-02	PM: tools: use canonical ftrace path	Ross Zwisler
	The canonical location for the tracefs filesystem is at /sys/kernel/tracing. But, from Documentation/trace/ftrace.rst: Before 4.1, all ftrace tracing control files were within the debugfs file system, which is typically located at /sys/kernel/debug/tracing. For backward compatibility, when mounting the debugfs file system, the tracefs file system will be automatically mounted at: /sys/kernel/debug/tracing A few scripts in tools/power still refer to this older debugfs path, so let's update them to avoid confusion. Signed-off-by: Ross Zwisler <zwisler@google.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-02-02	selftests: net: udpgso_bench_tx: Cater for pending datagrams zerocopy ↵	Andrei Gherzan
	benchmarking The test tool can check that the zerocopy number of completions value is valid taking into consideration the number of datagram send calls. This can catch the system into a state where the datagrams are still in the system (for example in a qdisk, waiting for the network interface to return a completion notification, etc). This change adds a retry logic of computing the number of completions up to a configurable (via CLI) timeout (default: 2 seconds). Fixes: 79ebc3c26010 ("net/udpgso_bench_tx: options to exercise TX CMSG") Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com> Cc: Willem de Bruijn <willemb@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20230201001612.515730-4-andrei.gherzan@canonical.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-02	selftests: net: udpgso_bench: Fix racing bug between the rx/tx programs	Andrei Gherzan
	"udpgro_bench.sh" invokes udpgso_bench_rx/udpgso_bench_tx programs subsequently and while doing so, there is a chance that the rx one is not ready to accept socket connections. This racing bug could fail the test with at least one of the following: ./udpgso_bench_tx: connect: Connection refused ./udpgso_bench_tx: sendmsg: Connection refused ./udpgso_bench_tx: write: Connection refused This change addresses this by making udpgro_bench.sh wait for the rx program to be ready before firing off the tx one - up to a 10s timeout. Fixes: 3a687bef148d ("selftests: udp gso benchmark") Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Willem de Bruijn <willemb@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20230201001612.515730-3-andrei.gherzan@canonical.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-02	selftests: net: udpgso_bench_rx/tx: Stop when wrong CLI args are provided	Andrei Gherzan
	Leaving unrecognized arguments buried in the output, can easily hide a CLI/script typo. Avoid this by exiting when wrong arguments are provided to the udpgso_bench test programs. Fixes: 3a687bef148d ("selftests: udp gso benchmark") Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com> Cc: Willem de Bruijn <willemb@google.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20230201001612.515730-2-andrei.gherzan@canonical.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-02	selftests: net: udpgso_bench_rx: Fix 'used uninitialized' compiler warning	Andrei Gherzan
	This change fixes the following compiler warning: /usr/include/x86_64-linux-gnu/bits/error.h:40:5: warning: ‘gso_size’ may be used uninitialized [-Wmaybe-uninitialized] 40 \| __error_noreturn (__status, __errnum, __format, __va_arg_pack ()); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ udpgso_bench_rx.c: In function ‘main’: udpgso_bench_rx.c:253:23: note: ‘gso_size’ was declared here 253 \| int ret, len, gso_size, budget = 256; Fixes: 3327a9c46352 ("selftests: add functionals test for UDP GRO") Signed-off-by: Andrei Gherzan <andrei.gherzan@canonical.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://lore.kernel.org/r/20230201001612.515730-1-andrei.gherzan@canonical.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-02	selftests/bpf: Remove duplicate include header in xdp_hw_metadata	Ye Xingchen
	The linux/net_tstamp.h is included more than once, thus clean it up. Signed-off-by: Ye Xingchen <ye.xingchen@zte.com.cn> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/202301311440516312161@zte.com.cn