summaryrefslogtreecommitdiff
path: root/tools
AgeCommit message (Collapse)Author
2022-02-22perf data: Don't mention --to-ctf if it's not supportedMahmoud Mandour
The option `--to-ctf` is only available when perf has libbabeltrace support. Hence, on error, we shouldn't state that user must include `--to-ctf` unless it's supported. The only user-visible change for this commit is that when `perf` is not configured to support libbabeltrace, the user is only prompted to provide the `--to-json` option instead of bothe `--to-json` and `--to-ctf`. Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220220113952.138280-1-ma.mandourr@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-22perf script: Fix error when printing 'weight' fieldGerman Gomez
In SPE traces the 'weight' field can't be printed in 'perf script' because the 'dummy:u' event doesn't have the WEIGHT attribute set. Use evsel__do_check_stype(..) to check this field, as it's done with other fields such as "phys_addr". Before: $ perf record -e arm_spe_0// -- sleep 1 $ perf script -F event,ip,weight Samples for 'dummy:u' event do not have WEIGHT attribute set. Cannot print 'weight' field. After: $ perf script -F event,ip,weight l1d-access: 12 ffffaf629d4cb320 tlb-access: 12 ffffaf629d4cb320 memory: 12 ffffaf629d4cb320 Fixes: b0fde9c6e291e528 ("perf arm-spe: Add SPE total latency as PERF_SAMPLE_WEIGHT") Signed-off-by: German Gomez <german.gomez@arm.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20220221171707.62960-1-german.gomez@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-22perf data: Adding error message if perf_data__create_dir() failsAlexey Bayduraev
Add proper return codes for all cases of data directory creation failure and add error message output based on these codes. Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220222091417.11020-1-alexey.v.bayduraev@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-22tools arch x86: Sync the msr-index.h copy with the kernel sourcesArnaldo Carvalho de Melo
To pick up the changes in: 3915035282573c5e ("KVM: x86: SVM: move avic definitions from AMD's spec to svm.h") Addressing these tools/perf build warnings: diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h' That makes the beautification scripts to pick some new entries: $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after $ diff -u before after --- before 2022-02-22 17:35:36.996271430 -0300 +++ after 2022-02-22 17:35:46.258503347 -0300 @@ -287,6 +287,7 @@ [0xc0010114 - x86_AMD_V_KVM_MSRs_offset] = "VM_CR", [0xc0010115 - x86_AMD_V_KVM_MSRs_offset] = "VM_IGNNE", [0xc0010117 - x86_AMD_V_KVM_MSRs_offset] = "VM_HSAVE_PA", + [0xc001011b - x86_AMD_V_KVM_MSRs_offset] = "AMD64_SVM_AVIC_DOORBELL", [0xc001011e - x86_AMD_V_KVM_MSRs_offset] = "AMD64_VM_PAGE_FLUSH", [0xc001011f - x86_AMD_V_KVM_MSRs_offset] = "AMD64_VIRT_SPEC_CTRL", [0xc0010130 - x86_AMD_V_KVM_MSRs_offset] = "AMD64_SEV_ES_GHCB", $ And this gets rebuilt: CC /tmp/build/perf/trace/beauty/tracepoints/x86_msr.o LD /tmp/build/perf/trace/beauty/tracepoints/perf-in.o LD /tmp/build/perf/trace/beauty/perf-in.o CC /tmp/build/perf/util/amd-sample-raw.o LD /tmp/build/perf/util/perf-in.o LD /tmp/build/perf/perf-in.o LINK /tmp/build/perf/perf Now one can trace systemwide asking to see backtraces to where those MSRs are being read/written with: # perf trace -e msr:*_msr/max-stack=32/ --filter="msr>=AMD64_SVM_AVIC_DOORBELL && msr<=AMD64_SEV_ES_GHCB" ^C# If we use -v (verbose mode) we can see what it does behind the scenes: # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr>=AMD64_SVM_AVIC_DOORBELL && msr<=AMD64_SEV_ES_GHCB" Using CPUID AuthenticAMD-25-21-0 0xc001011b 0xc0010130 New filter for msr:read_msr: (msr>=0xc001011b && msr<=0xc0010130) && (common_pid != 1019953 && common_pid != 3629) 0xc001011b 0xc0010130 New filter for msr:write_msr: (msr>=0xc001011b && msr<=0xc0010130) && (common_pid != 1019953 && common_pid != 3629) mmap size 528384B ^C# Example with a frequent msr: # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_SPEC_CTRL" --max-events 2 Using CPUID AuthenticAMD-25-21-0 0x48 New filter for msr:read_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841) 0x48 New filter for msr:write_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841) mmap size 528384B Looking at the vmlinux_path (8 entries long) symsrc__init: build id mismatch for vmlinux. Using /proc/kcore for kernel data Using /proc/kallsyms for symbols 0.000 Timer/2525383 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) __switch_to_xtra ([kernel.kallsyms]) __switch_to ([kernel.kallsyms]) __schedule ([kernel.kallsyms]) schedule ([kernel.kallsyms]) futex_wait_queue_me ([kernel.kallsyms]) futex_wait ([kernel.kallsyms]) do_futex ([kernel.kallsyms]) __x64_sys_futex ([kernel.kallsyms]) do_syscall_64 ([kernel.kallsyms]) entry_SYSCALL_64_after_hwframe ([kernel.kallsyms]) __futex_abstimed_wait_common64 (/usr/lib64/libpthread-2.33.so) 0.030 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL, val: 2) do_trace_write_msr ([kernel.kallsyms]) do_trace_write_msr ([kernel.kallsyms]) __switch_to_xtra ([kernel.kallsyms]) __switch_to ([kernel.kallsyms]) __schedule ([kernel.kallsyms]) schedule_idle ([kernel.kallsyms]) do_idle ([kernel.kallsyms]) cpu_startup_entry ([kernel.kallsyms]) secondary_startup_64_no_verify ([kernel.kallsyms]) # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Maxim Levitsky <mlevitsk@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Link: http://lore.kernel.org/lkml/YhVKxaft+z8rpOfy@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-22perf data: Fix double free in perf_session__delete()Alexey Bayduraev
When perf_data__create_dir() fails, it calls close_dir(), but perf_session__delete() also calls close_dir() and since dir.version and dir.nr were initialized by perf_data__create_dir(), a double free occurs. This patch moves the initialization of dir.version and dir.nr after successful initialization of dir.files, that prevents double freeing. This behavior is already implemented in perf_data__open_dir(). Fixes: 145520631130bd64 ("perf data: Add perf_data__(create_dir|close_dir) functions") Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Antonov <alexander.antonov@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexei Budankov <abudankov@huawei.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220218152341.5197-2-alexey.v.bayduraev@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-22KVM: PPC: reserve capability 210 for KVM_CAP_PPC_AIL_MODE_3Nicholas Piggin
Add KVM_CAP_PPC_AIL_MODE_3 to advertise the capability to set the AIL resource mode to 3 with the H_SET_MODE hypercall. This capability differs between processor types and KVM types (PR, HV, Nested HV), and affects guest-visible behaviour. QEMU will implement a cap-ail-mode-3 to control this behaviour[1], and use the KVM CAP if available to determine KVM support[2]. Reviewed-by: Fabiano Rosas <farosas@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-02-21tools/cgroup/slabinfo: update to work with struct slabRoman Gushchin
After the introduction of the dedicated struct slab to describe slab pages by commit d122019bf061 ("mm: Split slab into its own type") and the following removal of the corresponding struct page's fields by commit 07f910f9b729 ("mm: Remove slab from struct page") the memcg_slabinfo tool broke. An attempt to run it produces a trace like this: Traceback (most recent call last): File "/usr/bin/drgn", line 33, in <module> sys.exit(load_entry_point('drgn==0.0.16', 'console_scripts', 'drgn')()) File "/usr/lib64/python3.9/site-packages/drgn/internal/cli.py", line 133, in main runpy.run_path(args.script[0], init_globals=init_globals, run_name="__main__") File "/usr/lib64/python3.9/runpy.py", line 268, in run_path return _run_module_code(code, init_globals, run_name, File "/usr/lib64/python3.9/runpy.py", line 97, in _run_module_code _run_code(code, mod_globals, init_globals, File "/usr/lib64/python3.9/runpy.py", line 87, in _run_code exec(code, run_globals) File "memcg_slabinfo.py", line 226, in <module> main() File "memcg_slabinfo.py", line 199, in main cache = page.slab_cache AttributeError: 'struct page' has no member 'slab_cache' The problem can be fixed by explicitly casting struct page * to struct slab * for slab pages. The tools works as expected with this fix, e.g.: cred_jar 776 776 192 21 1 : tunables 0 0 0 : slabdata 547 547 0 kmalloc-cg-32 6 6 32 128 1 : tunables 0 0 0 : slabdata 9 9 0 files_cache 3 3 832 39 8 : tunables 0 0 0 : slabdata 8 8 0 kmalloc-cg-512 1 1 512 32 4 : tunables 0 0 0 : slabdata 10 10 0 task_struct 10 10 6720 4 8 : tunables 0 0 0 : slabdata 63 63 0 mm_struct 3 3 1664 19 8 : tunables 0 0 0 : slabdata 9 9 0 kmalloc-cg-16 1 1 16 256 1 : tunables 0 0 0 : slabdata 8 8 0 pde_opener 1 1 40 102 1 : tunables 0 0 0 : slabdata 8 8 0 anon_vma_chain 375 375 64 64 1 : tunables 0 0 0 : slabdata 81 81 0 radix_tree_node 3 3 584 28 4 : tunables 0 0 0 : slabdata 419 419 0 dentry 98 98 312 26 2 : tunables 0 0 0 : slabdata 1420 1420 0 btrfs_inode 3 3 2368 13 8 : tunables 0 0 0 : slabdata 730 730 0 signal_cache 3 3 1600 20 8 : tunables 0 0 0 : slabdata 17 17 0 sighand_cache 3 3 2240 14 8 : tunables 0 0 0 : slabdata 20 20 0 filp 90 90 512 32 4 : tunables 0 0 0 : slabdata 95 95 0 anon_vma 214 214 200 20 1 : tunables 0 0 0 : slabdata 162 162 0 kmalloc-cg-1k 1 1 1024 32 8 : tunables 0 0 0 : slabdata 22 22 0 pid 10 10 256 32 2 : tunables 0 0 0 : slabdata 14 14 0 kmalloc-cg-64 2 2 64 64 1 : tunables 0 0 0 : slabdata 8 8 0 kmalloc-cg-96 3 3 96 42 1 : tunables 0 0 0 : slabdata 8 8 0 sock_inode_cache 5 5 1408 23 8 : tunables 0 0 0 : slabdata 29 29 0 UNIX 7 7 1920 17 8 : tunables 0 0 0 : slabdata 21 21 0 inode_cache 36 36 1152 28 8 : tunables 0 0 0 : slabdata 680 680 0 proc_inode_cache 26 26 1224 26 8 : tunables 0 0 0 : slabdata 64 64 0 kmalloc-cg-2k 2 2 2048 16 8 : tunables 0 0 0 : slabdata 9 9 0 v2: change naming and count_partial()/count_free()/for_each_slab() signatures to work with slabs, suggested by Matthew Wilcox Fixes: 07f910f9b729 ("mm: Remove slab from struct page") Reported-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Roman Gushchin <guro@fb.com> Tested-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> Link: https://lore.kernel.org/linux-patches/Yg2cKKnIboNu7j+p@carbon.DHCP.thefacebook.com/
2022-02-21x86/speculation: Rename RETPOLINE_AMD to RETPOLINE_LFENCEPeter Zijlstra (Intel)
The RETPOLINE_AMD name is unfortunate since it isn't necessarily AMD only, in fact Hygon also uses it. Furthermore it will likely be sufficient for some Intel processors. Therefore rename the thing to RETPOLINE_LFENCE to better describe what it is. Add the spectre_v2=retpoline,lfence option as an alias to spectre_v2=retpoline,amd to preserve existing setups. However, the output of /sys/devices/system/cpu/vulnerabilities/spectre_v2 will be changed. [ bp: Fix typos, massage. ] Co-developed-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2022-02-20Merge tag 'fs.mount_setattr.v5.17-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux Pull mount_setattr test/doc fixes from Christian Brauner: "This contains a fix for one of the selftests for the mount_setattr syscall to create idmapped mounts, an entry for idmapped mounts for maintainers, and missing kernel documentation for the helper we split out some time ago to get and yield write access to a mount when changing mount properties" * tag 'fs.mount_setattr.v5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: fs: add kernel doc for mnt_{hold,unhold}_writers() MAINTAINERS: add entry for idmapped mounts tests: fix idmapped mount_setattr test
2022-02-19selftests: mptcp: be more conservative with cookie MPJ limitsPaolo Abeni
Since commit 2843ff6f36db ("mptcp: remote addresses fullmesh"), an MPTCP client can attempt creating multiple MPJ subflow simultaneusly. In such scenario the server, when syncookies are enabled, could end-up accepting incoming MPJ syn even above the configured subflow limit, as the such limit can be enforced in a reliable way only after the subflow creation. In case of syncookie, only after the 3rd ack reception. As a consequence the related self-tests case sporadically fails, as it verify that the server always accept the expected number of MPJ syn. Address the issues relaxing the MPJ syn number constrain. Note that the check on the accepted number of MPJ 3rd ack still remains intact. Fixes: 2843ff6f36db ("mptcp: remote addresses fullmesh") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19selftests: mptcp: more robust signal race testPaolo Abeni
The in kernel MPTCP PM implementation can process a single incoming add address option at any given time. In the mentioned test the server can surpass such limit. Let the setup cope with that allowing a faster add_addr retransmission. Fixes: a88c9e496937 ("mptcp: do not block subflows creation on errors") Fixes: f7efc7771eac ("mptcp: drop argument port from mptcp_pm_announce_addr") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/254 Reported-and-tested-by: Matthieu Baerts <matthieu.baerts@tessares.net> Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19selftests: mptcp: improve 'fair usage on close' stabilityPaolo Abeni
The mentioned test has to wait for a subflow creation failure. The current code looks for TCP sockets in TW state and sometimes misses the relevant event. Switch to a more stable check, looking for the associated mib counter. Fixes: 46e967d187ed ("selftests: mptcp: add tests for subflow creation failure") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/257 Reported-and-tested-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19selftests: mptcp: fix diag instabilityPaolo Abeni
Instead of waiting for an arbitrary amount of time for the MPTCP MP_CAPABLE handshake to complete, explicitly wait for the relevant socket to enter into the established status. Additionally let the data transfer application use the slowest transfer mode available (-r), to cope with very slow host, or high jitter caused by hosting VMs. Fixes: df62f2ec3df6 ("selftests/mptcp: add diag interface tests") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/258 Reported-and-tested-by: Matthieu Baerts <matthieu.baerts@tessares.net> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-18perf evlist: Fix failed to use cpu list for uncore eventsZhengjun Xing
The 'perf record' and 'perf stat' commands have supported the option '-C/--cpus' to count or collect only on the list of CPUs provided. Commit 1d3351e631fc34d7 ("perf tools: Enable on a list of CPUs for hybrid") add it to be supported for hybrid. For hybrid support, it checks the cpu list are available on hybrid PMU. But when we test only uncore events(or events not in cpu_core and cpu_atom), there is a bug: Before: # perf stat -C0 -e uncore_clock/clockticks/ sleep 1 failed to use cpu list 0 In this case, for uncore event, its pmu_name is not cpu_core or cpu_atom, so in evlist__fix_hybrid_cpus, perf_pmu__find_hybrid_pmu should return NULL,both events_nr and unmatched_count should be 0 ,then the cpu list check function evlist__fix_hybrid_cpus return -1 and the error "failed to use cpu list 0" will happen. Bypass "events_nr=0" case then the issue is fixed. After: # perf stat -C0 -e uncore_clock/clockticks/ sleep 1 Performance counter stats for 'CPU(s) 0': 195,476,873 uncore_clock/clockticks/ 1.004518677 seconds time elapsed When testing with at least one core event and uncore events, it has no issue. # perf stat -C0 -e cpu_core/cpu-cycles/,uncore_clock/clockticks/ sleep 1 Performance counter stats for 'CPU(s) 0': 5,993,774 cpu_core/cpu-cycles/ 301,025,912 uncore_clock/clockticks/ 1.003964934 seconds time elapsed Fixes: 1d3351e631fc34d7 ("perf tools: Enable on a list of CPUs for hybrid") Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: alexander.shishkin@intel.com Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jin Yao <yao.jin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20220218093127.1844241-1-zhengjun.xing@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-18perf test: Skip failing sigtrap test for arm+aarch64John Garry
Skip the Sigtrap test for arm + arm64, same as was done for s390 in commit a840974e96fd ("perf test: Test 73 Sig_trap fails on s390"). For this, reuse BP_SIGNAL_IS_SUPPORTED - meaning that the arch can use BP to generate signals - instead of BP_ACCOUNT_IS_SUPPORTED, which is appropriate. As described by Will at [0], in the test we get stuck in a loop of handling the HW breakpoint exception and never making progress. GDB handles this by stepping over the faulting instruction, but with perf the kernel is expected to handle the step (which it doesn't for arm). Dmitry made an attempt to get this work, also mentioned in the same thread as [0], which was appreciated. But the best thing to do is skip the test for now. [0] https://lore.kernel.org/linux-perf-users/20220118124343.GC98966@leoy-ThinkPad-X240s/T/#m13b06c39d2a5100d340f009435df6f4d8ee57b5a Fixes: 5504f67944484495 ("perf test sigtrap: Add basic stress test for sigtrap handling") Signed-off-by: John Garry <john.garry@huawei.com> Tested-by: Leo Yan <leo.yan@linaro.org> Acked-by: Marco Elver <elver@google.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Marco Elver <elver@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: linux-arm-kernel@lists.infradead.org Cc: linux@armlinux.org.uk Link: https://lore.kernel.org/r/1645176813-202756-1-git-send-email-john.garry@huawei.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-17Merge tag 'linux-kselftest-fixes-5.17-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kselftest fixes from Shuah Khan: "Fixes to ftrace, exec, and seccomp tests build, run-time and install bugs. These bugs are in the way of running the tests" * tag 'linux-kselftest-fixes-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests/ftrace: Do not trace do_softirq because of PREEMPT_RT selftests/seccomp: Fix seccomp failure by adding missing headers selftests/exec: Add non-regular to TEST_GEN_PROGS
2022-02-17Merge remote-tracking branch 'torvalds/master' into perf/coreArnaldo Carvalho de Melo
To pick up fixes from perf/urgent that recently got merged. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-17Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfJakub Kicinski
Alexei Starovoitov says: ==================== pull-request: bpf 2022-02-17 We've added 8 non-merge commits during the last 7 day(s) which contain a total of 8 files changed, 119 insertions(+), 15 deletions(-). The main changes are: 1) Add schedule points in map batch ops, from Eric. 2) Fix bpf_msg_push_data with len 0, from Felix. 3) Fix crash due to incorrect copy_map_value, from Kumar. 4) Fix crash due to out of bounds access into reg2btf_ids, from Kumar. 5) Fix a bpf_timer initialization issue with clang, from Yonghong. * https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Add schedule points in batch ops bpf: Fix crash due to out of bounds access into reg2btf_ids. selftests: bpf: Check bpf_msg_push_data return value bpf: Fix a bpf_timer initialization issue bpf: Emit bpf_timer in vmlinux BTF selftests/bpf: Add test for bpf_timer overwriting crash bpf: Fix crash due to incorrect copy_map_value bpf: Do not try bpf_msg_push_data with len 0 ==================== Link: https://lore.kernel.org/r/20220217190000.37925-1-alexei.starovoitov@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17Merge tag 'net-5.17-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from wireless and netfilter. Current release - regressions: - dsa: lantiq_gswip: fix use after free in gswip_remove() - smc: avoid overwriting the copies of clcsock callback functions Current release - new code bugs: - iwlwifi: - fix use-after-free when no FW is present - mei: fix the pskb_may_pull check in ipv4 - mei: retry mapping the shared area - mvm: don't feed the hardware RFKILL into iwlmei Previous releases - regressions: - ipv6: mcast: use rcu-safe version of ipv6_get_lladdr() - tipc: fix wrong publisher node address in link publications - iwlwifi: mvm: don't send SAR GEO command for 3160 devices, avoid FW assertion - bgmac: make idm and nicpm resource optional again - atl1c: fix tx timeout after link flap Previous releases - always broken: - vsock: remove vsock from connected table when connect is interrupted by a signal - ping: change destination interface checks to match raw sockets - crypto: af_alg - get rid of alg_memory_allocated to avoid confusing semantics (and null-deref) after SO_RESERVE_MEM was added - ipv6: make exclusive flowlabel checks per-netns - bonding: force carrier update when releasing slave - sched: limit TC_ACT_REPEAT loops - bridge: multicast: notify switchdev driver whenever MC processing gets disabled because of max entries reached - wifi: brcmfmac: fix crash in brcm_alt_fw_path when WLAN not found - iwlwifi: fix locking when "HW not ready" - phy: mediatek: remove PHY mode check on MT7531 - dsa: mv88e6xxx: flush switchdev FDB workqueue before removing VLAN - dsa: lan9303: - fix polarity of reset during probe - fix accelerated VLAN handling" * tag 'net-5.17-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (65 commits) bonding: force carrier update when releasing slave nfp: flower: netdev offload check for ip6gretap ipv6: fix data-race in fib6_info_hw_flags_set / fib6_purge_rt ipv4: fix data races in fib_alias_hw_flags_set net: dsa: lan9303: add VLAN IDs to master device net: dsa: lan9303: handle hwaccel VLAN tags vsock: remove vsock from connected table when connect is interrupted by a signal Revert "net: ethernet: bgmac: Use devm_platform_ioremap_resource_byname" ping: fix the dif and sdif check in ping_lookup net: usb: cdc_mbim: avoid altsetting toggling for Telit FN990 net: sched: limit TC_ACT_REPEAT loops tipc: fix wrong notification node addresses net: dsa: lantiq_gswip: fix use after free in gswip_remove() ipv6: per-netns exclusive flowlabel checks net: bridge: multicast: notify switchdev driver whenever MC processing gets disabled CDC-NCM: avoid overflow in sanity checking mctp: fix use after free net: mscc: ocelot: fix use-after-free in ocelot_vlan_del() bonding: fix data-races around agg_select_timer dpaa2-eth: Initialize mutex used in one step timestamping path ...
2022-02-17Merge tag 'perf-tools-fixes-for-v5.17-2022-02-17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux Pull perf tools fixes from Arnaldo Carvalho de Melo: - Fix corrupt inject files when only last branch option is enabled with ARM CoreSight ETM - Fix use-after-free for realloc(..., 0) in libsubcmd, found by gcc 12 - Defer freeing string after possible strlen() on it in the BPF loader, found by gcc 12 - Avoid early exit in 'perf trace' due SIGCHLD from non-workload processes - Fix arm64 perf_event_attr 'perf test's wrt --call-graph initialization - Fix libperf 32-bit build for 'perf test' wrt uint64_t printf - Fix perf_cpu_map__for_each_cpu macro in libperf, providing access to the CPU iterator - Sync linux/perf_event.h UAPI with the kernel sources - Update Jiri Olsa's email address in MAINTAINERS * tag 'perf-tools-fixes-for-v5.17-2022-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: perf bpf: Defer freeing string after possible strlen() on it perf test: Fix arm64 perf_event_attr tests wrt --call-graph initialization libsubcmd: Fix use-after-free for realloc(..., 0) libperf: Fix perf_cpu_map__for_each_cpu macro perf cs-etm: Fix corrupt inject files when only last branch option is enabled perf cs-etm: No-op refactor of synth opt usage libperf: Fix 32-bit build for tests uint64_t printf tools headers UAPI: Sync linux/perf_event.h with the kernel sources perf trace: Avoid early exit due SIGCHLD from non-workload processes MAINTAINERS: Update Jiri's email address
2022-02-17perf bpf: Defer freeing string after possible strlen() on itArnaldo Carvalho de Melo
This was detected by the gcc in Fedora Rawhide's gcc: 50 11.01 fedora:rawhide : FAIL gcc version 12.0.1 20220205 (Red Hat 12.0.1-0) (GCC) inlined from 'bpf__config_obj' at util/bpf-loader.c:1242:9: util/bpf-loader.c:1225:34: error: pointer 'map_opt' may be used after 'free' [-Werror=use-after-free] 1225 | *key_scan_pos += strlen(map_opt); | ^~~~~~~~~~~~~~~ util/bpf-loader.c:1223:9: note: call to 'free' here 1223 | free(map_name); | ^~~~~~~~~~~~~~ cc1: all warnings being treated as errors So do the calculations on the pointer before freeing it. Fixes: 04f9bf2bac72480c ("perf bpf-loader: Add missing '*' for key_scan_pos") Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang ShaoBo <bobo.shaobowang@huawei.com> Link: https://lore.kernel.org/lkml/Yg1VtQxKrPpS3uNA@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-16perf test: Fix arm64 perf_event_attr tests wrt --call-graph initializationGerman Gomez
The struct perf_event_attr is initialised differently in Arm64 when recording in call-graph fp mode, so update the relevant tests, and add two extra arm64-only tests. Before: $ perf test 17 -v 17: Setup struct perf_event_attr [...] running './tests/attr/test-record-graph-default' expected sample_type=295, got 4391 expected sample_regs_user=0, got 1073741824 FAILED './tests/attr/test-record-graph-default' - match failure test child finished with -1 ---- end ---- After: [...] running './tests/attr/test-record-graph-default-aarch64' test limitation 'aarch64' running './tests/attr/test-record-graph-fp-aarch64' test limitation 'aarch64' running './tests/attr/test-record-graph-default' test limitation '!aarch64' excluded architecture list ['aarch64'] skipped [aarch64] './tests/attr/test-record-graph-default' running './tests/attr/test-record-graph-fp' test limitation '!aarch64' excluded architecture list ['aarch64'] skipped [aarch64] './tests/attr/test-record-graph-fp' [...] Fixes: 7248e308a5758761 ("perf tools: Record ARM64 LR register automatically") Signed-off-by: German Gomez <german.gomez@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Truong <alexandre.truong@arm.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: KP Singh <kpsingh@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Martin KaFai Lau <kafai@fb.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Song Liu <songliubraving@fb.com> Cc: Yonghong Song <yhs@fb.com> Link: http://lore.kernel.org/lkml/20220125104435.2737-1-german.gomez@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-16libsubcmd: Fix use-after-free for realloc(..., 0)Kees Cook
GCC 12 correctly reports a potential use-after-free condition in the xrealloc helper. Fix the warning by avoiding an implicit "free(ptr)" when size == 0: In file included from help.c:12: In function 'xrealloc', inlined from 'add_cmdname' at help.c:24:2: subcmd-util.h:56:23: error: pointer may be used after 'realloc' [-Werror=use-after-free] 56 | ret = realloc(ptr, size); | ^~~~~~~~~~~~~~~~~~ subcmd-util.h:52:21: note: call to 'realloc' here 52 | void *ret = realloc(ptr, size); | ^~~~~~~~~~~~~~~~~~ subcmd-util.h:58:31: error: pointer may be used after 'realloc' [-Werror=use-after-free] 58 | ret = realloc(ptr, 1); | ^~~~~~~~~~~~~~~ subcmd-util.h:52:21: note: call to 'realloc' here 52 | void *ret = realloc(ptr, size); | ^~~~~~~~~~~~~~~~~~ Fixes: 2f4ce5ec1d447beb ("perf tools: Finalize subcmd independence") Reported-by: Valdis Klētnieks <valdis.kletnieks@vt.edu> Signed-off-by: Kees Kook <keescook@chromium.org> Tested-by: Valdis Klētnieks <valdis.kletnieks@vt.edu> Tested-by: Justin M. Forbes <jforbes@fedoraproject.org> Acked-by: Josh Poimboeuf <jpoimboe@redhat.com> Cc: linux-hardening@vger.kernel.org Cc: Valdis Klētnieks <valdis.kletnieks@vt.edu> Link: http://lore.kernel.org/lkml/20220213182443.4037039-1-keescook@chromium.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-16libperf: Fix perf_cpu_map__for_each_cpu macroJiri Olsa
Tzvetomir Stoyanov reported an issue with using macro perf_cpu_map__for_each_cpu using private perf_cpu object. The issue is caused by recent change that wrapped cpu in struct perf_cpu to distinguish it from cpu indexes. We need to make struct perf_cpu public. Add a simple test for using the perf_cpu_map__for_each_cpu macro. Fixes: 6d18804b963b78dc ("perf cpumap: Give CPUs their own type") Reported-by: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20220215153713.31395-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-16perf cs-etm: Fix corrupt inject files when only last branch option is enabledJames Clark
'perf inject' with Coresight data generates files that cannot be opened when only the last branch option is specified: perf inject -i perf.data --itrace=l -o inject.data perf script -i inject.data 0x33faa8 [0x8]: failed to process type: 9 [Bad address] This is because cs_etm__synth_instruction_sample() is called even when the sample type for instructions hasn't been setup. Last branch records are attached to instruction samples so it doesn't make sense to generate them when --itrace=i isn't specified anyway. This change disables all calls of cs_etm__synth_instruction_sample() unless --itrace=i is specified, resulting in a file with no samples if only --itrace=l is provided, rather than a bad file. Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: James Clark <james.clark@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220210200620.1227232-2-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-16perf cs-etm: No-op refactor of synth opt usageJames Clark
sample_branches and sample_instructions are already saved in the synth_opts struct. Other usages like synth_opts.last_branch don't save a value, so make this more consistent by always going through synth_opts and not saving duplicate values. Reviewed-by: Leo Yan <leo.yan@linaro.org> Signed-off-by: James Clark <james.clark@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20220210200620.1227232-1-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-16libperf: Fix 32-bit build for tests uint64_t printfRob Herring
Commit a7f3713f6bf207e6 ("libperf tests: Add test_stat_multiplexing test") added printf's of 64-bit ints using %lu which doesn't work on 32-bit builds: tests/test-evlist.c:529:29: error: format ‘%lu’ expects argument of type \ ‘long unsigned int’, but argument 4 has type ‘uint64_t’ {aka ‘long long unsigned int’} [-Werror=format=] Use PRIu64 instead which works on both 32-bit and 64-bit systems. Fixes: a7f3713f6bf207e6 ("libperf tests: Add test_stat_multiplexing test") Signed-off-by: Rob Herring <robh@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com> Link: https://lore.kernel.org/r/20220201213903.699656-1-robh@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-16tools headers UAPI: Sync linux/perf_event.h with the kernel sourcesArnaldo Carvalho de Melo
To pick the trivial change in: ddecd22878601a60 ("perf: uapi: Document perf_event_attr::sig_data truncation on 32 bit architectures") Just adds a comment. This silences this perf build warning: Warning: Kernel ABI header at 'tools/include/uapi/linux/perf_event.h' differs from latest version at 'include/uapi/linux/perf_event.h' diff -u tools/include/uapi/linux/perf_event.h include/uapi/linux/perf_event.h Cc: Marco Elver <elver@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/lkml/ Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-16perf trace: Avoid early exit due SIGCHLD from non-workload processesChangbin Du
The function trace__symbols_init() runs "perf-read-vdso32" and that ends up with a SIGCHLD delivered to 'perf'. And this SIGCHLD make perf exit early. 'perf trace' should exit only if the SIGCHLD is from our workload process. So let's use sigaction() instead of signal() to match such condition. Committer notes: Use memset to zero the 'struct sigaction' variable as the '= { 0 }' method isn't accepted in many compiler versions, e.g.: 4 34.02 alpine:3.6 : FAIL clang version 4.0.0 (tags/RELEASE_400/final) builtin-trace.c:4897:35: error: suggest braces around initialization of subobject [-Werror,-Wmissing-braces] struct sigaction sigchld_act = { 0 }; ^ {} builtin-trace.c:4897:37: error: missing field 'sa_mask' initializer [-Werror,-Wmissing-field-initializers] struct sigaction sigchld_act = { 0 }; ^ 2 errors generated. 6 32.60 alpine:3.8 : FAIL gcc version 6.4.0 (Alpine 6.4.0) builtin-trace.c:4897:35: error: suggest braces around initialization of subobject [-Werror,-Wmissing-braces] struct sigaction sigchld_act = { 0 }; ^ {} builtin-trace.c:4897:37: error: missing field 'sa_mask' initializer [-Werror,-Wmissing-field-initializers] struct sigaction sigchld_act = { 0 }; ^ 2 errors generated. 7 34.82 alpine:3.9 : FAIL gcc version 8.3.0 (Alpine 8.3.0) builtin-trace.c:4897:35: error: suggest braces around initialization of subobject [-Werror,-Wmissing-braces] struct sigaction sigchld_act = { 0 }; ^ {} builtin-trace.c:4897:37: error: missing field 'sa_mask' initializer [-Werror,-Wmissing-field-initializers] struct sigaction sigchld_act = { 0 }; ^ 2 errors generated. Signed-off-by: Changbin Du <changbin.du@gmail.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20220208140725.3947-1-changbin.du@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-16perf report: Add "addr_from" and "addr_to" sort dimensionsStephane Eranian
With the existing symbol_from/symbol_to, branches captured in the same function would be collapsed into a single function if the latencies associated with the each branch (cycles) were all the same. That is the case on Intel Broadwell, for instance. Since Intel Skylake, the latency is captured by hardware and therefore is used to disambiguate branches. Add addr_from/addr_to sort dimensions to sort branches based on their addresses and not the function there are in. The output is still the function name but the offset within the function is provided to uniquely identify each branch. These new sort dimensions also help with annotate because they create different entries in the histogram which, in turn, generates proper branch annotations. Here is an example using AMD's branch sampling: $ perf record -a -b -c 1000037 -e cpu/branch-brs/ test_prg $ perf report Samples: 6M of event 'cpu/branch-brs/', Event count (approx.): 6901276 Overhead Command Source Shared Object Source Symbol Target Symbol Basic Block Cycle 99.65% test_prg test_prg [.] test_thread [.] test_thread - 0.02% test_prg [kernel.vmlinux] [k] asm_sysvec_apic_timer_interrupt [k] error_entry - $ perf report -F overhead,comm,dso,addr_from,addr_to Samples: 6M of event 'cpu/branch-brs/', Event count (approx.): 6901276 Overhead Command Shared Object Source Address Target Address 4.22% test_prg test_prg [.] test_thread+0x3c [.] test_thread+0x4 4.13% test_prg test_prg [.] test_thread+0x4 [.] test_thread+0x3a 4.09% test_prg test_prg [.] test_thread+0x3a [.] test_thread+0x6 4.08% test_prg test_prg [.] test_thread+0x2 [.] test_thread+0x3c 4.06% test_prg test_prg [.] test_thread+0x3e [.] test_thread+0x2 3.87% test_prg test_prg [.] test_thread+0x6 [.] test_thread+0x38 3.84% test_prg test_prg [.] test_thread [.] test_thread+0x3e 3.76% test_prg test_prg [.] test_thread+0x1e [.] test_thread 3.76% test_prg test_prg [.] test_thread+0x38 [.] test_thread+0x8 3.56% test_prg test_prg [.] test_thread+0x22 [.] test_thread+0x1e 3.54% test_prg test_prg [.] test_thread+0x8 [.] test_thread+0x36 3.47% test_prg test_prg [.] test_thread+0x1c [.] test_thread+0x22 3.45% test_prg test_prg [.] test_thread+0x36 [.] test_thread+0xa 3.28% test_prg test_prg [.] test_thread+0x24 [.] test_thread+0x1c 3.25% test_prg test_prg [.] test_thread+0xa [.] test_thread+0x34 3.24% test_prg test_prg [.] test_thread+0x1a [.] test_thread+0x24 3.20% test_prg test_prg [.] test_thread+0x34 [.] test_thread+0xc 3.04% test_prg test_prg [.] test_thread+0x26 [.] test_thread+0x1a 3.01% test_prg test_prg [.] test_thread+0xc [.] test_thread+0x32 2.98% test_prg test_prg [.] test_thread+0x18 [.] test_thread+0x26 2.94% test_prg test_prg [.] test_thread+0x32 [.] test_thread+0xe 2.76% test_prg test_prg [.] test_thread+0x28 [.] test_thread+0x18 2.73% test_prg test_prg [.] test_thread+0xe [.] test_thread+0x30 2.67% test_prg test_prg [.] test_thread+0x30 [.] test_thread+0x10 2.67% test_prg test_prg [.] test_thread+0x16 [.] test_thread+0x28 2.46% test_prg test_prg [.] test_thread+0x10 [.] test_thread+0x2e 2.44% test_prg test_prg [.] test_thread+0x2a [.] test_thread+0x16 2.38% test_prg test_prg [.] test_thread+0x14 [.] test_thread+0x2a 2.32% test_prg test_prg [.] test_thread+0x2e [.] test_thread+0x12 2.28% test_prg test_prg [.] test_thread+0x12 [.] test_thread+0x2c 2.16% test_prg test_prg [.] test_thread+0x2c [.] test_thread+0x14 0.02% test_prg [kernel.vmlinux] [k] asm_sysvec_apic_ti+0x5 [k] error_entry Signed-off-by: Stephane Eranian <eranian@google.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kim Phillips <kim.phillips@amd.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <songliubraving@fb.com> Link: http://lore.kernel.org/lkml/20220208211637.2221872-13-eranian@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-16perf tools: Fix spelling mistake "commpressor" -> "compressor"Colin Ian King
There is a spelling mistake in a debug message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-janitors@vger.kernel.org Link: http://lore.kernel.org/lkml/20220214093547.44590-1-colin.i.king@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-16perf annotate: Remove redundant 'ret' variabletangmeng
Return the result from hist_entry_iter__add() directly instead of taking this in another redundant variable. Signed-off-by: tangmeng <tangmeng@uniontech.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20220216030425.27779-1-tangmeng@uniontech.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-16perf top: Remove redundant 'err' variabletangmeng
The variable 'err' in the perf_event__process_sample() is only used in the only one judgment statement, it is not used in other places. So, use the return value from hist_entry_iter__add() directly instead of taking this in another redundant variable. Signed-off-by: tangmeng <tangmeng@uniontech.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20220216030425.27779-2-tangmeng@uniontech.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-15perf test: Make metric testing more robustIan Rogers
When testing metric expressions we fake counter values from 1 going upward. For some metrics this can yield negative values that are clipped to zero, and then cause divide by zero failures. Such clipping is questionable but may be a result of tools automatically generating metrics. A workaround for this case is to try a second time with counter values going in the opposite direction. This case was seen in a metric like: event1 / max(event2 - event3, 0) But it may also happen in more sensible metrics like: event1 / (event2 + event3 - 1 - event4) Reviewed-by: John Garry <john.garry@huawei.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20211223185622.3435128-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-15perf cs-etm: Update deduction of TRCCONFIGR register for branch broadcastJames Clark
Now that a config flag for branch broadcast has been added, take it into account when trying to deduce what the driver would have programmed the TRCCONFIGR register to. Reviewed-by: Leo Yan <leo.yan@linaro.org> Reviewed-by: Mike Leach <mike.leach@linaro.org> Reviewed-by: Suzuki Poulouse <suzuki.poulose@arm.com> Signed-off-by: James Clark <james.clark@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: John Garry <john.garry@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mathieu Poirier <mathieu.poirier@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Cc: linux-doc@vger.kernel.org Link: https://lore.kernel.org/r/20220113091056.1297982-4-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-15perf c2c: Replace bitmap_weight() with bitmap_empty() where appropriateYury Norov
Some code in builtin-c2c.c calls bitmap_weight() to check if any bit of a given bitmap is set. It's better to use bitmap_empty() in that case because bitmap_empty() stops traversing the bitmap as soon as it finds first set bit, while bitmap_weight() counts all bits unconditionally. Signed-off-by: Yury Norov <yury.norov@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Klimov <aklimov@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: David Laight <david.laight@aculab.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Emil Renner Berthing <kernel@esmil.dk> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Joe Perches <joe@perches.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com> Cc: Michał Mirosław <mirq-linux@rere.qmqm.pl> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Link: http://lore.kernel.org/lkml/20220123183925.1052919-13-yury.norov@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-15perf tui: Only support --tui with slangIan Rogers
Make the --tui command line flags dependent HAVE_SLANG_SUPPORT. This was reported as confusing in: https://lore.kernel.org/linux-perf-users/YevaTkzdXmFKdGpc@zx-spectrum.none/ Reported-by: xaizek <xaizek@posteo.net> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: xaizek <xaizek@posteo.net> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lore.kernel.org/lkml/20220123191849.3655855-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-15perf intel-pt: Add documentation for Event Trace and TNT disableAdrian Hunter
Add documentation for Event Trace and TNT disable to the perf Intel PT man page. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20220124084201.2699795-26-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-15perf scripts python: export-to-postgresql.py: Export all sample flagsAdrian Hunter
Add sample flags to the PostgreSQL database definition and export. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20220124084201.2699795-25-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-15perf scripts python: export-to-sqlite.py: Export all sample flagsAdrian Hunter
Add sample flags to the SQLite database definition and export. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20220124084201.2699795-24-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-15perf scripting python: Add all sample flags to DB exportAdrian Hunter
Currently, the transaction flag (x) is kept separate from branch flags. Instead of doing the same for the interrupt disabled flags (D and t), add all flags so that new flags will not need to be handled separately in the future. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20220124084201.2699795-23-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-15perf scripts python: intel-pt-events.py: Add Event TraceAdrian Hunter
Add Event Trace to the intel-pt-events.py script. This shows how to unpack the raw data from the new sample events in a Python script. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20220124084201.2699795-22-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-15perf script: Display new D (Intr Disabled) and t (Intr Toggle) flagsAdrian Hunter
Amend the display to include D and t flags in the same way as the x flag. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20220124084201.2699795-21-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-15perf script: Display Intel PT iflag synthesized eventAdrian Hunter
Similar to other Intel PT synth events, display changes to the interrupt flag represented by the MODE.Exec packet. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20220124084201.2699795-20-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-15perf script: Display Intel PT CFE (Control Flow Event) / EVD (Event Data) ↵Adrian Hunter
synthesized event Similar to other Intel PT synth events, display Event Trace events recorded by CFE / EVD packets. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20220124084201.2699795-19-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-15perf intel-pt: Force 'quick' mode when TNT (Taken/Not-Taken packet) is disabledAdrian Hunter
It is not possible to walk the executable code without TNT packets, so force 'quick' mode when TNT is disabled, because 'quick' mode does not walk the code. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20220124084201.2699795-18-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-15perf intel-pt: Synthesize new D (Intr Disabled) and t (Intr Toggle) flagsAdrian Hunter
Update sample flags to represent the state and changes to the interrupt flag. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20220124084201.2699795-17-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-15perf intel-pt: Synthesize iflag eventAdrian Hunter
Synthesize an attribute event and sample events for changes to the interrupt flag represented by the MODE.Exec packet. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20220124084201.2699795-16-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-15perf intel-pt: Synthesize CFE (Control Flow Event) / EVD (Event Data) eventAdrian Hunter
Synthesize an attribute event and sample events for Intel PT Event Trace events represented by CFE and EVD packets. Committer notes: Make 'struct perf_synth_intel_evd evd[]' evd[0] at the end of 'struct perf_synth_intel_evt' as it is breaking the build with in many compilers with (e.g. clang version 13.0.0 (Fedora 13.0.0-3.fc35)): util/intel-pt.c:2213:31: error: field 'cfe' with variable sized type 'struct perf_synth_intel_evt' not at the end of a struct or class is a GNU extension [-Werror,-Wgnu-variable-sized-type-not-at-end] struct perf_synth_intel_evt cfe; ^ Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20220124084201.2699795-15-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-15perf intel-pt: Record Event Trace capability flagAdrian Hunter
The change to the MODE.Exec packet means processing must distinguish between the old and new cases. Record the Event Trace capability flag to make that possible. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Link: https://lore.kernel.org/r/20220124084201.2699795-14-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>