Age | Commit message (Collapse) | Author |
|
commit d4a98b45fbe6d06f4b79ed90d0bb05ced8674c23 upstream.
The trace could be misleading if trace errors are not taken into
account, so display them also by adding the itrace "e" option.
Note --call-trace and --call-ret-trace already add the itrace "e"
option.
Fixes: b585ebdb5912cf14 ("perf script: Add --insn-trace for instruction decoding")
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240315071334.3478-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit bb69c912c4e8005cf1ee6c63782d2fc28838dee2 upstream.
If the --itrace option is used more than once, the options are
combined, but "i" and "y" (sub-)options can be corrupted because
itrace_do_parse_synth_opts() incorrectly overwrites the period type and
period with default values.
For example, with:
--itrace=i0ns --itrace=e
The processing of "--itrace=e", resets the "i" period from 0 nanoseconds
to the default 100 microseconds.
Fix by performing the default setting of period type and period only if
"i" or "y" are present in the currently processed --itrace value.
Fixes: f6986c95af84ff2a ("perf session: Add instruction tracing options")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240315071334.3478-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 5b3cde198878b2f3269d5e7efbc0d514899b1fd8 upstream.
This reverts commit 7d1405c71df21f6c394b8a885aa8a133f749fa22.
This causes segfaults in some cases, as reported by Milian:
```
sudo /usr/bin/perf record -z --call-graph dwarf -e cycles -e
raw_syscalls:sys_enter ls
...
[ perf record: Woken up 3 times to write data ]
malloc(): invalid next size (unsorted)
Aborted
```
Backtrace with GDB + debuginfod:
```
malloc(): invalid next size (unsorted)
Thread 1 "perf" received signal SIGABRT, Aborted.
__pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6,
no_tid=no_tid@entry=0) at pthread_kill.c:44
Downloading source file /usr/src/debug/glibc/glibc/nptl/pthread_kill.c
44 return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO
(ret) : 0;
(gdb) bt
#0 __pthread_kill_implementation (threadid=<optimized out>,
signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1 0x00007ffff6ea8eb3 in __pthread_kill_internal (threadid=<optimized out>,
signo=6) at pthread_kill.c:78
#2 0x00007ffff6e50a30 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/
raise.c:26
#3 0x00007ffff6e384c3 in __GI_abort () at abort.c:79
#4 0x00007ffff6e39354 in __libc_message_impl (fmt=fmt@entry=0x7ffff6fc22ea
"%s\n") at ../sysdeps/posix/libc_fatal.c:132
#5 0x00007ffff6eb3085 in malloc_printerr (str=str@entry=0x7ffff6fc5850
"malloc(): invalid next size (unsorted)") at malloc.c:5772
#6 0x00007ffff6eb657c in _int_malloc (av=av@entry=0x7ffff6ff6ac0
<main_arena>, bytes=bytes@entry=368) at malloc.c:4081
#7 0x00007ffff6eb877e in __libc_calloc (n=<optimized out>,
elem_size=<optimized out>) at malloc.c:3754
#8 0x000055555569bdb6 in perf_session.do_write_header ()
#9 0x00005555555a373a in __cmd_record.constprop.0 ()
#10 0x00005555555a6846 in cmd_record ()
#11 0x000055555564db7f in run_builtin ()
#12 0x000055555558ed77 in main ()
```
Valgrind memcheck:
```
==45136== Invalid write of size 8
==45136== at 0x2B38A5: perf_event__synthesize_id_sample (in /usr/bin/perf)
==45136== by 0x157069: __cmd_record.constprop.0 (in /usr/bin/perf)
==45136== by 0x15A845: cmd_record (in /usr/bin/perf)
==45136== by 0x201B7E: run_builtin (in /usr/bin/perf)
==45136== by 0x142D76: main (in /usr/bin/perf)
==45136== Address 0x6a866a8 is 0 bytes after a block of size 40 alloc'd
==45136== at 0x4849BF3: calloc (vg_replace_malloc.c:1675)
==45136== by 0x3574AB: zalloc (in /usr/bin/perf)
==45136== by 0x1570E0: __cmd_record.constprop.0 (in /usr/bin/perf)
==45136== by 0x15A845: cmd_record (in /usr/bin/perf)
==45136== by 0x201B7E: run_builtin (in /usr/bin/perf)
==45136== by 0x142D76: main (in /usr/bin/perf)
==45136==
==45136== Syscall param write(buf) points to unaddressable byte(s)
==45136== at 0x575953D: __libc_write (write.c:26)
==45136== by 0x575953D: write (write.c:24)
==45136== by 0x35761F: ion (in /usr/bin/perf)
==45136== by 0x357778: writen (in /usr/bin/perf)
==45136== by 0x1548F7: record__write (in /usr/bin/perf)
==45136== by 0x15708A: __cmd_record.constprop.0 (in /usr/bin/perf)
==45136== by 0x15A845: cmd_record (in /usr/bin/perf)
==45136== by 0x201B7E: run_builtin (in /usr/bin/perf)
==45136== by 0x142D76: main (in /usr/bin/perf)
==45136== Address 0x6a866a8 is 0 bytes after a block of size 40 alloc'd
==45136== at 0x4849BF3: calloc (vg_replace_malloc.c:1675)
==45136== by 0x3574AB: zalloc (in /usr/bin/perf)
==45136== by 0x1570E0: __cmd_record.constprop.0 (in /usr/bin/perf)
==45136== by 0x15A845: cmd_record (in /usr/bin/perf)
==45136== by 0x201B7E: run_builtin (in /usr/bin/perf)
==45136== by 0x142D76: main (in /usr/bin/perf)
==45136==
-----
Closes: https://lore.kernel.org/linux-perf-users/23879991.0LEYPuXRzz@milian-workstation/
Reported-by: Milian Wolff <milian.wolff@kdab.com>
Tested-by: Milian Wolff <milian.wolff@kdab.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: stable@kernel.org # 6.8+
Link: https://lore.kernel.org/lkml/Zl9ksOlHJHnKM70p@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit d9c5f5f94c2d356fdf3503f7fcaf254512bc032d ]
Sys events are eagerly loaded as each event has a compat option that may
mean the event is or isn't associated with the PMU.
These shouldn't be counted as loaded_json_events as that is used for
JSON events matching the CPUID that may or may not have been loaded. The
mismatch causes issues on ARM64 that uses sys events.
Fixes: e6ff1eed3584362d ("perf pmu: Lazily add JSON events")
Closes: https://lore.kernel.org/lkml/20240510024729.1075732-1-justin.he@arm.com/
Reported-by: Jia He <justin.he@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240511003601.2666907-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 7b6dd7a923281a7ccb980a0f768d6926721eb3cc ]
Perf event names aren't case sensitive. For sysfs events the entire
directory of events is read then iterated comparing names in a case
insensitive way, most often to see if an event is present.
Consider:
$ perf stat -e inst_retired.any true
The event inst_retired.any may be present in any PMU, so every PMU's
sysfs events are loaded and then searched with strcasecmp to see if
any match. This event is only present on the cpu PMU as a JSON event
so a lot of events were loaded from sysfs unnecessarily just to prove
an event didn't exist there.
This change avoids loading all the events by assuming sysfs event
names are always either lower or uppercase. It uses file exists and
only loads the events when the desired event is present.
For the example above, the number of openat calls measured by 'perf
trace' on a tigerlake laptop goes from 325 down to 255. The reduction
will be larger for machines with many PMUs, particularly replicated
uncore PMUs.
Ensure pmu_aliases_parse() is called before all uses of the aliases
list, but remove some "pmu->sysfs_aliases_loaded" tests as they are now
part of the function.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20240502213507.2339733-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Stable-dep-of: d9c5f5f94c2d ("perf pmu: Count sys and cpuid JSON events separately")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 67ee8e71daabb8632931b7559e5c8a4b69a427f8 ]
Add perf_pmu__name_from_config that does a reverse lookup from a
config number to an alias name. The lookup is expensive as the config
is computed for every alias by filling in a perf_event_attr, but this
is only done when verbose output is enabled. The lookup also only
considers config, and not config1, config2 or config3.
An example of the output:
$ perf stat -vv -e data_read true
...
perf_event_attr:
type 24 (uncore_imc_free_running_0)
size 136
config 0x20ff (data_read)
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
exclude_guest 1
...
Committer notes:
Fix the python binding build by adding dummies for not strictly
needed perf_pmu__name_from_config() and perf_pmus__find_by_type().
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20240308001915.4060155-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Stable-dep-of: d9c5f5f94c2d ("perf pmu: Count sys and cpuid JSON events separately")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 7093882067e2e2f88d3449c35c5f0f3f566c8a26 ]
When dumping a perf_event_attr, use pmus to find the PMU and its name
by the type number. This allows dynamically added PMUs to be described.
Before:
$ perf stat -vv -e data_read true
...
perf_event_attr:
type 24
size 136
config 0x20ff
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
exclude_guest 1
...
After:
$ perf stat -vv -e data_read true
...
perf_event_attr:
type 24 (uncore_imc_free_running_0)
size 136
config 0x20ff
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
exclude_guest 1
...
However, it also means that when we have a PMU name we prefer it to a
hard coded name:
Before:
$ perf stat -vv -e faults true
...
perf_event_attr:
type 1 (PERF_TYPE_SOFTWARE)
size 136
config 0x2 (PERF_COUNT_SW_PAGE_FAULTS)
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
...
After:
$ perf stat -vv -e faults true
...
perf_event_attr:
type 1 (software)
size 136
config 0x2 (PERF_COUNT_SW_PAGE_FAULTS)
sample_type IDENTIFIER
read_format TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
disabled 1
inherit 1
enable_on_exec 1
exclude_guest 1
...
It feels more consistent to do this, rather than only prefer a PMU
name when a hard coded name isn't available.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20240308001915.4060155-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Stable-dep-of: d9c5f5f94c2d ("perf pmu: Count sys and cpuid JSON events separately")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 193a9e30207f54777ff42d0d8be8389edc522277 ]
On an Intel tigerlake laptop a metric like:
{
"BriefDescription": "Test",
"MetricExpr": "imc_free_running@data_read@ + imc_free_running@data_write@",
"MetricGroup": "Test",
"MetricName": "Test",
"ScaleUnit": "6.103515625e-5MiB"
},
Will have 4 events:
uncore_imc_free_running_0/data_read/
uncore_imc_free_running_0/data_write/
uncore_imc_free_running_1/data_read/
uncore_imc_free_running_1/data_write/
If aggregration is disabled with metric-only 2 column headers are
needed:
$ perf stat -M test --metric-only -A -a sleep 1
Performance counter stats for 'system wide':
MiB Test MiB Test
CPU0 1821.0 1820.5
But when not, the counts aggregated in the metric leader and only 1
column should be shown:
$ perf stat -M test --metric-only -a sleep 1
Performance counter stats for 'system wide':
MiB Test
5909.4
1.001258915 seconds time elapsed
Achieve this by skipping events that aren't metric leaders when
printing column headers and aggregation isn't disabled.
The bug is long standing, the fixes tag is set to a refactor as that
is as far back as is reasonable to backport.
Fixes: 088519f318be3a41 ("perf stat: Move the display functions to stat-display.c")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaige Ye <ye@kaige.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240510051309.2452468-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 9ef30265a483f0405e4f7b3f15cda251b9a2c7da ]
A symbol can have no samples, then accessing the annotated_source->samples
hashmap will result in a segfault.
Fixes: a3f7768bcf48281d ("perf annotate: Fix memory leak in annotated_source")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240510210452.2449944-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 09541603462c399c7408d50295db99b4b8042eaa ]
The open() function returns -1 on error.
The 'control' and 'ack' file descriptors are both initialized with
open() and further validated with 'if' statement.
'if (!control)' would evaluate to 'true' if returned value on error were
'0' but it is actually '-1'.
Fixes: edcaa47958c7438b ("perf daemon: Add 'ping' command")
Signed-off-by: Samasth Norway Ananda <samasth.norway.ananda@oracle.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240510003424.2016914-1-samasth.norway.ananda@oracle.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 25626e19ae6df34f336f235b6b3dbd1b566d2738 ]
The linked commit updated dso__load_vmlinux() to call
dso__set_long_name() before loading the symbols. Loading the symbols may
not succeed but dso__set_long_name() takes ownership of the string. The
two callers of this function free the string themselves on failure
cases, resulting in the following error:
$ perf record -- ls
$ perf report
free(): double free detected in tcache 2
Fix it by always taking ownership of the string, even on failure. This
means the string is either freed at the very first early exit condition,
or later when the dso is deleted or the long name is replaced. Now no
special return value is needed to signify that the caller needs to
free the string.
Fixes: e59fea47f83e8a9a ("perf symbols: Fix DSO kernel load and symbol process to correctly map DSO to its long_name, type and adjust_symbols")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240507141210.195939-5-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f30232b20fadea8c0f2f43f764bc06e51e8cfcdf ]
When loading kcore, the main vmlinux map is updated in the same loop
that merges the remaining maps. If a map that overlaps is merged in
before kcore, the list can become unsortable when the main map addresses
are updated. This will later trigger the check_invariants() assert:
$ perf record
$ perf report
util/maps.c:96: check_invariants: Assertion `map__end(prev) <=
map__start(map) || map__start(prev) == map__start(map)' failed.
Aborted
Fix it by moving the main map update prior to the loop so that
maps__merge_in() can split it if necessary.
Fixes: 659ad3492b913c90 ("perf maps: Switch from rbtree to lazily sorted array for addresses")
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240507141210.195939-4-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 9fe410a7ef483a9aca08bf620d8ddfd35ac99bc7 ]
Make the order of operations remove, update, add. Updating addresses
before the map is removed causes the ordering check to fail when the map
is removed. This can be reproduced when running Perf on an Arm system
with a static kernel and Perf uses kcore rather than other sources:
$ perf record -- ls
$ perf report
util/maps.c:96: check_invariants: Assertion `map__end(prev) <=
map__start(map) || map__start(prev) == map__start(map)' failed
Fixes: 659ad3492b913c90 ("perf maps: Switch from rbtree to lazily sorted array for addresses")
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240507141210.195939-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c9d492378faec5c1fb8ea1534c620f7320bbb23a ]
check_allowed_ops() is used from both HAVE_DWARF_GETLOCATIONS_SUPPORT
and HAVE_DWARF_CFI_SUPPORT sections, so move it into the right place so
that it's available when either are defined. This shows up when doing
a static cross compile for arm64:
$ make ARCH=arm64 CROSS_COMPILE=aarch64-linux-gnu- LDFLAGS="-static" \
EXTRA_PERFLIBS="-lexpat"
util/dwarf-aux.c:1723:6: error: implicit declaration of function 'check_allowed_ops'
Fixes: 55442cc2f22d0727 ("perf dwarf-aux: Check allowed DWARF Ops")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240508141458.439017-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 932dcc2c39aedf54ef291bc0b4129a54f5fe1e84 ]
The die_collect_vars() is to find all variable information in the scope
including function parameters. The struct die_var_type is to save the
type of the variable with the location (reg and offset) as well as where
it's defined in the code (addr).
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20240319055115.4063940-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Stable-dep-of: c9d492378fae ("perf dwarf-aux: Fix build with HAVE_DWARF_CFI_SUPPORT")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 3536c2575e88a890cf696b4ccd3da36bc937853b ]
Freeing the thread on failure won't work with reference count checking,
use thread__delete().
Don't allocate the comm_str, use a stack allocation instead.
Fixes: f6005cafebab72f8 ("perf thread: Add reference count checking")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240508035301.1554434-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 45b4f402a6b782352c4bafcff682bfb01da9ca05 ]
In some cases evsel->name is lazily initialized in evsel__name(). If not
initialized passing NULL to strstr() leads to a SEGV.
Fixes: ccb17caecfbd542f ("perf report: Set PERF_SAMPLE_DATA_SRC bit for Arm SPE event")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240508035301.1554434-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 90f01afb0dfafbc9b094bb61e61a4ac297d9d0d2 ]
If the title is NULL then it can lead to a SEGV.
Fixes: 769e6a1e15bdbbaf ("perf ui browser: Don't save pointer to stack memory")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240508035301.1554434-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit a3f7768bcf48281df14d98715f076c5656571527 ]
Freeing hash map doesn't free the entries added to the hashmap, add
the missing free().
Fixes: d3e7cad6f36d9e80 ("perf annotate: Add a hashmap for symbol histogram")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Li Dong <lidong@vivo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240507183545.1236093-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 769e6a1e15bdbbaf2b0d2f37c24f2c53268bd21f ]
ui_browser__show() is capturing the input title that is stack allocated
memory in hist_browser__run().
Avoid a use after return by strdup-ing the string.
Committer notes:
Further explanation from Ian Rogers:
My command line using tui is:
$ sudo bash -c 'rm /tmp/asan.log*; export
ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a
sleep 1; /tmp/perf/perf mem report'
I then go to the perf annotate view and quit. This triggers the asan
error (from the log file):
```
==1254591==ERROR: AddressSanitizer: stack-use-after-return on address
0x7f2813331920 at pc 0x7f28180
65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10
READ of size 80 at 0x7f2813331920 thread T0
#0 0x7f2818065990 in __interceptor_strlen
../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461
#1 0x7f2817698251 in SLsmg_write_wrapped_string
(/lib/x86_64-linux-gnu/libslang.so.2+0x98251)
#2 0x7f28176984b9 in SLsmg_write_nstring
(/lib/x86_64-linux-gnu/libslang.so.2+0x984b9)
#3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60
#4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266
#5 0x55c94045c776 in ui_browser__show ui/browser.c:288
#6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206
#7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458
#8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412
#9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527
#10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613
#11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661
#12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671
#13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141
#14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805
#15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374
#16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516
#17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350
#18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403
#19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447
#20 0x55c9400e53ad in main tools/perf/perf.c:561
#21 0x7f28170456c9 in __libc_start_call_main
../sysdeps/nptl/libc_start_call_main.h:58
#22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360
#23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId:
84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93)
Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame
#0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746
This frame has 1 object(s):
[32, 192) 'title' (line 747) <== Memory access at offset 32 is
inside this variable
HINT: this may be a false positive if your program uses some custom
stack unwind mechanism, swapcontext or vfork
```
hist_browser__run isn't on the stack so the asan error looks legit.
There's no clean init/exit on struct ui_browser so I may be trading a
use-after-return for a memory leak, but that seems look a good trade
anyway.
Fixes: 05e8b0804ec4 ("perf ui browser: Stop using 'self'")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Li Dong <lidong@vivo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240507183545.1236093-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
one DSO
[ Upstream commit d9180e23fbfa3875424d3a6b28b71b072862a52a ]
'perf bench internals inject-build-id' suffers from the following error when
only one DSO is collected.
# perf bench internals inject-build-id -v
Collected 1 DSOs
traps: internals-injec[2305] trap divide error
ip:557566ba6394 sp:7ffd4de97fe0 error:0 in perf[557566b2a000+23d000]
Build-id injection benchmark
Iteration #1
Floating point exception
This patch removes the unnecessary minus one from the divisor which also
corrects the randomization range.
Signed-off-by: He Zhe <zhe.he@windriver.com>
Fixes: 0bf02a0d80427f26 ("perf bench: Add build-id injection benchmark")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240507065026.2652929-1-zhe.he@windriver.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e101a05f79fd4ee3e89d2f3fb716493c33a33708 ]
MemorySanitizer discovered instances where the instruction op value was
not assigned.:
WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x5581c00a76b3 in intel_pt_sample_flags tools/perf/util/intel-pt.c:1527:17
Uninitialized value was stored to memory at
#0 0x5581c005ddf8 in intel_pt_walk_insn tools/perf/util/intel-pt-decoder/intel-pt-decoder.c:1256:25
The op value is used to set branch flags for branch instructions
encountered when walking the code, so fix by setting op to
INTEL_PT_OP_OTHER in other cases.
Fixes: 4c761d805bb2d2ea ("perf intel-pt: Fix intel_pt_fup_event() assumptions about setting state type")
Reported-by: Ian Rogers <irogers@google.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Closes: https://lore.kernel.org/linux-perf-users/20240320162619.1272015-1-irogers@google.com/
Link: https://lore.kernel.org/r/20240326083223.10883-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 10b6ee3b597b1b1b4dc390aaf9d589664af31df9 ]
These tests record in a mode that includes kernel trace but look for
samples of a userspace process. This makes them sensitive to any kernel
compilation options that increase the amount of time spent in the
kernel. If the trace buffer is completely filled before userspace is
reached then the test will fail. Double the buffer size to fix this.
The other tests in the same file aren't sensitive to this for various
reasons, for example the iterate devices test filters by userspace trace
only. But in order to keep coverage of all the modes, increase the
buffer size rather than filtering by userspace for the basic tests.
Fixes: d1efa4a0a696e487 ("perf cs-etm: Add separate decode paths for timeless and per-thread modes")
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Link: https://lore.kernel.org/r/20240326113749.257250-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit eb4d27cf9aef3e6c9bcaf8fa1a1cadc2433d847b ]
Document that 'b' is used as a modifier to make an event use a BPF
counter.
Fixes: 01bd8efcec444468 ("perf stat: Introduce ':b' modifier")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Song Liu <song@kernel.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20240416170014.985191-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 645af3fb62bf12911ce1fc79efa676dae9a8289b ]
In match_var_offset(), it checks the offset range with the target type
only for non-pointer types. But it also needs to check the pointer
types with the target type.
This is because there can be more than one pointer variable located in
the same register. Let's look at the following example. It's looking
up a variable for reg3 at tcp_get_info+0x62. It found "sk" variable but
it wasn't the right one since it accesses beyond the target type (struct
'sock' in this case) size.
-----------------------------------------------------------
find data type for 0x7bc(reg3) at tcp_get_info+0x62
CU for net/ipv4/tcp.c (die:0x7b5f516)
frame base: cfa=0 fbreg=6
offset: 1980 is bigger than size: 760
check variable "sk" failed (die: 0x7b92b2c)
variable location: reg3
type='struct sock' size=0x2f8 (die:0x7b63c3a)
Actually there was another variable "tp" in the function and it's
located at the same (reg3) because it's just type-casted like below.
void tcp_get_info(struct sock *sk, struct tcp_info *info)
{
const struct tcp_sock *tp = tcp_sk(sk);
...
The 'struct tcp_sock' contains the 'struct sock' at offset 0 so it can
just use the same address as a pointer to tcp_sock. That means it
should match variables correctly by checking the offset and size.
Actually it cannot distinguish if the offset was smaller than the size
of the original struct sock. But I think it's fine as they are the same
at that part.
So let's check the target type size and retry if it doesn't match.
Now it succeeded to find the correct variable.
-----------------------------------------------------------
find data type for 0x7bc(reg3) at tcp_get_info+0x62
CU for net/ipv4/tcp.c (die:0x7b5f516)
frame base: cfa=0 fbreg=6
found "tp" in scope=1/1 (die: 0x7b92b16) type_offset=0x7bc
variable location: reg3
type='struct tcp_sock' size=0xa68 (die:0x7b81380)
Fixes: bc10db8eb8955fbc ("perf annotate-data: Support stack variables")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240412183310.2518474-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 459fee7b508231cd4622b3bd94aaa85e8e16b888 ]
bpf_program__attach_uprobe_opts will search LD_LIBRARY_PATH and so
specifying `/lib64` is unnecessary and causes failures for libc.so.6
paths like `/lib/x86_64-linux-gnu/libc.so.6`.
Fixes: 7b47623b8cae8149 ("perf bench uprobe trace_printk: Add entry attaching an BPF program that does a trace_printk")
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrei Vagin <avagin@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240406040911.1603801-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 792bc998baf9ae17297b1f93b1edc3ca34a0b7e2 ]
evlist__config() might mess up the debug output consumed by test
"Test per-thread recording" in "Miscellaneous Intel PT testing".
Move it out from between the debug prints:
"perf record opening and mmapping events" and
"perf record done opening and mmapping events"
Fixes: da4062021e0e6da5 ("perf tools: Add debug messages and comments for testing")
Closes: https://lore.kernel.org/linux-perf-users/ZhVfc5jYLarnGzKa@x1/
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240411075447.17306-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit df12e21d4e15e48a5e7d12e58f1a00742c4177d0 ]
In a debug build there is validation that mmap lists are sorted when
taking a lock. In machine__update_kernel_mmap() the start and end
addresses are updated resulting in an unsorted list before the map is
removed from the list. When the map is removed, the lock is taken which
triggers the validation and the failure:
$ perf test "object code reading"
--- start ---
perf: util/maps.c:88: check_invariants: Assertion `map__start(prev) <= map__start(map)' failed.
Aborted
Fix it by updating the addresses after removal, but before insertion.
The bug depends on the ordering and type of debug info on the system and
doesn't reproduce everywhere.
Fixes: 659ad3492b913c90 ("perf maps: Switch from rbtree to lazily sorted array for addresses")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Spoorthy S <spoorts2@in.ibm.com>
Link: https://lore.kernel.org/r/20240410103458.813656-4-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 2dade41a533f337447b945239b87ff31a8857890 ]
PERF_PMU_CAP_EXTENDED_HW_TYPE results in multiple events being opened on
heterogeneous systems. Currently this test only sets its required
attributes on the first event. Not disabling enable_on_exec on the other
events causes the test to fail because the forked objdump processes are
sampled. No tracking event is opened so Perf only knows about its own
mappings causing the objdump samples to give the following error:
$ perf test -vvv "object code reading"
Reading object code for memory address: 0xffff9aaa55ec
thread__find_map failed
---- end(-1) ----
24: Object code reading : FAILED!
Fixes: 251aa040244a3b17 ("perf parse-events: Wildcard most "numeric" events")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Spoorthy S <spoorts2@in.ibm.com>
Link: https://lore.kernel.org/r/20240410103458.813656-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 256ef072b3842273ce703db18b603b051aca95fe ]
To prevent anyone from seeing a test failure appear as a regression and
thinking that it was caused by their code change, insert some noise into
the loop which makes it immune to sampling bias issues (errata 1694299).
The "test data symbol" test can fail with any unrelated change that
shifts the loop into an unfortunate position in the Perf binary which is
almost impossible to debug as the root cause of the test failure.
Ultimately it's caused by the referenced errata.
Fixes: 60abedb8aa902b06 ("perf test: Introduce script for data symbol testing")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Spoorthy S <spoorts2@in.ibm.com>
Link: https://lore.kernel.org/r/20240410103458.813656-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
get_srcline()
[ Upstream commit aaf494cf483a1a835c44e942861429b30a00cab0 ]
It should pass a proper address (i.e. suitable for objdump or addr2line)
to get_srcline() in order to work correctly. It used to pass an address
with map__rip_2objdump() as the second argument but later it's changed
to use notes->start. It's ok in normal cases but it can be changed when
annotate_opts.full_addr is set. So let's convert the address directly
instead of using the notes->start.
Also the last argument is an IP to print symbol offset if requested. So
it should pass symbol-relative address.
Fixes: 7d18a824b5e57ddd ("perf annotate: Toggle full address <-> offset display")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240404175716.1225482-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c2f3d7dfc7373d53286f2a5c882d3397a5070adc ]
On s390 z/VM virtual machines command 'perf list' also displays metrics:
# perf list | grep -A 20 'Metric Groups:'
Metric Groups:
No_group:
cpi
[Cycles per Instruction]
est_cpi
[Estimated Instruction Complexity CPI infinite Level 1]
finite_cpi
[Cycles per Instructions from Finite cache/memory]
l1mp
[Level One Miss per 100 Instructions]
l2p
[Percentage sourced from Level 2 cache]
l3p
[Percentage sourced from Level 3 on same chip cache]
l4lp
[Percentage sourced from Level 4 Local cache on same book]
l4rp
[Percentage sourced from Level 4 Remote cache on different book]
memp
[Percentage sourced from memory]
....
#
The command
# perf stat -M cpi -- true
event syntax error: '{CPU_CYCLES/metric-id=CPU_CYCLES/.....'
\___ Bad event or PMU
Unable to find PMU or event on a PMU of 'CPU_CYCLES'
event syntax error: '{CPU_CYCLES/metric-id=CPU_CYCLES/...'
\___ Cannot find PMU `CPU_CYCLES'.
Missing kernel support?
#
fails. 'perf stat' should not fail on metrics when the referenced CPU
Counter Measurement PMU is not available.
Output after:
# perf stat -M est_cpi -- sleep 1
Performance counter stats for 'sleep 1':
1,000,887,494 ns duration_time # 0.00 est_cpi
1.000887494 seconds time elapsed
0.000143000 seconds user
0.000662000 seconds sys
#
Fixes: 7f76b31130680fb3 ("perf list: Add IBM z16 event description for s390")
Suggested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20240404064806.1362876-2-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b74bc5a633a7d72f89141d481d835e73bda3c3ae ]
s390 introduced the Processor Activity Instrumentation (PAI) counter
facility on LPAR and virtual machines z/VM for models 3931 and 3932.
These counters are stored as raw data in the perf.data file and are
displayed with:
# perf report -i /tmp//perfout-635468 -D | grep Counter
Counter:007 <unknown> Value:0x00000000000186a0
Counter:032 <unknown> Value:0x0000000000000001
Counter:032 <unknown> Value:0x0000000000000001
Counter:032 <unknown> Value:0x0000000000000001
#
However on z/VM virtual machines, the counter names are not retrieved
from the PMU and are shown as '<unknown>'. This is caused by the CPU
string saved in the mapfile.csv for this machine:
^IBM.393[12].*3\.7.[[:xdigit:]]+$,3,cf_z16,core
This string contains the CPU Measurement facility first and second
version number and authorization level (3\.7.[[:xdigit:]]+). These
numbers do not apply to the PAI counter facility. In fact they can be
omitted.
Shorten the CPU identification string for this machine to manufacturer
and model. This is sufficient for all PMU devices.
Output after:
# perf report -i /tmp//perfout-635468 -D | grep Counter
Counter:007 km_aes_128 Value:0x00000000000186a0
Counter:032 kma_gcm_aes_256 Value:0x0000000000000001
Counter:032 kma_gcm_aes_256 Value:0x0000000000000001
Counter:032 kma_gcm_aes_256 Value:0x0000000000000001
#
Fixes: b539deafbadb2fc6 ("perf report: Add s390 raw data interpretation for PAI counters")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20240404064806.1362876-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 6e4b398770d5023eb6383da9360a23bd537c155b ]
When 'perf sched' enables the call-graph recording, sample_type of dummy
event does not have PERF_SAMPLE_CALLCHAIN, timehist_check_attr() checks
that the evsel does not have a callchain, and set show_callchain to 0.
Currently 'perf sched timehist' only saves callchain when processing the
'sched:sched_switch event', timehist_check_attr() only needs to determine
whether the event has PERF_SAMPLE_CALLCHAIN.
Before:
# perf sched record -g true
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 4.153 MB perf.data (7536 samples) ]
# perf sched timehist
Samples do not have callchains.
time cpu task name wait time sch delay run time
[tid/pid] (msec) (msec) (msec)
--------------- ------ ------------------------------ --------- --------- ---------
147851.826019 [0000] perf[285035] 0.000 0.000 0.000
147851.826029 [0000] migration/0[15] 0.000 0.003 0.009
147851.826063 [0001] perf[285035] 0.000 0.000 0.000
147851.826069 [0001] migration/1[21] 0.000 0.003 0.006
<SNIP>
After:
# perf sched record -g true
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.572 MB perf.data (822 samples) ]
# perf sched timehist
time cpu task name waittime sch delay runtime
[tid/pid] (msec) (msec) (msec)
----------- --- --------------- -------- -------- -----
4193.035164 [0] perf[277062] 0.000 0.000 0.000 __traceiter_sched_switch <- __traceiter_sched_switch <- __sched_text_start <- preempt_schedule_common <- __cond_resched <- __wait_for_common <- wait_for_completion
4193.035174 [0] migration/0[15] 0.000 0.003 0.009 __traceiter_sched_switch <- __traceiter_sched_switch <- __sched_text_start <- smpboot_thread_fn <- kthread <- ret_from_fork
4193.035207 [1] perf[277062] 0.000 0.000 0.000 __traceiter_sched_switch <- __traceiter_sched_switch <- __sched_text_start <- preempt_schedule_common <- __cond_resched <- __wait_for_common <- wait_for_completion
4193.035214 [1] migration/1[21] 0.000 0.003 0.007 __traceiter_sched_switch <- __traceiter_sched_switch <- __sched_text_start <- smpboot_thread_fn <- kthread <- ret_from_fork
<SNIP>
Fixes: 9c95e4ef06572349 ("perf evlist: Add evlist__findnew_tracking_event() helper")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Yang Jihong <yangjihong@bytedance.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20240401062724.1006010-2-yangjihong@bytedance.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 374af9f1f06b5e991c810d2e4983d6f58df32136 ]
The options array in cmd_annotate() has duplicate --group options. It
only needs one and let's get rid of the other.
$ perf annotate -h 2>&1 | grep group
--group Show event group information together
--group Show event group information together
Fixes: 7ebaf4890f63eb90 ("perf annotate: Support '--group' option")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240322224313.423181-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 581037151910126a7934e369e4b6ac70eda9a703 ]
This prototype is obtained indirectly, by luck, from some other header
in probe-event.c in most systems, but recently exploded on alpine:edge:
8 13.39 alpine:edge : FAIL gcc version 13.2.1 20240309 (Alpine 13.2.1_git20240309)
util/probe-event.c: In function 'convert_exec_to_group':
util/probe-event.c:225:16: error: implicit declaration of function 'basename' [-Werror=implicit-function-declaration]
225 | ptr1 = basename(exec_copy);
| ^~~~~~~~
util/probe-event.c:225:14: error: assignment to 'char *' from 'int' makes pointer from integer without a cast [-Werror=int-conversion]
225 | ptr1 = basename(exec_copy);
| ^
cc1: all warnings being treated as errors
make[3]: *** [/git/perf-6.8.0/tools/build/Makefile.build:158: util] Error 2
Fix it by adding the libgen.h header where basename() is prototyped.
Fixes: fb7345bbf7fad9bf ("perf probe: Support basic dwarf-based operations on uprobe events")
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit e120f7091a25460a19967380725558c36bca7c6c ]
Switch from dumping err then out, to a single file descriptor for both
of them. This allows the err and output to be correctly interleaved in
verbose output.
Fixes: b482f5f8e0168f1e ("perf tests: Add option to run tests in parallel")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Disha Goel <disgoel@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240301074639.2260708-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 88ce0106a1f603bf360cb397e8fe293f8298fabb ]
The session has a header in it which contains a perf env with
bpf_progs. The bpf_progs are accessed by the sideband thread and so
the sideband thread must be stopped before the session is deleted, to
avoid a use after free. This error was detected by AddressSanitizer
in the following:
==2054673==ERROR: AddressSanitizer: heap-use-after-free on address 0x61d000161e00 at pc 0x55769289de54 bp 0x7f9df36d4ab0 sp 0x7f9df36d4aa8
READ of size 8 at 0x61d000161e00 thread T1
#0 0x55769289de53 in __perf_env__insert_bpf_prog_info util/env.c:42
#1 0x55769289dbb1 in perf_env__insert_bpf_prog_info util/env.c:29
#2 0x557692bbae29 in perf_env__add_bpf_info util/bpf-event.c:483
#3 0x557692bbb01a in bpf_event__sb_cb util/bpf-event.c:512
#4 0x5576928b75f4 in perf_evlist__poll_thread util/sideband_evlist.c:68
#5 0x7f9df96a63eb in start_thread nptl/pthread_create.c:444
#6 0x7f9df9726a4b in clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
0x61d000161e00 is located 384 bytes inside of 2136-byte region [0x61d000161c80,0x61d0001624d8)
freed by thread T0 here:
#0 0x7f9dfa6d7288 in __interceptor_free libsanitizer/asan/asan_malloc_linux.cpp:52
#1 0x557692978d50 in perf_session__delete util/session.c:319
#2 0x557692673959 in __cmd_record tools/perf/builtin-record.c:2884
#3 0x55769267a9f0 in cmd_record tools/perf/builtin-record.c:4259
#4 0x55769286710c in run_builtin tools/perf/perf.c:349
#5 0x557692867678 in handle_internal_command tools/perf/perf.c:402
#6 0x557692867a40 in run_argv tools/perf/perf.c:446
#7 0x557692867fae in main tools/perf/perf.c:562
#8 0x7f9df96456c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
Fixes: 657ee5531903339b ("perf evlist: Introduce side band thread")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Disha Goel <disgoel@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20240301074639.2260708-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit efae55bb78cf8722c7df01cd974197dfd13ece39 ]
It seems that a previous modification to sysreg-defs, which corrected
emitting the header to the specified output directory, exposed missing
subdir, prefix variables.
This breaks out of tree builds of perf as the file is now built into the
output directory, but still tries to descend into output directory as a
subdir.
Fixes: a29ee6aea7030786 ("perf build: Ensure sysreg-defs Makefile respects output dir")
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Reviewed-by: Tycho Andersen <tycho@tycho.pizza>
Signed-off-by: Ethan Adams <j.ethan.adams@gmail.com>
Tested-by: Tycho Andersen <tycho@tycho.pizza>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240314222012.47193-1-j.ethan.adams@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- A fix for TASK_SIZE on rv64/NOMMU, to reflect the lack of user/kernel
separation
- A fix to avoid loading rv64/NOMMU kernel past the start of RAM
- A fix for RISCV_HWPROBE_EXT_ZVFHMIN on ilp32 to avoid signed integer
overflow in the bitmask
- The sud_test kselftest has been fixed to properly swizzle the syscall
number into the return register, which are not the same on RISC-V
- A fix for a build warning in the perf tools on rv32
- A fix for the CBO selftests, to avoid non-constants leaking into the
inline asm
- A pair of fixes for T-Head PBMT errata probing, which has been
renamed MAE by the vendor
* tag 'riscv-for-linus-6.9-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: selftests: cbo: Ensure asm operands match constraints, take 2
perf riscv: Fix the warning due to the incompatible type
riscv: T-Head: Test availability bit before enabling MAE errata
riscv: thead: Rename T-Head PBMT to MAE
selftests: sud_test: return correct emulated syscall value on RISC-V
riscv: hwprobe: fix invalid sign extension for RISCV_HWPROBE_EXT_ZVFHMIN
riscv: Fix loading 64-bit NOMMU kernels past the start of RAM
riscv: Fix TASK_SIZE on 64-bit NOMMU
|
|
In the 32-bit platform, the second argument of getline is expectd to be
'size_t *'(aka 'unsigned int *'), but line_sz is of type
'unsigned long *'. Therefore, declare line_sz as size_t.
Signed-off-by: Ben Zong-You Xie <ben717@andestech.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240305120501.1785084-3-ben717@andestech.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
I got a report for a failure in BPF verifier on a recent kernel with
perf lock contention command. It checks task->sighand->siglock without
checking if sighand is NULL or not. Let's add one.
; if (&curr->sighand->siglock == (void *)lock)
265: (79) r1 = *(u64 *)(r0 +2624) ; frame1: R0_w=trusted_ptr_task_struct(off=0,imm=0)
; R1_w=rcu_ptr_or_null_sighand_struct(off=0,imm=0)
266: (b7) r2 = 0 ; frame1: R2_w=0
267: (0f) r1 += r2
R1 pointer arithmetic on rcu_ptr_or_null_ prohibited, null-check it first
processed 164 insns (limit 1000000) max_states_per_insn 1 total_states 15 peak_states 15 mark_read 5
-- END PROG LOAD LOG --
libbpf: prog 'contention_end': failed to load: -13
libbpf: failed to load object 'lock_contention_bpf'
libbpf: failed to load BPF skeleton 'lock_contention_bpf': -13
Failed to load lock-contention BPF skeleton
lock contention BPF setup failed
lock contention did not detect any lock contention
Fixes: 1811e82767dcc ("perf lock contention: Track and show siglock with address")
Reviewed-by: Ian Rogers <irogers@google.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240409225542.1870999-1-namhyung@kernel.org
|
|
The symbol__annotate2() initializes some data structures needed by TUI.
It has a logic to prevent calling it multiple times by checking if it
has the annotated source. But data type profiling uses a different
code (symbol__annotate) to allocate the annotated lines in advance.
So TUI missed to call symbol__annotate2() when it shows the annotation
browser.
Make symbol__annotate() reentrant and handle that situation properly.
This fixes a crash in the annotation browser started by perf report in
TUI like below.
$ perf report -s type,sym --tui
# and press 'a' key and then move down
Fixes: 81e57deec325 ("perf report: Support data type profiling")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240405211800.1412920-2-namhyung@kernel.org
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
- Support for various vector-accelerated crypto routines
- Hibernation is now enabled for portable kernel builds
- mmap_rnd_bits_max is larger on systems with larger VAs
- Support for fast GUP
- Support for membarrier-based instruction cache synchronization
- Support for the Andes hart-level interrupt controller and PMU
- Some cleanups around unaligned access speed probing and Kconfig
settings
- Support for ACPI LPI and CPPC
- Various cleanus related to barriers
- A handful of fixes
* tag 'riscv-for-linus-6.9-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (66 commits)
riscv: Fix syscall wrapper for >word-size arguments
crypto: riscv - add vector crypto accelerated AES-CBC-CTS
crypto: riscv - parallelize AES-CBC decryption
riscv: Only flush the mm icache when setting an exec pte
riscv: Use kcalloc() instead of kzalloc()
riscv/barrier: Add missing space after ','
riscv/barrier: Consolidate fence definitions
riscv/barrier: Define RISCV_FULL_BARRIER
riscv/barrier: Define __{mb,rmb,wmb}
RISC-V: defconfig: Enable CONFIG_ACPI_CPPC_CPUFREQ
cpufreq: Move CPPC configs to common Kconfig and add RISC-V
ACPI: RISC-V: Add CPPC driver
ACPI: Enable ACPI_PROCESSOR for RISC-V
ACPI: RISC-V: Add LPI driver
cpuidle: RISC-V: Move few functions to arch/riscv
riscv: Introduce set_compat_task() in asm/compat.h
riscv: Introduce is_compat_thread() into compat.h
riscv: add compile-time test into is_compat_task()
riscv: Replace direct thread flag check with is_compat_task()
riscv: Improve arch_get_mmap_end() macro
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from CAN, netfilter, wireguard and IPsec.
I'd like to highlight [ lowlight? - Linus ] Florian W stepping down as
a netfilter maintainer due to constant stream of bug reports. Not sure
what we can do but IIUC this is not the first such case.
Current release - regressions:
- rxrpc: fix use of page_frag_alloc_align(), it changed semantics and
we added a new caller in a different subtree
- xfrm: allow UDP encapsulation only in offload modes
Current release - new code bugs:
- tcp: fix refcnt handling in __inet_hash_connect()
- Revert "net: Re-use and set mono_delivery_time bit for userspace
tstamp packets", conflicted with some expectations in BPF uAPI
Previous releases - regressions:
- ipv4: raw: fix sending packets from raw sockets via IPsec tunnels
- devlink: fix devlink's parallel command processing
- veth: do not manipulate GRO when using XDP
- esp: fix bad handling of pages from page_pool
Previous releases - always broken:
- report RCU QS for busy network kthreads (with Paul McK's blessing)
- tcp/rds: fix use-after-free on netns with kernel TCP reqsk
- virt: vmxnet3: fix missing reserved tailroom with XDP
Misc:
- couple of build fixes for Documentation"
* tag 'net-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (59 commits)
selftests: forwarding: Fix ping failure due to short timeout
MAINTAINERS: step down as netfilter maintainer
netfilter: nf_tables: Fix a memory leak in nf_tables_updchain
net: dsa: mt7530: fix handling of all link-local frames
net: dsa: mt7530: fix link-local frames that ingress vlan filtering ports
bpf: report RCU QS in cpumap kthread
net: report RCU QS on threaded NAPI repolling
rcu: add a helper to report consolidated flavor QS
ionic: update documentation for XDP support
lib/bitmap: Fix bitmap_scatter() and bitmap_gather() kernel doc
netfilter: nf_tables: do not compare internal table flags on updates
netfilter: nft_set_pipapo: release elements in clone only from destroy path
octeontx2-af: Use separate handlers for interrupts
octeontx2-pf: Send UP messages to VF only when VF is up.
octeontx2-pf: Use default max_active works instead of one
octeontx2-pf: Wait till detach_resources msg is complete
octeontx2: Detect the mbox up or down message via register
devlink: fix port new reply cmd type
tcp: Clear req->syncookie in reqsk_alloc().
net/bnx2x: Prevent access to a freed page in page_pool
...
|
|
The only user of these was io_uring, and it's not using them anymore.
Make them static and remove them from the socket header file.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/1b6089d3-c1cf-464a-abd3-b0f0b6bb2523@kernel.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add the Andes AX45 JSON files that allows specifying symbolic event
names for the raw PMU events.
Signed-off-by: Locus Wei-Han Chen <locus84@andestech.com>
Reviewed-by: Yu Chien Peter Lin <peterlin@andestech.com>
Reviewed-by: Charles Ci-Jyun Wu <dminus@andestech.com>
Reviewed-by: Leo Yu-Chi Liang <ycliang@andestech.com>
Tested-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>
Acked-by: Atish Patra <atishp@rivosinc.com>
Acked-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20240222083946.3977135-11-peterlin@andestech.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
|
|
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240304230815.1440583-5-namhyung@kernel.org
|
|
It's not used anymore and the code is coverted to use a hash map. Now
sym_hist has a static size, so no need to have sizeof_sym_hist in the
struct annotated_source.
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240304230815.1440583-4-namhyung@kernel.org
|
|
Use annotated_source.samples hashmap instead of addr array in the
struct sym_hist.
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240304230815.1440583-3-namhyung@kernel.org
|