Age | Commit message (Collapse) | Author |
|
Sending a PTP packet can imply to use the normal TX driver datapath but
invoked from the driver's ptp worker. The kernel generic TX code
disables softirqs and preemption before calling specific driver TX code,
but the ptp worker does not. Although current ptp driver functionality
does not require it, there are several reasons for doing so:
1) The invoked code is always executed with softirqs disabled for non
PTP packets.
2) Better if a ptp packet transmission is not interrupted by softirq
handling which could lead to high latencies.
3) netdev_xmit_more used by the TX code requires preemption to be
disabled.
Indeed a solution for dealing with kernel preemption state based on static
kernel configuration is not possible since the introduction of dynamic
preemption level configuration at boot time using the static calls
functionality.
Fixes: f79c957a0b537 ("drivers: net: sfc: use netdev_xmit_more helper")
Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Link: https://lore.kernel.org/r/20220726064504.49613-1-alejandro.lucero-palau@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The crc16() function is used to check the firmware validity, but
the library was not explicitly selected.
Fixes: 3c3673bde50c ("ptp: ocp: Add firmware header checks")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Vadim Fedorenko <vadfed@fb.com>
Link: https://lore.kernel.org/r/20220726220604.1339972-1-jonathan.lemon@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
While RTC clock was added in H616 ccu_common list, it was not in H6
list. That caused invalid pointer dereference like this:
Unable to handle kernel NULL pointer dereference at virtual address 000000000000020c
Mem abort info:
ESR = 0x96000004
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x04: level 0 translation fault
Data abort info:
ISV = 0, ISS = 0x00000004
CM = 0, WnR = 0
user pgtable: 4k pages, 48-bit VAs, pgdp=000000004d574000
[000000000000020c] pgd=0000000000000000, p4d=0000000000000000
Internal error: Oops: 96000004 [#1] PREEMPT SMP
CPU: 3 PID: 339 Comm: cat Tainted: G B 5.18.0-rc1+ #1352
Hardware name: Tanix TX6 (DT)
pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : ccu_gate_is_enabled+0x48/0x74
lr : ccu_gate_is_enabled+0x40/0x74
sp : ffff80000c0b76d0
x29: ffff80000c0b76d0 x28: 00000000016e3600 x27: 0000000000000000
x26: 0000000000000000 x25: 0000000000000002 x24: ffff00000952fe08
x23: ffff800009611400 x22: ffff00000952fe79 x21: 0000000000000000
x20: 0000000000000001 x19: ffff80000aad6f08 x18: 0000000000000000
x17: 2d2d2d2d2d2d2d2d x16: 2d2d2d2d2d2d2d2d x15: 2d2d2d2d2d2d2d2d
x14: 0000000000000000 x13: 00000000f2f2f2f2 x12: ffff700001816e89
x11: 1ffff00001816e88 x10: ffff700001816e88 x9 : dfff800000000000
x8 : ffff80000c0b7447 x7 : 0000000000000001 x6 : ffff700001816e88
x5 : ffff80000c0b7440 x4 : 0000000000000001 x3 : ffff800008935c50
x2 : dfff800000000000 x1 : 0000000000000000 x0 : 000000000000020c
Call trace:
ccu_gate_is_enabled+0x48/0x74
clk_core_is_enabled+0x7c/0x1c0
clk_summary_show_subtree+0x1dc/0x334
clk_summary_show_subtree+0x250/0x334
clk_summary_show_subtree+0x250/0x334
clk_summary_show_subtree+0x250/0x334
clk_summary_show_subtree+0x250/0x334
clk_summary_show+0x90/0xdc
seq_read_iter+0x248/0x6d4
seq_read+0x17c/0x1fc
full_proxy_read+0x90/0xf0
vfs_read+0xdc/0x28c
ksys_read+0xc8/0x174
__arm64_sys_read+0x44/0x5c
invoke_syscall+0x60/0x190
el0_svc_common.constprop.0+0x7c/0x160
do_el0_svc+0x38/0xa0
el0_svc+0x68/0x160
el0t_64_sync_handler+0x10c/0x140
el0t_64_sync+0x18c/0x190
Code: d1006260 97e5c981 785e8260 8b0002a0 (b9400000)
---[ end trace 0000000000000000 ]---
Fix that by adding rtc clock to H6 ccu_common list too.
Fixes: 38d321b61bda ("clk: sunxi-ng: h6-r: Add RTC gate clock")
Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com>
Link: https://lore.kernel.org/r/20220719183725.2605141-1-jernej.skrabec@gmail.com
Reviewed-by: Samuel Holland <samuel@sholland.org>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
|
|
After the blamed commit, IPv4 SYN packets handled
by a dual stack IPv6 socket are dropped, even if
perfectly valid.
$ nstat | grep MD5
TcpExtTCPMD5Failure 5 0.0
For a dual stack listener, an incoming IPv4 SYN packet
would call tcp_inbound_md5_hash() with @family == AF_INET,
while tp->af_specific is pointing to tcp_sock_ipv6_specific.
Only later when an IPv4-mapped child is created, tp->af_specific
is changed to tcp_sock_ipv6_mapped_specific.
Fixes: 7bbb765b7349 ("net/tcp: Merge TCP-MD5 inbound callbacks")
Reported-by: Brian Vazquez <brianvv@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Dmitry Safonov <dima@arista.com>
Tested-by: Leonard Crestez <cdleonard@gmail.com>
Link: https://lore.kernel.org/r/20220726115743.2759832-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 26f09e9b3a06 ("mm/memblock: add memblock memory allocation apis")
added a check to determine whether arm_dma_zone_size is exceeding the
amount of kernel virtual address space available between the upper 4GB
virtual address limit and PAGE_OFFSET in order to provide a suitable
definition of MAX_DMA_ADDRESS that should fit within the 32-bit virtual
address space. The quantity used for comparison was off by a missing
trailing 0, leading to MAX_DMA_ADDRESS to be overflowing a 32-bit
quantity.
This was caught thanks to CONFIG_DEBUG_VIRTUAL on the bcm2711 platform
where we define a dma_zone_size of 1GB and we have a PAGE_OFFSET value
of 0xc000_0000 (CONFIG_VMSPLIT_3G) leading to MAX_DMA_ADDRESS being
0x1_0000_0000 which overflows the unsigned long type used throughout
__pa() and then __virt_addr_valid(). Because the virtual address passed
to __virt_addr_valid() would now be 0, the function would loudly warn
and flood the kernel log, thus making the platform unable to boot
properly.
Fixes: 26f09e9b3a06 ("mm/memblock: add memblock memory allocation apis")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic fixes from Arnd Bergmann:
"Two more bug fixes for asm-generic, one addressing an incorrect
Kconfig symbol reference and another one fixing a build failure for
the perf tool on mips and possibly others"
* tag 'asm-generic-fixes-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
asm-generic: remove a broken and needless ifdef conditional
tools: Fixed MIPS builds due to struct flock re-definition
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
"One last set of changes for the soc tree:
- fix clock frequency on lan966x
- fix incorrect GPIO numbers on some pxa machines
- update Baolin's email address"
* tag 'soc-fixes-5.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
ARM: pxa2xx: Fix GPIO descriptor tables
mailmap: update Baolin Wang's email
ARM: dts: lan966x: fix sys_clk frequency
|
|
This reverts commit 007faec014cb5d26983c1f86fd08c6539b41392e.
Now that hyperv does its own protocol negotiation:
49d6a3c062a1 ("x86/Hyper-V: Add SEV negotiate protocol support in Isolation VM")
revert this exposure of the sev_es_ghcb_hv_call() helper.
Cc: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by:Tianyu Lan <tiala@microsoft.com>
Link: https://lore.kernel.org/r/20220614014553.1915929-1-ltykernel@gmail.com
|
|
Commit
4675ff05de2d ("kmemcheck: rip it out")
removed kmemcheck and its corresponding build config KMEMCHECK.
Commit
0f620cefd775 ("objtool: Rename "VMLINUX_VALIDATION" -> "NOINSTR_VALIDATION"")
renamed the debug config option.
Adjust x86_debug.config to those changes in debug configs.
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220722121815.27535-1-lukas.bulwahn@gmail.com
|
|
Pull NVMe fix from Christoph:
"nvme fix for Linux 5.19
- yet another duplicate ID quirk (Tobias Gruetzmacher)"
* tag 'nvme-5.19-2022-07-27' of git://git.infradead.org/nvme:
nvme-pci: Crucial P2 has bogus namespace ids
|
|
bpf_perf_object__next() folded the last element in the list test with the
empty list test. However, this meant that offsets were computed against
null and that a struct list_head was compared against a 'struct
bpf_perf_object'.
Working around this with clang's undefined behavior sanitizer required
-fno-sanitize=null and -fno-sanitize=object-size.
Remove the undefined behavior by using the regular Linux list APIs and
handling the starting case separately from the end testing case.
Looking at uses like bpf_perf_object__for_each(), as the constant NULL
or non-NULL argument can be constant propagated, the code is no less
efficient.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Christy Lee <christylee@fb.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Rix <trix@redhat.com>
Cc: bpf@vger.kernel.org
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20220726220921.2567761-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Some symbols are observed with the 'st_value' field zeroed. E.g.
libc.so.6 in Ubuntu contains a symbol '__evoke_link_warning_getwd' which
resides in the '.gnu.warning.getwd' section.
Unlike normal sections, such kind of sections are used for linker
warning when a file calls deprecated functions, but they are not part of
memory images, the symbols in these sections should be dropped.
This patch checks the section attribute SHF_ALLOC bit, if the bit is not
set, it skips symbols to avoid spurious ones.
Suggested-by: Fangrui Song <maskray@google.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chang Rui <changruinj@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220724060013.171050-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When using 'perf mem' and 'perf c2c', an issue is observed that tool
reports the wrong offset for global data symbols. This is a common
issue on both x86 and Arm64 platforms.
Let's see an example, for a test program, below is the disassembly for
its .bss section which is dumped with objdump:
...
Disassembly of section .bss:
0000000000004040 <completed.0>:
...
0000000000004080 <buf1>:
...
00000000000040c0 <buf2>:
...
0000000000004100 <thread>:
...
First we used 'perf mem record' to run the test program and then used
'perf --debug verbose=4 mem report' to observe what's the symbol info
for 'buf1' and 'buf2' structures.
# ./perf mem record -e ldlat-loads,ldlat-stores -- false_sharing.exe 8
# ./perf --debug verbose=4 mem report
...
dso__load_sym_internal: adjusting symbol: st_value: 0x40c0 sh_addr: 0x4040 sh_offset: 0x3028
symbol__new: buf2 0x30a8-0x30e8
...
dso__load_sym_internal: adjusting symbol: st_value: 0x4080 sh_addr: 0x4040 sh_offset: 0x3028
symbol__new: buf1 0x3068-0x30a8
...
The perf tool relies on libelf to parse symbols, in executable and
shared object files, 'st_value' holds a virtual address; 'sh_addr' is
the address at which section's first byte should reside in memory, and
'sh_offset' is the byte offset from the beginning of the file to the
first byte in the section. The perf tool uses below formula to convert
a symbol's memory address to a file address:
file_address = st_value - sh_addr + sh_offset
^
` Memory address
We can see the final adjusted address ranges for buf1 and buf2 are
[0x30a8-0x30e8) and [0x3068-0x30a8) respectively, apparently this is
incorrect, in the code, the structure for 'buf1' and 'buf2' specifies
compiler attribute with 64-byte alignment.
The problem happens for 'sh_offset', libelf returns it as 0x3028 which
is not 64-byte aligned, combining with disassembly, it's likely libelf
doesn't respect the alignment for .bss section, therefore, it doesn't
return the aligned value for 'sh_offset'.
Suggested by Fangrui Song, ELF file contains program header which
contains PT_LOAD segments, the fields p_vaddr and p_offset in PT_LOAD
segments contain the execution info. A better choice for converting
memory address to file address is using the formula:
file_address = st_value - p_vaddr + p_offset
This patch introduces elf_read_program_header() which returns the
program header based on the passed 'st_value', then it uses the formula
above to calculate the symbol file address; and the debugging log is
updated respectively.
After applying the change:
# ./perf --debug verbose=4 mem report
...
dso__load_sym_internal: adjusting symbol: st_value: 0x40c0 p_vaddr: 0x3d28 p_offset: 0x2d28
symbol__new: buf2 0x30c0-0x3100
...
dso__load_sym_internal: adjusting symbol: st_value: 0x4080 p_vaddr: 0x3d28 p_offset: 0x2d28
symbol__new: buf1 0x3080-0x30c0
...
Fixes: f17e04afaff84b5c ("perf report: Fix ELF symbol parsing")
Reported-by: Chang Rui <changruinj@gmail.com>
Suggested-by: Fangrui Song <maskray@google.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220724060013.171050-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The mainline kernel can be used for relative old distros, e.g. RHEL 7.
The distro doesn't upgrade from python2 to python3, this causes the
building error that the python script is not python2 compliant.
To fix the building failure, this patch changes from the python f-string
format to traditional string format.
Fixes: 12fdd6c009da0d02 ("perf scripts python: Support Arm CoreSight trace data disassembly")
Reported-by: Akemi Yagi <toracat@elrepo.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: ElRepo <contact@elrepo.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220725104220.1106663-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To pick the changes from:
28a99e95f55c6185 ("x86/amd: Use IBPB for firmware calls")
This only causes these perf files to be rebuilt:
CC /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
CC /tmp/build/perf/bench/mem-memset-x86-64-asm.o
And addresses this perf build warning:
Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org
Link: https://lore.kernel.org/lkml/Yt6oWce9UDAmBAtX@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To get upstream fixes.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We try using cancel_delayed_work_sync() to prevent the work from
enabling NAPI. This is insufficient since we don't disable the source
of the refill work scheduling. This means an NAPI poll callback after
cancel_delayed_work_sync() can schedule the refill work then can
re-enable the NAPI that leads to use-after-free [1].
Since the work can enable NAPI, we can't simply disable NAPI before
calling cancel_delayed_work_sync(). So fix this by introducing a
dedicated boolean to control whether or not the work could be
scheduled from NAPI.
[1]
==================================================================
BUG: KASAN: use-after-free in refill_work+0x43/0xd4
Read of size 2 at addr ffff88810562c92e by task kworker/2:1/42
CPU: 2 PID: 42 Comm: kworker/2:1 Not tainted 5.19.0-rc1+ #480
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
Workqueue: events refill_work
Call Trace:
<TASK>
dump_stack_lvl+0x34/0x44
print_report.cold+0xbb/0x6ac
? _printk+0xad/0xde
? refill_work+0x43/0xd4
kasan_report+0xa8/0x130
? refill_work+0x43/0xd4
refill_work+0x43/0xd4
process_one_work+0x43d/0x780
worker_thread+0x2a0/0x6f0
? process_one_work+0x780/0x780
kthread+0x167/0x1a0
? kthread_exit+0x50/0x50
ret_from_fork+0x22/0x30
</TASK>
...
Fixes: b2baed69e605c ("virtio_net: set/cancel work on ndo_open/ndo_stop")
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The commit
cb51a371d08e ("EDAC/ghes: Setup DIMM label from DMI and use it in error reports")
enforced that both the bank and device strings passed to
dimm_setup_label() are not NULL.
However, there are BIOSes, for example on a
HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 03/15/2019
which don't populate both strings:
Handle 0x0020, DMI type 17, 84 bytes
Memory Device
Array Handle: 0x0013
Error Information Handle: Not Provided
Total Width: 72 bits
Data Width: 64 bits
Size: 32 GB
Form Factor: DIMM
Set: None
Locator: PROC 1 DIMM 1 <===== device
Bank Locator: Not Specified <===== bank
This results in a buffer overflow because ghes_edac_register() calls
strlen() on an uninitialized label, which had non-zero values left over
from krealloc_array():
detected buffer overflow in __fortify_strlen
------------[ cut here ]------------
kernel BUG at lib/string_helpers.c:983!
invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
CPU: 1 PID: 1 Comm: swapper/0 Tainted: G I 5.18.6-200.fc36.x86_64 #1
Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 03/15/2019
RIP: 0010:fortify_panic
...
Call Trace:
<TASK>
ghes_edac_register.cold
ghes_probe
platform_probe
really_probe
__driver_probe_device
driver_probe_device
__driver_attach
? __device_attach_driver
bus_for_each_dev
bus_add_driver
driver_register
acpi_ghes_init
acpi_init
? acpi_sleep_proc_init
do_one_initcall
The label contains garbage because the commit in Fixes reallocs the
DIMMs array while scanning the system but doesn't clear the newly
allocated memory.
Change dimm_setup_label() to always initialize the label to fix the
issue. Set it to the empty string in case BIOS does not provide both
bank and device so that ghes_edac_register() can keep the default label
given by edac_mc_alloc_dimms().
[ bp: Rewrite commit message. ]
Fixes: b9cae27728d1f ("EDAC/ghes: Scan the system once on driver init")
Co-developed-by: Robert Richter <rric@kernel.org>
Signed-off-by: Robert Richter <rric@kernel.org>
Signed-off-by: Toshi Kani <toshi.kani@hpe.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Robert Elliott <elliott@hpe.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20220719220124.760359-1-toshi.kani@hpe.com
|
|
New subflows are created within the kernel using O_NONBLOCK, so
EINPROGRESS is the expected return value from kernel_connect().
__mptcp_subflow_connect() has the correct logic to consider EINPROGRESS
to be a successful case, but it has also used that error code as its
return value.
Before v5.19 this was benign: all the callers ignored the return
value. Starting in v5.19 there is a MPTCP_PM_CMD_SUBFLOW_CREATE generic
netlink command that does use the return value, so the EINPROGRESS gets
propagated to userspace.
Make __mptcp_subflow_connect() always return 0 on success instead.
Fixes: ec3edaa7ca6c ("mptcp: Add handling of outgoing MP_JOIN requests")
Fixes: 702c2f646d42 ("mptcp: netlink: allow userspace-driven subflow establishment")
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Link: https://lore.kernel.org/r/20220725205231.87529-1-mathew.j.martineau@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Florian Westphal says:
====================
netfilter updates for net
Three late fixes for netfilter:
1) If nf_queue user requests packet truncation below size of l3 header,
we corrupt the skb, then crash. Reject such requests.
2) add cond_resched() calls when doing cycle detection in the
nf_tables graph. This avoids softlockup warning with certain
rulesets.
3) Reject rulesets that use nftables 'queue' expression in family/chain
combinations other than those that are supported. Currently the ruleset
will load, but when userspace attempts to reinject you get WARN splat +
packet drops.
* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nft_queue: only allow supported familes and hooks
netfilter: nf_tables: add rescheduling points during loop detection walks
netfilter: nf_queue: do not allow packet truncation below transport header offset
====================
Link: https://lore.kernel.org/r/20220726192056.13497-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth
Luiz Augusto von Dentz says:
====================
bluetooth pull request for net:
- Fix early wakeup after suspend
- Fix double free on error
- Fix use-after-free on l2cap_chan_put
* tag 'for-net-2022-07-26' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
Bluetooth: L2CAP: Fix use-after-free caused by l2cap_chan_put
Bluetooth: Always set event mask on suspend
Bluetooth: mgmt: Fix double free on error path
====================
Link: https://lore.kernel.org/r/20220726221328.423714-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"Thirteen hotfixes.
Eight are cc:stable and the remainder are for post-5.18 issues or are
too minor to warrant backporting"
* tag 'mm-hotfixes-stable-2022-07-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mailmap: update Gao Xiang's email addresses
userfaultfd: provide properly masked address for huge-pages
Revert "ocfs2: mount shared volume without ha stack"
hugetlb: fix memoryleak in hugetlb_mcopy_atomic_pte
fs: sendfile handles O_NONBLOCK of out_fd
ntfs: fix use-after-free in ntfs_ucsncmp()
secretmem: fix unhandled fault in truncate
mm/hugetlb: separate path for hwpoison entry in copy_hugetlb_page_range()
mm: fix missing wake-up event for FSDAX pages
mm: fix page leak with multiple threads mapping the same page
mailmap: update Seth Forshee's email address
tmpfs: fix the issue that the mount and remount results are inconsistent.
mm: kfence: apply kmemleak_ignore_phys on early allocated pool
|
|
If a device management command completion happens after
wait_for_completion_timeout() times out and before ufshcd_clear_cmds() is
called, then the completion code may crash on the complete() call in
__ufshcd_transfer_req_compl().
Fix the following crash:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
Call trace:
complete+0x64/0x178
__ufshcd_transfer_req_compl+0x30c/0x9c0
ufshcd_poll+0xf0/0x208
ufshcd_sl_intr+0xb8/0xf0
ufshcd_intr+0x168/0x2f4
__handle_irq_event_percpu+0xa0/0x30c
handle_irq_event+0x84/0x178
handle_fasteoi_irq+0x150/0x2e8
__handle_domain_irq+0x114/0x1e4
gic_handle_irq.31846+0x58/0x300
el1_irq+0xe4/0x1c0
efi_header_end+0x110/0x680
__irq_exit_rcu+0x108/0x124
__handle_domain_irq+0x118/0x1e4
gic_handle_irq.31846+0x58/0x300
el1_irq+0xe4/0x1c0
cpuidle_enter_state+0x3ac/0x8c4
do_idle+0x2fc/0x55c
cpu_startup_entry+0x84/0x90
kernel_init+0x0/0x310
start_kernel+0x0/0x608
start_kernel+0x4ec/0x608
Link: https://lore.kernel.org/r/20220720170228.1598842-1-bvanassche@acm.org
Fixes: 5a0b0cb9bee7 ("[SCSI] ufs: Add support for sending NOP OUT UPIU")
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Avri Altman <avri.altman@wdc.com>
Cc: Bean Huo <beanhuo@micron.com>
Cc: Stanley Chu <stanley.chu@mediatek.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
As explained in SG_IO howto[1]:
"If iovec_count is non-zero then 'dxfer_len' should be equal to the sum of
iov_len lengths. If not, the minimum of the two is the transfer length."
When iovec_count is non-zero and dxfer_len is zero, the sg_io() just
genarated a null bio, and finally caused a warning below. To fix it, skip
generating a bio for this request if dxfer_len is zero.
[1] https://tldp.org/HOWTO/SCSI-Generic-HOWTO/x198.html
WARNING: CPU: 2 PID: 3643 at drivers/scsi/scsi_lib.c:1032 scsi_alloc_sgtables+0xc7d/0xf70 drivers/scsi/scsi_lib.c:1032
Modules linked in:
CPU: 2 PID: 3643 Comm: syz-executor397 Not tainted
5.17.0-rc3-syzkaller-00316-gb81b1829e7e3 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-204/01/2014
RIP: 0010:scsi_alloc_sgtables+0xc7d/0xf70 drivers/scsi/scsi_lib.c:1032
Code: e7 fc 31 ff 44 89 f6 e8 c1 4e e7 fc 45 85 f6 0f 84 1a f5 ff ff e8
93 4c e7 fc 83 c5 01 0f b7 ed e9 0f f5 ff ff e8 83 4c e7 fc <0f> 0b 41
bc 0a 00 00 00 e9 2b fb ff ff 41 bc 09 00 00 00 e9 20 fb
RSP: 0018:ffffc90000d07558 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff88801bfc96a0 RCX: 0000000000000000
RDX: ffff88801c876000 RSI: ffffffff849060bd RDI: 0000000000000003
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: ffffffff849055b9 R11: 0000000000000000 R12: ffff888012b8c000
R13: ffff88801bfc9580 R14: 0000000000000000 R15: ffff88801432c000
FS: 00007effdec8e700(0000) GS:ffff88802cc00000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007effdec6d718 CR3: 00000000206d6000 CR4: 0000000000150ee0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
scsi_setup_scsi_cmnd drivers/scsi/scsi_lib.c:1219 [inline]
scsi_prepare_cmd drivers/scsi/scsi_lib.c:1614 [inline]
scsi_queue_rq+0x283e/0x3630 drivers/scsi/scsi_lib.c:1730
blk_mq_dispatch_rq_list+0x6ea/0x22e0 block/blk-mq.c:1851
__blk_mq_sched_dispatch_requests+0x20b/0x410 block/blk-mq-sched.c:299
blk_mq_sched_dispatch_requests+0xfb/0x180 block/blk-mq-sched.c:332
__blk_mq_run_hw_queue+0xf9/0x350 block/blk-mq.c:1968
__blk_mq_delay_run_hw_queue+0x5b6/0x6c0 block/blk-mq.c:2045
blk_mq_run_hw_queue+0x30f/0x480 block/blk-mq.c:2096
blk_mq_sched_insert_request+0x340/0x440 block/blk-mq-sched.c:451
blk_execute_rq+0xcc/0x340 block/blk-mq.c:1231
sg_io+0x67c/0x1210 drivers/scsi/scsi_ioctl.c:485
scsi_ioctl_sg_io drivers/scsi/scsi_ioctl.c:866 [inline]
scsi_ioctl+0xa66/0x1560 drivers/scsi/scsi_ioctl.c:921
sd_ioctl+0x199/0x2a0 drivers/scsi/sd.c:1576
blkdev_ioctl+0x37a/0x800 block/ioctl.c:588
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:874 [inline]
__se_sys_ioctl fs/ioctl.c:860 [inline]
__x64_sys_ioctl+0x193/0x200 fs/ioctl.c:860
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7effdecdc5d9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 81 14 00 00 90 48 89 f8 48 89
f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01
f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007effdec8e2f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007effded664c0 RCX: 00007effdecdc5d9
RDX: 0000000020002300 RSI: 0000000000002285 RDI: 0000000000000004
RBP: 00007effded34034 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000003
R13: 00007effded34054 R14: 2f30656c69662f2e R15: 00007effded664c8
Link: https://lore.kernel.org/r/20220720025120.3226770-1-yanaijie@huawei.com
Fixes: 25636e282fe9 ("block: fix SG_IO vector request data length handling")
Reported-by: syzbot+d44b35ecfb807e5af0b5@syzkaller.appspotmail.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
In ufshcd_populate_vreg(), we should hold the reference returned by
of_parse_phandle() and then use it to call of_node_put() for refcount
balance.
Link: https://lore.kernel.org/r/20220719071529.1081166-1-windhl@126.com
Fixes: aa4976130934 ("ufs: Add regulator enable support")
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Liang He <windhl@126.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
During system shutdown or reboot, mpt3sas will reset the firmware back to
ready state. However, the driver leaves running a watchdog work item
intended to keep the firmware in operational state. This causes a second,
unneeded reset on shutdown and moves the firmware back to operational
instead of in ready state as intended. And if the mpt3sas_fwfault_debug
module parameter is set, this extra reset also panics the system.
mpt3sas's scsih_shutdown needs to stop the watchdog before resetting the
firmware back to ready state.
Link: https://lore.kernel.org/r/20220722142448.6289-1-djeffery@redhat.com
Fixes: fae21608c31c ("scsi: mpt3sas: Transition IOC to Ready state during shutdown")
Tested-by: Laurence Oberman <loberman@redhat.com>
Acked-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: David Jeffery <djeffery@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
I've been in Alibaba Cloud for more than one year, mainly to address
cloud-native challenges (such as high-performance container images) for
open source communities.
Update my email addresses on behalf of my current employer (Alibaba Cloud)
to support all my (team) work in this area. Also add an outdated
@redhat.com address of me.
Link: https://lkml.kernel.org/r/20220719154246.62970-1-xiang@kernel.org
Signed-off-by: Gao Xiang <xiang@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Commit 824ddc601adc ("userfaultfd: provide unmasked address on
page-fault") was introduced to fix an old bug, in which the offset in the
address of a page-fault was masked. Concerns were raised - although were
never backed by actual code - that some userspace code might break because
the bug has been around for quite a while. To address these concerns a
new flag was introduced, and only when this flag is set by the user,
userfaultfd provides the exact address of the page-fault.
The commit however had a bug, and if the flag is unset, the offset was
always masked based on a base-page granularity. Yet, for huge-pages, the
behavior prior to the commit was that the address is masked to the
huge-page granulrity.
While there are no reports on real breakage, fix this issue. If the flag
is unset, use the address with the masking that was done before.
Link: https://lkml.kernel.org/r/20220711165906.2682-1-namit@vmware.com
Fixes: 824ddc601adc ("userfaultfd: provide unmasked address on page-fault")
Signed-off-by: Nadav Amit <namit@vmware.com>
Reported-by: James Houghton <jthoughton@google.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: James Houghton <jthoughton@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
This fixes the following trace which is caused by hci_rx_work starting up
*after* the final channel reference has been put() during sock_close() but
*before* the references to the channel have been destroyed, so instead
the code now rely on kref_get_unless_zero/l2cap_chan_hold_unless_zero to
prevent referencing a channel that is about to be destroyed.
refcount_t: increment on 0; use-after-free.
BUG: KASAN: use-after-free in refcount_dec_and_test+0x20/0xd0
Read of size 4 at addr ffffffc114f5bf18 by task kworker/u17:14/705
CPU: 4 PID: 705 Comm: kworker/u17:14 Tainted: G S W
4.14.234-00003-g1fb6d0bd49a4-dirty #28
Hardware name: Qualcomm Technologies, Inc. SM8150 V2 PM8150
Google Inc. MSM sm8150 Flame DVT (DT)
Workqueue: hci0 hci_rx_work
Call trace:
dump_backtrace+0x0/0x378
show_stack+0x20/0x2c
dump_stack+0x124/0x148
print_address_description+0x80/0x2e8
__kasan_report+0x168/0x188
kasan_report+0x10/0x18
__asan_load4+0x84/0x8c
refcount_dec_and_test+0x20/0xd0
l2cap_chan_put+0x48/0x12c
l2cap_recv_frame+0x4770/0x6550
l2cap_recv_acldata+0x44c/0x7a4
hci_acldata_packet+0x100/0x188
hci_rx_work+0x178/0x23c
process_one_work+0x35c/0x95c
worker_thread+0x4cc/0x960
kthread+0x1a8/0x1c4
ret_from_fork+0x10/0x18
Cc: stable@kernel.org
Reported-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Tested-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
When suspending, always set the event mask once disconnects are
successful. Otherwise, if wakeup is disallowed, the event mask is not
set before suspend continues and can result in an early wakeup.
Fixes: 182ee45da083 ("Bluetooth: hci_sync: Rework hci_suspend_notifier")
Cc: stable@vger.kernel.org
Signed-off-by: Abhishek Pandit-Subedi <abhishekpandit@chromium.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
Don't call mgmt_pending_remove() twice (double free).
Fixes: 6b88eff43704 ("Bluetooth: hci_sync: Refactor remove Adv Monitor")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
|
|
lockdep complains use of uninitialized spinlock at ieee80211_do_stop() [1],
for commit f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif
that is being stopped") guards clear_bit() using fq.lock even before
fq_init() from ieee80211_txq_setup_flows() initializes this spinlock.
According to discussion [2], Toke was not happy with expanding usage of
fq.lock. Since __ieee80211_wake_txqs() is called under RCU read lock, we
can instead use synchronize_rcu() for flushing ieee80211_wake_txqs().
Link: https://syzkaller.appspot.com/bug?extid=eceab52db7c4b961e9d6 [1]
Link: https://lkml.kernel.org/r/874k0zowh2.fsf@toke.dk [2]
Reported-by: syzbot <syzbot+eceab52db7c4b961e9d6@syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Fixes: f856373e2f31ffd3 ("wifi: mac80211: do not wake queues on a vif that is being stopped")
Tested-by: syzbot <syzbot+eceab52db7c4b961e9d6@syzkaller.appspotmail.com>
Acked-by: Toke Høiland-Jørgensen <toke@kernel.org>
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://lore.kernel.org/r/9cc9b81d-75a3-3925-b612-9d0ad3cab82b@I-love.SAKURA.ne.jp
[ pick up commit 3598cb6e1862 ("wifi: mac80211: do not abuse fq.lock in ieee80211_do_stop()") from -next]
Link: https://lore.kernel.org/all/87o7xcq6qt.fsf@kernel.org/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Currently loopback test is failiing due to the error returned from
ice_vsi_vlan_setup(). Skip calling it when preparing loopback VSI.
Fixes: 0e674aeb0b77 ("ice: Add handler for ethtool selftest")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Tx side sets EOP and RS bits on descriptors to indicate that a
particular descriptor is the last one and needs to generate an irq when
it was sent. These bits should not be checked on completion path
regardless whether it's the Tx or the Rx. DD bit serves this purpose and
it indicates that a particular descriptor is either for Rx or was
successfully Txed. EOF is also set as loopback test does not xmit
fragmented frames.
Look at (DD | EOF) bits setting in ice_lbtest_receive_frames() instead
of EOP and RS pair.
Fixes: 0e674aeb0b77 ("ice: Add handler for ethtool selftest")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Tested-by: George Kuruvinakunnel <george.kuruvinakunnel@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
The driver currently does not allow two VSIs in the same PF domain
to have the same unicast MAC address. This is incorrect in the sense
that a policy decision is being made in the driver when it must be
left to the user. This approach was causing issues when rebooting
the system with VFs spawned not being able to change their MAC addresses.
Such errors were present in dmesg:
[ 7921.068237] ice 0000:b6:00.2 ens2f2: Unicast MAC 6a:0d:e4:70:ca:d1 already
exists on this PF. Preventing setting VF 7 unicast MAC address to 6a:0d:e4:70:ca:d1
Fix that by removing this restriction. Doing this also allows
us to remove some additional code that's checking if a unicast MAC
filter already exists.
Fixes: 47ebc7b02485 ("ice: Check if unicast MAC exists before setting VF MAC")
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Signed-off-by: Sylwester Dziedziuch <sylwesterx.dziedziuch@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Fix checksum offload on VXLAN tunnels.
In case, when mpls protocol is not used, set l4 header to transport
header of skb. This fixes case, when user tries to offload checksums
of VXLAN tunneled traffic.
Steps for reproduction (requires link partner with tunnels):
ip l s enp130s0f0 up
ip a f enp130s0f0
ip a a 10.10.110.2/24 dev enp130s0f0
ip l s enp130s0f0 mtu 1600
ip link add vxlan12_sut type vxlan id 12 group 238.168.100.100 dev enp130s0f0 dstport 4789
ip l s vxlan12_sut up
ip a a 20.10.110.2/24 dev vxlan12_sut
iperf3 -c 20.10.110.1 #should connect
Offload params: td_offset, cd_tunnel_params were
corrupted, due to l4 header pointing wrong address. NIC would then drop
those packets internally, due to incorrect TX descriptor data,
which increased GLV_TEPC register.
Fixes: 69e66c04c672 ("ice: Add mpls+tso support")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Legacy VLAN implementation allows for untrusted VF to have 8 VLAN
filters, not counting VLAN 0 filters. Current VLAN_V2 implementation
lowers available filters for VF, by counting in VLAN 0 filter for both
TPIDs.
Fix this by counting only non zero VLAN filters.
Without this patch, untrusted VF would not be able to access 8 VLAN
filters.
Fixes: cc71de8fa133 ("ice: Add support for VIRTCHNL_VF_OFFLOAD_VLAN_V2")
Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com>
Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com>
Tested-by: Marek Szlosek <marek.szlosek@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
|
|
Icelake has a slots event, on my Skylakex I have CPU events in sysfs of
topdown-slots-issued and topdown-total-slots.
Legacy event parsing would try to use '-' to separate parts of an event
and so perf_pmu__parse_init sets 'slots' to be a
PMU_EVENT_SYMBOL_SUFFIX2.
As such parsing the slots event for a fake PMU fails as a
PMU_EVENT_SYMBOL_SUFFIX2 isn't made into the PE_PMU_EVENT_FAKE token.
Resolve this issue by test initializing the PMU parsing state before
every parse. This must be done every parse as the state is removes after
each parse_events.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220725223633.2301737-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Update JSON core/uncore events for haswellx to perf.
Based on HSX JSON list v24:
https://download.01.org/perfmon/HSX
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220614145019.2177071-2-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Update JSON core/uncore events for broadwellx to perf.
Based on BDX JSON list v19:
https://download.01.org/perfmon/BDX
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220614145019.2177071-1-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
More uncore events are added in the converter tool:
https://github.com/intel/event-converter-for-linux-perf
Keep both alias and the original name for the events, in case someone
already used the alias in their script.
Generate the perf events based on Snowridgex(SNR) event list v1.20:
https://download.01.org/perfmon/SNR/
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220609094222.2030167-2-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Tremontx was an old name for Snowridgex, so rename Tremontx to Snowridgex.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220609094222.2030167-1-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Update JSON event list for Sapphirerapids to perf.
Based on JSON list v1.02:
https://download.01.org/perfmon/SPR/
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220607092749.1976878-2-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Update JSON event list for Alderlake to perf.
It is a hybrid event list for both Atom and Core.
Based on JSON list v1.11:
https://download.01.org/perfmon/ADL/
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220607092749.1976878-1-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
There is a spelling mistake in a pr_err message. Fix it.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-janitors@vger.kernel.org
Link: https://lore.kernel.org/r/20220721124528.20997-1-colin.i.king@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Implements workqueue trace bpf function.
Test cases:
# perf kwork -k workqueue lat -b
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end |
--------------------------------------------------------------------------------------------------------------------------------
(w)addrconf_verify_work | 0002 | 5.856 ms | 1 | 5.856 ms | 111994.634313 s | 111994.640169 s |
(w)vmstat_update | 0001 | 1.247 ms | 1 | 1.247 ms | 111996.462651 s | 111996.463899 s |
(w)neigh_periodic_work | 0001 | 1.183 ms | 1 | 1.183 ms | 111996.462789 s | 111996.463973 s |
(w)neigh_managed_work | 0001 | 0.989 ms | 2 | 1.635 ms | 111996.462820 s | 111996.464455 s |
(w)wb_workfn | 0000 | 0.667 ms | 1 | 0.667 ms | 111996.384273 s | 111996.384940 s |
(w)bpf_prog_free_deferred | 0001 | 0.495 ms | 1 | 0.495 ms | 111986.314201 s | 111986.314696 s |
(w)mix_interrupt_randomness | 0002 | 0.421 ms | 6 | 0.749 ms | 111995.927750 s | 111995.928499 s |
(w)vmstat_shepherd | 0000 | 0.374 ms | 2 | 0.385 ms | 111991.265242 s | 111991.265627 s |
(w)e1000_watchdog | 0002 | 0.356 ms | 5 | 0.390 ms | 111994.528380 s | 111994.528770 s |
(w)vmstat_update | 0000 | 0.231 ms | 2 | 0.365 ms | 111996.384407 s | 111996.384772 s |
(w)flush_to_ldisc | 0006 | 0.165 ms | 1 | 0.165 ms | 111995.930606 s | 111995.930771 s |
(w)flush_to_ldisc | 0000 | 0.094 ms | 2 | 0.095 ms | 111996.460453 s | 111996.460548 s |
--------------------------------------------------------------------------------------------------------------------------------
# perf kwork -k workqueue rep -b
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
(w)e1000_watchdog | 0002 | 0.627 ms | 2 | 0.324 ms | 112002.720665 s | 112002.720989 s |
(w)flush_to_ldisc | 0007 | 0.598 ms | 2 | 0.534 ms | 112000.875226 s | 112000.875761 s |
(w)wq_barrier_func | 0007 | 0.492 ms | 1 | 0.492 ms | 112000.876981 s | 112000.877473 s |
(w)flush_to_ldisc | 0007 | 0.281 ms | 1 | 0.281 ms | 112005.826882 s | 112005.827163 s |
(w)mix_interrupt_randomness | 0002 | 0.229 ms | 3 | 0.102 ms | 112005.825671 s | 112005.825774 s |
(w)vmstat_shepherd | 0000 | 0.202 ms | 1 | 0.202 ms | 112001.504511 s | 112001.504713 s |
(w)bpf_prog_free_deferred | 0001 | 0.181 ms | 1 | 0.181 ms | 112000.883251 s | 112000.883432 s |
(w)wb_workfn | 0007 | 0.130 ms | 1 | 0.130 ms | 112001.505195 s | 112001.505325 s |
(w)vmstat_update | 0000 | 0.053 ms | 1 | 0.053 ms | 112001.504763 s | 112001.504815 s |
--------------------------------------------------------------------------------------------------------------------------------
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-18-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Implements softirq trace bpf function.
Test cases:
Trace softirq latency without filter:
# perf kwork -k softirq lat -b
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end |
--------------------------------------------------------------------------------------------------------------------------------
(s)RCU:9 | 0005 | 0.281 ms | 3 | 0.338 ms | 111295.752222 s | 111295.752560 s |
(s)RCU:9 | 0002 | 0.262 ms | 24 | 1.400 ms | 111301.335986 s | 111301.337386 s |
(s)SCHED:7 | 0005 | 0.177 ms | 14 | 0.212 ms | 111295.752270 s | 111295.752481 s |
(s)RCU:9 | 0007 | 0.161 ms | 47 | 2.022 ms | 111295.402159 s | 111295.404181 s |
(s)NET_RX:3 | 0003 | 0.149 ms | 12 | 1.261 ms | 111301.192964 s | 111301.194225 s |
(s)TIMER:1 | 0001 | 0.105 ms | 9 | 0.198 ms | 111301.180191 s | 111301.180389 s |
... <SNIP> ...
(s)NET_RX:3 | 0002 | 0.098 ms | 6 | 0.124 ms | 111295.403760 s | 111295.403884 s |
(s)SCHED:7 | 0001 | 0.093 ms | 19 | 0.242 ms | 111301.180256 s | 111301.180498 s |
(s)SCHED:7 | 0007 | 0.078 ms | 15 | 0.188 ms | 111300.064226 s | 111300.064415 s |
(s)SCHED:7 | 0004 | 0.077 ms | 11 | 0.213 ms | 111301.361759 s | 111301.361973 s |
(s)SCHED:7 | 0000 | 0.063 ms | 33 | 0.805 ms | 111295.401811 s | 111295.402616 s |
(s)SCHED:7 | 0003 | 0.063 ms | 14 | 0.085 ms | 111301.192255 s | 111301.192340 s |
--------------------------------------------------------------------------------------------------------------------------------
Trace softirq latency with cpu filter:
# perf kwork -k softirq lat -b -C 1
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end |
--------------------------------------------------------------------------------------------------------------------------------
(s)RCU:9 | 0001 | 0.178 ms | 5 | 0.572 ms | 111435.534135 s | 111435.534707 s |
--------------------------------------------------------------------------------------------------------------------------------
Trace softirq latency with name filter:
# perf kwork -k softirq lat -b -n SCHED
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end |
--------------------------------------------------------------------------------------------------------------------------------
(s)SCHED:7 | 0001 | 0.295 ms | 15 | 2.183 ms | 111452.534950 s | 111452.537133 s |
(s)SCHED:7 | 0002 | 0.215 ms | 10 | 0.315 ms | 111460.000238 s | 111460.000553 s |
(s)SCHED:7 | 0005 | 0.190 ms | 29 | 0.338 ms | 111457.032538 s | 111457.032876 s |
(s)SCHED:7 | 0003 | 0.097 ms | 10 | 0.319 ms | 111452.434351 s | 111452.434670 s |
(s)SCHED:7 | 0006 | 0.089 ms | 1 | 0.089 ms | 111450.737450 s | 111450.737539 s |
(s)SCHED:7 | 0007 | 0.085 ms | 17 | 0.169 ms | 111452.471333 s | 111452.471502 s |
(s)SCHED:7 | 0004 | 0.071 ms | 15 | 0.221 ms | 111452.535252 s | 111452.535473 s |
(s)SCHED:7 | 0000 | 0.044 ms | 32 | 0.130 ms | 111460.001982 s | 111460.002112 s |
--------------------------------------------------------------------------------------------------------------------------------
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-17-yangjihong1@huawei.com
[ Add {} for multiline if blocks ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Implements irq trace bpf function.
Test cases:
Trace irq without filter:
# perf kwork -k irq rep -b
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
virtio0-requests:25 | 0000 | 31.026 ms | 285 | 1.493 ms | 110326.049963 s | 110326.051456 s |
eth0:10 | 0002 | 7.875 ms | 96 | 1.429 ms | 110313.916835 s | 110313.918264 s |
ata_piix:14 | 0002 | 2.510 ms | 28 | 0.396 ms | 110331.367987 s | 110331.368383 s |
--------------------------------------------------------------------------------------------------------------------------------
Trace irq with cpu filter:
# perf kwork -k irq rep -b -C 0
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
virtio0-requests:25 | 0000 | 34.288 ms | 282 | 2.061 ms | 110358.078968 s | 110358.081029 s |
--------------------------------------------------------------------------------------------------------------------------------
Trace irq with name filter:
# perf kwork -k irq rep -b -n eth0
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
eth0:10 | 0002 | 2.184 ms | 21 | 0.572 ms | 110386.541699 s | 110386.542271 s |
--------------------------------------------------------------------------------------------------------------------------------
Trace irq with summary:
# perf kwork -k irq rep -b -S
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
virtio0-requests:25 | 0000 | 42.923 ms | 285 | 1.181 ms | 110418.128867 s | 110418.130049 s |
eth0:10 | 0002 | 2.085 ms | 20 | 0.668 ms | 110416.002935 s | 110416.003603 s |
ata_piix:14 | 0002 | 0.970 ms | 4 | 0.656 ms | 110424.034482 s | 110424.035138 s |
--------------------------------------------------------------------------------------------------------------------------------
Total count : 309
Total runtime (msec) : 45.977 (0.003% load average)
Total time span (msec) : 17017.655
--------------------------------------------------------------------------------------------------------------------------------
Committer testing:
# perf kwork -k irq rep -b
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
nvme0q20:145 | 0019 | 0.570 ms | 28 | 0.064 ms | 26966.635102 s | 26966.635167 s |
amdgpu:162 | 0002 | 0.568 ms | 29 | 0.068 ms | 26966.644346 s | 26966.644414 s |
nvme0q4:129 | 0003 | 0.565 ms | 31 | 0.037 ms | 26966.614830 s | 26966.614866 s |
nvme0q16:141 | 0015 | 0.205 ms | 66 | 0.012 ms | 26967.145161 s | 26967.145174 s |
nvme0q29:154 | 0028 | 0.154 ms | 44 | 0.014 ms | 26967.078970 s | 26967.078984 s |
nvme0q10:135 | 0009 | 0.134 ms | 43 | 0.011 ms | 26967.132093 s | 26967.132104 s |
nvme0q2:127 | 0001 | 0.132 ms | 26 | 0.011 ms | 26966.883584 s | 26966.883595 s |
nvme0q25:150 | 0024 | 0.127 ms | 32 | 0.014 ms | 26966.631419 s | 26966.631433 s |
nvme0q14:139 | 0013 | 0.110 ms | 21 | 0.017 ms | 26966.760843 s | 26966.760861 s |
nvme0q30:155 | 0029 | 0.102 ms | 30 | 0.022 ms | 26966.677171 s | 26966.677193 s |
nvme0q13:138 | 0012 | 0.088 ms | 20 | 0.015 ms | 26966.738733 s | 26966.738748 s |
nvme0q6:131 | 0005 | 0.087 ms | 13 | 0.020 ms | 26966.648445 s | 26966.648465 s |
nvme0q28:153 | 0027 | 0.066 ms | 12 | 0.015 ms | 26966.771431 s | 26966.771447 s |
nvme0q26:151 | 0025 | 0.060 ms | 13 | 0.012 ms | 26966.704266 s | 26966.704278 s |
nvme0q21:146 | 0020 | 0.054 ms | 20 | 0.011 ms | 26967.322082 s | 26967.322094 s |
nvme0q1:126 | 0000 | 0.046 ms | 11 | 0.013 ms | 26966.859754 s | 26966.859767 s |
nvme0q17:142 | 0016 | 0.046 ms | 10 | 0.011 ms | 26967.114513 s | 26967.114524 s |
xhci_hcd:74 | 0015 | 0.041 ms | 3 | 0.016 ms | 26967.086004 s | 26967.086020 s |
nvme0q8:133 | 0007 | 0.039 ms | 12 | 0.008 ms | 26966.712056 s | 26966.712063 s |
nvme0q32:157 | 0031 | 0.036 ms | 10 | 0.014 ms | 26966.627054 s | 26966.627068 s |
nvme0q9:134 | 0008 | 0.036 ms | 11 | 0.011 ms | 26967.258452 s | 26967.258462 s |
nvme0q7:132 | 0006 | 0.024 ms | 3 | 0.014 ms | 26966.767404 s | 26966.767418 s |
nvme0q11:136 | 0010 | 0.023 ms | 5 | 0.006 ms | 26966.935455 s | 26966.935461 s |
nvme0q31:156 | 0030 | 0.018 ms | 5 | 0.006 ms | 26966.627517 s | 26966.627524 s |
nvme0q12:137 | 0011 | 0.015 ms | 2 | 0.014 ms | 26966.799588 s | 26966.799602 s |
enp5s0-rx-0:164 | 0006 | 0.009 ms | 2 | 0.005 ms | 26966.742024 s | 26966.742028 s |
enp5s0-rx-1:165 | 0007 | 0.006 ms | 2 | 0.004 ms | 26966.939486 s | 26966.939490 s |
enp5s0-tx-0:166 | 0008 | 0.005 ms | 1 | 0.005 ms | 26966.939484 s | 26966.939489 s |
enp5s0-tx-1:167 | 0009 | 0.005 ms | 1 | 0.005 ms | 26966.939484 s | 26966.939489 s |
--------------------------------------------------------------------------------------------------------------------------------
#t
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-16-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
'perf record' generates perf.data, which generates extra interrupts
for hard disk, amount of data to be collected increases with time.
Using eBPF trace can process the data in kernel, which solves the
preceding two problems.
Add -b/--use-bpf option for latency and report to support
tracing kwork events using eBPF:
1. Create bpf prog and attach to tracepoints,
2. Start tracing after command is entered,
3. After user hit "ctrl+c", stop tracing and report,
4. Support CPU and name filtering.
This commit implements the framework code and
does not add specific event support.
Test cases:
# perf kwork rep -h
Usage: perf kwork report [<options>]
-b, --use-bpf Use BPF to measure kwork runtime
-C, --cpu <cpu> list of cpus to profile
-i, --input <file> input file name
-n, --name <name> event name to profile
-s, --sort <key[,key2...]>
sort by key(s): runtime, max, count
-S, --with-summary Show summary with statistics
--time <str> Time span for analysis (start,stop)
# perf kwork lat -h
Usage: perf kwork latency [<options>]
-b, --use-bpf Use BPF to measure kwork latency
-C, --cpu <cpu> list of cpus to profile
-i, --input <file> input file name
-n, --name <name> event name to profile
-s, --sort <key[,key2...]>
sort by key(s): avg, max, count
--time <str> Time span for analysis (start,stop)
# perf kwork lat -b
Unsupported bpf trace class irq
# perf kwork rep -b
Unsupported bpf trace class irq
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-15-yangjihong1@huawei.com
[ Simplify work_findnew() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Implements framework of perf kwork timehist,
to provide an analysis of kernel work events.
Test cases:
# perf kwork tim
Runtime start Runtime end Cpu Kwork name Runtime Delaytime
(TYPE)NAME:NUM (msec) (msec)
----------------- ----------------- ------ ------------------------------ ---------- ----------
91576.060290 91576.060344 [0000] (s)RCU:9 0.055 0.111
91576.061470 91576.061547 [0000] (s)SCHED:7 0.077 0.073
91576.062604 91576.062697 [0001] (s)RCU:9 0.094 0.409
91576.064443 91576.064517 [0002] (s)RCU:9 0.074 0.114
91576.065144 91576.065211 [0000] (s)SCHED:7 0.067 0.058
91576.066564 91576.066609 [0003] (s)RCU:9 0.045 0.110
91576.068495 91576.068559 [0000] (s)SCHED:7 0.064 0.059
91576.068900 91576.068996 [0004] (s)RCU:9 0.096 0.726
91576.069364 91576.069420 [0002] (s)RCU:9 0.056 0.082
91576.069649 91576.069701 [0004] (s)RCU:9 0.052 0.111
91576.070147 91576.070206 [0000] (s)SCHED:7 0.060 0.057
91576.073147 91576.073202 [0000] (s)SCHED:7 0.054 0.060
<SNIP>
# perf kwork tim --max-stack 2 -g
Runtime start Runtime end Cpu Kwork name Runtime Delaytime
(TYPE)NAME:NUM (msec) (msec)
----------------- ----------------- ------ ------------------------------ ---------- ----------
91576.060290 91576.060344 [0000] (s)RCU:9 0.055 0.111 irq_exit_rcu <- sysvec_apic_timer_interrupt
91576.061470 91576.061547 [0000] (s)SCHED:7 0.077 0.073 irq_exit_rcu <- sysvec_call_function_single
91576.062604 91576.062697 [0001] (s)RCU:9 0.094 0.409 irq_exit_rcu <- sysvec_apic_timer_interrupt
91576.064443 91576.064517 [0002] (s)RCU:9 0.074 0.114 irq_exit_rcu <- sysvec_apic_timer_interrupt
91576.065144 91576.065211 [0000] (s)SCHED:7 0.067 0.058 irq_exit_rcu <- sysvec_call_function_single
91576.066564 91576.066609 [0003] (s)RCU:9 0.045 0.110 irq_exit_rcu <- sysvec_apic_timer_interrupt
91576.068495 91576.068559 [0000] (s)SCHED:7 0.064 0.059 irq_exit_rcu <- sysvec_call_function_single
91576.068900 91576.068996 [0004] (s)RCU:9 0.096 0.726 irq_exit_rcu <- sysvec_apic_timer_interrupt
91576.069364 91576.069420 [0002] (s)RCU:9 0.056 0.082 irq_exit_rcu <- sysvec_apic_timer_interrupt
91576.069649 91576.069701 [0004] (s)RCU:9 0.052 0.111 irq_exit_rcu <- sysvec_apic_timer_interrupt
<SNIP>
Committer testing:
# perf kwork -k workqueue timehist | head -40
Runtime start Runtime end Cpu Kwork name Runtime Delaytime
(TYPE)NAME:NUM (msec) (msec)
----------------- ----------------- ------ ------------------------------ ---------- ----------
26520.211825 26520.211832 [0019] (w)free_work 0.007 0.004
26520.212929 26520.212934 [0020] (w)free_work 0.005 0.004
26520.213226 26520.213228 [0014] (w)kfree_rcu_work 0.002 0.004
26520.214057 26520.214061 [0021] (w)free_work 0.004 0.004
26520.221239 26520.221241 [0007] (w)kfree_rcu_work 0.002 0.009
26520.223232 26520.223238 [0013] (w)psi_avgs_work 0.005 0.006
26520.230057 26520.230060 [0020] (w)free_work 0.003 0.003
26520.270428 26520.270434 [0015] (w)free_work 0.006 0.004
26520.270546 26520.270550 [0014] (w)free_work 0.004 0.003
26520.281626 26520.281629 [0015] (w)free_work 0.003 0.002
26520.287225 26520.287230 [0012] (w)psi_avgs_work 0.005 0.008
26520.287231 26520.287235 [0001] (w)psi_avgs_work 0.004 0.011
26520.287236 26520.287239 [0001] (w)psi_avgs_work 0.003 0.012
26520.329488 26520.329492 [0024] (w)free_work 0.004 0.004
26520.330600 26520.330605 [0007] (w)free_work 0.005 0.004
26520.334218 26520.334218 [0007] (w)kfree_rcu_monitor 0.001 0.002
26520.335220 26520.335221 [0005] (w)kfree_rcu_monitor 0.001 0.004
26520.343980 26520.343985 [0007] (w)free_work 0.005 0.002
26520.345093 26520.345097 [0006] (w)free_work 0.004 0.003
26520.351233 26520.351238 [0027] (w)psi_avgs_work 0.005 0.008
26520.353228 26520.353229 [0007] (w)kfree_rcu_work 0.001 0.002
26520.353229 26520.353231 [0005] (w)kfree_rcu_work 0.001 0.006
26520.382381 26520.382383 [0006] (w)free_work 0.003 0.002
26520.386547 26520.386548 [0006] (w)free_work 0.002 0.001
26520.391243 26520.391245 [0015] (w)console_callback 0.002 0.016
26520.415369 26520.415621 [0027] (w)btrfs_work_helper 0.252
26520.415351 26520.416174 [0002] (w)btrfs_work_helper 0.823 0.037
26520.415343 26520.416304 [0031] (w)btrfs_work_helper 0.961
26520.415335 26520.417078 [0001] (w)btrfs_work_helper 1.743
26520.415250 26520.417564 [0002] (w)wb_workfn 2.314
26520.424777 26520.424787 [0002] (w)btrfs_work_helper 0.010
26520.424788 26520.424798 [0002] (w)btrfs_work_helper 0.010
26520.424790 26520.424805 [0001] (w)btrfs_work_helper 0.016 0.016
26520.424801 26520.424807 [0002] (w)btrfs_work_helper 0.006
26520.424809 26520.424831 [0002] (w)btrfs_work_helper 0.022 0.030
26520.424824 26520.424835 [0027] (w)btrfs_work_helper 0.011
26520.424809 26520.424867 [0001] (w)btrfs_work_helper 0.059 0.032
#
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-14-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|