Age | Commit message (Collapse) | Author |
|
The UMR WQE in the MR re-registration flow requires that
modify_atomic and modify_entity_size capabilities are enabled.
Therefore, check that the these capabilities are present before going to
umr flow and go through slow path if not.
Fixes: c8d75a980fab ("IB/mlx5: Respect new UMR capabilities")
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20190815083834.9245-8-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
ODP depends on the several device capabilities, among them is the ability
to send UMR WQEs with that modify atomic and entity size of the MR.
Therefore, only if all conditions to send such a UMR WQE are met then
driver can report that ODP is supported. Use this check of conditions
in all places where driver needs to know about ODP support.
Also, implicit ODP support depends on ability of driver to send UMR WQEs
for an indirect mkey. Therefore, verify that all conditions to do so are
met when reporting support.
Fixes: c8d75a980fab ("IB/mlx5: Respect new UMR capabilities")
Signed-off-by: Moni Shoua <monis@mellanox.com>
Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20190815083834.9245-7-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Introduce helper function to unify various use_umr checks.
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20190815083834.9245-6-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
task_active_pid_ns() is wrong API to check PID namespace because it
posses some restrictions and return PID namespace where the process
was allocated. It created mismatches with current namespace, which
can be different.
Rewrite whole rdma_is_visible_in_pid_ns() logic to provide reliable
results without any relation to allocated PID namespace.
Fixes: 8be565e65fa9 ("RDMA/nldev: Factor out the PID namespace check")
Fixes: 6a6c306a09b5 ("RDMA/restrack: Make is_visible_in_pid_ns() as an API")
Reviewed-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20190815083834.9245-4-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
"Auto" configuration mode is called for visible in that PID
namespace and it ensures that all counters and QPs are coexist
in the same namespace and belong to same PID.
Fixes: 99fa331dc862 ("RDMA/counter: Add "auto" configuration mode support")
Reviewed-by: Mark Zhang <markz@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20190815083834.9245-3-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
If QP is not visible to the pid, then we try to decrease its reference
count and return from the function before the QP pointer is
initialized. This lead to NULL pointer dereference.
Fix it by pass directly the res to the rdma_restract_put as arg instead of
&qp->res.
This fixes below call trace:
[ 5845.110329] BUG: kernel NULL pointer dereference, address:
00000000000000dc
[ 5845.120482] Oops: 0002 [#1] SMP PTI
[ 5845.129119] RIP: 0010:rdma_restrack_put+0x5/0x30 [ib_core]
[ 5845.169450] Call Trace:
[ 5845.170544] rdma_counter_get_qp+0x5c/0x70 [ib_core]
[ 5845.172074] rdma_counter_bind_qpn_alloc+0x6f/0x1a0 [ib_core]
[ 5845.173731] nldev_stat_set_doit+0x314/0x330 [ib_core]
[ 5845.175279] rdma_nl_rcv_msg+0xeb/0x1d0 [ib_core]
[ 5845.176772] ? __kmalloc_node_track_caller+0x20b/0x2b0
[ 5845.178321] rdma_nl_rcv+0xcb/0x120 [ib_core]
[ 5845.179753] netlink_unicast+0x179/0x220
[ 5845.181066] netlink_sendmsg+0x2d8/0x3d0
[ 5845.182338] sock_sendmsg+0x30/0x40
[ 5845.183544] __sys_sendto+0xdc/0x160
[ 5845.184832] ? syscall_trace_enter+0x1f8/0x2e0
[ 5845.186209] ? __audit_syscall_exit+0x1d9/0x280
[ 5845.187584] __x64_sys_sendto+0x24/0x30
[ 5845.188867] do_syscall_64+0x48/0x120
[ 5845.190097] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Fixes: 1bd8e0a9d0fd1 ("RDMA/counter: Allow manual mode configuration support")
Signed-off-by: Ido Kalir <idok@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20190815083834.9245-2-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
In a congested fabric with adaptive routing enabled, traces show that
packets could be delivered out of order. A stale TID RDMA data packet
could lead to TidErr if the TID entries have been released by duplicate
data packets generated from retries, and subsequently erroneously force
the qp into error state in the current implementation.
Since the payload has already been dropped by hardware, the packet can
be simply dropped and it is no longer necessary to put the qp into
error state.
Fixes: 9905bf06e890 ("IB/hfi1: Add functions to receive TID RDMA READ response")
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Link: https://lore.kernel.org/r/20190815192058.105923.72324.stgit@awfm-01.aw.intel.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
In a congested fabric with adaptive routing enabled, traces show that
packets could be delivered out of order, which could cause incorrect
processing of stale packets. For stale TID RDMA WRITE DATA packets that
cause KDETH EFLAGS errors, this patch adds additional checks before
processing the packets.
Fixes: d72fe7d5008b ("IB/hfi1: Add a function to receive TID RDMA WRITE DATA packet")
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Link: https://lore.kernel.org/r/20190815192051.105923.69979.stgit@awfm-01.aw.intel.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
In a congested fabric with adaptive routing enabled, traces show that
packets could be delivered out of order, which could cause incorrect
processing of stale packets. For stale TID RDMA READ RESP packets that
cause KDETH EFLAGS errors, this patch adds additional checks before
processing the packets.
Fixes: 9905bf06e890 ("IB/hfi1: Add functions to receive TID RDMA READ response")
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Link: https://lore.kernel.org/r/20190815192045.105923.59813.stgit@awfm-01.aw.intel.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
When processing a TID RDMA READ RESP packet that causes KDETH EFLAGS
errors, the packet's IB PSN is checked against qp->s_last_psn and
qp->s_psn without the protection of qp->s_lock, which is not safe.
This patch fixes the issue by acquiring qp->s_lock first.
Fixes: 9905bf06e890 ("IB/hfi1: Add functions to receive TID RDMA READ response")
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Link: https://lore.kernel.org/r/20190815192039.105923.7852.stgit@awfm-01.aw.intel.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
In a congested fabric with adaptive routing enabled, traces show that
the sender could receive stale TID RDMA NAK packets that contain newer
KDETH PSNs and older Verbs PSNs. If not dropped, these packets could
cause the incorrect rewinding of the software flows and the incorrect
completion of TID RDMA WRITE requests, and eventually leading to memory
corruption and kernel crash.
The current code drops stale TID RDMA ACK/NAK packets solely based
on KDETH PSNs, which may lead to erroneous processing. This patch
fixes the issue by also checking the Verbs PSN. Addition checks are
added before rewinding the TID RDMA WRITE DATA packets.
Fixes: 9e93e967f7b4 ("IB/hfi1: Add a function to receive TID RDMA ACK packet")
Cc: <stable@vger.kernel.org>
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Link: https://lore.kernel.org/r/20190815192033.105923.44192.stgit@awfm-01.aw.intel.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
In siw_connect() we have an error flow where there is no valid qp
pointer. Make sure we don't try to de-ref in that situation.
Fixes: 6c52fdc244b5 ("rdma/siw: connection management")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Bernard Metzler <bmt@zurich.ibm.com>
Link: https://lore.kernel.org/r/20190819140257.19319-1-bmt@zurich.ibm.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
When ODP is enabled with IB_ACCESS_HUGETLB then the required pages
should be calculated based on the extent of the MR, which is rounded
to the nearest huge page alignment.
Fixes: d2183c6f1958 ("RDMA/umem: Move page_shift from ib_umem to ib_odp_umem")
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Link: https://lore.kernel.org/r/20190815083834.9245-5-leon@kernel.org
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
We need to check if we have CQEs pending before starting a poll loop,
as those could be the events we will be spinning for (and hence we'll
find none). This can happen if a CQE triggers an error, or if it is
found by eg an IRQ before we get a chance to find it through polling.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
One of the components in LiteON CL1 device has limitations that
can be encountered based upon boundary race conditions using the
nvme bus specific suspend to idle flow.
When this situation occurs the drive doesn't resume properly from
suspend-to-idle.
LiteON has confirmed this problem and fixed in the next firmware
version. As this firmware is already in the field, avoid running
nvme specific suspend to idle flow.
Fixes: d916b1be94b6 ("nvme-pci: use host managed power state for suspend")
Link: http://lists.infradead.org/pipermail/linux-nvme/2019-July/thread.html
Signed-off-by: Mario Limonciello <mario.limonciello@dell.com>
Signed-off-by: Charles Hyde <charles.hyde@dellteam.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Commit 1b1031ca63b2 ("nvme: validate cntlid during controller initialisation")
introduced a validation for controllers with duplicate cntlid that runs
on nvme_init_subsystem(). The problem is that the validation relies on
ctrl->cntlid, and this value is assigned (from id_ctrl value) after the
call for nvme_init_subsystem() in nvme_init_identify() for non-fabrics
scenario. That leads to ctrl->cntlid always being 0 in case we have a
physical set of controllers in the same subsystem.
This patch fixes that by loading the discovered cntlid id_ctrl value into
ctrl->cntlid before the subsystem initialization, only for the non-fabrics
case. The patch was tested with emulated nvme devices (qemu) having two
controllers in a single subsystem. Without the patch, we couldn't make
it work failing in the duplicate check; when running with the patch, we
could see the subsystem holding both controllers.
For the fabrics case we see ctrl->cntlid has a more intricate relation
with the admin connect, so we didn't change that.
Fixes: 1b1031ca63b2 ("nvme: validate cntlid during controller initialisation")
Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
nvme_state_set_live() making a path available triggers requeue_work
in order to resubmit requests that ended up on requeue_list when no
paths were available.
This requeue_work may race with concurrent nvme_ns_head_make_request()
that do not observe the live path yet.
Such concurrent requests may by made by either:
- New IO submission.
- Requeue_work triggered by nvme_failover_req() or another ana_work.
A race may cause requeue_work capture the state of requeue_list before
more requests get onto the list. These requests will stay on the list
forever unless requeue_work is triggered again.
In order to prevent such race, nvme_state_set_live() should
synchronize_srcu(&head->srcu) before triggering the requeue_work and
prevent nvme_ns_head_make_request referencing an old snapshot of the
path list.
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
If a request issue ends up being punted to async context to avoid
blocking, we can get into a situation where the original application
enters the poll loop for that very request before it has been issued.
This should not be an issue, except that the polling will hold the
io_uring uring_ctx mutex for the duration of the poll. When the async
worker has actually issued the request, it needs to acquire this mutex
to add the request to the poll issued list. Since the application
polling is already holding this mutex, the workqueue sleeps on the
mutex forever, and the application thus never gets a chance to poll for
the very request it was interested in.
Fix this by ensuring that the polling drops the uring_ctx occasionally
if it's not making any progress.
Reported-by: Jeffrey M. Birnbaum <jmbnyc@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
In the case of X86_PAE, unsigned long is u32, but the physical address type
should be u64. Due to the bug here, the netvsc driver can not load
successfully, and sometimes the VM can panic due to memory corruption (the
hypervisor writes data to the wrong location).
Fixes: 6ba34171bcbd ("Drivers: hv: vmbus: Remove use of slow_virt_to_phys()")
Cc: stable@vger.kernel.org
Cc: Michael Kelley <mikelley@microsoft.com>
Reported-and-tested-by: Juliana Rodrigueiro <juliana.rodrigueiro@intra2net.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
When building hv_kvp_daemon GCC-8.3 complains:
hv_kvp_daemon.c: In function ‘kvp_get_ip_info.constprop’:
hv_kvp_daemon.c:812:30: warning: ‘ip_buffer’ may be used uninitialized in this function [-Wmaybe-uninitialized]
struct hv_kvp_ipaddr_value *ip_buffer;
this seems to be a false positive: we only use ip_buffer when
op == KVP_OP_GET_IP_INFO and it is only unset when op == KVP_OP_ENUMERATE.
Silence the warning by initializing ip_buffer to NULL.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
Guenter Roeck reported problem with compilation when the ARCH is
specified:
$ make ARCH=x86_64
In file included from tools/include/asm/atomic.h:6:0,
from include/linux/atomic.h:5,
from tools/include/linux/refcount.h:41,
from cpumap.c:4: tools/include/asm/../../arch/x86/include/asm/atomic.h:11:10:
fatal error: asm/cmpxchg.h: No such file or directory
The problem is that we don't use SRCARCH (the sanitized ARCH version)
and we don't get the proper include path.
Reported-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Fixes: 314350491810 ("libperf: Make libperf.a part of the perf build")
Link: http://lkml.kernel.org/r/20190820124624.GG24105@krava
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Simplify the ring buffer handling with the in-place API.
Also avoid the dynamic allocation and the memory leak in the channel
callback function.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
This field is no longer used after the commit
63ed4e0c67df ("Drivers: hv: vmbus: Consolidate all Hyper-V specific clocksource code")
, because it's replaced by the global variable
"struct ms_hyperv_tsc_page *tsc_pg;" (now, the variable is in
drivers/clocksource/hyperv_timer.c).
Fixes: 63ed4e0c67df ("Drivers: hv: vmbus: Consolidate all Hyper-V specific clocksource code")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
We were getting the file by luck, from one of the paths in -I, fix it to
get it from the proper place:
$ cd tools/include/uapi/asm/
[acme@quaco asm]$ grep include bitsperlong.h
#include "../../arch/x86/include/uapi/asm/bitsperlong.h"
#include "../../arch/arm64/include/uapi/asm/bitsperlong.h"
#include "../../arch/powerpc/include/uapi/asm/bitsperlong.h"
#include "../../arch/s390/include/uapi/asm/bitsperlong.h"
#include "../../arch/sparc/include/uapi/asm/bitsperlong.h"
#include "../../arch/mips/include/uapi/asm/bitsperlong.h"
#include "../../arch/ia64/include/uapi/asm/bitsperlong.h"
#include "../../arch/riscv/include/uapi/asm/bitsperlong.h"
#include "../../arch/alpha/include/uapi/asm/bitsperlong.h"
#include <asm-generic/bitsperlong.h>
$ ls -la ../../arch/x86/include/uapi/asm/bitsperlong.h
ls: cannot access '../../arch/x86/include/uapi/asm/bitsperlong.h': No such file or directory
$ ls -la ../../../arch/*/include/uapi/asm/bitsperlong.h
-rw-rw-r--. 1 237 ../../../arch/alpha/include/uapi/asm/bitsperlong.h
-rw-rw-r--. 1 841 ../../../arch/arm64/include/uapi/asm/bitsperlong.h
-rw-rw-r--. 1 966 ../../../arch/hexagon/include/uapi/asm/bitsperlong.h
-rw-rw-r--. 1 234 ../../../arch/ia64/include/uapi/asm/bitsperlong.h
-rw-rw-r--. 1 100 ../../../arch/microblaze/include/uapi/asm/bitsperlong.h
-rw-rw-r--. 1 244 ../../../arch/mips/include/uapi/asm/bitsperlong.h
-rw-rw-r--. 1 352 ../../../arch/parisc/include/uapi/asm/bitsperlong.h
-rw-rw-r--. 1 312 ../../../arch/powerpc/include/uapi/asm/bitsperlong.h
-rw-rw-r--. 1 353 ../../../arch/riscv/include/uapi/asm/bitsperlong.h
-rw-rw-r--. 1 292 ../../../arch/s390/include/uapi/asm/bitsperlong.h
-rw-rw-r--. 1 323 ../../../arch/sparc/include/uapi/asm/bitsperlong.h
-rw-rw-r--. 1 320 ../../../arch/x86/include/uapi/asm/bitsperlong.h
$
Found while fixing some other problem, before it was escaping the
tools/ chroot and using stuff in the kernel sources:
CC /tmp/build/perf/util/find_bit.o
In file included from /git/linux/tools/include/../../arch/x86/include/uapi/asm/bitsperlong.h:11,
from /git/linux/tools/include/uapi/asm/bitsperlong.h:3,
from /git/linux/tools/include/linux/bits.h:6,
from /git/linux/tools/include/linux/bitops.h:13,
from ../lib/find_bit.c:17:
# cd /git/linux/tools/include/../../arch/x86/include/uapi/asm/
# pwd
/git/linux/arch/x86/include/uapi/asm
#
Now it is getting the one we want it to, i.e. the one inside tools/:
CC /tmp/build/perf/util/find_bit.o
In file included from /git/linux/tools/arch/x86/include/uapi/asm/bitsperlong.h:11,
from /git/linux/tools/include/linux/bits.h:6,
from /git/linux/tools/include/linux/bitops.h:13,
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-8f8cfqywmf6jk8a3ucr0ixhu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Give visual cue about what is happening while initially collecting the
minimal set of samples to collect/sort/display.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-xcui60p1v6ozijfam2o89ya8@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
available to display
The 'perf top' tool will use that to avoid having a initial blank screen
while collecting the minimum number of samples to sort and display.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-89ciceg8cy4442he3t0jzo3f@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Sometimes we want just to print a message on the center of the screen,
like in 'perf top' while we wait for the minimum amount of samples to be
collected before sorting and showing them.
Also expose __ui__info_window() as an optimization for cases where such
message is to be printed while holding the ui lock.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-uat0f89vfwl2w52kv9wzwd8a@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We will not need it when refactoring this function to be
non-interactive, so make it optional.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-pnx1dn17bsz7lqt9ty95nnjx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The synthetic branch and instruction samples are missed to set
instruction related info, thus the perf tool fails to display samples
with flags '-F,+insn,+insnlen'.
The CoreSight trace decoder provides sufficient information to decide
the instruction size based on the ISA type: A64/A32 instructions are
32-bit size, but one exception is the T32 instruction size, which might
be 32-bit or 16-bit.
This patch handles these cases and it reads the instruction values from
DSO file; thus can support the flags '-F,+insn,+insnlen'.
Before:
# perf script -F,insn,insnlen,ip,sym
0 [unknown] ilen: 0
ffff97174044 _start ilen: 0
ffff97174938 _dl_start ilen: 0
ffff97174938 _dl_start ilen: 0
ffff97174938 _dl_start ilen: 0
ffff97174938 _dl_start ilen: 0
ffff97174938 _dl_start ilen: 0
ffff97174938 _dl_start ilen: 0
ffff97174938 _dl_start ilen: 0
ffff97174938 _dl_start ilen: 0
[...]
After:
# perf script -F,insn,insnlen,ip,sym
0 [unknown] ilen: 0
ffff97174044 _start ilen: 4 insn: 2f 02 00 94
ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
ffff97174938 _dl_start ilen: 4 insn: c1 ff ff 54
[...]
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Robert Walker <robert.walker@arm.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190815082854.18191-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Display DWARF based callchains when the perf.data file contains raw thread
stack data as LBR callstack data.
Commiter testing:
This changes the output from the branch stack based one, i.e. without
this patch, for the same file as in the previous csets:
# perf report --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
# Total Lost Samples: 0
#
# Samples: 13 of event 'cycles'
# Event count (approx.): 13
#
# Overhead Command Source Shared Object Source Symbol Target Symbol Basic Block Cycles
# ........ ....... .................... ........................... ......................................... ..................
#
7.69% ls libpthread-2.29.so [.] _init [.] __pthread_initialize_minimal_internal 6827
7.69% ls ld-2.29.so [k] _start [k] _dl_start -
7.69% ls ld-2.29.so [.] _dl_start_user [.] _dl_init -24790
7.69% ls ld-2.29.so [k] _dl_start [k] _dl_sysdep_start 278
7.69% ls ld-2.29.so [k] dl_main [k] _dl_map_object_deps 15581
7.69% ls ld-2.29.so [k] open_verify.constprop.0 [k] lseek64 4228
7.69% ls ld-2.29.so [k] _dl_map_object [k] open_verify.constprop.0 55
7.69% ls ld-2.29.so [k] openaux [k] _dl_map_object 67
7.69% ls ld-2.29.so [k] _dl_map_object_deps [k] 0x00007f441b57c090 112
7.69% ls ld-2.29.so [.] call_init.part.0 [.] _init 334
7.69% ls ld-2.29.so [.] _dl_init [.] call_init.part.0 383
7.69% ls ld-2.29.so [k] _dl_sysdep_start [k] dl_main 45
7.69% ls ld-2.29.so [k] _dl_catch_exception [k] openaux 116
#
# (Tip: For memory address profiling, try: perf mem record / perf mem report)
#
To the one that shows call chains:
# perf report --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 10 of event 'cycles'
# Event count (approx.): 3204047
#
# Children Self Command Shared Object Symbol
# ........ ........ ....... .................. .........................................
#
55.01% 0.00% ls [kernel.vmlinux] [k] entry_SYSCALL_64_after_hwframe
|
---entry_SYSCALL_64_after_hwframe
do_syscall_64
|
--16.01%--__x64_sys_execve
__do_execve_file.isra.0
search_binary_handler
load_elf_binary
elf_map
vm_mmap_pgoff
do_mmap
mmap_region
perf_event_mmap
perf_iterate_sb
perf_iterate_ctx
perf_event_mmap_output
perf_output_copy
memcpy_erms
55.01% 39.00% ls [kernel.vmlinux] [k] do_syscall_64
|
|--39.00%--0xffffffffffffffff
| _dl_map_object
| open_verify.constprop.0
| __lseek64 (inlined)
| entry_SYSCALL_64_after_hwframe
| do_syscall_64
|
--16.01%--do_syscall_64
__x64_sys_execve
__do_execve_file.isra.0
search_binary_handler
load_elf_binary
elf_map
vm_mmap_pgoff
do_mmap
mmap_region
perf_event_mmap
perf_iterate_sb
perf_iterate_ctx
perf_event_mmap_output
perf_output_copy
memcpy_erms
42.95% 42.95% ls libpthread-2.29.so [.] __pthread_initialize_minimal_internal
|
---_init
__pthread_initialize_minimal_internal
42.95% 0.00% ls libpthread-2.29.so [.] _init
|
---_init
__pthread_initialize_minimal_internal
<SNIP>
#
# (Tip: Profiling branch (mis)predictions with: perf record -b / perf report)
#
#
The branch stack view be explicitely selected using:
# perf report -h branch-stack
Usage: perf report [<options>]
-b, --branch-stack use branch records for per branch histogram filling
#
I.e. after this patch:
# perf report -b --stdio
# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 13 of event 'cycles'
# Event count (approx.): 13
#
# Overhead Command Source Shared Object Source Symbol Target Symbol Basic Block Cycles
# ........ ....... .................... ........................... ......................................... ..................
#
7.69% ls libpthread-2.29.so [.] _init [.] __pthread_initialize_minimal_internal 6827
7.69% ls ld-2.29.so [k] _start [k] _dl_start -
7.69% ls ld-2.29.so [.] _dl_start_user [.] _dl_init -24790
7.69% ls ld-2.29.so [k] _dl_start [k] _dl_sysdep_start 278
7.69% ls ld-2.29.so [k] dl_main [k] _dl_map_object_deps 15581
7.69% ls ld-2.29.so [k] open_verify.constprop.0 [k] lseek64 4228
7.69% ls ld-2.29.so [k] _dl_map_object [k] open_verify.constprop.0 55
7.69% ls ld-2.29.so [k] openaux [k] _dl_map_object 67
7.69% ls ld-2.29.so [k] _dl_map_object_deps [k] 0x00007f441b57c090 112
7.69% ls ld-2.29.so [.] call_init.part.0 [.] _init 334
7.69% ls ld-2.29.so [.] _dl_init [.] call_init.part.0 383
7.69% ls ld-2.29.so [k] _dl_sysdep_start [k] dl_main 45
7.69% ls ld-2.29.so [k] _dl_catch_exception [k] openaux 116
#
# (Tip: Show current config key-value pairs: perf config --list)
#
#
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/ccbd9583-82f4-dec5-7e84-64bf56e351fb@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Make perf report -D command print captured LBR callstack chain when it is
collected together with raw thread stack data:
2752673087247083 0x5d10 [0x548]: PERF_RECORD_SAMPLE(IP, 0x4002): 5841/5841: 0x40121f period: 1543862 addr: 0
... FP chain: nr:0
... branch callstack: nr:3
..... 0: 00000000004011d0
..... 1: 00007f393c388411
..... 2: 0000000000401098
... user regs: mask 0xff0fff ABI 64-bit
.... AX 0x34e7
.... BX 0x7fff5f6dd3c0
.... CX 0xffffffff
.... DX 0x34e6
.... SI 0x7f393c5268d0
.... DI 0x0
.... BP 0x401260
.... SP 0x7fff5f6dd3c0
.... IP 0x40121f
.... FLAGS 0x29f
.... CS 0x33
.... SS 0x2b
.... R8 0x7f393c526800
.... R9 0x7f393c525da0
.... R10 0xfffffffffffff70a
.... R11 0x246
.... R12 0x401070
.... R13 0x7fff5f6ddcb0
.... R14 0x0
.... R15 0x0
... ustack: size 1024, offset 0x130
. data_src: 0x5080021
... thread: stack_test:5841
...... dso: /root/abudanko/stacks/stack_test
Committer testing:
# perf record -g --call-graph dwarf,1024 -j stack,u ls > /dev/null
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.042 MB perf.data (10 samples) ]
#
Before:
# perf report -D |& grep PERF_RECORD_SAMPLE -A28 | tail -29
67538909824483 0xa7a0 [0x560]: PERF_RECORD_SAMPLE(IP, 0x4002): 9721/9721: 0x7f441b2b1e20 period: 1376095 addr: 0
... FP chain: nr:0
... user regs: mask 0xff0fff ABI 64-bit
.... AX 0x7f441b2b1000
.... BX 0x7f441b55b970
.... CX 0x7fff6e2db218
.... DX 0x7fff6e2db218
.... SI 0x7fff6e2db208
.... DI 0x1
.... BP 0x1
.... SP 0x7fff6e2db178
.... IP 0x7f441b2b1e20
.... FLAGS 0x20a
.... CS 0x33
.... SS 0x2b
.... R8 0x1
.... R9 0x7f441b371c18
.... R10 0x7f441b5a5f10
.... R11 0x202
.... R12 0x7fff6e2db208
.... R13 0x7fff6e2db218
.... R14 0x7f441b5a7150
.... R15 0x0
... ustack: size 1024, offset 0x148
. data_src: 0x5080021
... thread: ls:9721
...... dso: /usr/lib64/libpthread-2.29.so
0xad00 [0x60]: event: 10
#
After:
# perf report -D |& grep PERF_RECORD_SAMPLE -A31 | tail -32
67538909824483 0xa7a0 [0x560]: PERF_RECORD_SAMPLE(IP, 0x4002): 9721/9721: 0x7f441b2b1e20 period: 1376095 addr: 0
... FP chain: nr:0
... branch callstack: nr:4
..... 0: 00007f441b2b1e20
..... 1: 00007f441b58af1a
..... 2: 00007f441b58b0e1
..... 3: 00007f441b57c145
... user regs: mask 0xff0fff ABI 64-bit
.... AX 0x7f441b2b1000
.... BX 0x7f441b55b970
.... CX 0x7fff6e2db218
.... DX 0x7fff6e2db218
.... SI 0x7fff6e2db208
.... DI 0x1
.... BP 0x1
.... SP 0x7fff6e2db178
.... IP 0x7f441b2b1e20
.... FLAGS 0x20a
.... CS 0x33
.... SS 0x2b
.... R8 0x1
.... R9 0x7f441b371c18
.... R10 0x7f441b5a5f10
.... R11 0x202
.... R12 0x7fff6e2db208
.... R13 0x7fff6e2db218
.... R14 0x7f441b5a7150
.... R15 0x0
... ustack: size 1024, offset 0x148
. data_src: 0x5080021
... thread: ls:9721
...... dso: /usr/lib64/libpthread-2.29.so
#
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/aa82e5dd-def2-0ca8-a064-db9e2e8ad076@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Enable '-j stack' applicability together with '--call-graph dwarf'
option so thread stack data and LBR call stack could be captured
jointly:
$ perf record -g --call-graph dwarf,1024 -j stack,u -- stack_test
Collected LBR call stack can be used to augment DWARF call stack
calculated from the raw thread stack data and to provide more
comprehensive call stack information for cases when collected SIZE is
not enough to cover complete thread stack.
Such cases are typical for workloads that allocate large arrays of data
on its threads stacks or the possible SIZE to collect can't be large
enough due to workload nature or system configuration and this is where
hardware captured LBR call stacks can provide missing stack frames.
Possible DWARF plus LBR call stacks consolidation algorithm description
follows.
With this patch set perf report command UI currently ignores collected
LBR call stack data and still provides DWARF based call stacks
information.
===========================================================================
Overview:
Legend:
THS - thread stack
CTX - thread register context
SWS - software stack
SSF - skipped stack frames
PSS - Perf sample stack
ip,sp,bp - HW registers values
d - allocated stack regions
kip - ip address in the kernel space
K - captured thread stack size
THS
-----
| |<-stack bottom
...
|---|
|ip4|
|---| PSS = SWS(THS(K))
| |
--> | |
| |d3 | user/
| |---| user PSS kernel PSS
| |ip3| ------ ------
| |---| |SSF | |SSF |
| | | .... ....
| | | ------ ------
| |d2 | | -1 | | -1 |
|---| user ------ ------
K |ip2| CTX |ip3 | |ip3 |
|---| |----| |----|
| |d1 | ... |ip2 | , |ip2 |
| |---| |---| |----| |----|
| |ip1| |bp0| |ip1 | |ip1 |
| |---| |---| |----| |----|
| | | |ip0|->|ip0 | |ip0 |<-user stack top
| | | |---| ------ ------
| | |<-|sp0|<-stack |kip0|<-kernel stack bottom
--> ----- ----- top |----|
|kip1|
|----|
|kip2|
|----|
....
| |<-kernel stack top
------
Algorithm details:
Legend:
HWS - hardware stack
K-SWS - kernel software stack
BRANCH
TABLE
HWS ip ip
from to
------ -----------
|ip7`| |ip7`| |
|----| |----|----|
|ip6`| |ip6`| |
user PSS |----| |----|----|
|ip5`| |ip5`| |
------ |----| |----|----|
| -1 | |ip4`| |ip4`| |
------ |----| |----|----|
|ip3 |~~~|ip3`| |ip3`| |
|----| |----| |----|----|
|ip2 |~~~|ip2`| |ip2`| |
|----| |----| |----|----|
|ip1 |~~~|ip1`| |ip1`|ip0`|
|----| |----| -----------
|ip0 |~~~|ip0`|<---------'
------ ------
1. if (sym(ipj) == sym(ipj`)), j=0-3 ===> user PSS
2. ipj` , j=4-7 ===> user PSS
Augmented PSS = A_SWS(SWS(THS(K)), HWS):
user/
user PSS kernel PSS
------ ------
|ip7`| |ip7`|<-user PSS bottom
|----| |----|
|ip6`| |ip6`|
|----| |----|
HWS |ip5`| |ip5`|
|----| |----|
|ip4`| |ip4`|
------ ------
|ip3 | |ip3 |
|----| |----|
SWS |ip2 | |ip2 |
|----| |----|
|ip1 | |ip1 |
|----| |----|
|ip0 | |ip0 |<-user PSS top
------ ------
|kip0|<-kernel PSS bottom
|----|
|kip1|
K-SWS |----|
|kip2|
|----|
|kip3|<-kernel PSS top
------
APSS
Committer testing:
Before:
# perf record -g --call-graph dwarf,1024 -j stack,u ls > /dev/null
unknown branch filter stack, check man page
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-j, --branch-filter <branch filter mask>
branch stack filter modes
# perf record -g --call-graph dwarf,1024 -j u ls > /dev/null
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.054 MB perf.data (12 samples) ]
# perf evlist -v
cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CALLCHAIN|PERIOD|BRANCH_STACK|REGS_USER|STACK_USER|DATA_SRC, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, mmap_data: 1, sample_id_all: 1, exclude_guest: 1, exclude_callchain_user: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: ANY, sample_regs_user: 0xff0fff, sample_stack_user: 1024
#
After:
# perf record -g --call-graph dwarf,1024 -j stack,u ls > /dev/null
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.044 MB perf.data (11 samples) ]
[root@quaco ~]# perf evlist -v
cycles: size: 112, { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|ADDR|CALLCHAIN|PERIOD|BRANCH_STACK|REGS_USER|STACK_USER|DATA_SRC, read_format: ID, disabled: 1, inherit: 1, mmap: 1, comm: 1, freq: 1, enable_on_exec: 1, task: 1, precise_ip: 3, mmap_data: 1, sample_id_all: 1, exclude_guest: 1, exclude_callchain_user: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1, branch_sample_type: USER|CALL_STACK, sample_regs_user: 0xff0fff, sample_stack_user: 1024
#
Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/e9e00090-66fb-d2a4-c90f-1d12344f7788@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The tools/lib/traceevent/Makefile had a test added to it to detect a failure
of the "nm" when making the dynamic list file (whatever that is). The
problem is that the test sorts the values "U W w" and some versions of sort
will place "w" ahead of "W" (even though it has a higher ASCII value, and
break the test.
Add 'tr "w" "W"' to merge the two and not worry about the ordering.
Reported-by: Tzvetomir Stoyanov <tstoyanov@vmware.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Michal rarek <mmarek@suse.com>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: stable@vger.kernel.org
Fixes: 6467753d61399 ("tools lib traceevent: Robustify do_generate_dynamic_list_file")
Link: http://lkml.kernel.org/r/20190805130150.25acfeb1@gandalf.local.home
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The 'idx' member was added as preparation for AUX area sampling. Add a
comment to describe why.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/83ff264f-84c3-5372-8976-dd9293d20c6f@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To pick up the changes in:
f36cf386e3fe ("x86/speculation/swapgs: Exclude ATOMs from speculation through SWAPGS")
18ec54fdd6d1 ("x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations")
That don't affect anything in tools/.
This silences this perf build warning:
Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/n/tip-860dq1qie2cpnfghlpcnxrzr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To pick up the changes in this cset:
95b980d62d52 ("linux/bits.h: make BIT(), GENMASK(), and friends available in assembly")
To address this tools/perf build warning:
Warning: Kernel ABI header at 'tools/include/linux/bits.h' differs from latest version at 'include/linux/bits.h'
diff -u tools/include/linux/bits.h include/linux/bits.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-1if3iga5r3di6oyddgxsr225@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
So that can update the copy of linux/bits.h that now uses macros defined
in const.h and that are not available in older systems.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-c2qfcbl58hxyfb5u5xivp7is@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The next cset will grap const.h copies from the kernel to keep bits.h
in sync as it started to use linux/const.h, that in turn includes
uapi/linux/const.h.
So now we have a file with the same name in tools/include and
tools/uapi/include, and one includes the other, we need to have
tools/include/uapi/ after tools/include/ for this to work, fix it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-qzjqxa1wdrt51kwadyqawnuj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We need to make sure limits.h is included before checking if we can use
__WORDSIZE, do it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-5yfoed4rnsck2n3cwhm9mvth@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The static structures ht16k33_fb_fix and ht16k33_fb_var, of types
fb_fix_screeninfo and fb_var_screeninfo respectively, are not used
except to be copied into other variables. Hence make both of them
constant to prevent unintended modification.
Issue found with
Coccinelle.
Acked-by: Robin van der Gracht <robin@protonic.nl>
Signed-off-by: Nishka Dasgupta <nishkadg.linux@gmail.com>
Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
|
|
The EKR ring claims a range of 0 to 71 but actually reports
values 1 to 72. The ring is used in relative mode so this
change should not affect users.
Signed-off-by: Aaron Armstrong Skomra <aaron.skomra@wacom.com>
Fixes: 72b236d60218f ("HID: wacom: Add support for Express Key Remote.")
Cc: <stable@vger.kernel.org> # v4.3+
Reviewed-by: Ping Cheng <ping.cheng@wacom.com>
Reviewed-by: Jason Gerecke <jason.gerecke@wacom.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
We need to provide the arch hooks for non-coherent dma-direct
and swiotlb for all swiotlb builds, not just when LPAS is enabled.
Without that the Xen build that selects SWIOTLB indirectly through
SWIOTLB_XEN fails to build.
Fixes: ad3c7b18c5b3 ("arm: use swiotlb for bounce buffering on LPAE configs")
Reported-by: Stefan Wahren <wahrenst@gmx.net>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Stefan Wahren <wahrenst@gmx.net>
|
|
When SCSI-MQ is enabled, the SCSI-MQ layers will do pre-allocation of MQ
resources based on shost values set by the driver. In newer cases of the
driver, which attempts to set nr_hw_queues to the cpu count, the
multipliers become excessive, with a single shost having SCSI-MQ
pre-allocation reaching into the multiple GBytes range. NPIV, which
creates additional shosts, only multiply this overhead. On lower-memory
systems, this can exhaust system memory very quickly, resulting in a system
crash or failures in the driver or elsewhere due to low memory conditions.
After testing several scenarios, the situation can be mitigated by limiting
the value set in shost->nr_hw_queues to 4. Although the shost values were
changed, the driver still had per-cpu hardware queues of its own that
allowed parallelization per-cpu. Testing revealed that even with the
smallish number for nr_hw_queues for SCSI-MQ, performance levels remained
near maximum with the within-driver affiinitization.
A module parameter was created to allow the value set for the nr_hw_queues
to be tunable.
Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Ewan D. Milne <emilne@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
|
|
The parens used in the while loop would result in error being assigned
the value 1 rather than the intended errno value.
This is required to return -ETXTBSY from follow on break_layout()
changes.
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"A couple fixes to the core framework logic that finds clk parents, a
handful of samsung clk driver fixes for audio and display clks, and a
small fix for the Stratix10 SoC driver that was checking the wrong
register for validity"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: Fix potential NULL dereference in clk_fetch_parent_index()
clk: Fix falling back to legacy parent string matching
clk: socfpga: stratix10: fix rate caclulationg for cnt_clks
clk: samsung: exynos542x: Move MSCL subsystem clocks to its sub-CMU
clk: samsung: exynos5800: Move MAU subsystem clocks to MAU sub-CMU
clk: samsung: Change signature of exynos5_subcmus_init() function
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull kernel thread signal handling fix from Eric Biederman:
"I overlooked the fact that kernel threads are created with all signals
set to SIG_IGN, and accidentally caused a regression in cifs and drbd
when replacing force_sig with send_sig.
This is my fix for that regression. I add a new function
allow_kernel_signal which allows kernel threads to receive signals
sent from the kernel, but continues to ignore all signals sent from
userspace. This ensures the user space interface for cifs and drbd
remain the same.
These kernel threads depend on blocking networking calls which block
until something is received or a signal is pending. Making receiving
of signals somewhat necessary for these kernel threads.
Perhaps someday we can cleanup those interfaces and remove
allow_kernel_signal. If not allow_kernel_signal is pretty trivial and
clearly documents what is going on so I don't think we will mind
carrying it"
* 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
signal: Allow cifs and drbd to receive their terminating signals
|
|
If alloc_descs() fails before irq_sysfs_init() has run, free_desc() in the
cleanup path will call kobject_del() even though the kobject has not been
added with kobject_add().
Fix this by making the call to kobject_del() conditional on whether
irq_sysfs_init() has run.
This problem surfaced because commit aa30f47cf666 ("kobject: Add support
for default attribute groups to kobj_type") makes kobject_del() stricter
about pairing with kobject_add(). If the pairing is incorrrect, a WARNING
and backtrace occur in sysfs_remove_group() because there is no parent.
[ tglx: Add a comment to the code and make it work with CONFIG_SYSFS=n ]
Fixes: ecb3f394c5db ("genirq: Expose interrupt information through sysfs")
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/1564703564-4116-1-git-send-email-mikelley@microsoft.com
|
|
There have been reports of RDRAND issues after resuming from suspend on
some AMD family 15h and family 16h systems. This issue stems from a BIOS
not performing the proper steps during resume to ensure RDRAND continues
to function properly.
RDRAND support is indicated by CPUID Fn00000001_ECX[30]. This bit can be
reset by clearing MSR C001_1004[62]. Any software that checks for RDRAND
support using CPUID, including the kernel, will believe that RDRAND is
not supported.
Update the CPU initialization to clear the RDRAND CPUID bit for any family
15h and 16h processor that supports RDRAND. If it is known that the family
15h or family 16h system does not have an RDRAND resume issue or that the
system will not be placed in suspend, the "rdrand=force" kernel parameter
can be used to stop the clearing of the RDRAND CPUID bit.
Additionally, update the suspend and resume path to save and restore the
MSR C001_1004 value to ensure that the RDRAND CPUID setting remains in
place after resuming from suspend.
Note, that clearing the RDRAND CPUID bit does not prevent a processor
that normally supports the RDRAND instruction from executing it. So any
code that determined the support based on family and model won't #UD.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Chen Yu <yu.c.chen@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: "linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>
Cc: "linux-pm@vger.kernel.org" <linux-pm@vger.kernel.org>
Cc: Nathan Chancellor <natechancellor@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: <stable@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "x86@kernel.org" <x86@kernel.org>
Link: https://lkml.kernel.org/r/7543af91666f491547bd86cebb1e17c66824ab9f.1566229943.git.thomas.lendacky@amd.com
|
|
Pull networking fixes from David Miller:
1) Fix jmp to 1st instruction in x64 JIT, from Alexei Starovoitov.
2) Severl kTLS fixes in mlx5 driver, from Tariq Toukan.
3) Fix severe performance regression due to lack of SKB coalescing of
fragments during local delivery, from Guillaume Nault.
4) Error path memory leak in sch_taprio, from Ivan Khoronzhuk.
5) Fix batched events in skbedit packet action, from Roman Mashak.
6) Propagate VLAN TX offload to hw_enc_features in bond and team
drivers, from Yue Haibing.
7) RXRPC local endpoint refcounting fix and read after free in
rxrpc_queue_local(), from David Howells.
8) Fix endian bug in ibmveth multicast list handling, from Thomas
Falcon.
9) Oops, make nlmsg_parse() wrap around the correct function,
__nlmsg_parse not __nla_parse(). Fix from David Ahern.
10) Memleak in sctp_scend_reset_streams(), fro Zheng Bin.
11) Fix memory leak in cxgb4, from Wenwen Wang.
12) Yet another race in AF_PACKET, from Eric Dumazet.
13) Fix false detection of retransmit failures in tipc, from Tuong
Lien.
14) Use after free in ravb_tstamp_skb, from Tho Vu.
* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (101 commits)
ravb: Fix use-after-free ravb_tstamp_skb
netfilter: nf_tables: map basechain priority to hardware priority
net: sched: use major priority number as hardware priority
wimax/i2400m: fix a memory leak bug
net: cavium: fix driver name
ibmvnic: Unmap DMA address of TX descriptor buffers after use
bnxt_en: Fix to include flow direction in L2 key
bnxt_en: Use correct src_fid to determine direction of the flow
bnxt_en: Suppress HWRM errors for HWRM_NVM_GET_VARIABLE command
bnxt_en: Fix handling FRAG_ERR when NVM_INSTALL_UPDATE cmd fails
bnxt_en: Improve RX doorbell sequence.
bnxt_en: Fix VNIC clearing logic for 57500 chips.
net: kalmia: fix memory leaks
cx82310_eth: fix a memory leak bug
bnx2x: Fix VF's VLAN reconfiguration in reload.
Bluetooth: Add debug setting for changing minimum encryption key size
tipc: fix false detection of retransmit failures
lan78xx: Fix memory leaks
MAINTAINERS: r8169: Update path to the driver
MAINTAINERS: PHY LIBRARY: Update files in the record
...
|
|
The maximum key description size is 4095. Commit f771fde82051 ("keys:
Simplify key description management") inadvertantly reduced that to 255
and made sizes between 256 and 4095 work weirdly, and any size whereby
size & 255 == 0 would cause an assertion in __key_link_begin() at the
following line:
BUG_ON(index_key->desc_len == 0);
This can be fixed by simply increasing the size of desc_len in struct
keyring_index_key to a u16.
Note the argument length test in keyutils only checked empty
descriptions and descriptions with a size around the limit (ie. 4095)
and not for all the values in between, so it missed this. This has been
addressed and
https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/keyutils.git/commit/?id=066bf56807c26cd3045a25f355b34c1d8a20a5aa
now exhaustively tests all possible lengths of type, description and
payload and then some.
The assertion failure looks something like:
kernel BUG at security/keys/keyring.c:1245!
...
RIP: 0010:__key_link_begin+0x88/0xa0
...
Call Trace:
key_create_or_update+0x211/0x4b0
__x64_sys_add_key+0x101/0x200
do_syscall_64+0x5b/0x1e0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
It can be triggered by:
keyctl add user "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa" a @s
Fixes: f771fde82051 ("keys: Simplify key description management")
Reported-by: kernel test robot <rong.a.chen@intel.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|