Age | Commit message (Collapse) | Author |
|
When trying to align the source pointer and there's a byte carry
in an SGE copy, bytes are borrowed from the next quad-word X to
complete the required quad-word copy. Then, the SGE length is
reduced by the number of borrowed bytes. After this, if the
remaining number of bytes from quad-word X (extra bytes) is
greater than the new SGE length, the number of extra bytes needs
to be updated to the new SGE length. Otherwise, when the
SGE length gets updated again after the extra bytes are read to
create the new byte carry, it goes negative, which then becomes
a very large number as the SGE length is an unsigned integer.
This causes SGE buffer to be over-read.
Reviewed-by: Dean Luick <dean.luick@intel.com>
Signed-off-by: Sebastian Sanchez <sebastian.sanchez@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Remove returning errors from mlx5 poll_cq function. Polling CQ
operation in kernel never fails by Mellanox HCA architecture and
respective driver design.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Use TIR number based on selector, it should be done to differentiate
between RSS QP to RAW one.
Reported-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Tested-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Return variable was set in a line before the
actual return was called in begin_wqe function.
This patch removes such variable and simplifies the code.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The returned value should be EINVAL, because it is caused by wrong
caller and not by internal overflow event.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Remove returning errors from mlx4 poll_cq function. Polling CQ
operation in kernel never fails by Mellanox HCA architecture and
respective driver design.
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
By Mellanox HW design and SW implementation, poll_cq never
fails and returns errors, so all these printks are to catch ULP bugs.
In case of such bug, the reverted patch will cause reentry of the
function, resulting in a printk storm.
This reverts commit 5412352fcd8f ("IB/mlx4: Return EAGAIN for any error in mlx4_ib_poll_one")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
When a new CM connection is being requested, ipoib driver copies data
from the path pointer in the CM/tx object, the path object might be
invalid at the point and memory corruption will happened later when now
the CM driver will try using that data.
The next scenario demonstrates it:
neigh_add_path --> ipoib_cm_create_tx -->
queue_work (pointer to path is in the cm/tx struct)
#while the work is still in the queue,
#the port goes down and causes the ipoib_flush_paths:
ipoib_flush_paths --> path_free --> kfree(path)
#at this point the work scheduled starts.
ipoib_cm_tx_start --> copy from the (invalid)path pointer:
(memcpy(&pathrec, &p->path->pathrec, sizeof pathrec);)
-> memory corruption.
To fix that the driver now starts the CM/tx connection only if that
specific path exists in the general paths database.
This check is protected with the relevant locks, and uses the gid from
the neigh member in the CM/tx object which is valid according to the ref
count that was taken by the CM/tx.
Fixes: 839fcaba35 ('IPoIB: Connected mode experimental support')
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The function send_leave sets the member: group->query_id
(group->query_id = ret) after calling the sa_query, but leave_handler
can be executed before the setting and it might delete the group object,
and will get a memory corruption.
Additionally, this patch gets rid of group->query_id variable which is
not used.
Fixes: faec2f7b96b5 ('IB/sa: Track multicast join/leave requests')
Signed-off-by: Erez Shitrit <erezsh@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
This patch fixes the retun value of switchdev_port_fdb_dump() when
CONFIG_NET_SWITCHDEV is not set. This avoids getting "warning: return makes
integer from pointer without a cast [-Wint-conversion]" when building
when CONFIG_NET_SWITCHDEV is not set under several compiler versions.
This warning is due to commit d297653dd6f07afbe7e6c702a4bcd7615680002e
("rtnetlink: fdb dump: optimize by saving last interface markers").
Signed-off-by: Rami Rosen <rami.rosen@intel.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Alexei Starovoitov says:
====================
perf, bpf: add support for bpf in sw/hw perf_events
this patch set is a follow up to the discussion:
https://lkml.kernel.org/r/20160804142853.GO6862%20()%20twins%20!%20programming%20!%20kicks-ass%20!%20net
It turned out to be simpler than what we discussed.
Patches 1-3 is bpf-side prep for the main patch 4
that adds bpf program as an overflow_handler to sw and hw perf_events.
Patches 5 and 6 are examples from myself and Brendan.
Peter,
to implement your suggestion to add ifdef CONFIG_BPF_SYSCALL
inside struct perf_event, I had to shuffle ifdefs in events/core.c
Please double check whether that is what you wanted to see.
v2->v3: fixed few more minor issues
v1->v2: fixed issues spotted by Peter and Daniel.
====================
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
sample instruction pointer and frequency count in a BPF map
Signed-off-by: Brendan Gregg <bgregg@netflix.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The bpf program is called 50 times a second and does hashmap[kern&user_stackid]++
It's primary purpose to check that key bpf helpers like map lookup, update,
get_stackid, trace_printk and ctx access are all working.
It checks:
- PERF_COUNT_HW_CPU_CYCLES on all cpus
- PERF_COUNT_HW_CPU_CYCLES for current process and inherited perf_events to children
- PERF_COUNT_SW_CPU_CLOCK on all cpus
- PERF_COUNT_SW_CPU_CLOCK for current process
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Allow attaching BPF_PROG_TYPE_PERF_EVENT programs to sw and hw perf events
via overflow_handler mechanism.
When program is attached the overflow_handlers become stacked.
The program acts as a filter.
Returning zero from the program means that the normal perf_event_output handler
will not be called and sampling event won't be stored in the ring buffer.
The overflow_handler_context==NULL is an additional safety check
to make sure programs are not attached to hw breakpoints and watchdog
in case other checks (that prevent that now anyway) get accidentally
relaxed in the future.
The program refcnt is incremented in case perf_events are inhereted
when target task is forked.
Similar to kprobe and tracepoint programs there is no ioctl to
detach the program or swap already attached program. The user space
expected to close(perf_event_fd) like it does right now for kprobe+bpf.
That restriction simplifies the code quite a bit.
The invocation of overflow_handler in __perf_event_overflow() is now
done via READ_ONCE, since that pointer can be replaced when the program
is attached while perf_event itself could have been active already.
There is no need to do similar treatment for event->prog, since it's
assigned only once before it's accessed.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Make sure that BPF_PROG_TYPE_PERF_EVENT programs only use
preallocated hash maps, since doing memory allocation
in overflow_handler can crash depending on where nmi got triggered.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Introduce BPF_PROG_TYPE_PERF_EVENT programs that can be attached to
HW and SW perf events (PERF_TYPE_HARDWARE and PERF_TYPE_SOFTWARE
correspondingly in uapi/linux/perf_event.h)
The program visible context meta structure is
struct bpf_perf_event_data {
struct pt_regs regs;
__u64 sample_period;
};
which is accessible directly from the program:
int bpf_prog(struct bpf_perf_event_data *ctx)
{
... ctx->sample_period ...
... ctx->regs.ip ...
}
The bpf verifier rewrites the accesses into kernel internal
struct bpf_perf_event_data_kern which allows changing
struct perf_sample_data without affecting bpf programs.
New fields can be added to the end of struct bpf_perf_event_data
in the future.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The verifier supported only 4-byte metafields in
struct __sk_buff and struct xdp_md. The metafields in upcoming
struct bpf_perf_event are 8-byte to match register width in struct pt_regs.
Teach verifier to recognize 8-byte metafield access.
The patch doesn't affect safety of sockets and xdp programs.
They check for 4-byte only ctx access before these conditions are hit.
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We get 1 warning when build kernel with W=1:
drivers/infiniband/hw/cxgb4/qp.c:686:6: warning: no previous prototype for '_free_qp' [-Wmissing-prototypes]
In fact, this function is only used in the file in which it is declared
and don't need a declaration, but can be made static.
so this patch marks it 'static'.
Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
When the low level driver exercises the hot unplug they would call
rdma_cm cma_remove_one which would fire DEVICE_REMOVAL event to all cma
consumers. Now, if consumer doesn't make sure they destroy all IB
objects created on that IB device instance prior to finalizing all
processing of DEVICE_REMOVAL callback, rdma_cm will let the lld to
de-register with IB core and destroy the IB device instance. And if the
consumer calls (say) ib_dereg_mr(), it will crash since that dev object
is NULL.
In the current implementation, iser-target just initiates the cleanup
and returns from DEVICE_REMOVAL callback. This deferred work creates a
race between iser-target cleaning IB objects(say MR) and lld destroying
IB device instance.
This patch includes the following fixes
-> make sure that consumer frees all IB objects associated with device
instance
-> return non-zero from the callback to destroy the rdma_cm id
Signed-off-by: Raju Rangoju <rajur@chelsio.com>
Acked-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The 2nd parameter of 'find_first_bit' is the number of bits to search.
In this case, we are passing 'sizeof(u64)' which is 8.
It is likely that the number of bits of 'port_mask' was expected here.
Use sizeof() * 8 to get the correct number.
It has been spotted by the following coccinelle script:
@@
expression ret, x;
@@
* ret = \(find_first_bit \| find_first_zero_bit\) (x, sizeof(...));
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
The 2nd parameter of 'find_first_bit' is the number of bits to search.
In this case, we are passing 'sizeof(tmp)' which is likely to be 4 or 8
because 'tmp' is an 'unsigned long'.
It is likely that the number of bits of 'tmp' was expected here. So use
BITS_PER_LONG instead.
It has been spotted by the following coccinelle script:
@@
expression ret, x;
@@
* ret = \(find_first_bit \| find_first_zero_bit\) (x, sizeof(...));
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Majd Dibbiny <majd@mellanox.com>
Acked-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
|
|
Łukasz Daniluk reported that on a RHEL kernel that his machine would lock up
after enabling function tracer. I asked him to bisect the functions within
available_filter_functions, which he did and it came down to three:
_paravirt_nop(), _paravirt_ident_32() and _paravirt_ident_64()
It was found that this is only an issue when noreplace-paravirt is added
to the kernel command line.
This means that those functions are most likely called within critical
sections of the funtion tracer, and must not be traced.
In newer kenels _paravirt_nop() is defined within gcc asm(), and is no
longer an issue. But both _paravirt_ident_{32,64}() causes the
following splat when they are traced:
mm/pgtable-generic.c:33: bad pmd ffff8800d2435150(0000000001d00054)
mm/pgtable-generic.c:33: bad pmd ffff8800d3624190(0000000001d00070)
mm/pgtable-generic.c:33: bad pmd ffff8800d36a5110(0000000001d00054)
mm/pgtable-generic.c:33: bad pmd ffff880118eb1450(0000000001d00054)
NMI watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [systemd-journal:469]
Modules linked in: e1000e
CPU: 2 PID: 469 Comm: systemd-journal Not tainted 4.6.0-rc4-test+ #513
Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012
task: ffff880118f740c0 ti: ffff8800d4aec000 task.ti: ffff8800d4aec000
RIP: 0010:[<ffffffff81134148>] [<ffffffff81134148>] queued_spin_lock_slowpath+0x118/0x1a0
RSP: 0018:ffff8800d4aefb90 EFLAGS: 00000246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88011eb16d40
RDX: ffffffff82485760 RSI: 000000001f288820 RDI: ffffea0000008030
RBP: ffff8800d4aefb90 R08: 00000000000c0000 R09: 0000000000000000
R10: ffffffff821c8e0e R11: 0000000000000000 R12: ffff880000200fb8
R13: 00007f7a4e3f7000 R14: ffffea000303f600 R15: ffff8800d4b562e0
FS: 00007f7a4e3d7840(0000) GS:ffff88011eb00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7a4e3f7000 CR3: 00000000d3e71000 CR4: 00000000001406e0
Call Trace:
_raw_spin_lock+0x27/0x30
handle_pte_fault+0x13db/0x16b0
handle_mm_fault+0x312/0x670
__do_page_fault+0x1b1/0x4e0
do_page_fault+0x22/0x30
page_fault+0x28/0x30
__vfs_read+0x28/0xe0
vfs_read+0x86/0x130
SyS_read+0x46/0xa0
entry_SYSCALL_64_fastpath+0x1e/0xa8
Code: 12 48 c1 ea 0c 83 e8 01 83 e2 30 48 98 48 81 c2 40 6d 01 00 48 03 14 c5 80 6a 5d 82 48 89 0a 8b 41 08 85 c0 75 09 f3 90 8b 41 08 <85> c0 74 f7 4c 8b 09 4d 85 c9 74 08 41 0f 18 09 eb 02 f3 90 8b
Reported-by: Łukasz Daniluk <lukasz.daniluk@intel.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs
Pull overlayfs fixes from Miklos Szeredi:
"Most of this is regression fixes for posix acl behavior introduced in
4.8-rc1 (these were caught by the pjd-fstest suite). The are also
miscellaneous fixes marked as stable material and cleanups.
Other than overlayfs code, it touches <linux/fs.h> to add a constant
with which to disable posix acl caching. No changes needed to the
actual caching code, it automatically does the right thing, although
later we may want to optimize this case.
I'm now testing overlayfs with the following test suites to catch
regressions:
- unionmount-testsuite
- xfstests
- pjd-fstest"
* 'overlayfs-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs:
ovl: update doc
ovl: listxattr: use strnlen()
ovl: Switch to generic_getxattr
ovl: copyattr after setting POSIX ACL
ovl: Switch to generic_removexattr
ovl: Get rid of ovl_xattr_noacl_handlers array
ovl: Fix OVL_XATTR_PREFIX
ovl: fix spelling mistake: "directries" -> "directories"
ovl: don't cache acl on overlay layer
ovl: use cached acl on underlying layer
ovl: proper cleanup of workdir
ovl: remove posix_acl_default from workdir
ovl: handle umask and posix_acl_default correctly on creation
ovl: don't copy up opaqueness
|
|
The warning was seen on AR5416 chip, which invoke ath9k_hw_gio_get()
before the GPIO initialized correctly.
WARNING: CPU: 1 PID: 1159 at ~/drivers/net/wireless/ath/ath9k/hw.c:2776 ath9k_hw_gpio_get+0x148/0x1a0 [ath9k_hw]
...
CPU: 1 PID: 1159 Comm: systemd-udevd Not tainted 4.7.0-rc7-aptosid-amd64 #1 aptosid 4.7~rc7-1~git92.slh.3
Hardware name: /DH67CL, BIOS BLH6710H.86A.0160.2012.1204.1156 12/04/2012
0000000000000286 00000000f912d633 ffffffff81290fd3 0000000000000000
0000000000000000 ffffffff81063fd4 ffff88040c6dc018 0000000000000000
0000000000000002 0000000000000000 0000000000000100 ffff88040c6dc018
Call Trace:
[<ffffffff81290fd3>] ? dump_stack+0x5c/0x79
[<ffffffff81063fd4>] ? __warn+0xb4/0xd0
[<ffffffffa0668fb8>] ? ath9k_hw_gpio_get+0x148/0x1a0 [ath9k_hw]
Signed-off-by: Miaoqing Pan <miaoqing@codeaurora.org>
Reported-by: Stefan Lippers-Hollmann <s.l-h@gmx.de>
Tested-by: Stefan Lippers-Hollmann <s.l-h@gmx.de>
Signed-off-by: Kalle Valo <kvalo@qca.qualcomm.com>
|
|
Changes to make the resume from cpu_suspend() code behave more like
secondary boot caused debug exceptions to be unmasked early by
__cpu_setup(). We then go on to restore mdscr_el1 in cpu_do_resume(),
potentially taking break or watch points based on uninitialised registers.
Mask debug exceptions in cpu_do_resume(), which is specific to resume
from cpu_suspend(). Debug exceptions will be restored to their original
state by local_dbg_restore() in cpu_suspend(), which runs after
hw_breakpoint_restore() has re-initialised the other registers.
Reported-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Fixes: cabe1c81ea5b ("arm64: Change cpu_resume() to enable mmu early then access sleep_sp by va")
Cc: <stable@vger.kernel.org> # 4.7+
Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
Patch 7f1d642fbb5c ("drivers/perf: arm-pmu: Fix handling of SPI lacking
interrupt-affinity property") unintended also fixes perf_event support
for bcm2835 which doesn't have PMU interrupts. Unfortunately this change
introduce a NULL pointer dereference on bcm2835, because irq_is_percpu
always expected to be called with a valid IRQ. So fix this regression
by validating the IRQ before.
Tested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Fixes: 7f1d642fbb5c ("drivers/perf: arm-pmu: Fix handling of SPI lacking "interrupt-affinity" property")
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
In case of a IRQ type mismatch in of_pmu_irq_cfg() the
device node for interrupt affinity isn't freed. So fix this
issue by calling of_node_put().
Signed-off-by: Stefan Wahren <stefan.wahren@i2se.com>
Fixes: fa8ad7889d83 ("arm: perf: factor arm_pmu core out to drivers")
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
git://git.infradead.org/users/vkoul/slave-dma
Pull dmaengine fixes from Vinod Koul:
"The fixes this time are all in drivers:
- possible NULL dereference in img-mdc
- correct device identity for free_irq in at_xdmac
- missing of_node_put() in fsl probe
- fix debug log and hotchain corner case for pxa-dma
- fix checking hardware bits in isr in usb dmac"
* tag 'dmaengine-fix-4.8-rc5' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: img-mdc: fix a possible NULL dereference
dmaengine: at_xdmac: fix to pass correct device identity to free_irq()
dmaengine: fsl_raid: add missing of_node_put() in fsl_re_probe()
dmaengine: pxa_dma: fix debug message
dmaengine: pxa_dma: fix hotchain corner case
dmaengine: usb-dmac: check CHCR.DE bit in usb_dmac_isr_channel()
|
|
Pull drm fixes from Dave Airlie:
"Contains fixes for imx, amdgpu, vc4, msm and one nouveau ACPI fix"
* tag 'drm-fixes-for-4.8-rc5' of git://people.freedesktop.org/~airlied/linux:
drm/amdgpu: record error code when ring test failed
drm/amd/amdgpu: compute ring test fail during S4 on CI
drm/amd/amdgpu: sdma resume fail during S4 on CI
drm/nouveau/acpi: use DSM if bridge does not support D3cold
drm/imx: fix crtc vblank state regression
drm/imx: Add active plane reconfiguration support
drm/msm: protect against faults from copy_from_user() in submit ioctl
drm/msm: fix use of copy_from_user() while holding spinlock
drm/vc4: Fix oops when userspace hands in a bad BO.
drm/vc4: Fix overflow mem unreferencing when the binner runs dry.
drm/vc4: Free hang state before destroying BO cache.
drm/vc4: Fix handling of a pm_runtime_get_sync() success case.
drm/vc4: Use drm_malloc_ab to fix large rendering jobs.
drm/vc4: Use drm_free_large() on handles to match its allocation.
|
|
git://git.linaro.org/people/pawel.moll/linux into fixes
Merge "bus: ARM CCN PMU driver updates" from Paweł Moll:
- Fixes and improvements for XP watchpoint and events handling
- Added missing condition checks for KVM-related exclusions
- Improved interrupt affinity handling
- Fix for hrtimer use in polling mode
- Event grouping implementation improvement
* tag 'ccn/fixes-for-4.8-v2' of git://git.linaro.org/people/pawel.moll/linux:
bus: arm-ccn: make event groups reliable
bus: arm-ccn: fix hrtimer registration
bus: arm-ccn: fix PMU interrupt flags
bus: arm-ccn: Add missing event attribute exclusions for host/guest
bus: arm-ccn: Correct required arguments for XP PMU events
bus: arm-ccn: Fix XP watchpoint settings bitmask
bus: arm-ccn: Do not attempt to configure XPs for cycle counter
bus: arm-ccn: Fix PMU handling of MN
|
|
Merge "mvebu fixes for 4.8 (part 1)" from Gregory CLEMENT:
Few device tree fix on kirkwood:
- enable PCIe on OpenRD
- use correct u-boot environment partition size on ib62x0
* tag 'mvebu-fixes-4.8-2' of git://git.infradead.org/linux-mvebu:
ARM: dts: kirkwood: Fix PCIe label on OpenRD
ARM: kirkwood: ib62x0: fix size of u-boot environment partition
|
|
I got this with syzkaller:
==================================================================
BUG: KASAN: null-ptr-deref on address 0000000000000020
Read of size 32 by task syz-executor/22519
CPU: 1 PID: 22519 Comm: syz-executor Not tainted 4.8.0-rc2+ #169
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2
014
0000000000000001 ffff880111a17a00 ffffffff81f9f141 ffff880111a17a90
ffff880111a17c50 ffff880114584a58 ffff880114584a10 ffff880111a17a80
ffffffff8161fe3f ffff880100000000 ffff880118d74a48 ffff880118d74a68
Call Trace:
[<ffffffff81f9f141>] dump_stack+0x83/0xb2
[<ffffffff8161fe3f>] kasan_report_error+0x41f/0x4c0
[<ffffffff8161ff74>] kasan_report+0x34/0x40
[<ffffffff82c84b54>] ? snd_timer_user_read+0x554/0x790
[<ffffffff8161e79e>] check_memory_region+0x13e/0x1a0
[<ffffffff8161e9c1>] kasan_check_read+0x11/0x20
[<ffffffff82c84b54>] snd_timer_user_read+0x554/0x790
[<ffffffff82c84600>] ? snd_timer_user_info_compat.isra.5+0x2b0/0x2b0
[<ffffffff817d0831>] ? proc_fault_inject_write+0x1c1/0x250
[<ffffffff817d0670>] ? next_tgid+0x2a0/0x2a0
[<ffffffff8127c278>] ? do_group_exit+0x108/0x330
[<ffffffff8174653a>] ? fsnotify+0x72a/0xca0
[<ffffffff81674dfe>] __vfs_read+0x10e/0x550
[<ffffffff82c84600>] ? snd_timer_user_info_compat.isra.5+0x2b0/0x2b0
[<ffffffff81674cf0>] ? do_sendfile+0xc50/0xc50
[<ffffffff81745e10>] ? __fsnotify_update_child_dentry_flags+0x60/0x60
[<ffffffff8143fec6>] ? kcov_ioctl+0x56/0x190
[<ffffffff81e5ada2>] ? common_file_perm+0x2e2/0x380
[<ffffffff81746b0e>] ? __fsnotify_parent+0x5e/0x2b0
[<ffffffff81d93536>] ? security_file_permission+0x86/0x1e0
[<ffffffff816728f5>] ? rw_verify_area+0xe5/0x2b0
[<ffffffff81675355>] vfs_read+0x115/0x330
[<ffffffff81676371>] SyS_read+0xd1/0x1a0
[<ffffffff816762a0>] ? vfs_write+0x4b0/0x4b0
[<ffffffff82001c2c>] ? __this_cpu_preempt_check+0x1c/0x20
[<ffffffff8150455a>] ? __context_tracking_exit.part.4+0x3a/0x1e0
[<ffffffff816762a0>] ? vfs_write+0x4b0/0x4b0
[<ffffffff81005524>] do_syscall_64+0x1c4/0x4e0
[<ffffffff810052fc>] ? syscall_return_slowpath+0x16c/0x1d0
[<ffffffff83c3276a>] entry_SYSCALL64_slow_path+0x25/0x25
==================================================================
There are a couple of problems that I can see:
- ioctl(SNDRV_TIMER_IOCTL_SELECT), which potentially sets
tu->queue/tu->tqueue to NULL on memory allocation failure, so read()
would get a NULL pointer dereference like the above splat
- the same ioctl() can free tu->queue/to->tqueue which means read()
could potentially see (and dereference) the freed pointer
We can fix both by taking the ioctl_lock mutex when dereferencing
->queue/->tqueue, since that's always held over all the ioctl() code.
Just looking at the code I find it likely that there are more problems
here such as tu->qhead pointing outside the buffer if the size is
changed concurrently using SNDRV_TIMER_IOCTL_PARAMS.
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
|
|
tick_nohz_start_idle() is prevented to be called if the idle tick can't
be stopped since commit 1f3b0f8243cb934 ("tick/nohz: Optimize nohz idle
enter"). As a result, after suspend/resume the host machine, full dynticks
kvm guest will softlockup:
NMI watchdog: BUG: soft lockup - CPU#0 stuck for 26s! [swapper/0:0]
Call Trace:
default_idle+0x31/0x1a0
arch_cpu_idle+0xf/0x20
default_idle_call+0x2a/0x50
cpu_startup_entry+0x39b/0x4d0
rest_init+0x138/0x140
? rest_init+0x5/0x140
start_kernel+0x4c1/0x4ce
? set_init_arg+0x55/0x55
? early_idt_handler_array+0x120/0x120
x86_64_start_reservations+0x24/0x26
x86_64_start_kernel+0x142/0x14f
In addition, cat /proc/stat | grep cpu in guest or host:
cpu 398 16 5049 15754 5490 0 1 46 0 0
cpu0 206 5 450 0 0 0 1 14 0 0
cpu1 81 0 3937 3149 1514 0 0 9 0 0
cpu2 45 6 332 6052 2243 0 0 11 0 0
cpu3 65 2 328 6552 1732 0 0 11 0 0
The idle and iowait states are weird 0 for cpu0(housekeeping).
The bug is present in both guest and host kernels, and they both have
cpu0's idle and iowait states issue, however, host kernel's suspend/resume
path etc will touch watchdog to avoid the softlockup.
- The watchdog will not be touched in tick_nohz_stop_idle path (need be
touched since the scheduler stall is expected) if idle_active flags are
not detected.
- The idle and iowait states will not be accounted when exit idle loop
(resched or interrupt) if idle start time and idle_active flags are
not set.
This patch fixes it by reverting commit 1f3b0f8243cb934 since can't stop
idle tick doesn't mean can't be idle.
Fixes: 1f3b0f8243cb934 ("tick/nohz: Optimize nohz idle enter")
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Cc: Sanjeev Yadav<sanjeev.yadav@spreadtrum.com>
Cc: Gaurav Jindal<gaurav.jindal@spreadtrum.com>
Cc: stable@vger.kernel.org
Cc: kvm@vger.kernel.org
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lkml.kernel.org/r/1472798303-4154-1-git-send-email-wanpeng.li@hotmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Commit 8eb30be0352d0916 ("ipv6: Create ip6_tnl_xmit") unsets
flowi6_proto in ip4ip6_tnl_xmit() and ip6ip6_tnl_xmit().
Since xfrm_selector_match() relies on this info, IPv6 packets
sent by an ip6tunnel cannot be properly selected by their
protocols after removing it. This patch puts flowi6_proto back.
Cc: stable@vger.kernel.org
Fixes: 8eb30be0352d ("ipv6: Create ip6_tnl_xmit")
Signed-off-by: Eli Cooper <elicooper@gmx.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We get a few warnings when building kernel with W=1:
drivers/isdn/hardware/mISDN/hfcmulti.c:568:1: warning: no previous declaration for 'enablepcibridge' [-Wmissing-declarations]
drivers/isdn/hardware/mISDN/hfcmulti.c:574:1: warning: no previous declaration for 'disablepcibridge' [-Wmissing-declarations]
drivers/isdn/hardware/mISDN/hfcmulti.c:580:1: warning: no previous declaration for 'readpcibridge' [-Wmissing-declarations]
drivers/isdn/hardware/mISDN/hfcmulti.c:608:1: warning: no previous declaration for 'writepcibridge' [-Wmissing-declarations]
drivers/isdn/hardware/mISDN/hfcmulti.c:638:1: warning: no previous declaration for 'cpld_set_reg' [-Wmissing-declarations]
drivers/isdn/hardware/mISDN/hfcmulti.c:645:1: warning: no previous declaration for 'cpld_write_reg' [-Wmissing-declarations]
drivers/isdn/hardware/mISDN/hfcmulti.c:657:1: warning: no previous declaration for 'cpld_read_reg' [-Wmissing-declarations]
drivers/isdn/hardware/mISDN/hfcmulti.c:674:1: warning: no previous declaration for 'vpm_write_address' [-Wmissing-declarations]
drivers/isdn/hardware/mISDN/hfcmulti.c:681:1: warning: no previous declaration for 'vpm_read_address' [-Wmissing-declarations]
drivers/isdn/hardware/mISDN/hfcmulti.c:695:1: warning: no previous declaration for 'vpm_in' [-Wmissing-declarations]
drivers/isdn/hardware/mISDN/hfcmulti.c:716:1: warning: no previous declaration for 'vpm_out' [-Wmissing-declarations]
drivers/isdn/hardware/mISDN/hfcmulti.c:1028:1: warning: no previous declaration for 'plxsd_checksync' [-Wmissing-declarations]
....
In fact, these functions are only used in the file in which they are
declared and don't need a declaration, but can be made static.
so this patch marks these functions with 'static'.
Signed-off-by: Baoyou Xie <baoyou.xie@linaro.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add support for the Qualcomm Technologies, Inc. EMAC gigabit Ethernet
controller.
This driver supports the following features:
1) Checksum offload.
2) Interrupt coalescing support.
3) SGMII phy.
4) phylib interface for external phy
Based on original work by
Niranjana Vishwanathapura <nvishwan@codeaurora.org>
Gilad Avidov <gavidov@codeaurora.org>
Signed-off-by: Timur Tabi <timur@codeaurora.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
drm-fixes
This pull request brings in fixes for VC4 3D in 4.8, most of which are
covered by testcases.
* tag 'drm-vc4-fixes-2016-08-29' of https://github.com/anholt/linux:
drm/vc4: Fix oops when userspace hands in a bad BO.
drm/vc4: Fix overflow mem unreferencing when the binner runs dry.
drm/vc4: Free hang state before destroying BO cache.
drm/vc4: Fix handling of a pm_runtime_get_sync() success case.
drm/vc4: Use drm_malloc_ab to fix large rendering jobs.
drm/vc4: Use drm_free_large() on handles to match its allocation.
|
|
Access the priv member of the dsa_switch structure directly, instead of
having an unnecessary helper.
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When PCI error is detected, in some architectures (like PowerPC) a slot
reset is performed - the driver's error handlers are in charge of "disable"
device before the reset, and re-enable it after a successful slot reset.
There are two cases though that another path is taken on the code: if the
slot reset is not successful or if too many errors already happened in the
specific adapter (meaning that possibly the device is experiencing a HW
failure that slot reset is not able to solve), the core PCI error mechanism
(called EEH in PowerPC) will remove the adapter from the system, since it
will consider this as a permanent failure on device. In this case, a path
is taken that leads to bnx2x_chip_cleanup() calling bnx2x_reset_hw(), which
then tries to perform a HW reset on chip. This reset won't succeed since
the HW is in a fault state, which can be seen by multiple messages on
kernel log like below:
bnx2x: [bnx2x_issue_dmae_with_comp:552(eth1)]DMAE timeout!
bnx2x: [bnx2x_write_dmae:600(eth1)]DMAE returned failure -1
After some time, the PCI error mechanism gives up on waiting the driver's
correct removal procedure and forcibly remove the adapter from the system.
We can see soft lockup while core PCI error mechanism is waiting for driver
to accomplish the right removal process.
This patch adds a verification to avoid a chip reset whenever the function
is in PCI error state - since this case is only reached when we have a
device being removed because of a permanent failure, the HW chip reset is
not expected to work fine neither is necessary.
Also, as a minor improvement in error path, we avoid the MCP information dump
in case of non-recoverable PCI error (when adapter is about to be removed),
since it will certainly fail.
Reported-by: Harsha Thyagaraja <hathyaga@in.ibm.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@linux.vnet.ibm.com>
Acked-By: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Nikolay Aleksandrov says:
====================
net: bridge: add per-port unknown multicast flood control
The first patch prepares the forwarding path by having the exact packet
type passed down so we can later filter based on it and the per-port
unknown mcast flood flag introduced in the second patch. It is similar to
how the per-port unknown unicast flood flag works.
Nice side-effects of patch 01 are the slight reduction of tests in the
fast-path and a few minor checkpatch fixes.
v3: don't change br_auto_mask as that will change user-visible behaviour
v2: make pkt_type an enum as per Stephen's comment
====================
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.pengutronix.de/git/pza/linux into drm-fixes
imx-drm atomic modeset regression fixes
- add active plane reconfiguration support
- add back crtc vblank state reporting
* tag 'imx-drm-fixes-2016-08-30' of git://git.pengutronix.de/git/pza/linux:
drm/imx: fix crtc vblank state regression
drm/imx: Add active plane reconfiguration support
|
|
Add a per-port flag to control the unknown multicast flood, similar to the
unknown unicast flood flag and break a few long lines in the netlink flag
exports.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Remove the unicast flag and introduce an exact pkt_type. That would help us
for the upcoming per-port multicast flood flag and also slightly reduce the
tests in the input fast path.
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The original codes depend on that the function parameters are evaluated from
left to right. But the parameter's evaluation order is not defined in C
standard actually.
When flow_keys_have_l4(&keys) is invoked before ___skb_get_hash(skb, &keys,
hashrnd) with some compilers or environment, the keys passed to
flow_keys_have_l4 is not initialized.
Fixes: 6db61d79c1e1 ("flow_dissector: Ignore flow dissector return value from ___skb_get_hash")
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Gao Feng <fgao@ikuai8.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"A collection of small fixes for various SoC vendor clk drivers"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: rockchip: mark aclk_emmc_noc as a critical clock on rk3399
clk: tegra: remove TEGRA_PLL_USE_LOCK for PLLD/PLLD2
clk: rockchip: fix incorrect GATE bits for {c, g}pll_aclk_perihp_src on rk3399
clk: rockchip: fix incorrect aclk_emmc source gate bits on rk3399
clk: renesas: r8a7795: Fix SD clocks
clk: rockchip: fix rk3399 aclk_vio gate bit
clk: sunxi-ng: Fix inverted test condition in ccu_helper_wait_for_lock
|
|
Merge fixes from Andrew Morton:
"14 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
rapidio/tsi721: fix incorrect detection of address translation condition
rapidio/documentation/mport_cdev: add missing parameter description
kernel/fork: fix CLONE_CHILD_CLEARTID regression in nscd
MAINTAINERS: Vladimir has moved
mm, mempolicy: task->mempolicy must be NULL before dropping final reference
printk/nmi: avoid direct printk()-s from __printk_nmi_flush()
treewide: remove references to the now unnecessary DEFINE_PCI_DEVICE_TABLE
drivers/scsi/wd719x.c: remove last declaration using DEFINE_PCI_DEVICE_TABLE
mm, vmscan: only allocate and reclaim from zones with pages managed by the buddy allocator
lib/test_hash.c: fix warning in preprocessor symbol evaluation
lib/test_hash.c: fix warning in two-dimensional array init
kconfig: tinyconfig: provide whole choice blocks to avoid warnings
kexec: fix double-free when failing to relocate the purgatory
mm, oom: prevent premature OOM killer invocation for high order request
|
|
Fix incorrect condition to identify involvment of a address translation
mechanism.
This bug results in NULL pointer kernel crash dump in cases when mapping
of inbound RapidIO address range is requested within existing aprture.
Link: http://lkml.kernel.org/r/20160901173144.2983-1-alexandre.bounine@idt.com
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Cc: Barry Wood <barry.wood@idt.com>
Cc: <stable@vger.kernel.org> [4.6+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Add missing description for rio_mport_cdev driver parameter
'dma_timeout'.
This patch is applicable to kernel versions starting from v4.6.
Link: http://lkml.kernel.org/r/20160901173104.2928-1-alexandre.bounine@idt.com
Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Andre van Herk <andre.van.herk@prodrive-technologies.com>
Cc: Barry Wood <barry.wood@idt.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Commit fec1d0115240 ("[PATCH] Disable CLONE_CHILD_CLEARTID for abnormal
exit") has caused a subtle regression in nscd which uses
CLONE_CHILD_CLEARTID to clear the nscd_certainly_running flag in the
shared databases, so that the clients are notified when nscd is
restarted. Now, when nscd uses a non-persistent database, clients that
have it mapped keep thinking the database is being updated by nscd, when
in fact nscd has created a new (anonymous) one (for non-persistent
databases it uses an unlinked file as backend).
The original proposal for the CLONE_CHILD_CLEARTID change claimed
(https://lkml.org/lkml/2006/10/25/233):
: The NPTL library uses the CLONE_CHILD_CLEARTID flag on clone() syscalls
: on behalf of pthread_create() library calls. This feature is used to
: request that the kernel clear the thread-id in user space (at an address
: provided in the syscall) when the thread disassociates itself from the
: address space, which is done in mm_release().
:
: Unfortunately, when a multi-threaded process incurs a core dump (such as
: from a SIGSEGV), the core-dumping thread sends SIGKILL signals to all of
: the other threads, which then proceed to clear their user-space tids
: before synchronizing in exit_mm() with the start of core dumping. This
: misrepresents the state of process's address space at the time of the
: SIGSEGV and makes it more difficult for someone to debug NPTL and glibc
: problems (misleading him/her to conclude that the threads had gone away
: before the fault).
:
: The fix below is to simply avoid the CLONE_CHILD_CLEARTID action if a
: core dump has been initiated.
The resulting patch from Roland (https://lkml.org/lkml/2006/10/26/269)
seems to have a larger scope than the original patch asked for. It
seems that limitting the scope of the check to core dumping should work
for SIGSEGV issue describe above.
[Changelog partly based on Andreas' description]
Fixes: fec1d0115240 ("[PATCH] Disable CLONE_CHILD_CLEARTID for abnormal exit")
Link: http://lkml.kernel.org/r/1471968749-26173-1-git-send-email-mhocko@kernel.org
Signed-off-by: Michal Hocko <mhocko@suse.com>
Tested-by: William Preston <wpreston@suse.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Roland McGrath <roland@hack.frob.com>
Cc: Andreas Schwab <schwab@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
vdavydov@{parallels,virtuozzo}.com will bounce from now on.
Link: http://lkml.kernel.org/r/20160831180752.GB10353@esperanza
Signed-off-by: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|