summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2014-10-28sched/fair: Fix division by zero sysctl_numa_balancing_scan_sizeKirill Tkhai
File /proc/sys/kernel/numa_balancing_scan_size_mb allows writing of zero. This bash command reproduces problem: $ while :; do echo 0 > /proc/sys/kernel/numa_balancing_scan_size_mb; \ echo 256 > /proc/sys/kernel/numa_balancing_scan_size_mb; done divide error: 0000 [#1] SMP Modules linked in: CPU: 0 PID: 24112 Comm: bash Not tainted 3.17.0+ #8 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 task: ffff88013c852600 ti: ffff880037a68000 task.ti: ffff880037a68000 RIP: 0010:[<ffffffff81074191>] [<ffffffff81074191>] task_scan_min+0x21/0x50 RSP: 0000:ffff880037a6bce0 EFLAGS: 00010246 RAX: 0000000000000a00 RBX: 00000000000003e8 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88013c852600 RBP: ffff880037a6bcf0 R08: 0000000000000001 R09: 0000000000015c90 R10: ffff880239bf6c00 R11: 0000000000000016 R12: 0000000000003fff R13: ffff88013c852600 R14: ffffea0008d1b000 R15: 0000000000000003 FS: 00007f12bb048700(0000) GS:ffff88007da00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 0000000001505678 CR3: 0000000234770000 CR4: 00000000000006f0 Stack: ffff88013c852600 0000000000003fff ffff880037a6bd18 ffffffff810741d1 ffff88013c852600 0000000000003fff 000000000002bfff ffff880037a6bda8 ffffffff81077ef7 ffffea0008a56d40 0000000000000001 0000000000000001 Call Trace: [<ffffffff810741d1>] task_scan_max+0x11/0x40 [<ffffffff81077ef7>] task_numa_fault+0x1f7/0xae0 [<ffffffff8115a896>] ? migrate_misplaced_page+0x276/0x300 [<ffffffff81134a4d>] handle_mm_fault+0x62d/0xba0 [<ffffffff8103e2f1>] __do_page_fault+0x191/0x510 [<ffffffff81030122>] ? native_smp_send_reschedule+0x42/0x60 [<ffffffff8106dc00>] ? check_preempt_curr+0x80/0xa0 [<ffffffff8107092c>] ? wake_up_new_task+0x11c/0x1a0 [<ffffffff8104887d>] ? do_fork+0x14d/0x340 [<ffffffff811799bb>] ? get_unused_fd_flags+0x2b/0x30 [<ffffffff811799df>] ? __fd_install+0x1f/0x60 [<ffffffff8103e67c>] do_page_fault+0xc/0x10 [<ffffffff8150d322>] page_fault+0x22/0x30 RIP [<ffffffff81074191>] task_scan_min+0x21/0x50 RSP <ffff880037a6bce0> ---[ end trace 9a826d16936c04de ]--- Also fix race in task_scan_min (it depends on compiler behaviour). Signed-off-by: Kirill Tkhai <ktkhai@parallels.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Aaron Tomlin <atomlin@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Dario Faggioli <raistlin@linux.it> Cc: David Rientjes <rientjes@google.com> Cc: Jens Axboe <axboe@fb.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Rik van Riel <riel@redhat.com> Link: http://lkml.kernel.org/r/1413455977.24793.78.camel@tkhai Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-28sched/fair: Care divide error in update_task_scan_period()Yasuaki Ishimatsu
While offling node by hot removing memory, the following divide error occurs: divide error: 0000 [#1] SMP [...] Call Trace: [...] handle_mm_fault [...] ? try_to_wake_up [...] ? wake_up_state [...] __do_page_fault [...] ? do_futex [...] ? put_prev_entity [...] ? __switch_to [...] do_page_fault [...] page_fault [...] RIP [<ffffffff810a7081>] task_numa_fault RSP <ffff88084eb2bcb0> The issue occurs as follows: 1. When page fault occurs and page is allocated from node 1, task_struct->numa_faults_buffer_memory[] of node 1 is incremented and p->numa_faults_locality[] is also incremented as follows: o numa_faults_buffer_memory[] o numa_faults_locality[] NR_NUMA_HINT_FAULT_TYPES | 0 | 1 | ---------------------------------- ---------------------- node 0 | 0 | 0 | remote | 0 | node 1 | 0 | 1 | locale | 1 | ---------------------------------- ---------------------- 2. node 1 is offlined by hot removing memory. 3. When page fault occurs, fault_types[] is calculated by using p->numa_faults_buffer_memory[] of all online nodes in task_numa_placement(). But node 1 was offline by step 2. So the fault_types[] is calculated by using only p->numa_faults_buffer_memory[] of node 0. So both of fault_types[] are set to 0. 4. The values(0) of fault_types[] pass to update_task_scan_period(). 5. numa_faults_locality[1] is set to 1. So the following division is calculated. static void update_task_scan_period(struct task_struct *p, unsigned long shared, unsigned long private){ ... ratio = DIV_ROUND_UP(private * NUMA_PERIOD_SLOTS, (private + shared)); } 6. But both of private and shared are set to 0. So divide error occurs here. The divide error is rare case because the trigger is node offline. This patch always increments denominator for avoiding divide error. Signed-off-by: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/54475703.8000505@jp.fujitsu.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-28sched/numa: Fix unsafe get_task_struct() in task_numa_assign()Kirill Tkhai
Unlocked access to dst_rq->curr in task_numa_compare() is racy. If curr task is exiting this may be a reason of use-after-free: task_numa_compare() do_exit() ... current->flags |= PF_EXITING; ... release_task() ... ~~delayed_put_task_struct()~~ ... schedule() rcu_read_lock() ... cur = ACCESS_ONCE(dst_rq->curr) ... ... rq->curr = next; ... context_switch() ... finish_task_switch() ... put_task_struct() ... __put_task_struct() ... free_task_struct() task_numa_assign() ... get_task_struct() ... As noted by Oleg: <<The lockless get_task_struct(tsk) is only safe if tsk == current and didn't pass exit_notify(), or if this tsk was found on a rcu protected list (say, for_each_process() or find_task_by_vpid()). IOW, it is only safe if release_task() was not called before we take rcu_read_lock(), in this case we can rely on the fact that delayed_put_pid() can not drop the (potentially) last reference until rcu_read_unlock(). And as Kirill pointed out task_numa_compare()->task_numa_assign() path does get_task_struct(dst_rq->curr) and this is not safe. The task_struct itself can't go away, but rcu_read_lock() can't save us from the final put_task_struct() in finish_task_switch(); this reference goes away without rcu gp>> The patch provides simple check of PF_EXITING flag. If it's not set, this guarantees that call_rcu() of delayed_put_task_struct() callback hasn't happened yet, so we can safely do get_task_struct() in task_numa_assign(). Locked dst_rq->lock protects from concurrency with the last schedule(). Reusing or unmapping of cur's memory may happen without it. Suggested-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Kirill Tkhai <ktkhai@parallels.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1413962231.19914.130.camel@tkhai Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-28sched/deadline: Fix races between rt_mutex_setprio() and dl_task_timer()Juri Lelli
dl_task_timer() is racy against several paths. Daniel noticed that the replenishment timer may experience a race condition against an enqueue_dl_entity() called from rt_mutex_setprio(). With his own words: rt_mutex_setprio() resets p->dl.dl_throttled. So the pattern is: start_dl_timer() throttled = 1, rt_mutex_setprio() throlled = 0, sched_switch() -> enqueue_task(), dl_task_timer-> enqueue_task() throttled is 0 => BUG_ON(on_dl_rq(dl_se)) fires as the scheduling entity is already enqueued on the -deadline runqueue. As we do for the other races, we just bail out in the replenishment timer code. Reported-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Tested-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Signed-off-by: Juri Lelli <juri.lelli@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: vincent@legout.info Cc: Dario Faggioli <raistlin@linux.it> Cc: Michael Trimarchi <michael@amarulasolutions.com> Cc: Fabio Checconi <fchecconi@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1414142198-18552-5-git-send-email-juri.lelli@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-28sched/deadline: Don't replenish from a !SCHED_DEADLINE entityJuri Lelli
In the deboost path, right after the dl_boosted flag has been reset, we can currently end up replenishing using -deadline parameters of a !SCHED_DEADLINE entity. This of course causes a bug, as those parameters are empty. In the case depicted above it is safe to simply bail out, as the deboosted task is going to be back to its original scheduling class anyway. Reported-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Tested-by: Daniel Wagner <daniel.wagner@bmw-carit.de> Signed-off-by: Juri Lelli <juri.lelli@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: vincent@legout.info Cc: Dario Faggioli <raistlin@linux.it> Cc: Michael Trimarchi <michael@amarulasolutions.com> Cc: Fabio Checconi <fchecconi@gmail.com> Link: http://lkml.kernel.org/r/1414142198-18552-4-git-send-email-juri.lelli@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-28sched: Fix race between task_group and sched_task_groupKirill Tkhai
The race may happen when somebody is changing task_group of a forking task. Child's cgroup is the same as parent's after dup_task_struct() (there just memory copying). Also, cfs_rq and rt_rq are the same as parent's. But if parent changes its task_group before it's called cgroup_post_fork(), we do not reflect this situation on child. Child's cfs_rq and rt_rq remain the same, while child's task_group changes in cgroup_post_fork(). To fix this we introduce fork() method, which calls sched_move_task() directly. This function changes sched_task_group on appropriate (also its logic has no problem with freshly created tasks, so we shouldn't introduce something special; we are able just to use it). Possibly, this decides the Burke Libbey's problem: https://lkml.org/lkml/2014/10/24/456 Signed-off-by: Kirill Tkhai <ktkhai@parallels.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/r/1414405105.19914.169.camel@tkhai Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-28MAINTAINERS: ufs - remove selfSantosh Y
I have moved, I do not have the hardware access anymore. Signed-off-by: Santosh Y <santoshsy@gmail.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-10-28MAINTAINERS: change hpsa and cciss maintainerDon Brace
Change ownership of the hpsa driver from Stephen M. Cameron (Hewlett-Packard) to Don Brace (PMC-Sierra). Change ownership of the cciss driver from Mike Miller (Hewlett-Packard) to Don Brace (PMC-Sierra). Signed-off-by: Don Brace <don.brace@pmcs.com> Reviewed-by: Robert Elliott <elliott@hp.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-10-28libcxgbi : support ipv6 address host_paramAnish Bhatt
libcxgbi was always returning an ipv4 address for ISCSI_HOST_PARAM_IPADDRESS, return appropriate address based on address family Signed-off-by: Anish Bhatt <anish@chelsio.com> Signed-off-by: Karen Xie <kxie@chelsio.com> Reviewed-by: Mike Christie <michaelc@cs.wisc.edu> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-10-28scsi: set REQ_QUEUE for the blk-mq caseChristoph Hellwig
To generate the right SPI tag messages we need to properly set QUEUE_FLAG_QUEUED in the request_queue and mirror it to the request. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Jens Axboe <axboe@kernel.dk> Reported-by: Meelis Roos <mroos@linux.ee> Tested-by: Meelis Roos <mroos@linux.ee> Cc: stable@vger.kernel.org
2014-10-28Revert "block: all blk-mq requests are tagged"Christoph Hellwig
This reverts commit fb3ccb5da71273e7f0d50b50bc879e50cedd60e7. SCSI-2/SPI actually needs the tagged/untagged flag in the request to work properly. Revert this patch and add a follow on to set it in the right place. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Acked-by: Jens Axboe <axboe@kernel.dk> Reported-by: Meelis Roos <mroos@linux.ee> Tested-by: Meelis Roos <mroos@linux.ee> Cc: stable@vger.kernel.org
2014-10-28lib/scatterlist: fix memory leak with scsi-mqTony Battersby
Fix a memory leak with scsi-mq triggered by commands with large data transfer length. Fixes: c53c6d6a68b1 ("scatterlist: allow chaining to preallocated chunks") Cc: <stable@vger.kernel.org> # 3.17.x Signed-off-by: Tony Battersby <tonyb@cybernetics.com> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Christoph Hellwig <hch@lst.de>
2014-10-28cxl: Fix PSL error due to duplicate segment table entriesIan Munsie
In certain circumstances the PSL (Power Service Layer, which provides translation services for CXL hardware) can send an interrupt for a segment miss that the kernel has already handled. This can happen if multiple translations for the same segment are queued in the PSL before the kernel has restarted the first translation. The CXL driver does not expect this situation and does not check if a segment had already been handled. This could cause a duplicate segment table entry which in turn caused a PSL error taking down the card. This patch fixes the issue by checking for existing entries in the segment table that match the segment we are trying to insert, so as to avoid inserting duplicate entries. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-28powerpc/mm: Use appropriate ESID mask in copro_calculate_slb()Ian Munsie
This patch makes copro_calculate_slb() mask the ESID by the correct mask for 1T vs 256M segments. This has no effect by itself as the extra bits were ignored, but it makes debugging the segment table entries easier and means that we can directly compare the ESID values for duplicates without needing to worry about masking in the comparison. This will be used to simplify a comparison in the following patch. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-28cxl: Refactor cxl_load_segment() and find_free_sste()Ian Munsie
This moves the segment table hash calculation from cxl_load_segment() into find_free_sste() since that is the only place it is actually used. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-28cxl: Disable secondary hash in segment tableIan Munsie
This patch simplifies the process of finding a free segment table entry by disabling the secondary hash. This reduces the number of possible entries in the segment table for a given address from 16 to 8. Due to the large segment sizes we use it is extremely unlikely that the secondary hash would ever have been used in practice, so this should not have any negative impacts and may even improve performance due to the reduced number of comparisons that software & hardware need to perform. This patch clears the SC bit in the hardware's state register (CXL_PSL_SR_An) to disable the secondary hash in the hardware since we can no longer fill out entries using it. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-28x86, cma: Reserve DMA contiguous area after initmem_init()Weijie Yang
Fengguang Wu reported a boot crash on the x86 platform via the 0-day Linux Kernel Performance Test: cma: dma_contiguous_reserve: reserving 31 MiB for global area BUG: Int 6: CR2 (null) [<41850786>] dump_stack+0x16/0x18 [<41d2b1db>] early_idt_handler+0x6b/0x6b [<41072227>] ? __phys_addr+0x2e/0xca [<41d4ee4d>] cma_declare_contiguous+0x3c/0x2d7 [<41d6d359>] dma_contiguous_reserve_area+0x27/0x47 [<41d6d4d1>] dma_contiguous_reserve+0x158/0x163 [<41d33e0f>] setup_arch+0x79b/0xc68 [<41d2b7cf>] start_kernel+0x9c/0x456 [<41d2b2ca>] i386_start_kernel+0x79/0x7d (See details at: https://lkml.org/lkml/2014/10/8/708) It is because dma_contiguous_reserve() is called before initmem_init() in x86, the variable high_memory is not initialized but accessed by __pa(high_memory) in dma_contiguous_reserve(). This patch moves dma_contiguous_reserve() after initmem_init() so that high_memory is initialized before accessed. Reported-by: Fengguang Wu <fengguang.wu@intel.com> Signed-off-by: Weijie Yang <weijie.yang@samsung.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Acked-by: Michal Nazarewicz <mina86@mina86.com> Cc: iamjoonsoo.kim@lge.com Cc: 'Linux-MM' <linux-mm@kvack.org> Cc: 'Weijie Yang' <weijie.yang.kh@gmail.com> Link: http://lkml.kernel.org/r/000101cfef69%2431e528a0%2495af79e0%24%25yang@samsung.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-10-28Revert "powerpc/powernv: Fix endian bug in LPC bus debugfs accessors"Michael Ellerman
This reverts commit bf7588a0859580a45c63cb082825d77c13eca357. Ben says although the code is not correct "[this] fix was completely wrong and does more damages than it fixes things." Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-28ipvs: Avoid null-pointer deref in debug codeAlex Gartrell
Use daddr instead of reaching into dest. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Gartrell <agartrell@fb.com> Signed-off-by: Simon Horman <horms@verge.net.au>
2014-10-28Merge branch 'drm-armada-fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm ↵Dave Airlie
into drm-fixes Three changes for the Armada DRM driver: 1. Add back the flags which tell the DRM core that we can do vblank. This was removed in error during the recent restructuring, and came to light while trying textured Xv rendering. 2. Fixing a refcount leak with Xv overlay. 3. As per recent discussion, the drm_vblank_pre_modeset() calls can cause deadlock with other changes to generic code. This change prevents those deadlocks by switching to the drm_crtc_vblank_*() calls instead. * 'drm-armada-fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm: drm/armada: convert to use vblank_on/off calls drm/armada: fix page_flip refcounting leak drm/armada: add IRQ support back
2014-10-27bpf: split eBPF out of NETAlexei Starovoitov
introduce two configs: - hidden CONFIG_BPF to select eBPF interpreter that classic socket filters depend on - visible CONFIG_BPF_SYSCALL (default off) that tracing and sockets can use that solves several problems: - tracing and others that wish to use eBPF don't need to depend on NET. They can use BPF_SYSCALL to allow loading from userspace or select BPF to use it directly from kernel in NET-less configs. - in 3.18 programs cannot be attached to events yet, so don't force it on - when the rest of eBPF infra is there in 3.19+, it's still useful to switch it off to minimize kernel size bloat-o-meter on x64 shows: add/remove: 0/60 grow/shrink: 0/2 up/down: 0/-15601 (-15601) tested with many different config combinations. Hopefully didn't miss anything. Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Acked-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-27Merge branch 'cxgb4-net'David S. Miller
Anish Bhatt says: ==================== cxgb4 : DCBx fixes for apps/host lldp agents This patchset contains some minor fixes for cxgb4 DCBx code. Chiefly, cxgb4 was not cleaning up any apps added to kernel app table when link was lost. Disabling DCBx in firmware would automatically set DCBx state to host-managed and enabled, we now wait for an explicit enable call from an lldp agent instead First patch was originally sent to net-next, but considering it applies to correcting behaviour of code already in net, I think it qualifies as a bug fix. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-27cxgb4 : Handle dcb enable correctlyAnish Bhatt
Disabling DCBx in firmware automatically enables DCBx for control via host lldp agents. Wait for an explicit setstate call from an lldp agents to enable DCBx instead. Fixes: 76bcb31efc06 ("cxgb4 : Add DCBx support codebase and dcbnl_ops") Signed-off-by: Anish Bhatt <anish@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-27cxgb4 : Improve handling of DCB negotiation or loss thereofAnish Bhatt
Clear out any DCB apps we might have added to kernel table when we lose DCB sync (or IEEE equivalent event). These were previously left behind and not cleaned up correctly. IEEE allows individual components to work independently, so improve check for IEEE completion by specifying individual components. Fixes: 10b0046685ab ("cxgb4: IEEE fixes for DCBx state machine") Signed-off-by: Anish Bhatt <anish@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter fixes for net The following patchset contains Netfilter fixes for your net tree, they are: 1) Allow to recycle a TCP port in conntrack when the change role from server to client, from Marcelo Leitner. 2) Fix possible off by one access in ip_set_nfnl_get_byindex(), patch from Dan Carpenter. 3) alloc_percpu returns NULL on error, no need for IS_ERR() in nf_tables chain statistic updates. From Sabrina Dubroca. 4) Don't compile ip options in bridge netfilter, this mangles the packet and bridge should not alter layer >= 3 headers when forwarding packets. Patch from Herbert Xu and tested by Florian Westphal. 5) Account the final NLMSG_DONE message when calculating the size of the nflog netlink batches. Patch from Florian Westphal. 6) Fix a possible netlink attribute length overflow with large packets. Again from Florian Westphal. 7) Release the skbuff if nfnetlink_log fails to put the final NLMSG_DONE message. This fixes a leak on error. This shouldn't ever happen though, otherwise this means we miscalculate the netlink batch size, so spot a warning if this ever happens so we can track down the problem. This patch from Houcheng Lin. 8) Look at the right list when recycling targets in the nft_compat, patch from Arturo Borrero. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-10-27cpufreq: cpufreq-dt: Restore default cpumask_setall(policy->cpus)Geert Uytterhoeven
Commit 34e5a5273d6aa0ee ("cpufreq: cpufreq-dt: extend with platform_data") changed cpufreq_init() to only call cpumask_setall(policy->cpus) if the platform data indicates that all CPUs share the same clock. Before, cpufreq_generic_init() did this unconditionally. This causes a crash on r8a7791/koelsch when resuming from s2ram: Enabling non-boot CPUs ... CPU1: Booted secondary processor Unable to handle kernel NULL pointer dereference at virtual address 0000003c pgd = ee71f980 [0000003c] *pgd=6eeb6003, *pmd=6e0e9003, *pte=00000000 Internal error: Oops: a07 [#1] SMP ARM Modules linked in: CPU: 0 PID: 1397 Comm: s2ram Tainted: G W 3.18.0-rc2-koelsch-00762-g7eed2a4e61d2d978 #581 task: ee6b76c0 ti: ee7f0000 task.ti: ee7f0000 PC is at __cpufreq_add_dev.isra.24+0x24c/0x77c LR is at __cpufreq_add_dev.isra.24+0x244/0x77c pc : [<c029e084>] lr : [<c029e07c>] psr: 60000153 sp : ee7f1d48 ip : ee7f1d48 fp : ee7f1d84 r10: c04e8448 r9 : 00000000 r8 : 00000001 r7 : c054a8c4 r6 : 00000001 r5 : 00000001 r4 : 00000000 r3 : 00000000 r2 : 00000000 r1 : 20000153 r0 : c054a950 Flags: nZCv IRQs on FIQs off Mode SVC_32 ISA ARM Segment user Control: 30c5307d Table: 6e71f980 DAC: fffffffd Process s2ram (pid: 1397, stack limit = 0xee7f0240) ... Backtrace: [<c029de38>] (__cpufreq_add_dev.isra.24) from [<c029e620>] (cpufreq_cpu_callback+0x6c/0x74) r10:eec75240 r9:c04e8448 r8:c04ef3a0 r7:00000001 r6:00000012 r5:00000000 r4:00000012 [<c029e5b4>] (cpufreq_cpu_callback) from [<c003f20c>] (notifier_call_chain+0x48/0x70) r4:ffffffdd r3:c029e5b4 [<c003f1c4>] (notifier_call_chain) from [<c003f2cc>] (__raw_notifier_call_chain+0x1c/0x24) r8:00000001 r7:00000010 r6:00000000 r5:00000000 r4:00000012 r3:ffffffff [<c003f2b0>] (__raw_notifier_call_chain) from [<c0026a00>] (__cpu_notify+0x34/0x50) [<c00269cc>] (__cpu_notify) from [<c0026a34>] (cpu_notify+0x18/0x1c) r4:00000001 [<c0026a1c>] (cpu_notify) from [<c0026c44>] (_cpu_up+0x108/0x144) [<c0026b3c>] (_cpu_up) from [<c0381c68>] (enable_nonboot_cpus+0x68/0xb8) r10:00000000 r9:c04e8ee6 r8:00000000 r7:00000003 r6:c04e8528 r5:c0506248 r4:00000001 [<c0381c00>] (enable_nonboot_cpus) from [<c0059038>] (suspend_devices_and_enter+0x29c/0x3e8) r6:c0506e70 r5:00000000 r4:00000000 r3:60000153 Restore the old default of calling cpumask_setall(policy->cpus) if no platform data is available to fix this. Fixes: 34e5a5273d6aa0ee (cpufreq: cpufreq-dt: extend with platform_data) Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Reviewed-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-27Merge tag 'media/v3.18-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media Pull media fixes from Mauro Carvalho Chehab: "A series of driver fixes: - a few compilation fixes with randconfigs - one potential compilation breakage on userspace due to the usage of a gcc extension - several warnings fixed - some other random driver fixes" * tag 'media/v3.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (22 commits) [media] s5p-jpeg: Avoid -Wuninitialized warning in s5p_jpeg_parse_hdr [media] s5p-fimc: Only build suspend/resume for PM [media] s5p-jpeg: Only build suspend/resume for PM [media] Remove references to non-existent PLAT_S5P symbol [media] videobuf-dma-contig: set vm_pgoff to be zero to pass the sanity check in vm_iomap_memory() [media] tw68: remove bogus I2C_ALGOBIT dependency [media] usbvision-video: two use after frees [media] tw68: remove deprecated IRQF_DISABLED [media] xc5000: use after free in release() [media] em28xx-input: NULL dereference on error [media] wl128x: fix fmdbg compiler warning Revert "[media] v4l2-dv-timings: fix a sparse warning" [media] hackrf: harmless off by one in debug code [media] cx23885: initialize config structs for T9580 [media] v4l: uvcvideo: Fix buffer completion size check [media] vivid: fix buffer overrun [media] saa7146: Create a device name before it's used [media] em28xx: fix uninitialized variable warning [media] vivid: fix Kconfig FB dependency [media] anysee: make sure loading modules is const ...
2014-10-27Merge tag 'edac_fixes_for_3.18' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp Pull EDAC fixes from Borislav Petkov: "Correct severity of reported errors in several EDAC drivers. From Jason Baron" * tag 'edac_fixes_for_3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/bp/bp: e7xxx_edac: Report CE events properly cpc925_edac: Report UE events properly i82860_edac: Report CE events properly i3200_edac: Report CE events properly
2014-10-27Merge tag 'spi-v3.18-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi Pull spi fixes from Mark Brown: "Quite a few driver fixes in here, including some fairly substantial ones for the recently added Rockchip driver, plus a fix for spidev to more reliably support bidirectional transfers which is fairly large but basically mechanical. It's a bit more code than I'd like but all fixes" * tag 'spi-v3.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi: spi: orion: fix potential NULL pointer de-reference spi/rockchip: spi controller must be disabled in tx callback too spi/rockchip: fix bug that cause spi transfer timed out in DMA duplex mode spi/rockchip: fix bug that case spi can't go as fast as slave request spi: pl022: Fix incorrect dma_unmap_sg spi: spidev: Use separate TX and RX bounce buffers spi: dw: Initialize of_node to discover DT node children
2014-10-27Merge tag 'regulator-v3.18-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator Pull regulator fixes from Mark Brown: "A couple of fixes for v3.18, one fix for an incorrect voltage to register mapping in the rk808 driver and a fix for a build failure in some SH defconfigs" * tag 'regulator-v3.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator: regulator: Include err.h from consumer.h to fix build failure regulator: rk808: Fix min_uV for DCDC1 & DCDC2
2014-10-27netfilter: nft_compat: fix wrong target lookup in nft_target_select_ops()Arturo Borrero
The code looks for an already loaded target, and the correct list to search is nft_target_list, not nft_match_list. Signed-off-by: Arturo Borrero Gonzalez <arturo.borrero.glez@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2014-10-27Btrfs: properly clean up btrfs_end_io_wq_cacheJosef Bacik
In one of Dave's cleanup commits he forgot to call btrfs_end_io_wq_exit on unload, which makes us unable to unload and then re-load the btrfs module. This fixes the problem. Thanks, Signed-off-by: Josef Bacik <jbacik@fb.com> Reviewed-by: David Sterba <dsterba@suse.cz> Reviewed-by: Miao Xie <miaox@cn.fujitsu.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-10-27Btrfs: fix invalid leaf slot access in btrfs_lookup_extent()Filipe Manana
If we couldn't find our extent item, we accessed the current slot (path->slots[0]) to check if it corresponds to an equivalent skinny metadata item. However this slot could be beyond our last item in the leaf (i.e. path->slots[0] >= btrfs_header_nritems(leaf)), in which case we shouldn't process it. Since btrfs_lookup_extent() is only used to find extent items for data extents, fix this by removing completely the logic that looks up for an equivalent skinny metadata item, since it can not exist. Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Chris Mason <clm@fb.com>
2014-10-27btrfs: use macro accessors in superblock validation checksDavid Sterba
The initial patch c926093ec516f5d316 (btrfs: add more superblock checks) did not properly use the macro accessors that wrap endianness and the code would not work correctly on big endian machines. Reported-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.cz> Signed-off-by: Chris Mason <clm@fb.com>
2014-10-27Revert duplicate "PCI: pciehp: Prevent NULL dereference during probe"Kamal Mostafa
This reverts bceee4a97eb5 ("PCI: pciehp: Prevent NULL dereference during probe") because it was accidentally applied twice: 62e4492c3063 ("PCI: Prevent NULL dereference during pciehp probe") bceee4a97eb5 ("PCI: pciehp: Prevent NULL dereference during probe") Revert the latter to dispose of the duplicated code block. [bhelgaas: tidy changelog, drop stable tag] Signed-off-by: Kamal Mostafa <kamal@canonical.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: Andreas Noever <andreas.noever@gmail.com>
2014-10-27PM / Sleep: fix recovery during resuming from hibernationImre Deak
If a device's dev_pm_ops::freeze callback fails during the QUIESCE phase, we don't rollback things correctly calling the thaw and complete callbacks. This could leave some devices in a suspended state in case of an error during resuming from hibernation. Signed-off-by: Imre Deak <imre.deak@intel.com> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-27PM / Sleep: fix async suspend_late/freeze_late error handlingImre Deak
If an asynchronous suspend_late or freeze_late callback fails during the SUSPEND, FREEZE or QUIESCE phases, we don't propagate the corresponding error correctly, in effect ignoring the error and continuing the suspend-to-ram/hibernation. During suspend-to-ram this could leave some devices without a valid saved context, leading to a failure to reinitialize them during resume. During hibernation this could leave some devices active interfeering with the creation / restoration of the hibernation image. Also this could leave the corresponding devices without a valid saved context and failure to reinitialize them during resume. Fixes: de377b397272 (PM / sleep: Asynchronous threads for suspend_late) Signed-off-by: Imre Deak <imre.deak@intel.com> Cc: 3.15+ <stable@vger.kernel.org> # 3.15+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-27ACPI: Use ACPI companion to match only the first physical deviceMika Westerberg
Commit 6ab3430129e2 ("mfd: Add ACPI support") made the MFD subdevices share the parent MFD ACPI companion if no _HID/_CID is specified for the subdevice in mfd_cell description. However, since all the subdevices share the ACPI companion, the match and modalias generation logic started to use the ACPI companion as well resulting this: # cat /sys/bus/platform/devices/HID-SENSOR-200041.6.auto/modalias acpi:INT33D1:PNP0C50: instead of the expected one # cat /sys/bus/platform/devices/HID-SENSOR-200041.6.auto/modalias platform:HID-SENSOR-200041 In other words the subdevice modalias is overwritten by the one taken from ACPI companion. This causes udev not to load the driver anymore. It is useful to be able to share the ACPI companion so that MFD subdevices (and possibly other devices as well) can access the ACPI resources even if they do not have ACPI representation in the namespace themselves. An example where this is used is Minnowboard LPC driver that creates GPIO as a subdevice among other things. Without the ACPI companion gpiolib is not able to lookup the corresponding GPIO controller from ACPI GpioIo resource. To fix this, restrict the match and modalias logic to be limited to the first (primary) physical device associated with the given ACPI comapnion. The secondary devices will still be able to access the ACPI companion, but they will be matched in a different way. Fixes: 6ab3430129e2 (mfd: Add ACPI support) Reported-by: Jarkko Nikula <jarkko.nikula@linux.intel.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-27cpufreq: cpufreq-dt: disable unsupported OPPsLucas Stach
If the regulator connected to the CPU voltage plane doesn't support an OPP specified voltage with the acceptable tolerance it's better to just disable the OPP instead of constantly failing the voltage scaling later on. Includes a fix to move initialization of opp_freq outside the loop to avoid an endless loop from Geert Uytterhoeven. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-10-27Merge tag 'mac80211-for-john-2014-10-23' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg <johannes@sipsolutions.net> says: "Here are a few fixes for the wireless stack: one fixes the RTS rate, one for a debugfs file, one to return the correct channel to userspace, a sanity check for a userspace value and the remaining two are just documentation fixes." Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-10-27Merge tag 'iwlwifi-for-john-2014-10-23' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/iwlwifi-fixes Emmanuel Grumbach <egrumbach@gmail.com> says: "I revert here a patch that caused interoperability issues. dvm gets a fix for a bug that was reported by many users. Two minor fixes for BT Coex and platform power fix that helps reducing latency when the PCIe link goes to low power states." Signed-off-by: John W. Linville <linville@tuxdriver.com>
2014-10-27doc: kernel-parameters.txt: Add ide-generic.probe-maskMaciej W. Rozycki
Signed-off-by: Maciej W. Rozycki <macro@linux-mips.org> [ jc: wording tweaked slightly ] Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2014-10-27ALSA: bebob: Uninitialized id returned by saffirepro_both_clk_src_getChristian Vogel
snd_bebob_stream_check_internal_clock() may get an id from saffirepro_both_clk_src_get (via clk_src->get()) that was uninitialized. a) make logic in saffirepro_both_clk_src_get explicit b) test if id used in snd_bebob_stream_check_internal_clock matches array size [fixed missing signed prefix to *_maps[] by tiwai] Signed-off-by: Christian Vogel <vogelchr@vogel.cx> Reviewed-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2014-10-27Merge tag 'asoc-v3.18-rc2' of ↵Takashi Iwai
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Fixes for v3.18 A few small driver fixes for v3.18 plus the removal of the s6000 support since the relevant chip is no longer supported in mainline.
2014-10-27drm/i915: Fix GMBUSFREQ on vlv/chvVille Syrjälä
vlv_cdclk_freq is in kHz but we need MHz for the GMBUSFREQ divider. This is a regression from: commit f8bf63fdcb1f82459dae7a3f22ee5ce92f3ea727 Author: Ville Syrjälä <ville.syrjala@linux.intel.com> Date: Fri Jun 13 13:37:54 2014 +0300 drm/i915: Kill duplicated cdclk readout code from i2c Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2014-10-27drm/i915: Ignore long hpds on eDP portsVille Syrjälä
Turning vdd on/off can generate a long hpd pulse on eDP ports. In order to handle hpd we would need to turn on vdd to perform aux transfers. This would lead to an endless cycle of "vdd off -> long hpd -> vdd on -> detect -> vdd off -> ..." So ignore long hpd pulses on eDP ports. eDP panels should be physically tied to the machine anyway so they should not actually disappear and thus don't need long hpd handling. Short hpds are still needed for link re-train and whatnot so we can't just turn off the hpd interrupt entirely for eDP ports. Perhaps we could turn it off whenever the panel is disabled, but just ignoring the long hpd seems sufficient. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: stable@vger.kernel.org Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Todd Previte <tprevite@gmail.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2014-10-27drm/i915: Do a dummy DPCD read before the actual readVille Syrjälä
Sometimes we seem to get utter garbage from DPCD reads. The resulting buffer is filled with the same byte, and the operation completed without errors. My HP ZR24w monitor seems particularly susceptible to this problem once it's gone into a sleep mode. The issue seems to happen only for the first AUX message that wakes the sink up. But as the first AUX read we often do is the DPCD receiver cap it does wreak a bit of havoc with subsequent link training etc. when the receiver cap bw/lane/etc. information is garbage. A sufficient workaround seems to be to perform a single byte dummy read before reading the actual data. I suppose that just wakes up the sink sufficiently and we can just throw away the returned data in case it's crap. DP_DPCD_REV seems like a sufficiently safe location to read here. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Todd Previte <tprevite@gmail.com> Cc: stable@vger.kernel.org Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2014-10-27Merge remote-tracking branches 'spi/fix/dw', 'spi/fix/orion', ↵Mark Brown
'spi/fix/pl022', 'spi/fix/rockchip' and 'spi/fix/spidev' into spi-linus
2014-10-27Merge remote-tracking branch 'regulator/fix/rk808' into regulator-linusMark Brown
2014-10-27Merge remote-tracking branch 'regulator/fix/core' into regulator-linusMark Brown