summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2013-12-22Linux 3.13-rc5v3.13-rc5Linus Torvalds
2013-12-22Merge tag 'fixes-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "Much smaller batch of fixes this week. Biggest one is a revert of an OMAP display change that removed some non-DT pinmux code that was still needed for 3.13 to get DSI displays to work. There's also a fix that resolves some misdescribed GPIO controller resources on shmobile. The rest are mostly smaller fixes, a couple of MAINTAINERS updates, etc" * tag 'fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: Revert "ARM: OMAP2+: Remove legacy mux code for display.c" MAINTAINERS: Add keystone clock drivers MAINTAINERS: Add keystone git tree information ARM: s3c64xx: dt: Fix boot failure due to double clock initialization ARM: shmobile: r8a7790: Fix GPIO resources in DTS irqchip: renesas-intc-irqpin: Fix register bitfield shift calculation ARM: shmobile: lager: phy fixup needs CONFIG_PHYLIB
2013-12-22Merge tag 'firewire-fix' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394 Pull firewire fixlet from Stefan Richter: "A one-liner to reenable WRITE SAME over SBP-2 like in v3.8...v3.12. Buggy targets which could malfunction when being subjected to this command are already sufficiently protected by a scsi_level check in sd + SCSI core" * tag 'firewire-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394: firewire: sbp2: bring back WRITE SAME support
2013-12-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pendingLinus Torvalds
Pull SCSI target fixes from Nicholas Bellinger: "Mostly minor items this time around, the most notable being a FILEIO backend change to enforce hw_max_sectors based upon the current block_size to address a bug where large sized I/Os (> 1M) where being rejected" * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: qla2xxx: Fix scsi_host leak on qlt_lport_register callback failure target: Remove extra percpu_ref_init target/file: Update hw_max_sectors based on current block_size iser-target: Move INIT_WORK setup into isert_create_device_ib_res iscsi-target: Fix incorrect np->np_thread NULL assignment qla2xxx: Fix schedule_delayed_work() for target timeout calculations iser-target: fix error return code in isert_create_device_ib_res() iscsi-target: Fix-up all zero data-length CDBs with R/W_BIT set target: Remove write-only stats fields and lock from struct se_node_acl iscsi-target: return -EINVAL on oversized configfs parameter
2013-12-22Merge git://git.kvack.org/~bcrl/aio-nextLinus Torvalds
Pull AIO leak fixes from Ben LaHaise: "I've put these two patches plus Linus's change through a round of tests, and it passes millions of iterations of the aio numa migratepage test, as well as a number of repetitions of a few simple read and write tests. The first patch fixes the memory leak Kent introduced, while the second patch makes aio_migratepage() much more paranoid and robust" * git://git.kvack.org/~bcrl/aio-next: aio/migratepages: make aio migrate pages sane aio: fix kioctx leak introduced by "aio: Fix a trinity splat"
2013-12-22aio: clean up and fix aio_setup_ring page mappingLinus Torvalds
Since commit 36bc08cc01709 ("fs/aio: Add support to aio ring pages migration") the aio ring setup code has used a special per-ring backing inode for the page allocations, rather than just using random anonymous pages. However, rather than remembering the pages as it allocated them, it would allocate the pages, insert them into the file mapping (dirty, so that they couldn't be free'd), and then forget about them. And then to look them up again, it would mmap the mapping, and then use "get_user_pages()" to get back an array of the pages we just created. Now, not only is that incredibly inefficient, it also leaked all the pages if the mmap failed (which could happen due to excessive number of mappings, for example). So clean it all up, making it much more straightforward. Also remove some left-overs of the previous (broken) mm_populate() usage that was removed in commit d6c355c7dabc ("aio: fix race in ring buffer page lookup introduced by page migration support") but left the pointless and now misleading MAP_POPULATE flag around. Tested-and-acked-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-12-21Merge branch 'for-davem' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless John W. Linville says: ==================== Please consider pulling this batch of fixes for the 3.13 stream... For the mac80211 bits, Johannes says: "Here's a fix for another potential radiotap parser buffer overrun thanks to Evan Huus, and a fix for a cfg80211 warning in a certain corner case (reconnecting to the same BSS)." For the bluetooth bits, Gustavo says: "Two patches in this pull request. An important fix from Marcel in the permission check for HCI User Channels, there was a extra check for CAP_NET_RAW, and it was now removed. These channels should only require CAP_NET_ADMIN. The other patch is a device id addition." On top of that... Sujith Manoharan provides a workaround for a hardware problem that can result in lost interrupts. Larry Finger fixes an oops when unloading the rtlwifi driver (Red Hat bug 852761). Mathy Vanhoef fixes a somewhat minor MAC address privacy issue (CVE-2013-4579). ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-21hyperv: Fix race between probe and open callsHaiyang Zhang
Moving the register_netdev to the end of probe to prevent possible open call happens before NetVSP is connected. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Reviewed-by: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-12-22powercap / RAPL: add support for ValleyView SocJacob Pan
This patch adds support for RAPL on Intel ValleyView based SoC platforms, such as Baytrail. Besides adding CPU ID, special energy unit encoding is handled for ValleyView. Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-12-22PM / sleep: Fix memory leak in pm_vt_switch_unregister().Masami Ichikawa
kmemleak reported a memory leak as below. unreferenced object 0xffff880118f14700 (size 32): comm "swapper/0", pid 1, jiffies 4294877401 (age 123.283s) hex dump (first 32 bytes): 00 01 10 00 00 00 ad de 00 02 20 00 00 00 ad de .......... ..... 00 d4 d2 18 01 88 ff ff 01 00 00 00 00 04 00 00 ................ backtrace: [<ffffffff814edb1e>] kmemleak_alloc+0x4e/0xb0 [<ffffffff811889dc>] kmem_cache_alloc_trace+0x1ec/0x260 [<ffffffff810aba66>] pm_vt_switch_required+0x76/0xb0 [<ffffffff812f39f5>] register_framebuffer+0x195/0x320 [<ffffffff8130af18>] efifb_probe+0x718/0x780 [<ffffffff81391495>] platform_drv_probe+0x45/0xb0 [<ffffffff8138f407>] driver_probe_device+0x87/0x3a0 [<ffffffff8138f7f3>] __driver_attach+0x93/0xa0 [<ffffffff8138d413>] bus_for_each_dev+0x63/0xa0 [<ffffffff8138ee5e>] driver_attach+0x1e/0x20 [<ffffffff8138ea40>] bus_add_driver+0x180/0x250 [<ffffffff8138fe74>] driver_register+0x64/0xf0 [<ffffffff813913ba>] __platform_driver_register+0x4a/0x50 [<ffffffff8191e028>] efifb_driver_init+0x12/0x14 [<ffffffff8100214a>] do_one_initcall+0xfa/0x1b0 [<ffffffff818e40e0>] kernel_init_freeable+0x17b/0x201 In pm_vt_switch_required(), "entry" variable is allocated via kmalloc(). So, in pm_vt_switch_unregister(), it needs to call kfree() when object is deleted from list. Signed-off-by: Masami Ichikawa <masami256@gmail.com> Reviewed-by: Pavel Machek <pavel@ucw.cz> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-12-22cpufreq: Use CONFIG_CPU_FREQ_DEFAULT_* to set initial policy for setpolicy ↵Jason Baron
drivers When configuring a default governor (via CONFIG_CPU_FREQ_DEFAULT_*) with the intel_pstate driver, the desired default policy is not properly set. For example, setting 'CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE' ends up with the 'powersave' policy being set. Fix by configuring the correct default policy, if either 'powersave' or 'performance' are requested. Otherwise, fallback to what the driver originally set via its 'init' routine. Signed-off-by: Jason Baron <jbaron@akamai.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-12-22cpufreq: remove sysfs files for CPUs which failed to come back after resumeViresh Kumar
There are cases where cpufreq_add_dev() may fail for some CPUs during system resume. With the current code we will still have sysfs cpufreq files for those CPUs and struct cpufreq_policy would be already freed for them. Hence any operation on those sysfs files would result in kernel warnings. Example of problems resulting from resume errors (from Bjørn Mork): WARNING: CPU: 0 PID: 6055 at fs/sysfs/file.c:343 sysfs_open_file+0x77/0x212() missing sysfs attribute operations for kobject: (null) Modules linked in: [stripped as irrelevant] CPU: 0 PID: 6055 Comm: grep Tainted: G D 3.13.0-rc2 #153 Hardware name: LENOVO 2776LEG/2776LEG, BIOS 6EET55WW (3.15 ) 12/19/2011 0000000000000009 ffff8802327ebb78 ffffffff81380b0e 0000000000000006 ffff8802327ebbc8 ffff8802327ebbb8 ffffffff81038635 0000000000000000 ffffffff811823c7 ffff88021a19e688 ffff88021a19e688 ffff8802302f9310 Call Trace: [<ffffffff81380b0e>] dump_stack+0x55/0x76 [<ffffffff81038635>] warn_slowpath_common+0x7c/0x96 [<ffffffff811823c7>] ? sysfs_open_file+0x77/0x212 [<ffffffff810386e3>] warn_slowpath_fmt+0x41/0x43 [<ffffffff81182dec>] ? sysfs_get_active+0x6b/0x82 [<ffffffff81182382>] ? sysfs_open_file+0x32/0x212 [<ffffffff811823c7>] sysfs_open_file+0x77/0x212 [<ffffffff81182350>] ? sysfs_schedule_callback+0x1ac/0x1ac [<ffffffff81122562>] do_dentry_open+0x17c/0x257 [<ffffffff8112267e>] finish_open+0x41/0x4f [<ffffffff81130225>] do_last+0x80c/0x9ba [<ffffffff8112dbbd>] ? inode_permission+0x40/0x42 [<ffffffff81130606>] path_openat+0x233/0x4a1 [<ffffffff81130b7e>] do_filp_open+0x35/0x85 [<ffffffff8113b787>] ? __alloc_fd+0x172/0x184 [<ffffffff811232ea>] do_sys_open+0x6b/0xfa [<ffffffff811233a7>] SyS_openat+0xf/0x11 [<ffffffff8138c812>] system_call_fastpath+0x16/0x1b To fix this, remove those sysfs files or put the associated kobject in case of such errors. Also, to make it simple, remove the cpufreq sysfs links from all the CPUs (except for the policy->cpu) during suspend, as that operation won't result in a loss of sysfs file permissions and we can create those links during resume just fine. Fixes: 5302c3fb2e62 ("cpufreq: Perform light-weight init/teardown during suspend/resume") Reported-and-tested-by: Bjørn Mork <bjorn@mork.no> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 3.12+ <stable@vger.kernel.org> # 3.12+ [rjw: Changelog] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-12-21aio/migratepages: make aio migrate pages saneBenjamin LaHaise
The arbitrary restriction on page counts offered by the core migrate_page_move_mapping() code results in rather suspicious looking fiddling with page reference counts in the aio_migratepage() operation. To fix this, make migrate_page_move_mapping() take an extra_count parameter that allows aio to tell the code about its own reference count on the page being migrated. While cleaning up aio_migratepage(), make it validate that the old page being passed in is actually what aio_migratepage() expects to prevent misbehaviour in the case of races. Signed-off-by: Benjamin LaHaise <bcrl@kvack.org>
2013-12-21aio: fix kioctx leak introduced by "aio: Fix a trinity splat"Benjamin LaHaise
e34ecee2ae791df674dfb466ce40692ca6218e43 reworked the percpu reference counting to correct a bug trinity found. Unfortunately, the change lead to kioctxes being leaked because there was no final reference count to put. Add that reference count back in to fix things. Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Cc: stable@vger.kernel.org
2013-12-21null_blk: support submit_queues on use_per_node_hctxMatias Bjørling
In the case of both the submit_queues param and use_per_node_hctx param are used. We limit the number af submit_queues to the number of online nodes. If the submit_queues is a multiple of nr_online_nodes, its trivial. Simply map them to the nodes. For example: 8 submit queues are mapped as node0[0,1], node1[2,3], ... If uneven, we are left with an uneven number of submit_queues that must be mapped. These are mapped toward the first node and onward. E.g. 5 submit queues mapped onto 4 nodes are mapped as node0[0,1], node1[2], ... Signed-off-by: Matias Bjorling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-12-21null_blk: set use_per_node_hctx param to falseMatias Bjørling
The defaults for the module is to instantiate itself with blk-mq and a submit queue for each CPU node in the system. To save resources, initialize instead with a single submit queue. Signed-off-by: Matias Bjorling <m@bjorling.me> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-12-21null_blk: corrections to documentationMatias Bjørling
Randy Dunlap reported a couple of grammar errors and unfortunate usages of socket/node/core. Signed-off-by: Matias Bjorling <m@bjorling.me> Acked-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2013-12-20Don't set the INITRD_COMPRESS environment variable automaticallyLinus Torvalds
Commit 1bf49dd4be0b ("./Makefile: export initial ramdisk compression config option") started setting the INITRD_COMPRESS environment variable depending on which decompression models the kernel had available. That is completely broken. For example, we by default have CONFIG_RD_LZ4 enabled, and are able to decompress such an initrd, but the user tools to *create* such an initrd may not be availble. So trying to tell dracut to generate an lz4-compressed image just because we can decode such an image is completely inappropriate. Cc: J P <ppandit@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Jan Beulich <JBeulich@suse.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-12-20Merge tag 'xfs-for-linus-v3.13-rc5' of git://oss.sgi.com/xfs/xfsLinus Torvalds
Pull xfs bugfixes from Ben Myers: "This contains fixes for some asserts related to project quotas, a memory leak, a hang when disabling group or project quotas before disabling user quotas, Dave's email address, several fixes for the alignment of file allocation to stripe unit/width geometry, a fix for an assertion with xfs_zero_remaining_bytes, and the behavior of metadata writeback in the face of IO errors. Details: - fix memory leak in xfs_dir2_node_removename - fix quota assertion in xfs_setattr_size - fix quota assertions in xfs_qm_vop_create_dqattach - fix for hang when disabling group and project quotas before disabling user quotas - fix Dave Chinner's email address in MAINTAINERS - fix for file allocation alignment - fix for assertion in xfs_buf_stale by removing xfsbdstrat - fix for alignment with swalloc mount option - fix for "retry forever" semantics on IO errors" * tag 'xfs-for-linus-v3.13-rc5' of git://oss.sgi.com/xfs/xfs: xfs: abort metadata writeback on permanent errors xfs: swalloc doesn't align allocations properly xfs: remove xfsbdstrat error xfs: align initial file allocations correctly MAINTAINERS: fix incorrect mail address of XFS maintainer xfs: fix infinite loop by detaching the group/project hints from user dquot xfs: fix assertion failure at xfs_setattr_nonsize xfs: fix false assertion at xfs_qm_vop_create_dqattach xfs: fix memory leak in xfs_dir2_node_removename
2013-12-20mm: fix build of split ptlock codeOlof Johansson
Commit 597d795a2a78 ('mm: do not allocate page->ptl dynamically, if spinlock_t fits to long') restructures some allocators that are compiled even if USE_SPLIT_PTLOCKS arn't used. It results in compilation failure: mm/memory.c:4282:6: error: 'struct page' has no member named 'ptl' mm/memory.c:4288:12: error: 'struct page' has no member named 'ptl' Add in the missing ifdef. Fixes: 597d795a2a78 ('mm: do not allocate page->ptl dynamically, if spinlock_t fits to long') Signed-off-by: Olof Johansson <olof@lixom.net> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Hugh Dickins <hughd@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-12-20Merge tag 'arc-fixes-for-3.13-rc5' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC fix from Vineet Gupta: "Fix busted syscall table due to unistd header inclusion issue" * tag 'arc-fixes-for-3.13-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: Allow conditional multiple inclusion of uapi/asm/unistd.h
2013-12-20Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 ptrace fix from Catalin Marinas. * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: ptrace: avoid using HW_BREAKPOINT_EMPTY for disabled events
2013-12-20powerpc/512x: dts: disable MPC5125 usb moduleMatteo Facchinetti
At the moment the USB controller's pin muxing is not setup correctly and causes a kernel panic upon system startup, so disable the USB1 device tree node in the MPC5125 tower board dts file. The USB controller is connected to an USB3320 ULPI transceiver and the device tree should receive an update to reflect correct dependencies and required initialization data before the USB1 node can get re-enabled. Signed-off-by: Matteo Facchinetti <matteo.facchinetti@sirius-es.it> Signed-off-by: Anatolij Gustschin <agust@denx.de>
2013-12-20pstore: Don't allow high traffic options on fragile devicesLuck, Tony
Some pstore backing devices use on board flash as persistent storage. These have limited numbers of write cycles so it is a poor idea to use them from high frequency operations. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-12-20Merge branch 'master' of ↵John W. Linville
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless into for-davem
2013-12-20Merge tag 'dmaengine-fixes-3.13-rc4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine Pull dmaengine fixes from Dan Williams: - deprecation of net_dma to be removed in 3.14 - crash regression fix in pl330 from the dmaengine_unmap rework - crash regression fix for any channel running raid ops without CONFIG_ASYNC_TX_DMA from dmaengine_unmap - memory leak regression in mv_xor from dmaengine_unmap - build warning regressions in mv_xor, fsldma, ppc4xx, txx9, and at_hdmac from dmaengine_unmap - sleep in atomic regression in dma_async_memcpy_pg_to_pg - new fix in mv_xor for handling channel initialization failures * tag 'dmaengine-fixes-3.13-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/dmaengine: net_dma: mark broken dma: pl330: ensure DMA descriptors are zero-initialised dmaengine: fix sleep in atomic dmaengine: mv_xor: fix oops when channels fail to initialise dma: mv_xor: Use dmaengine_unmap_data for the self-tests dmaengine: fix enable for high order unmap pools dma: fix build warnings in txx9 dmatest: fix build warning on mips dma: fix fsldma build warnings dma: fix build warnings in ppc4xx dmaengine: at_hdmac: remove unused function dma: mv_xor: remove mv_desc_get_dest_addr()
2013-12-20Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Paolo Bonzini: "The PPC folks had a large amount of changes queued for 3.13, and now they are fixing the bugs" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: PPC: Book3S HV: Don't drop low-order page address bits powerpc: book3s: kvm: Don't abuse host r2 in exit path powerpc/kvm/booke: Fix build break due to stack frame size warning KVM: PPC: Book3S: PR: Enable interrupts earlier KVM: PPC: Book3S: PR: Make svcpu -> vcpu store preempt savvy KVM: PPC: Book3S: PR: Export kvmppc_copy_to|from_svcpu KVM: PPC: Book3S: PR: Don't clobber our exit handler id powerpc: kvm: fix rare but potential deadlock scene KVM: PPC: Book3S HV: Take SRCU read lock around kvm_read_guest() call KVM: PPC: Book3S HV: Make tbacct_lock irq-safe KVM: PPC: Book3S HV: Refine barriers in guest entry/exit KVM: PPC: Book3S HV: Fix physical address calculations
2013-12-20mm: do not allocate page->ptl dynamically, if spinlock_t fits to longKirill A. Shutemov
In struct page we have enough space to fit long-size page->ptl there, but we use dynamically-allocated page->ptl if size(spinlock_t) is larger than sizeof(int). It hurts 64-bit architectures with CONFIG_GENERIC_LOCKBREAK, where sizeof(spinlock_t) == 8, but it easily fits into struct page. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Acked-by: Hugh Dickins <hughd@google.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-12-20mm: page_alloc: revert NUMA aspect of fair allocation policyJohannes Weiner
Commit 81c0a2bb515f ("mm: page_alloc: fair zone allocator policy") meant to bring aging fairness among zones in system, but it was overzealous and badly regressed basic workloads on NUMA systems. Due to the way kswapd and page allocator interacts, we still want to make sure that all zones in any given node are used equally for all allocations to maximize memory utilization and prevent thrashing on the highest zone in the node. While the same principle applies to NUMA nodes - memory utilization is obviously improved by spreading allocations throughout all nodes - remote references can be costly and so many workloads prefer locality over memory utilization. The original change assumed that zone_reclaim_mode would be a good enough predictor for that, but it turned out to be as indicative as a coin flip. Revert the NUMA aspect of the fairness until we can find a proper way to make it configurable and agree on a sane default. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reviewed-by: Michal Hocko <mhocko@suse.cz> Signed-off-by: Mel Gorman <mgorman@suse.de> Cc: <stable@kernel.org> # 3.12 Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-12-20Revert "mm: page_alloc: exclude unreclaimable allocations from zone fairness ↵Mel Gorman
policy" This reverts commit 73f038b863df. The NUMA behaviour of this patch is less than ideal. An alternative approch is to interleave allocations only within local zones which is implemented in the next patch. Cc: stable@vger.kernel.org Signed-off-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-12-20mm: Fix NULL pointer dereference in madvise(MADV_WILLNEED) supportKirill A. Shutemov
Sasha Levin found a NULL pointer dereference that is due to a missing page table lock, which in turn is due to the pmd entry in question being a transparent huge-table entry. The code - introduced in commit 1998cc048901 ("mm: make madvise(MADV_WILLNEED) support swap file prefetch") - correctly checks for this situation using pmd_none_or_trans_huge_or_clear_bad(), but it turns out that that function doesn't work correctly. pmd_none_or_trans_huge_or_clear_bad() expected that pmd_bad() would trigger if the transparent hugepage bit was set, but it doesn't do that if pmd_numa() is also set. Note that the NUMA bit only gets set on real NUMA machines, so people trying to reproduce this on most normal development systems would never actually trigger this. Fix it by removing the very subtle (and subtly incorrect) expectation, and instead just checking pmd_trans_huge() explicitly. Reported-by: Sasha Levin <sasha.levin@oracle.com> Acked-by: Andrea Arcangeli <aarcange@redhat.com> [ Additionally remove the now stale test for pmd_trans_huge() inside the pmd_bad() case - Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-12-20Merge tag 'renesas-fixes-for-v3.13' of ↵Kevin Hilman
git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas into fixes From Simon Horman: Renesas ARM based SoC fixes for v3.13 * r8a7790 (R-Car H1) SoC - Correct GPIO resources in DT. This problem has been present since GPIOs were added to the r8a7790 SoC by f98e10c88aa95bf7 ("ARM: shmobile: r8a7790: Add GPIO controller devices to device tree") in v3.12-rc1. * irqchip renesas-intc-irqpin - Correct register bitfield shift calculation This bug has been present since the renesas-intc-irqpin driver was introduced by 443580486e3b9657 ("irqchip: Renesas INTC External IRQ pin driver") in v3.10-rc1 * Lager board - Do not build the phy fixup unless CONFIG_PHYLIB is enabled This problem was introduced by 48c8b96f21817aad * tag 'renesas-fixes-for-v3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas: ARM: shmobile: r8a7790: Fix GPIO resources in DTS irqchip: renesas-intc-irqpin: Fix register bitfield shift calculation ARM: shmobile: lager: phy fixup needs CONFIG_PHYLIB Signed-off-by: Kevin Hilman <khilman@linaro.org>
2013-12-20IB/uverbs: Check access to userspace response buffer in extended commandYann Droneaud
This patch adds a check on the output buffer with access_ok(VERIFY_WRITE, ...) to ensure the whole buffer is in userspace memory before using the pointer in uverbs functions. If the buffer or a subset of it is not valid, returns -EFAULT to the caller. This will also catch invalid buffer before the final call to copy_to_user() which happen late in most uverb functions. Just like the check in read(2) syscall, it's a sanity check to detect invalid parameters provided by userspace. This particular check was added in vfs_read() by Linus Torvalds for v2.6.12 with following commit message: https://git.kernel.org/cgit/linux/kernel/git/tglx/history.git/commit/?id=fd770e66c9a65b14ce114e171266cf6f393df502 Make read/write always do the full "access_ok()" tests. The actual user copy will do them too, but only for the range that ends up being actually copied. That hides bugs when the range has been clamped by file size or other issues. Note: there's no need to check input buffer since vfs_write() already does access_ok(VERIFY_READ, ...) as part of write() syscall. Link: http://marc.info/?i=cover.1387273677.git.ydroneaud@opteya.com Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-12-20IB/uverbs: Check input length in flow steering uverbsYann Droneaud
Since ib_copy_from_udata() doesn't check yet the available input data length before accessing userspace memory, an explicit check of this length is required to prevent: - reading past the user provided buffer, - underflow when subtracting the expected command size from the input length. This will ensure the newly added flow steering uverbs don't try to process truncated commands. Link: http://marc.info/?i=cover.1386798254.git.ydroneaud@opteya.com> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-12-20IB/uverbs: Set error code when fail to consume all flow_spec itemsYann Droneaud
If the flow_spec items parsed count does not match the number of items declared in the flow_attr command, or if not all bytes are used for flow_spec items (eg. trailing garbage), a log message is reported and the function leave through the error path. Unfortunately the error code is currently not set. This patch set error code to -EINVAL in such cases, so that the error is reported to userspace instead of silently fail. Link: http://marc.info/?i=cover.1386798254.git.ydroneaud@opteya.com> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-12-20IB/uverbs: Check reserved fields in create_flowYann Droneaud
As noted by Daniel Vetter in its article "Botching up ioctls"[1] "Check *all* unused fields and flags and all the padding for whether it's 0, and reject the ioctl if that's not the case. Otherwise your nice plan for future extensions is going right down the gutters since someone *will* submit an ioctl struct with random stack garbage in the yet unused parts. Which then bakes in the ABI that those fields can never be used for anything else but garbage." It's important to ensure that reserved fields are set to known value, so that it will be possible to use them latter to extend the ABI. The same reasonning apply to comp_mask field present in newer uverbs command: per commit 22878dbc9173 ("IB/core: Better checking of userspace values for receive flow steering"), unsupported values in comp_mask are rejected. [1] http://blog.ffwll.ch/2013/11/botching-up-ioctls.html Link: http://marc.info/?i=cover.1386798254.git.ydroneaud@opteya.com> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-12-20IB/uverbs: Check comp_mask in destroy_flowYann Droneaud
Just like the check added to create_flow in 22878dbc9173 ("IB/core: Better checking of userspace values for receive flow steering"), comp_mask must be checked in destroy_flow too. Since only empty comp_mask is currently supported, any other value must be rejected. This check was silently added in a previous patch[1] to move comp_mask in extended command header, part of previous patchset[2] against create/destroy_flow uverbs. The idea of moving comp_mask to the header was discarded for the final patchset[3]. Unfortunately the check added in destroy_flow uverb was not integrated in the final patchset. [1] http://marc.info/?i=40175eda10d670d098204da6aa4c327a0171ae5f.1381510045.git.ydroneaud@opteya.com [2] http://marc.info/?i=cover.1381510045.git.ydroneaud@opteya.com [3] http://marc.info/?i=cover.1383773832.git.ydroneaud@opteya.com Cc: Matan Barak <matanb@mellanox.com> Link: http://marc.info/?i=cover.1386798254.git.ydroneaud@opteya.com> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-12-20IB/uverbs: Check reserved field in extended command headerYann Droneaud
As noted by Daniel Vetter in its article "Botching up ioctls"[1] "Check *all* unused fields and flags and all the padding for whether it's 0, and reject the ioctl if that's not the case. Otherwise your nice plan for future extensions is going right down the gutters since someone *will* submit an ioctl struct with random stack garbage in the yet unused parts. Which then bakes in the ABI that those fields can never be used for anything else but garbage." It's important to ensure that reserved fields are set to known value, so that it will be possible to use them latter to extend the ABI. The same reasonning apply to comp_mask field present in newer uverbs command: per commit 22878dbc9173 ("IB/core: Better checking of userspace values for receive flow steering"), unsupported values in comp_mask are rejected. [1] http://blog.ffwll.ch/2013/11/botching-up-ioctls.html Link: http://marc.info/?i=cover.1386798254.git.ydroneaud@opteya.com> Signed-off-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-12-20IB/uverbs: New macro to set pointers to NULL if length is 0 in INIT_UDATA()Roland Dreier
Trying to have a ternary operator to choose between NULL (or 0) and the real pointer value in invocations leads to an impossible choice between a sparse error about a literal 0 used as a NULL pointer, and a gcc warning about "pointer/integer type mismatch in conditional expression." Rather than clutter the source with more casts, move the ternary operator into a new INIT_UDATA_BUF_OR_NULL() macro, which makes it easier to use and simplifies its callers. Reported-by: Yann Droneaud <ydroneaud@opteya.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-12-20Merge tag 'signed-for-3.13' of git://github.com/agraf/linux-2.6 into kvm-masterPaolo Bonzini
Patch queue for 3.13 - 2013-12-18 This fixes some grave issues we've only found after 3.13-rc1: - Make the modularized HV/PR book3s kvm work well as modules - Fix some race conditions - Fix compilation with certain compilers (booke) - Fix THP for book3s_hv - Fix preemption for book3s_pr Alexander Graf (4): KVM: PPC: Book3S: PR: Don't clobber our exit handler id KVM: PPC: Book3S: PR: Export kvmppc_copy_to|from_svcpu KVM: PPC: Book3S: PR: Make svcpu -> vcpu store preempt savvy KVM: PPC: Book3S: PR: Enable interrupts earlier Aneesh Kumar K.V (1): powerpc: book3s: kvm: Don't abuse host r2 in exit path Paul Mackerras (5): KVM: PPC: Book3S HV: Fix physical address calculations KVM: PPC: Book3S HV: Refine barriers in guest entry/exit KVM: PPC: Book3S HV: Make tbacct_lock irq-safe KVM: PPC: Book3S HV: Take SRCU read lock around kvm_read_guest() call KVM: PPC: Book3S HV: Don't drop low-order page address bits Scott Wood (1): powerpc/kvm/booke: Fix build break due to stack frame size warning pingfan liu (1): powerpc: kvm: fix rare but potential deadlock scene
2013-12-20Merge tag 'stable/for-linus-3.13-rc4-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull Xen bugfixes from Konrad Rzeszutek Wilk: - Fix balloon driver for auto-translate guests (PVHVM, ARM) to not use scratch pages. - Fix block API header for ARM32 and ARM64 to have proper layout - On ARM when mapping guests, stick on PTE_SPECIAL - When using SWIOTLB under ARM, don't call swiotlb functions twice - When unmapping guests memory and if we fail, don't return pages which failed to be unmapped. - Grant driver was using the wrong address on ARM. * tag 'stable/for-linus-3.13-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/balloon: Seperate the auto-translate logic properly (v2) xen/block: Correctly define structures in public headers on ARM32 and ARM64 arm: xen: foreign mapping PTEs are special. xen/arm64: do not call the swiotlb functions twice xen: privcmd: do not return pages which we have failed to unmap XEN: Grant table address, xen_hvm_resume_frames, is a phys_addr not a pfn
2013-12-20Merge tag 'trace-fixes-v3.13-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull ftrace fix from Steven Rostedt: "This fixes a long standing bug in the ftrace profiler. The problem is that the profiler only initializes the online CPUs, and not possible CPUs. This causes issues if the user takes CPUs online or offline while the profiler is running. If we online a CPU after starting the profiler, we lose all the trace information on the CPU going online. If we offline a CPU after running a test and start a new test, it will not clear the old data from that CPU. This bug causes incorrect data to be reported to the user if they online or offline CPUs during the profiling" * tag 'trace-fixes-v3.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ftrace: Initialize the ftrace profiler for each possible cpu
2013-12-20Merge tag 'omap-for-v3.13/display-fix' of ↵Kevin Hilman
git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes I accidentally removed some mux code for omap4 that I thought was dead code as omap4 has been booting with device tree only since v3.10. Turns out I also removed some display related mux code, so let's revert that except for the dead code parts. * tag 'omap-for-v3.13/display-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: (439 commits) Revert "ARM: OMAP2+: Remove legacy mux code for display.c" +Linux 3.13-rc4
2013-12-20ext4: add explicit casts when masking cluster sizesTheodore Ts'o
The missing casts can cause the high 64-bits of the physical blocks to be lost. Set up new macros which allows us to make sure the right thing happen, even if at some point we end up supporting larger logical block numbers. Thanks to the Emese Revfy and the PaX security team for reporting this issue. Reported-by: PaX Team <pageexec@freemail.hu> Reported-by: Emese Revfy <re.emese@gmail.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@vger.kernel.org
2013-12-20netfilter: nf_ct_timestamp: Fix BUG_ON after netns deletionHelmut Schaa
When having nf_conntrack_timestamp enabled deleting a netns can lead to the following BUG being triggered: [63836.660000] Kernel bug detected[#1]: [63836.660000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.10.18 #14 [63836.660000] task: 802d9420 ti: 802d2000 task.ti: 802d2000 [63836.660000] $ 0 : 00000000 00000000 00000000 00000000 [63836.660000] $ 4 : 00000001 00000004 00000020 00000020 [63836.660000] $ 8 : 00000000 80064910 00000000 00000000 [63836.660000] $12 : 0bff0002 00000001 00000000 0a0a0abe [63836.660000] $16 : 802e70a0 85f29d80 00000000 00000004 [63836.660000] $20 : 85fb62a0 00000002 802d3bc0 85fb62a0 [63836.660000] $24 : 00000000 87138110 [63836.660000] $28 : 802d2000 802d3b40 00000014 871327cc [63836.660000] Hi : 000005ff [63836.660000] Lo : f2edd000 [63836.660000] epc : 87138794 __nf_ct_ext_add_length+0xe8/0x1ec [nf_conntrack] [63836.660000] Not tainted [63836.660000] ra : 871327cc nf_conntrack_in+0x31c/0x7b8 [nf_conntrack] [63836.660000] Status: 1100d403 KERNEL EXL IE [63836.660000] Cause : 00800034 [63836.660000] PrId : 0001974c (MIPS 74Kc) [63836.660000] Modules linked in: ath9k ath9k_common pppoe ppp_async iptable_nat ath9k_hw ath pppox ppp_generic nf_nat_ipv4 nf_conntrack_ipv4 mac80211 ipt_MASQUERADE cfg80211 xt_time xt_tcpudp xt_state xt_quota xt_policy xt_pkttype xt_owner xt_nat xt_multiport xt_mark xh [63836.660000] Process swapper (pid: 0, threadinfo=802d2000, task=802d9420, tls=00000000) [63836.660000] Stack : 802e70a0 871323d4 00000005 87080234 802e70a0 86d2a840 00000000 00000000 [63836.660000] Call Trace: [63836.660000] [<87138794>] __nf_ct_ext_add_length+0xe8/0x1ec [nf_conntrack] [63836.660000] [<871327cc>] nf_conntrack_in+0x31c/0x7b8 [nf_conntrack] [63836.660000] [<801ff63c>] nf_iterate+0x90/0xec [63836.660000] [<801ff730>] nf_hook_slow+0x98/0x164 [63836.660000] [<80205968>] ip_rcv+0x3e8/0x40c [63836.660000] [<801d9754>] __netif_receive_skb_core+0x624/0x6a4 [63836.660000] [<801da124>] process_backlog+0xa4/0x16c [63836.660000] [<801d9bb4>] net_rx_action+0x10c/0x1e0 [63836.660000] [<8007c5a4>] __do_softirq+0xd0/0x1bc [63836.660000] [<8007c730>] do_softirq+0x48/0x68 [63836.660000] [<8007c964>] irq_exit+0x54/0x70 [63836.660000] [<80060830>] ret_from_irq+0x0/0x4 [63836.660000] [<8006a9f8>] r4k_wait_irqoff+0x18/0x1c [63836.660000] [<8009cfb8>] cpu_startup_entry+0xa4/0x104 [63836.660000] [<802eb918>] start_kernel+0x394/0x3ac [63836.660000] [63836.660000] Code: 00821021 8c420000 2c440001 <00040336> 90440011 92350010 90560010 2485ffff 02a5a821 [63837.040000] ---[ end trace ebf660c3ce3b55e7 ]--- [63837.050000] Kernel panic - not syncing: Fatal exception in interrupt [63837.050000] Rebooting in 3 seconds.. Fix this by not unregistering the conntrack extension in the per-netns cleanup code. This bug was introduced in (73f4001 netfilter: nf_ct_tstamp: move initialization out of pernet_operations). Signed-off-by: Helmut Schaa <helmut.schaa@googlemail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-12-20GFS2: Wait for async DIO in glock state changesSteven Whitehouse
We need to wait for any outstanding DIO to complete in a couple of situations. Firstly, in case we are changing out of deferred mode (in inode_go_sync) where GLF_DIRTY will not be set. That call could be prefixed with a test for gl_state == LM_ST_DEFERRED but it doesn't seem worth it bearing in mind that the test for outstanding DIO is very quick anyway, in the usual case that there is none. The second case is in inode_go_lock which will catch the cases where we have a cached EX lock, but where we grant deferred locks against it so that there is no glock state transistion. We only need to wait if the state is not deferred, since DIO is valid anyway in that state. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2013-12-20GFS2: Fix incorrect invalidation for DIO/buffered I/OSteven Whitehouse
In patch 209806aba9d540dde3db0a5ce72307f85f33468f we allowed local deferred locks to be granted against a cached exclusive lock. That opened up a corner case which this patch now fixes. The solution to the problem is to check whether we have cached pages each time we do direct I/O and if so to unmap, flush and invalidate those pages. Since the glock state machine normally does that for us, mostly the code will be a no-op. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2013-12-20netfilter: nft_exthdr: call ipv6_find_hdr() with explicitly initialized offsetDaniel Borkmann
In nft's nft_exthdr_eval() routine we process IPv6 extension header through invoking ipv6_find_hdr(), but we call it with an uninitialized offset variable that contains some stack value. In ipv6_find_hdr() we then test if the value of offset != 0 and call skb_header_pointer() on that offset in order to map struct ipv6hdr into it. Fix it up by initializing offset to 0 as it was probably intended to be. Fixes: 96518518cc41 ("netfilter: add nftables") Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2013-12-19drm/radeon: fix asic gfx values for scrapper asicsAlex Deucher
Fixes gfx corruption on certain TN/RL parts. bug: https://bugs.freedesktop.org/show_bug.cgi?id=60389 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2013-12-19dccp: catch failed request_module call in dccp_probe initWang Weidong
Check the return value of request_module during dccp_probe initialisation, bail out if that call fails. Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk> Signed-off-by: Wang Weidong <wangweidong1@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net>