linux/linux-stable.git - Linux kernel stable tree

Age	Commit message (Collapse)	Author
2017-04-08	staging: most: destroy cdev when channel gets disconnected	Andrey Shvetsov
	When a channel is being removed while an application holds the corresponding character device, this device is going to be destroyed only after the application closes the file descriptor and releases character device. In case the channel appears again before the application closes the file descriptor it holds, the channel cannot be linked. This patch changes the described behavior and destroys the character device at the time the channel get disconnected from the AIM. Signed-off-by: Andrey Shvetsov <andrey.shvetsov@k2l.de> Signed-off-by: Christian Gromm <christian.gromm@microchip.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08	staging: most: consolidate attributes for list of links	Andrey Shvetsov
	This patch replaces three temporary variables representing the attributes to control the links between the AIMs and HDMs with an array of three elements to keep the corresponding code compact. Signed-off-by: Andrey Shvetsov <andrey.shvetsov@k2l.de> Signed-off-by: Christian Gromm <christian.gromm@microchip.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08	staging: most: core: separate property showing links	Andrey Shvetsov
	Currently an AIM has the following properties available to manage links: - write-only "remove_link" used to remove a link from a list - read/write "add_link" used to add a link to a list and display them This patch transfers the read functionality of "add_link" to the new read-only property "links" to build consistent set of properties to control the list of links. Signed-off-by: Andrey Shvetsov <andrey.shvetsov@k2l.de> Signed-off-by: Christian Gromm <christian.gromm@microchip.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08	staging: most: core: consolidate channel attributes	Andrey Shvetsov
	This patch replaces 13 temporary variables representing the attributes to control the channel with an array of 13 elements to keep the corresponding code compact. Signed-off-by: Andrey Shvetsov <andrey.shvetsov@k2l.de> Signed-off-by: Christian Gromm <christian.gromm@microchip.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08	staging: most: core: make use of __ATTR_* macros	Christian Gromm
	This patch replaces the proprietary macros with those provided by the kernel. Signed-off-by: Christian Gromm <christian.gromm@microchip.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08	staging: most: core: fix function names	Christian Gromm
	This patch fixes the names of the show/store functions to match the naming convention. Signed-off-by: Christian Gromm <christian.gromm@microchip.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08	staging: most: fix comment of the function remove_link_store	Andrey Shvetsov
	This patch replaces the name store_remove_link by the remove_link_store in the comment for the corresponding function. Signed-off-by: Andrey Shvetsov <andrey.shvetsov@k2l.de> Signed-off-by: Christian Gromm <christian.gromm@microchip.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08	staging: most: fix comment of the function add_link_store	Andrey Shvetsov
	This patch replaces the name store_add_link by the add_link_store in the comment for the corresponding function. Signed-off-by: Andrey Shvetsov <andrey.shvetsov@k2l.de> Signed-off-by: Christian Gromm <christian.gromm@microchip.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08	staging: android: ion: Remove old platform support	Laura Abbott
	Device specific platform support has been haphazard for Ion. There have been several independent attempts and there are still objections to what bindings exist right now. Just remove everything for a fresh start. Signed-off-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08	staging: android: ion: Remove duplicate ION_IOC_MAP	Laura Abbott
	ION_IOC_MAP is the same as ION_IOC_SHARE. We really don't need two identical interfaces. Remove it. Signed-off-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08	staging: android: ion: Remove import interface	Laura Abbott
	With the expansion of dma-buf and the move for Ion to be come just an allocator, the import mechanism is mostly useless. There isn't a kernel component to Ion anymore and handles are private to Ion. Remove this interface. Signed-off-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08	staging: android: ion: Remove custom ioctl interface	Laura Abbott
	Ion is now moving towards a unified interfact. This makes the custom ioctl interface unneeded. Remove it. Signed-off-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08	staging: android: ion: Remove crufty cache support	Laura Abbott
	Now that we call dma_map in the dma_buf API callbacks there is no need to use the existing cache APIs. Remove the sync ioctl and the existing bad dma_sync calls. Explicit caching can be handled with the dma_buf sync API. Signed-off-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08	staging: android: ion: Remove page faulting support	Laura Abbott
	The new method of syncing with dma_map means that the page faulting sync implementation is no longer applicable. Remove it. Signed-off-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08	staging: android: ion: Call dma_map_sg for syncing and mapping	Laura Abbott
	Technically, calling dma_buf_map_attachment should return a buffer properly dma_mapped. Add calls to dma_map_sg to begin_cpu_access to ensure this happens. As a side effect, this lets Ion buffers take advantage of the dma_buf sync ioctls. Signed-off-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08	staging: android: ion: Duplicate sg_table	Laura Abbott
	Ion currently returns a single sg_table on each dma_map call. This is incorrect for later usage. Signed-off-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08	staging: android: ion: Remove alignment from allocation field	Laura Abbott
	The align field was supposed to be used to specify the alignment of the allocation. Nobody actually does anything with it except to check if the alignment specified is out of bounds. Since this has no effect on the actual allocation, just remove it. Signed-off-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08	staging: android: ion: Remove dmap_cnt	Laura Abbott
	The reference counting of dma_map calls was removed. Remove the associated counter field as well. Signed-off-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08	staging: android: ashmem: lseek failed due to no FMODE_LSEEK.	Shuxiao Zhang
	vfs_llseek will check whether the file mode has FMODE_LSEEK, no return failure. But ashmem can be lseek, so add FMODE_LSEEK to ashmem file. Comment From Greg Hackmann: ashmem_llseek() passes the llseek() call through to the backing shmem file. 91360b02ab48 ("ashmem: use vfs_llseek()") changed this from directly calling the file's llseek() op into a VFS layer call. This also adds a check for the FMODE_LSEEK bit, so without that bit ashmem_llseek() now always fails with -ESPIPE. Fixes: 91360b02ab48 ("ashmem: use vfs_llseek()") Signed-off-by: Shuxiao Zhang <zhangshuxiao@xiaomi.com> Tested-by: Greg Hackmann <ghackmann@google.com> Cc: stable <stable@vger.kernel.org> # 3.18+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-04-08	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc	Linus Torvalds
	Pull sparc fixes from David Miller: "Several fixes here, mostly having to due with either build errors or memory corruptions depending upon whether you have THP enabled or not" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc: remove unused wp_works_ok macro sparc32: Export vac_cache_size to fix build error sparc64: Fix memory corruption when THP is enabled sparc64: Fix kernel panic due to erroneous #ifdef surrounding pmd_write() arch/sparc: Avoid DCTI Couples sparc64: kern_addr_valid regression sparc64: Add support for 2G hugepages sparc64: Fix size check in huge_pte_alloc
2017-04-08	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds
	Pull KVM fixes from Radim Krčmář: "ARM: - Fix a problem with GICv3 userspace save/restore - Clarify GICv2 userspace save/restore ABI - Be more careful in clearing GIC LRs - Add missing synchronization primitive to our MMU handling code PPC: - Check for a NULL return from kzalloc s390: - Prevent translation exception errors on valid page tables for the instruction-exection-protection support x86: - Fix Page-Modification Logging when running a nested guest" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: PPC: Book3S HV: Check for kmalloc errors in ioctl KVM: nVMX: initialize PML fields in vmcs02 KVM: nVMX: do not leak PML full vmexit to L1 KVM: arm/arm64: vgic: Fix GICC_PMR uaccess on GICv3 and clarify ABI KVM: arm64: Ensure LRs are clear when they should be kvm: arm/arm64: Fix locking for kvm_free_stage2_pgd KVM: s390: remove change-recording override support arm/arm64: KVM: Take mmap_sem in kvm_arch_prepare_memory_region arm/arm64: KVM: Take mmap_sem in stage2_unmap_vm
2017-04-08	Merge branch 'stable-4.11' of git://git.infradead.org/users/pcmoore/audit	Linus Torvalds
	Pull audit cleanup from Paul Moore: "A week later than I had hoped, but as promised, here is the audit uninline-fix we talked about during the last audit pull request. The patch is slightly different than what we originally discussed as it made more sense to keep the audit_signal_info() function in auditsc.c rather than move it and bunch of other related variables/definitions into audit.c/audit.h. At some point in the future I need to look at how the audit code is organized across kernel/audit, I suspect we could do things a bit better, but it doesn't seem like a -rc release is a good place for that ;) Regardless, this patch passes our tests without problem and looks good for v4.11" 'stable-4.11' of git://git.infradead.org/users/pcmoore/audit: audit: move audit_signal_info() into kernel/auditsc.c
2017-04-08	Merge branch 'akpm' (patches from Andrew)	Linus Torvalds
	Merge misc fixes from Andrew Morton: "10 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm: move pcp and lru-pcp draining into single wq mailmap: update Yakir Yang email address mm, swap_cgroup: reschedule when neeed in swap_cgroup_swapoff() dax: fix radix tree insertion race mm, thp: fix setting of defer+madvise thp defrag mode ptrace: fix PTRACE_LISTEN race corrupting task->state vmlinux.lds: add missing VMLINUX_SYMBOL macros mm/page_alloc.c: fix print order in show_free_areas() userfaultfd: report actual registered features in fdinfo mm: fix page_vma_mapped_walk() for ksm pages
2017-04-08	mm: move pcp and lru-pcp draining into single wq	Michal Hocko
	We currently have 2 specific WQ_RECLAIM workqueues in the mm code. vmstat_wq for updating pcp stats and lru_add_drain_wq dedicated to drain per cpu lru caches. This seems more than necessary because both can run on a single WQ. Both do not block on locks requiring a memory allocation nor perform any allocations themselves. We will save one rescuer thread this way. On the other hand drain_all_pages() queues work on the system wq which doesn't have rescuer and so this depend on memory allocation (when all workers are stuck allocating and new ones cannot be created). Initially we thought this would be more of a theoretical problem but Hugh Dickins has reported: : 4.11-rc has been giving me hangs after hours of swapping load. At : first they looked like memory leaks ("fork: Cannot allocate memory"); : but for no good reason I happened to do "cat /proc/sys/vm/stat_refresh" : before looking at /proc/meminfo one time, and the stat_refresh stuck : in D state, waiting for completion of flush_work like many kworkers. : kthreadd waiting for completion of flush_work in drain_all_pages(). This worker should be using WQ_RECLAIM as well in order to guarantee a forward progress. We can reuse the same one as for lru draining and vmstat. Link: http://lkml.kernel.org/r/20170307131751.24936-1-mhocko@kernel.org Signed-off-by: Michal Hocko <mhocko@suse.com> Suggested-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Mel Gorman <mgorman@suse.de> Tested-by: Yang Li <pku.leo@gmail.com> Tested-by: Hugh Dickins <hughd@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-08	mailmap: update Yakir Yang email address	Jeffy Chen
	Set current email address to replace previous employers email addresses. Link: http://lkml.kernel.org/r/1491450722-6633-1-git-send-email-jeffy.chen@rock-chips.com Signed-off-by: Jeffy Chen <jeffy.chen@rock-chips.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-08	mm, swap_cgroup: reschedule when neeed in swap_cgroup_swapoff()	David Rientjes
	We got need_resched() warnings in swap_cgroup_swapoff() because swap_cgroup_ctrl[type].length is particularly large. Reschedule when needed. Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1704061315270.80559@chino.kir.corp.google.com Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Vladimir Davydov <vdavydov.dev@gmail.com> Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-08	dax: fix radix tree insertion race	Ross Zwisler
	While running generic/340 in my test setup I hit the following race. It can happen with kernels that support FS DAX PMDs, so v4.10 thru v4.11-rc5. Thread 1 Thread 2 -------- -------- dax_iomap_pmd_fault() grab_mapping_entry() spin_lock_irq() get_unlocked_mapping_entry() 'entry' is NULL, can't call lock_slot() spin_unlock_irq() radix_tree_preload() dax_iomap_pmd_fault() grab_mapping_entry() spin_lock_irq() get_unlocked_mapping_entry() ... lock_slot() spin_unlock_irq() dax_pmd_insert_mapping() <inserts a PMD mapping> spin_lock_irq() __radix_tree_insert() fails with -EEXIST <fall back to 4k fault, and die horribly when inserting a 4k entry where a PMD exists> The issue is that we have to drop mapping->tree_lock while calling radix_tree_preload(), but since we didn't have a radix tree entry to lock (unlike in the pmd_downgrade case) we have no protection against Thread 2 coming along and inserting a PMD at the same index. For 4k entries we handled this with a special-case response to -EEXIST coming from the __radix_tree_insert(), but this doesn't save us for PMDs because the -EEXIST case can also mean that we collided with a 4k entry in the radix tree at a different index, but one that is covered by our PMD range. So, correctly handle both the 4k and 2M collision cases by explicitly re-checking the radix tree for an entry at our index once we reacquire mapping->tree_lock. This patch has made it through a clean xfstests run with the current v4.11-rc5 based linux/master, and it also ran generic/340 500 times in a loop. It used to fail within the first 10 iterations. Link: http://lkml.kernel.org/r/20170406212944.2866-1-ross.zwisler@linux.intel.com Signed-off-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: "Darrick J. Wong" <darrick.wong@oracle.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christoph Hellwig <hch@lst.de> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Matthew Wilcox <mawilcox@microsoft.com> Cc: <stable@vger.kernel.org> [4.10+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-08	mm, thp: fix setting of defer+madvise thp defrag mode	David Rientjes
	Setting thp defrag mode of "defer+madvise" actually sets "defer" in the kernel due to the name similarity and the out-of-order way the string is checked in defrag_store(). Check the string in the correct order so that TRANSPARENT_HUGEPAGE_DEFRAG_KSWAPD_OR_MADV_FLAG is set appropriately for "defer+madvise". Fixes: 21440d7eb904 ("mm, thp: add new defer+madvise defrag option") Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1704051814420.137626@chino.kir.corp.google.com Signed-off-by: David Rientjes <rientjes@google.com> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-08	ptrace: fix PTRACE_LISTEN race corrupting task->state	bsegall@google.com
	In PT_SEIZED + LISTEN mode STOP/CONT signals cause a wakeup against __TASK_TRACED. If this races with the ptrace_unfreeze_traced at the end of a PTRACE_LISTEN, this can wake the task /after/ the check against __TASK_TRACED, but before the reset of state to TASK_TRACED. This causes it to instead clobber TASK_WAKING, allowing a subsequent wakeup against TRACED while the task is still on the rq wake_list, corrupting it. Oleg said: "The kernel can crash or this can lead to other hard-to-debug problems. In short, "task->state = TASK_TRACED" in ptrace_unfreeze_traced() assumes that nobody else can wake it up, but PTRACE_LISTEN breaks the contract. Obviusly it is very wrong to manipulate task->state if this task is already running, or WAKING, or it sleeps again" [akpm@linux-foundation.org: coding-style fixes] Fixes: 9899d11f ("ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILL") Link: http://lkml.kernel.org/r/xm26y3vfhmkp.fsf_-_@bsegall-linux.mtv.corp.google.com Signed-off-by: Ben Segall <bsegall@google.com> Acked-by: Oleg Nesterov <oleg@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-08	vmlinux.lds: add missing VMLINUX_SYMBOL macros	Jessica Yu
	When __{start,end}_ro_after_init is referenced from C code, we run into the following build errors on blackfin: kernel/extable.c:169: undefined reference to `__start_ro_after_init' kernel/extable.c:169: undefined reference to `__end_ro_after_init' The build error is due to the fact that blackfin is one of the few arches that prepends an underscore '_' to all symbols defined in C. Fix this by wrapping __{start,end}_ro_after_init in vmlinux.lds.h with VMLINUX_SYMBOL(), which adds the necessary prefix for arches that have HAVE_UNDERSCORE_SYMBOL_PREFIX. Link: http://lkml.kernel.org/r/1491259387-15869-1-git-send-email-jeyu@redhat.com Signed-off-by: Jessica Yu <jeyu@redhat.com> Acked-by: Kees Cook <keescook@chromium.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Eddie Kovsky <ewk@edkovsky.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-08	mm/page_alloc.c: fix print order in show_free_areas()	Alexander Polakov
	Fixes: 11fb998986a72a ("mm: move most file-based accounting to the node") Link: http://lkml.kernel.org/r/1490377730.30219.2.camel@beget.ru Signed-off-by: Alexander Polyakov <apolyakov@beget.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> [4.8+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-08	userfaultfd: report actual registered features in fdinfo	Mike Rapoport
	fdinfo for userfault file descriptor reports UFFD_API_FEATURES. Up until recently, the UFFD_API_FEATURES was defined as 0, therefore corresponding field in fdinfo always contained zero. Now, with introduction of several additional features, UFFD_API_FEATURES is not longer 0 and it seems better to report actual features requested for the userfaultfd object described by the fdinfo. First, the applications that were using userfault will still see zero at the features field in fdinfo. Next, reporting actual features rather than available features, gives clear indication of what userfault features are used by an application. Link: http://lkml.kernel.org/r/1491140181-22121-1-git-send-email-rppt@linux.vnet.ibm.com Signed-off-by: Mike Rapoport <rppt@linux.vnet.ibm.com> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Pavel Emelyanov <xemul@virtuozzo.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-08	mm: fix page_vma_mapped_walk() for ksm pages	Hugh Dickins
	Doug Smythies reports oops with KSM in this backtrace, I've been seeing the same: page_vma_mapped_walk+0xe6/0x5b0 page_referenced_one+0x91/0x1a0 rmap_walk_ksm+0x100/0x190 rmap_walk+0x4f/0x60 page_referenced+0x149/0x170 shrink_active_list+0x1c2/0x430 shrink_node_memcg+0x67a/0x7a0 shrink_node+0xe1/0x320 kswapd+0x34b/0x720 Just as observed in commit 4b0ece6fa016 ("mm: migrate: fix remove_migration_pte() for ksm pages"), you cannot use page->index calculations on ksm pages. page_vma_mapped_walk() is relying on __vma_address(), where a ksm page can lead it off the end of the page table, and into whatever nonsense is in the next page, ending as an oops inside check_pte()'s pte_page(). KSM tells page_vma_mapped_walk() exactly where to look for the page, it does not need any page->index calculation: and that's so also for all the normal and file and anon pages - just not for THPs and their subpages. Get out early in most cases: instead of a PageKsm test, move down the earlier not-THP-page test, as suggested by Kirill. I'm also slightly worried that this loop can stray into other vmas, so added a vm_end test to prevent surprises; though I have not imagined anything worse than a very contrived case, in which a page mlocked in the next vma might be reclaimed because it is not mlocked in this vma. Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()") Link: http://lkml.kernel.org/r/alpine.LSU.2.11.1704031104400.1118@eggly.anvils Signed-off-by: Hugh Dickins <hughd@google.com> Reported-by: Doug Smythies <dsmythies@telus.net> Tested-by: Doug Smythies <dsmythies@telus.net> Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-07	orangefs: move features validation to fix filesystem hang	Martin Brandenburg
	Without this fix (and another to the userspace component itself described later), the kernel will be unable to process any OrangeFS requests after the userspace component is restarted (due to a crash or at the administrator's behest). The bug here is that inside orangefs_remount, the orangefs_request_mutex is locked. When the userspace component restarts while the filesystem is mounted, it sends a ORANGEFS_DEV_REMOUNT_ALL ioctl to the device, which causes the kernel to send it a few requests aimed at synchronizing the state between the two. While this is happening the orangefs_request_mutex is locked to prevent any other requests going through. This is only half of the bugfix. The other half is in the userspace component which outright ignores(!) requests made before it considers the filesystem remounted, which is after the ioctl returns. Of course the ioctl doesn't return until after the userspace component responds to the request it ignores. The userspace component has been changed to allow ORANGEFS_VFS_OP_FEATURES regardless of the mount status. Mike Marshall says: "I've tested this patch against the fixed userspace part. This patch is real important, I hope it can make it into 4.11... Here's what happens when the userspace daemon is restarted, without the patch: ============================================= [ INFO: possible recursive locking detected ] [ 4.10.0-00007-ge98bdb3 #1 Not tainted ] --------------------------------------------- pvfs2-client-co/29032 is trying to acquire lock: (orangefs_request_mutex){+.+.+.}, at: service_operation+0x3c7/0x7b0 [orangefs] but task is already holding lock: (orangefs_request_mutex){+.+.+.}, at: dispatch_ioctl_command+0x1bf/0x330 [orangefs] CPU: 0 PID: 29032 Comm: pvfs2-client-co Not tainted 4.10.0-00007-ge98bdb3 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.9.3-1.fc25 04/01/2014 Call Trace: __lock_acquire+0x7eb/0x1290 lock_acquire+0xe8/0x1d0 mutex_lock_killable_nested+0x6f/0x6e0 service_operation+0x3c7/0x7b0 [orangefs] orangefs_remount+0xea/0x150 [orangefs] dispatch_ioctl_command+0x227/0x330 [orangefs] orangefs_devreq_ioctl+0x29/0x70 [orangefs] do_vfs_ioctl+0xa3/0x6e0 SyS_ioctl+0x79/0x90" Signed-off-by: Martin Brandenburg <martin@omnibond.com> Acked-by: Mike Marshall <hubcap@omnibond.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-07	Merge tag 'pci-v4.11-fixes-4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI fixes from Bjorn Helgaas: - fix ThunderX legacy firmware resources - fix ARTPEC-6 and DesignWare platform driver NULL pointer dereferences - fix HiSilicon link error * tag 'pci-v4.11-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: dwc: Fix dw_pcie_ops NULL pointer dereference PCI: dwc: Select PCI_HOST_COMMON for hisi PCI: thunder-pem: Fix legacy firmware PEM-specific resources
2017-04-07	blk-mq: Restart a single queue if tag sets are shared	Bart Van Assche
	To improve scalability, if hardware queues are shared, restart a single hardware queue in round-robin fashion. Rename blk_mq_sched_restart_queues() to reflect the new semantics. Remove blk_mq_sched_mark_restart_queue() because this function has no callers. Remove flag QUEUE_FLAG_RESTART because this patch removes the code that uses this flag. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-07	dm rq: Avoid that request processing stalls sporadically	Bart Van Assche
	While running the srp-test software I noticed that request processing stalls sporadically at the beginning of a test, namely when mkfs is run against a dm-mpath device. Every time when that happened the following command was sufficient to resume request processing: echo run >/sys/kernel/debug/block/dm-0/state This patch avoids that such request processing stalls occur. The test I ran is as follows: while srp-test/run_tests -d -r 30 -t 02-mq; do :; done Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: dm-devel@redhat.com Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-07	scsi: Avoid that SCSI queues get stuck	Bart Van Assche
	If a .queue_rq() function returns BLK_MQ_RQ_QUEUE_BUSY then the block driver that implements that function is responsible for rerunning the hardware queue once requests can be queued again successfully. commit 52d7f1b5c2f3 ("blk-mq: Avoid that requeueing starts stopped queues") removed the blk_mq_stop_hw_queue() call from scsi_queue_rq() for the BLK_MQ_RQ_QUEUE_BUSY case. Hence change all calls to functions that are intended to rerun a busy queue such that these examine all hardware queues instead of only stopped queues. Since no other functions than scsi_internal_device_block() and scsi_internal_device_unblock() should ever stop or restart a SCSI queue, change the blk_mq_delay_queue() call into a blk_mq_delay_run_hw_queue() call. Fixes: commit 52d7f1b5c2f3 ("blk-mq: Avoid that requeueing starts stopped queues") Fixes: commit 7e79dadce222 ("blk-mq: stop hardware queue in blk_mq_delay_queue()") Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Long Li <longli@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-07	blk-mq: Introduce blk_mq_delay_run_hw_queue()	Bart Van Assche
	Introduce a function that runs a hardware queue unconditionally after a delay. Note: there is already a function that stops and restarts a hardware queue after a delay, namely blk_mq_delay_queue(). This function will be used in the next patch in this series. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Long Li <longli@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-07	Merge tag 'dm-4.11-fixes-2' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - two stable fixes for the verity target's FEC support - a stable fix for raid target's raid1 support (when no bitmap is used) - a 4.11 cache metadata v2 format fix to properly test blocks are clean * tag 'dm-4.11-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm verity fec: fix bufio leaks dm raid: fix NULL pointer dereference for raid1 without bitmap dm cache metadata: fix metadata2 format's blocks_are_clean_separate_dirty dm verity fec: limit error correction recursion
2017-04-07	Merge tag 'arm64-fixes' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "We've got a regression fix for the signal raised when userspace makes an unsupported unaligned access and a revert of the contiguous (hugepte) support for hugetlb, which has once again been found to be broken. One day, maybe, we'll get it right. Summary: - restore previous SIGBUS behaviour for unhandled unaligned user accesses - revert broken support for the contiguous bit in hugetlb (again...)" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: Revert "Revert "arm64: hugetlb: partial revert of 66b3923a1a0f"" arm64: mm: unaligned access by user-land should be received as SIGBUS
2017-04-07	Merge tag 'metag-for-v4.11-rc6' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag Pull metag usercopy fixes from James Hogan: "Metag usercopy fault handling fixes These patches fix a bunch of longstanding (some over a decade old) metag user copy fault handling bugs. Thanks go to Al Viro for spotting some of the questionable code in the first place" * tag 'metag-for-v4.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/metag: metag/usercopy: Add missing fixups metag/usercopy: Fix src fixup in from user rapf loops metag/usercopy: Set flags before ADDZ metag/usercopy: Zero rest of buffer from copy_from_user metag/usercopy: Add early abort to copy_to_user metag/usercopy: Fix alignment error checking metag/usercopy: Drop unused macros
2017-04-07	Merge tag 'acpi-4.11-rc6' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "This fixes a core device enumeration code change made in 4.10, in order to address a reported issue, that went too far. Specifics: - Refine the check for the existence of _HID in find_child_checks() so that it doesn't trigger for device objects with device IDs made up by the kernel (Rafael Wysocki)" * tag 'acpi-4.11-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / scan: Prefer devices without _HID for _ADR matching
2017-04-07	Merge tag 'for-linus-4.11b-rc6-tag' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull Xen fix from Juergen Gross: "A fix for error path cleanup in the xenbus handler" * tag 'for-linus-4.11b-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xenbus: remove transaction holder from list before freeing
2017-04-07	sysctl: don't print negative flag for proc_douintvec	Liping Zhang
	I saw some very confusing sysctl output on my system: # cat /proc/sys/net/core/xfrm_aevent_rseqth -2 # cat /proc/sys/net/core/xfrm_aevent_etime -10 # cat /proc/sys/net/ipv4/tcp_notsent_lowat -4294967295 Because we forget to set the negp flag in proc_douintvec, so it will become a garbage value. Since the value related to proc_douintvec is always an unsigned integer, so we can set negp to false explictily to fix this issue. Fixes: e7d316a02f68 ("sysctl: handle error writing UINT_MAX to u32 fields") Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Cc: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-07	sysctl: add sanity check for proc_douintvec	Liping Zhang
	Commit e7d316a02f68 ("sysctl: handle error writing UINT_MAX to u32 fields") introduced the proc_douintvec helper function, but it forgot to add the related sanity check when doing register_sysctl_table. So add it now. Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Cc: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org> Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-04-07	blk-mq: remap queues when adding/removing hardware queues	Omar Sandoval
	blk_mq_update_nr_hw_queues() used to remap hardware queues, which is the behavior that drivers expect. However, commit 4e68a011428a changed blk_mq_queue_reinit() to not remap queues for the case of CPU hotplugging, inadvertently making blk_mq_update_nr_hw_queues() not remap queues as well. This breaks, for example, NBD's multi-connection mode, leaving the added hardware queues unused. Fix it by making blk_mq_update_nr_hw_queues() explicitly remap the queues. Fixes: 4e68a011428a ("blk-mq: don't redistribute hardware queues on a CPU hotplug event") Reviewed-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-07	blk-mq-sched: fix crash in switch error path	Omar Sandoval
	In elevator_switch(), if blk_mq_init_sched() fails, we attempt to fall back to the original scheduler. However, at this point, we've already torn down the original scheduler's tags, so this causes a crash. Doing the fallback like the legacy elevator path is much harder for mq, so fix it by just falling back to none, instead. Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-07	blk-mq-sched: set up scheduler tags when bringing up new queues	Omar Sandoval
	If a new hardware queue is added at runtime, we don't allocate scheduler tags for it, leading to a crash. This hooks up the scheduler framework to blk_mq_{init,exit}_hctx() to make sure everything gets properly initialized/freed. Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>
2017-04-07	blk-mq-sched: refactor scheduler initialization	Omar Sandoval
	Preparation cleanup for the next couple of fixes, push blk_mq_sched_setup() and e->ops.mq.init_sched() into a helper. Signed-off-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>