summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2019-07-17Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge more updates from Andrew Morton: "VM: - z3fold fixes and enhancements by Henry Burns and Vitaly Wool - more accurate reclaimed slab caches calculations by Yafang Shao - fix MAP_UNINITIALIZED UAPI symbol to not depend on config, by Christoph Hellwig - !CONFIG_MMU fixes by Christoph Hellwig - new novmcoredd parameter to omit device dumps from vmcore, by Kairui Song - new test_meminit module for testing heap and pagealloc initialization, by Alexander Potapenko - ioremap improvements for huge mappings, by Anshuman Khandual - generalize kprobe page fault handling, by Anshuman Khandual - device-dax hotplug fixes and improvements, by Pavel Tatashin - enable synchronous DAX fault on powerpc, by Aneesh Kumar K.V - add pte_devmap() support for arm64, by Robin Murphy - unify locked_vm accounting with a helper, by Daniel Jordan - several misc fixes core/lib: - new typeof_member() macro including some users, by Alexey Dobriyan - make BIT() and GENMASK() available in asm, by Masahiro Yamada - changed LIST_POISON2 on x86_64 to 0xdead000000000122 for better code generation, by Alexey Dobriyan - rbtree code size optimizations, by Michel Lespinasse - convert struct pid count to refcount_t, by Joel Fernandes get_maintainer.pl: - add --no-moderated switch to skip moderated ML's, by Joe Perches misc: - ptrace PTRACE_GET_SYSCALL_INFO interface - coda updates - gdb scripts, various" [ Using merge message suggestion from Vlastimil Babka, with some editing - Linus ] * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (100 commits) fs/select.c: use struct_size() in kmalloc() mm: add account_locked_vm utility function arm64: mm: implement pte_devmap support mm: introduce ARCH_HAS_PTE_DEVMAP mm: clean up is_device_*_page() definitions mm/mmap: move common defines to mman-common.h mm: move MAP_SYNC to asm-generic/mman-common.h device-dax: "Hotremove" persistent memory that is used like normal RAM mm/hotplug: make remove_memory() interface usable device-dax: fix memory and resource leak if hotplug fails include/linux/lz4.h: fix spelling and copy-paste errors in documentation ipc/mqueue.c: only perform resource calculation if user valid include/asm-generic/bug.h: fix "cut here" for WARN_ON for __WARN_TAINT architectures scripts/gdb: add helpers to find and list devices scripts/gdb: add lx-genpd-summary command drivers/pps/pps.c: clear offset flags in PPS_SETPARAMS ioctl kernel/pid.c: convert struct pid count to refcount_t drivers/rapidio/devices/rio_mport_cdev.c: NUL terminate some strings select: shift restore_saved_sigmask_unless() into poll_select_copy_remaining() select: change do_poll() to return -ERESTARTNOHAND rather than -EINTR ...
2019-07-17dm kcopyd: Increase default sub-job size to 512KBNikos Tsironis
Currently, kcopyd has a sub-job size of 64KB and a maximum number of 8 sub-jobs. As a result, for any kcopyd job, we have a maximum of 512KB of I/O in flight. This upper limit to the amount of in-flight I/O under-utilizes fast devices and results in decreased throughput, e.g., when writing to a snapshotted thin LV with I/O size less than the pool's block size (so COW is performed using kcopyd). Increase kcopyd's default sub-job size to 512KB, so we have a maximum of 4MB of I/O in flight for each kcopyd job. This results in an up to 96% improvement of bandwidth when writing to a snapshotted thin LV, with I/O sizes less than the pool's block size. Also, add dm_mod.kcopyd_subjob_size_kb module parameter to allow users to fine tune the sub-job size of kcopyd. The default value of this parameter is 512KB and the maximum allowed value is 1024KB. We evaluate the performance impact of the change by running the snap_breaking_throughput benchmark, from the device mapper test suite [1]. The benchmark: 1. Creates a 1G thin LV 2. Provisions the thin LV 3. Takes a snapshot of the thin LV 4. Writes to the thin LV with: dd if=/dev/zero of=/dev/vg/thin_lv oflag=direct bs=<I/O size> Running this benchmark with various thin pool block sizes and dd I/O sizes (all combinations triggering the use of kcopyd) we get the following results: +-----------------+-------------+------------------+-----------------+ | Pool block size | dd I/O size | BW before (MB/s) | BW after (MB/s) | +-----------------+-------------+------------------+-----------------+ | 1 MB | 256 KB | 242 | 280 | | 1 MB | 512 KB | 238 | 295 | | | | | | | 2 MB | 256 KB | 238 | 354 | | 2 MB | 512 KB | 241 | 380 | | 2 MB | 1 MB | 245 | 394 | | | | | | | 4 MB | 256 KB | 248 | 412 | | 4 MB | 512 KB | 234 | 432 | | 4 MB | 1 MB | 251 | 474 | | 4 MB | 2 MB | 257 | 504 | | | | | | | 8 MB | 256 KB | 239 | 420 | | 8 MB | 512 KB | 256 | 431 | | 8 MB | 1 MB | 264 | 467 | | 8 MB | 2 MB | 264 | 502 | | 8 MB | 4 MB | 281 | 537 | +-----------------+-------------+------------------+-----------------+ [1] https://github.com/jthornber/device-mapper-test-suite Signed-off-by: Nikos Tsironis <ntsironis@arrikto.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-07-17dm snapshot: fix oversights in optional discard supportMike Snitzer
__find_snapshots_sharing_cow() should always be used with _origins_lock held so fix snapshot_io_hints() accordingly. Also, once a snapshot is being merged discards must not be allowed -- otherwise incorrect or duplicate work will be performed. Fixes: 2e6023850e177d ("dm snapshot: add optional discard support features") Reported-by: Nikos Tsironis <ntsironis@arrikto.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-07-17dm zoned: fix zone state management raceDamien Le Moal
dm-zoned uses the zone flag DMZ_ACTIVE to indicate that a zone of the backend device is being actively read or written and so cannot be reclaimed. This flag is set as long as the zone atomic reference counter is not 0. When this atomic is decremented and reaches 0 (e.g. on BIO completion), the active flag is cleared and set again whenever the zone is reused and BIO issued with the atomic counter incremented. These 2 operations (atomic inc/dec and flag set/clear) are however not always executed atomically under the target metadata mutex lock and this causes the warning: WARN_ON(!test_bit(DMZ_ACTIVE, &zone->flags)); in dmz_deactivate_zone() to be displayed. This problem is regularly triggered with xfstests generic/209, generic/300, generic/451 and xfs/077 with XFS being used as the file system on the dm-zoned target device. Similarly, xfstests ext4/303, ext4/304, generic/209 and generic/300 trigger the warning with ext4 use. This problem can be easily fixed by simply removing the DMZ_ACTIVE flag and managing the "ACTIVE" state by directly looking at the reference counter value. To do so, the functions dmz_activate_zone() and dmz_deactivate_zone() are changed to inline functions respectively calling atomic_inc() and atomic_dec(), while the dmz_is_active() macro is changed to an inline function calling atomic_read(). Fixes: 3b1a94c88b79 ("dm zoned: drive-managed zoned block device target") Cc: stable@vger.kernel.org Reported-by: Masato Suzuki <masato.suzuki@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-07-17iomap: move internal declarations into fs/iomap/Darrick J. Wong
Move internal function declarations out of fs/internal.h into include/linux/iomap.h so that our transition is complete. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-07-17iomap: move the main iteration code into a separate fileDarrick J. Wong
Move the main iteration code into a separate file so that we can group related functions in a single file instead of having a single enormous source file. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-07-17iomap: move the buffered IO code into a separate fileDarrick J. Wong
Move the buffered IO code into a separate file so that we can group related functions in a single file instead of having a single enormous source file. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-07-17iomap: move the direct IO code into a separate fileDarrick J. Wong
Move the direct IO code into a separate file so that we can group related functions in a single file instead of having a single enormous source file. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-07-17iomap: move the SEEK_HOLE code into a separate fileDarrick J. Wong
Move the SEEK_HOLE/SEEK_DATA code into a separate file so that we can group related functions in a single file instead of having a single enormous source file. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-07-17iomap: move the file mapping reporting code into a separate fileDarrick J. Wong
Move the file mapping reporting code (FIEMAP/FIBMAP) into a separate file so that we can group related functions in a single file instead of having a single enormous source file. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-07-17iomap: move the swapfile code into a separate fileDarrick J. Wong
Move the swapfile activation code into a separate file so that we can group related functions in a single file instead of having a single enormous source file. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> Reviewed-by: Christoph Hellwig <hch@lst.de>
2019-07-17kbuild: modsign: read modules.order instead of $(MODVERDIR)/*.modMasahiro Yamada
Towards the goal of removing MODVERDIR, read out modules.order to get the list of modules to be signed. This is simpler than parsing *.mod files in $(MODVERDIR). The modules_sign target is only supported for in-kernel modules. So, this commit does not take care of external modules. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-07-17kbuild: modinst: read modules.order instead of $(MODVERDIR)/*.modMasahiro Yamada
Towards the goal of removing MODVERDIR, read out modules.order to get the list of modules to be installed. This is simpler than parsing *.mod files in $(MODVERDIR). For external modules, $(KBUILD_EXTMOD)/modules.order should be read. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-07-17scsi: remove pointless $(MODVERDIR)/$(obj)/53c700.verMasahiro Yamada
Nothing depends on this, so it is dead code. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-07-17kbuild: remove duplication from modules.order in sub-directoriesMasahiro Yamada
Currently, only the top-level modules.order drops duplicated entries. The modules.order files in sub-directories potentially contain duplication. To list out the paths of all modules, I want to use modules.order instead of parsing *.mod files in $(MODVERDIR). To achieve this, I want to rip off duplication from modules.order of external modules too. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-07-17kbuild: get rid of kernel/ prefix from in-tree modules.{order,builtin}Masahiro Yamada
Removing the 'kernel/' prefix will make our life easier because we can simply do 'cat modules.order' to get all built modules with full paths. Currently, we parse the first line of '*.mod' files in $(MODVERDIR). Since we have duplicated functionality here, I plan to remove MODVERDIR entirely. In fact, modules.order is generated also for external modules in a broken format. It adds the 'kernel/' prefix to the absolute path of the module, like this: kernel//path/to/your/external/module/foo.ko This is fine for now since modules.order is not used for external modules. However, I want to sanitize the format everywhere towards the goal of removing MODVERDIR. We cannot change the format of installed module.{order,builtin}. So, 'make modules_install' will add the 'kernel/' prefix while copying them to $(MODLIB)/. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-07-17kbuild: do not create empty modules.order in the prepare stageMasahiro Yamada
Currently, $(objtree)/modules.order is touched in two places. In the 'prepare0' rule, scripts/Makefile.build creates an empty modules.order while processing 'obj=.' In the 'modules' rule, the top-level Makefile overwrites it with the correct list of modules. While this might be a good side-effect that modules.order is made empty every time (probably this is not intended functionality), I personally do not like this behavior. Create modules.order only when it is sensible to do so. This avoids creating the following pointless files: scripts/basic/modules.order scripts/dtc/modules.order scripts/gcc-plugins/modules.order scripts/genksyms/modules.order scripts/mod/modules.order scripts/modules.order scripts/selinux/genheaders/modules.order scripts/selinux/mdp/modules.order scripts/selinux/modules.order Going forward, $(objtree)/modules.order lists the modules that was built in the last successful build. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-07-17coccinelle: api: add devm_platform_ioremap_resource scriptHimanshu Jha
Use recently introduced devm_platform_ioremap_resource helper which wraps platform_get_resource() and devm_ioremap_resource() together. This helps produce much cleaner code and remove local `struct resource` declaration. Signed-off-by: Himanshu Jha <himanshujha199640@gmail.com> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-07-17kbuild: compile-test headers listed in header-test-m as wellMasahiro Yamada
It will be useful to control the header-test by a tristate option. If CONFIG_FOO is a tristate option, you can write like this: header-test-$(CONFIG_FOO) += foo.h Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-07-17kbuild: remove unused hostcc-optionMasahiro Yamada
We can re-add this whenever it is needed. At this moment, it is unused. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-07-17kbuild: remove tag files by distclean instead of mrproperMasahiro Yamada
It takes somewhat long time to generate these tag files. Keep such precious files until we run 'make distclean'. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-07-17kbuild: add --hash-style= and --build-id unconditionallyMasahiro Yamada
As commit 1e0221374e30 ("mips: vdso: drop unnecessary cc-ldoption") explained, these flags are supported by the minimal required version of binutils. They are supported by ld.lld too. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com> Tested-by: Nathan Chancellor <natechancellor@gmail.com>
2019-07-17kbuild: get rid of misleading $(AS) from documentsMasahiro Yamada
The assembler files in the kernel are *.S instead of *.s, so they must be preprocessed. Since 'as' of GNU binutils is not able to preprocess, we always use $(CC) as an assembler driver. $(AS) is almost unused in Kbuild. As of v5.2, there is just one place that directly invokes $(AS). $ git grep -e '$(AS)' -e '${AS}' -e '$AS' -e '$(AS:' -e '${AS:' -- :^Documentation drivers/net/wan/Makefile: AS68K = $(AS) The documentation about *_AFLAGS* sounds like the flags were passed to $(AS). This is somewhat misleading. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
2019-07-17kconfig: fix missing choice values in auto.confMasahiro Yamada
Since commit 00c864f8903d ("kconfig: allow all config targets to write auto.conf if missing"), Kconfig creates include/config/auto.conf in the defconfig stage when it is missing. Joonas Kylmälä reported incorrect auto.conf generation under some circumstances. To reproduce it, apply the following diff: | --- a/arch/arm/configs/imx_v6_v7_defconfig | +++ b/arch/arm/configs/imx_v6_v7_defconfig | @@ -345,14 +345,7 @@ CONFIG_USB_CONFIGFS_F_MIDI=y | CONFIG_USB_CONFIGFS_F_HID=y | CONFIG_USB_CONFIGFS_F_UVC=y | CONFIG_USB_CONFIGFS_F_PRINTER=y | -CONFIG_USB_ZERO=m | -CONFIG_USB_AUDIO=m | -CONFIG_USB_ETH=m | -CONFIG_USB_G_NCM=m | -CONFIG_USB_GADGETFS=m | -CONFIG_USB_FUNCTIONFS=m | -CONFIG_USB_MASS_STORAGE=m | -CONFIG_USB_G_SERIAL=m | +CONFIG_USB_FUNCTIONFS=y | CONFIG_MMC=y | CONFIG_MMC_SDHCI=y | CONFIG_MMC_SDHCI_PLTFM=y And then, run: $ make ARCH=arm mrproper imx_v6_v7_defconfig You will see CONFIG_USB_FUNCTIONFS=y is correctly contained in the .config, but not in the auto.conf. Please note drivers/usb/gadget/legacy/Kconfig is included from a choice block in drivers/usb/gadget/Kconfig. So USB_FUNCTIONFS is a choice value. This is probably a similar situation described in commit beaaddb62540 ("kconfig: tests: test defconfig when two choices interact"). When sym_calc_choice() is called, the choice symbol forgets the SYMBOL_DEF_USER unless all of its choice values are explicitly set by the user. The choice symbol is given just one chance to recall it because set_all_choice_values() is called if SYMBOL_NEED_SET_CHOICE_VALUES is set. When sym_calc_choice() is called again, the choice symbol forgets it forever, since SYMBOL_NEED_SET_CHOICE_VALUES is a one-time aid. Hence, we cannot call sym_clear_all_valid() again and again. It is crazy to repeat set and unset of internal flags. However, we cannot simply get rid of "sym->flags &= flags | ~SYMBOL_DEF_USER;" Doing so would re-introduce the problem solved by commit 5d09598d488f ("kconfig: fix new choices being skipped upon config update"). To work around the issue, conf_write_autoconf() stopped calling sym_clear_all_valid(). conf_write() must be changed accordingly. Currently, it clears SYMBOL_WRITE after the symbol is written into the .config file. This is needed to prevent it from writing the same symbol multiple times in case the symbol is declared in two or more locations. I added the new flag SYMBOL_WRITTEN, to track the symbols that have been written. Anyway, this is a cheesy workaround in order to suppress the issue as far as defconfig is concerned. Handling of choices is totally broken. sym_clear_all_valid() is called every time a user touches a symbol from the GUI interface. To reproduce it, just add a new symbol drivers/usb/gadget/legacy/Kconfig, then touch around unrelated symbols from menuconfig. USB_FUNCTIONFS will disappear from the .config file. I added the Fixes tag since it is more fatal than before. But, this has been broken since long long time before, and still it is. We should take a closer look to fix this correctly somehow. Fixes: 00c864f8903d ("kconfig: allow all config targets to write auto.conf if missing") Cc: linux-stable <stable@vger.kernel.org> # 4.19+ Reported-by: Joonas Kylmälä <joonas.kylmala@iki.fi> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Tested-by: Joonas Kylmälä <joonas.kylmala@iki.fi>
2019-07-17KVM: x86/vPMU: reset pmc->counter to 0 for pmu fixed_countersLike Xu
To avoid semantic inconsistency, the fixed_counters in Intel vPMU need to be reset to 0 in intel_pmu_reset() as gp_counters does. Signed-off-by: Like Xu <like.xu@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-07-17dma-direct: only limit the mapping size if swiotlb could be usedChristoph Hellwig
Don't just check for a swiotlb buffer, but also if buffering might be required for this particular device. Fixes: 133d624b1cee ("dma: Introduce dma_max_mapping_size()") Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2019-07-17dma-mapping: add a dma_addressing_limited helperChristoph Hellwig
This helper returns if the device has issues addressing all present memory in the system. Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-07-17xen/pv: Fix a boot up hang revealed by int3 self testZhenzhong Duan
Commit 7457c0da024b ("x86/alternatives: Add int3_emulate_call() selftest") is used to ensure there is a gap setup in int3 exception stack which could be used for inserting call return address. This gap is missed in XEN PV int3 exception entry path, then below panic triggered: [ 0.772876] general protection fault: 0000 [#1] SMP NOPTI [ 0.772886] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.2.0+ #11 [ 0.772893] RIP: e030:int3_magic+0x0/0x7 [ 0.772905] RSP: 3507:ffffffff82203e98 EFLAGS: 00000246 [ 0.773334] Call Trace: [ 0.773334] alternative_instructions+0x3d/0x12e [ 0.773334] check_bugs+0x7c9/0x887 [ 0.773334] ? __get_locked_pte+0x178/0x1f0 [ 0.773334] start_kernel+0x4ff/0x535 [ 0.773334] ? set_init_arg+0x55/0x55 [ 0.773334] xen_start_kernel+0x571/0x57a For 64bit PV guests, Xen's ABI enters the kernel with using SYSRET, with %rcx/%r11 on the stack. To convert back to "normal" looking exceptions, the xen thunks do 'xen_*: pop %rcx; pop %r11; jmp *'. E.g. Extracting 'xen_pv_trap xenint3' we have: xen_xenint3: pop %rcx; pop %r11; jmp xenint3 As xenint3 and int3 entry code are same except xenint3 doesn't generate a gap, we can fix it by using int3 and drop useless xenint3. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2019-07-17x86/xen: Add "nopv" support for HVM guestZhenzhong Duan
PVH guest needs PV extentions to work, so "nopv" parameter should be ignored for PVH but not for HVM guest. If PVH guest boots up via the Xen-PVH boot entry, xen_pvh is set early, we know it's PVH guest and ignore "nopv" parameter directly. If PVH guest boots up via the normal boot entry same as HVM guest, it's hard to distinguish PVH and HVM guest at that time. In this case, we have to panic early if PVH is detected and nopv is enabled to avoid a worse situation later. Remove static from bool_x86_init_noop/x86_op_int_noop so they could be used globally. Move xen_platform_hvm() after xen_hvm_guest_late_init() to avoid compile error. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Signed-off-by: Juergen Gross <jgross@suse.com>
2019-07-17x86/paravirt: Remove const mark from x86_hyper_xen_hvm variableZhenzhong Duan
.. as "nopv" support needs it to be changeable at boot up stage. Checkpatch reports warning, so move variable declarations from hypervisor.c to hypervisor.h Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Signed-off-by: Juergen Gross <jgross@suse.com>
2019-07-17xen: Map "xen_nopv" parameter to "nopv" and mark it obsoleteZhenzhong Duan
Clean up unnecessory code after that operation. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Juergen Gross <jgross@suse.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Signed-off-by: Juergen Gross <jgross@suse.com>
2019-07-17x86: Add "nopv" parameter to disable PV extensionsZhenzhong Duan
In virtualization environment, PV extensions (drivers, interrupts, timers, etc) are enabled in the majority of use cases which is the best option. However, in some cases (kexec not fully working, benchmarking) we want to disable PV extensions. We have "xen_nopv" for that purpose but only for XEN. For a consistent admin experience a common command line parameter "nopv" set across all PV guest implementations is a better choice. There are guest types which just won't work without PV extensions, like Xen PV, Xen PVH and jailhouse. add a "ignore_nopv" member to struct hypervisor_x86 set to true for those guest types and call the detect functions only if nopv is false or ignore_nopv is true. Suggested-by: Juergen Gross <jgross@suse.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Jan Kiszka <jan.kiszka@siemens.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Signed-off-by: Juergen Gross <jgross@suse.com>
2019-07-17x86/xen: Mark xen_hvm_need_lapic() and xen_x2apic_para_available() as __initZhenzhong Duan
.. as they are only called at early bootup stage. In fact, other functions in x86_hyper_xen_hvm.init.* are all marked as __init. Unexport xen_hvm_need_lapic as it's never used outside. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Reviewed-by: Juergen Gross <jgross@suse.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Signed-off-by: Juergen Gross <jgross@suse.com>
2019-07-17xen: remove tmem driverJuergen Gross
The Xen tmem (transcendent memory) driver can be removed, as the related Xen hypervisor feature never made it past the "experimental" state and will be removed in future Xen versions (>= 4.13). The xen-selfballoon driver depends on tmem, so it can be removed, too. Signed-off-by: Juergen Gross <jgross@suse.com> Acked-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2019-07-17Revert "x86/paravirt: Set up the virt_spin_lock_key after static keys get ↵Zhenzhong Duan
initialized" This reverts commit ca5d376e17072c1b60c3fee66f3be58ef018952d. Commit 8990cac6e5ea ("x86/jump_label: Initialize static branching early") adds jump_label_init() call in setup_arch() to make static keys initialized early, so we could use the original simpler code again. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@oracle.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Juergen Gross <jgross@suse.com>
2019-07-17xen/events: fix binding user event channels to cpusJuergen Gross
When binding an interdomain event channel to a vcpu via IOCTL_EVTCHN_BIND_INTERDOMAIN not only the event channel needs to be bound, but the affinity of the associated IRQi must be changed, too. Otherwise the IRQ and the event channel won't be moved to another vcpu in case the original vcpu they were bound to is going offline. Cc: <stable@vger.kernel.org> # 4.13 Fixes: c48f64ab472389df ("xen-evtchn: Bind dyn evtchn:qemu-dm interrupt to next online VCPU") Signed-off-by: Juergen Gross <jgross@suse.com> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: Juergen Gross <jgross@suse.com>
2019-07-16scsi: megaraid_sas: set an unlimited max_segment_sizeChristoph Hellwig
When using a virt_boundary_mask, as done for NVMe devices attached to megaraid_sas controllers, we require an unlimited max_segment_size as the virt boundary merging code assumes that. But we also need to propagate that to the DMA mapping layer to make dma-debug happy. The SCSI layer takes care of that when using the per-host virt_boundary setting, but given that megaraid_sas only wants to set the virt_boundary for actual NVMe devices, we can't rely on that. The DMA layer maximum segment is global to the HBA however, so we have to set it explicitly. This patch assumes that megaraid_sas does not have a segment size limitation, which seems true based on the SGL format, but will need to be verified. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-16scsi: mpt3sas: set an unlimited max_segment_size for SAS 3.0 HBAsChristoph Hellwig
When using a virt_boundary_mask, as done for NVMe devices attached to mpt3sas controllers, we require an unlimited max_segment_size as the virt boundary merging code assumes that. But we also need to propagate that to the DMA mapping layer to make dma-debug happy. The SCSI layer takes care of that when using the per-host virt_boundary setting, but given that mpt3sas only wants to set the virt_boundary for actual NVMe devices, we can't rely on that. The DMA layer maximum segment is global to the HBA however, so we have to set it explicitly. This patch assumes that mpt3sas does not have a segment size limitation, which seems true based on the SGL format, but will need to be verified. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Suganath Prabu <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-16scsi: IB/srp: set virt_boundary_mask in the scsi hostChristoph Hellwig
This ensures all proper DMA layer handling is taken care of by the SCSI midlayer. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Sagi Grimberg <sagi@grimberg.me> Acked-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-16scsi: IB/iser: set virt_boundary_mask in the scsi hostChristoph Hellwig
This ensures all proper DMA layer handling is taken care of by the SCSI midlayer. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Max Gurtovoy <maxg@mellanox.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-16scsi: storvsc: set virt_boundary_mask in the scsi host templateChristoph Hellwig
This ensures all proper DMA layer handling is taken care of by the SCSI midlayer. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-16scsi: ufshcd: set max_segment_size in the scsi host templateChristoph Hellwig
We need to also mirror the value to the device to ensure IOMMU merging doesn't undo it, and the SCSI host level parameter will ensure that. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-16scsi: core: take the DMA max mapping size into accountChristoph Hellwig
We need to limit the device's max_sectors to what the DMA mapping implementation can support. If not, we risk running out of swiotlb buffers easily. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-16scsi: core: add a host / host template field for the virt boundaryChristoph Hellwig
This allows drivers setting it up easily instead of branching out to block layer calls in slave_alloc, and ensures the upgraded max_segment_size setting gets picked up by the DMA layer. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Kashyap Desai < kashyap.desai@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-16switch the remnants of releasing the mountpoint away from fs_pinAl Viro
We used to need rather convoluted ordering trickery to guarantee that dput() of ex-mountpoints happens before the final mntput() of the same. Since we don't need that anymore, there's no point playing with fs_pin for that. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-07-16get rid of detach_mnt()Al Viro
Lift getting the original mount (dentry is actually not needed at all) of the mountpoint into the callers - to do_move_mount() and pivot_root() level. That simplifies the cleanup in those and allows to get saner arguments for attach_mnt_recursive(). Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-07-16virtio_pmem: fix sparse warningPankaj Gupta
This patch fixes below sparse warning related to __virtio type in virtio pmem driver. This is reported by Intel test bot on linux-next tree. nd_virtio.c:56:28: warning: incorrect type in assignment (different base types) nd_virtio.c:56:28: expected unsigned int [unsigned] [usertype] type nd_virtio.c:56:28: got restricted __virtio32 nd_virtio.c:93:59: warning: incorrect type in argument 2 (different base types) nd_virtio.c:93:59: expected restricted __virtio32 [usertype] val nd_virtio.c:93:59: got unsigned int [unsigned] [usertype] ret Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Pankaj Gupta <pagupta@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-07-16make struct mountpoint bear the dentry reference to mountpoint, not struct mountAl Viro
Using dput_to_list() to shift the contributing reference from ->mnt_mountpoint to ->mnt_mp->m_dentry. Dentries are dropped (with dput_to_list()) as soon as struct mountpoint is destroyed; in cases where we are under namespace_sem we use the global list, shrinking it in namespace_unlock(). In case of detaching stuck MNT_LOCKed children at final mntput_no_expire() we use a local list and shrink it ourselves. ->mnt_ex_mountpoint crap is gone. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-07-16scsi: core: Fix race on creating sense cacheMing Lei
When scsi_init_sense_cache(host) is called concurrently from different hosts, each code path may find that no cache has been created and allocate a new one. The lack of locking can lead to potentially overriding a cache allocated by a different host. Fix the issue by moving 'mutex_lock(&scsi_sense_cache_mutex)' before scsi_select_sense_cache(). Fixes: 0a6ac4ee7c21 ("scsi: respect unchecked_isa_dma for blk-mq") Cc: Stable <stable@vger.kernel.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.com> Cc: Ewan D. Milne <emilne@redhat.com> Signed-off-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2019-07-16scsi: sd_zbc: Fix compilation warningDamien Le Moal
kbuild test robot gets the following compilation warning using gcc 7.4 cross compilation for c6x (GCC_VERSION=7.4.0 make.cross ARCH=c6x). In file included from include/asm-generic/bug.h:18:0, from arch/c6x/include/asm/bug.h:12, from include/linux/bug.h:5, from include/linux/thread_info.h:12, from include/asm-generic/current.h:5, from ./arch/c6x/include/generated/asm/current.h:1, from include/linux/sched.h:12, from include/linux/blkdev.h:5, from drivers//scsi/sd_zbc.c:11: drivers//scsi/sd_zbc.c: In function 'sd_zbc_read_zones': >> include/linux/kernel.h:62:48: warning: 'zone_blocks' may be used uninitialized in this function [-Wmaybe-uninitialized] #define __round_mask(x, y) ((__typeof__(x))((y)-1)) ^ drivers//scsi/sd_zbc.c:464:6: note: 'zone_blocks' was declared here u32 zone_blocks; ^~~~~~~~~~~ This is a false-positive report. The variable zone_blocks is always initialized in sd_zbc_check_zones() before use. It is not initialized only and only if sd_zbc_check_zones() fails. Avoid this warning by initializing the zone_blocks variable to 0. Fixes: 5f832a395859 ("scsi: sd_zbc: Fix sd_zbc_check_zones() error checks") Cc: Stable <stable@vger.kernel.org> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>