summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-09-08mm: soft-dirty: keep soft-dirty bits over thp migrationNaoya Horiguchi
Soft dirty bit is designed to keep tracked over page migration. This patch makes it work in the same manner for thp migration too. Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Zi Yan <zi.yan@cs.rutgers.edu> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Nellans <dnellans@nvidia.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08mm: thp: check pmd migration entry in common pathZi Yan
When THP migration is being used, memory management code needs to handle pmd migration entries properly. This patch uses !pmd_present() or is_swap_pmd() (depending on whether pmd_none() needs separate code or not) to check pmd migration entries at the places where a pmd entry is present. Since pmd-related code uses split_huge_page(), split_huge_pmd(), pmd_trans_huge(), pmd_trans_unstable(), or pmd_none_or_trans_huge_or_clear_bad(), this patch: 1. adds pmd migration entry split code in split_huge_pmd(), 2. takes care of pmd migration entries whenever pmd_trans_huge() is present, 3. makes pmd_none_or_trans_huge_or_clear_bad() pmd migration entry aware. Since split_huge_page() uses split_huge_pmd() and pmd_trans_unstable() is equivalent to pmd_none_or_trans_huge_or_clear_bad(), we do not change them. Until this commit, a pmd entry should be: 1. pointing to a pte page, 2. is_swap_pmd(), 3. pmd_trans_huge(), 4. pmd_devmap(), or 5. pmd_none(). Signed-off-by: Zi Yan <zi.yan@cs.rutgers.edu> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Nellans <dnellans@nvidia.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08mm: thp: enable thp migration in generic pathZi Yan
Add thp migration's core code, including conversions between a PMD entry and a swap entry, setting PMD migration entry, removing PMD migration entry, and waiting on PMD migration entries. This patch makes it possible to support thp migration. If you fail to allocate a destination page as a thp, you just split the source thp as we do now, and then enter the normal page migration. If you succeed to allocate destination thp, you enter thp migration. Subsequent patches actually enable thp migration for each caller of page migration by allowing its get_new_page() callback to allocate thps. [zi.yan@cs.rutgers.edu: fix gcc-4.9.0 -Wmissing-braces warning] Link: http://lkml.kernel.org/r/A0ABA698-7486-46C3-B209-E95A9048B22C@cs.rutgers.edu [akpm@linux-foundation.org: fix x86_64 allnoconfig warning] Signed-off-by: Zi Yan <zi.yan@cs.rutgers.edu> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Nellans <dnellans@nvidia.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08mm: thp: introduce CONFIG_ARCH_ENABLE_THP_MIGRATIONNaoya Horiguchi
Introduce CONFIG_ARCH_ENABLE_THP_MIGRATION to limit thp migration functionality to x86_64, which should be safer at the first step. Link: http://lkml.kernel.org/r/20170717193955.20207-5-zi.yan@sent.com Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Zi Yan <zi.yan@cs.rutgers.edu> Reviewed-by: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Nellans <dnellans@nvidia.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08mm: thp: introduce separate TTU flag for thp freezingNaoya Horiguchi
TTU_MIGRATION is used to convert pte into migration entry until thp split completes. This behavior conflicts with thp migration added later patches, so let's introduce a new TTU flag specifically for freezing. try_to_unmap() is used both for thp split (via freeze_page()) and page migration (via __unmap_and_move()). In freeze_page(), ttu_flag given for head page is like below (assuming anonymous thp): (TTU_IGNORE_MLOCK | TTU_IGNORE_ACCESS | TTU_RMAP_LOCKED | \ TTU_MIGRATION | TTU_SPLIT_HUGE_PMD) and ttu_flag given for tail pages is: (TTU_IGNORE_MLOCK | TTU_IGNORE_ACCESS | TTU_RMAP_LOCKED | \ TTU_MIGRATION) __unmap_and_move() calls try_to_unmap() with ttu_flag: (TTU_MIGRATION | TTU_IGNORE_MLOCK | TTU_IGNORE_ACCESS) Now I'm trying to insert a branch for thp migration at the top of try_to_unmap_one() like below static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma, unsigned long address, void *arg) { ... /* PMD-mapped THP migration entry */ if (!pvmw.pte && (flags & TTU_MIGRATION)) { if (!PageAnon(page)) continue; set_pmd_migration_entry(&pvmw, page); continue; } ... } so try_to_unmap() for tail pages called by thp split can go into thp migration code path (which converts *pmd* into migration entry), while the expectation is to freeze thp (which converts *pte* into migration entry.) I detected this failure as a "bad page state" error in a testcase where split_huge_page() is called from queue_pages_pte_range(). Link: http://lkml.kernel.org/r/20170717193955.20207-4-zi.yan@sent.com Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Zi Yan <zi.yan@cs.rutgers.edu> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Nellans <dnellans@nvidia.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08mm: x86: move _PAGE_SWP_SOFT_DIRTY from bit 7 to bit 1Naoya Horiguchi
_PAGE_PSE is used to distinguish between a truly non-present (_PAGE_PRESENT=0) PMD, and a PMD which is undergoing a THP split and should be treated as present. But _PAGE_SWP_SOFT_DIRTY currently uses the _PAGE_PSE bit, which would cause confusion between one of those PMDs undergoing a THP split, and a soft-dirty PMD. Dropping _PAGE_PSE check in pmd_present() does not work well, because it can hurt optimization of tlb handling in thp split. Thus, we need to move the bit. In the current kernel, bits 1-4 are not used in non-present format since commit 00839ee3b299 ("x86/mm: Move swap offset/type up in PTE to work around erratum"). So let's move _PAGE_SWP_SOFT_DIRTY to bit 1. Bit 7 is used as reserved (always clear), so please don't use it for other purpose. Link: http://lkml.kernel.org/r/20170717193955.20207-3-zi.yan@sent.com Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Zi Yan <zi.yan@cs.rutgers.edu> Acked-by: Dave Hansen <dave.hansen@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: David Nellans <dnellans@nvidia.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Minchan Kim <minchan@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08mm: mempolicy: add queue_pages_required()Naoya Horiguchi
Patch series "mm: page migration enhancement for thp", v9. Motivations: 1. THP migration becomes important in the upcoming heterogeneous memory systems. As David Nellans from NVIDIA pointed out from other threads (http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg1349227.html), future GPUs or other accelerators will have their memory managed by operating systems. Moving data into and out of these memory nodes efficiently is critical to applications that use GPUs or other accelerators. Existing page migration only supports base pages, which has a very low memory bandwidth utilization. My experiments (see below) show THP migration can migrate pages more efficiently. 2. Base page migration vs THP migration throughput. Here are cross-socket page migration results from calling move_pages() syscall: In x86_64, a Intel two-socket E5-2640v3 box, - single 4KB base page migration takes 62.47 us, using 0.06 GB/s BW, - single 2MB THP migration takes 658.54 us, using 2.97 GB/s BW, - 512 4KB base page migration takes 1987.38 us, using 0.98 GB/s BW. In ppc64, a two-socket Power8 box, - single 64KB base page migration takes 49.3 us, using 1.24 GB/s BW, - single 16MB THP migration takes 2202.17 us, using 7.10 GB/s BW, - 256 64KB base page migration takes 2543.65 us, using 6.14 GB/s BW. THP migration can give us 3x and 1.15x throughput over base page migration in x86_64 and ppc64 respectivley. You can test it out by using the code here: https://github.com/x-y-z/thp-migration-bench 3. Existing page migration splits THP before migration and cannot guarantee the migrated pages are still contiguous. Contiguity is always what GPUs and accelerators look for. Without THP migration, khugepaged needs to do extra work to reassemble the migrated pages back to THPs. This patch (of 10): Introduce a separate check routine related to MPOL_MF_INVERT flag. This patch just does cleanup, no behavioral change. Link: http://lkml.kernel.org/r/20170717193955.20207-2-zi.yan@sent.com Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Zi Yan <zi.yan@cs.rutgers.edu> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Mel Gorman <mgorman@techsingularity.net> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: David Nellans <dnellans@nvidia.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08ipv6: fix typo in fib6_net_exit()Eric Dumazet
IPv6 FIB should use FIB6_TABLE_HASHSZ, not FIB_TABLE_HASHSZ. Fixes: ba1cc08d9488 ("ipv6: fix memory leak with multiple tables during netns destruction") Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-08tcp: fix a request socket leakEric Dumazet
While the cited commit fixed a possible deadlock, it added a leak of the request socket, since reqsk_put() must be called if the BPF filter decided the ACK packet must be dropped. Fixes: d624d276d1dd ("tcp: fix possible deadlock in TCP stack vs BPF filter") Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-08Merge tag 'platform-drivers-x86-v4.14-1' of ↵Linus Torvalds
git://git.infradead.org/linux-platform-drivers-x86 Pull x86 platform driver updates from Darren Hart: "Several fixes from static analysis and message noise reduction. Correct WMI core and related drivers to evaluate instance number 0x0 in accordance with the documentation. Add intel-telemetry support for Gemini Lake. Various individual driver fixes noted below. dell-wmi: - Update dell_wmi_check_descriptor_buffer() to new model intel-vbtn: - reduce unnecessary messages for normal users - match power button on press rather than release intel-hid: - reduce unnecessary messages for normal users thinkpad_acpi: - Fix warning about deprecated hwmon_device_register wmi: - Fix check for method instance number ideapad-laptop: - Expose conservation mode switch intel_pmc_core: - Make the driver PCH family agnostic peaq-wmi: - Evaluate wmi method with instance number 0x0 - silence a static checker warning mxm-wmi: - Evaluate wmi method with instance number 0x0 asus-wmi: - Evaluate wmi method with instance number 0x0 intel_scu_ipc: - make intel_scu_ipc_pdata_t const intel_mid_powerbtn: - make mid_pb_ddata const - fix error return code in mid_pb_probe() hp-wmi: - Remove unused macro helper - Correctly determine method id in WMI calls dell-wmi: - Fix driver interface version query intel_telemetry: - remove redundant macro definition - Add GLK PSS Event Table alienware-wmi: - fix format string overflow warning ibm_rtl: - remove unnecessary static in ibm_rtl_write() msi-wmi: - remove unnecessary static in msi_wmi_notify()" * tag 'platform-drivers-x86-v4.14-1' of git://git.infradead.org/linux-platform-drivers-x86: (23 commits) platform/x86: dell-wmi: Update dell_wmi_check_descriptor_buffer() to new model platform/x86: intel-vbtn: reduce unnecessary messages for normal users platform/x86: intel-hid: reduce unnecessary messages for normal users platform/x86: thinkpad_acpi: Fix warning about deprecated hwmon_device_register platform/x86: wmi: Fix check for method instance number platform/x86: ideapad-laptop: Expose conservation mode switch platform/x86: intel_pmc_core: Make the driver PCH family agnostic platform/x86: peaq-wmi: Evaluate wmi method with instance number 0x0 platform/x86: mxm-wmi: Evaluate wmi method with instance number 0x0 platform/x86: asus-wmi: Evaluate wmi method with instance number 0x0 platform/x86: intel_scu_ipc: make intel_scu_ipc_pdata_t const platform/x86: intel_mid_powerbtn: make mid_pb_ddata const platform/x86: intel_mid_powerbtn: fix error return code in mid_pb_probe() platform/x86: hp-wmi: Remove unused macro helper platform/x86: hp-wmi: Correctly determine method id in WMI calls platform/x86: intel-vbtn: match power button on press rather than release platform/x86: dell-wmi: Fix driver interface version query platform/x86: intel_telemetry: remove redundant macro definition platform/x86: intel_telemetry: Add GLK PSS Event Table platform/x86: alienware-wmi: fix format string overflow warning ...
2017-09-08Merge tag 'arc-4.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC updates from Vineet Gupta: - Support for HSDK board hosting a Quad core HS38x4 based SoC running @1GHz (and some prerrquisite changes such as ability to scoot the kernel code/data from start of memory map etc) - Quite a few updates for EZChip (Mellanox) platform - Fixes to fault/exception printing * tag 'arc-4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: (26 commits) ARC: Re-enable MMU upon Machine Check exception ARC: Show fault information passed to show_kernel_fault_diag() ARC: [plat-hsdk] initial port for HSDK board ARC: mm: Decouple RAM base address from kernel link address ARCv2: IOC: Tighten up the contraints (specifically base / size alignment) ARC: [plat-axs103] refactor the DT fudging code ARC: [plat-axs103] use clk driver #2: Add core pll node to DT to manage cpu clk ARC: [plat-axs103] use clk driver #1: Get rid of platform specific cpu clk setting ARCv2: SLC: provide a line based flush routine for debugging ARC: Hardcode ARCH_DMA_MINALIGN to max line length we may have ARC: [plat-eznps] handle extra aux regs #2: kernel/entry exit ARC: [plat-eznps] handle extra aux regs #1: save/restore on context switch ARC: [plat-eznps] avoid toggling of DPC register ARC: [plat-eznps] Update the init sequence of aux regs per cpu. ARC: [plat-eznps] new command line argument for HW scheduler at MTM ARC: set boot print log level to PR_INFO ARC: [plat-eznps] Handle user memory error same in simulation and silicon ARC: [plat-eznps] use schd.wft instruction instead of sleep at idle task ARC: create cpu specific version of arch_cpu_idle() ARC: [plat-eznps] spinlock aware for MTM ...
2017-09-08Merge tag 'pci-v4.14-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: - add enhanced Downstream Port Containment support, which prints more details about Root Port Programmed I/O errors (Dongdong Liu) - add Layerscape ls1088a and ls2088a support (Hou Zhiqiang) - add MediaTek MT2712 and MT7622 support (Ryder Lee) - add MediaTek MT2712 and MT7622 MSI support (Honghui Zhang) - add Qualcom IPQ8074 support (Varadarajan Narayanan) - add R-Car r8a7743/5 device tree support (Biju Das) - add Rockchip per-lane PHY support for better power management (Shawn Lin) - fix IRQ mapping for hot-added devices by replacing the pci_fixup_irqs() boot-time design with a host bridge hook called at probe-time (Lorenzo Pieralisi, Matthew Minter) - fix race when enabling two devices that results in upstream bridge not being enabled correctly (Srinath Mannam) - fix pciehp power fault infinite loop (Keith Busch) - fix SHPC bridge MSI hotplug events by enabling bus mastering (Aleksandr Bezzubikov) - fix a VFIO issue by correcting PCIe capability sizes (Alex Williamson) - fix an INTD issue on Xilinx and possibly other drivers by unifying INTx IRQ domain support (Paul Burton) - avoid IOMMU stalls by marking AMD Stoney GPU ATS as broken (Joerg Roedel) - allow APM X-Gene device assignment to guests by adding an ACS quirk (Feng Kan) - fix driver crashes by disabling Extended Tags on Broadcom HT2100 (Extended Tags support is required for PCIe Receivers but not Requesters, and we now enable them by default when Requesters support them) (Sinan Kaya) - fix MSIs for devices that use phantom RIDs for DMA by assuming MSIs use the real Requester ID (not a phantom RID) (Robin Murphy) - prevent assignment of Intel VMD children to guests (which may be supported eventually, but isn't yet) by not associating an IOMMU with them (Jon Derrick) - fix Intel VMD suspend/resume by releasing IRQs on suspend (Scott Bauer) - fix a Function-Level Reset issue with Intel 750 NVMe by waiting longer (up to 60sec instead of 1sec) for device to become ready (Sinan Kaya) - fix a Function-Level Reset issue on iProc Stingray by working around hardware defects in the CRS implementation (Oza Pawandeep) - fix an issue with Intel NVMe P3700 after an iProc reset by adding a delay during shutdown (Oza Pawandeep) - fix a Microsoft Hyper-V lockdep issue by polling instead of blocking in compose_msi_msg() (Stephen Hemminger) - fix a wireless LAN driver timeout by clearing DesignWare MSI interrupt status after it is handled, not before (Faiz Abbas) - fix DesignWare ATU enable checking (Jisheng Zhang) - reduce Layerscape dependencies on the bootloader by doing more initialization in the driver (Hou Zhiqiang) - improve Intel VMD performance allowing allocation of more IRQ vectors than present CPUs (Keith Busch) - improve endpoint framework support for initial DMA mask, different BAR sizes, configurable page sizes, MSI, test driver, etc (Kishon Vijay Abraham I, Stan Drozd) - rework CRS support to add periodic messages while we poll during enumeration and after Function-Level Reset and prepare for possible other uses of CRS (Sinan Kaya) - clean up Root Port AER handling by removing unnecessary code and moving error handler methods to struct pcie_port_service_driver (Christoph Hellwig) - clean up error handling paths in various drivers (Bjorn Andersson, Fabio Estevam, Gustavo A. R. Silva, Harunobu Kurokawa, Jeffy Chen, Lorenzo Pieralisi, Sergei Shtylyov) - clean up SR-IOV resource handling by disabling VF decoding before updating the corresponding resource structs (Gavin Shan) - clean up DesignWare-based drivers by unifying quirks to update Class Code and Interrupt Pin and related handling of write-protected registers (Hou Zhiqiang) - clean up by adding empty generic pcibios_align_resource() and pcibios_fixup_bus() and removing empty arch-specific implementations (Palmer Dabbelt) - request exclusive reset control for several drivers to allow cleanup elsewhere (Philipp Zabel) - constify various structures (Arvind Yadav, Bhumika Goyal) - convert from full_name() to %pOF (Rob Herring) - remove unused variables from iProc, HiSi, Altera, Keystone (Shawn Lin) * tag 'pci-v4.14-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (170 commits) PCI: xgene: Clean up whitespace PCI: xgene: Define XGENE_PCI_EXP_CAP and use generic PCI_EXP_RTCTL offset PCI: xgene: Fix platform_get_irq() error handling PCI: xilinx-nwl: Fix platform_get_irq() error handling PCI: rockchip: Fix platform_get_irq() error handling PCI: altera: Fix platform_get_irq() error handling PCI: spear13xx: Fix platform_get_irq() error handling PCI: artpec6: Fix platform_get_irq() error handling PCI: armada8k: Fix platform_get_irq() error handling PCI: dra7xx: Fix platform_get_irq() error handling PCI: exynos: Fix platform_get_irq() error handling PCI: iproc: Clean up whitespace PCI: iproc: Rename PCI_EXP_CAP to IPROC_PCI_EXP_CAP PCI: iproc: Add 500ms delay during device shutdown PCI: Fix typos and whitespace errors PCI: Remove unused "res" variable from pci_resource_io() PCI: Correct kernel-doc of pci_vpd_srdt_size(), pci_vpd_srdt_tag() PCI/AER: Reformat AER register definitions iommu/vt-d: Prevent VMD child devices from being remapping targets x86/PCI: Use is_vmd() rather than relying on the domain number ...
2017-09-08Merge tag 'kvm-4.14-1' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Radim Krčmář: "First batch of KVM changes for 4.14 Common: - improve heuristic for boosting preempted spinlocks by ignoring VCPUs in user mode ARM: - fix for decoding external abort types from guests - added support for migrating the active priority of interrupts when running a GICv2 guest on a GICv3 host - minor cleanup PPC: - expose storage keys to userspace - merge kvm-ppc-fixes with a fix that missed 4.13 because of vacations - fixes s390: - merge of kvm/master to avoid conflicts with additional sthyi fixes - wire up the no-dat enhancements in KVM - multiple epoch facility (z14 feature) - Configuration z/Architecture Mode - more sthyi fixes - gdb server range checking fix - small code cleanups x86: - emulate Hyper-V TSC frequency MSRs - add nested INVPCID - emulate EPTP switching VMFUNC - support Virtual GIF - support 5 level page tables - speedup nested VM exits by packing byte operations - speedup MMIO by using hardware provided physical address - a lot of fixes and cleanups, especially nested" * tag 'kvm-4.14-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (67 commits) KVM: arm/arm64: Support uaccess of GICC_APRn KVM: arm/arm64: Extract GICv3 max APRn index calculation KVM: arm/arm64: vITS: Drop its_ite->lpi field KVM: arm/arm64: vgic: constify seq_operations and file_operations KVM: arm/arm64: Fix guest external abort matching KVM: PPC: Book3S HV: Fix memory leak in kvm_vm_ioctl_get_htab_fd KVM: s390: vsie: cleanup mcck reinjection KVM: s390: use WARN_ON_ONCE only for checking KVM: s390: guestdbg: fix range check KVM: PPC: Book3S HV: Report storage key support to userspace KVM: PPC: Book3S HV: Fix case where HDEC is treated as 32-bit on POWER9 KVM: PPC: Book3S HV: Fix invalid use of register expression KVM: PPC: Book3S HV: Fix H_REGISTER_VPA VPA size validation KVM: PPC: Book3S HV: Fix setting of storage key in H_ENTER KVM: PPC: e500mc: Fix a NULL dereference KVM: PPC: e500: Fix some NULL dereferences on error KVM: PPC: Book3S HV: Protect updates to spapr_tce_tables list KVM: s390: we are always in czam mode KVM: s390: expose no-DAT to guest and migration support KVM: s390: sthyi: remove invalid guest write access ...
2017-09-08Merge tag 'linux-kselftest-4.14-rc1-update' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest updates from Shuah Khan: - TAP13 framework API and converting tests to TAP13 continues. A few more tests are converted and kselftest common RUN_TESTS in lib.mk is enhanced to print TAP13 to cover test shell scripts that won't be able to use kselftest API. - Several fixes to existing tests to not fail in unsupported cases. This has been an ongoing work based on the feedback from stable release kselftest users. - A new watchdog test and much needed cleanups to the existing tests from Eugeniu Rosca. - Changes to kselftest common lib.mk framework to make RUN_TESTS a function to be called from individual test make files to run stress and destructive sub-tests. * tag 'linux-kselftest-4.14-rc1-update' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (41 commits) selftests: Enhance kselftest_harness.h to print which assert failed selftests: lib.mk: change RUN_TESTS to print messages in TAP13 format selftests: change lib.mk RUN_TESTS to take test list as an argument selftests: lib.mk: suppress "cd" output from run_tests target selftests: kselftest framework: change skip exit code to 0 selftests/timers: make loop consistent with array size selftests: timers: remove rtctest_setdate from run_destructive_tests selftests: timers: Fix run_destructive_tests target to handle skipped tests kselftests: timers: leap-a-day: Change default arguments to help test runs selftests: timers: drop support for !KTEST case rtc: rtctest: Improve support detection selftests/cpu-hotplug: Skip test when there is only one online cpu selftests/cpu-hotplug: exit with failure when test occured unexpected behaviors selftests: futex: convert test to use ksft TAP13 framework selftests: capabilities: convert error output to TAP13 ksft framework selftests: memfd: Align STACK_SIZE for ARM AArch64 system selftests: warn if failure is due to lack of executable bit selftests: kselftest framework: add error counter selftests: capabilities: convert the test to use TAP13 ksft framework selftests: capabilities: fix to run Non-root +ia, sgidroot => i test ...
2017-09-08Merge tag 'trace-v4.14' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing updates from Steven Rostedt: "Nothing new in development for this release. These are mostly fixes that were found during development of changes for the next merge window and fixes that were sent to me late in the last cycle" * tag 'trace-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Apply trace_clock changes to instance max buffer tracing: Fix clear of RECORDED_TGID flag when disabling trace event tracing: Add barrier to trace_printk() buffer nesting modification ftrace: Fix memleak when unregistering dynamic ops when tracing disabled ftrace: Fix selftest goto location on error ftrace: Zero out ftrace hashes when a module is removed tracing: Only have rmmod clear buffers that its events were active in ftrace: Fix debug preempt config name in stack_tracer_{en,dis}able
2017-09-08genksyms: fix gperf removal conversionLinus Torvalds
I had stupidly missed one special use of 'is_reserved_word()' when I converted the code to avoid gperf. I had changed that function to return the token ID directly rather than a pointer to the token descriptor structure, but that meant that the test for "is this a reserved word" changed from checking the return value against NULL, to checking that it wasn't negative. And while I had converted the main token parser over, I missed the special case of the typeof phrase handling. And since our dependency chain for genksyms does not include the genksyms program itself changing, my kernel rebuild didn't show the problem. Fixes: bb3290d91695 ("Remove gperf usage from toolchain") Reported-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nfDavid S. Miller
Pablo Neira Ayuso says: ==================== Netfilter/IPVS fixes for net The following patchset contains Netfilter/IPVS fixes for your net tree, they are: 1) Fix SCTP connection setup when IPVS module is loaded and any scheduler is registered, from Xin Long. 2) Don't create a SCTP connection from SCTP ABORT packets, also from Xin Long. 3) WARN_ON() and drop packet, instead of BUG_ON() races when calling nf_nat_setup_info(). This is specifically a longstanding problem when br_netfilter with conntrack support is in place, patch from Florian Westphal. 4) Avoid softlock splats via iptables-restore, also from Florian. 5) Revert NAT hashtable conversion to rhashtable, semantics of rhlist are different from our simple NAT hashtable, this has been causing problems in the recent Linux kernel releases. From Florian. 6) Add per-bucket spinlock for NAT hashtable, so at least we restore one of the benefits we got from the previous rhashtable conversion. 7) Fix incorrect hashtable size in memory allocation in xt_hashlimit, from Zhizhou Tian. 8) Fix build/link problems with hashlimit and 32-bit arches, to address recent fallout from a new hashlimit mode, from Vishwanath Pai. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-08RDMA/netlink: clean up message validity array initializerLinus Torvalds
The fix in the parent made me look at that function, and react to how illogical and illegible the array initializer was. Use named array indexes to make it clearer what is going on, and make the initializer not depend silently on the exact index numbers. [ The initializer now also shows an odd inconsistency in the naming: note the IWCM vs IWPM.. - Linus ] Cc: Leon Romanovsky <leonro@mellanox.com> Cc: Doug Ledford <dledford@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08Merge tag 'wireless-drivers-for-davem-2017-09-08' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers Kalle Valo says: ==================== wireless-drivers fixes for 4.14 Few fixes to regressions introduced in the last one or two releases. The iwlwifi fix is for a regression reported by Linus. rtlwifi * fix two antenna selection related bugs iwlwifi * fix regression with older firmwares brcmfmac * workaround firmware crash for bcm4345 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-08sctp: fix missing wake ups in some situationsMarcelo Ricardo Leitner
Commit fb586f25300f ("sctp: delay calls to sk_data_ready() as much as possible") minimized the number of wake ups that are triggered in case the association receives a packet with multiple data chunks on it and/or when io_events are enabled and then commit 0970f5b36659 ("sctp: signal sk_data_ready earlier on data chunks reception") moved the wake up to as soon as possible. It thus relies on the state machine running later to clean the flag that the event was already generated. The issue is that there are 2 call paths that calls sctp_ulpq_tail_event() outside of the state machine, causing the flag to linger and possibly omitting a needed wake up in the sequence. One of the call paths is when enabling SCTP_SENDER_DRY_EVENTS via setsockopt(SCTP_EVENTS), as noticed by Harald Welte. The other is when partial reliability triggers removal of chunks from the send queue when the application calls sendmsg(). This commit fixes it by not setting the flag in case the socket is not owned by the user, as it won't be cleaned later. This works for user-initiated calls and also for rx path processing. Fixes: fb586f25300f ("sctp: delay calls to sk_data_ready() as much as possible") Reported-by: Harald Welte <laforge@gnumonks.org> Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-08RDAM/netlink: Fix out-of-bound access while checking message validityLeon Romanovsky
The netlink message sent with type == 0, which doesn't have any client behind it, caused to the overflow in max_num_ops array. Fix it by declaring zero number of ops for the first client. Fixes: c9901724a2f1 ("RDMA/netlink: Remove netlink clients infrastructure") Signed-off-by: Leon Romanovsky <leon@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-09-08netfilter: xt_hashlimit: fix build error caused by 64bit divisionVishwanath Pai
64bit division causes build/link errors on 32bit architectures. It prints out error messages like: ERROR: "__aeabi_uldivmod" [net/netfilter/xt_hashlimit.ko] undefined! The value of avg passed through by userspace in BYTE mode cannot exceed U32_MAX. Which means 64bit division in user2rate_bytes is unnecessary. To fix this I have changed the type of param 'user' to u32. Since anything greater than U32_MAX is an invalid input we error out in hashlimit_mt_check_common() when this is the case. Changes in v2: Making return type as u32 would cause an overflow for small values of 'user' (for example 2, 3 etc). To avoid this I bumped up 'r' to u64 again as well as the return type. This is OK since the variable that stores the result is u64. We still avoid 64bit division here since 'user' is u32. Fixes: bea74641e378 ("netfilter: xt_hashlimit: add rate match mode") Signed-off-by: Vishwanath Pai <vpai@akamai.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-09-08netfilter: xt_hashlimit: alloc hashtable with right sizeZhizhou Tian
struct xt_byteslimit_htable used hlist_head, but memory allocation is done through sizeof(struct list_head). Signed-off-by: Zhizhou Tian <zhizhou.tian@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-09-08netfilter: core: remove erroneous warn_onFlorian Westphal
kernel test robot reported: WARNING: CPU: 0 PID: 1244 at net/netfilter/core.c:218 __nf_hook_entries_try_shrink+0x49/0xcd [..] After allowing batching in nf_unregister_net_hooks its possible that an earlier call to __nf_hook_entries_try_shrink already compacted the list. If this happens we don't need to do anything. Fixes: d3ad2c17b4047 ("netfilter: core: batch nf_unregister_net_hooks synchronize_net calls") Reported-by: kernel test robot <xiaolong.ye@intel.com> Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Aaron Conole <aconole@bytheb.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-09-08netfilter: nat: use keyed locksFlorian Westphal
no need to serialize on a single lock, we can partition the table and add/delete in parallel to different slots. This restores one of the advantages that got lost with the rhlist revert. Cc: Ivan Babrou <ibobrik@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-09-08netfilter: nat: Revert "netfilter: nat: convert nat bysrc hash to rhashtable"Florian Westphal
This reverts commit 870190a9ec9075205c0fa795a09fa931694a3ff1. It was not a good idea. The custom hash table was a much better fit for this purpose. A fast lookup is not essential, in fact for most cases there is no lookup at all because original tuple is not taken and can be used as-is. What needs to be fast is insertion and deletion. rhlist removal however requires a rhlist walk. We can have thousands of entries in such a list if source port/addresses are reused for multiple flows, if this happens removal requests are so expensive that deletions of a few thousand flows can take several seconds(!). The advantages that we got from rhashtable are: 1) table auto-sizing 2) multiple locks 1) would be nice to have, but it is not essential as we have at most one lookup per new flow, so even a million flows in the bysource table are not a problem compared to current deletion cost. 2) is easy to add to custom hash table. I tried to add hlist_node to rhlist to speed up rhltable_remove but this isn't doable without changing semantics. rhltable_remove_fast will check that the to-be-deleted object is part of the table and that requires a list walk that we want to avoid. Furthermore, using hlist_node increases size of struct rhlist_head, which in turn increases nf_conn size. Link: https://bugzilla.kernel.org/show_bug.cgi?id=196821 Reported-by: Ivan Babrou <ibobrik@gmail.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-09-08netfilter: xtables: add scheduling opportunity in get_countersFlorian Westphal
There are reports about spurious softlockups during iptables-restore, a backtrace i saw points at get_counters -- it uses a sequence lock and also has unbounded restart loop. Signed-off-by: Florian Westphal <fw@strlen.de> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-09-08netfilter: nf_nat: don't bug when mapping already existsFlorian Westphal
It seems preferrable to limp along if we have a conflicting mapping, its certainly better than a BUG(). Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-09-08ipv6: fix memory leak with multiple tables during netns destructionSabrina Dubroca
fib6_net_exit only frees the main and local tables. If another table was created with fib6_alloc_table, we leak it when the netns is destroyed. Fix this in the same way ip_fib_net_exit cleans up tables, by walking through the whole hashtable of fib6_table's. We can get rid of the special cases for local and main, since they're also part of the hashtable. Reproducer: ip netns add x ip -net x -6 rule add from 6003:1::/64 table 100 ip netns del x Reported-by: Jianlin Shi <jishi@redhat.com> Fixes: 58f09b78b730 ("[NETNS][IPV6] ip6_fib - make it per network namespace") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-09-08kokr/memory-barriers.txt: Apply atomic_t.txt changeSeongJae Park
This commit applies memory-barriers.txt part of upstream change, commit 706eeb3e9c6f ("Documentation/locking/atomic: Add documents for new atomic_t APIs") to Korean translation. Signed-off-by: SeongJae Park <sj38.park@gmail.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2017-09-08kokr/doc: Update memory-barriers.txt for read-to-write dependenciesSeongJae Park
This commit applies upstream change, commit 66ce3a4dcb9f ("doc: Update memory-barriers.txt for read-to-write dependencies") to Korean translation. Signed-off-by: SeongJae Park <sj38.park@gmail.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2017-09-08docs-rst: don't require adjustbox anymoreMauro Carvalho Chehab
Only the media PDF book was requiring adjustbox, in order to scale big tables. That worked pretty good with Sphinx versions 1.4 and 1.5, but Spinx 1.6 changed the way tables are produced, by introducing some weird macros before tabulary. That causes adjustbox to fail. So, it can't be used anymore, and its usage was removed from the media book. So, let's remove it from conf.py and sphinx-pre-install. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2017-09-08docs-rst: conf.py: only setup notice box colors if Sphinx < 1.6Mauro Carvalho Chehab
Sphinx 1.5 added a new way to change backward colors for note boxes, but kept backward compatibility with 1.4. On Sphinx 1.6, the old way stopped working, in favor of a new less hackish way. Unfortunately, this is currently too buggy to be used, and the old way doesn't work anymore. So, we have no option but to stick with boring notice boxes. One example of such bug is the notice that it is inside struct v4l2_plane, at the "bytesused" field. At least, add a notice about how to use, as maybe some day the bug will vanish. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2017-09-08docs-rst: conf.py: remove lscape from LaTeX preambleMauro Carvalho Chehab
Only the media book used this extension in the past, but it is not required anymore. Cleanup patch only. Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com> Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2017-09-08s390/dasd: blk-mq conversionStefan Haberland
Use new blk-mq interfaces. Use multiple queues and also use the block layer complete helper that finish the IO on the CPU that initiated it. Reviewed-by: Jan Hoeppner <hoeppner@linux.vnet.ibm.com> Signed-off-by: Stefan Haberland <sth@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-09-08Merge branch 'kvm-ppc-fixes' of ↵Radim Krčmář
git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc This fix was intended for 4.13, but didn't get in because both maintainers were on vacation. Paul Mackerras: "It adds mutual exclusion between list_add_rcu and list_del_rcu calls on the kvm->arch.spapr_tce_tables list. Without this, userspace could potentially trigger corruption of the list and cause a host crash or worse."
2017-09-08netfilter: ipvs: do not create conn for ABORT packet in sctp_conn_scheduleXin Long
There's no reason for ipvs to create a conn for an ABORT packet even if sysctl_sloppy_sctp is set. This patch is to accept it without creating a conn, just as ipvs does for tcp's RST packet. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-09-08netfilter: ipvs: fix the issue that sctp_conn_schedule drops non-INIT packetXin Long
Commit 5e26b1b3abce ("ipvs: support scheduling inverse and icmp SCTP packets") changed to check packet type early. It introduced a side effect: if it's not a INIT packet, ports will be set as NULL, and the packet will be dropped later. It caused that sctp couldn't create connection when ipvs module is loaded and any scheduler is registered on server. Li Shuang reproduced it by running the cmds on sctp server: # ipvsadm -A -t 1.1.1.1:80 -s rr # ipvsadm -D -t 1.1.1.1:80 then the server could't work any more. This patch is to return 1 when it's not an INIT packet. It means ipvs will accept it without creating a conn for it, just like what it does for tcp. Fixes: 5e26b1b3abce ("ipvs: support scheduling inverse and icmp SCTP packets") Reported-by: Li Shuang <shuali@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-09-08brcmfmac: feature check for multi-scheduled scan fails on bcm4345 devicesIan W MORRISON
The firmware feature check introduced for multi-scheduled scan is also failing for bcm4345 devices resulting in a firmware crash. The reason for this crash has not yet been root cause so this patch avoids the feature check for those device as a short-term fix. Fixes: 9fe929aaace6 ("brcmfmac: add firmware feature detection for gscan feature") Cc: <stable@vger.kernel.org> # v4.13 Signed-off-by: Ian W MORRISON <ianwmorrison@gmail.com> Acked-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
2017-09-07Merge branch 'gperf-removal'Linus Torvalds
Remove our use of 'gperf' for generating perfect hashes from some of our build tools. This removal was prompted by Masahiro Yamada sending out a patch that removes all our pre-generated files, and when I tested it, I noticed that the gperf version I have (3.1) apparently generates code that no longer works with out code-base because the function interfaces generated by gperf have changed. We really don't care that much, and the gperf people changed their interfaces in ways that makes it annoying to work with them. Tools that make it hard to use them should not be used, and the kernel is not at all interested in some autoconf mess. So remove the gperf dependency entirely. It turns out that if you ignore the pre-generated files, the use of gperf apparently saved us a whopping fifteen lines of code. It obviously wasn't worth it, considering that the pre-generated files are about 500 lines. I sent this out as a patch about three weeks ago, and got absolutely zero responses. So let's see if anybody notices now that I merge it. Because there might be serious bugs here, but it WorksForMe(tm). * gperf-removal: Remove gperf usage from toolchain
2017-09-07Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsiLinus Torvalds
Pull SCSI updates from James Bottomley: "This is mostly updates of the usual suspects: lpfc, qla2xxx, hisi_sas, megaraid_sas, zfcp and a host of minor updates. The major driver change here is the elimination of the block based cciss driver in favour of the SCSI based hpsa driver (which now drives all the legacy cases cciss used to be required for). Plus a reset handler clean up and the redo of the SAS SMP handler to use bsg lib" * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (279 commits) scsi: scsi-mq: Always unprepare before requeuing a request scsi: Show .retries and .jiffies_at_alloc in debugfs scsi: Improve requeuing behavior scsi: Call scsi_initialize_rq() for filesystem requests scsi: qla2xxx: Reset the logo flag, after target re-login. scsi: qla2xxx: Fix slow mem alloc behind lock scsi: qla2xxx: Clear fc4f_nvme flag scsi: qla2xxx: add missing includes for qla_isr scsi: qla2xxx: Fix an integer overflow in sysfs code scsi: aacraid: report -ENOMEM to upper layer from aac_convert_sgraw2() scsi: aacraid: get rid of one level of indentation scsi: aacraid: fix indentation errors scsi: storvsc: fix memory leak on ring buffer busy scsi: scsi_transport_sas: switch to bsg-lib for SMP passthrough scsi: smartpqi: remove the smp_handler stub scsi: hpsa: remove the smp_handler stub scsi: bsg-lib: pass the release callback through bsg_setup_queue scsi: Rework handling of scsi_device.vpd_pg8[03] scsi: Rework the code for caching Vital Product Data (VPD) scsi: rcu: Introduce rcu_swap_protected() ...
2017-09-07Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk Pull printk updates from Petr Mladek: - Do not allow use of freed init data and code even when boot consoles are forced to stay. Also check for the init memory more precisely. - Some code clean up by starting contributors. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk: printk: Clean up do_syslog() error handling printk/console: Enhance the check for consoles using init memory printk/console: Always disable boot consoles that use init memory before it is freed printk: Modify operators of printed_len and text_len
2017-09-07f2fs: avoid race in between read xattr & write xattrYunlei He
Thread A: Thread B: -f2fs_getxattr -lookup_all_xattrs -xnid = F2FS_I(inode)->i_xattr_nid; -f2fs_setxattr -__f2fs_setxattr -write_all_xattrs -truncate_xattr_node ... ... -write_checkpoint ... ... -alloc_nid <- nid reuse -get_node_page -f2fs_bug_on <- nid != node_footer->nid It's need a rw_sem to avoid the race Signed-off-by: Yunlei He <heyunlei@huawei.com> Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-09-07f2fs: make get_lock_data_page to handle encrypted inodeJaegeuk Kim
This patch refactors get_lock_data_page() to handle encryption case directly. In order to do that, it introduces common f2fs_submit_page_read(). Reviewed-by: Chao Yu <yuchao0@huawei.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2017-09-07Merge tag 'audit-pr-20170907' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit updates from Paul Moore: "A small pull request for audit this time, only four patches and only two with any real code changes. Those two changes are the removal of a pointless SELinux AVC initialization audit event and a fix to improve the audit timestamp overhead. The other two patches are comment cleanup and administrative updates, nothing very exciting. Everything passes our tests" * tag 'audit-pr-20170907' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: update the function comments selinux: remove AVC init audit log message audit: update the audit info in MAINTAINERS audit: Reduce overhead using a coarse clock
2017-09-07Merge tag 'secureexec-v4.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull secureexec update from Kees Cook: "This series has the ultimate goal of providing a sane stack rlimit when running set*id processes. To do this, the bprm_secureexec LSM hook is collapsed into the bprm_set_creds hook so the secureexec-ness of an exec can be determined early enough to make decisions about rlimits and the resulting memory layouts. Other logic acting on the secureexec-ness of an exec is similarly consolidated. Capabilities needed some special handling, but the refactoring removed other special handling, so that was a wash" * tag 'secureexec-v4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: exec: Consolidate pdeath_signal clearing exec: Use sane stack rlimit under secureexec exec: Consolidate dumpability logic smack: Remove redundant pdeath_signal clearing exec: Use secureexec for clearing pdeath_signal exec: Use secureexec for setting dumpability LSM: drop bprm_secureexec hook commoncap: Move cap_elevated calculation into bprm_set_creds commoncap: Refactor to remove bprm_secureexec hook smack: Refactor to remove bprm_secureexec hook selinux: Refactor to remove bprm_secureexec hook apparmor: Refactor to remove bprm_secureexec hook binfmt: Introduce secureexec flag exec: Correct comments about "point of no return" exec: Rename bprm->cred_prepared to called_set_creds
2017-09-07Merge tag 'gcc-plugins-v4.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull gcc plugins update from Kees Cook: "This finishes the porting work on randstruct, and introduces a new option to structleak, both noted below: - For the randstruct plugin, enable automatic randomization of structures that are entirely function pointers (along with a couple designated initializer fixes). - For the structleak plugin, provide an option to perform zeroing initialization of all otherwise uninitialized stack variables that are passed by reference (Ard Biesheuvel)" * tag 'gcc-plugins-v4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: gcc-plugins: structleak: add option to init all vars used as byref args randstruct: Enable function pointer struct detection drivers/net/wan/z85230.c: Use designated initializers drm/amd/powerplay: rv: Use designated initializers
2017-09-08Merge branches 'thermal-core', 'thermal-soc', 'thermal-intel' and ↵Zhang Rui
'const-thermal-zone-structure' into next
2017-09-08Merge branches 'mediatek-mt2712', 'rockchip-rk3328' and 'uniphier-thermal' ↵Zhang Rui
into thermal-soc
2017-09-07rds: Fix incorrect statistics countingHåkon Bugge
In rds_send_xmit() there is logic to batch the sends. However, if another thread has acquired the lock and has incremented the send_gen, it is considered a race and we yield. The code incrementing the s_send_lock_queue_raced statistics counter did not count this event correctly. This commit counts the race condition correctly. Changes from v1: - Removed check for *someone_on_xmit()* - Fixed incorrect indentation Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Reviewed-by: Knut Omang <knut.omang@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>