summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-08-17Merge tag 'drm-next-2018-08-17-1' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm msm support for adreno a6xx from Dave Airlie: "This is the support for new Qualcomm Snapdragon SoCs with the A6xx core. Userspace support is in mesa now" * tag 'drm-next-2018-08-17-1' of git://anongit.freedesktop.org/drm/drm: drm/msm: a6xx: fix spelling mistake: "initalization" -> "initialization" drm/msm: Add A6XX device support drm/msm: update generated headers drm/msm/adreno: Load the firmware before bringing up the hardware drm/msm: Add a helper function to parse clock names
2018-08-17Merge tag 'drm-next-2018-08-17' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull drm fixes from Dave Airlie: "First round of fixes for -rc1. I'll follow this up with the msm new hw support pull request. This just has three sets of fixes, some for msm before the new hw, a bunch of AMD fixes (includiing some required firmware changes for new hw), and a set of i915 (+gvt) fixes" * tag 'drm-next-2018-08-17' of git://anongit.freedesktop.org/drm/drm: (30 commits) drm/amdgpu: Use kvmalloc for allocating UVD/VCE/VCN BO backup memory drm/i915: set DP Main Stream Attribute for color range on DDI platforms drm/i915/selftests: Hold rpm for unparking drm/i915: Restore user forcewake domains across suspend drm/i915: Unmask user interrupts writes into HWSP on snb/ivb/vlv/hsw drm/i915/gvt: fix memory leak in intel_vgpu_ioctl() drm/i915/gvt: Off by one in intel_vgpu_write_fence() drm/i915/kvmgt: Fix potential Spectre v1 drm/i915/gvt: return error on cmd access drm/i915/gvt: initialize dmabuf mutex in vgpu_create drm/i915/gvt: fix cleanup sequence in intel_gvt_clean_device drm/amd/display: Guard against null crtc in CRC IRQ drm/amd/display: Pass connector id when executing VBIOS CT drm/amd/display: Check if clock source in use before disabling drm/amd/display: Allow clock sharing b/w HDMI and DVI drm/amd/display: Fix warning observed in mode change on Vega drm/amd/display: fix single link DVI has no display drm/amdgpu/vce: VCE entity initialization relies on ring initializtion drm/amdgpu/uvd: UVD entity initialization relys on ring initialization drm/amdgpu:add VCN booting with firmware loaded by PSP ...
2018-08-17Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "A couple of arm64 fixes - Fix boot on Hikey-960 by avoiding an IPI with interrupts disabled - Fix address truncation in pfn_valid() implementation" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: mm: check for upper PAGE_SHIFT bits in pfn_valid() arm64: Avoid calling stop_machine() when patching jump labels
2018-08-17Merge tag 'powerpc-4.19-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Notable changes: - A fix for a bug in our page table fragment allocator, where a page table page could be freed and reallocated for something else while still in use, leading to memory corruption etc. The fix reuses pt_mm in struct page (x86 only) for a powerpc only refcount. - Fixes to our pkey support. Several are user-visible changes, but bring us in to line with x86 behaviour and/or fix outright bugs. Thanks to Florian Weimer for reporting many of these. - A series to improve the hvc driver & related OPAL console code, which have been seen to cause hardlockups at times. The hvc driver changes in particular have been in linux-next for ~month. - Increase our MAX_PHYSMEM_BITS to 128TB when SPARSEMEM_VMEMMAP=y. - Remove Power8 DD1 and Power9 DD1 support, neither chip should be in use anywhere other than as a paper weight. - An optimised memcmp implementation using Power7-or-later VMX instructions - Support for barrier_nospec on some NXP CPUs. - Support for flushing the count cache on context switch on some IBM CPUs (controlled by firmware), as a Spectre v2 mitigation. - A series to enhance the information we print on unhandled signals to bring it into line with other arches, including showing the offending VMA and dumping the instructions around the fault. Thanks to: Aaro Koskinen, Akshay Adiga, Alastair D'Silva, Alexey Kardashevskiy, Alexey Spirkov, Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Arnd Bergmann, Bartosz Golaszewski, Benjamin Herrenschmidt, Bharat Bhushan, Bjoern Noetel, Boqun Feng, Breno Leitao, Bryant G. Ly, Camelia Groza, Christophe Leroy, Christoph Hellwig, Cyril Bur, Dan Carpenter, Daniel Klamt, Darren Stevens, Dave Young, David Gibson, Diana Craciun, Finn Thain, Florian Weimer, Frederic Barrat, Gautham R. Shenoy, Geert Uytterhoeven, Geoff Levand, Guenter Roeck, Gustavo Romero, Haren Myneni, Hari Bathini, Joel Stanley, Jonathan Neuschäfer, Kees Cook, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring, Mathieu Malaterre, Mauro S. M. Rodrigues, Michael Hanselmann, Michael Neuling, Michael Schmitz, Mukesh Ojha, Murilo Opsfelder Araujo, Nicholas Piggin, Parth Y Shah, Paul Mackerras, Paul Menzel, Ram Pai, Randy Dunlap, Rashmica Gupta, Reza Arbab, Rodrigo R. Galvao, Russell Currey, Sam Bobroff, Scott Wood, Shilpasri G Bhat, Simon Guo, Souptick Joarder, Stan Johnson, Thiago Jung Bauermann, Tyrel Datwyler, Vaibhav Jain, Vasant Hegde, Venkat Rao, zhong jiang" * tag 'powerpc-4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (234 commits) powerpc/mm/book3s/radix: Add mapping statistics powerpc/uaccess: Enable get_user(u64, *p) on 32-bit powerpc/mm/hash: Remove unnecessary do { } while(0) loop powerpc/64s: move machine check SLB flushing to mm/slb.c powerpc/powernv/idle: Fix build error powerpc/mm/tlbflush: update the mmu_gather page size while iterating address range powerpc/mm: remove warning about ‘type’ being set powerpc/32: Include setup.h header file to fix warnings powerpc: Move `path` variable inside DEBUG_PROM powerpc/powermac: Make some functions static powerpc/powermac: Remove variable x that's never read cxl: remove a dead branch powerpc/powermac: Add missing include of header pmac.h powerpc/kexec: Use common error handling code in setup_new_fdt() powerpc/xmon: Add address lookup for percpu symbols powerpc/mm: remove huge_pte_offset_and_shift() prototype powerpc/lib: Use patch_site to patch copy_32 functions once cache is enabled powerpc/pseries: Fix endianness while restoring of r3 in MCE handler. powerpc/fadump: merge adjacent memory ranges to reduce PT_LOAD segements powerpc/fadump: handle crash memory ranges array index overflow ...
2018-08-17Merge tag 'modules-for-v4.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux Pull modules updates from Jessica Yu: "Summary of modules changes for the 4.19 merge window: - Fix modules kallsyms for livepatch. Livepatch modules can have SHN_UNDEF symbols in their module symbol tables for later symbol resolution, but kallsyms shouldn't be returning these symbols - Some code cleanups and minor reshuffling in load_module() were done to log the module name when module signature verification fails" * tag 'modules-for-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/jeyu/linux: kernel/module: Use kmemdup to replace kmalloc+memcpy ARM: module: fix modsign build error modsign: log module name in the event of an error module: replace VMLINUX_SYMBOL_STR() with __stringify() or string literal module: print sensible error code module: setup load info before module_sig_check() module: make it clear when we're handling the module copy in info->hdr module: exclude SHN_UNDEF symbols from kallsyms api
2018-08-17Merge tag 'vla-leftovers-v4.19-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull VLA removal leftovers from Kees Cook: - bus/imx-weim: Use maximum register count to avoid VLA - drm/i2c/tda9950: Use maximum CEC message size to avoid VLA * tag 'vla-leftovers-v4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: bus: imx-weim: Remove VLA usage drm/i2c: tda9950: Remove VLA usage
2018-08-17x86/speculation/l1tf: Exempt zeroed PTEs from inversionSean Christopherson
It turns out that we should *not* invert all not-present mappings, because the all zeroes case is obviously special. clear_page() does not undergo the XOR logic to invert the address bits, i.e. PTE, PMD and PUD entries that have not been individually written will have val=0 and so will trigger __pte_needs_invert(). As a result, {pte,pmd,pud}_pfn() will return the wrong PFN value, i.e. all ones (adjusted by the max PFN mask) instead of zero. A zeroed entry is ok because the page at physical address 0 is reserved early in boot specifically to mitigate L1TF, so explicitly exempt them from the inversion when reading the PFN. Manifested as an unexpected mprotect(..., PROT_NONE) failure when called on a VMA that has VM_PFNMAP and was mmap'd to as something other than PROT_NONE but never used. mprotect() sends the PROT_NONE request down prot_none_walk(), which walks the PTEs to check the PFNs. prot_none_pte_entry() gets the bogus PFN from pte_pfn() and returns -EACCES because it thinks mprotect() is trying to adjust a high MMIO address. [ This is a very modified version of Sean's original patch, but all credit goes to Sean for doing this and also pointing out that sometimes the __pte_needs_invert() function only gets the protection bits, not the full eventual pte. But zero remains special even in just protection bits, so that's ok. - Linus ] Fixes: f22cc87f6c1f ("x86/speculation/l1tf: Invert all not present mappings") Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Acked-by: Andi Kleen <ak@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-08-17Merge tag 'for-4.19/dm-changes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper updates from Mike Snitzer: - A couple stable fixes for the DM writecache target. - A stable fix for the DM cache target that fixes the potential for data corruption after an unclean shutdown of a cache device using writeback mode. - Update DM integrity target to allow the metadata to be stored on a separate device from data. - Fix DM kcopyd and the snapshot target to cond_resched() where appropriate and be more efficient with processing completed work. - A few fixes and improvements for DM crypt. - Add DM delay target feature to configure delay of flushes independent of writes. - Update DM thin-provisioning target to include metadata_low_watermark threshold in pool status. - Fix stale DM thin-provisioning Documentation. * tag 'for-4.19/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (26 commits) dm writecache: fix a crash due to reading past end of dirty_bitmap dm crypt: don't decrease device limits dm cache metadata: set dirty on all cache blocks after a crash dm snapshot: remove stale FIXME in snapshot_map() dm snapshot: improve performance by switching out_of_order_list to rbtree dm kcopyd: avoid softlockup in run_complete_job dm cache metadata: save in-core policy_hint_size to on-disk superblock dm thin: stop no_space_timeout worker when switching to write-mode dm kcopyd: return void from dm_kcopyd_copy() dm thin: include metadata_low_watermark threshold in pool status dm writecache: report start_sector in status line dm crypt: convert essiv from ahash to shash dm crypt: use wake_up_process() instead of a wait queue dm integrity: recalculate checksums on creation dm integrity: flush journal on suspend when using separate metadata device dm integrity: use version 2 for separate metadata dm integrity: allow separate metadata device dm integrity: add ic->start in get_data_sector() dm integrity: report provided data sectors in the status dm integrity: implement fair range locks ...
2018-08-17Merge tag 'fsnotify_for_v4.19-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull fsnotify updates from Jan Kara: "fsnotify cleanups from Amir and a small inotify improvement" * tag 'fsnotify_for_v4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: inotify: Add flag IN_MASK_CREATE for inotify_add_watch() fanotify: factor out helpers to add/remove mark fsnotify: add helper to get mask from connector fsnotify: let connector point to an abstract object fsnotify: pass connp and object type to fsnotify_add_mark() fsnotify: use typedef fsnotify_connp_t for brevity
2018-08-17Merge tag 'for_v4.19-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull UDF and ext2 update from Jan Kara. * tag 'for_v4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: ext2: use ktime_get_real_seconds for timestamps udf: convert inode stamps to timespec64
2018-08-17EDAC: Add missing MEM_LRDDR4 entry in edac_mem_types[]Takashi Iwai
The edac_mem_types[] array misses a MEM_LRDDR4 entry, which leads to NULL pointer dereference when accessed via sysfs or such. Signed-off-by: Takashi Iwai <tiwai@suse.de> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Yazen Ghannam <Yazen.Ghannam@amd.com> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: <stable@vger.kernel.org> Link: http://lkml.kernel.org/r/20180810141426.8918-1-tiwai@suse.de Fixes: 1e8096bb2031 ("EDAC: Add LRDDR4 DRAM type") Signed-off-by: Borislav Petkov <bp@suse.de>
2018-08-17Merge branch 'topic/pl330' into for-linusVinod Koul
2018-08-17Merge branch 'topic/imx' into for-linusVinod Koul
2018-08-17Merge branch 'topic/xilinx' into for-linusVinod Koul
2018-08-17Merge branch 'topic/ste' into for-linusVinod Koul
2018-08-17Merge branch 'topic/renesas' into for-linusVinod Koul
2018-08-17Merge branch 'topic/owl' into for-linusVinod Koul
2018-08-17Merge branch 'topic/nbpfaxi' into for-linusVinod Koul
2018-08-17Merge branch 'topic/mv_xor' into for-linusVinod Koul
2018-08-17Merge branch 'topic/ioat' into for-linusVinod Koul
2018-08-17Merge branch 'topic/intel' into for-linusVinod Koul
2018-08-17Merge branch 'topic/async_tx' into for-linusVinod Koul
2018-08-17arm64: mm: check for upper PAGE_SHIFT bits in pfn_valid()Greg Hackmann
ARM64's pfn_valid() shifts away the upper PAGE_SHIFT bits of the input before seeing if the PFN is valid. This leads to false positives when some of the upper bits are set, but the lower bits match a valid PFN. For example, the following userspace code looks up a bogus entry in /proc/kpageflags: int pagemap = open("/proc/self/pagemap", O_RDONLY); int pageflags = open("/proc/kpageflags", O_RDONLY); uint64_t pfn, val; lseek64(pagemap, [...], SEEK_SET); read(pagemap, &pfn, sizeof(pfn)); if (pfn & (1UL << 63)) { /* valid PFN */ pfn &= ((1UL << 55) - 1); /* clear flag bits */ pfn |= (1UL << 55); lseek64(pageflags, pfn * sizeof(uint64_t), SEEK_SET); read(pageflags, &val, sizeof(val)); } On ARM64 this causes the userspace process to crash with SIGSEGV rather than reading (1 << KPF_NOPAGE). kpageflags_read() treats the offset as valid, and stable_page_flags() will try to access an address between the user and kernel address ranges. Fixes: c1cc1552616d ("arm64: MMU initialisation") Cc: stable@vger.kernel.org Signed-off-by: Greg Hackmann <ghackmann@google.com> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-08-17arm64: Avoid calling stop_machine() when patching jump labelsWill Deacon
Patching a jump label involves patching a single instruction at a time, swizzling between a branch and a NOP. The architecture treats these instructions specially, so a concurrently executing CPU is guaranteed to see either the NOP or the branch, rather than an amalgamation of the two instruction encodings. However, in order to guarantee that the new instruction is visible, it is necessary to send an IPI to the concurrently executing CPU so that it discards any previously fetched instructions from its pipeline. This operation therefore cannot be completed from a context with IRQs disabled, but this is exactly what happens on the jump label path where the hotplug lock is held and irqs are subsequently disabled by stop_machine_cpuslocked(). This results in a deadlock during boot on Hikey-960. Due to the architectural guarantees around patching NOPs and branches, we don't actually need to stop_machine() at all on the jump label path, so we can avoid the deadlock by using the "nosync" variant of our instruction patching routine. Fixes: 693350a79980 ("arm64: insn: Don't fallback on nosync path for general insn patching") Reported-by: Tuomas Tynkkynen <tuomas.tynkkynen@iki.fi> Reported-by: John Stultz <john.stultz@linaro.org> Tested-by: Valentin Schneider <valentin.schneider@arm.com> Tested-by: Tuomas Tynkkynen <tuomas@tuxera.com> Tested-by: John Stultz <john.stultz@linaro.org> Signed-off-by: Will Deacon <will.deacon@arm.com>
2018-08-17Merge tag 'drm-msm-next-2018-08-10' of ↵Dave Airlie
git://people.freedesktop.org/~robclark/linux into drm-next An optional follow-on PR for 4.19, on top of previous -fixes PR, which brings in a6xx support. These patches have been on list since earlier in the year (mostly waiting for userspace). They have been in linux-next since earlier in the week, now that we have freedreno userspace working on a6xx[1][2]. So far glmark2, Chromium/ChromiumOS, gnome-shell, glamor, xonotic, etc, are working. And a healthy chuck of deqp works, and I've been busy fixing things. The needed libdrm changes (no new uapi changes needed) are already on master, and the 2nd branch is rebased on that. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGuCKekZ2Dho80qxODT1BEUGg4hbq33ACUy5VXs3dHbDLA@mail.gmail.com
2018-08-16remoteproc/davinci: use the reset frameworkBartosz Golaszewski
Switch to using the reset framework instead of handcoded reset routines we used so far. Reviewed-by: Sekhar Nori <nsekhar@ti.com> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Signed-off-by: Bartosz Golaszewski <bgolaszewski@baylibre.com> Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
2018-08-17Merge tag 'drm-intel-next-fixes-2018-08-16-1' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-intel into drm-next Fixes for: - DP full color range. - selftest for gem_object - forcewake on suspend - GPU reset This also include accumulated fixes from GVT: - Fix an error code in gvt_dma_map_page() (Dan) - Fix off by one error in intel_vgpu_write_fence() (Dan) - Fix potential Spectre v1 (Gustavo) - Fix workload free in vgpu release (Henry) - Fix cleanup sequence in intel_gvt_clean_device (Henry) - dmabuf mutex init place fix (Henry) - possible memory leak in intel_vgpu_ioctl() err path (Yi) - return error on cmd access check failure (Yan) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180816190335.GA7765@intel.com
2018-08-17Merge branch 'drm-next-4.19' of git://people.freedesktop.org/~agd5f/linux ↵Dave Airlie
into drm-next Fixes for 4.19: - Add VCN PSP FW loading for RV (this is required on upcoming parts) - Fix scheduler setup ordering for VCE and UVD - Few misc display fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180816181840.2786-1-alexander.deucher@amd.com
2018-08-17Merge tag 'drm-msm-fixes-2018-08-10' of ↵Dave Airlie
git://people.freedesktop.org/~robclark/linux into drm-next Some small msm fixes. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGuZE0VEpatrtxGZtUB6FaQYr6Gf07UVpMsD15ook+5_WQ@mail.gmail.com
2018-08-16tracing: Fix SPDX format headers to use C++ style commentsSteven Rostedt (VMware)
The Linux kernel adopted the SPDX License format headers to ease license compliance management, and uses the C++ '//' style comments for the SPDX header tags. Some files in the tracing directory used the C style /* */ comments for them. To be consistent across all files, replace the /* */ C style SPDX tags with the C++ // SPDX tags. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-16tracing: Add SPDX License format tags to tracing filesSteven Rostedt (VMware)
Add the SPDX License header to ease license compliance management. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-16tracing: Add SPDX License format to bpf_trace.cSteven Rostedt (VMware)
Add the SPDX License header to ease license compliance management. Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-16Merge branch 'sockmap-ulp-fixes'Alexei Starovoitov
Daniel Borkmann says: ==================== Batch of various fixes related to BPF sockmap and ULP, including adding module alias to restrict module requests, races and memory leaks in sockmap code. For details please refer to the individual patches. Thanks! ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-16bpf, sockmap: fix sock_map_ctx_update_elem race with exist/noexistDaniel Borkmann
The current code in sock_map_ctx_update_elem() allows for BPF_EXIST and BPF_NOEXIST map update flags. While on array-like maps this approach is rather uncommon, e.g. bpf_fd_array_map_update_elem() and others enforce map update flags to be BPF_ANY such that xchg() can be used directly, the current implementation in sock map does not guarantee that such operation with BPF_EXIST / BPF_NOEXIST is atomic. The initial test does a READ_ONCE(stab->sock_map[i]) to fetch the socket from the slot which is then tested for NULL / non-NULL. However later after __sock_map_ctx_update_elem(), the actual update is done through osock = xchg(&stab->sock_map[i], sock). Problem is that in the meantime a different CPU could have updated / deleted a socket on that specific slot and thus flag contraints won't hold anymore. I've been thinking whether best would be to just break UAPI and do an enforcement of BPF_ANY to check if someone actually complains, however trouble is that already in BPF kselftest we use BPF_NOEXIST for the map update, and therefore it might have been copied into applications already. The fix to keep the current behavior intact would be to add a map lock similar to the sock hash bucket lock only for covering the whole map. Fixes: 174a79ff9515 ("bpf: sockmap with sk redirect support") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-16bpf, sockmap: fix map elem deletion race with smap_stop_sockDaniel Borkmann
The smap_start_sock() and smap_stop_sock() are each protected under the sock->sk_callback_lock from their call-sites except in the case of sock_map_delete_elem() where we drop the old socket from the map slot. This is racy because the same sock could be part of multiple sock maps, so we run smap_stop_sock() in parallel, and given at that point psock->strp_enabled might be true on both CPUs, we might for example wrongly restore the sk->sk_data_ready / sk->sk_write_space. Therefore, hold the sock->sk_callback_lock as well on delete. Looks like 2f857d04601a ("bpf: sockmap, remove STRPARSER map_flags and add multi-map support") had this right, but later on e9db4ef6bf4c ("bpf: sockhash fix omitted bucket lock in sock_close") removed it again from delete leaving this smap_stop_sock() instance unprotected. Fixes: e9db4ef6bf4c ("bpf: sockhash fix omitted bucket lock in sock_close") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-16bpf, sockmap: fix leakage of smap_psock_map_entryDaniel Borkmann
While working on sockmap I noticed that we do not always kfree the struct smap_psock_map_entry list elements which track psocks attached to maps. In the case of sock_hash_ctx_update_elem(), these map entries are allocated outside of __sock_map_ctx_update_elem() with their linkage to the socket hash table filled. In the case of sock array, the map entries are allocated inside of __sock_map_ctx_update_elem() and added with their linkage to the psock->maps. Both additions are under psock->maps_lock each. Now, we drop these elements from their psock->maps list in a few occasions: i) in sock array via smap_list_map_remove() when an entry is either deleted from the map from user space, or updated via user space or BPF program where we drop the old socket at that map slot, or the sock array is freed via sock_map_free() and drops all its elements; ii) for sock hash via smap_list_hash_remove() in exactly the same occasions as just described for sock array; iii) in the bpf_tcp_close() where we remove the elements from the list via psock_map_pop() and iterate over them dropping themselves from either sock array or sock hash; and last but not least iv) once again in smap_gc_work() which is a callback for deferring the work once the psock refcount hit zero and thus the socket is being destroyed. Problem is that the only case where we kfree() the list entry is in case iv), which at that point should have an empty list in normal cases. So in cases from i) to iii) we unlink the elements without freeing where they go out of reach from us. Hence fix is to properly kfree() them as well to stop the leakage. Given these are all handled under psock->maps_lock there is no need for deferred RCU freeing. I later also ran with kmemleak detector and it confirmed the finding as well where in the state before the fix the object goes unreferenced while after the patch no kmemleak report related to BPF showed up. [...] unreferenced object 0xffff880378eadae0 (size 64): comm "test_sockmap", pid 2225, jiffies 4294720701 (age 43.504s) hex dump (first 32 bytes): 00 01 00 00 00 00 ad de 00 02 00 00 00 00 ad de ................ 50 4d 75 5d 03 88 ff ff 00 00 00 00 00 00 00 00 PMu]............ backtrace: [<000000005225ac3c>] sock_map_ctx_update_elem.isra.21+0xd8/0x210 [<0000000045dd6d3c>] bpf_sock_map_update+0x29/0x60 [<00000000877723aa>] ___bpf_prog_run+0x1e1f/0x4960 [<000000002ef89e83>] 0xffffffffffffffff unreferenced object 0xffff880378ead240 (size 64): comm "test_sockmap", pid 2225, jiffies 4294720701 (age 43.504s) hex dump (first 32 bytes): 00 01 00 00 00 00 ad de 00 02 00 00 00 00 ad de ................ 00 44 75 5d 03 88 ff ff 00 00 00 00 00 00 00 00 .Du]............ backtrace: [<000000005225ac3c>] sock_map_ctx_update_elem.isra.21+0xd8/0x210 [<0000000030e37a3a>] sock_map_update_elem+0x125/0x240 [<000000002e5ce36e>] map_update_elem+0x4eb/0x7b0 [<00000000db453cc9>] __x64_sys_bpf+0x1f9/0x360 [<0000000000763660>] do_syscall_64+0x9a/0x300 [<00000000422a2bb2>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [<000000002ef89e83>] 0xffffffffffffffff [...] Fixes: e9db4ef6bf4c ("bpf: sockhash fix omitted bucket lock in sock_close") Fixes: 54fedb42c653 ("bpf: sockmap, fix smap_list_map_remove when psock is in many maps") Fixes: 2f857d04601a ("bpf: sockmap, remove STRPARSER map_flags and add multi-map support") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-16tcp, ulp: fix leftover icsk_ulp_ops preventing sock from reattachDaniel Borkmann
I found that in BPF sockmap programs once we either delete a socket from the map or we updated a map slot and the old socket was purged from the map that these socket can never get reattached into a map even though their related psock has been dropped entirely at that point. Reason is that tcp_cleanup_ulp() leaves the old icsk->icsk_ulp_ops intact, so that on the next tcp_set_ulp_id() the kernel returns an -EEXIST thinking there is still some active ULP attached. BPF sockmap is the only one that has this issue as the other user, kTLS, only calls tcp_cleanup_ulp() from tcp_v4_destroy_sock() whereas sockmap semantics allow dropping the socket from the map with all related psock state being cleaned up. Fixes: 1aa12bdf1bfb ("bpf: sockmap, add sock close() hook to remove socks") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-16tcp, ulp: add alias for all ulp modulesDaniel Borkmann
Lets not turn the TCP ULP lookup into an arbitrary module loader as we only intend to load ULP modules through this mechanism, not other unrelated kernel modules: [root@bar]# cat foo.c #include <sys/types.h> #include <sys/socket.h> #include <linux/tcp.h> #include <linux/in.h> int main(void) { int sock = socket(PF_INET, SOCK_STREAM, 0); setsockopt(sock, IPPROTO_TCP, TCP_ULP, "sctp", sizeof("sctp")); return 0; } [root@bar]# gcc foo.c -O2 -Wall [root@bar]# lsmod | grep sctp [root@bar]# ./a.out [root@bar]# lsmod | grep sctp sctp 1077248 4 libcrc32c 16384 3 nf_conntrack,nf_nat,sctp [root@bar]# Fix it by adding module alias to TCP ULP modules, so probing module via request_module() will be limited to tcp-ulp-[name]. The existing modules like kTLS will load fine given tcp-ulp-tls alias, but others will fail to load: [root@bar]# lsmod | grep sctp [root@bar]# ./a.out [root@bar]# lsmod | grep sctp [root@bar]# Sockmap is not affected from this since it's either built-in or not. Fixes: 734942cc4ea6 ("tcp: ULP infrastructure") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2018-08-16Merge branch 'linus/master' into rdma.git for-nextJason Gunthorpe
rdma.git merge resolution for the 4.19 merge window Conflicts: drivers/infiniband/core/rdma_core.c - Use the rdma code and revise with the new spelling for atomic_fetch_add_unless drivers/nvme/host/rdma.c - Replace max_sge with max_send_sge in new blk code drivers/nvme/target/rdma.c - Use the blk code and revise to use NULL for ib_post_recv when appropriate - Replace max_sge with max_recv_sge in new blk code net/rds/ib_send.c - Use the net code and revise to use NULL for ib_post_recv when appropriate Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-16Revert "net/smc: Replace ib_query_gid with rdma_get_gid_attr"Jason Gunthorpe
This reverts commit ddb457c6993babbcdd41fca638b870d2a2fc3941. The include rdma/ib_cache.h is kept, and we have to add a memset to the compat wrapper to avoid compiler warnings in gcc-7 This revert is done to avoid extensive merge conflicts with SMC changes in netdev during the 4.19 merge window. Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-08-16bpf: fix a rcu usage warning in bpf_prog_array_copy_core()Yonghong Song
Commit 394e40a29788 ("bpf: extend bpf_prog_array to store pointers to the cgroup storage") refactored the bpf_prog_array_copy_core() to accommodate new structure bpf_prog_array_item which contains bpf_prog array itself. In the old code, we had perf_event_query_prog_array(): mutex_lock(...) bpf_prog_array_copy_call(): prog = rcu_dereference_check(array, 1)->progs bpf_prog_array_copy_core(prog, ...) mutex_unlock(...) With the above commit, we had perf_event_query_prog_array(): mutex_lock(...) bpf_prog_array_copy_call(): bpf_prog_array_copy_core(array, ...): item = rcu_dereference(array)->items; ... mutex_unlock(...) The new code will trigger a lockdep rcu checking warning. The fix is to change rcu_dereference() to rcu_dereference_check() to prevent such a warning. Reported-by: syzbot+6e72317008eef84a216b@syzkaller.appspotmail.com Fixes: 394e40a29788 ("bpf: extend bpf_prog_array to store pointers to the cgroup storage") Cc: Roman Gushchin <guro@fb.com> Signed-off-by: Yonghong Song <yhs@fb.com> Acked-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Roman Gushchin <guro@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-16samples/bpf: all XDP samples should unload xdp/bpf prog on SIGTERMJesper Dangaard Brouer
It is common XDP practice to unload/deattach the XDP bpf program, when the XDP sample program is Ctrl-C interrupted (SIGINT) or killed (SIGTERM). The samples/bpf programs xdp_redirect_cpu and xdp_rxq_info, forgot to trap signal SIGTERM (which is the default signal used by the kill command). This was discovered by Red Hat QA, which automated scripts depend on killing the XDP sample program after a timeout period. Fixes: fad3917e361b ("samples/bpf: add cpumap sample program xdp_redirect_cpu") Fixes: 0fca931a6f21 ("samples/bpf: program demonstrating access to xdp_rxq_info") Reported-by: Jean-Tsung Hsiao <jhsiao@redhat.com> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Yonghong Song <yhs@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-16net/xdp: Fix suspicious RCU usage warningTariq Toukan
Fix the warning below by calling rhashtable_lookup_fast. Also, make some code movements for better quality and human readability. [ 342.450870] WARNING: suspicious RCU usage [ 342.455856] 4.18.0-rc2+ #17 Tainted: G O [ 342.462210] ----------------------------- [ 342.467202] ./include/linux/rhashtable.h:481 suspicious rcu_dereference_check() usage! [ 342.476568] [ 342.476568] other info that might help us debug this: [ 342.476568] [ 342.486978] [ 342.486978] rcu_scheduler_active = 2, debug_locks = 1 [ 342.495211] 4 locks held by modprobe/3934: [ 342.500265] #0: 00000000e23116b2 (mlx5_intf_mutex){+.+.}, at: mlx5_unregister_interface+0x18/0x90 [mlx5_core] [ 342.511953] #1: 00000000ca16db96 (rtnl_mutex){+.+.}, at: unregister_netdev+0xe/0x20 [ 342.521109] #2: 00000000a46e2c4b (&priv->state_lock){+.+.}, at: mlx5e_close+0x29/0x60 [mlx5_core] [ 342.531642] #3: 0000000060c5bde3 (mem_id_lock){+.+.}, at: xdp_rxq_info_unreg+0x93/0x6b0 [ 342.541206] [ 342.541206] stack backtrace: [ 342.547075] CPU: 12 PID: 3934 Comm: modprobe Tainted: G O 4.18.0-rc2+ #17 [ 342.556621] Hardware name: Dell Inc. PowerEdge R730/0H21J3, BIOS 1.5.4 10/002/2015 [ 342.565606] Call Trace: [ 342.568861] dump_stack+0x78/0xb3 [ 342.573086] xdp_rxq_info_unreg+0x3f5/0x6b0 [ 342.578285] ? __call_rcu+0x220/0x300 [ 342.582911] mlx5e_free_rq+0x38/0xc0 [mlx5_core] [ 342.588602] mlx5e_close_channel+0x20/0x120 [mlx5_core] [ 342.594976] mlx5e_close_channels+0x26/0x40 [mlx5_core] [ 342.601345] mlx5e_close_locked+0x44/0x50 [mlx5_core] [ 342.607519] mlx5e_close+0x42/0x60 [mlx5_core] [ 342.613005] __dev_close_many+0xb1/0x120 [ 342.617911] dev_close_many+0xa2/0x170 [ 342.622622] rollback_registered_many+0x148/0x460 [ 342.628401] ? __lock_acquire+0x48d/0x11b0 [ 342.633498] ? unregister_netdev+0xe/0x20 [ 342.638495] rollback_registered+0x56/0x90 [ 342.643588] unregister_netdevice_queue+0x7e/0x100 [ 342.649461] unregister_netdev+0x18/0x20 [ 342.654362] mlx5e_remove+0x2a/0x50 [mlx5_core] [ 342.659944] mlx5_remove_device+0xe5/0x110 [mlx5_core] [ 342.666208] mlx5_unregister_interface+0x39/0x90 [mlx5_core] [ 342.673038] cleanup+0x5/0xbfc [mlx5_core] [ 342.678094] __x64_sys_delete_module+0x16b/0x240 [ 342.683725] ? do_syscall_64+0x1c/0x210 [ 342.688476] do_syscall_64+0x5a/0x210 [ 342.693025] entry_SYSCALL_64_after_hwframe+0x49/0xbe Fixes: 8d5d88527587 ("xdp: rhashtable with allocator ID to pointer mapping") Signed-off-by: Tariq Toukan <tariqt@mellanox.com> Suggested-by: Daniel Borkmann <daniel@iogearbox.net> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2018-08-16blktrace: Add SPDX License format headerSteven Rostedt (VMware)
Add the SPDX License header to ease license compliance management. Acked-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-08-16net/mlx5e: Delete unneeded function argumentYuval Shaia
priv argument is not used by the function, delete it. Fixes: a89842811ea98 ("net/mlx5e: Merge per priority stats groups") Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-16Documentation: networking: ti-cpsw: correct cbs parameters for Eth1 100MbIvan Khoronzhuk
If set cbs parameters calculated for 1000Mb, but use on 100Mb port w/o h/w offload (for cpsw offload it doesn't matter), it works incorrectly. According to the example and testing board, second port is 100Mb interface. Correct them on recalculated for 100Mb interface. It allows to use the same command for CBS software implementation for board in example. Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-16isdn: Disable IIOCDBGVARKees Cook
It was possible to directly leak the kernel address where the isdn_dev structure pointer was stored. This is a kernel ASLR bypass for anyone with access to the ioctl. The code had been present since the beginning of git history, though this shouldn't ever be needed for normal operation, therefore remove it. Reported-by: Al Viro <viro@zeniv.linux.org.uk> Cc: Karsten Keil <isdn@linux-pingi.de> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-16net: dsa: add support for ksz9897 ethernet switchLad, Prabhakar
ksz9477 is superset of ksz9xx series, driver just works out of the box for ksz9897 chip with this patch. Signed-off-by: Lad, Prabhakar <prabhakar.csengg@gmail.com> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-16veth: Free queues on link deleteToshiaki Makita
David Ahern reported memory leak in veth. ======================================================================= $ cat /sys/kernel/debug/kmemleak unreferenced object 0xffff8800354d5c00 (size 1024): comm "ip", pid 836, jiffies 4294722952 (age 25.904s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<(____ptrval____)>] kmemleak_alloc+0x70/0x94 [<(____ptrval____)>] slab_post_alloc_hook+0x42/0x52 [<(____ptrval____)>] __kmalloc+0x101/0x142 [<(____ptrval____)>] kmalloc_array.constprop.20+0x1e/0x26 [veth] [<(____ptrval____)>] veth_newlink+0x147/0x3ac [veth] ... unreferenced object 0xffff88002e009c00 (size 1024): comm "ip", pid 836, jiffies 4294722958 (age 25.898s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<(____ptrval____)>] kmemleak_alloc+0x70/0x94 [<(____ptrval____)>] slab_post_alloc_hook+0x42/0x52 [<(____ptrval____)>] __kmalloc+0x101/0x142 [<(____ptrval____)>] kmalloc_array.constprop.20+0x1e/0x26 [veth] [<(____ptrval____)>] veth_newlink+0x219/0x3ac [veth] ======================================================================= veth_rq allocated in veth_newlink() was not freed on dellink. We need to free up them after veth_close() so that any packets will not reference the queues afterwards. Thus free them in veth_dev_free() in the same way as freeing stats structure (vstats). Also move queues allocation to veth_dev_init() to be in line with stats allocation. Fixes: 638264dc90227 ("veth: Support per queue XDP ring") Reported-by: David Ahern <dsahern@gmail.com> Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Reviewed-by: David Ahern <dsahern@gmail.com> Tested-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-08-16ila: make lockdep happy againCong Wang
Previously, alloc_ila_locks() and bucket_table_alloc() call spin_lock_init() separately, therefore they have two different lock names and lock class keys. However, after commit b893281715ab ("ila: Call library function alloc_bucket_locks") they both call helper alloc_bucket_spinlocks() which now only has one lock name and lock class key. This causes a few bogus lockdep warnings as reported by syzbot. Fix this by making alloc_bucket_locks() a macro and pass declaration name as lock name and a static lock class key inside the macro. Fixes: b893281715ab ("ila: Call library function alloc_bucket_locks") Reported-by: <syzbot+b66a5a554991a8ed027c@syzkaller.appspotmail.com> Cc: Tom Herbert <tom@quantonium.net> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>