summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-12-18sfc: make mem_bar a function rather than a constantEdward Cree
Support using BAR 0 on SFC9250, even though the driver doesn't bind to such devices yet. Signed-off-by: Edward Cree <ecree@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18Merge branch 'WIP.x86-pti.entry-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 syscall entry code changes for PTI from Ingo Molnar: "The main changes here are Andy Lutomirski's changes to switch the x86-64 entry code to use the 'per CPU entry trampoline stack'. This, besides helping fix KASLR leaks (the pending Page Table Isolation (PTI) work), also robustifies the x86 entry code" * 'WIP.x86-pti.entry-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits) x86/cpufeatures: Make CPU bugs sticky x86/paravirt: Provide a way to check for hypervisors x86/paravirt: Dont patch flush_tlb_single x86/entry/64: Make cpu_entry_area.tss read-only x86/entry: Clean up the SYSENTER_stack code x86/entry/64: Remove the SYSENTER stack canary x86/entry/64: Move the IST stacks into struct cpu_entry_area x86/entry/64: Create a per-CPU SYSCALL entry trampoline x86/entry/64: Return to userspace from the trampoline stack x86/entry/64: Use a per-CPU trampoline stack for IDT entries x86/espfix/64: Stop assuming that pt_regs is on the entry stack x86/entry/64: Separate cpu_current_top_of_stack from TSS.sp0 x86/entry: Remap the TSS into the CPU entry area x86/entry: Move SYSENTER_stack to the beginning of struct tss_struct x86/dumpstack: Handle stack overflow on all stacks x86/entry: Fix assumptions that the HW TSS is at the beginning of cpu_tss x86/kasan/64: Teach KASAN about the cpu_entry_area x86/mm/fixmap: Generalize the GDT fixmap mechanism, introduce struct cpu_entry_area x86/entry/gdt: Put per-CPU GDT remaps in ascending order x86/dumpstack: Add get_stack_info() support for the SYSENTER stack ...
2017-12-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf-next 2017-12-18 The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Allow arbitrary function calls from one BPF function to another BPF function. As of today when writing BPF programs, __always_inline had to be used in the BPF C programs for all functions, unnecessarily causing LLVM to inflate code size. Handle this more naturally with support for BPF to BPF calls such that this __always_inline restriction can be overcome. As a result, it allows for better optimized code and finally enables to introduce core BPF libraries in the future that can be reused out of different projects. x86 and arm64 JIT support was added as well, from Alexei. 2) Add infrastructure for tagging functions as error injectable and allow for BPF to return arbitrary error values when BPF is attached via kprobes on those. This way of injecting errors generically eases testing and debugging without having to recompile or restart the kernel. Tags for opting-in for this facility are added with BPF_ALLOW_ERROR_INJECTION(), from Josef. 3) For BPF offload via nfp JIT, add support for bpf_xdp_adjust_head() helper call for XDP programs. First part of this work adds handling of BPF capabilities included in the firmware, and the later patches add support to the nfp verifier part and JIT as well as some small optimizations, from Jakub. 4) The bpftool now also gets support for basic cgroup BPF operations such as attaching, detaching and listing current BPF programs. As a requirement for the attach part, bpftool can now also load object files through 'bpftool prog load'. This reuses libbpf which we have in the kernel tree as well. bpftool-cgroup man page is added along with it, from Roman. 5) Back then commit e87c6bc3852b ("bpf: permit multiple bpf attachments for a single perf event") added support for attaching multiple BPF programs to a single perf event. Given they are configured through perf's ioctl() interface, the interface has been extended with a PERF_EVENT_IOC_QUERY_BPF command in this work in order to return an array of one or multiple BPF prog ids that are currently attached, from Yonghong. 6) Various minor fixes and cleanups to the bpftool's Makefile as well as a new 'uninstall' and 'doc-uninstall' target for removing bpftool itself or prior installed documentation related to it, from Quentin. 7) Add CONFIG_CGROUP_BPF=y to the BPF kernel selftest config file which is required for the test_dev_cgroup test case to run, from Naresh. 8) Fix reporting of XDP prog_flags for nfp driver, from Jakub. 9) Fix libbpf's exit code from the Makefile when libelf was not found in the system, also from Jakub. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpfDavid S. Miller
Daniel Borkmann says: ==================== pull-request: bpf 2017-12-17 The following pull-request contains BPF updates for your *net* tree. The main changes are: 1) Fix a corner case in generic XDP where we have non-linear skbs but enough tailroom in the skb to not miss to linearizing there, from Song. 2) Fix BPF JIT bugs in s390x and ppc64 to not recache skb data when BPF context is not skb, from Daniel. 3) Fix a BPF JIT bug in sparc64 where recaching skb data after helper call would use the wrong register for the skb, from Daniel. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-12-18Merge tag 'kvm-arm-fixes-for-v4.15-2' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/ARM Fixes for v4.15, Round 2 Fixes: - A bug in our handling of SPE state for non-vhe systems - A bug that causes hyp unmapping to go off limits and crash the system on shutdown - Three timer fixes that were introduced as part of the timer optimizations for v4.15
2017-12-18KVM: Fix stack-out-of-bounds read in write_mmioWanpeng Li
Reported by syzkaller: BUG: KASAN: stack-out-of-bounds in write_mmio+0x11e/0x270 [kvm] Read of size 8 at addr ffff8803259df7f8 by task syz-executor/32298 CPU: 6 PID: 32298 Comm: syz-executor Tainted: G OE 4.15.0-rc2+ #18 Hardware name: LENOVO ThinkCentre M8500t-N000/SHARKBAY, BIOS FBKTC1AUS 02/16/2016 Call Trace: dump_stack+0xab/0xe1 print_address_description+0x6b/0x290 kasan_report+0x28a/0x370 write_mmio+0x11e/0x270 [kvm] emulator_read_write_onepage+0x311/0x600 [kvm] emulator_read_write+0xef/0x240 [kvm] emulator_fix_hypercall+0x105/0x150 [kvm] em_hypercall+0x2b/0x80 [kvm] x86_emulate_insn+0x2b1/0x1640 [kvm] x86_emulate_instruction+0x39a/0xb90 [kvm] handle_exception+0x1b4/0x4d0 [kvm_intel] vcpu_enter_guest+0x15a0/0x2640 [kvm] kvm_arch_vcpu_ioctl_run+0x549/0x7d0 [kvm] kvm_vcpu_ioctl+0x479/0x880 [kvm] do_vfs_ioctl+0x142/0x9a0 SyS_ioctl+0x74/0x80 entry_SYSCALL_64_fastpath+0x23/0x9a The path of patched vmmcall will patch 3 bytes opcode 0F 01 C1(vmcall) to the guest memory, however, write_mmio tracepoint always prints 8 bytes through *(u64 *)val since kvm splits the mmio access into 8 bytes. This leaks 5 bytes from the kernel stack (CVE-2017-17741). This patch fixes it by just accessing the bytes which we operate on. Before patch: syz-executor-5567 [007] .... 51370.561696: kvm_mmio: mmio write len 3 gpa 0x10 val 0x1ffff10077c1010f After patch: syz-executor-13416 [002] .... 51302.299573: kvm_mmio: mmio write len 3 gpa 0x10 val 0xc1010f Reported-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Marc Zyngier <marc.zyngier@arm.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-12-18ACPI: APEI / ERST: Fix missing error handling in erst_reader()Takashi Iwai
The commit f6f828513290 ("pstore: pass allocated memory region back to caller") changed the check of the return value from erst_read() in erst_reader() in the following way: if (len == -ENOENT) goto skip; - else if (len < 0) { - rc = -1; + else if (len < sizeof(*rcd)) { + rc = -EIO; goto out; This introduced another bug: since the comparison with sizeof() is cast to unsigned, a negative len value doesn't hit any longer. As a result, when an error is returned from erst_read(), the code falls through, and it may eventually lead to some weird thing like memory corruption. This patch adds the negative error value check more explicitly for addressing the issue. Fixes: f6f828513290 (pstore: pass allocated memory region back to caller) Cc: All applicable <stable@vger.kernel.org> Tested-by: Jerry Tang <jtang@suse.com> Signed-off-by: Takashi Iwai <tiwai@suse.de> Acked-by: Kees Cook <keescook@chromium.org> Reviewed-by: Borislav Petkov <bp@suse.de> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-12-18ACPI: CPPC: remove initial assignment of pcc_ss_dataColin Ian King
The initialization of pcc_ss_data from pcc_data[pcc_ss_id] before pcc_ss_id is being range checked could lead to an out-of-bounds array read. This very same initialization is also being performed after the range check on pcc_ss_id, so we can just remove this problematic and also redundant assignment to fix the issue. Detected by cppcheck: warning: Value stored to 'pcc_ss_data' during its initialization is never read Fixes: 85b1407bf6d2 (ACPI / CPPC: Make CPPC ACPI driver aware of PCC subspace IDs) Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-12-18cpufreq: governor: Ensure sufficiently large sampling intervalsRafael J. Wysocki
After commit aa7519af450d (cpufreq: Use transition_delay_us for legacy governors as well) the sampling_rate field of struct dbs_data may be less than the tick period which causes dbs_update() to produce incorrect results, so make the code ensure that the value of that field will always be sufficiently large. Fixes: aa7519af450d (cpufreq: Use transition_delay_us for legacy governors as well) Reported-by: Andy Tang <andy.tang@nxp.com> Reported-by: Doug Smythies <dsmythies@telus.net> Tested-by: Andy Tang <andy.tang@nxp.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
2017-12-18cpufreq: imx6q: fix speed grading regression on i.MX6 QuadPlusLucas Stach
The commit moving the speed grading check to the cpufreq driver introduced some additional checks, so the OPP disable is only attempted on SoCs where those OPPs are present. The compatible checks are missing the QuadPlus compatible, so invalid OPPs are not correctly disabled there. Move both checks to a single condition, so we don't need to sprinkle even more calls to of_machine_is_compatible(). Fixes: 2b3d58a3adca (cpufreq: imx6q: Move speed grading check to cpufreq driver) Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2017-12-18PCI / PM: Force devices to D0 in pci_pm_thaw_noirq()Rafael J. Wysocki
It is incorrect to call pci_restore_state() for devices in low-power states (D1-D3), as that involves the restoration of MSI setup which requires MMIO to be operational and that is only the case in D0. However, pci_pm_thaw_noirq() may do that if the driver's "freeze" callbacks put the device into a low-power state, so fix it by making it force devices into D0 via pci_set_power_state() instead of trying to "update" their power state which is pointless. Fixes: e60514bd4485 (PCI/PM: Restore the status of PCI devices across hibernation) Cc: 4.13+ <stable@vger.kernel.org> # 4.13+ Reported-by: Thomas Gleixner <tglx@linutronix.de> Reported-by: Maarten Lankhorst <dev@mblankhorst.nl> Tested-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Maarten Lankhorst <dev@mblankhorst.nl> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com>
2017-12-18ALSA: hda/realtek - Fix Dell AIO LineOut issueKailang Yang
Dell AIO had LineOut jack. Add LineOut verb into this patch. [ Additional notes: the ALC274 codec seems requiring the fixed pin / DAC connections for HP / line-out pins for enabling EQ for speakers; i.e. the HP / LO pins expect to be connected with NID 0x03 while keeping the speaker with NID 0x02. However, by adding a new line-out pin, the auto-parser assigns the NID 0x02 for HP/LO pins as primary outputs. As an easy workaround, we provide the preferred_pairs[] to map forcibly for these pins. -- tiwai ] Fixes: 75ee94b20b46 ("ALSA: hda - fix headset mic problem for Dell machines with alc274") Signed-off-by: Kailang Yang <kailang@realtek.com> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2017-12-18KVM: arm/arm64: Fix timer enable flowChristoffer Dall
When enabling the timer on the first run, we fail to ever restore the state and mark it as loaded. That means, that in the initial entry to the VCPU ioctl, unless we exit to userspace for some reason such as a pending signal, if the guest programs a timer and blocks, we will wait forever, because we never read back the hardware state (the loaded flag is not set), and so we think the timer is disabled, and we never schedule a background soft timer. The end result? The VCPU blocks forever, and the only solution is to kill the thread. Fixes: 4a2c4da1250d ("arm/arm64: KVM: Load the timer state when enabling the timer") Reported-by: Marc Zyngier <marc.zyngier@arm.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2017-12-18KVM: arm/arm64: Properly handle arch-timer IRQs after vtimer_save_stateChristoffer Dall
The recent timer rework was assuming that once the timer was disabled, we should no longer see any interrupts from the timer. This assumption turns out to not be true, and instead we have to handle the case when the timer ISR runs even after the timer has been disabled. This requires a couple of changes: First, we should never overwrite the cached guest state of the timer control register when the ISR runs, because KVM may have disabled its timers when doing vcpu_put(), even though the guest still had the timer enabled. Second, we shouldn't assume that the timer is actually firing just because we see an interrupt, but we should check the actual state of the timer in the timer control register to understand if the hardware timer is really firing or not. We also add an ISB to vtimer_save_state() to ensure the timer is actually disabled once we enable interrupts, which should clarify the intention of the implementation, and reduce the risk of unwanted interrupts. Fixes: b103cc3f10c0 ("KVM: arm/arm64: Avoid timer save/restore in vcpu entry/exit") Reported-by: Marc Zyngier <marc.zyngier@arm.com> Reported-by: Jia He <hejianet@gmail.com> Reviewed-by: Marc Zyngier <marc.zyngier@arm.com> Tested-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2017-12-18KVM: arm/arm64: timer: Don't set irq as forwarded if no usable GICMarc Zyngier
If we don't have a usable GIC, do not try to set the vcpu affinity as this is guaranteed to fail. Reported-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Andre Przywara <andre.przywara@arm.com> Tested-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2017-12-18KVM: arm/arm64: Fix HYP unmapping going off limitsMarc Zyngier
When we unmap the HYP memory, we try to be clever and unmap one PGD at a time. If we start with a non-PGD aligned address and try to unmap a whole PGD, things go horribly wrong in unmap_hyp_range (addr and end can never match, and it all goes really badly as we keep incrementing pgd and parse random memory as page tables...). The obvious fix is to let unmap_hyp_range do what it does best, which is to iterate over a range. The size of the linear mapping, which begins at PAGE_OFFSET, can be easily calculated by subtracting PAGE_OFFSET form high_memory, because high_memory is defined as the linear map address of the last byte of DRAM, plus one. The size of the vmalloc region is given trivially by VMALLOC_END - VMALLOC_START. Cc: stable@vger.kernel.org Reported-by: Andre Przywara <andre.przywara@arm.com> Tested-by: Andre Przywara <andre.przywara@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2017-12-18arm64: kvm: Prevent restoring stale PMSCR_EL1 for vcpuJulien Thierry
When VHE is not present, KVM needs to save and restores PMSCR_EL1 when possible. If SPE is used by the host, value of PMSCR_EL1 cannot be saved for the guest. If the host starts using SPE between two save+restore on the same vcpu, restore will write the value of PMSCR_EL1 read during the first save. Make sure __debug_save_spe_nvhe clears the value of the saved PMSCR_EL1 when the guest cannot use SPE. Signed-off-by: Julien Thierry <julien.thierry@arm.com> Cc: Christoffer Dall <christoffer.dall@linaro.org> Cc: Marc Zyngier <marc.zyngier@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: <stable@vger.kernel.org> Reviewed-by: Will Deacon <will.deacon@arm.com> Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2017-12-18mtd: Fix mtd_check_oob_ops()Miquel Raynal
The mtd_check_oob_ops() helper verifies if the operation defined by the user is correct. Fix the check that verifies if the entire requested area exists. This check is too restrictive and will fail anytime the last data byte of the very last page is included in an operation. Fixes: 5cdd929da53d ("mtd: Add sanity checks in mtd_write/read_oob()") Signed-off-by: Miquel Raynal <miquel.raynal@free-electrons.com> Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com> Signed-off-by: Richard Weinberger <richard@nod.at>
2017-12-17Linux 4.15-rc4v4.15-rc4Linus Torvalds
2017-12-17Revert "exec: avoid RLIMIT_STACK races with prlimit()"Kees Cook
This reverts commit 04e35f4495dd560db30c25efca4eecae8ec8c375. SELinux runs with secureexec for all non-"noatsecure" domain transitions, which means lots of processes end up hitting the stack hard-limit change that was introduced in order to fix a race with prlimit(). That race fix will need to be redesigned. Reported-by: Laura Abbott <labbott@redhat.com> Reported-by: Tomáš Trnka <trnka@scm.com> Cc: stable@vger.kernel.org Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-17ARM: 8731/1: Fix csum_partial_copy_from_user() stack mismatchChunyan Zhang
An additional 'ip' will be pushed to the stack, for restoring the DACR later, if CONFIG_CPU_SW_DOMAIN_PAN defined. However, the fixup still get the err_ptr by add #8*4 to sp, which results in the fact that the code area pointed by the LR will be overwritten, or the kernel will crash if CONFIG_DEBUG_RODATA is enabled. This patch fixes the stack mismatch. Fixes: a5e090acbf54 ("ARM: software-based priviledged-no-access support") Signed-off-by: Lvqiang Huang <Lvqiang.Huang@spreadtrum.com> Signed-off-by: Chunyan Zhang <zhang.lyra@gmail.com> Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
2017-12-17Merge branch 'WIP.x86-pti.base-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull Page Table Isolation (PTI) v4.14 backporting base tree from Ingo Molnar: "This tree contains the v4.14 PTI backport preparatory tree, which consists of four merges of upstream trees and 7 cherry-picked commits, which the upcoming PTI work depends on" NOTE! The resulting tree is exactly the same as the original base tree (ie the diff between this commit and its immediate first parent is empty). The only reason for this merge is literally to have a common point for the actual PTI changes so that the commits can be shared in both the 4.15 and 4.14 trees. * 'WIP.x86-pti.base-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm/kasan: Don't use vmemmap_populate() to initialize shadow locking/barriers: Convert users of lockless_dereference() to READ_ONCE() locking/barriers: Add implicit smp_read_barrier_depends() to READ_ONCE() bpf: fix build issues on um due to mising bpf_perf_event.h perf/x86: Enable free running PEBS for REGS_USER/INTR x86: Make X86_BUG_FXSAVE_LEAK detectable in CPUID on AMD x86/cpufeature: Add User-Mode Instruction Prevention definitions
2017-12-17Merge branch 'WIP.x86-pti.base.prep-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull Page Table Isolation (PTI) preparatory tree from Ingo Molnar: "This does a rename to free up linux/pti.h to be used by the upcoming page table isolation feature" * 'WIP.x86-pti.base.prep-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: drivers/misc/intel/pti: Rename the header file to free up the namespace
2017-12-17Merge branch 'timers-urgent-for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Thomas Gleixner: "A single bugfix which prevents arbitrary sigev_notify values in posix-timers" * 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: posix-timer: Properly check sigevent->sigev_notify
2017-12-17Merge tag 'dmaengine-fix-4.15-rc4' of ↵Linus Torvalds
git://git.infradead.org/users/vkoul/slave-dma Pull dmaengine fixes from Vinod Koul: "This time consisting of fixes in a bunch of drivers and the dmatest module: - Fix for disable clk on error path in fsl-edma driver - Disable clk fail fix in jz4740 driver - Fix long pending bug in dmatest driver for dangling pointer - Fix potential NULL pointer dereference in at_hdmac driver - Error handling path in ioat driver" * tag 'dmaengine-fix-4.15-rc4' of git://git.infradead.org/users/vkoul/slave-dma: dmaengine: fsl-edma: disable clks on all error paths dmaengine: jz4740: disable/unprepare clk if probe fails dmaengine: dmatest: move callback wait queue to thread context dmaengine: at_hdmac: fix potential NULL pointer dereference in atc_prep_dma_interleaved dmaengine: ioat: Fix error handling path
2017-12-17cramfs: fix MTD dependencyArnd Bergmann
With CONFIG_MTD=m and CONFIG_CRAMFS=y, we now get a link failure: fs/cramfs/inode.o: In function `cramfs_mount': inode.c:(.text+0x220): undefined reference to `mount_mtd' fs/cramfs/inode.o: In function `cramfs_mtd_fill_super': inode.c:(.text+0x6d8): undefined reference to `mtd_point' inode.c:(.text+0xae4): undefined reference to `mtd_unpoint' This adds a more specific Kconfig dependency to avoid the broken configuration. Alternatively we could make CRAMFS itself depend on "MTD || !MTD" with a similar result. Fixes: 99c18ce580c6 ("cramfs: direct memory access support") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Nicolas Pitre <nico@linaro.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-12-17Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs fixes from Al Viro: "The alloc_super() one is a regression in this merge window, lazytime thing is older..." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: VFS: Handle lazytime in do_mount() alloc_super(): do ->s_umount initialization earlier
2017-12-17Merge tag 'ext4_for_stable' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "Fix a regression which caused us to fail to interpret symlinks in very ancient ext3 file system images. Also fix two xfstests failures, one of which could cause an OOPS, plus an additional bug fix caught by fuzz testing" * tag 'ext4_for_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix crash when a directory's i_size is too small ext4: add missing error check in __ext4_new_inode() ext4: fix fdatasync(2) after fallocate(2) operation ext4: support fast symlinks from ext3 file systems
2017-12-17parisc: Reduce thread stack to 16 kbJohn David Anglin
In testing, I found that the thread stack can be 16 kB when using an irq stack. Without it, the thread stack needs to be 32 kB. Currently, the irq stack is 32 kB. While it probably could be 16 kB, I would prefer to leave it as is for safety. Signed-off-by: John David Anglin <dave.anglin@bell.net> Signed-off-by: Helge Deller <deller@gmx.de>
2017-12-17Revert "parisc: Re-enable interrupts early"John David Anglin
This reverts commit 5c38602d83e584047906b41b162ababd4db4106d. Interrupts can't be enabled early because the register saves are done on the thread stack prior to switching to the IRQ stack. This caused stack overflows and the thread stack needed increasing to 32k. Even then, stack overflows still occasionally occurred. Background: Even with a 32 kB thread stack, I have seen instances where the thread stack overflowed on the mx3210 buildd. Detection of stack overflow only occurs when we have an external interrupt. When an external interrupt occurs, we switch to the thread stack if we are not already on a kernel stack. Then, registers and specials are saved to the kernel stack. The bug occurs in intr_return where interrupts are reenabled prior to returning from the interrupt. This was done incase we need to schedule or deliver signals. However, it introduces the possibility that multiple external interrupts may occur on the thread stack and cause a stack overflow. These might not be detected and cause the kernel to misbehave in random ways. This patch changes the code back to only reenable interrupts when we are going to schedule or deliver signals. As a result, we generally return from an interrupt before reenabling interrupts. This minimizes the growth of the thread stack. Fixes: 5c38602d83e5 ("parisc: Re-enable interrupts early") Signed-off-by: John David Anglin <dave.anglin@bell.net> Cc: <stable@vger.kernel.org> # v4.10+ Signed-off-by: Helge Deller <deller@gmx.de>
2017-12-17parisc: remove duplicate includesPravin Shedge
These duplicate includes have been found with scripts/checkincludes.pl but they have been removed manually to avoid removing false positives. Signed-off-by: Pravin Shedge <pravin.shedge4linux@gmail.com> Signed-off-by: Helge Deller <deller@gmx.de>
2017-12-17parisc: Hide Diva-built-in serial aux and graphics cardHelge Deller
Diva GSP card has built-in serial AUX port and ATI graphic card which simply don't work and which both don't have external connectors. User Guides even mention that those devices shouldn't be used. So, prevent that Linux drivers try to enable those devices. Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v3.0+
2017-12-17parisc: Align os_hpmc_size on word boundaryHelge Deller
The os_hpmc_size variable sometimes wasn't aligned at word boundary and thus triggered the unaligned fault handler at startup. Fix it by aligning it properly. Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v4.14+
2017-12-17parisc: Fix indenting in puts()Helge Deller
Static analysis tools complain that we intended to have curly braces around this indent block. In this case this assumption is wrong, so fix the indenting. Fixes: 2f3c7b8137ef ("parisc: Add core code for self-extracting kernel") Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Helge Deller <deller@gmx.de> Cc: <stable@vger.kernel.org> # v4.14+
2017-12-17trace: reenable preemption if we modify the ipJosef Bacik
Things got moved around between the original bpf_override_return patches and the final version, and now the ftrace kprobe dispatcher assumes if you modified the ip that you also enabled preemption. Make a comment of this and enable preemption, this fixes the lockdep splat that happened when using this feature. Fixes: 9802d86585db ("bpf: add a bpf_override_function helper") Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17nfp: set flags in the correct member of netdev_bpfJakub Kicinski
netdev_bpf.flags is the input member for installing the program. netdev_bpf.prog_flags is the output member for querying. Set the correct one on query. Fixes: 92f0292b35a0 ("net: xdp: report flags program was installed with on query") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17libbpf: fix Makefile exit code if libelf not foundJakub Kicinski
/bin/sh's exit does not recognize -1 as a number, leading to the following error message: /bin/sh: 1: exit: Illegal number: -1 Use 1 as the exit code. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17Merge branch 'bpf-to-bpf-function-calls'Daniel Borkmann
Alexei Starovoitov says: ==================== First of all huge thank you to Daniel, John, Jakub, Edward and others who reviewed multiple iterations of this patch set over the last many months and to Dave and others who gave critical feedback during netconf/netdev. The patch is solid enough and we thought through numerous corner cases, but it's not the end. More followups with code reorg and features to follow. TLDR: Allow arbitrary function calls from bpf function to another bpf function. Since the beginning of bpf all bpf programs were represented as a single function and program authors were forced to use always_inline for all functions in their C code. That was causing llvm to unnecessary inflate the code size and forcing developers to move code to header files with little code reuse. With a bit of additional complexity teach verifier to recognize arbitrary function calls from one bpf function to another as long as all of functions are presented to the verifier as a single bpf program. Extended program layout: .. r1 = .. // arg1 r2 = .. // arg2 call pc+1 // function call pc-relative exit .. = r1 // access arg1 .. = r2 // access arg2 .. call pc+20 // second level of function call ... It allows for better optimized code and finally allows to introduce the core bpf libraries that can be reused in different projects, since programs are no longer limited by single elf file. With function calls bpf can be compiled into multiple .o files. This patch is the first step. It detects programs that contain multiple functions and checks that calls between them are valid. It splits the sequence of bpf instructions (one program) into a set of bpf functions that call each other. Calls to only known functions are allowed. Since all functions are presented to the verifier at once conceptually it is 'static linking'. Future plans: - introduce BPF_PROG_TYPE_LIBRARY and allow a set of bpf functions to be loaded into the kernel that can be later linked to other programs with concrete program types. Aka 'dynamic linking'. - introduce function pointer type and indirect calls to allow bpf functions call other dynamically loaded bpf functions while the caller bpf function is already executing. Aka 'runtime linking'. This will be more generic and more flexible alternative to bpf_tail_calls. FAQ: Q: Interpreter and JIT changes mean that new instruction is introduced ? A: No. The call instruction technically stays the same. Now it can call both kernel helpers and other bpf functions. Calling convention stays the same as well. From uapi point of view the call insn got new 'relocation' BPF_PSEUDO_CALL similar to BPF_PSEUDO_MAP_FD 'relocation' of bpf_ldimm64 insn. Q: What had to change on LLVM side? A: Trivial LLVM patch to allow calls was applied to upcoming 6.0 release: https://reviews.llvm.org/rL318614 with few bugfixes as well. Make sure to build the latest llvm to have bpf_call support. More details in the patches. ==================== Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17selftests/bpf: additional bpf_call testsDaniel Borkmann
Add some additional checks for few more corner cases. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17bpf: arm64: add JIT support for multi-function programsAlexei Starovoitov
similar to x64 add support for bpf-to-bpf calls. When program has calls to in-kernel helpers the target call offset is known at JIT time and arm64 architecture needs 2 passes. With bpf-to-bpf calls the dynamically allocated function start is unknown until all functions of the program are JITed. Therefore (just like x64) arm64 JIT needs one extra pass over the program to emit correct call offsets. Implementation detail: Avoid being too clever in 64-bit immediate moves and always use 4 instructions (instead of 3-4 depending on the address) to make sure only one extra pass is needed. If some future optimization would make it worth while to optimize 'call 64-bit imm' further, the JIT would need to do 4 passes over the program instead of 3 as in this patch. For typical bpf program address the mov needs 3 or 4 insns, so unconditional 4 insns to save extra pass is a worthy trade off at this state of JIT. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17bpf: x64: add JIT support for multi-function programsAlexei Starovoitov
Typical JIT does several passes over bpf instructions to compute total size and relative offsets of jumps and calls. With multitple bpf functions calling each other all relative calls will have invalid offsets intially therefore we need to additional last pass over the program to emit calls with correct offsets. For example in case of three bpf functions: main: call foo call bpf_map_lookup exit foo: call bar exit bar: exit We will call bpf_int_jit_compile() indepedently for main(), foo() and bar() x64 JIT typically does 4-5 passes to converge. After these initial passes the image for these 3 functions will be good except call targets, since start addresses of foo() and bar() are unknown when we were JITing main() (note that call bpf_map_lookup will be resolved properly during initial passes). Once start addresses of 3 functions are known we patch call_insn->imm to point to right functions and call bpf_int_jit_compile() again which needs only one pass. Additional safety checks are done to make sure this last pass doesn't produce image that is larger or smaller than previous pass. When constant blinding is on it's applied to all functions at the first pass, since doing it once again at the last pass can change size of the JITed code. Tested on x64 and arm64 hw with JIT on/off, blinding on/off. x64 jits bpf-to-bpf calls correctly while arm64 falls back to interpreter. All other JITs that support normal BPF_CALL will behave the same way since bpf-to-bpf call is equivalent to bpf-to-kernel call from JITs point of view. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17bpf: fix net.core.bpf_jit_enable raceAlexei Starovoitov
global bpf_jit_enable variable is tested multiple times in JITs, blinding and verifier core. The malicious root can try to toggle it while loading the programs. This race condition was accounted for and there should be no issues, but it's safer to avoid this race condition. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17bpf: add support for bpf_call to interpreterAlexei Starovoitov
though bpf_call is still the same call instruction and calling convention 'bpf to bpf' and 'bpf to helper' is the same the interpreter has to oparate on 'struct bpf_insn *'. To distinguish these two cases add a kernel internal opcode and mark call insns with it. This opcode is seen by interpreter only. JITs will never see it. Also add tiny bit of debug code to aid interpreter debugging. Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17selftests/bpf: add xdp noinline testAlexei Starovoitov
add large semi-artificial XDP test with 18 functions to stress test bpf call verification logic Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17selftests/bpf: add bpf_call testAlexei Starovoitov
strip always_inline from test_l4lb.c and compile it with -fno-inline to let verifier go through 11 function with various function arguments and return values Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17libbpf: add support for bpf_callAlexei Starovoitov
- recognize relocation emitted by llvm - since all regular function will be kept in .text section and llvm takes care of pc-relative offsets in bpf_call instruction simply copy all of .text to relevant program section while adjusting bpf_call instructions in program section to point to newly copied body of instructions from .text - do so for all programs in the elf file - set all programs types to the one passed to bpf_prog_load() Note for elf files with multiple programs that use different functions in .text section we need to do 'linker' style logic. This work is still TBD Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17selftests/bpf: add tests for stack_zero trackingAlexei Starovoitov
adjust two tests, since verifier got smarter and add new one to test stack_zero logic Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17bpf: teach verifier to recognize zero initialized stackAlexei Starovoitov
programs with function calls are often passing various pointers via stack. When all calls are inlined llvm flattens stack accesses and optimizes away extra branches. When functions are not inlined it becomes the job of the verifier to recognize zero initialized stack to avoid exploring paths that program will not take. The following program would fail otherwise: ptr = &buffer_on_stack; *ptr = 0; ... func_call(.., ptr, ...) { if (..) *ptr = bpf_map_lookup(); } ... if (*ptr != 0) { // Access (*ptr)->field is valid. // Without stack_zero tracking such (*ptr)->field access // will be rejected } since stack slots are no longer uniform invalid | spill | misc add liveness marking to all slots, but do it in 8 byte chunks. So if nothing was read or written in [fp-16, fp-9] range it will be marked as LIVE_NONE. If any byte in that range was read, it will be marked LIVE_READ and stacksafe() check will perform byte-by-byte verification. If all bytes in the range were written the slot will be marked as LIVE_WRITTEN. This significantly speeds up state equality comparison and reduces total number of states processed. before after bpf_lb-DLB_L3.o 2051 2003 bpf_lb-DLB_L4.o 3287 3164 bpf_lb-DUNKNOWN.o 1080 1080 bpf_lxc-DDROP_ALL.o 24980 12361 bpf_lxc-DUNKNOWN.o 34308 16605 bpf_netdev.o 15404 10962 bpf_overlay.o 7191 6679 Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17selftests/bpf: add verifier tests for bpf_callAlexei Starovoitov
Add extensive set of tests for bpf_call verification logic: calls: basic sanity calls: using r0 returned by callee calls: callee is using r1 calls: callee using args1 calls: callee using wrong args2 calls: callee using two args calls: callee changing pkt pointers calls: two calls with args calls: two calls with bad jump calls: recursive call. test1 calls: recursive call. test2 calls: unreachable code calls: invalid call calls: jumping across function bodies. test1 calls: jumping across function bodies. test2 calls: call without exit calls: call into middle of ld_imm64 calls: call into middle of other call calls: two calls with bad fallthrough calls: two calls with stack read calls: two calls with stack write calls: spill into caller stack frame calls: two calls with stack write and void return calls: ambiguous return value calls: two calls that return map_value calls: two calls that return map_value with bool condition calls: two calls that return map_value with incorrect bool check calls: two calls that receive map_value via arg=ptr_stack_of_caller. test1 calls: two calls that receive map_value via arg=ptr_stack_of_caller. test2 calls: two jumps that receive map_value via arg=ptr_stack_of_jumper. test3 calls: two calls that receive map_value_ptr_or_null via arg. test1 calls: two calls that receive map_value_ptr_or_null via arg. test2 calls: pkt_ptr spill into caller stack Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2017-12-17bpf: introduce function calls (verification)Alexei Starovoitov
Allow arbitrary function calls from bpf function to another bpf function. To recognize such set of bpf functions the verifier does: 1. runs control flow analysis to detect function boundaries 2. proceeds with verification of all functions starting from main(root) function It recognizes that the stack of the caller can be accessed by the callee (if the caller passed a pointer to its stack to the callee) and the callee can store map_value and other pointers into the stack of the caller. 3. keeps track of the stack_depth of each function to make sure that total stack depth is still less than 512 bytes 4. disallows pointers to the callee stack to be stored into the caller stack, since they will be invalid as soon as the callee returns 5. to reuse all of the existing state_pruning logic each function call is considered to be independent call from the verifier point of view. The verifier pretends to inline all function calls it sees are being called. It stores the callsite instruction index as part of the state to make sure that two calls to the same callee from two different places in the caller will be different from state pruning point of view 6. more safety checks are added to liveness analysis Implementation details: . struct bpf_verifier_state is now consists of all stack frames that led to this function . struct bpf_func_state represent one stack frame. It consists of registers in the given frame and its stack . propagate_liveness() logic had a premature optimization where mark_reg_read() and mark_stack_slot_read() were manually inlined with loop iterating over parents for each register or stack slot. Undo this optimization to reuse more complex mark_*_read() logic . skip_callee() logic is not necessary from safety point of view, but without it mark_*_read() markings become too conservative, since after returning from the funciton call a read of r6-r9 will incorrectly propagate the read marks into callee causing inefficient pruning later . mark_*_read() logic is now aware of control flow which makes it more complex. In the future the plan is to rewrite liveness to be hierarchical. So that liveness can be done within basic block only and control flow will be responsible for propagation of liveness information along cfg and between calls. . tail_calls and ld_abs insns are not allowed in the programs with bpf-to-bpf calls . returning stack pointers to the caller or storing them into stack frame of the caller is not allowed Testing: . no difference in cilium processed_insn numbers . large number of tests follows in next patches Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: John Fastabend <john.fastabend@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>