summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-01-26x86/speculation: Add basic IBPB (Indirect Branch Prediction Barrier) supportDavid Woodhouse
Expose indirect_branch_prediction_barrier() for use in subsequent patches. [ tglx: Add IBPB status to spectre_v2 sysfs file ] Co-developed-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: dave.hansen@intel.com Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-8-git-send-email-dwmw@amazon.co.uk
2018-01-26x86/cpufeature: Blacklist SPEC_CTRL/PRED_CMD on early Spectre v2 microcodesDavid Woodhouse
This doesn't refuse to load the affected microcodes; it just refuses to use the Spectre v2 mitigation features if they're detected, by clearing the appropriate feature bits. The AMD CPUID bits are handled here too, because hypervisors *may* have been exposing those bits even on Intel chips, for fine-grained control of what's available. It is non-trivial to use x86_match_cpu() for this table because that doesn't handle steppings. And the approach taken in commit bd9240a18 almost made me lose my lunch. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-7-git-send-email-dwmw@amazon.co.uk
2018-01-26x86/pti: Do not enable PTI on CPUs which are not vulnerable to MeltdownDavid Woodhouse
Also, for CPUs which don't speculate at all, don't report that they're vulnerable to the Spectre variants either. Leave the cpu_no_meltdown[] match table with just X86_VENDOR_AMD in it for now, even though that could be done with a simple comparison, on the assumption that we'll have more to add. Based on suggestions from Dave Hansen and Alan Cox. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Borislav Petkov <bp@suse.de> Acked-by: Dave Hansen <dave.hansen@intel.com> Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-6-git-send-email-dwmw@amazon.co.uk
2018-01-26x86/msr: Add definitions for new speculation control MSRsDavid Woodhouse
Add MSR and bit definitions for SPEC_CTRL, PRED_CMD and ARCH_CAPABILITIES. See Intel's 336996-Speculative-Execution-Side-Channel-Mitigations.pdf Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-5-git-send-email-dwmw@amazon.co.uk
2018-01-26x86/cpufeatures: Add AMD feature bits for Speculation ControlDavid Woodhouse
AMD exposes the PRED_CMD/SPEC_CTRL MSRs slightly differently to Intel. See http://lkml.kernel.org/r/2b3e25cc-286d-8bd0-aeaf-9ac4aae39de8@amd.com Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-4-git-send-email-dwmw@amazon.co.uk
2018-01-26x86/cpufeatures: Add Intel feature bits for Speculation ControlDavid Woodhouse
Add three feature bits exposed by new microcode on Intel CPUs for speculation control. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-3-git-send-email-dwmw@amazon.co.uk
2018-01-26x86/cpufeatures: Add CPUID_7_EDX CPUID leafDavid Woodhouse
This is a pure feature bits leaf. There are two AVX512 feature bits in it already which were handled as scattered bits, and three more from this leaf are going to be added for speculation control features. Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: gnomes@lxorguk.ukuu.org.uk Cc: ak@linux.intel.com Cc: ashok.raj@intel.com Cc: dave.hansen@intel.com Cc: karahmed@amazon.de Cc: arjan@linux.intel.com Cc: torvalds@linux-foundation.org Cc: peterz@infradead.org Cc: bp@alien8.de Cc: pbonzini@redhat.com Cc: tim.c.chen@linux.intel.com Cc: gregkh@linux-foundation.org Link: https://lkml.kernel.org/r/1516896855-7642-2-git-send-email-dwmw@amazon.co.uk
2018-01-26module/retpoline: Warn about missing retpoline in moduleAndi Kleen
There's a risk that a kernel which has full retpoline mitigations becomes vulnerable when a module gets loaded that hasn't been compiled with the right compiler or the right option. To enable detection of that mismatch at module load time, add a module info string "retpoline" at build time when the module was compiled with retpoline support. This only covers compiled C source, but assembler source or prebuilt object files are not checked. If a retpoline enabled kernel detects a non retpoline protected module at load time, print a warning and report it in the sysfs vulnerability file. [ tglx: Massaged changelog ] Signed-off-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw2@infradead.org> Cc: gregkh@linuxfoundation.org Cc: torvalds@linux-foundation.org Cc: jeyu@kernel.org Cc: arjan@linux.intel.com Link: https://lkml.kernel.org/r/20180125235028.31211-1-andi@firstfloor.org
2018-01-26Merge branch 'linux-4.15' of git://github.com/skeggsb/linux into drm-fixesDave Airlie
Single irq regression fix * 'linux-4.15' of git://github.com/skeggsb/linux: drm/nouveau: Move irq setup/teardown to pci ctor/dtor
2018-01-25net: vrf: Add support for sends to local broadcast addressDavid Ahern
Sukumar reported that sends to the local broadcast address (255.255.255.255) are broken. Check for the address in vrf driver and do not redirect to the VRF device - similar to multicast packets. With this change sockets can use SO_BINDTODEVICE to specify an egress interface and receive responses. Note: the egress interface can not be a VRF device but needs to be the enslaved device. https://bugzilla.kernel.org/show_bug.cgi?id=198521 Reported-by: Sukumar Gopalakrishnan <sukumarg1973@gmail.com> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25r8169: fix memory corruption on retrieval of hardware statistics.Francois Romieu
Hardware statistics retrieval hurts in tight invocation loops. Avoid extraneous write and enforce strict ordering of writes targeted to the tally counters dump area address registers. Signed-off-by: Francois Romieu <romieu@fr.zoreil.com> Tested-by: Oliver Freyermuth <o.freyermuth@googlemail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input fixes from Dmitry Torokhov: "The main item is that we try to better handle the newer trackpoints on Lenovo devices that are now being produced by Elan/ALPS/NXP and only implement a small subset of the original IBM trackpoint controls" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Revert "Input: synaptics_rmi4 - use devm_device_add_group() for attributes in F01" Input: trackpoint - only expose supported controls for Elan, ALPS and NXP Input: trackpoint - force 3 buttons if 0 button is reported Input: xpad - add support for PDP Xbox One controllers Input: stmfts,s6sy671 - add SPDX identifier
2018-01-25orangefs: fix deadlock; do not write i_size in read_iterMartin Brandenburg
After do_readv_writev, the inode cache is invalidated anyway, so i_size will never be read. It will be fetched from the server which will also know about updates from other machines. Fixes deadlock on 32-bit SMP. See https://marc.info/?l=linux-fsdevel&m=151268557427760&w=2 Signed-off-by: Martin Brandenburg <martin@omnibond.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Mike Marshall <hubcap@omnibond.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-26drm/nouveau: Move irq setup/teardown to pci ctor/dtorLyude Paul
For a while we've been having issues with seemingly random interrupts coming from nvidia cards when resuming them. Originally the fix for this was thought to be just re-arming the MSI interrupt registers right after re-allocating our IRQs, however it seems a lot of what we do is both wrong and not even nessecary. This was made apparent by what appeared to be a regression in the mainline kernel that started introducing suspend/resume issues for nouveau: a0c9259dc4e1 (irq/matrix: Spread interrupts on allocation) After this commit was introduced, we started getting interrupts from the GPU before we actually re-allocated our own IRQ (see references below) and assigned the IRQ handler. Investigating this turned out that the problem was not with the commit, but the fact that nouveau even free/allocates it's irqs before and after suspend/resume. For starters: drivers in the linux kernel haven't had to handle freeing/re-allocating their IRQs during suspend/resume cycles for quite a while now. Nouveau seems to be one of the few drivers left that still does this, despite the fact there's no reason we actually need to since disabling interrupts from the device side should be enough, as the kernel is already smart enough to know to disable host-side interrupts for us before going into suspend. Since we were tearing down our IRQs by hand however, that means there was a short period during resume where interrupts could be received before we re-allocated our IRQ which would lead to us getting an unhandled IRQ. Since we never handle said IRQ and re-arm the interrupt registers, this would cause us to miss all of the interrupts from the GPU and cause our init process to start timing out on anything requiring interrupts. So, since this whole setup/teardown every suspend/resume cycle is useless anyway, move irq setup/teardown into the pci subdev's ctor/dtor functions instead so they're only called at driver load and driver unload. This should fix most of the issues with pending interrupts on resume, along with getting suspend/resume for nouveau to work again. As well, this probably means we can also just remove the msi rearm call inside nvkm_pci_init(). But since our main focus here is to fix suspend/resume before 4.15, we'll save that for a later patch. Signed-off-by: Lyude Paul <lyude@redhat.com> Cc: Karol Herbst <kherbst@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Mike Galbraith <efault@gmx.de> Cc: stable@vger.kernel.org Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2018-01-25net: don't call update_pmtu unconditionallyNicolas Dichtel
Some dst_ops (e.g. md_dst_ops)) doesn't set this handler. It may result to: "BUG: unable to handle kernel NULL pointer dereference at (null)" Let's add a helper to check if update_pmtu is available before calling it. Fixes: 52a589d51f10 ("geneve: update skb dst pmtu on tx path") Fixes: a93bf0ff4490 ("vxlan: update skb dst pmtu on tx path") CC: Roman Kapl <code@rkapl.cz> CC: Xin Long <lucien.xin@gmail.com> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM fixes from Radim Krčmář: "Fix races and a potential use after free in the s390 cmma migration code" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: s390: add proper locking for CMMA migration bitmap
2018-01-25Merge tag 'for-4.15-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fix from David Sterba: "It's been reported recently that readdir can list stale entries under some conditions. Fix it." * tag 'for-4.15-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: Btrfs: fix stale entries in readdir
2018-01-25net: tcp: close sock if net namespace is exitingDan Streetman
When a tcp socket is closed, if it detects that its net namespace is exiting, close immediately and do not wait for FIN sequence. For normal sockets, a reference is taken to their net namespace, so it will never exit while the socket is open. However, kernel sockets do not take a reference to their net namespace, so it may begin exiting while the kernel socket is still open. In this case if the kernel socket is a tcp socket, it will stay open trying to complete its close sequence. The sock's dst(s) hold a reference to their interface, which are all transferred to the namespace's loopback interface when the real interfaces are taken down. When the namespace tries to take down its loopback interface, it hangs waiting for all references to the loopback interface to release, which results in messages like: unregister_netdevice: waiting for lo to become free. Usage count = 1 These messages continue until the socket finally times out and closes. Since the net namespace cleanup holds the net_mutex while calling its registered pernet callbacks, any new net namespace initialization is blocked until the current net namespace finishes exiting. After this change, the tcp socket notices the exiting net namespace, and closes immediately, releasing its dst(s) and their reference to the loopback interface, which lets the net namespace continue exiting. Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407 Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=97811 Signed-off-by: Dan Streetman <ddstreet@canonical.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25perf/x86: Fix perf,x86,cpuhp deadlockPeter Zijlstra
More lockdep gifts, a 5-way lockup race: perf_event_create_kernel_counter() perf_event_alloc() perf_try_init_event() x86_pmu_event_init() __x86_pmu_event_init() x86_reserve_hardware() #0 mutex_lock(&pmc_reserve_mutex); reserve_ds_buffer() #1 get_online_cpus() perf_event_release_kernel() _free_event() hw_perf_event_destroy() x86_release_hardware() #0 mutex_lock(&pmc_reserve_mutex) release_ds_buffer() #1 get_online_cpus() #1 do_cpu_up() perf_event_init_cpu() #2 mutex_lock(&pmus_lock) #3 mutex_lock(&ctx->mutex) sys_perf_event_open() mutex_lock_double() #3 mutex_lock(ctx->mutex) #4 mutex_lock_nested(ctx->mutex, 1); perf_try_init_event() #4 mutex_lock_nested(ctx->mutex, 1) x86_pmu_event_init() intel_pmu_hw_config() x86_add_exclusive() #0 mutex_lock(&pmc_reserve_mutex) Fix it by using ordering constructs instead of locking. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-25perf/core: Fix ctx::mutex deadlockPeter Zijlstra
Lockdep noticed the following 3-way lockup scenario: sys_perf_event_open() perf_event_alloc() perf_try_init_event() #0 ctx = perf_event_ctx_lock_nested(1) perf_swevent_init() swevent_hlist_get() #1 mutex_lock(&pmus_lock) perf_event_init_cpu() #1 mutex_lock(&pmus_lock) #2 mutex_lock(&ctx->mutex) sys_perf_event_open() mutex_lock_double() #2 mutex_lock() #0 mutex_lock_nested() And while we need that perf_event_ctx_lock_nested() for HW PMUs such that they can iterate the sibling list, trying to match it to the available counters, the software PMUs need do no such thing. Exclude them. In particular the swevent triggers the above invertion, while the tpevent PMU triggers a more elaborate one through their event_mutex. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-25perf/core: Fix another perf,trace,cpuhp lock inversionPeter Zijlstra
Lockdep noticed the following 3-way lockup race: perf_trace_init() #0 mutex_lock(&event_mutex) perf_trace_event_init() perf_trace_event_reg() tp_event->class->reg() := tracepoint_probe_register #1 mutex_lock(&tracepoints_mutex) trace_point_add_func() #2 static_key_enable() #2 do_cpu_up() perf_event_init_cpu() #3 mutex_lock(&pmus_lock) #4 mutex_lock(&ctx->mutex) perf_ioctl() #4 ctx = perf_event_ctx_lock() _perf_iotcl() ftrace_profile_set_filter() #0 mutex_lock(&event_mutex) Fudge it for now by noting that the tracepoint state does not depend on the event <-> context relation. Ugly though :/ Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-25perf/core: Fix lock inversion between perf,trace,cpuhpPeter Zijlstra
Lockdep gifted us with noticing the following 4-way lockup scenario: perf_trace_init() #0 mutex_lock(&event_mutex) perf_trace_event_init() perf_trace_event_reg() tp_event->class->reg() := tracepoint_probe_register #1 mutex_lock(&tracepoints_mutex) trace_point_add_func() #2 static_key_enable() #2 do_cpu_up() perf_event_init_cpu() #3 mutex_lock(&pmus_lock) #4 mutex_lock(&ctx->mutex) perf_event_task_disable() mutex_lock(&current->perf_event_mutex) #4 ctx = perf_event_ctx_lock() #5 perf_event_for_each_child() do_exit() task_work_run() __fput() perf_release() perf_event_release_kernel() #4 mutex_lock(&ctx->mutex) #5 mutex_lock(&event->child_mutex) free_event() _free_event() event->destroy() := perf_trace_destroy #0 mutex_lock(&event_mutex); Fix that by moving the free_event() out from under the locks. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2018-01-25KVM: VMX: Make indirect call speculation safePeter Zijlstra
Replace indirect call with CALL_NOSPEC. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Jun Nakajima <jun.nakajima@intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: rga@amazon.de Cc: Dave Hansen <dave.hansen@intel.com> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jason Baron <jbaron@akamai.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Link: https://lkml.kernel.org/r/20180125095843.645776917@infradead.org
2018-01-25KVM: x86: Make indirect calls in emulator speculation safePeter Zijlstra
Replace the indirect calls with CALL_NOSPEC. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: David Woodhouse <dwmw@amazon.co.uk> Cc: Andrea Arcangeli <aarcange@redhat.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Greg KH <gregkh@linuxfoundation.org> Cc: Jun Nakajima <jun.nakajima@intel.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: rga@amazon.de Cc: Dave Hansen <dave.hansen@intel.com> Cc: Asit Mallick <asit.k.mallick@intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Jason Baron <jbaron@akamai.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Arjan Van De Ven <arjan.van.de.ven@intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Link: https://lkml.kernel.org/r/20180125095843.595615683@infradead.org
2018-01-25Merge tag 'drm-misc-fixes-2018-01-24' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes Two vc4 fixes that were applied in the last day. One fixes a NULL dereference, and the other fixes a flickering bug. Cc: Eric Anholt <eric@anholt.net> Cc: Boris Brezillon <boris.brezillon@free-electrons.com> * tag 'drm-misc-fixes-2018-01-24' of git://anongit.freedesktop.org/drm/drm-misc: drm/vc4: Fix NULL pointer dereference in vc4_save_hang_state() drm/vc4: Flush the caches before the bin jobs, as well.
2018-01-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds
Pull networking fixes from David Miller: 1) Avoid negative netdev refcount in error flow of xfrm state add, from Aviad Yehezkel. 2) Fix tcpdump decoding of IPSEC decap'd frames by filling in the ethernet header protocol field in xfrm{4,6}_mode_tunnel_input(). From Yossi Kuperman. 3) Fix a syzbot triggered skb_under_panic in pppoe having to do with failing to allocate an appropriate amount of headroom. From Guillaume Nault. 4) Fix memory leak in vmxnet3 driver, from Neil Horman. 5) Cure out-of-bounds packet memory access in em_nbyte EMATCH module, from Wolfgang Bumiller. 6) Restrict what kinds of sockets can be bound to the KCM multiplexer and also disallow when another layer has attached to the socket and made use of sk_user_data. From Tom Herbert. 7) Fix use before init of IOTLB in vhost code, from Jason Wang. 8) Correct STACR register write bit definition in IBM emac driver, from Ivan Mikhaylov. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: net/ibm/emac: wrong bit is used for STA control register write net/ibm/emac: add 8192 rx/tx fifo size vhost: do not try to access device IOTLB when not initialized vhost: use mutex_lock_nested() in vhost_dev_lock_vqs() i40e: flower: check if TC offload is enabled on a netdev qed: Free reserved MR tid qed: Remove reserveration of dpi for kernel kcm: Check if sk_user_data already set in kcm_attach kcm: Only allow TCP sockets to be attached to a KCM mux net: sched: fix TCF_LAYER_LINK case in tcf_get_base_ptr net: sched: em_nbyte: don't add the data offset twice mlxsw: spectrum_router: Don't log an error on missing neighbor vmxnet3: repair memory leak ipv6: Fix getsockopt() for sockets with default IPV6_AUTOFLOWLABEL pppoe: take ->needed_headroom of lower device into account on xmit xfrm: fix boolean assignment in xfrm_get_type_offload xfrm: Fix eth_hdr(skb)->h_proto to reflect inner IP version xfrm: fix error flow in case of add state fails xfrm: Add SA to hardware at the end of xfrm_state_construct()
2018-01-24Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparcLinus Torvalds
Pull sparc bugfix from David Miller: "Sparc Makefile typo fix" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc64: fix typo in CONFIG_CRYPTO_DES_SPARC64 => CONFIG_CRYPTO_CAMELLIA_SPARC64
2018-01-24net/ibm/emac: wrong bit is used for STA control register writeIvan Mikhaylov
STA control register has areas of mode and opcodes for opeations. 18 bit is using for mode selection, where 0 is old MIO/MDIO access method and 1 is indirect access mode. 19-20 bits are using for setting up read/write operation(STA opcodes). In current state 'read' is set into old MIO/MDIO mode with 19 bit and write operation is set into 18 bit which is mode selection, not a write operation. To correlate write with read we set it into 20 bit. All those bit operations are MSB 0 based. Signed-off-by: Ivan Mikhaylov <ivan@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24net/ibm/emac: add 8192 rx/tx fifo sizeIvan Mikhaylov
emac4syn chips has availability to use 8192 rx/tx fifo buffer sizes, in current state if we set it up in dts 8192 as example, we will get only 2048 which may impact on network speed. Signed-off-by: Ivan Mikhaylov <ivan@de.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24Revert "Input: synaptics_rmi4 - use devm_device_add_group() for attributes ↵Nick Dyer
in F01" Since the sysfs attribute hangs off the RMI bus, which doesn't go away during firmware flash, it needs to be explicitly removed, otherwise we would try and register the same attribute twice. This reverts commit 36a44af5c176d619552d99697433261141dd1296. Signed-off-by: Nick Dyer <nick@shmanahar.org> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2018-01-24vhost: do not try to access device IOTLB when not initializedJason Wang
The code will try to access dev->iotlb when processing VHOST_IOTLB_INVALIDATE even if it was not initialized which may lead to NULL pointer dereference. Fixes this by check dev->iotlb before. Fixes: 6b1e6cc7855b0 ("vhost: new device IOTLB API") Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24vhost: use mutex_lock_nested() in vhost_dev_lock_vqs()Jason Wang
We used to call mutex_lock() in vhost_dev_lock_vqs() which tries to hold mutexes of all virtqueues. This may confuse lockdep to report a possible deadlock because of trying to hold locks belong to same class. Switch to use mutex_lock_nested() to avoid false positive. Fixes: 6b1e6cc7855b0 ("vhost: new device IOTLB API") Reported-by: syzbot+dbb7c1161485e61b0241@syzkaller.appspotmail.com Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24i40e: flower: check if TC offload is enabled on a netdevJakub Kicinski
Since TC block changes drivers are required to check if the TC hw offload flag is set on the interface themselves. Fixes: 2f4b411a3d67 ("i40e: Enable cloud filters via tc-flower") Fixes: 44ae12a768b7 ("net: sched: move the can_offload check from binding phase to rule insertion phase") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Acked-by: Amritha Nambiar <amritha.nambiar@intel.com> Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24sparc64: fix typo in CONFIG_CRYPTO_DES_SPARC64 => CONFIG_CRYPTO_CAMELLIA_SPARC64Corentin Labbe
This patch fixes the typo CONFIG_CRYPTO_DES_SPARC64 => CONFIG_CRYPTO_CAMELLIA_SPARC64 Fixes: 81658ad0d923 ("sparc64: Add CAMELLIA driver making use of the new camellia opcodes.") Signed-off-by: Corentin Labbe <clabbe@baylibre.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24Merge branch 'qed-rdma-bug-fixes'David S. Miller
Michal Kalderon says: ==================== qed: rdma bug fixes This patch contains two small bug fixes related to RDMA. Both related to resource reservations. ==================== Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24qed: Free reserved MR tidMichal Kalderon
A tid was allocated for reserved MR during initialization but not freed. This lead to an annoying output message during rdma unload flow. Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24qed: Remove reserveration of dpi for kernelMichal Kalderon
Double reservation for kernel dedicated dpi was performed. Once in the core module and once in qedr. Remove the reservation from core. Signed-off-by: Michal Kalderon <Michal.Kalderon@cavium.com> Signed-off-by: Ariel Elior <Ariel.Elior@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24Merge branch 'kcm-fix-two-syzcaller-issues'David S. Miller
Tom Herbert says: ==================== kcm: fix two syzcaller issues In this patch set: - Don't allow attaching non-TCP or listener sockets to a KCM mux. - In kcm_attach Check if sk_user_data is already set. This is under lock to avoid race conditions. More work is need to make all of the users of sk_user_data to use the same locking. - v2 Remove unncessary check for not PF_KCM in kcm_attach (suggested by Guillaume Nault) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24kcm: Check if sk_user_data already set in kcm_attachTom Herbert
This is needed to prevent sk_user_data being overwritten. The check is done under the callback lock. This should prevent a socket from being attached twice to a KCM mux. It also prevents a socket from being attached for other use cases of sk_user_data as long as the other cases set sk_user_data under the lock. Followup work is needed to unify all the use cases of sk_user_data to use the same locking. Reported-by: syzbot+114b15f2be420a8886c3@syzkaller.appspotmail.com Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module") Signed-off-by: Tom Herbert <tom@quantonium.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24kcm: Only allow TCP sockets to be attached to a KCM muxTom Herbert
TCP sockets for IPv4 and IPv6 that are not listeners or in closed stated are allowed to be attached to a KCM mux. Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module") Reported-by: syzbot+8865eaff7f9acd593945@syzkaller.appspotmail.com Signed-off-by: Tom Herbert <tom@quantonium.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-25MAINTAINERS: update email address for James MorrisJames Morris
Update my email address. Signed-off-by: James Morris <jmorris@namei.org>
2018-01-24net: sched: fix TCF_LAYER_LINK case in tcf_get_base_ptrWolfgang Bumiller
TCF_LAYER_LINK and TCF_LAYER_NETWORK returned the same pointer as skb->data points to the network header. Use skb_mac_header instead. Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24net: sched: em_nbyte: don't add the data offset twiceWolfgang Bumiller
'ptr' is shifted by the offset and then validated, the memcmp should not add it a second time. Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24Btrfs: fix stale entries in readdirJosef Bacik
In fixing the readdir+pagefault deadlock I accidentally introduced a stale entry regression in readdir. If we get close to full for the temporary buffer, and then skip a few delayed deletions, and then try to add another entry that won't fit, we will emit the entries we found and retry. Unfortunately we delete entries from our del_list as we find them, assuming we won't need them. However our pos will be with whatever our last entry was, which could be before the delayed deletions we skipped, so the next search will add the deleted entries back into our readdir buffer. So instead don't delete entries we find in our del_list so we can make sure we always find our delayed deletions. This is a slight perf hit for readdir with lots of pending deletions, but hopefully this isn't a common occurrence. If it is we can revist this and optimize it. cc: stable@vger.kernel.org Fixes: 23b5ec74943f ("btrfs: fix readdir deadlock with pagefault") Signed-off-by: Josef Bacik <jbacik@fb.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-01-24Merge tag 'trace-v4.15-rc9' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull tracing fixes from Steven Rostedt: "With the new ORC unwinder, ftrace stack tracing became disfunctional. One was that ORC didn't know how to handle the ftrace callbacks in general (which Josh fixed). The other was that ORC would just bail if it hit a dynamically allocated trampoline. Which means all ftrace stack tracing that happens from the function tracer would produce no results (that includes killing the max stack size tracer). I added a check to the ORC unwinder to see if the trampoline belonged to ftrace, and if it did, use the orc entry of the static trampoline that was used to create the dynamic one (it would be identical). Finally, I noticed that the skip values of the stack tracing were out of whack. I went through and fixed them up" * tag 'trace-v4.15-rc9' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: tracing: Update stack trace skipping for ORC unwinder ftrace, orc, x86: Handle ftrace dynamically allocated trampolines x86/ftrace: Fix ORC unwinding from ftrace handlers
2018-01-24MAINTAINERS: clarify that only verified bugs should be submitted to security@Willy Tarreau
We're seeing a raise of automated reports from testing tools and reports about address leaks that are not really exploitable as-is, many of which do not represent an immediate risk justifying to work in closed places. Signed-off-by: Willy Tarreau <w@1wt.eu> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-24Revert "module: Add retpoline tag to VERMAGIC"Greg Kroah-Hartman
This reverts commit 6cfb521ac0d5b97470883ff9b7facae264b7ab12. Turns out distros do not want to make retpoline as part of their "ABI", so this patch should not have been merged. Sorry Andi, this was my fault, I suggested it when your original patch was the "correct" way of doing this instead. Reported-by: Jiri Kosina <jikos@kernel.org> Fixes: 6cfb521ac0d5 ("module: Add retpoline tag to VERMAGIC") Acked-by: Andi Kleen <ak@linux.intel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: rusty@rustcorp.com.au Cc: arjan.van.de.ven@intel.com Cc: jeyu@kernel.org Cc: stable <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-01-24mlxsw: spectrum_router: Don't log an error on missing neighborYuval Mintz
Driver periodically samples all neighbors configured in device in order to update the kernel regarding their state. When finding an entry configured in HW that doesn't show in neigh_lookup() driver logs an error message. This introduces a race when removing multiple neighbors - it's possible that a given entry would still be configured in HW as its removal is still being processed but is already removed from the kernel's neighbor tables. Simply remove the error message and gracefully accept such events. Fixes: c723c735fa6b ("mlxsw: spectrum_router: Periodically update the kernel's neigh table") Fixes: 60f040ca11b9 ("mlxsw: spectrum_router: Periodically dump active IPv6 neighbours") Signed-off-by: Yuval Mintz <yuvalm@mellanox.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Signed-off-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2018-01-24 1) Only offloads SAs after they are fully initialized. Otherwise a NIC may receive packets on a SA we can not yet handle in the stack. From Yossi Kuperman. 2) Fix negative refcount in case of a failing offload. From Aviad Yehezkel. 3) Fix inner IP ptoro version when decapsulating from interaddress family tunnels. From Yossi Kuperman. 4) Use true or false for boolean variables instead of an integer value in xfrm_get_type_offload. From Gustavo A. R. Silva. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2018-01-24Merge tag 'kvm-s390-master-4.15-3' of ↵Radim Krčmář
git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux KVM: s390: another fix for cmma migration This fixes races and potential use after free in the cmma migration code.