summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2018-04-18MIPS: uaccess: Add micromips clobbers to bzero invocationMatt Redfearn
The micromips implementation of bzero additionally clobbers registers t7 & t8. Specify this in the clobbers list when invoking bzero. Fixes: 26c5e07d1478 ("MIPS: microMIPS: Optimise 'memset' core library function.") Reported-by: James Hogan <jhogan@kernel.org> Signed-off-by: Matt Redfearn <matt.redfearn@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: <stable@vger.kernel.org> # 3.10+ Patchwork: https://patchwork.linux-mips.org/patch/19110/ Signed-off-by: James Hogan <jhogan@kernel.org>
2018-04-18MIPS: memset.S: Fix clobber of v1 in last_fixupMatt Redfearn
The label .Llast_fixup\@ is jumped to on page fault within the final byte set loop of memset (on < MIPSR6 architectures). For some reason, in this fault handler, the v1 register is randomly set to a2 & STORMASK. This clobbers v1 for the calling function. This can be observed with the following test code: static int __init __attribute__((optimize("O0"))) test_clear_user(void) { register int t asm("v1"); char *test; int j, k; pr_info("\n\n\nTesting clear_user\n"); test = vmalloc(PAGE_SIZE); for (j = 256; j < 512; j++) { t = 0xa5a5a5a5; if ((k = clear_user(test + PAGE_SIZE - 256, j)) != j - 256) { pr_err("clear_user (%px %d) returned %d\n", test + PAGE_SIZE - 256, j, k); } if (t != 0xa5a5a5a5) { pr_err("v1 was clobbered to 0x%x!\n", t); } } return 0; } late_initcall(test_clear_user); Which demonstrates that v1 is indeed clobbered (MIPS64): Testing clear_user v1 was clobbered to 0x1! v1 was clobbered to 0x2! v1 was clobbered to 0x3! v1 was clobbered to 0x4! v1 was clobbered to 0x5! v1 was clobbered to 0x6! v1 was clobbered to 0x7! Since the number of bytes that could not be set is already contained in a2, the andi placing a value in v1 is not necessary and actively harmful in clobbering v1. Reported-by: James Hogan <jhogan@kernel.org> Signed-off-by: Matt Redfearn <matt.redfearn@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/19109/ Signed-off-by: James Hogan <jhogan@kernel.org>
2018-04-18Merge tag 'ceph-for-4.17-rc2' of git://github.com/ceph/ceph-clientLinus Torvalds
Pull ceph fixes from Ilya Dryomov: "A couple of follow-up patches for -rc1 changes in rbd, support for a timeout on waiting for the acquisition of exclusive lock and a fix for uninitialized memory access in CephFS, marked for stable" * tag 'ceph-for-4.17-rc2' of git://github.com/ceph/ceph-client: rbd: notrim map option rbd: adjust queue limits for "fancy" striping rbd: avoid Wreturn-type warnings ceph: always update atime/mtime/ctime for new inode rbd: support timeout in rbd_wait_state_locked() rbd: refactor rbd_wait_state_locked()
2018-04-18perf test BPF: Fixup BPF test using epoll_pwait syscall function probeArnaldo Carvalho de Melo
Since e145242ea0df ("syscalls/core, syscalls/x86: Clean up syscall stub naming convention") changed the main syscall function for 'epoll_pwait' to something other than the expected 'SyS_epoll_pwait the' 'perf test BPF' entries started failing, fix it by using something called from the main syscall function instead, 'epoll_wait', which should keep this test working in older kernels too. Before: # perf test BPF 40: BPF filter : 40.1: Basic BPF filtering : FAILED! 40.2: BPF pinning : Skip 40.3: BPF prologue generation : Skip 40.4: BPF relocation checker : Skip If we use -v for that test we see the problem: Probe point 'SyS_epoll_pwait' not found. After: # perf test BPF 40: BPF filter : 40.1: Basic BPF filtering : Ok 40.2: BPF pinning : Ok 40.3: BPF prologue generation : Ok 40.4: BPF relocation checker : Ok # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/r/tip-y24nmn70cs2am8jh4i344dng@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-18perf tests mmap: Show which tracepoint is failingArnaldo Carvalho de Melo
In the 'perf test "mmap interface"' we try creating events for several tracepoints, but when perf_evsel__new() fails we're not showing which one is failing, fix that to help diagnosing problems, such as the syscall tracepoints ones being found and fixes in this merge window. Now the failing tests shows: # perf test -v "mmap interface" 4: Read samples using the mmap interface : --- start --- test child forked, pid 14311 <SNIP> perf_evsel__new(sys_enter_getppid) test child finished with -1 ---- end ---- Read samples using the mmap interface: FAILED! # Now to check why the syscalls:sys_enter_getppid is failing... # ls -la /sys/kernel/debug/tracing/events/syscalls/sys_enter_getppid ls: cannot access '/sys/kernel/debug/tracing/events/syscalls/sys_enter_getppid': No such file or directory # Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-44xk0ycdzrfzx1o9rklf5itl@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-18perf tools: Add '\n' at the end of parse-options error messagesRavi Bangoria
Few error messages does not have '\n' at the end and thus next prompt gets printed in the same line. Ex, linux~$ perf buildid-cache -verbose --add ./a.out Error: did you mean `--verbose` (with two dashes ?)linux~$ Fix it. Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com> Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Kate Stewart <kstewart@linuxfoundation.org> Cc: Krister Johansen <kjlx@templeofstupid.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Philippe Ombredanne <pombredanne@nexb.com> Cc: Sihyeon Jang <uneedsihyeon@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20180417041346.5617-2-ravi.bangoria@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-18perf record: Remove suggestion to enable APICAndi Kleen
'perf record' suggests to enable the APIC on errors. APIC is practically always used today and the problem is usually somewhere else. Just remove the outdated suggestion. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20180406203812.3087-5-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-18perf record: Remove misleading error suggestionAndi Kleen
When perf record encounters an error setting up an event it suggests to enable CONFIG_PERF_EVENTS. This is misleading because: - Usually it is enabled (it is really hard to disable on x86) - The problem is usually somewhere else, e.g. the CPU is not supported or an invalid configuration has been used. Remove the misleading suggestion. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20180406203812.3087-4-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-18perf hists browser: Clarify top/report browser helpAndi Kleen
Clarify in the browser help that ESC in tui mode may go back to the previous screen instead of just exiting (was not clear to me) Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20180406203812.3087-3-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-18perf mem: Allow all record/report optionsAndi Kleen
For perf mem report / perf mem record, pass all unknown options through to the underlying report/record commands. This makes things like perf mem record -a sleep 1 work. Matches how c2c and other tools work. Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20180406203812.3087-2-andi@firstfloor.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-18perf trace: Support MAP_FIXED_NOREPLACEArnaldo Carvalho de Melo
Introduced in a4ff8e8620d3 ("mm: introduce MAP_FIXED_NOREPLACE"), and now that we have that define in the just syncronized tools/arch/*/include/uapi/asm/mman.h files, add support for it. This should really transition to autogeneration of string tables as done for various other things: $ ls /tmp/build/perf/trace/beauty/generated/*.c arch_errno_name_array.c kcmp_type_array.c madvise_behavior_array.c pkey_alloc_access_rights_array.c prctl_option_array.c $ head /tmp/build/perf/trace/beauty/generated/madvise_behavior_array.c static const char *madvise_advices[] = { [0] = "NORMAL", [1] = "RANDOM", [2] = "SEQUENTIAL", [3] = "WILLNEED", [4] = "DONTNEED", [8] = "FREE", [9] = "REMOVE", [10] = "DONTFORK", [11] = "DOFORK", $ Till then, add support for this the old way. Also it has to be ifdef'ed, because arches like mips still don't define it. The proper solution will be to have per-arch tables for these values to support cross-analysis. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: David Ahern <dsahern@gmail.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Wang Nan <wangnan0@huawei.com> Link: https://lkml.kernel.org/n/tip-td9t5vhjltqnlzaurkkgq8cn@git.kernel.org Signef-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-18tun: fix vlan packet truncationBjørn Mork
Bogus trimming in tun_net_xmit() causes truncated vlan packets. skb->len is correct whether or not skb_vlan_tag_present() is true. There is no more reason to adjust the skb length on xmit in this driver than any other driver. tun_put_user() adds 4 bytes to the total for tagged packets because it transmits the tag inline to userspace. This is similar to a nic transmitting the tag inline on the wire. Reproducing the bug by sending any tagged packet through back-to-back connected tap interfaces: socat TUN,tun-type=tap,iff-up,tun-name=in TUN,tun-type=tap,iff-up,tun-name=out & ip link add link in name in.20 type vlan id 20 ip addr add 10.9.9.9/24 dev in.20 ip link set in.20 up tshark -nxxi in -f arp -c1 2>/dev/null & tshark -nxxi out -f arp -c1 2>/dev/null & ping -c 1 10.9.9.5 >/dev/null 2>&1 The output from the 'in' and 'out' interfaces are different when the bug is present: Capturing on 'in' 0000 ff ff ff ff ff ff 76 cf 76 37 d5 0a 81 00 00 14 ......v.v7...... 0010 08 06 00 01 08 00 06 04 00 01 76 cf 76 37 d5 0a ..........v.v7.. 0020 0a 09 09 09 00 00 00 00 00 00 0a 09 09 05 .............. Capturing on 'out' 0000 ff ff ff ff ff ff 76 cf 76 37 d5 0a 81 00 00 14 ......v.v7...... 0010 08 06 00 01 08 00 06 04 00 01 76 cf 76 37 d5 0a ..........v.v7.. 0020 0a 09 09 09 00 00 00 00 00 00 .......... Fixes: aff3d70a07ff ("tun: allow to attach ebpf socket filter") Cc: Jason Wang <jasowang@redhat.com> Signed-off-by: Bjørn Mork <bjorn@mork.no> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18tipc: fix infinite loop when dumping link monitor summaryTung Nguyen
When configuring the number of used bearers to MAX_BEARER and issuing command "tipc link monitor summary", the command enters infinite loop in user space. This issue happens because function tipc_nl_node_dump_monitor() returns the wrong 'prev_bearer' value when all potential monitors have been scanned. The correct behavior is to always try to scan all monitors until either the netlink message is full, in which case we return the bearer identity of the affected monitor, or we continue through the whole bearer array until we can return MAX_BEARERS. This solution also caters for the case where there may be gaps in the bearer array. Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-18tipc: fix use-after-free in tipc_nametbl_stopJon Maloy
When we delete a service item in tipc_nametbl_stop() we loop over all service ranges in the service's RB tree, and for each service range we loop over its pertaining publications while calling tipc_service_remove_publ() for each of them. However, tipc_service_remove_publ() has the side effect that it also removes the comprising service range item when there are no publications left. This leads to a "use-after-free" access when the inner loop continues to the next iteration, since the range item holding the list we are looping no longer exists. We fix this by moving the delete of the service range item outside the said function. Instead, we now let the two functions calling it test if the list is empty and perform the removal when that is the case. Reported-by: syzbot+d64b64afc55660106556@syzkaller.appspotmail.com Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-19powerpc/xive: Fix trying to "push" an already active pool VPBenjamin Herrenschmidt
When setting up a CPU, we "push" (activate) a pool VP for it. However it's an error to do so if it already has an active pool VP. This happens when doing soft CPU hotplug on powernv since we don't tear down the CPU on unplug. The HW flags the error which gets captured by the diagnostics. Fix this by making sure to "pull" out any already active pool first. Fixes: 243e25112d06 ("powerpc/xive: Native exploitation of the XIVE interrupt controller") Cc: stable@vger.kernel.org # v4.12+ Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-04-18btrfs: Fix wrong btrfs_delalloc_release_extents parameterQu Wenruo
Commit 43b18595d660 ("btrfs: qgroup: Use separate meta reservation type for delalloc") merged into mainline is not the latest version submitted to mail list in Dec 2017. It has a fatal wrong @qgroup_free parameter, which results increasing qgroup metadata pertrans reserved space, and causing a lot of early EDQUOT. Fix it by applying the correct diff on top of current branch. Fixes: 43b18595d660 ("btrfs: qgroup: Use separate meta reservation type for delalloc") Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-04-18btrfs: delayed-inode: Remove wrong qgroup meta reservation callsQu Wenruo
Commit 4f5427ccce5d ("btrfs: delayed-inode: Use new qgroup meta rsv for delayed inode and item") merged into mainline was not latest version submitted to the mail list in Dec 2017. Which lacks the following fixes: 1) Remove btrfs_qgroup_convert_reserved_meta() call in btrfs_delayed_item_release_metadata() 2) Remove btrfs_qgroup_reserve_meta_prealloc() call in btrfs_delayed_inode_reserve_metadata() Those fixes will resolve unexpected EDQUOT problems. Fixes: 4f5427ccce5d ("btrfs: delayed-inode: Use new qgroup meta rsv for delayed inode and item") Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-04-18btrfs: qgroup: Use independent and accurate per inode qgroup rsvQu Wenruo
Unlike reservation calculation used in inode rsv for metadata, qgroup doesn't really need to care about things like csum size or extent usage for the whole tree COW. Qgroups care more about net change of the extent usage. That's to say, if we're going to insert one file extent, it will mostly find its place in COWed tree block, leaving no change in extent usage. Or causing a leaf split, resulting in one new net extent and increasing qgroup number by nodesize. Or in an even more rare case, increase the tree level, increasing qgroup number by 2 * nodesize. So here instead of using the complicated calculation for extent allocator, which cares more about accuracy and no error, qgroup doesn't need that over-estimated reservation. This patch will maintain 2 new members in btrfs_block_rsv structure for qgroup, using much smaller calculation for qgroup rsv, reducing false EDQUOT. Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com>
2018-04-18btrfs: qgroup: Commit transaction in advance to reduce early EDQUOTQu Wenruo
Unlike previous method that tries to commit transaction inside qgroup_reserve(), this time we will try to commit transaction using fs_info->transaction_kthread to avoid nested transaction and no need to worry about locking context. Since it's an asynchronous function call and we won't wait for transaction commit, unlike previous method, we must call it before we hit the qgroup limit. So this patch will use the ratio and size of qgroup meta_pertrans reservation as indicator to check if we should trigger a transaction commit. (meta_prealloc won't be cleaned in transaction committ, it's useless anyway) Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2018-04-18udf: Fix leak of UTF-16 surrogates into encoded stringsJan Kara
OSTA UDF specification does not mention whether the CS0 charset in case of two bytes per character encoding should be treated in UTF-16 or UCS-2. The sample code in the standard does not treat UTF-16 surrogates in any special way but on systems such as Windows which work in UTF-16 internally, filenames would be treated as being in UTF-16 effectively. In Linux it is more difficult to handle characters outside of Base Multilingual plane (beyond 0xffff) as NLS framework works with 2-byte characters only. Just make sure we don't leak UTF-16 surrogates into the resulting string when loading names from the filesystem for now. CC: stable@vger.kernel.org # >= v4.6 Reported-by: Mingye Wang <arthur200126@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz>
2018-04-18arm64: signal: don't force known signals to SIGKILLMark Rutland
Since commit: a7e6f1ca90354a31 ("arm64: signal: Force SIGKILL for unknown signals in force_signal_inject") ... any signal which is not SIGKILL will be upgraded to a SIGKILL be force_signal_inject(). This includes signals we do expect, such as SIGILL triggered by do_undefinstr(). Fix the check to use a logical AND rather than a logical OR, permitting signals whose layout is SIL_FAULT. Fixes: a7e6f1ca90354a31 ("arm64: signal: Force SIGKILL for unknown signals in force_signal_inject") Cc: Will Deacon <will.deacon@arm.com> Reviewed-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-04-18dt-bindings: thermal: Remove "cooling-{min|max}-level" propertiesViresh Kumar
The "cooling-min-level" and "cooling-max-level" properties are not parsed by any part of kernel currently and the max cooling state of a CPU cooling device is found by referring to the cpufreq table instead. Remove the unused bindings. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2018-04-18dt-bindings: thermal: remove no longer needed samsung thermal propertiesBartlomiej Zolnierkiewicz
Remove documentation for longer needed samsung thermal properties. Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com> Reviewed-by: Rob Herring <robh@kernel.org> Signed-off-by: Eduardo Valentin <edubezval@gmail.com>
2018-04-18drm/i915: Fix LSPCON TMDS output buffer enabling from low-power stateImre Deak
LSPCON adapters in low-power state may ignore the first I2C write during TMDS output buffer enabling, resulting in a blank screen even with an otherwise enabled pipe. Fix this by reading back and validating the written value a few times. The problem was noticed on GLK machines with an onboard LSPCON adapter after entering/exiting DC5 power state. Doing an I2C read of the adapter ID as the first transaction - instead of the I2C write to enable the TMDS buffers - returns the correct value. Based on this we assume that the transaction itself is sent properly, it's only the adapter that is not ready for some reason to accept this first write after waking from low-power state. In my case the second I2C write attempt always succeeded. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105854 Cc: Clinton Taylor <clinton.a.taylor@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180416155309.11100-1-imre.deak@intel.com
2018-04-18drm/i915/audio: Fix audio detection issue on GLKGaurav K Singh
On Geminilake, sometimes audio card is not getting detected after reboot. This is a spurious issue happening on Geminilake. HW codec and HD audio controller link was going out of sync for which there was a fix in i915 driver but was not getting invoked for GLK. Extending this fix to GLK as well. Tested by Du,Wenkai on GLK board. Bspec: 21829 v2: Instead of checking GEN9_BC, BXT and GLK macros, use IS_GEN9 macro (Jani N) Cc: <stable@vger.kernel.org> # b651bd2a3ae3 ("drm/i915/audio: Fix audio enumeration issue on BXT") Cc: <stable@vger.kernel.org> Signed-off-by: Gaurav K Singh <gaurav.k.singh@intel.com> Reviewed-by: Abhay Kumar <abhay.Kumar@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/1523989338-29677-1-git-send-email-gaurav.k.singh@intel.com (cherry picked from commit 8221229046e862977ae93ec9d34aa583fbd10397) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2018-04-18drm/i915: Call i915_perf_fini() on init_hw error unwindChris Wilson
We have to cleanup after i915_perf_init(), even on the error path, as it passes a pointer into the module to the sysfs core. If we fail to unregister the sysctl table, we leave a dangling pointer which then may explode anytime later. Fixes: 9f9b2792b6d3 ("drm/i915/perf: reuse timestamp frequency from device info") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180414091233.32224-1-chris@chris-wilson.co.uk (cherry picked from commit 9f172f6fbd243759c808d97bd83c95e49325b2c9) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2018-04-18drm/i915/bios: filter out invalid DDC pins from VBT child devicesJani Nikula
The VBT contains the DDC pin to use for specific ports. Alas, sometimes the field appears to contain bogus data, and while we check for it later on in intel_gmbus_get_adapter() we fail to check the returned NULL on errors. Oops results. The simplest approach seems to be to catch and ignore the bogus DDC pins already at the VBT parsing phase, reverting to fixed per port default pins. This doesn't guarantee display working, but at least it prevents the oops. And we continue to be fuzzed by VBT. One affected machine is Dell Latitude 5590 where a BIOS upgrade added invalid DDC pins. Typical backtrace: [ 35.461411] WARN_ON(!intel_gmbus_is_valid_pin(dev_priv, pin)) [ 35.461432] WARNING: CPU: 6 PID: 411 at drivers/gpu/drm/i915/intel_i2c.c:844 intel_gmbus_get_adapter+0x32/0x37 [i915] [ 35.461437] Modules linked in: i915 ahci libahci dm_snapshot dm_bufio dm_raid raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx [ 35.461445] CPU: 6 PID: 411 Comm: kworker/u16:2 Not tainted 4.16.0-rc7.x64-g1cda370ffded #1 [ 35.461447] Hardware name: Dell Inc. Latitude 5590/0MM81M, BIOS 1.1.9 03/13/2018 [ 35.461450] Workqueue: events_unbound async_run_entry_fn [ 35.461465] RIP: 0010:intel_gmbus_get_adapter+0x32/0x37 [i915] [ 35.461467] RSP: 0018:ffff9b4e43d47c40 EFLAGS: 00010286 [ 35.461469] RAX: 0000000000000000 RBX: ffff98f90639f800 RCX: ffffffffae051960 [ 35.461471] RDX: 0000000000000001 RSI: 0000000000000092 RDI: 0000000000000246 [ 35.461472] RBP: ffff98f905410000 R08: 0000004d062a83f6 R09: 00000000000003bd [ 35.461474] R10: 0000000000000031 R11: ffffffffad4eda58 R12: ffff98f905410000 [ 35.461475] R13: ffff98f9064c1000 R14: ffff9b4e43d47cf0 R15: ffff98f905410000 [ 35.461477] FS: 0000000000000000(0000) GS:ffff98f92e580000(0000) knlGS:0000000000000000 [ 35.461479] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 35.461481] CR2: 00007f5682359008 CR3: 00000001b700c005 CR4: 00000000003606e0 [ 35.461483] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 35.461484] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 35.461486] Call Trace: [ 35.461501] intel_hdmi_set_edid+0x37/0x27f [i915] [ 35.461515] intel_hdmi_detect+0x7c/0x97 [i915] [ 35.461518] drm_helper_probe_single_connector_modes+0xe1/0x6c0 [ 35.461521] drm_setup_crtcs+0x129/0xa6a [ 35.461523] ? __switch_to_asm+0x34/0x70 [ 35.461525] ? __switch_to_asm+0x34/0x70 [ 35.461527] ? __switch_to_asm+0x40/0x70 [ 35.461528] ? __switch_to_asm+0x34/0x70 [ 35.461529] ? __switch_to_asm+0x40/0x70 [ 35.461531] ? __switch_to_asm+0x34/0x70 [ 35.461532] ? __switch_to_asm+0x40/0x70 [ 35.461534] ? __switch_to_asm+0x34/0x70 [ 35.461536] __drm_fb_helper_initial_config_and_unlock+0x34/0x46f [ 35.461538] ? __switch_to_asm+0x40/0x70 [ 35.461541] ? _cond_resched+0x10/0x33 [ 35.461557] intel_fbdev_initial_config+0xf/0x1c [i915] [ 35.461560] async_run_entry_fn+0x2e/0xf5 [ 35.461563] process_one_work+0x15b/0x364 [ 35.461565] worker_thread+0x2c/0x3a0 [ 35.461567] ? process_one_work+0x364/0x364 [ 35.461568] kthread+0x10c/0x122 [ 35.461570] ? _kthread_create_on_node+0x5d/0x5d [ 35.461572] ret_from_fork+0x35/0x40 [ 35.461574] Code: 74 16 89 f6 48 8d 04 b6 48 c1 e0 05 48 29 f0 48 8d 84 c7 e8 11 00 00 c3 48 c7 c6 b0 19 1e c0 48 c7 c7 64 8a 1c c0 e8 47 88 ed ec <0f> 0b 31 c0 c3 8b 87 a4 04 00 00 80 e4 fc 09 c6 89 b7 a4 04 00 [ 35.461604] WARNING: CPU: 6 PID: 411 at drivers/gpu/drm/i915/intel_i2c.c:844 intel_gmbus_get_adapter+0x32/0x37 [i915] [ 35.461606] ---[ end trace 4fe1e63e2dd93373 ]--- [ 35.461609] BUG: unable to handle kernel NULL pointer dereference at 0000000000000010 [ 35.461613] IP: i2c_transfer+0x4/0x86 [ 35.461614] PGD 0 P4D 0 [ 35.461616] Oops: 0000 [#1] SMP PTI [ 35.461618] Modules linked in: i915 ahci libahci dm_snapshot dm_bufio dm_raid raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx [ 35.461624] CPU: 6 PID: 411 Comm: kworker/u16:2 Tainted: G W 4.16.0-rc7.x64-g1cda370ffded #1 [ 35.461625] Hardware name: Dell Inc. Latitude 5590/0MM81M, BIOS 1.1.9 03/13/2018 [ 35.461628] Workqueue: events_unbound async_run_entry_fn [ 35.461630] RIP: 0010:i2c_transfer+0x4/0x86 [ 35.461631] RSP: 0018:ffff9b4e43d47b30 EFLAGS: 00010246 [ 35.461633] RAX: ffff9b4e43d47b6e RBX: 0000000000000005 RCX: 0000000000000001 [ 35.461635] RDX: 0000000000000002 RSI: ffff9b4e43d47b80 RDI: 0000000000000000 [ 35.461636] RBP: ffff9b4e43d47bd8 R08: 0000004d062a83f6 R09: 00000000000003bd [ 35.461638] R10: 0000000000000031 R11: ffffffffad4eda58 R12: 0000000000000002 [ 35.461639] R13: 0000000000000001 R14: ffff9b4e43d47b6f R15: ffff9b4e43d47c07 [ 35.461641] FS: 0000000000000000(0000) GS:ffff98f92e580000(0000) knlGS:0000000000000000 [ 35.461643] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 35.461645] CR2: 0000000000000010 CR3: 00000001b700c005 CR4: 00000000003606e0 [ 35.461646] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 35.461647] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 35.461649] Call Trace: [ 35.461652] drm_do_probe_ddc_edid+0xb3/0x128 [ 35.461654] drm_get_edid+0xe5/0x38d [ 35.461669] intel_hdmi_set_edid+0x45/0x27f [i915] [ 35.461684] intel_hdmi_detect+0x7c/0x97 [i915] [ 35.461687] drm_helper_probe_single_connector_modes+0xe1/0x6c0 [ 35.461689] drm_setup_crtcs+0x129/0xa6a [ 35.461691] ? __switch_to_asm+0x34/0x70 [ 35.461693] ? __switch_to_asm+0x34/0x70 [ 35.461694] ? __switch_to_asm+0x40/0x70 [ 35.461696] ? __switch_to_asm+0x34/0x70 [ 35.461697] ? __switch_to_asm+0x40/0x70 [ 35.461698] ? __switch_to_asm+0x34/0x70 [ 35.461700] ? __switch_to_asm+0x40/0x70 [ 35.461701] ? __switch_to_asm+0x34/0x70 [ 35.461703] __drm_fb_helper_initial_config_and_unlock+0x34/0x46f [ 35.461705] ? __switch_to_asm+0x40/0x70 [ 35.461707] ? _cond_resched+0x10/0x33 [ 35.461724] intel_fbdev_initial_config+0xf/0x1c [i915] [ 35.461727] async_run_entry_fn+0x2e/0xf5 [ 35.461729] process_one_work+0x15b/0x364 [ 35.461731] worker_thread+0x2c/0x3a0 [ 35.461733] ? process_one_work+0x364/0x364 [ 35.461734] kthread+0x10c/0x122 [ 35.461736] ? _kthread_create_on_node+0x5d/0x5d [ 35.461738] ret_from_fork+0x35/0x40 [ 35.461739] Code: 5c fa e1 ad 48 89 df e8 ea fb ff ff e9 2a ff ff ff 0f 1f 44 00 00 31 c0 e9 43 fd ff ff 31 c0 45 31 e4 e9 c5 fd ff ff 41 54 55 53 <48> 8b 47 10 48 83 78 10 00 74 70 41 89 d4 48 89 f5 48 89 fb 65 [ 35.461756] RIP: i2c_transfer+0x4/0x86 RSP: ffff9b4e43d47b30 [ 35.461757] CR2: 0000000000000010 [ 35.461759] ---[ end trace 4fe1e63e2dd93374 ]--- Based on a patch by Fei Li. v2: s/reverting/sticking/ (Chris) Cc: stable@vger.kernel.org Cc: Fei Li <fei.li@intel.com> Co-developed-by: Fei Li <fei.li@intel.com> Reported-by: Pavel Nakonechnyi <zorg1331@gmail.com> Reported-and-tested-by: Seweryn Kokot <sewkokot@gmail.com> Reported-and-tested-by: Laszlo Valko <valko@linux.karinthy.hu> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105549 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105961 Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20180411131519.9091-1-jani.nikula@intel.com (cherry picked from commit f212bf9abe5de9f938fecea7df07046e74052dde) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2018-04-18drm/i915/pmu: Inspect runtime PM state more carefully while estimating RC6Tvrtko Ursulin
While thinking about sporadic failures of perf_pmu/rc6-runtime-pm* tests on some CI machines I have concluded that: a) the PMU readout of RC6 can race against runtime PM transitions, and b) there are other reasons than being runtime suspended which can cause intel_runtime_pm_get_if_in_use to fail. Therefore when estimating RC6 the code needs to assert we are indeed in suspended state, and if not, the best we can do is return the last known RC6 value. Without this check we can calculate the estimated value based on un- initialized or inappropriate internal state, which can result in over- estimation, or in any case incorrect value being returned. v2: * Re-arrange the code a bit to avoid second unlock and return branch. (Chris Wilson) v3: * Insert some strategic blank lines and improve commit msg. (Chris Wilson) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: 1fe699e30113 ("drm/i915/pmu: Fix sleep under atomic in RC6 readout") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105010 Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180410112704.24462-1-tvrtko.ursulin@linux.intel.com (cherry picked from commit 2924bdee21edd6785a4df1b4d17fd3cb265fddd9) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2018-04-18drm/i915: Do no use kfree() to free a kmem_cache_alloc() return valueXidong Wang
Along the eb_lookup_vmas() error path, the return value from kmem_cache_alloc() was freed using kfree(). Fix it to use the proper kmem_cache_free() instead. Fixes: d1b48c1e7184 ("drm/i915: Replace execbuf vma ht with an idr") Signed-off-by: Xidong Wang <wangxidong_97@163.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: <stable@vger.kernel.org> # v4.14+ Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20180404093824.9313-1-chris@chris-wilson.co.uk (cherry picked from commit 6be1187dbffa0027ea379c53f7ca0c782515c610) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2018-04-17selftests/filesystems: Don't run dnotify_test by defaultMichael Ellerman
In commit ce290a19609d ("selftests: add devpts selftests"), the filesystems directory was added to the top-level selftests Makefile. That had the effect of causing the existing dnotify_test in the filesystems directory to now be run as part of the default selftests test-run. Unfortunately dnotify_test is actually an infinite loop. Fix it by moving dnotify_test to TEST_GEN_PROGS_EXTENDED, which says that it's a generated file (ie. built) but should not be run as part of the default test suite run (it's an "extended" test). While we're here cleanup a few other things, devpts_pts should be in TEST_GEN_PROGS to indicate that it's built, and with the above two changes we no longer need a custom all or clean rule. Fixes: ce290a19609d ("selftests: add devpts selftests") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Acked-by: Christian brauner <christian.brauner@ubuntu.com> Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
2018-04-17fs: cifs: Adding new return type vm_fault_tSouptick Joarder
Use new return type vm_fault_t for page_mkwrite handler. Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com> Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2018-04-17cifs: smb2ops: Fix NULL check in smb2_query_symlinkGustavo A. R. Silva
The current code null checks variable err_buf, which is always null when it is checked, hence utf16_path is free'd and the function returns -ENOENT everytime it is called, making it impossible for the execution path to reach the following code: err_buf = err_iov.iov_base; Fix this by null checking err_iov.iov_base instead of err_buf. Also, notice that err_buf no longer needs to be initialized to NULL. Addresses-Coverity-ID: 1467876 ("Logically dead code") Fixes: 2d636199e400 ("cifs: Change SMB2_open to return an iov for the error parameter") Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com> Signed-off-by: Steve French <smfrench@gmail.com> Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2018-04-17KEYS: DNS: limit the length of option stringsEric Biggers
Adding a dns_resolver key whose payload contains a very long option name resulted in that string being printed in full. This hit the WARN_ONCE() in set_precision() during the printk(), because printk() only supports a precision of up to 32767 bytes: precision 1000000 too large WARNING: CPU: 0 PID: 752 at lib/vsprintf.c:2189 vsnprintf+0x4bc/0x5b0 Fix it by limiting option strings (combined name + value) to a much more reasonable 128 bytes. The exact limit is arbitrary, but currently the only recognized option is formatted as "dnserror=%lu" which fits well within this limit. Also ratelimit the printks. Reproducer: perl -e 'print "#", "A" x 1000000, "\x00"' | keyctl padd dns_resolver desc @s This bug was found using syzkaller. Reported-by: Mark Rutland <mark.rutland@arm.com> Fixes: 4a2d789267e0 ("DNS: If the DNS server returns an error, allow that to be cached [ver #2]") Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-17sfc: check RSS is active for filter insertBert Kenward
For some firmware variants - specifically 'capture packed stream' - RSS filters are not valid. We must check if RSS is actually active rather than merely enabled. Fixes: 42356d9a137b ("sfc: support RSS spreading of ethtool ntuple filters") Signed-off-by: Bert Kenward <bkenward@solarflare.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-17vlan: Fix reading memory beyond skb->tail in skb_vlan_tagged_multiToshiaki Makita
Syzkaller spotted an old bug which leads to reading skb beyond tail by 4 bytes on vlan tagged packets. This is caused because skb_vlan_tagged_multi() did not check skb_headlen. BUG: KMSAN: uninit-value in eth_type_vlan include/linux/if_vlan.h:283 [inline] BUG: KMSAN: uninit-value in skb_vlan_tagged_multi include/linux/if_vlan.h:656 [inline] BUG: KMSAN: uninit-value in vlan_features_check include/linux/if_vlan.h:672 [inline] BUG: KMSAN: uninit-value in dflt_features_check net/core/dev.c:2949 [inline] BUG: KMSAN: uninit-value in netif_skb_features+0xd1b/0xdc0 net/core/dev.c:3009 CPU: 1 PID: 3582 Comm: syzkaller435149 Not tainted 4.16.0+ #82 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:17 [inline] dump_stack+0x185/0x1d0 lib/dump_stack.c:53 kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067 __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:676 eth_type_vlan include/linux/if_vlan.h:283 [inline] skb_vlan_tagged_multi include/linux/if_vlan.h:656 [inline] vlan_features_check include/linux/if_vlan.h:672 [inline] dflt_features_check net/core/dev.c:2949 [inline] netif_skb_features+0xd1b/0xdc0 net/core/dev.c:3009 validate_xmit_skb+0x89/0x1320 net/core/dev.c:3084 __dev_queue_xmit+0x1cb2/0x2b60 net/core/dev.c:3549 dev_queue_xmit+0x4b/0x60 net/core/dev.c:3590 packet_snd net/packet/af_packet.c:2944 [inline] packet_sendmsg+0x7c57/0x8a10 net/packet/af_packet.c:2969 sock_sendmsg_nosec net/socket.c:630 [inline] sock_sendmsg net/socket.c:640 [inline] sock_write_iter+0x3b9/0x470 net/socket.c:909 do_iter_readv_writev+0x7bb/0x970 include/linux/fs.h:1776 do_iter_write+0x30d/0xd40 fs/read_write.c:932 vfs_writev fs/read_write.c:977 [inline] do_writev+0x3c9/0x830 fs/read_write.c:1012 SYSC_writev+0x9b/0xb0 fs/read_write.c:1085 SyS_writev+0x56/0x80 fs/read_write.c:1082 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 RIP: 0033:0x43ffa9 RSP: 002b:00007fff2cff3948 EFLAGS: 00000217 ORIG_RAX: 0000000000000014 RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 000000000043ffa9 RDX: 0000000000000001 RSI: 0000000020000080 RDI: 0000000000000003 RBP: 00000000006cb018 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000217 R12: 00000000004018d0 R13: 0000000000401960 R14: 0000000000000000 R15: 0000000000000000 Uninit was created at: kmsan_save_stack_with_flags mm/kmsan/kmsan.c:278 [inline] kmsan_internal_poison_shadow+0xb8/0x1b0 mm/kmsan/kmsan.c:188 kmsan_kmalloc+0x94/0x100 mm/kmsan/kmsan.c:314 kmsan_slab_alloc+0x11/0x20 mm/kmsan/kmsan.c:321 slab_post_alloc_hook mm/slab.h:445 [inline] slab_alloc_node mm/slub.c:2737 [inline] __kmalloc_node_track_caller+0xaed/0x11c0 mm/slub.c:4369 __kmalloc_reserve net/core/skbuff.c:138 [inline] __alloc_skb+0x2cf/0x9f0 net/core/skbuff.c:206 alloc_skb include/linux/skbuff.h:984 [inline] alloc_skb_with_frags+0x1d4/0xb20 net/core/skbuff.c:5234 sock_alloc_send_pskb+0xb56/0x1190 net/core/sock.c:2085 packet_alloc_skb net/packet/af_packet.c:2803 [inline] packet_snd net/packet/af_packet.c:2894 [inline] packet_sendmsg+0x6444/0x8a10 net/packet/af_packet.c:2969 sock_sendmsg_nosec net/socket.c:630 [inline] sock_sendmsg net/socket.c:640 [inline] sock_write_iter+0x3b9/0x470 net/socket.c:909 do_iter_readv_writev+0x7bb/0x970 include/linux/fs.h:1776 do_iter_write+0x30d/0xd40 fs/read_write.c:932 vfs_writev fs/read_write.c:977 [inline] do_writev+0x3c9/0x830 fs/read_write.c:1012 SYSC_writev+0x9b/0xb0 fs/read_write.c:1085 SyS_writev+0x56/0x80 fs/read_write.c:1082 do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 Fixes: 58e998c6d239 ("offloading: Force software GSO for multiple vlan tags.") Reported-and-tested-by: syzbot+0bbe42c764feafa82c5a@syzkaller.appspotmail.com Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-17timekeeping: Remove __current_kernel_time()Baolin Wang
The __current_kernel_time() function based on 'struct timespec' is no longer recommended for new code, and the only user of this function has been replaced by commit 6909e29fdefb ("kdb: use __ktime_get_real_seconds instead of __current_kernel_time"). Remove the obsolete interface. Signed-off-by: Baolin Wang <baolin.wang@linaro.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: arnd@arndb.de Cc: sboyd@kernel.org Cc: broonie@kernel.org Cc: john.stultz@linaro.org Link: https://lkml.kernel.org/r/1a9dbea7ee2cda7efe9ed330874075cf17fdbff6.1523596316.git.baolin.wang@linaro.org
2018-04-17timers: Remove stale struct tvec_base forward declarationLiu, Changcheng
struct tvec_base is a leftover of the original timer wheel implementation and not longer used. Remove the forward declaration. Signed-off-by: Liu Changcheng <changcheng.liu@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: akpm@linux-foundation.org Link: https://lkml.kernel.org/r/20180412075701.GA38952@sofia
2018-04-17clockevents: Fix kernel messages split across multiple linesGeert Uytterhoeven
Convert the clockevents driver from old-style printk() to pr_info() and pr_cont(), to fix split kernel messages like below: Clockevents: could not switch to one-shot mode: dummy_timer is not functional. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: https://lkml.kernel.org/r/1522942018-14471-1-git-send-email-geert%2Brenesas@glider.be
2018-04-17MIPS: memset.S: Fix return of __clear_user from Lpartial_fixupMatt Redfearn
The __clear_user function is defined to return the number of bytes that could not be cleared. From the underlying memset / bzero implementation this means setting register a2 to that number on return. Currently if a page fault is triggered within the memset_partial block, the value loaded into a2 on return is meaningless. The label .Lpartial_fixup\@ is jumped to on page fault. In order to work out how many bytes failed to copy, the exception handler should find how many bytes left in the partial block (andi a2, STORMASK), add that to the partial block end address (a2), and subtract the faulting address to get the remainder. Currently it incorrectly subtracts the partial block start address (t1), which has additionally been clobbered to generate a jump target in memset_partial. Fix this by adding the block end address instead. This issue was found with the following test code: int j, k; for (j = 0; j < 512; j++) { if ((k = clear_user(NULL, j)) != j) { pr_err("clear_user (NULL %d) returned %d\n", j, k); } } Which now passes on Creator Ci40 (MIPS32) and Cavium Octeon II (MIPS64). Suggested-by: James Hogan <jhogan@kernel.org> Signed-off-by: Matt Redfearn <matt.redfearn@mips.com> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: stable@vger.kernel.org Patchwork: https://patchwork.linux-mips.org/patch/19108/ Signed-off-by: James Hogan <jhogan@kernel.org>
2018-04-17arm64: kasan: avoid pfn_to_nid() before page array is initializedMark Rutland
In arm64's kasan_init(), we use pfn_to_nid() to find the NUMA node a span of memory is in, hoping to allocate shadow from the same NUMA node. However, at this point, the page array has not been initialized, and thus this is bogus. Since commit: f165b378bbdf6c8a ("mm: uninitialized struct page poisoning sanity") ... accessing fields of the page array results in a boot time Oops(), highlighting this problem: [ 0.000000] Unable to handle kernel paging request at virtual address dfff200000000000 [ 0.000000] Mem abort info: [ 0.000000] ESR = 0x96000004 [ 0.000000] Exception class = DABT (current EL), IL = 32 bits [ 0.000000] SET = 0, FnV = 0 [ 0.000000] EA = 0, S1PTW = 0 [ 0.000000] Data abort info: [ 0.000000] ISV = 0, ISS = 0x00000004 [ 0.000000] CM = 0, WnR = 0 [ 0.000000] [dfff200000000000] address between user and kernel address ranges [ 0.000000] Internal error: Oops: 96000004 [#1] PREEMPT SMP [ 0.000000] Modules linked in: [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 4.16.0-07317-gf165b378bbdf #42 [ 0.000000] Hardware name: ARM Juno development board (r1) (DT) [ 0.000000] pstate: 80000085 (Nzcv daIf -PAN -UAO) [ 0.000000] pc : __asan_load8+0x8c/0xa8 [ 0.000000] lr : __dump_page+0x3c/0x3b8 [ 0.000000] sp : ffff2000099b7ca0 [ 0.000000] x29: ffff2000099b7ca0 x28: ffff20000a1762c0 [ 0.000000] x27: ffff7e0000000000 x26: ffff2000099dd000 [ 0.000000] x25: ffff200009a3f960 x24: ffff200008f9c38c [ 0.000000] x23: ffff20000a9d3000 x22: ffff200009735430 [ 0.000000] x21: fffffffffffffffe x20: ffff7e0001e50420 [ 0.000000] x19: ffff7e0001e50400 x18: 0000000000001840 [ 0.000000] x17: ffffffffffff8270 x16: 0000000000001840 [ 0.000000] x15: 0000000000001920 x14: 0000000000000004 [ 0.000000] x13: 0000000000000000 x12: 0000000000000800 [ 0.000000] x11: 1ffff0012d0f89ff x10: ffff10012d0f89ff [ 0.000000] x9 : 0000000000000000 x8 : ffff8009687c5000 [ 0.000000] x7 : 0000000000000000 x6 : ffff10000f282000 [ 0.000000] x5 : 0000000000000040 x4 : fffffffffffffffe [ 0.000000] x3 : 0000000000000000 x2 : dfff200000000000 [ 0.000000] x1 : 0000000000000005 x0 : 0000000000000000 [ 0.000000] Process swapper (pid: 0, stack limit = 0x (ptrval)) [ 0.000000] Call trace: [ 0.000000] __asan_load8+0x8c/0xa8 [ 0.000000] __dump_page+0x3c/0x3b8 [ 0.000000] dump_page+0xc/0x18 [ 0.000000] kasan_init+0x2e8/0x5a8 [ 0.000000] setup_arch+0x294/0x71c [ 0.000000] start_kernel+0xdc/0x500 [ 0.000000] Code: aa0403e0 9400063c 17ffffee d343fc00 (38e26800) [ 0.000000] ---[ end trace 67064f0e9c0cc338 ]--- [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task! [ 0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]--- Let's fix this by using early_pfn_to_nid(), as other architectures do in their kasan init code. Note that early_pfn_to_nid acquires the nid from the memblock array, which we iterate over in kasan_init(), so this should be fine. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Fixes: 39d114ddc6822302 ("arm64: add KASAN support") Cc: Will Deacon <will.deacon@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2018-04-17net: qrtr: add MODULE_ALIAS_NETPROTO macroNicolas Dechesne
To ensure that qrtr can be loaded automatically, when needed, if it is compiled as module. Signed-off-by: Nicolas Dechesne <nicolas.dechesne@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-17VSOCK: make af_vsock.ko removable againStefan Hajnoczi
Commit c1eef220c1760762753b602c382127bfccee226d ("vsock: always call vsock_init_tables()") introduced a module_init() function without a corresponding module_exit() function. Modules with an init function can only be removed if they also have an exit function. Therefore the vsock module was considered "permanent" and could not be removed. This patch adds an empty module_exit() function so that "rmmod vsock" works. No explicit cleanup is required because: 1. Transports call vsock_core_exit() upon exit and cannot be removed while sockets are still alive. 2. vsock_diag.ko does not perform any action that requires cleanup by vsock.ko. Fixes: c1eef220c176 ("vsock: always call vsock_init_tables()") Reported-by: Xiumei Mu <xmu@redhat.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Cc: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Jorgen Hansen <jhansen@vmware.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2018-04-17x86/mm: Prevent kernel Oops in PTDUMP code with HIGHPTE=yJoerg Roedel
The walk_pte_level() function just uses __va to get the virtual address of the PTE page, but that breaks when the PTE page is not in the direct mapping with HIGHPTE=y. The result is an unhandled kernel paging request at some random address when accessing the current_kernel or current_user file. Use the correct API to access PTE pages. Fixes: fe770bf0310d ('x86: clean up the page table dumper and add 32-bit support') Signed-off-by: Joerg Roedel <jroedel@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Cc: jgross@suse.com Cc: JBeulich@suse.com Cc: hpa@zytor.com Cc: aryabinin@virtuozzo.com Cc: kirill.shutemov@linux.intel.com Link: https://lkml.kernel.org/r/1523971636-4137-1-git-send-email-joro@8bytes.org
2018-04-17x86,sched: Allow topologies where NUMA nodes share an LLCAlison Schofield
Intel's Skylake Server CPUs have a different LLC topology than previous generations. When in Sub-NUMA-Clustering (SNC) mode, the package is divided into two "slices", each containing half the cores, half the LLC, and one memory controller and each slice is enumerated to Linux as a NUMA node. This is similar to how the cores and LLC were arranged for the Cluster-On-Die (CoD) feature. CoD allowed the same cache line to be present in each half of the LLC. But, with SNC, each line is only ever present in *one* slice. This means that the portion of the LLC *available* to a CPU depends on the data being accessed: Remote socket: entire package LLC is shared Local socket->local slice: data goes into local slice LLC Local socket->remote slice: data goes into remote-slice LLC. Slightly higher latency than local slice LLC. The biggest implication from this is that a process accessing all NUMA-local memory only sees half the LLC capacity. The CPU describes its cache hierarchy with the CPUID instruction. One of the CPUID leaves enumerates the "logical processors sharing this cache". This information is used for scheduling decisions so that tasks move more freely between CPUs sharing the cache. But, the CPUID for the SNC configuration discussed above enumerates the LLC as being shared by the entire package. This is not 100% precise because the entire cache is not usable by all accesses. But, it *is* the way the hardware enumerates itself, and this is not likely to change. The userspace visible impact of all the above is that the sysfs info reports the entire LLC as being available to the entire package. As noted above, this is not true for local socket accesses. This patch does not correct the sysfs info. It is the same, pre and post patch. The current code emits the following warning: sched: CPU #3's llc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency. The warning is coming from the topology_sane() check in smpboot.c because the topology is not matching the expectations of the model for obvious reasons. To fix this, add a vendor and model specific check to never call topology_sane() for these systems. Also, just like "Cluster-on-Die" disable the "coregroup" sched_domain_topology_level and use NUMA information from the SRAT alone. This is OK at least on the hardware we are immediately concerned about because the LLC sharing happens at both the slice and at the package level, which are also NUMA boundaries. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov <bp@suse.de> Cc: Prarit Bhargava <prarit@redhat.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: brice.goglin@gmail.com Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: David Rientjes <rientjes@google.com> Cc: Igor Mammedov <imammedo@redhat.com> Cc: "H. Peter Anvin" <hpa@linux.intel.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Link: https://lkml.kernel.org/r/20180407002130.GA18984@alison-desk.jf.intel.com
2018-04-17perf: Remove superfluous allocation error checkJiri Olsa
If the get_callchain_buffers fails to allocate the buffer it will decrease the nr_callchain_events right away. There's no point of checking the allocation error for nr_callchain_events > 1. Removing that check. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: syzkaller-bugs@googlegroups.com Cc: x86@kernel.org Link: http://lkml.kernel.org/r/20180415092352.12403-3-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-17perf: Fix sample_max_stack maximum checkJiri Olsa
The syzbot hit KASAN bug in perf_callchain_store having the entry stored behind the allocated bounds [1]. We miss the sample_max_stack check for the initial event that allocates callchain buffers. This missing check allows to create an event with sample_max_stack value bigger than the global sysctl maximum: # sysctl -a | grep perf_event_max_stack kernel.perf_event_max_stack = 127 # perf record -vv -C 1 -e cycles/max-stack=256/ kill ... perf_event_attr: size 112 ... sample_max_stack 256 ------------------------------------------------------------ sys_perf_event_open: pid -1 cpu 1 group_fd -1 flags 0x8 = 4 Note the '-C 1', which forces perf record to create just single event. Otherwise it opens event for every cpu, then the sample_max_stack check fails on the second event and all's fine. The fix is to run the sample_max_stack check also for the first event with callchains. [1] https://marc.info/?l=linux-kernel&m=152352732920874&w=2 Reported-by: syzbot+7c449856228b63ac951e@syzkaller.appspotmail.com Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: syzkaller-bugs@googlegroups.com Cc: x86@kernel.org Fixes: 97c79a38cd45 ("perf core: Per event callchain limit") Link: http://lkml.kernel.org/r/20180415092352.12403-2-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-17perf: Return proper values for user stack errorsJiri Olsa
Return immediately when we find issue in the user stack checks. The error value could get overwritten by following check for PERF_SAMPLE_REGS_INTR. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: syzkaller-bugs@googlegroups.com Cc: x86@kernel.org Fixes: 60e2364e60e8 ("perf: Add ability to sample machine state on interrupt") Link: http://lkml.kernel.org/r/20180415092352.12403-1-jolsa@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-17perf list: Add s390 support for detailed/verbose PMU event descriptionThomas Richter
'perf list' with flags -d and -v print a description (-d) or a very verbose explanation (-v) of CPU specific counter events. These descriptions are provided with the json files in directory pmu-events/arch/s390/*.json. Display of these descriptions on s390 requires the corresponding json files. On s390 this does not work because function is_pmu_core() does not detect the s390 directory name where the CPU specific events are listed. On x86 it is: /sys/bus/event_source/devices/cpu whereas on s390 it is: /sys/bus/event_source/devices/cpum_cf /sys/bus/event_source/devices/cpum_sf Fix this by adding s390 directory name testing to function is_pmu_core(). This is the same approach as taken for the ARM platform. Output before: [root@s35lp76 perf]# ./perf list -d pmu List of pre-defined events (to be used in -e): cpum_cf/AES_BLOCKED_CYCLES/ [Kernel PMU event] cpum_cf/AES_BLOCKED_FUNCTIONS/ [Kernel PMU event] cpum_cf/AES_CYCLES/ [Kernel PMU event] cpum_cf/AES_FUNCTIONS/ [Kernel PMU event] .... cpum_cf/TX_NC_TEND/ [Kernel PMU event] cpum_cf/VX_BCD_EXECUTION_SLOTS/ [Kernel PMU event] cpum_sf/SF_CYCLES_BASIC/ [Kernel PMU event] Output after: [root@s35lp76 perf]# ./perf list -d pmu List of pre-defined events (to be used in -e): cpum_cf/AES_BLOCKED_CYCLES/ [Kernel PMU event] cpum_cf/AES_BLOCKED_FUNCTIONS/ [Kernel PMU event] cpum_cf/AES_CYCLES/ [Kernel PMU event] cpum_cf/AES_FUNCTIONS/ [Kernel PMU event] .... cpum_cf/TX_NC_TEND/ [Kernel PMU event] cpum_cf/VX_BCD_EXECUTION_SLOTS/ [Kernel PMU event] cpum_sf/SF_CYCLES_BASIC/ [Kernel PMU event] 3906: bcd_dfp_execution_slots [BCD DFP Execution Slots] decimal_instructions [Decimal Instructions] dtlb2_gpage_writes [DTLB2 GPAGE Writes] dtlb2_hpage_writes [DTLB2 HPAGE Writes] dtlb2_misses [DTLB2 Misses] dtlb2_writes [DTLB2 Writes] itlb2_misses [ITLB2 Misses] itlb2_writes [ITLB2 Writes] l1c_tlb2_misses [L1C TLB2 Misses] ..... cfvn 3: cpu_cycles [CPU Cycles] instructions [Instructions] l1d_dir_writes [L1D Directory Writes] l1d_penalty_cycles [L1D Penalty Cycles] l1i_dir_writes [L1I Directory Writes] l1i_penalty_cycles [L1I Penalty Cycles] problem_state_cpu_cycles [Problem State CPU Cycles] problem_state_instructions [Problem State Instructions] .... csvn generic: aes_blocked_cycles [AES Blocked Cycles] aes_blocked_functions [AES Blocked Functions] aes_cycles [AES Cycles] aes_functions [AES Functions] dea_blocked_cycles [DEA Blocked Cycles] dea_blocked_functions [DEA Blocked Functions] .... Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com> Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Link: http://lkml.kernel.org/r/20180416132314.33249-1-tmricht@linux.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-17perf script: Extend misc field decoding with switch out event typeAlexey Budankov
Append 'p' sign to 'S' tag designating the type of context switch out event so 'Sp' means preemption context switch. Documentation is extended to cover new presentation changes. $ perf script --show-switch-events -F +misc -I -i perf.data: hdparm 4073 [004] U 762.198265: 380194 cycles:ppp: 7faf727f5a23 strchr (/usr/lib64/ld-2.26.so) hdparm 4073 [004] K 762.198366: 441572 cycles:ppp: ffffffffb9218435 alloc_set_pte (/lib/modules/4.16.0-rc6+/build/vmlinux) hdparm 4073 [004] S 762.198391: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0 swapper 0 [004] 762.198392: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 4073/4073 swapper 0 [004] Sp 762.198477: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 4073/4073 hdparm 4073 [004] 762.198478: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0 swapper 0 [007] K 762.198514: 2303073 cycles:ppp: ffffffffb98b0c66 intel_idle (/lib/modules/4.16.0-rc6+/build/vmlinux) swapper 0 [007] Sp 762.198561: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 1134/1134 kworker/u16:18 1134 [007] 762.198562: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0 kworker/u16:18 1134 [007] S 762.198567: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0 Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/5fc65ce7-8ca5-53ae-8858-8ddd27290575@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-04-17perf report: Extend raw dump (-D) out with switch out event typeAlexey Budankov
Print additional 'preempt' tag for PERF_RECORD_SWITCH[_CPU_WIDE] OUT records when event header misc field contains PERF_RECORD_MISC_SWITCH_OUT_PREEMPT bit set designating preemption context switch out event: tools/perf/perf report -D -i perf.data | grep _SWITCH 0 768361415226 0x27f076 [0x28]: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 8/8 4 768362216813 0x28f45e [0x28]: PERF_RECORD_SWITCH_CPU_WIDE OUT next pid/tid: 0/0 4 768362217824 0x28f486 [0x28]: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 4073/4073 0 768362414027 0x27f0ce [0x28]: PERF_RECORD_SWITCH_CPU_WIDE OUT preempt next pid/tid: 8/8 0 768362414367 0x27f0f6 [0x28]: PERF_RECORD_SWITCH_CPU_WIDE IN prev pid/tid: 0/0 Signed-off-by: Alexey Budankov <alexey.budankov@linux.intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/6f5aebb9-b96c-f304-f08f-8f046d38de4f@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>