summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-06-25nfp: add support for control messages for flower appSimon Horman
In preparation for adding a new flower app - targeted at offloading the flower classifier - provide support for control message that it will use to communicate with the NFP. Based in part on work by Bert van Leeuwen. Signed-off-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-25nfp: add support for tx/rx with metadata portidSimon Horman
Allow tx/rx with metadata port id. This will be used for tx/rx of representor netdevs acting as upper-devices while a pf netdev acts as a lower-device. Signed-off-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-25nfp: provide nfp_port to of nfp_net_get_mac_addr()Simon Horman
Provide port rather than vNIC as parameter of nfp_net_get_mac_addr. This is to allow this function to be used by representor netdevs where a vNIC may have more than one physical port none of which are associated with the vNIC. Signed-off-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-25nfp: app callbacks for SRIOVSimon Horman
Add app-callbacks for app-specific initialisation of SRIOV. Disabling SRIOV is brought forward in nfp_pci_remove() so that nfp_app_sriov_disable is called while the app still exists. This is intended to be used to implement representor netdevs for virtual ports. Signed-off-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-25nfp: add stats and xmit helpers for representorsSimon Horman
Provide helpers for stats and xmit on representor netdevs. Parts based on work by Bert van Leeuwen, Benjamin LaHaise and Jakub Kicinski. Signed-off-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-25nfp: general representor implementationSimon Horman
Provide infrastructure to create and destroy representors of a given type. Parts based on work by Bert van Leeuwen, Benjamin LaHaise, and Jakub Kicinski. Signed-off-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-25nfp: map mac_stats and vf_cfg BARsSimon Horman
If present map mac_stats and vf_cfg BARs. These will be used by representor netdevs to read statistics for phys port and vf representors. Also provide defines describing the layout of the mac_stats area. Similar defines are already present for the cf_cfg area. Based in part on work by Jakub Kicinski. Signed-off-by: Simon Horman <simon.horman@netronome.com> Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-25nfp: move physical port init into a helperJakub Kicinski
Move MAC/PHY port init into a helper to make it easier to reuse it in the representor code. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-25nfp: devlink add support for getting eswitch modeJakub Kicinski
Add app callback for reporting eswitch mode. Non-SRIOV apps should not implement this callback, nfp_app code will then respond with -EOPNOTSUPP. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-25net: store port/representator id in metadata_dstJakub Kicinski
Switches and modern SR-IOV enabled NICs may multiplex traffic from Port representators and control messages over single set of hardware queues. Control messages and muxed traffic may need ordered delivery. Those requirements make it hard to comfortably use TC infrastructure today unless we have a way of attaching metadata to skbs at the upper device. Because single set of queues is used for many netdevs stopping TC/sched queues of all of them reliably is impossible and lower device has to retreat to returning NETDEV_TX_BUSY and usually has to take extra locks on the fastpath. This patch attempts to enable port/representative devs to attach metadata to skbs which carry port id. This way representatives can be queueless and all queuing can be performed at the lower netdev in the usual way. Traffic arriving on the port/representative interfaces will be have metadata attached and will subsequently be queued to the lower device for transmission. The lower device should recognize the metadata and translate it to HW specific format which is most likely either a special header inserted before the network headers or descriptor/metadata fields. Metadata is associated with the lower device by storing the netdev pointer along with port id so that if TC decides to redirect or mirror the new netdev will not try to interpret it. This is mostly for SR-IOV devices since switches don't have lower netdevs today. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Sridhar Samudrala <sridhar.samudrala@intel.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-25bnx2x: Don't log mc removal needlesslyMintz, Yuval
When mc configuration changes bnx2x_config_mcast() can return 0 for success, negative for failure and positive for benign reason preventing its immediate work, e.g., when the command awaits the completion of a previously sent command. When removing all configured macs on a 578xx adapter, if a positive value would be returned driver would errneously log it as an error. Fixes: c7b7b483ccc9 ("bnx2x: Don't flush multicast MACs") Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-24Merge tag 'kbuild-fixes-v4.12-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild Pull Kbuild fixes from Masahiro Yamada: "Nothing scary, just some random fixes: - fix warnings of host programs - fix "make tags" when COMPILED_SOURCE=1 is specified along with O= - clarify help message of C=1 option - fix dependency for ncurses compatibility check - fix "make headers_install" for fakechroot environment" * tag 'kbuild-fixes-v4.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: kconfig: fix sparse warnings in nconfig kbuild: fix header installation under fakechroot environment kconfig: Check for libncurses before menuconfig Kbuild: tiny correction on `make help` tags: honor COMPILED_SOURCE with apart output directory genksyms: add printf format attribute to error_with_pos()
2017-06-24Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull timer fix from Eric Biederman: "This fixes an issue of confusing injected signals with the signals from posix timers that has existed since posix timers have been in the kernel. This patch is slightly simpler than my earlier version of this patch as I discovered in testing that I had misspelled "#ifdef CONFIG_POSIX_TIMERS". So I deleted that unnecessary test and made setting of resched_timer uncondtional. I have tested this and verified that without this patch there is a nasty hang that is easy to trigger, and with this patch everything works properly" Thomas Gleixner dixit: "It fixes the problem at hand and covers the ptrace case as well, which I missed. Reviewed-and-tested-by: Thomas Gleixner <tglx@linutronix.de>" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: signal: Only reschedule timers on signals timers have sent
2017-06-24x86/mshyperv: Remove excess #includes from mshyperv.hThomas Gleixner
A recent commit included linux/slab.h in linux/irq.h. This breaks the build of vdso32 on a 64-bit kernel. The reason is that linux/irq.h gets included into the vdso code via linux/interrupt.h which is included from asm/mshyperv.h. That makes the 32-bit vdso compile fail, because slab.h includes the pgtable headers for 64-bit on a 64-bit build. Neither linux/clocksource.h nor linux/interrupt.h are needed in the mshyperv.h header file itself - it has a dependency on <linux/atomic.h>. Remove the includes and unbreak the build. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: devel@linuxdriverproject.org Fixes: dee863b571b0 ("hv: export current Hyper-V clocksource") Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1706231038460.2647@nanos Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-06-23Merge tag 'powerpc-4.12-7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: "Some more powerpc fixes for 4.12. Most of these actually came in last week but got held up for some more testing. - three fixes for kprobes/ftrace/livepatch interactions. - properly handle data breakpoints when using the Radix MMU. - fix for perf sampling of registers during call_usermodehelper(). - properly initialise the thread_info on our emergency stacks - add an explicit flush when doing TLB invalidations for a process using NPU2. Thanks to: Alistair Popple, Naveen N. Rao, Nicholas Piggin, Ravi Bangoria, Masami Hiramatsu" * tag 'powerpc-4.12-7' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64: Initialise thread_info for emergency stacks powerpc/powernv/npu-dma: Add explicit flush when sending an ATSD powerpc/perf: Fix oops when kthread execs user process powerpc/64s: Handle data breakpoints in Radix mode powerpc/kprobes: Skip livepatch_handler() for jprobes powerpc/ftrace: Pass the correct stack pointer for DYNAMIC_FTRACE_WITH_REGS powerpc/kprobes: Pause function_graph tracing during jprobes handling
2017-06-23Merge tag 'acpi-4.12-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "This fixes the ACPI-based enumeration of some I2C and SPI devices broken in 4.11. Specifics: - I2C and SPI devices are expected to be enumerated by the I2C and SPI subsystems, respectively, but due to a change made during the 4.11 cycle, in some cases the ACPI core marks them as already enumerated which causes the I2C and SPI subsystems to overlook them, so fix that (Jarkko Nikula)" * tag 'acpi-4.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: ACPI / scan: Fix enumeration for special SPI and I2C devices
2017-06-23Merge branch 'i2c/for-current-fixed' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fix from Wolfram Sang. * 'i2c/for-current-fixed' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: imx: Use correct function to write to register
2017-06-23Merge tag 'gpio-v4.12-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO fix from Linus Walleij: "A single GPIO patch fixing the compatible string for the MVEBU PWM controller embedded in the GPIO controller before we release v4.12. Hopefully" * tag 'gpio-v4.12-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: gpio: mvebu: change compatible string for PWM support
2017-06-23Merge tag 'sound-4.12-rc7' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "Nothing exciting here, just a few stable fixes: - suppress spurious kernel WARNING in PCM core - fix potential spin deadlock at error handling in firewire - HD-audio PCI ID addition / fixup" * tag 'sound-4.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda - Apply quirks to Broxton-T, too ALSA: firewire-lib: Fix stall of process context at packet error ALSA: pcm: Don't treat NULL chmap as a fatal error ALSA: hda - Add Coffelake PCI ID
2017-06-23Merge tag 'drm-fixes-for-v4.12-rc7' of ↵Linus Torvalds
git://people.freedesktop.org/~airlied/linux Pull drm fixes from Dave Airlie: "A varied bunch of fixes, one for an API regression with connectors. Otherwise amdgpu and i915 have a bunch of varied fixes, the shrinker ones being the most important" * tag 'drm-fixes-for-v4.12-rc7' of git://people.freedesktop.org/~airlied/linux: drm: Fix GETCONNECTOR regression drm/radeon: add a quirk for Toshiba Satellite L20-183 drm/radeon: add a PX quirk for another K53TK variant drm/amdgpu: adjust default display clock drm/amdgpu/atom: fix ps allocation size for EnableDispPowerGating drm/amdgpu: add Polaris12 DID drm/i915: Don't enable backlight at setup time. drm/i915: Plumb the correct acquire ctx into intel_crtc_disable_noatomic() drm/i915: Fix deadlock witha the pipe A quirk during resume drm/i915: Remove __GFP_NORETRY from our buffer allocator drm/i915: Encourage our shrinker more when our shmemfs allocations fails drm/i915: Differentiate between sw write location into ring and last hw read
2017-06-23Merge tag 'random_for_linus_stable' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random Pull random fixes from Ted Ts'o: "Fix some locking and gcc optimization issues from the most recent random_for_linus_stable pull request" * tag 'random_for_linus_stable' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random: random: silence compiler warnings and fix race
2017-06-23Merge tag 'for-4.12/dm-fixes-4' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mike Snitzer: - a revert of a DM mirror commit that has proven to make the code prone to crash - a DM io reference count fix that resolves a NULL pointer seen when issuing discards to a DM mirror target's device whose mirror legs do not all support discards - a couple DM integrity fixes * tag 'for-4.12/dm-fixes-4' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm io: fix duplicate bio completion due to missing ref count dm integrity: fix to not disable/enable interrupts from interrupt context Revert "dm mirror: use all available legs on multiple failures" dm integrity: reject mappings too large for device
2017-06-23Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "8 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: fs/exec.c: account for argv/envp pointers ocfs2: fix deadlock caused by recursive locking in xattr slub: make sysfs file removal asynchronous lib/cmdline.c: fix get_options() overflow while parsing ranges fs/dax.c: fix inefficiency in dax_writeback_mapping_range() autofs: sanity check status reported with AUTOFS_DEV_IOCTL_FAIL mm/vmalloc.c: huge-vmap: fail gracefully on unexpected huge vmap mappings mm, thp: remove cond_resched from __collapse_huge_page_copy
2017-06-23fs/exec.c: account for argv/envp pointersKees Cook
When limiting the argv/envp strings during exec to 1/4 of the stack limit, the storage of the pointers to the strings was not included. This means that an exec with huge numbers of tiny strings could eat 1/4 of the stack limit in strings and then additional space would be later used by the pointers to the strings. For example, on 32-bit with a 8MB stack rlimit, an exec with 1677721 single-byte strings would consume less than 2MB of stack, the max (8MB / 4) amount allowed, but the pointers to the strings would consume the remaining additional stack space (1677721 * 4 == 6710884). The result (1677721 + 6710884 == 8388605) would exhaust stack space entirely. Controlling this stack exhaustion could result in pathological behavior in setuid binaries (CVE-2017-1000365). [akpm@linux-foundation.org: additional commenting from Kees] Fixes: b6a2fea39318 ("mm: variable length argument support") Link: http://lkml.kernel.org/r/20170622001720.GA32173@beast Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Rik van Riel <riel@redhat.com> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Qualys Security Advisory <qsa@qualys.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-23ocfs2: fix deadlock caused by recursive locking in xattrEric Ren
Another deadlock path caused by recursive locking is reported. This kind of issue was introduced since commit 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()"). Two deadlock paths have been fixed by commit b891fa5024a9 ("ocfs2: fix deadlock issue when taking inode lock at vfs entry points"). Yes, we intend to fix this kind of case in incremental way, because it's hard to find out all possible paths at once. This one can be reproduced like this. On node1, cp a large file from home directory to ocfs2 mountpoint. While on node2, run setfacl/getfacl. Both nodes will hang up there. The backtraces: On node1: __ocfs2_cluster_lock.isra.39+0x357/0x740 [ocfs2] ocfs2_inode_lock_full_nested+0x17d/0x840 [ocfs2] ocfs2_write_begin+0x43/0x1a0 [ocfs2] generic_perform_write+0xa9/0x180 __generic_file_write_iter+0x1aa/0x1d0 ocfs2_file_write_iter+0x4f4/0xb40 [ocfs2] __vfs_write+0xc3/0x130 vfs_write+0xb1/0x1a0 SyS_write+0x46/0xa0 On node2: __ocfs2_cluster_lock.isra.39+0x357/0x740 [ocfs2] ocfs2_inode_lock_full_nested+0x17d/0x840 [ocfs2] ocfs2_xattr_set+0x12e/0xe80 [ocfs2] ocfs2_set_acl+0x22d/0x260 [ocfs2] ocfs2_iop_set_acl+0x65/0xb0 [ocfs2] set_posix_acl+0x75/0xb0 posix_acl_xattr_set+0x49/0xa0 __vfs_setxattr+0x69/0x80 __vfs_setxattr_noperm+0x72/0x1a0 vfs_setxattr+0xa7/0xb0 setxattr+0x12d/0x190 path_setxattr+0x9f/0xb0 SyS_setxattr+0x14/0x20 Fix this one by using ocfs2_inode_{lock|unlock}_tracker, which is exported by commit 439a36b8ef38 ("ocfs2/dlmglue: prepare tracking logic to avoid recursive cluster lock"). Link: http://lkml.kernel.org/r/20170622014746.5815-1-zren@suse.com Fixes: 743b5f1434f5 ("ocfs2: take inode lock in ocfs2_iop_set/get_acl()") Signed-off-by: Eric Ren <zren@suse.com> Reported-by: Thomas Voegtle <tv@lio96.de> Tested-by: Thomas Voegtle <tv@lio96.de> Reviewed-by: Joseph Qi <jiangqi903@gmail.com> Cc: Mark Fasheh <mfasheh@versity.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-23slub: make sysfs file removal asynchronousTejun Heo
Commit bf5eb3de3847 ("slub: separate out sysfs_slab_release() from sysfs_slab_remove()") made slub sysfs file removals synchronous to kmem_cache shutdown. Unfortunately, this created a possible ABBA deadlock between slab_mutex and sysfs draining mechanism triggering the following lockdep warning. ====================================================== [ INFO: possible circular locking dependency detected ] 4.10.0-test+ #48 Not tainted ------------------------------------------------------- rmmod/1211 is trying to acquire lock: (s_active#120){++++.+}, at: [<ffffffff81308073>] kernfs_remove+0x23/0x40 but task is already holding lock: (slab_mutex){+.+.+.}, at: [<ffffffff8120f691>] kmem_cache_destroy+0x41/0x2d0 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (slab_mutex){+.+.+.}: lock_acquire+0xf6/0x1f0 __mutex_lock+0x75/0x950 mutex_lock_nested+0x1b/0x20 slab_attr_store+0x75/0xd0 sysfs_kf_write+0x45/0x60 kernfs_fop_write+0x13c/0x1c0 __vfs_write+0x28/0x120 vfs_write+0xc8/0x1e0 SyS_write+0x49/0xa0 entry_SYSCALL_64_fastpath+0x1f/0xc2 -> #0 (s_active#120){++++.+}: __lock_acquire+0x10ed/0x1260 lock_acquire+0xf6/0x1f0 __kernfs_remove+0x254/0x320 kernfs_remove+0x23/0x40 sysfs_remove_dir+0x51/0x80 kobject_del+0x18/0x50 __kmem_cache_shutdown+0x3e6/0x460 kmem_cache_destroy+0x1fb/0x2d0 kvm_exit+0x2d/0x80 [kvm] vmx_exit+0x19/0xa1b [kvm_intel] SyS_delete_module+0x198/0x1f0 entry_SYSCALL_64_fastpath+0x1f/0xc2 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(slab_mutex); lock(s_active#120); lock(slab_mutex); lock(s_active#120); *** DEADLOCK *** 2 locks held by rmmod/1211: #0: (cpu_hotplug.dep_map){++++++}, at: [<ffffffff810a7877>] get_online_cpus+0x37/0x80 #1: (slab_mutex){+.+.+.}, at: [<ffffffff8120f691>] kmem_cache_destroy+0x41/0x2d0 stack backtrace: CPU: 3 PID: 1211 Comm: rmmod Not tainted 4.10.0-test+ #48 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v02.05 05/07/2012 Call Trace: print_circular_bug+0x1be/0x210 __lock_acquire+0x10ed/0x1260 lock_acquire+0xf6/0x1f0 __kernfs_remove+0x254/0x320 kernfs_remove+0x23/0x40 sysfs_remove_dir+0x51/0x80 kobject_del+0x18/0x50 __kmem_cache_shutdown+0x3e6/0x460 kmem_cache_destroy+0x1fb/0x2d0 kvm_exit+0x2d/0x80 [kvm] vmx_exit+0x19/0xa1b [kvm_intel] SyS_delete_module+0x198/0x1f0 ? SyS_delete_module+0x5/0x1f0 entry_SYSCALL_64_fastpath+0x1f/0xc2 It'd be the cleanest to deal with the issue by removing sysfs files without holding slab_mutex before the rest of shutdown; however, given the current code structure, it is pretty difficult to do so. This patch punts sysfs file removal to a work item. Before commit bf5eb3de3847, the removal was punted to a RCU delayed work item which is executed after release. Now, we're punting to a different work item on shutdown which still maintains the goal removing the sysfs files earlier when destroying kmem_caches. Link: http://lkml.kernel.org/r/20170620204512.GI21326@htj.duckdns.org Fixes: bf5eb3de3847 ("slub: separate out sysfs_slab_release() from sysfs_slab_remove()") Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Christoph Lameter <cl@linux.com> Cc: Pekka Enberg <penberg@kernel.org> Cc: David Rientjes <rientjes@google.com> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-23lib/cmdline.c: fix get_options() overflow while parsing rangesIlya Matveychikov
When using get_options() it's possible to specify a range of numbers, like 1-100500. The problem is that it doesn't track array size while calling internally to get_range() which iterates over the range and fills the memory with numbers. Link: http://lkml.kernel.org/r/2613C75C-B04D-4BFF-82A6-12F97BA0F620@gmail.com Signed-off-by: Ilya V. Matveychikov <matvejchikov@gmail.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-23fs/dax.c: fix inefficiency in dax_writeback_mapping_range()Jan Kara
dax_writeback_mapping_range() fails to update iteration index when searching radix tree for entries needing cache flushing. Thus each pagevec worth of entries is searched starting from the start which is inefficient and prone to livelocks. Update index properly. Link: http://lkml.kernel.org/r/20170619124531.21491-1-jack@suse.cz Fixes: 9973c98ecfda3 ("dax: add support for fsync/sync") Signed-off-by: Jan Kara <jack@suse.cz> Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com> Cc: Dan Williams <dan.j.williams@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-23autofs: sanity check status reported with AUTOFS_DEV_IOCTL_FAILNeilBrown
If a positive status is passed with the AUTOFS_DEV_IOCTL_FAIL ioctl, autofs4_d_automount() will return ERR_PTR(status) with that status to follow_automount(), which will then dereference an invalid pointer. So treat a positive status the same as zero, and map to ENOENT. See comment in systemd src/core/automount.c::automount_send_ready(). Link: http://lkml.kernel.org/r/871sqwczx5.fsf@notabene.neil.brown.name Signed-off-by: NeilBrown <neilb@suse.com> Cc: Ian Kent <raven@themaw.net> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-23mm/vmalloc.c: huge-vmap: fail gracefully on unexpected huge vmap mappingsArd Biesheuvel
Existing code that uses vmalloc_to_page() may assume that any address for which is_vmalloc_addr() returns true may be passed into vmalloc_to_page() to retrieve the associated struct page. This is not un unreasonable assumption to make, but on architectures that have CONFIG_HAVE_ARCH_HUGE_VMAP=y, it no longer holds, and we need to ensure that vmalloc_to_page() does not go off into the weeds trying to dereference huge PUDs or PMDs as table entries. Given that vmalloc() and vmap() themselves never create huge mappings or deal with compound pages at all, there is no correct answer in this case, so return NULL instead, and issue a warning. When reading /proc/kcore on arm64, you will hit an oops as soon as you hit the huge mappings used for the various segments that make up the mapping of vmlinux. With this patch applied, you will no longer hit the oops, but the kcore contents willl be incorrect (these regions will be zeroed out) We are fixing this for kcore specifically, so it avoids vread() for those regions. At least one other problematic user exists, i.e., /dev/kmem, but that is currently broken on arm64 for other reasons. Link: http://lkml.kernel.org/r/20170609082226.26152-1-ard.biesheuvel@linaro.org Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org> Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Laura Abbott <labbott@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: zhong jiang <zhongjiang@huawei.com> Cc: Dave Hansen <dave.hansen@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-23mm, thp: remove cond_resched from __collapse_huge_page_copyDavid Rientjes
This is a partial revert of commit 338a16ba1549 ("mm, thp: copying user pages must schedule on collapse") which added a cond_resched() to __collapse_huge_page_copy(). On x86 with CONFIG_HIGHPTE, __collapse_huge_page_copy is called in atomic context and thus scheduling is not possible. This is only a possible config on arm and i386. Although need_resched has been shown to be set for over 100 jiffies while doing the iteration in __collapse_huge_page_copy, this is better than doing if (in_atomic()) cond_resched() to cover only non-CONFIG_HIGHPTE configs. Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1706191341550.97821@chino.kir.corp.google.com Signed-off-by: David Rientjes <rientjes@google.com> Reported-by: Larry Finger <Larry.Finger@lwfinger.net> Tested-by: Larry Finger <Larry.Finger@lwfinger.net> Acked-by: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-23Merge tag 'scsi-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "Two fixes to remove spurious WARN_ONs from the new(ish) qedi driver. The driver already prints a warning message, there's no need to panic users by printing something that looks like an oops as well" * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: scsi: qedi: Remove WARN_ON from clear task context. scsi: qedi: Remove WARN_ON for untracked cleanup.
2017-06-23Merge tag 'xfs-4.12-fixes-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs fixes from Darrick Wong: "I have one more bugfix for you for 4.12-rc7 to fix a disk corruption problem: - don't allow swapon on files on the realtime device, because the swap code will swap pages out to blocks on the data device, thereby corrupting the filesystem" * tag 'xfs-4.12-fixes-5' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: don't allow bmap on rt files
2017-06-23Merge branch 'phy-internal'David S. Miller
Florian Fainelli says: ==================== net: phy: Support "internal" PHY interface This makes the "internal" phy-mode property generally available and documented and this allows us to remove some custom parsing code we had for bcmgenet and bcm_sf2 which both used that specific value. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23net: dsa: bcm_sf2: Remove special handling of "internal" phy-modeFlorian Fainelli
The PHY library now supports an "internal" phy-mode, thus making our custom parsing code now unnecessary. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23net: bcmgenet: Remove special handling of "internal" phy-modeFlorian Fainelli
The PHY library now supports an "internal" phy-mode, thus making our custom parsing code now unnecessary. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23net: phy: Support "internal" PHY interfaceFlorian Fainelli
Now that the Device Tree binding has been updated, update the PHY library phy_interface_t and phy_modes to support the "internal" PHY interface type. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23dt-bindings: Add "internal" as a valid 'phy-mode' propertyFlorian Fainelli
A number of Ethernet MACs have internal Ethernet PHYs and the internal wiring makes it so that this knowledge needs to be available using the standard 'phy-mode' property. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23Merge branch 'bnxt_en-fixes'David S. Miller
Michael Chan says: ==================== bnxt_en: Error handling and netpoll fixes. Add missing error handling and fix netpoll handling. The current code handles RX and TX events in netpoll mode and is causing lots of warnings and errors in the RX code path in netpoll mode. The fix is to only handle TX events in netpoll mode. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23bnxt_en: Fix netpoll handling.Michael Chan
To handle netpoll properly, the driver must only handle TX packets during NAPI. Handling RX events cause warnings and errors in netpoll mode. The ndo_poll_controller() method should call napi_schedule() directly so that a NAPI weight of zero will be used during netpoll mode. The bnxt_en driver supports 2 ring modes: combined, and separate rx/tx. In separate rx/tx mode, the ndo_poll_controller() method will only process the tx rings. In combined mode, the rx and tx completion entries are mixed in the completion ring and we need to drop the rx entries and recycle the rx buffers. Add a function bnxt_force_rx_discard() to handle this in netpoll mode when we see rx entries in combined ring mode. Reported-by: Calvin Owens <calvinowens@fb.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23bnxt_en: Add missing logic to handle TPA end error conditions.Michael Chan
When we get a TPA_END completion to handle a completed LRO packet, it is possible that hardware would indicate errors. The current code is not checking for the error condition. Define the proper error bits and the macro to check for this error and abort properly. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23net: dp83640: Avoid NULL pointer dereference.Richard Cochran
The function, skb_complete_tx_timestamp(), used to allow passing in a NULL pointer for the time stamps, but that was changed in commit 62bccb8cdb69051b95a55ab0c489e3cab261c8ef ("net-timestamp: Make the clone operation stand-alone from phy timestamping"), and the existing call sites, all of which are in the dp83640 driver, were fixed up. Even though the kernel-doc was subsequently updated in commit 7a76a021cd5a292be875fbc616daf03eab1e6996 ("net-timestamp: Update skb_complete_tx_timestamp comment"), still a bug fix from Manfred Rudigier came into the driver using the old semantics. Probably Manfred derived that patch from an older kernel version. This fix should be applied to the stable trees as well. Fixes: 81e8f2e930fe ("net: dp83640: Fix tx timestamp overflow handling.") Signed-off-by: Richard Cochran <richardcochran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23Merge tag 'mlx5-updates-2017-06-23' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2017-06-23 This series provides some updates to the mlx5 core and netdevice drivers. Three patches from Tariq, Introduces page reuse mechanism in non-Striding RQ RX datapath, we allow the the RX descriptor to reuse its allocated page as much as it could, until the page is fully consumed. RX page reuse reduces the stress on page allocator and improves RX performance especially with high speeds (100Gb/s). Next four patches of the series from Or allows to offload tc flower matching on ttl/hoplimit and header re-write of hoplimit. The rest of the series from Yotam and Or enhances mlx5 to support FW flashing through the mlxfw module, in a similar manner done by the mlxsw driver. Currently, only ethtool based flashing is implemented, where both Eth and IB ports are supported. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23cxgb4: Use Firmware params to get buffer-group mapArjun Vynipadath
Buffer group mappings can be obtained using FW_PARAMs cmd for newer FW. Since some of the bg_maps are obtained in atomic context, created another t4_query_params_ns(), that wont sleep when awaiting mbox cmd completion. Signed-off-by: Casey Leedom <leedom@chelsio.com> Signed-off-by: Arjun Vynipadath <arjun@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23cxgb4: Update T6 Buffer Group and Channel MappingsArjun Vynipadath
We were using t4_get_mps_bg_map() for both t4_get_port_stats() to determine which MPS Buffer Groups to report statistics on for a given Port, and also for t4_sge_alloc_rxq() to provide a TP Ingress Channel Congestion Map. For T4/T5 these are actually the same values (because they are ~somewhat~ related), but for T6 they should return different values (T6 has Port 0 associated with MPS Buffer Group 0 (with MPS Buffer Group 1 silently cascading off) and Port 1 is associated with MPS Buffer Group 2 (with 3 cascading off)). Based on the original work by Casey Leedom <leedom@chelsio.com> Signed-off-by: Arjun Vynipadath <arjun@chelsio.com> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23tls: return -EFAULT if copy_to_user() failsDan Carpenter
The copy_to_user() function returns the number of bytes remaining but we want to return -EFAULT here. Fixes: 3c4d7559159b ("tls: kernel TLS support") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Dave Watson <davejwatson@fb.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23Merge branch 'master' of ↵David S. Miller
git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2017-06-23 1) Use memdup_user to spmlify xfrm_user_policy. From Geliang Tang. 2) Make xfrm_dev_register static to silence a sparse warning. From Wei Yongjun. 3) Use crypto_memneq to check the ICV in the AH protocol. From Sabrina Dubroca. 4) Remove some unused variables in esp6. From Stephen Hemminger. 5) Extend XFRM MIGRATE to allow to change the UDP encapsulation port. From Antony Antony. 6) Include the UDP encapsulation port to km_migrate announcements. From Antony Antony. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23Merge branch 'ena-new-features-and-improvements'David S. Miller
Netanel Belgazal says: ==================== net: update ena ethernet driver to version 1.2.0 This patchset contains some new features/improvements that were added to the ENA driver to increase its robustness and are based on experience of wide ENA deployment. Change log: V2: * Remove patch that add inline to C-file static function (contradict coding style). * Remove patch that moves MTU parameter validation in ena_change_mtu() instead of using the network stack. * Use upper_32_bits()/lower_32_bits() instead of casting. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23net: ena: update ena driver to version 1.2.0Netanel Belgazal
Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-06-23net: ena: update driver's rx drop statisticsNetanel Belgazal
rx drop counter is reported by the device in the keep-alive event. update the driver's counter with the device counter. Signed-off-by: Netanel Belgazal <netanel@amazon.com> Signed-off-by: David S. Miller <davem@davemloft.net>