summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2019-09-10module: add config option MODULE_ALLOW_MISSING_NAMESPACE_IMPORTSMatthias Maennich
If MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS is enabled (default=n), the requirement for modules to import all namespaces that are used by the module is relaxed. Enabling this option effectively allows (invalid) modules to be loaded while only a warning is emitted. Disabling this option keeps the enforcement at module loading time and loading is denied if the module's imports are not satisfactory. Reviewed-by: Martijn Coenen <maco@android.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Matthias Maennich <maennich@google.com> Signed-off-by: Jessica Yu <jeyu@kernel.org>
2019-09-10module: add support for symbol namespaces.Matthias Maennich
The EXPORT_SYMBOL_NS() and EXPORT_SYMBOL_NS_GPL() macros can be used to export a symbol to a specific namespace. There are no _GPL_FUTURE and _UNUSED variants because these are currently unused, and I'm not sure they are necessary. I didn't add EXPORT_SYMBOL_NS() for ASM exports; this patch sets the namespace of ASM exports to NULL by default. In case of relative references, it will be relocatable to NULL. If there's a need, this should be pretty easy to add. A module that wants to use a symbol exported to a namespace must add a MODULE_IMPORT_NS() statement to their module code; otherwise, modpost will complain when building the module, and the kernel module loader will emit an error and fail when loading the module. MODULE_IMPORT_NS() adds a modinfo tag 'import_ns' to the module. That tag can be observed by the modinfo command, modpost and kernel/module.c at the time of loading the module. The ELF symbols are renamed to include the namespace with an asm label; for example, symbol 'usb_stor_suspend' in namespace USB_STORAGE becomes 'usb_stor_suspend.USB_STORAGE'. This allows modpost to do namespace checking, without having to go through all the effort of parsing ELF and relocation records just to get to the struct kernel_symbols. On x86_64 I saw no difference in binary size (compression), but at runtime this will require a word of memory per export to hold the namespace. An alternative could be to store namespaced symbols in their own section and use a separate 'struct namespaced_kernel_symbol' for that section, at the cost of making the module loader more complex. Co-developed-by: Martijn Coenen <maco@android.com> Signed-off-by: Martijn Coenen <maco@android.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Matthias Maennich <maennich@google.com> Signed-off-by: Jessica Yu <jeyu@kernel.org>
2019-09-10module: support reading multiple values per modinfo tagMatthias Maennich
Similar to modpost's get_next_modinfo(), introduce get_next_modinfo() in kernel/module.c to acquire any further values associated with the same modinfo tag name. That is useful for any tags that have multiple occurrences (such as 'alias'), but is in particular introduced here as part of the symbol namespaces patch series to read the (potentially) multiple namespaces a module is importing. Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org> Reviewed-by: Martijn Coenen <maco@android.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Matthias Maennich <maennich@google.com> Signed-off-by: Jessica Yu <jeyu@kernel.org>
2019-09-07kernel.h: Add non_block_start/end()Daniel Vetter
In some special cases we must not block, but there's not a spinlock, preempt-off, irqs-off or similar critical section already that arms the might_sleep() debug checks. Add a non_block_start/end() pair to annotate these. This will be used in the oom paths of mmu-notifiers, where blocking is not allowed to make sure there's forward progress. Quoting Michal: "The notifier is called from quite a restricted context - oom_reaper - which shouldn't depend on any locks or sleepable conditionals. The code should be swift as well but we mostly do care about it to make a forward progress. Checking for sleepable context is the best thing we could come up with that would describe these demands at least partially." Peter also asked whether we want to catch spinlocks on top, but Michal said those are less of a problem because spinlocks can't have an indirect dependency upon the page allocator and hence close the loop with the oom reaper. Suggested by Michal Hocko. Link: https://lore.kernel.org/r/20190826201425.17547-4-daniel.vetter@ffwll.ch Acked-by: Christian König <christian.koenig@amd.com> (v1) Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-09-06kexec_elf: support 32 bit ELF filesSven Schnelle
The powerpc version only supported 64 bit. Add some code to switch decoding of fields during runtime so we can kexec a 32 bit kernel from a 64 bit kernel and vice versa. Signed-off-by: Sven Schnelle <svens@stackframe.org> Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Helge Deller <deller@gmx.de>
2019-09-06kexec_elf: remove unused variable in kexec_elf_load()Sven Schnelle
base was never assigned, so we can remove it. Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>
2019-09-06kexec_elf: remove Elf_Rel macroSven Schnelle
It wasn't used anywhere, so lets drop it. Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr> Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Sven Schnelle <svens@stackframe.org> Signed-off-by: Helge Deller <deller@gmx.de>
2019-09-06kexec_elf: remove PURGATORY_STACK_SIZESven Schnelle
It's not used anywhere so just drop it. Signed-off-by: Sven Schnelle <svens@stackframe.org> Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Helge Deller <deller@gmx.de>
2019-09-06kexec_elf: remove parsing of section headersSven Schnelle
We're not using them, so we can drop the parsing. Signed-off-by: Sven Schnelle <svens@stackframe.org> Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Helge Deller <deller@gmx.de>
2019-09-06kexec_elf: change order of elf_*_to_cpu() functionsSven Schnelle
Change the order to have a 64/32/16 order, no functional change. Signed-off-by: Sven Schnelle <svens@stackframe.org> Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Helge Deller <deller@gmx.de>
2019-09-06kexec: add KEXEC_ELFSven Schnelle
Right now powerpc provides an implementation to read elf files with the kexec_file_load() syscall. Make that available as a public kexec interface so it can be re-used on other architectures. Signed-off-by: Sven Schnelle <svens@stackframe.org> Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Signed-off-by: Helge Deller <deller@gmx.de>
2019-09-06Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-nextDavid S. Miller
Daniel Borkmann says: ==================== The following pull-request contains BPF updates for your *net-next* tree. The main changes are: 1) Add the ability to use unaligned chunks in the AF_XDP umem. By relaxing where the chunks can be placed, it allows to use an arbitrary buffer size and place whenever there is a free address in the umem. Helps more seamless DPDK AF_XDP driver integration. Support for i40e, ixgbe and mlx5e, from Kevin and Maxim. 2) Addition of a wakeup flag for AF_XDP tx and fill rings so the application can wake up the kernel for rx/tx processing which avoids busy-spinning of the latter, useful when app and driver is located on the same core. Support for i40e, ixgbe and mlx5e, from Magnus and Maxim. 3) bpftool fixes for printf()-like functions so compiler can actually enforce checks, bpftool build system improvements for custom output directories, and addition of 'bpftool map freeze' command, from Quentin. 4) Support attaching/detaching XDP programs from 'bpftool net' command, from Daniel. 5) Automatic xskmap cleanup when AF_XDP socket is released, and several barrier/{read,write}_once fixes in AF_XDP code, from Björn. 6) Relicense of bpf_helpers.h/bpf_endian.h for future libbpf inclusion as well as libbpf versioning improvements, from Andrii. 7) Several new BPF kselftests for verifier precision tracking, from Alexei. 8) Several BPF kselftest fixes wrt endianess to run on s390x, from Ilya. 9) And more BPF kselftest improvements all over the place, from Stanislav. 10) Add simple BPF map op cache for nfp driver to batch dumps, from Jakub. 11) AF_XDP socket umem mapping improvements for 32bit archs, from Ivan. 12) Add BPF-to-BPF call and BTF line info support for s390x JIT, from Yauheni. 13) Small optimization in arm64 JIT to spare 1 insns for BPF_MOD, from Jerin. 14) Fix an error check in bpf_tcp_gen_syncookie() helper, from Petar. 15) Various minor fixes and cleanups, from Nathan, Masahiro, Masanari, Peter, Wei, Yue. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-06Merge tag 'irqchip-5.4' of ↵Thomas Gleixner
git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core Pull irqchip updates for Linux 5.4 from Marc Zyngier: - Large GICv3 updates to support new PPI and SPI ranges - Conver all alloc_fwnode() users to use PAs instead of VAs - Add support for Marvell's MMP3 irqchip - Add support for Amlogic Meson SM1 - Various cleanups and fixes
2019-09-06perf/hw_breakpoint: Fix arch_hw_breakpoint use-before-initializationMark-PK Tsai
If we disable the compiler's auto-initialization feature, if -fplugin-arg-structleak_plugin-byref or -ftrivial-auto-var-init=pattern are disabled, arch_hw_breakpoint may be used before initialization after: 9a4903dde2c86 ("perf/hw_breakpoint: Split attribute parse and commit") On our ARM platform, the struct step_ctrl in arch_hw_breakpoint, which used to be zero-initialized by kzalloc(), may be used in arch_install_hw_breakpoint() without initialization. Signed-off-by: Mark-PK Tsai <mark-pk.tsai@mediatek.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alix Wu <alix.wu@mediatek.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: YJ Chiang <yj.chiang@mediatek.com> Link: https://lkml.kernel.org/r/20190906060115.9460-1-mark-pk.tsai@mediatek.com [ Minor edits. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-06Merge branch 'x86/cleanups' into x86/cpu, to pick up dependent changesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-05genirq: Prevent NULL pointer dereference in resend_irqs()Yunfeng Ye
The following crash was observed: Unable to handle kernel NULL pointer dereference at 0000000000000158 Internal error: Oops: 96000004 [#1] SMP pc : resend_irqs+0x68/0xb0 lr : resend_irqs+0x64/0xb0 ... Call trace: resend_irqs+0x68/0xb0 tasklet_action_common.isra.6+0x84/0x138 tasklet_action+0x2c/0x38 __do_softirq+0x120/0x324 run_ksoftirqd+0x44/0x60 smpboot_thread_fn+0x1ac/0x1e8 kthread+0x134/0x138 ret_from_fork+0x10/0x18 The reason for this is that the interrupt resend mechanism happens in soft interrupt context, which is a asynchronous mechanism versus other operations on interrupts. free_irq() does not take resend handling into account. Thus, the irq descriptor might be already freed before the resend tasklet is executed. resend_irqs() does not check the return value of the interrupt descriptor lookup and derefences the return value unconditionally. 1): __setup_irq irq_startup check_irq_resend // activate softirq to handle resend irq 2): irq_domain_free_irqs irq_free_descs free_desc call_rcu(&desc->rcu, delayed_free_desc) 3): __do_softirq tasklet_action resend_irqs desc = irq_to_desc(irq) desc->handle_irq(desc) // desc is NULL --> Ooops Fix this by adding a NULL pointer check in resend_irqs() before derefencing the irq descriptor. Fixes: a4633adcdbc1 ("[PATCH] genirq: add genirq sw IRQ-retrigger") Signed-off-by: Yunfeng Ye <yeyunfeng@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Zhiqiang Liu <liuzhiqiang26@huawei.com> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/1630ae13-5c8e-901e-de09-e740b6a426a7@huawei.com
2019-09-05alarmtimer: Use EOPNOTSUPP instead of ENOTSUPPThadeu Lima de Souza Cascardo
ENOTSUPP is not supposed to be returned to userspace. This was found on an OpenPower machine, where the RTC does not support set_alarm. On that system, a clock_nanosleep(CLOCK_REALTIME_ALARM, ...) results in "524 Unknown error 524" Replace it with EOPNOTSUPP which results in the expected "95 Operation not supported" error. Fixes: 1c6b39ad3f01 (alarmtimers: Return -ENOTSUPP if no RTC device is present) Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20190903171802.28314-1-cascardo@canonical.com
2019-09-05tracing: Add "gfp_t" support in synthetic_eventsZhengjun Xing
Add "gfp_t" support in synthetic_events, then the "gfp_t" type parameter in some functions can be traced. Prints the gfp flags as hex in addition to the human-readable flag string. Example output: whoopsie-630 [000] ...1 78.969452: testevent: bar=b20 (GFP_ATOMIC|__GFP_ZERO) rcuc/0-11 [000] ...1 81.097555: testevent: bar=a20 (GFP_ATOMIC) rcuc/0-11 [000] ...1 81.583123: testevent: bar=a20 (GFP_ATOMIC) Link: http://lkml.kernel.org/r/20190712015308.9908-1-zhengjun.xing@linux.intel.com Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com> [ Added printing of flag names ] Signed-off-by: Tom Zanussi <zanussi@kernel.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-09-05bpf: fix precision tracking of stack slotsAlexei Starovoitov
The problem can be seen in the following two tests: 0: (bf) r3 = r10 1: (55) if r3 != 0x7b goto pc+0 2: (7a) *(u64 *)(r3 -8) = 0 3: (79) r4 = *(u64 *)(r10 -8) .. 0: (85) call bpf_get_prandom_u32#7 1: (bf) r3 = r10 2: (55) if r3 != 0x7b goto pc+0 3: (7b) *(u64 *)(r3 -8) = r0 4: (79) r4 = *(u64 *)(r10 -8) When backtracking need to mark R4 it will mark slot fp-8. But ST or STX into fp-8 could belong to the same block of instructions. When backtracing is done the parent state may have fp-8 slot as "unallocated stack". Which will cause verifier to warn and incorrectly reject such programs. Writes into stack via non-R10 register are rare. llvm always generates canonical stack spill/fill. For such pathological case fall back to conservative precision tracking instead of rejecting. Reported-by: syzbot+c8d66267fd2b5955287e@syzkaller.appspotmail.com Fixes: b5dc0163d8fd ("bpf: precise scalar_value tracking") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2019-09-05hrtimer: Add a missing bracket and hide `migration_base' on !SMPSebastian Andrzej Siewior
The recent change to avoid taking the expiry lock when a timer is currently migrated missed to add a bracket at the end of the if statement leading to compile errors. Since that commit the variable `migration_base' is always used but it is only available on SMP configuration thus leading to another compile error. The changelog says "The timer base and base->cpu_base cannot be NULL in the code path", so it is safe to limit this check to SMP configurations only. Add the missing bracket to the if statement and hide `migration_base' behind CONFIG_SMP bars. [ tglx: Mark the functions inline ... ] Fixes: 68b2c8c1e4210 ("hrtimer: Don't take expiry_lock when timer is currently migrated") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/20190904145527.eah7z56ntwobqm6j@linutronix.de
2019-09-05kprobes: Prohibit probing on BUG() and WARN() addressMasami Hiramatsu
Since BUG() and WARN() may use a trap (e.g. UD2 on x86) to get the address where the BUG() has occurred, kprobes can not do single-step out-of-line that instruction. So prohibit probing on such address. Without this fix, if someone put a kprobe on WARN(), the kernel will crash with invalid opcode error instead of outputing warning message, because kernel can not find correct bug address. Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org> Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: David S . Miller <davem@davemloft.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Naveen N . Rao <naveen.n.rao@linux.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lkml.kernel.org/r/156750890133.19112.3393666300746167111.stgit@devnote2 Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-04sched/core: Fix uclamp ABI bug, clean up and robustify sched_read_attr() ABI ↵Ingo Molnar
logic and code Thadeu Lima de Souza Cascardo reported that 'chrt' broke on recent kernels: $ chrt -p $$ chrt: failed to get pid 26306's policy: Argument list too long and he has root-caused the bug to the following commit increasing sched_attr size and breaking sched_read_attr() into returning -EFBIG: a509a7cd7974 ("sched/uclamp: Extend sched_setattr() to support utilization clamping") The other, bigger bug is that the whole sched_getattr() and sched_read_attr() logic of checking non-zero bits in new ABI components is arguably broken, and pretty much any extension of the ABI will spuriously break the ABI. That's way too fragile. Instead implement the perf syscall's extensible ABI instead, which we already implement on the sched_setattr() side: - if user-attributes have the same size as kernel attributes then the logic is unchanged. - if user-attributes are larger than the kernel knows about then simply skip the extra bits, but set attr->size to the (smaller) kernel size so that tooling can (in principle) handle older kernel as well. - if user-attributes are smaller than the kernel knows about then just copy whatever user-space can accept. Also clean up the whole logic: - Simplify the code flow - there's no need for 'ret' for example. - Standardize on 'kattr/uattr' and 'ksize/usize' naming to make sure we always know which side we are dealing with. - Why is it called 'read' when what it does is to copy to user? This code is so far away from VFS read() semantics that the naming is actively confusing. Name it sched_attr_copy_to_user() instead, which mirrors other copy_to_user() functionality. - Move the attr->size assignment from the head of sched_getattr() to the sched_attr_copy_to_user() function. Nothing else within the kernel should care about the size of the structure. With these fixes the sched_getattr() syscall now nicely supports an extensible ABI in both a forward and backward compatible fashion, and will also fix the chrt bug. As an added bonus the bogus -EFBIG return is removed as well, which as Thadeu noted should have been -E2BIG to begin with. Reported-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Tested-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Tested-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Acked-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> Cc: Arnaldo Carvalho de Melo <acme@infradead.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Patrick Bellasi <patrick.bellasi@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: a509a7cd7974 ("sched/uclamp: Extend sched_setattr() to support utilization clamping") Link: https://lkml.kernel.org/r/20190904075532.GA26751@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-04kbuild: add $(BASH) to run scripts with bash-extensionMasahiro Yamada
CONFIG_SHELL falls back to sh when bash is not installed on the system, but nobody is testing such a case since bash is usually installed. So, shell scripts invoked by CONFIG_SHELL are only tested with bash. It makes it difficult to test whether the hashbang #!/bin/sh is real. For example, #!/bin/sh in arch/powerpc/kernel/prom_init_check.sh is false. (I fixed it up) Besides, some shell scripts invoked by CONFIG_SHELL use bash-extension and #!/bin/bash is specified as the hashbang, while CONFIG_SHELL may not always be set to bash. Probably, the right thing to do is to introduce BASH, which is bash by default, and always set CONFIG_SHELL to sh. Replace $(CONFIG_SHELL) with $(BASH) for bash scripts. If somebody tries to add bash-extension to a #!/bin/sh script, it will be caught in testing because /bin/sh is a symlink to dash on some major distributions. Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
2019-09-04dma-mapping: introduce a dma_common_find_pages helperChristoph Hellwig
A helper to find the backing page array based on a virtual address. This also ensures we do the same vm_flags check everywhere instead of slightly different or missing ones in a few places. Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-09-04dma-mapping: always use VM_DMA_COHERENT for generic DMA remapChristoph Hellwig
Currently the generic dma remap allocator gets a vm_flags passed by the caller that is a little confusing. We just introduced a generic vmalloc-level flag to identify the dma coherent allocations, so use that everywhere and remove the now pointless argument. Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-09-04dma-mapping: provide a better default ->get_required_maskChristoph Hellwig
Most dma_map_ops instances are IOMMUs that work perfectly fine in 32-bits of IOVA space, and the generic direct mapping code already provides its own routines that is intelligent based on the amount of memory actually present. Wire up the dma-direct routine for the ARM direct mapping code as well, and otherwise default to the constant 32-bit mask. This way we only need to override it for the occasional odd IOMMU that requires 64-bit IOVA support, or IOMMU drivers that are more efficient if they can fall back to the direct mapping. Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-09-04dma-mapping: remove the dma_declare_coherent_memory exportChristoph Hellwig
dma_declare_coherent_memory is something that the platform setup code (which pretty much means the device tree these days) need to do so that drivers can use the memory as declared by the platform. Drivers themselves have no business calling this function. Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-09-04dma-mapping: remove the dma_mmap_from_dev_coherent exportChristoph Hellwig
dma_mmap_from_dev_coherent is only used by dma_map_ops instances, none of which is modular. Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-09-04dma-mapping: remove dma_release_declared_memoryChristoph Hellwig
This function is entirely unused given that declared memory is generally provided by platform setup code. Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-09-04dma-mapping: remove CONFIG_ARCH_NO_COHERENT_DMA_MMAPChristoph Hellwig
CONFIG_ARCH_NO_COHERENT_DMA_MMAP is now functionally identical to !CONFIG_MMU, so remove the separate symbol. The only difference is that arm did not set it for !CONFIG_MMU, but arm uses a separate dma mapping implementation including its own mmap method, which is handled by moving the CONFIG_MMU check in dma_can_mmap so that is only applies to the dma-direct case, just as the other ifdefs for it. Signed-off-by: Christoph Hellwig <hch@lst.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k
2019-09-04dma-mapping: add a dma_can_mmap helperChristoph Hellwig
Add a helper to check if DMA allocations for a specific device can be mapped to userspace using dma_mmap_*. Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-09-04dma-mapping: explicitly wire up ->mmap and ->get_sgtableChristoph Hellwig
While the default ->mmap and ->get_sgtable implementations work for the majority of our dma_map_ops impementations they are inherently safe for others that don't use the page allocator or CMA and/or use their own way of remapping not covered by the common code. So remove the defaults if these methods are not wired up, but instead wire up the default implementations for all safe instances. Fixes: e1c7e324539a ("dma-mapping: always provide the dma_map_ops based implementation") Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-09-04dma-mapping: move the dma_get_sgtable API comments from arm to common codeChristoph Hellwig
The comments are spot on and should be near the central API, not just near a single implementation. Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-09-03kgdb: fix comment regarding static functionNadav Amit
The comment that says that module_event() is not static is clearly wrong. Signed-off-by: Nadav Amit <namit@vmware.com> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2019-09-03kdb: Replace strncmp with str_has_prefixChuhong Yuan
strncmp(str, const, len) is error-prone. We had better use newly introduced str_has_prefix() instead of it. Signed-off-by: Chuhong Yuan <hslester96@gmail.com> Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
2019-09-03cpuidle: play_idle: Increase the resolution to usecDaniel Lezcano
The play_idle resolution is 1ms. The intel_powerclamp bases the idle duration on jiffies. The idle injection API is also using msec based duration but has no user yet. Unfortunately, msec based time does not fit well when we want to inject idle cycle precisely with shallow idle state. In order to set the scene for the incoming idle injection user, move the precision up to usec when calling play_idle. Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-09-03irqdomain: Add the missing assignment of domain->fwnode for named fwnodeDexuan Cui
Recently device pass-through stops working for Linux VM running on Hyper-V. git-bisect shows the regression is caused by the recent commit 467a3bb97432 ("PCI: hv: Allocate a named fwnode ..."), but the root cause is that the commit d59f6617eef0 forgets to set the domain->fwnode for IRQCHIP_FWNODE_NAMED*, and as a result: 1. The domain->fwnode remains to be NULL. 2. irq_find_matching_fwspec() returns NULL since "h->fwnode == fwnode" is false, and pci_set_bus_msi_domain() sets the Hyper-V PCI root bus's msi_domain to NULL. 3. When the device is added onto the root bus, the device's dev->msi_domain is set to NULL in pci_set_msi_domain(). 4. When a device driver tries to enable MSI-X, pci_msi_setup_msi_irqs() calls arch_setup_msi_irqs(), which uses the native MSI chip (i.e. arch/x86/kernel/apic/msi.c: pci_msi_controller) to set up the irqs, but actually pci_msi_setup_msi_irqs() is supposed to call msi_domain_alloc_irqs() with the hbus->irq_domain, which is created in hv_pcie_init_irq_domain() and is associated with the Hyper-V chip hv_msi_irq_chip. Consequently, the irq line is not properly set up, and the device driver can not receive any interrupt. Fixes: d59f6617eef0 ("genirq: Allow fwnode to carry name information only") Fixes: 467a3bb97432 ("PCI: hv: Allocate a named fwnode instead of an address-based one") Reported-by: Lili Deng <v-lide@microsoft.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/PU1P153MB01694D9AF625AC335C600C5FBFBE0@PU1P153MB0169.APCP153.PROD.OUTLOOK.COM
2019-09-03sched/uclamp: Always use 'enum uclamp_id' for clamp_id valuesPatrick Bellasi
The supported clamp indexes are defined in 'enum clamp_id', however, because of the code logic in some of the first utilization clamping series version, sometimes we needed to use 'unsigned int' to represent indices. This is not more required since the final version of the uclamp_* APIs can always use the proper enum uclamp_id type. Fix it with a bulk rename now that we have all the bits merged. Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Michal Koutny <mkoutny@suse.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Alessio Balsini <balsini@android.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <quentin.perret@arm.com> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lkml.kernel.org/r/20190822132811.31294-7-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-03sched/uclamp: Update CPU's refcount on TG's clamp changesPatrick Bellasi
On updates of task group (TG) clamp values, ensure that these new values are enforced on all RUNNABLE tasks of the task group, i.e. all RUNNABLE tasks are immediately boosted and/or capped as requested. Do that each time we update effective clamps from cpu_util_update_eff(). Use the *cgroup_subsys_state (css) to walk the list of tasks in each affected TG and update their RUNNABLE tasks. Update each task by using the same mechanism used for cpu affinity masks updates, i.e. by taking the rq lock. Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Michal Koutny <mkoutny@suse.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Alessio Balsini <balsini@android.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <quentin.perret@arm.com> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lkml.kernel.org/r/20190822132811.31294-6-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-03sched/uclamp: Use TG's clamps to restrict TASK's clampsPatrick Bellasi
When a task specific clamp value is configured via sched_setattr(2), this value is accounted in the corresponding clamp bucket every time the task is {en,de}qeued. However, when cgroups are also in use, the task specific clamp values could be restricted by the task_group (TG) clamp values. Update uclamp_cpu_inc() to aggregate task and TG clamp values. Every time a task is enqueued, it's accounted in the clamp bucket tracking the smaller clamp between the task specific value and its TG effective value. This allows to: 1. ensure cgroup clamps are always used to restrict task specific requests, i.e. boosted not more than its TG effective protection and capped at least as its TG effective limit. 2. implement a "nice-like" policy, where tasks are still allowed to request less than what enforced by their TG effective limits and protections Do this by exploiting the concept of "effective" clamp, which is already used by a TG to track parent enforced restrictions. Apply task group clamp restrictions only to tasks belonging to a child group. While, for tasks in the root group or in an autogroup, system defaults are still enforced. Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Michal Koutny <mkoutny@suse.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Alessio Balsini <balsini@android.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <quentin.perret@arm.com> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lkml.kernel.org/r/20190822132811.31294-5-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-03sched/uclamp: Propagate system defaults to the root groupPatrick Bellasi
The clamp values are not tunable at the level of the root task group. That's for two main reasons: - the root group represents "system resources" which are always entirely available from the cgroup standpoint. - when tuning/restricting "system resources" makes sense, tuning must be done using a system wide API which should also be available when control groups are not. When a system wide restriction is available, cgroups should be aware of its value in order to know exactly how much "system resources" are available for the subgroups. Utilization clamping supports already the concepts of: - system defaults: which define the maximum possible clamp values usable by tasks. - effective clamps: which allows a parent cgroup to constraint (maybe temporarily) its descendants without losing the information related to the values "requested" from them. Exploit these two concepts and bind them together in such a way that, whenever system default are tuned, the new values are propagated to (possibly) restrict or relax the "effective" value of nested cgroups. When cgroups are in use, force an update of all the RUNNABLE tasks. Otherwise, keep things simple and do just a lazy update next time each task will be enqueued. Do that since we assume a more strict resource control is required when cgroups are in use. This allows also to keep "effective" clamp values updated in case we need to expose them to user-space. Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Michal Koutny <mkoutny@suse.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Alessio Balsini <balsini@android.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <quentin.perret@arm.com> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lkml.kernel.org/r/20190822132811.31294-4-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-03sched/uclamp: Propagate parent clampsPatrick Bellasi
In order to properly support hierarchical resources control, the cgroup delegation model requires that attribute writes from a child group never fail but still are locally consistent and constrained based on parent's assigned resources. This requires to properly propagate and aggregate parent attributes down to its descendants. Implement this mechanism by adding a new "effective" clamp value for each task group. The effective clamp value is defined as the smaller value between the clamp value of a group and the effective clamp value of its parent. This is the actual clamp value enforced on tasks in a task group. Since it's possible for a cpu.uclamp.min value to be bigger than the cpu.uclamp.max value, ensure local consistency by restricting each "protection" (i.e. min utilization) with the corresponding "limit" (i.e. max utilization). Do that at effective clamps propagation to ensure all user-space write never fails while still always tracking the most restrictive values. Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Michal Koutny <mkoutny@suse.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Alessio Balsini <balsini@android.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <quentin.perret@arm.com> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lkml.kernel.org/r/20190822132811.31294-3-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-03sched/uclamp: Extend CPU's cgroup controllerPatrick Bellasi
The cgroup CPU bandwidth controller allows to assign a specified (maximum) bandwidth to the tasks of a group. However this bandwidth is defined and enforced only on a temporal base, without considering the actual frequency a CPU is running on. Thus, the amount of computation completed by a task within an allocated bandwidth can be very different depending on the actual frequency the CPU is running that task. The amount of computation can be affected also by the specific CPU a task is running on, especially when running on asymmetric capacity systems like Arm's big.LITTLE. With the availability of schedutil, the scheduler is now able to drive frequency selections based on actual task utilization. Moreover, the utilization clamping support provides a mechanism to bias the frequency selection operated by schedutil depending on constraints assigned to the tasks currently RUNNABLE on a CPU. Giving the mechanisms described above, it is now possible to extend the cpu controller to specify the minimum (or maximum) utilization which should be considered for tasks RUNNABLE on a cpu. This makes it possible to better defined the actual computational power assigned to task groups, thus improving the cgroup CPU bandwidth controller which is currently based just on time constraints. Extend the CPU controller with a couple of new attributes uclamp.{min,max} which allow to enforce utilization boosting and capping for all the tasks in a group. Specifically: - uclamp.min: defines the minimum utilization which should be considered i.e. the RUNNABLE tasks of this group will run at least at a minimum frequency which corresponds to the uclamp.min utilization - uclamp.max: defines the maximum utilization which should be considered i.e. the RUNNABLE tasks of this group will run up to a maximum frequency which corresponds to the uclamp.max utilization These attributes: a) are available only for non-root nodes, both on default and legacy hierarchies, while system wide clamps are defined by a generic interface which does not depends on cgroups. This system wide interface enforces constraints on tasks in the root node. b) enforce effective constraints at each level of the hierarchy which are a restriction of the group requests considering its parent's effective constraints. Root group effective constraints are defined by the system wide interface. This mechanism allows each (non-root) level of the hierarchy to: - request whatever clamp values it would like to get - effectively get only up to the maximum amount allowed by its parent c) have higher priority than task-specific clamps, defined via sched_setattr(), thus allowing to control and restrict task requests. Add two new attributes to the cpu controller to collect "requested" clamp values. Allow that at each non-root level of the hierarchy. Keep it simple by not caring now about "effective" values computation and propagation along the hierarchy. Update sysctl_sched_uclamp_handler() to use the newly introduced uclamp_mutex so that we serialize system default updates with cgroup relate updates. Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Michal Koutny <mkoutny@suse.com> Acked-by: Tejun Heo <tj@kernel.org> Cc: Alessio Balsini <balsini@android.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Joel Fernandes <joelaf@google.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Morten Rasmussen <morten.rasmussen@arm.com> Cc: Paul Turner <pjt@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quentin Perret <quentin.perret@arm.com> Cc: Rafael J . Wysocki <rafael.j.wysocki@intel.com> Cc: Steve Muckle <smuckle@google.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Todd Kjos <tkjos@google.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lkml.kernel.org/r/20190822132811.31294-2-patrick.bellasi@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-03sched/topology: Improve load balancing on AMD EPYC systemsMatt Fleming
SD_BALANCE_{FORK,EXEC} and SD_WAKE_AFFINE are stripped in sd_init() for any sched domains with a NUMA distance greater than 2 hops (RECLAIM_DISTANCE). The idea being that it's expensive to balance across domains that far apart. However, as is rather unfortunately explained in: commit 32e45ff43eaf ("mm: increase RECLAIM_DISTANCE to 30") the value for RECLAIM_DISTANCE is based on node distance tables from 2011-era hardware. Current AMD EPYC machines have the following NUMA node distances: node distances: node 0 1 2 3 4 5 6 7 0: 10 16 16 16 32 32 32 32 1: 16 10 16 16 32 32 32 32 2: 16 16 10 16 32 32 32 32 3: 16 16 16 10 32 32 32 32 4: 32 32 32 32 10 16 16 16 5: 32 32 32 32 16 10 16 16 6: 32 32 32 32 16 16 10 16 7: 32 32 32 32 16 16 16 10 where 2 hops is 32. The result is that the scheduler fails to load balance properly across NUMA nodes on different sockets -- 2 hops apart. For example, pinning 16 busy threads to NUMA nodes 0 (CPUs 0-7) and 4 (CPUs 32-39) like so, $ numactl -C 0-7,32-39 ./spinner 16 causes all threads to fork and remain on node 0 until the active balancer kicks in after a few seconds and forcibly moves some threads to node 4. Override node_reclaim_distance for AMD Zen. Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Borislav Petkov <bp@alien8.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Suravee.Suthikulpanit@amd.com Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas.Lendacky@amd.com Cc: Tony Luck <tony.luck@intel.com> Link: https://lkml.kernel.org/r/20190808195301.13222-3-matt@codeblueprint.co.uk Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-03sched/fair: Don't assign runtime for throttled cfs_rqLiangyan
do_sched_cfs_period_timer() will refill cfs_b runtime and call distribute_cfs_runtime to unthrottle cfs_rq, sometimes cfs_b->runtime will allocate all quota to one cfs_rq incorrectly, then other cfs_rqs attached to this cfs_b can't get runtime and will be throttled. We find that one throttled cfs_rq has non-negative cfs_rq->runtime_remaining and cause an unexpetced cast from s64 to u64 in snippet: distribute_cfs_runtime() { runtime = -cfs_rq->runtime_remaining + 1; } The runtime here will change to a large number and consume all cfs_b->runtime in this cfs_b period. According to Ben Segall, the throttled cfs_rq can have account_cfs_rq_runtime called on it because it is throttled before idle_balance, and the idle_balance calls update_rq_clock to add time that is accounted to the task. This commit prevents cfs_rq to be assgined new runtime if it has been throttled until that distribute_cfs_runtime is called. Signed-off-by: Liangyan <liangyan.peng@linux.alibaba.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <valentin.schneider@arm.com> Reviewed-by: Ben Segall <bsegall@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: shanpeic@linux.alibaba.com Cc: stable@vger.kernel.org Cc: xlpang@linux.alibaba.com Fixes: d3d9dc330236 ("sched: Throttle entities exceeding their allowed bandwidth") Link: https://lkml.kernel.org/r/20190826121633.6538-1-liangyan.peng@linux.alibaba.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-03dma-mapping: introduce dma_get_merge_boundary()Yoshihiro Shimoda
This patch adds a new DMA API "dma_get_merge_boundary". This function returns the DMA merge boundary if the DMA layer can merge the segments. This patch also adds the implementation for a new dma_map_ops pointer. Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Reviewed-by: Simon Horman <horms+renesas@verge.net.au> Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-09-02Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netDavid S. Miller
r8152 conflicts are the NAPI fixes in 'net' overlapping with some tasklet stuff in net-next Signed-off-by: David S. Miller <davem@davemloft.net>
2019-09-02Merge branch 'linus' into perf/core, to pick up fixesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-09-01Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netLinus Torvalds
Pull networking fixes from David Miller: 1) Fix some length checks during OGM processing in batman-adv, from Sven Eckelmann. 2) Fix regression that caused netfilter conntrack sysctls to not be per-netns any more. From Florian Westphal. 3) Use after free in netpoll, from Feng Sun. 4) Guard destruction of pfifo_fast per-cpu qdisc stats with qdisc_is_percpu_stats(), from Davide Caratti. Similar bug is fixed in pfifo_fast_enqueue(). 5) Fix memory leak in mld_del_delrec(), from Eric Dumazet. 6) Handle neigh events on internal ports correctly in nfp, from John Hurley. 7) Clear SKB timestamp in NF flow table code so that it does not confuse fq scheduler. From Florian Westphal. 8) taprio destroy can crash if it is invoked in a failure path of taprio_init(), because the list head isn't setup properly yet and the list del is unconditional. Perform the list add earlier to address this. From Vladimir Oltean. 9) Make sure to reapply vlan filters on device up, in aquantia driver. From Dmitry Bogdanov. 10) sgiseeq driver releases DMA memory using free_page() instead of dma_free_attrs(). From Christophe JAILLET. * git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (58 commits) net: seeq: Fix the function used to release some memory in an error handling path enetc: Add missing call to 'pci_free_irq_vectors()' in probe and remove functions net: bcmgenet: use ethtool_op_get_ts_info() tc-testing: don't hardcode 'ip' in nsPlugin.py net: dsa: microchip: add KSZ8563 compatibility string dt-bindings: net: dsa: document additional Microchip KSZ8563 switch net: aquantia: fix out of memory condition on rx side net: aquantia: linkstate irq should be oneshot net: aquantia: reapply vlan filters on up net: aquantia: fix limit of vlan filters net: aquantia: fix removal of vlan 0 net/sched: cbs: Set default link speed to 10 Mbps in cbs_set_port_rate taprio: Set default link speed to 10 Mbps in taprio_set_picos_per_byte taprio: Fix kernel panic in taprio_destroy net: dsa: microchip: fill regmap_config name rxrpc: Fix lack of conn cleanup when local endpoint is cleaned up [ver #2] net: stmmac: dwmac-rk: Don't fail if phy regulator is absent amd-xgbe: Fix error path in xgbe_mod_init() netfilter: nft_meta_bridge: Fix get NFT_META_BRI_IIFVPROTO in network byteorder mac80211: Correctly set noencrypt for PAE frames ...
2019-08-31tracing: Rename tracing_reset() to tracing_reset_cpu()Steven Rostedt (VMware)
The name tracing_reset() was a misnomer, as it really only reset a single CPU buffer. Rename it to tracing_reset_cpu() and also make it static and remove the prototype from trace.h, as it is only used in a single function. Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>