summaryrefslogtreecommitdiff
path: root/kernel
AgeCommit message (Collapse)Author
2013-11-07Merge branch 'pm-sleep'Rafael J. Wysocki
* pm-sleep: PM / hibernate: Avoid overflow in hibernate_preallocate_memory()
2013-11-07Merge tag 'tty-3.13-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial driver updates from Greg KH: "Here's the big tty/serial driver update for 3.13-rc1. There's some more minor n_tty work here, but nothing like previous kernel releases. Also some new driver ids, driver updates for new hardware, and other small things. All of this has been in linux-next for a while with no issues" * tag 'tty-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (84 commits) serial: omap: fix missing comma serial: sh-sci: Enable the driver on all ARM platforms serial: mfd: Staticize local symbols serial: omap: fix a few checkpatch warnings serial: omap: improve RS-485 performance mrst_max3110: fix unbalanced IRQ issue during resume serial: omap: Add support for optional wake-up serial: sirf: remove duplicate defines tty: xuartps: Fix build error when COMMON_CLK is not set tty: xuartps: Fix build error due to missing forward declaration tty: xuartps: Fix "may be used uninitialized" build warning serial: 8250_pci: add Pericom PCIe Serial board Support (12d8:7952/4/8) - Chip PI7C9X7952/4/8 tty: xuartps: Update copyright information tty: xuartps: Implement suspend/resume callbacks tty: xuartps: Dynamically adjust to input frequency changes tty: xuartps: Updating set_baud_rate() tty: xuartps: Force enable the UART in xuartps_console_write tty: xuartps: support 64 byte FIFO size tty: xuartps: Add polled mode support for xuartps tty: xuartps: Implement BREAK detection, add SYSRQ support ...
2013-11-07Merge tag 'driver-core-3.13-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core / sysfs patches from Greg KH: "Here's the big driver core / sysfs update for 3.13-rc1. There's lots of dev_groups updates for different subsystems, as they all get slowly migrated over to the safe versions of the attribute groups (removing userspace races with the creation of the sysfs files.) Also in here are some kobject updates, devres expansions, and the first round of Tejun's sysfs reworking to enable it to be used by other subsystems as a backend for an in-kernel filesystem. All of these have been in linux-next for a while with no reported issues" * tag 'driver-core-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (83 commits) sysfs: rename sysfs_assoc_lock and explain what it's about sysfs: use generic_file_llseek() for sysfs_file_operations sysfs: return correct error code on unimplemented mmap() mdio_bus: convert bus code to use dev_groups device: Make dev_WARN/dev_WARN_ONCE print device as well as driver name sysfs: separate out dup filename warning into a separate function sysfs: move sysfs_hash_and_remove() to fs/sysfs/dir.c sysfs: remove unused sysfs_get_dentry() prototype sysfs: honor bin_attr.attr.ignore_lockdep sysfs: merge sysfs_elem_bin_attr into sysfs_elem_attr devres: restore zeroing behavior of devres_alloc() sysfs: fix sysfs_write_file for bin file input: gameport: convert bus code to use dev_groups input: serio: remove bus usage of dev_attrs input: serio: use DEVICE_ATTR_RO() i2o: convert bus code to use dev_groups memstick: convert bus code to use dev_groups tifm: convert bus code to use dev_groups virtio: convert bus code to use dev_groups ipack: convert bus code to use dev_groups ...
2013-11-07PM / hibernate: Avoid overflow in hibernate_preallocate_memory()Aaron Lu
When system has a lot of highmem (e.g. 16GiB using a 32 bits kernel), the code to calculate how much memory we need to preallocate in normal zone may cause overflow. As Leon has analysed: It looks that during computing 'alloc' variable there is overflow: alloc = (3943404 - 1970542) - 1978280 = -5418 (signed) And this function goes to err_out. Fix this by avoiding that overflow. References: https://bugzilla.kernel.org/show_bug.cgi?id=60817 Reported-and-tested-by: Leon Drugi <eyak@wp.pl> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Aaron Lu <aaron.lu@intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2013-11-06tracing: Do not use signed enums with unsigned long long in fgragh outputSteven Rostedt (Red Hat)
The duration field of print_graph_duration() can also be used to do the space filling by passing an enum in it: DURATION_FILL_FULL DURATION_FILL_START DURATION_FILL_END The problem is that these are enums and defined as negative, but the duration field is unsigned long long. Most archs are fine with this but blackfin fails to compile because of it: kernel/built-in.o: In function `print_graph_duration': kernel/trace/trace_functions_graph.c:782: undefined reference to `__ucmpdi2' Overloading a unsigned long long with an signed enum is just bad in principle. We can accomplish the same thing by using part of the flags field instead. Cc: Mike Frysinger <vapier@gentoo.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-06tracing: Remove unused function ftrace_off_permanent()Steven Rostedt (Red Hat)
In the past, ftrace_off_permanent() was called if something strange was detected. But the ftrace_bug() now handles all the anomolies that can happen with ftrace (function tracing), and there are no uses of ftrace_off_permanent(). Get rid of it. Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-06tracing: Do not assign filp->private_data to freed memoryGeyslan G. Bem
In system_tr_open(), the filp->private_data can be assigned the 'dir' variable even if it was freed. This is on the error path, and is harmless because the error return code will prevent filp->private_data from being used. But for correctness, we should not assign it to a recently freed variable, as that can cause static tools to give false warnings. Also have both subsystem_open() and system_tr_open() return -ENODEV if tracing has been disabled. Link: http://lkml.kernel.org/r/1383764571-7318-1-git-send-email-geyslan@gmail.com Signed-off-by: Geyslan G. Bem <geyslan@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-06perf/ftrace: Fix paranoid level for enabling function tracerSteven Rostedt
The current default perf paranoid level is "1" which has "perf_paranoid_kernel()" return false, and giving any operations that use it, access to normal users. Unfortunately, this includes function tracing and normal users should not be allowed to enable function tracing by default. The proper level is defined at "-1" (full perf access), which "perf_paranoid_tracepoint_raw()" will only give access to. Use that check instead for enabling function tracing. Reported-by: Dave Jones <davej@redhat.com> Reported-by: Vince Weaver <vincent.weaver@maine.edu> Tested-by: Vince Weaver <vincent.weaver@maine.edu> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: stable@vger.kernel.org # 3.4+ CVE: CVE-2013-2930 Fixes: ced39002f5ea ("ftrace, perf: Add support to use function tracepoint in perf") Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-06uprobes: Export write_opcode() as uprobe_write_opcode()Oleg Nesterov
set_swbp() and set_orig_insn() are __weak, but this is pointless because write_opcode() is static. Export write_opcode() as uprobe_write_opcode() for the upcoming arm port, this way it can actually override set_swbp() and use __opcode_to_mem_arm(bpinsn) instead if UPROBE_SWBP_INSN. Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-11-06uprobes: Introduce arch_uprobe->ixolOleg Nesterov
Currently xol_get_insn_slot() assumes that we should simply copy arch_uprobe->insn[] which is (ignoring arch_uprobe_analyze_insn) just the copy of the original insn. This is not true for arm which needs to create another insn to execute it out-of-line. So this patch simply adds the new member, ->ixol into the union. This doesn't make any difference for x86 and powerpc, but arm can divorce insn/ixol and initialize the correct xol insn in arch_uprobe_analyze_insn(). Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-11-06uprobes: Kill module_init() and module_exit()Oleg Nesterov
Turn module_init() into __initcall() and kill module_exit(). This code can't be compiled as a module so these module_*() calls only add the confusion, especially if arch-dependant code needs its own initialization hooks. Signed-off-by: Oleg Nesterov <oleg@redhat.com>
2013-11-06audit: fix type of sessionid in audit_set_loginuid()Eric Paris
sfr pointed out that with CONFIG_UIDGID_STRICT_TYPE_CHECKS set the audit tree would not build. This is because the oldsessionid in audit_set_loginuid() was accidentally being declared as a kuid_t. This patch fixes that declaration mistake. Example of problem: kernel/auditsc.c: In function 'audit_set_loginuid': kernel/auditsc.c:2003:15: error: incompatible types when assigning to type 'kuid_t' from type 'int' oldsessionid = audit_get_sessionid(current); Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Eric Paris <eparis@redhat.com>
2013-11-06tracing: Add helper function tracing_is_disabled()Geyslan G. Bem
This patch creates the function 'tracing_is_disabled', which can be used outside of trace.c. Link: http://lkml.kernel.org/r/1382141754-12155-1-git-send-email-geyslan@gmail.com Signed-off-by: Geyslan G. Bem <geyslan@gmail.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-06tracing: Open tracer when ftrace_dump_on_oops is usedCody P Schafer
With ftrace_dump_on_oops, we previously did not open the tracer in question, sometimes causing the trace output to be useless. For example, the function_graph tracer with tracing_thresh set dumped via ftrace_dump_on_oops would show a series of '}' indented at different levels, but no function names. call trace->open() (and do a few other fixups copied from the normal dump path) to make the output more intelligible. Link: http://lkml.kernel.org/r/1382554197-16961-1-git-send-email-cody@linux.vnet.ibm.com Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-06sched: Remove unnecessary iteration over sched domains to update nr_busy_cpusPreeti U Murthy
nr_busy_cpus parameter is used by nohz_kick_needed() to find out the number of busy cpus in a sched domain which has SD_SHARE_PKG_RESOURCES flag set. Therefore instead of updating nr_busy_cpus at every level of sched domain, since it is irrelevant, we can update this parameter only at the parent domain of the sd which has this flag set. Introduce a per-cpu parameter sd_busy which represents this parent domain. In nohz_kick_needed() we directly query the nr_busy_cpus parameter associated with the groups of sd_busy. By associating sd_busy with the highest domain which has SD_SHARE_PKG_RESOURCES flag set, we cover all lower level domains which could have this flag set and trigger nohz_idle_balancing if any of the levels have more than one busy cpu. sd_busy is irrelevant for asymmetric load balancing. However sd_asym has been introduced to represent the highest sched domain which has SD_ASYM_PACKING flag set so that it can be queried directly when required. While we are at it, we might as well change the nohz_idle parameter to be updated at the sd_busy domain level alone and not the base domain level of a CPU. This will unify the concept of busy cpus at just one level of sched domain where it is currently used. Signed-off-by: Preeti U Murthy<preeti@linux.vnet.ibm.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: svaidy@linux.vnet.ibm.com Cc: vincent.guittot@linaro.org Cc: bitbucket@online.de Cc: benh@kernel.crashing.org Cc: anton@samba.org Cc: Morten.Rasmussen@arm.com Cc: pjt@google.com Cc: peterz@infradead.org Cc: mikey@neuling.org Link: http://lkml.kernel.org/r/20131030031252.23426.4417.stgit@preeti.in.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06sched: Fix asymmetric scheduling for POWER7Vaidyanathan Srinivasan
Asymmetric scheduling within a core is a scheduler loadbalancing feature that is triggered when SD_ASYM_PACKING flag is set. The goal for the load balancer is to move tasks to lower order idle SMT threads within a core on a POWER7 system. In nohz_kick_needed(), we intend to check if our sched domain (core) is completely busy or we have idle cpu. The following check for SD_ASYM_PACKING: (cpumask_first_and(nohz.idle_cpus_mask, sched_domain_span(sd)) < cpu) already covers the case of checking if the domain has an idle cpu, because cpumask_first_and() will not yield any set bits if this domain has no idle cpu. Hence, nr_busy check against group weight can be removed. Reported-by: Michael Neuling <michael.neuling@au1.ibm.com> Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com> Signed-off-by: Preeti U Murthy <preeti@linux.vnet.ibm.com> Tested-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: vincent.guittot@linaro.org Cc: bitbucket@online.de Cc: benh@kernel.crashing.org Cc: anton@samba.org Cc: Morten.Rasmussen@arm.com Cc: pjt@google.com Link: http://lkml.kernel.org/r/20131030031242.23426.13019.stgit@preeti.in.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06perf: Factor out strncpy() in perf_event_mmap_event()Oleg Nesterov
While this is really minor, but strncpy() does the unnecessary zero-padding till the end of tmp[16] and it is called every time we are going to use the string literal. Turn these strncpy()'s into the single strlcpy() under the new label, saves 72 bytes. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131017182417.GA17753@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06perf: Fix arch_perf_out_copy_user defaultPeter Zijlstra
The arch_perf_output_copy_user() default of __copy_from_user_inatomic() returns bytes not copied, while all other argument functions given DEFINE_OUTPUT_COPY() return bytes copied. Since copy_from_user_nmi() is the odd duck out by returning bytes copied where all other *copy_{to,from}* functions return bytes not copied, change it over and ammend DEFINE_OUTPUT_COPY() to expect bytes not copied. Oddly enough DEFINE_OUTPUT_COPY() already returned bytes not copied while expecting its worker functions to return bytes copied. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Acked-by: will.deacon@arm.com Cc: Frederic Weisbecker <fweisbec@gmail.com> Link: http://lkml.kernel.org/r/20131030201622.GR16117@laptop.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06perf: Update a stale commentPeter Zijlstra
Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Michael Neuling <mikey@neuling.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: james.hogan@imgtec.com Cc: Vince Weaver <vince@deater.net> Cc: Victor Kaplansky <VICTORK@il.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Anton Blanchard <anton@samba.org> Link: http://lkml.kernel.org/n/tip-9s5mze78gmlz19agt39i8rii@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06perf: Optimize perf_output_begin() -- address calculationPeter Zijlstra
Rewrite the handle address calculation code to be clearer. Saves 8 bytes on x86_64-defconfig. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Michael Neuling <mikey@neuling.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: james.hogan@imgtec.com Cc: Vince Weaver <vince@deater.net> Cc: Victor Kaplansky <VICTORK@il.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Anton Blanchard <anton@samba.org> Link: http://lkml.kernel.org/n/tip-3trb2n2henb9m27tncef3ag7@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06perf: Optimize perf_output_begin() -- lost_event casePeter Zijlstra
Avoid touching the lost_event and sample_data cachelines twince. Its not like we end up doing less work, but it might help to keep all accesses to these cachelines in one place. Due to code shuffle, this looses 4 bytes on x86_64-defconfig. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Michael Neuling <mikey@neuling.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: james.hogan@imgtec.com Cc: Vince Weaver <vince@deater.net> Cc: Victor Kaplansky <VICTORK@il.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Anton Blanchard <anton@samba.org> Link: http://lkml.kernel.org/n/tip-zfxnc58qxj0eawdoj31hhupv@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06perf: Optimize perf_output_begin()Peter Zijlstra
There's no point in re-doing the memory-barrier when we fail the cmpxchg(). Also placing it after the space reservation loop makes it clearer it only separates the userpage->tail read from the data stores. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Michael Neuling <mikey@neuling.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: james.hogan@imgtec.com Cc: Vince Weaver <vince@deater.net> Cc: Victor Kaplansky <VICTORK@il.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Anton Blanchard <anton@samba.org> Link: http://lkml.kernel.org/n/tip-c19u6egfldyx86tpyc3zgkw9@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06perf: Add unlikely() to the ring-buffer codePeter Zijlstra
Add unlikely() annotations to 'slow' paths: When having a sampling event but no output buffer; you have bigger issues -- also the bail is still faster than actually doing the work. When having a sampling event but a control page only buffer, you have bigger issues -- again the bail is still faster than actually doing work. Optimize for the case where you're not loosing events -- again, not doing the work is still faster but make sure that when you have to actually do work its as fast as possible. The typical watermark is 1/2 the buffer size, so most events will not take this path. Shrinks perf_output_begin() by 16 bytes on x86_64-defconfig. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Michael Neuling <mikey@neuling.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: james.hogan@imgtec.com Cc: Vince Weaver <vince@deater.net> Cc: Victor Kaplansky <VICTORK@il.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Anton Blanchard <anton@samba.org> Link: http://lkml.kernel.org/n/tip-wlg3jew3qnutm8opd0hyeuwn@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06perf: Simplify the ring-buffer codePeter Zijlstra
By using CIRC_SPACE() we can obviate the need for perf_output_space(). Shrinks the size of perf_output_begin() by 17 bytes on x86_64-defconfig. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> Cc: Michael Ellerman <michael@ellerman.id.au> Cc: Michael Neuling <mikey@neuling.org> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: james.hogan@imgtec.com Cc: Vince Weaver <vince@deater.net> Cc: Victor Kaplansky <VICTORK@il.ibm.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Anton Blanchard <anton@samba.org> Link: http://lkml.kernel.org/n/tip-vtb0xb0llebmsdlfn1v5vtfj@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06locking: Move the percpu-rwsem code to kernel/locking/Peter Zijlstra
Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-52bjmtty46we26hbfd9sc9iy@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06locking: Move the lglocks code to kernel/locking/Peter Zijlstra
Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-amd6pg1mif6tikbyktfvby3y@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06locking: Move the rwsem code to kernel/locking/Peter Zijlstra
Notably: changed lib/rwsem* targets from lib- to obj-, no idea about the ramifications of that. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-g0kynfh5feriwc6p3h6kpbw6@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06locking: Move the rtmutex code to kernel/locking/Peter Zijlstra
Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-p9ijt8div0hwldexwfm4nlhj@git.kernel.org [ Fixed build failure in kernel/rcu/tree_plugin.h. ] Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06hung_task: add method to reset detectorMarcelo Tosatti
In certain occasions it is possible for a hung task detector positive to be false: continuation from a paused VM, for example. Add a method to reset detection, similar as is done with other kernel watchdogs. Acked-by: Don Zickus <dzickus@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-11-06locking: Move the semaphore core to kernel/locking/Peter Zijlstra
Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-vmw5sf6vzmua1z6nx1cg69h2@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06locking: Move the spinlock code to kernel/locking/Peter Zijlstra
Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-b81ol0z3mon45m51o131yc9j@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06locking: Move the lockdep code to kernel/locking/Peter Zijlstra
Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-wl7s3tta5isufzfguc23et06@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06locking: Move the mutex code to kernel/locking/Peter Zijlstra
Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-1ditvncg30dgbpvrz2bxfmke@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06Merge branch 'sched/core' into core/locking, to prepare the kernel/locking/ ↵Ingo Molnar
file move Conflicts: kernel/Makefile There are conflicts in kernel/Makefile due to file moving in the scheduler tree - resolve them. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06sched: Move completion code from core.c to completion.cPeter Zijlstra
Completions already have their own header file: linux/completion.h Move the implementation out of kernel/sched/core.c and into its own file: kernel/sched/completion.c. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-x2y49rmxu5dljt66ai2lcfuw@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06sched: Move wait code from core.c to wait.cPeter Zijlstra
For some reason only the wait part of the wait api lives in kernel/sched/wait.c and the wake part still lives in kernel/sched/core.c; ammend this. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-ftycee88naznulqk7ei5mbci@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06sched: Move wait.c into kernel/sched/Peter Zijlstra
Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-5q5yqvdaen0rmapwloeaotx3@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06Merge branch 'core/rcu' into core/locking, to prepare the kernel/locking/ ↵Ingo Molnar
file move There are conflicts in lockdep.c due to RCU changes, and also the RCU tree changes kernel/Makefile - so pre-merge it to ease the moving of locking related .c files to kernel/locking/. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-06Merge tag 'v3.12' into core/locking to pick up mutex upatesIngo Molnar
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-11-05tracing: Add support for SOFT_DISABLE to syscall eventsTom Zanussi
The original SOFT_DISABLE patches didn't add support for soft disable of syscall events; this adds it. Add an array of ftrace_event_file pointers indexed by syscall number to the trace array and remove the existing enabled bitmaps, which as a result are now redundant. The ftrace_event_file structs in turn contain the soft disable flags we need for per-syscall soft disable accounting. Adding ftrace_event_files also means we can remove the USE_CALL_FILTER bit, thus enabling multibuffer filter support for syscall events. Link: http://lkml.kernel.org/r/6e72b566e85d8df8042f133efbc6c30e21fb017e.1382620672.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-05tracing: Make register/unregister_ftrace_command __initTom Zanussi
register/unregister_ftrace_command() are only ever called from __init functions, so can themselves be made __init. Also make register_snapshot_cmd() __init for the same reason. Link: http://lkml.kernel.org/r/d4042c8cadb7ae6f843ac9a89a24e1c6a3099727.1382620672.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-05tracing: Update event filters for multibufferTom Zanussi
The trace event filters are still tied to event calls rather than event files, which means you don't get what you'd expect when using filters in the multibuffer case: Before: # echo 'bytes_alloc > 8192' > /sys/kernel/debug/tracing/events/kmem/kmalloc/filter # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter bytes_alloc > 8192 # mkdir /sys/kernel/debug/tracing/instances/test1 # echo 'bytes_alloc > 2048' > /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter bytes_alloc > 2048 # cat /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter bytes_alloc > 2048 Setting the filter in tracing/instances/test1/events shouldn't affect the same event in tracing/events as it does above. After: # echo 'bytes_alloc > 8192' > /sys/kernel/debug/tracing/events/kmem/kmalloc/filter # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter bytes_alloc > 8192 # mkdir /sys/kernel/debug/tracing/instances/test1 # echo 'bytes_alloc > 2048' > /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter # cat /sys/kernel/debug/tracing/events/kmem/kmalloc/filter bytes_alloc > 8192 # cat /sys/kernel/debug/tracing/instances/test1/events/kmem/kmalloc/filter bytes_alloc > 2048 We'd like to just move the filter directly from ftrace_event_call to ftrace_event_file, but there are a couple cases that don't yet have multibuffer support and therefore have to continue using the current event_call-based filters. For those cases, a new USE_CALL_FILTER bit is added to the event_call flags, whose main purpose is to keep the old behavior for those cases until they can be updated with multibuffer support; at that point, the USE_CALL_FILTER flag (and the new associated call_filter_check_discard() function) can go away. The multibuffer support also made filter_current_check_discard() redundant, so this change removes that function as well and replaces it with filter_check_discard() (or call_filter_check_discard() as appropriate). Link: http://lkml.kernel.org/r/f16e9ce4270c62f46b2e966119225e1c3cca7e60.1382620672.git.tom.zanussi@linux.intel.com Signed-off-by: Tom Zanussi <tom.zanussi@linux.intel.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-05ftrace: Have control op function callback only trace when RCU is watchingSteven Rostedt (Red Hat)
Dave Jones reported that trinity would be able to trigger the following back trace: =============================== [ INFO: suspicious RCU usage. ] 3.10.0-rc2+ #38 Not tainted ------------------------------- include/linux/rcupdate.h:771 rcu_read_lock() used illegally while idle! other info that might help us debug this: RCU used illegally from idle CPU! rcu_scheduler_active = 1, debug_locks = 0 RCU used illegally from extended quiescent state! 1 lock held by trinity-child1/18786: #0: (rcu_read_lock){.+.+..}, at: [<ffffffff8113dd48>] __perf_event_overflow+0x108/0x310 stack backtrace: CPU: 3 PID: 18786 Comm: trinity-child1 Not tainted 3.10.0-rc2+ #38 0000000000000000 ffff88020767bac8 ffffffff816e2f6b ffff88020767baf8 ffffffff810b5897 ffff88021de92520 0000000000000000 ffff88020767bbf8 0000000000000000 ffff88020767bb78 ffffffff8113ded4 ffffffff8113dd48 Call Trace: [<ffffffff816e2f6b>] dump_stack+0x19/0x1b [<ffffffff810b5897>] lockdep_rcu_suspicious+0xe7/0x120 [<ffffffff8113ded4>] __perf_event_overflow+0x294/0x310 [<ffffffff8113dd48>] ? __perf_event_overflow+0x108/0x310 [<ffffffff81309289>] ? __const_udelay+0x29/0x30 [<ffffffff81076054>] ? __rcu_read_unlock+0x54/0xa0 [<ffffffff816f4000>] ? ftrace_call+0x5/0x2f [<ffffffff8113dfa1>] perf_swevent_overflow+0x51/0xe0 [<ffffffff8113e08f>] perf_swevent_event+0x5f/0x90 [<ffffffff8113e1c9>] perf_tp_event+0x109/0x4f0 [<ffffffff8113e36f>] ? perf_tp_event+0x2af/0x4f0 [<ffffffff81074630>] ? __rcu_read_lock+0x20/0x20 [<ffffffff8112d79f>] perf_ftrace_function_call+0xbf/0xd0 [<ffffffff8110e1e1>] ? ftrace_ops_control_func+0x181/0x210 [<ffffffff81074630>] ? __rcu_read_lock+0x20/0x20 [<ffffffff81100cae>] ? rcu_eqs_enter_common+0x5e/0x470 [<ffffffff8110e1e1>] ftrace_ops_control_func+0x181/0x210 [<ffffffff816f4000>] ftrace_call+0x5/0x2f [<ffffffff8110e229>] ? ftrace_ops_control_func+0x1c9/0x210 [<ffffffff816f4000>] ? ftrace_call+0x5/0x2f [<ffffffff81074635>] ? debug_lockdep_rcu_enabled+0x5/0x40 [<ffffffff81074635>] ? debug_lockdep_rcu_enabled+0x5/0x40 [<ffffffff81100cae>] ? rcu_eqs_enter_common+0x5e/0x470 [<ffffffff8110112a>] rcu_eqs_enter+0x6a/0xb0 [<ffffffff81103673>] rcu_user_enter+0x13/0x20 [<ffffffff8114541a>] user_enter+0x6a/0xd0 [<ffffffff8100f6d8>] syscall_trace_leave+0x78/0x140 [<ffffffff816f46af>] int_check_syscall_exit_work+0x34/0x3d ------------[ cut here ]------------ Perf uses rcu_read_lock() but as the function tracer can trace functions even when RCU is not currently active, this makes the rcu_read_lock() used by perf ineffective. As perf is currently the only user of the ftrace_ops_control_func() and perf is also the only function callback that actively uses rcu_read_lock(), the quick fix is to prevent the ftrace_ops_control_func() from calling its callbacks if RCU is not active. With Paul's new "rcu_is_watching()" we can tell if RCU is active or not. Reported-by: Dave Jones <davej@redhat.com> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-05rcu: Do not trace rcu_is_watching() functionsSteven Rostedt
As perf uses the rcu_read_lock() primitives for recording into its ring buffer, perf tracing can not be called when RCU in inactive. With the perf function tracing, there are functions that can be traced when RCU is not active, and perf must not have its function callback called when this is the case. Luckily, Paul McKenney has created a way to detect when RCU is active or not with the rcu_is_watching() function. Unfortunately, this function can also be traced, and if that happens it can cause a bit of overhead for the perf function calls that do the check. Recursion protection prevents anything bad from happening, but there is a bit of added overhead for every function being traced that must detect that the rcu_is_watching() is also being traced. As rcu_is_watching() is a helper routine and not part of the critical logic in RCU, it does not need to be traced in order to debug RCU itself. Add the "notrace" annotation to all the rcu_is_watching() calls such that we never trace it. Link: http://lkml.kernel.org/r/20131104202736.72dd8e45@gandalf.local.home Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-05Merge branch 'idle.2013.09.25a' of ↵Steven Rostedt (Red Hat)
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into HEAD Need to use Paul McKenney's "rcu_is_watching()" changes to fix a perf/ftrace bug.
2013-11-05trace/trace_stat: use rbtree postorder iteration helper instead of opencodingCody P Schafer
Use rbtree_postorder_for_each_entry_safe() to destroy the rbtree instead of opencoding an alternate postorder iteration that modifies the tree Link: http://lkml.kernel.org/r/1383345566-25087-2-git-send-email-cody@linux.vnet.ibm.com Signed-off-by: Cody P Schafer <cody@linux.vnet.ibm.com> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-11-05audit: call audit_bprm() only once to add AUDIT_EXECVE informationRichard Guy Briggs
Move the audit_bprm() call from search_binary_handler() to exec_binprm(). This allows us to get rid of the mm member of struct audit_aux_data_execve since bprm->mm will equal current->mm. This also mitigates the issue that ->argc could be modified by the load_binary() call in search_binary_handler(). audit_bprm() was being called to add an AUDIT_EXECVE record to the audit context every time search_binary_handler() was recursively called. Only one reference is necessary. Reported-by: Oleg Nesterov <onestero@redhat.com> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com> --- This patch is against 3.11, but was developed on Oleg's post-3.11 patches that introduce exec_binprm().
2013-11-05audit: move audit_aux_data_execve contents into audit_context unionRichard Guy Briggs
audit_bprm() was being called to add an AUDIT_EXECVE record to the audit context every time search_binary_handler() was recursively called. Only one reference is necessary, so just update it. Move the the contents of audit_aux_data_execve into the union in audit_context, removing dependence on a kmalloc along the way. Reported-by: Oleg Nesterov <onestero@redhat.com> Cc: Eric Paris <eparis@redhat.com> Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2013-11-05audit: remove unused envc member of audit_aux_data_execveRichard Guy Briggs
Get rid of write-only audit_aux_data_exeve structure member envc. Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2013-11-05audit: Kill the unused struct audit_aux_data_capsetEric W. Biederman
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> (cherry picked from ebiederman commit 6904431d6b41190e42d6b94430b67cb4e7e6a4b7) Signed-off-by: Eric Paris <eparis@redhat.com>