summaryrefslogtreecommitdiff
path: root/arch/powerpc/kernel
AgeCommit message (Collapse)Author
2010-10-13powerpc: Use static const char arraysJoe Perches
Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-10-13powerpc/pseries: Export rtas_ibm_suspend_me()Nathan Fontenot
Export the rtas_ibm_suspend_me() routine. This is needed to perform partition migration in the kernel. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-10-13Merge remote branch 'kumar/merge' into nextBenjamin Herrenschmidt
2010-10-08Merge commit 'v2.6.36-rc7' into perf/coreIngo Molnar
Conflicts: arch/x86/kernel/module.c Merge reason: Resolve the conflict, pick up fixes. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-08Merge commit 'v2.6.36-rc7' into core/memblockIngo Molnar
Merge reason: Update from -rc3 to -rc7. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-07powerpc, of_serial: Endianness issues setting up the serial portsIan Munsie
The speed and clock of the serial ports is retrieved from the device tree in both the PowerPC legacy serial code and the Open Firmware serial driver, therefore they need to handle the fact that the device tree is always big endian, while the CPU may not be. Also fix other device tree references in the legacy serial code. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2010-10-07Fix IRQ flag handling namingDavid Howells
Fix the IRQ flag handling naming. In linux/irqflags.h under one configuration, it maps: local_irq_enable() -> raw_local_irq_enable() local_irq_disable() -> raw_local_irq_disable() local_irq_save() -> raw_local_irq_save() ... and under the other configuration, it maps: raw_local_irq_enable() -> local_irq_enable() raw_local_irq_disable() -> local_irq_disable() raw_local_irq_save() -> local_irq_save() ... This is quite confusing. There should be one set of names expected of the arch, and this should be wrapped to give another set of names that are expected by users of this facility. Change this to have the arch provide: flags = arch_local_save_flags() flags = arch_local_irq_save() arch_local_irq_restore(flags) arch_local_irq_disable() arch_local_irq_enable() arch_irqs_disabled_flags(flags) arch_irqs_disabled() arch_safe_halt() Then linux/irqflags.h wraps these to provide: raw_local_save_flags(flags) raw_local_irq_save(flags) raw_local_irq_restore(flags) raw_local_irq_disable() raw_local_irq_enable() raw_irqs_disabled_flags(flags) raw_irqs_disabled() raw_safe_halt() with type checking on the flags 'arguments', and then wraps those to provide: local_save_flags(flags) local_irq_save(flags) local_irq_restore(flags) local_irq_disable() local_irq_enable() irqs_disabled_flags(flags) irqs_disabled() safe_halt() with tracing included if enabled. The arch functions can now all be inline functions rather than some of them having to be macros. Signed-off-by: David Howells <dhowells@redhat.com> [X86, FRV, MN10300] Signed-off-by: Chris Metcalf <cmetcalf@tilera.com> [Tile] Signed-off-by: Michal Simek <monstr@monstr.eu> [Microblaze] Tested-by: Catalin Marinas <catalin.marinas@arm.com> [ARM] Acked-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Haavard Skinnemoen <haavard.skinnemoen@atmel.com> [AVR] Acked-by: Tony Luck <tony.luck@intel.com> [IA-64] Acked-by: Hirokazu Takata <takata@linux-m32r.org> [M32R] Acked-by: Greg Ungerer <gerg@uclinux.org> [M68K/M68KNOMMU] Acked-by: Ralf Baechle <ralf@linux-mips.org> [MIPS] Acked-by: Kyle McMartin <kyle@mcmartin.ca> [PA-RISC] Acked-by: Paul Mackerras <paulus@samba.org> [PowerPC] Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com> [S390] Acked-by: Chen Liqin <liqin.chen@sunplusct.com> [Score] Acked-by: Matt Fleming <matt@console-pimps.org> [SH] Acked-by: David S. Miller <davem@davemloft.net> [Sparc] Acked-by: Chris Zankel <chris@zankel.net> [Xtensa] Reviewed-by: Richard Henderson <rth@twiddle.net> [Alpha] Reviewed-by: Yoshinori Sato <ysato@users.sourceforge.jp> [H8300] Cc: starvik@axis.com [CRIS] Cc: jesper.nilsson@axis.com [CRIS] Cc: linux-cris-kernel@axis.com
2010-10-05powerpc: remove unused variableStephen Rothwell
Since powerpc uses -Werror on arch powerpc, the build was broken like this: cc1: warnings being treated as errors arch/powerpc/kernel/module.c: In function 'module_finalize': arch/powerpc/kernel/module.c:66: error: unused variable 'err' Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-10-05modules: Fix module_bug_list list corruption raceLinus Torvalds
With all the recent module loading cleanups, we've minimized the code that sits under module_mutex, fixing various deadlocks and making it possible to do most of the module loading in parallel. However, that whole conversion totally missed the rather obscure code that adds a new module to the list for BUG() handling. That code was doubly obscure because (a) the code itself lives in lib/bugs.c (for dubious reasons) and (b) it gets called from the architecture-specific "module_finalize()" rather than from generic code. Calling it from arch-specific code makes no sense what-so-ever to begin with, and is now actively wrong since that code isn't protected by the module loading lock any more. So this commit moves the "module_bug_{finalize,cleanup}()" calls away from the arch-specific code, and into the generic code - and in the process protects it with the module_mutex so that the list operations are now safe. Future fixups: - move the module list handling code into kernel/module.c where it belongs. - get rid of 'module_bug_list' and just use the regular list of modules (called 'modules' - imagine that) that we already create and maintain for other reasons. Reported-and-tested-by: Thomas Gleixner <tglx@linutronix.de> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Adrian Bunk <bunk@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-09-23powerpc/perf: Fix sampling enable for PPC970Paul Mackerras
The logic to distinguish marked instruction events from ordinary events on PPC970 and derivatives was flawed. The result is that instruction sampling didn't get enabled in the PMU for some marked instruction events, so they would never trigger. This fixes it by adding the appropriate break statements in the switch statement. Reported-by: David Binderman <dcb314@hotmail.com> Cc: stable@kernel.org Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-23Merge branch 'linus' into perf/coreIngo Molnar
Conflicts: arch/sparc/kernel/perf_event.c Merge reason: Resolve the conflict. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-22powerpc: fix double syscall restartsAl Viro
Make sigreturn zero regs->trap, make do_signal() do the same on all paths. As it is, signal interrupting e.g. read() from fd 512 (== ERESTARTSYS) with another signal getting unblocked when the first handler finishes will lead to restart one insn earlier than it ought to. Same for multiple signals with in-kernel handlers interrupting that sucker at the same time. Same for multiple signals of any kind interrupting that sucker on 64bit... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-09-15Merge branch 'tip/perf/core' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core
2010-09-09perf: Rework the PMU methodsPeter Zijlstra
Replace pmu::{enable,disable,start,stop,unthrottle} with pmu::{add,del,start,stop}, all of which take a flags argument. The new interface extends the capability to stop a counter while keeping it scheduled on the PMU. We replace the throttled state with the generic stopped state. This also allows us to efficiently stop/start counters over certain code paths (like IRQ handlers). It also allows scheduling a counter without it starting, allowing for a generic frozen state (useful for rotating stopped counters). The stopped state is implemented in two different ways, depending on how the architecture implemented the throttled state: 1) We disable the counter: a) the pmu has per-counter enable bits, we flip that b) we program a NOP event, preserving the counter state 2) We store the counter state and ignore all read/overflow events Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus <paulus@samba.org> Cc: stephane eranian <eranian@googlemail.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Yanmin <yanmin_zhang@linux.intel.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: David Miller <davem@davemloft.net> Cc: Michael Cree <mcree@orcon.net.nz> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-09perf: Per PMU disablePeter Zijlstra
Changes perf_disable() into perf_pmu_disable(). Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus <paulus@samba.org> Cc: stephane eranian <eranian@googlemail.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Yanmin <yanmin_zhang@linux.intel.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: David Miller <davem@davemloft.net> Cc: Michael Cree <mcree@orcon.net.nz> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-09perf: Reduce perf_disable() usagePeter Zijlstra
Since the current perf_disable() usage is only an optimization, remove it for now. This eases the removal of the __weak hw_perf_enable() interface. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus <paulus@samba.org> Cc: stephane eranian <eranian@googlemail.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Yanmin <yanmin_zhang@linux.intel.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: David Miller <davem@davemloft.net> Cc: Michael Cree <mcree@orcon.net.nz> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-09perf: Register PMU implementationsPeter Zijlstra
Simple registration interface for struct pmu, this provides the infrastructure for removing all the weak functions. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus <paulus@samba.org> Cc: stephane eranian <eranian@googlemail.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Yanmin <yanmin_zhang@linux.intel.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: David Miller <davem@davemloft.net> Cc: Michael Cree <mcree@orcon.net.nz> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-09perf: Deconstify struct pmuPeter Zijlstra
sed -ie 's/const struct pmu\>/struct pmu/g' `git grep -l "const struct pmu\>"` Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: paulus <paulus@samba.org> Cc: stephane eranian <eranian@googlemail.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Cc: Lin Ming <ming.m.lin@intel.com> Cc: Yanmin <yanmin_zhang@linux.intel.com> Cc: Deng-Cheng Zhu <dengcheng.zhu@gmail.com> Cc: David Miller <davem@davemloft.net> Cc: Michael Cree <mcree@orcon.net.nz> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-09-02powerpc/dma: Add optional platform override of dma_set_mask()Benjamin Herrenschmidt
Some platforms may want to override dma_set_mask() to take into account some specific "features" such as the availability of a direct-map window in addition to an iommu. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02powerpc: Use is_32bit_task() helper to test 32-bit binaryDenis Kirjanov
This patch removes all explicit tests for the TIF_32BIT flag Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02powerpc: Remove fpscr use from [kvm_]cvt_{fd,df}Andreas Schwab
Neither lfs nor stfs touch the fpscr, so remove the restore/save of it around them. Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02powerpc/pseries: Re-enable dispatch trace log userspace interfacePaul Mackerras
Since the cpu accounting code uses the hypervisor dispatch trace log now when CONFIG_VIRT_CPU_ACCOUNTING = y, the previous commit disabled access to it via files in the /sys/kernel/debug/powerpc/dtl/ directory in that case. This restores those files. To do this, we now have a hook that the cpu accounting code will call as it processes each entry from the hypervisor dispatch trace log. The code in dtl.c now uses that to fill up its ring buffer, rather than having the hypervisor fill the ring buffer directly. This also fixes dtl_file_read() to handle overflow conditions a bit better and adds a spinlock to ensure that race conditions (multiple processes opening or reading the file concurrently) are handled correctly. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02powerpc: Account time using timebase rather than PURRPaul Mackerras
Currently, when CONFIG_VIRT_CPU_ACCOUNTING is enabled, we use the PURR register for measuring the user and system time used by processes, as well as other related times such as hardirq and softirq times. This turns out to be quite confusing for users because it means that a program will often be measured as taking less time when run on a multi-threaded processor (SMT2 or SMT4 mode) than it does when run on a single-threaded processor (ST mode), even though the program takes longer to finish. The discrepancy is accounted for as stolen time, which is also confusing, particularly when there are no other partitions running. This changes the accounting to use the timebase instead, meaning that the reported user and system times are the actual number of real-time seconds that the program was executing on the processor thread, regardless of which SMT mode the processor is in. Thus a program will generally show greater user and system times when run on a multi-threaded processor than on a single-threaded processor. On pSeries systems on POWER5 or later processors, we measure the stolen time (time when this partition wasn't running) using the hypervisor dispatch trace log. We check for new entries in the log on every entry from user mode and on every transition from kernel process context to soft or hard IRQ context (i.e. when account_system_vtime() gets called). So that we can correctly distinguish time stolen from user time and time stolen from system time, without having to check the log on every exit to user mode, we store separate timestamps for exit to user mode and entry from user mode. On systems that have a SPURR (POWER6 and POWER7), we read the SPURR in account_system_vtime() (as before), and then apportion the SPURR ticks since the last time we read it between scaled user time and scaled system time according to the relative proportions of user time and system time over the same interval. This avoids having to read the SPURR on every kernel entry and exit. On systems that have PURR but not SPURR (i.e., POWER5), we do the same using the PURR rather than the SPURR. This disables the DTL user interface in /sys/debug/kernel/powerpc/dtl for now since it conflicts with the use of the dispatch trace log by the time accounting code. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02powerpc: Dynamically allocate most lppaca structsPaul Mackerras
This arranges for the lppaca structs for most cpus to be dynamically allocated in the same manner as the paca structs. If we don't include support for legacy iSeries, only the first lppaca is statically allocated; the rest are dynamically allocated. If we include legacy iSeries support, then we statically allocate the first 64 lppaca structs, since the iSeries hypervisor requires that the lppaca structs be present in the data section of the kernel image, but legacy iSeries supports at most 64 cpus. With CONFIG_NR_CPUS, the kernel image size for a typical pSeries config went from: text data bss dec hex filename 9524478 4734564 8469944 22728986 15ad11a ../test-1024/vmlinux to: text data bss dec hex filename 9524482 3751508 8469944 21745934 14bd10e ../test-1024/vmlinux a reduction of 983052 bytes overall. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02powerpc: Abstract indexing of lppaca structsPaul Mackerras
Currently we have the lppaca structs as a simple array of NR_CPUS entries, taking up space in the data section of the kernel image. In future we would like to allocate them dynamically, so this abstracts out the accesses to the array, making it easier to change how we locate the lppaca for a given cpu in future. Specifically, lppaca[cpu] changes to lppaca_of(cpu). Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02powerpc: Move arch_sd_sibling_asym_packing() to smp.cMichael Neuling
Simple cleanup by moving arch_sd_sibling_asym_packing from process.c to smp.c to save an #ifdef CONFIG_SMP No functionality change. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-09-02powerpc: Feature nop out reservation clear when stcx checks addressAnton Blanchard
The POWER architecture does not require stcx to check that it is operating on the same address as the larx. This means it is possible for an an exception handler to execute a larx, get a reservation, decide not to do the stcx and then return back with an active reservation. If the interrupted code was in the middle of a larx/stcx sequence the stcx could incorrectly succeed. All recent POWER CPUs check the address before letting the stcx succeed so we can create a CPU feature and nop it out. As Ben suggested, we can only do this in our syscall path because there is a remote possibility some kernel code gets interrupted by an exception that ends up operating on the same cacheline. Thanks to Paul Mackerras and Derek Williams for the idea. To test this I used a very simple null syscall (actually getppid) testcase at http://ozlabs.org/~anton/junkcode/null_syscall.c I tested against 2.6.35-git10 with the following changes against the pseries_defconfig: CONFIG_VIRT_CPU_ACCOUNTING=n CONFIG_AUDIT=n CONFIG_PPC_4K_PAGES=n CONFIG_PPC_64K_PAGES=y CONFIG_FORCE_MAX_ZONEORDER=9 CONFIG_PPC_SUBPAGE_PROT=n CONFIG_FUNCTION_TRACER=n CONFIG_FUNCTION_GRAPH_TRACER=n CONFIG_IRQSOFF_TRACER=n CONFIG_STACK_TRACER=n to remove the overhead of virtual CPU accounting, syscall auditing and the ftrace mcount tracers. 64kB pages were enabled to minimise TLB misses. POWER6: +8.2% POWER7: +7.0% Another suggestion was to use a larx to something in the L1 instead of a stcx. This was almost as fast as removing the larx on POWER6, but only 3.5% faster on POWER7. We can use this to speed up the reservation clear in our exception exit code. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-31Merge commit 'v2.6.36-rc3' into x86/memblockIngo Molnar
Conflicts: arch/x86/kernel/trampoline.c mm/memblock.c Merge reason: Resolve the conflicts, update to latest upstream. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-31powerpc: Don't use kernel stack with translation offMichael Neuling
In f761622e59433130bc33ad086ce219feee9eb961 we changed early_setup_secondary so it's called using the proper kernel stack rather than the emergency one. Unfortunately, this stack pointer can't be used when translation is off on PHYP as this stack pointer might be outside the RMO. This results in the following on all non zero cpus: cpu 0x1: Vector: 300 (Data Access) at [c00000001639fd10] pc: 000000000001c50c lr: 000000000000821c sp: c00000001639ff90 msr: 8000000000001000 dar: c00000001639ffa0 dsisr: 42000000 current = 0xc000000016393540 paca = 0xc000000006e00200 pid = 0, comm = swapper The original patch was only tested on bare metal system, so it never caught this problem. This changes __secondary_start so that we calculate the new stack pointer but only start using it after we've called early_setup_secondary. With this patch, the above problem goes away. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-31powerpc/perf_event: Reduce latency of calling perf_event_do_pendingPaul Mackerras
Commit 0fe1ac48 ("powerpc/perf_event: Fix oops due to perf_event_do_pending call") moved the call to perf_event_do_pending in timer_interrupt() down so that it was after the irq_enter() call. Unfortunately this moved it after the code that checks whether it is time for the next decrementer clock event. The result is that the call to perf_event_do_pending() won't happen until the next decrementer clock event is due. This was pointed out by Milton Miller. This fixes it by moving the check for whether it's time for the next decrementer clock event down to the point where we're about to call the event handler, after we've called perf_event_do_pending. This has the side effect that on old pre-Core99 Powermacs where we use the ppc_n_lost_interrupts mechanism to replay interrupts, a replayed interrupt will incur a little more latency since it will now do the code from the irq_enter down to the irq_exit, that it used to skip. However, these machines are now old and rare enough that this doesn't matter. To make it clear that ppc_n_lost_interrupts is only used on Powermacs, and to speed up the code slightly on non-Powermac ppc32 machines, the code that tests ppc_n_lost_interrupts is now conditional on CONFIG_PMAC as well as CONFIG_PPC32. Signed-off-by: Paul Mackerras <paulus@samba.org> Cc: stable@kernel.org Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-31powerpc/kexec: Adds correct calling convention for kexec purgatoryMatthew McClintock
Call kexec purgatory code correctly. We were getting lucky before. If you examine the powerpc 32bit kexec "purgatory" code you will see it expects the following: >From kexec-tools: purgatory/arch/ppc/v2wrap_32.S -> calling convention: -> r3 = physical number of this cpu (all cpus) -> r4 = address of this chunk (master only) As such, we need to set r3 to the current core, r4 happens to be unused by purgatory at the moment but we go ahead and set it here as well Signed-off-by: Matthew McClintock <msm@freescale.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-25Merge branch 'linus' into perf/coreIngo Molnar
Merge reason: pick up perf fixes Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-08-24powerpc: Wire up fanotify_init, fanotify_mark, prlimit64 syscallsAndreas Schwab
Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24powerpc/pci: Fix checking for child bridges in PCI code.Grant Likely
pci_device_to_OF_node() can return null, and list_for_each_entry will never enter the loop when dev is NULL, so it looks like this test is a typo. Reported-by: Julia Lawall <julia@diku.dk> Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24powerpc: Initialise paca->kstack before early_setup_secondaryMatt Evans
As early setup calls down to slb_initialize(), we must have kstack initialised before checking "should we add a bolted SLB entry for our kstack?" Failing to do so means stack access requires an SLB miss exception to refill an entry dynamically, if the stack isn't accessible via SLB(0) (kernel text & static data). It's not always allowable to take such a miss, and intermittent crashes will result. Primary CPUs don't have this issue; an SLB entry is not bolted for their stack anyway (as that lives within SLB(0)). This patch therefore only affects the init of secondaries. Signed-off-by: Matt Evans <matt@ozlabs.org> Cc: stable <stable@kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24powerpc: Fix bogus it_blocksize in VIO iommu codeAnton Blanchard
When looking at some issues with the virtual ethernet driver I noticed that TCE allocation was following a very strange pattern: address 00e9000 length 2048 address 0409000 length 2048 <----- address 0429000 length 2048 address 0449000 length 2048 address 0469000 length 2048 address 0489000 length 2048 address 04a9000 length 2048 address 04c9000 length 2048 address 04e9000 length 2048 address 4009000 length 2048 <----- address 4029000 length 2048 Huge unexplained gaps in what should be an empty TCE table. It turns out it_blocksize, the amount we want to align the next allocation to, was c0000000fe903b20. Completely bogus. Initialise it to something reasonable in the VIO IOMMU code, and use kzalloc everywhere to protect against this when we next add a non compulsary field to iommu code and forget to initialise it. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24powerpc: Inline ppc64_runlatch_offAnton Blanchard
I'm sick of seeing ppc64_runlatch_off in our profiles, so inline it into the callers. To avoid a mess of circular includes I didn't add it as an inline function. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Olof Johansson <olof@lixom.net> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24powerpc: Correct smt_enabled=X boot option for > 2 threads per coreNathan Fontenot
The 'smt_enabled=X' boot option does not handle values of X > 2. For Power 7 processors with smt modes of 0,1,2,3, and 4 this does not work. This patch allows the smt_enabled option to be set to any value limited to a max equal to the number of threads per core. Signed-off-by: Nathan Fontenot <nfont@austin.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24powerpc: Silence __cpu_up() under normal operationSigned-off-by: Darren Hart
During CPU offline/online tests __cpu_up would flood the logs with the following message: Processor 0 found. This provides no useful information to the user as there is no context provided, and since the operation was a success (to this point) it is expected that the CPU will come back online, providing all the feedback necessary. Change the "Processor found" message to DBG() similar to other such messages in the same function. Also, add an appropriate log level for the "Processor is stuck" message. Signed-off-by: Darren Hart <dvhltc@us.ibm.com> Acked-by: Will Schmidt <will_schmidt@vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Nathan Fontenot <nfont@austin.ibm.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24powerpc: Re-enable preemption before cpu_die()Signed-off-by: Darren Hart
start_secondary() is called shortly after _start and also via cpu_idle()->cpu_die()->pseries_mach_cpu_die() start_secondary() expects a preempt_count() of 0. pseries_mach_cpu_die() is called via the cpu_idle() routine with preemption disabled, resulting in the following repeating message during rapid cpu offline/online tests with CONFIG_PREEMPT=y: BUG: scheduling while atomic: swapper/0/0x00000002 Modules linked in: autofs4 binfmt_misc dm_mirror dm_region_hash dm_log [last unloaded: scsi_wait_scan] Call Trace: [c00000010e7079c0] [c0000000000133ec] .show_stack+0xd8/0x218 (unreliable) [c00000010e707aa0] [c0000000006a47f0] .dump_stack+0x28/0x3c [c00000010e707b20] [c00000000006e7a4] .__schedule_bug+0x7c/0x9c [c00000010e707bb0] [c000000000699d9c] .schedule+0x104/0x800 [c00000010e707cd0] [c000000000015b24] .cpu_idle+0x1c4/0x1d8 [c00000010e707d70] [c0000000006aa1b4] .start_secondary+0x398/0x3d4 [c00000010e707e30] [c000000000008278] .start_secondary_resume+0x10/0x14 Move the cpu_die() call inside the existing preemption enabled block of cpu_idle(). This is safe as the idle task is affined to a single CPU so the debug_smp_processor_id() tests (from cpu_should_die()) won't trigger as we are in a "migration disabled" region. Signed-off-by: Darren Hart <dvhltc@us.ibm.com> Acked-by: Will Schmidt <will_schmidt@vnet.ibm.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Nathan Fontenot <nfont@austin.ibm.com> Cc: Robert Jennings <rcj@linux.vnet.ibm.com> Cc: Brian King <brking@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24powerpc/pci: Drop unnecessary null testJulia Lawall
list_for_each_entry binds its first argument to a non-null value, and thus any null test on the value of that argument is superfluous. The semantic patch that makes this change is as follows: (http://coccinelle.lip6.fr/) // <smpl> @@ iterator I; expression x,E,E1,E2; statement S,S1,S2; @@ I(x,...) { <... - if (x != NULL || ...) S ...> } // </smpl> Signed-off-by: Julia Lawall <julia@diku.dk> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24powerpc/kdump: Stop all other CPUs before running crash handlersAnton Blanchard
During kdump we run the crash handlers first then stop all other CPUs. We really want to stop all CPUs as close to the fail as possible and also have a very controlled environment for running the crash handlers, so it makes sense to reverse the order. Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Matt Evans <matt@ozlabs.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24powerpc: Use is_32bit_task() helper to test 32 bit binaryDenis Kirjanov
Use is_32bit_task() helper to test 32 bit binary. Signed-off-by: Denis Kirjanov <dkirjanov@kernel.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2010-08-24Merge remote branch 'jwb/merge' into mergeBenjamin Herrenschmidt
2010-08-23powerpc/4xx: Index interrupt stacks by physical cpuDave Kleikamp
The interrupt stacks need to be indexed by the physical cpu since the critical, debug and machine check handlers use the contents of SPRN_PIR to index the critirq_ctx, dbgirq_ctx, and mcheckirq_ctx arrays. Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2010-08-23powerpc/47x: Remove redundant line from cputable.cDave Kleikamp
There are two entries for .cpu_user_features in arch/powerpc/kernel/cputable.c. Remove the one that doesn't belong Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2010-08-23powerpc/47x: Make sure mcsr is cleared before enabling machine check interruptsDave Kleikamp
Clear the machine check syndrom register before enabling machine check interrupts. The initial state of the tlb can lead to parity errors being flagged early after a cold boot. Signed-off-by: Dave Kleikamp <shaggy@linux.vnet.ibm.com> Signed-off-by: Josh Boyer <jwboyer@linux.vnet.ibm.com>
2010-08-19Merge branch 'tip/perf/urgent' of ↵Ingo Molnar
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace into perf/core
2010-08-19perf: Factorize callchain context handlingFrederic Weisbecker
Store the kernel and user contexts from the generic layer instead of archs, this gathers some repetitive code. Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Paul Mackerras <paulus@samba.org> Tested-by: Will Deacon <will.deacon@arm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Cc: David Miller <davem@davemloft.net> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Borislav Petkov <bp@amd64.org>
2010-08-19perf: Generalize some arch callchain codeFrederic Weisbecker
- Most archs use one callchain buffer per cpu, except x86 that needs to deal with NMIs. Provide a default perf_callchain_buffer() implementation that x86 overrides. - Centralize all the kernel/user regs handling and invoke new arch handlers from there: perf_callchain_user() / perf_callchain_kernel() That avoid all the user_mode(), current->mm checks and so... - Invert some parameters in perf_callchain_*() helpers: entry to the left, regs to the right, following the traditional (dst, src). Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Acked-by: Paul Mackerras <paulus@samba.org> Tested-by: Will Deacon <will.deacon@arm.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Stephane Eranian <eranian@google.com> Cc: David Miller <davem@davemloft.net> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Borislav Petkov <bp@amd64.org>