summaryrefslogtreecommitdiff
path: root/arch/sparc/include/asm/trap_block.h
AgeCommit message (Collapse)Author
2017-09-09sparc64: speed up etrap/rtrap on NG2 and later processorsAnthony Yznaga
For many sun4v processor types, reading or writing a privileged register has a latency of 40 to 70 cycles. Use a combination of the low-latency allclean, otherw, normalw, and nop instructions in etrap and rtrap to replace 2 rdpr and 5 wrpr instructions and improve etrap/rtrap performance. allclean, otherw, and normalw are available on NG2 and later processors. The average ticks to execute the flush windows trap ("ta 0x3") with and without this patch on select platforms: CPU Not patched Patched % Latency Reduction NG2 1762 1558 -11.58 NG4 3619 3204 -11.47 M7 3015 2624 -12.97 SPARC64-X 829 770 -7.12 Signed-off-by: Anthony Yznaga <anthony.yznaga@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-14sparc64: Measure receiver forward progress to avoid send mondo timeoutJane Chu
A large sun4v SPARC system may have moments of intensive xcall activities, usually caused by unmapping many pages on many CPUs concurrently. This can flood receivers with CPU mondo interrupts for an extended period, causing some unlucky senders to hit send-mondo timeout. This problem gets worse as cpu count increases because sometimes mappings must be invalidated on all CPUs, and sometimes all CPUs may gang up on a single CPU. But a busy system is not a broken system. In the above scenario, as long as the receiver is making forward progress processing mondo interrupts, the sender should continue to retry. This patch implements the receiver's forward progress meter by introducing a per cpu counter 'cpu_mondo_counter[cpu]' where 'cpu' is in the range of 0..NR_CPUS. The receiver increments its counter as soon as it receives a mondo and the sender tracks the receiver's counter. If the receiver has stopped making forward progress when the retry limit is reached, the sender declares send-mondo-timeout and panic; otherwise, the receiver is allowed to keep making forward progress. In addition, it's been observed that PCIe hotplug events generate Correctable Errors that are handled by hypervisor and then OS. Hypervisor 'borrows' a guest cpu strand briefly to provide the service. If the cpu strand is simultaneously the only cpu targeted by a mondo, it may not be available for the mondo in 20msec, causing SUN4V mondo timeout. It appears that 1 second is the agreed wait time between hypervisor and guest OS, this patch makes the adjustment. Orabug: 25476541 Orabug: 26417466 Signed-off-by: Jane Chu <jane.chu@oracle.com> Reviewed-by: Steve Sistare <steven.sistare@oracle.com> Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com> Reviewed-by: Rob Gardner <rob.gardner@oracle.com> Reviewed-by: Thomas Tai <thomas.tai@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2015-05-31sparc: Resolve conflict between sparc v9 and M7 on usage of bit 9 of TTEKhalid Aziz
sparc: Resolve conflict between sparc v9 and M7 on usage of bit 9 of TTE Bit 9 of TTE is CV (Cacheable in V-cache) on sparc v9 processor while the same bit 9 is MCDE (Memory Corruption Detection Enable) on M7 processor. This creates a conflicting usage of the same bit. Kernel sets TTE.cv bit on all pages for sun4v architecture which works well for sparc v9 but enables memory corruption detection on M7 processor which is not the intent. This patch adds code to determine if kernel is running on M7 processor and takes steps to not enable memory corruption detection in TTE erroneously. Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-18sparc: drop use of extern for prototypes in arch/sparc/include/asmSam Ravnborg
Drop extern for all prototypes and adjust alignment of parameters as required after the removal. In a few rare cases adjust linelength to conform to maximum 80 chars, and likewise in a few rare cases adjust alignment of parameters to static functions. Signed-off-by: Sam Ravnborg <sam@ravnborg.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-16sparc64: Store per-cpu offset in trap_block[]David S. Miller
Surprisingly this actually makes LOAD_PER_CPU_BASE() a little more efficient. Signed-off-by: David S. Miller <davem@davemloft.net>
2009-06-16sparc64: Move trap_block[] definitions into a new header file.David S. Miller
Later we're going to want to get at these definitions from asm/percpu.h and that's not possible via cpudata.h because of the set of dependencies the non-trap_block[] stuff has. Signed-off-by: David S. Miller <davem@davemloft.net>