summaryrefslogtreecommitdiff
path: root/arch/loongarch/kernel/traps.c
AgeCommit message (Collapse)Author
2023-06-29LoongArch: Add uprobes supportTiezhu Yang
Uprobes is the user-space counterpart to kprobes, this patch adds uprobes support for LoongArch. Here is a simple example with CONFIG_UPROBE_EVENTS=y: # cat test.c #include <stdio.h> int add(int a, int b) { return a + b; } int main() { return add(2, 7); } # gcc test.c -o /tmp/test # nm /tmp/test | grep add 0000000120004194 T add # cd /sys/kernel/debug/tracing # echo > uprobe_events # echo "p:myuprobe /tmp/test:0x4194 %r4 %r5" > uprobe_events # echo "r:myuretprobe /tmp/test:0x4194 %r4" >> uprobe_events # echo 1 > events/uprobes/enable # echo 1 > tracing_on # /tmp/test # cat trace ... # TASK-PID CPU# ||||| TIMESTAMP FUNCTION # | | | ||||| | | test-1060 [001] DNZff 1015.770620: myuprobe: (0x120004194) arg1=0x2 arg2=0x7 test-1060 [001] DNZff 1015.770930: myuretprobe: (0x1200041f0 <- 0x120004194) arg1=0x9 Tested-by: Jeff Xie <xiehuan09@gmail.com> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29LoongArch: Add vector extensions supportHuacai Chen
Add LoongArch's vector extensions support, which including 128bit LSX (i.e., Loongson SIMD eXtension) and 256bit LASX (i.e., Loongson Advanced SIMD eXtension). Linux kernel doesn't use vector itself, it only handle exceptions and context save/restore. So it only needs a subset of these instructions: * Vector load/store: vld vst vldx vstx xvld xvst xvldx xvstx * 8bit-elements move: vpickve2gr.b xvpickve2gr.b vinsgr2vr.b xvinsgr2vr.b * 16bit-elements move: vpickve2gr.h xvpickve2gr.h vinsgr2vr.h xvinsgr2vr.h * 32bit-elements move: vpickve2gr.w xvpickve2gr.w vinsgr2vr.w xvinsgr2vr.w * 64bit-elements move: vpickve2gr.d xvpickve2gr.d vinsgr2vr.d xvinsgr2vr.d * Elements permute: vpermi.w vpermi.d xvpermi.w xvpermi.d xvpermi.q Introduce AS_HAS_LSX_EXTENSION and AS_HAS_LASX_EXTENSION to avoid non- vector toolchains complains unsupported instructions. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-06-29LoongArch: Make the CPUCFG&CSR ops simple aliases of compiler built-insWANG Xuerui
In addition to less visual clutter, this also makes Clang happy regarding the const-ness of arguments. In the original approach, all Clang gets to see is the incoming arguments whose const-ness cannot be proven without first being inlined; so Clang errors out here while GCC is fine. While at it, tweak several printk format strings because the return type of csr_read64 becomes effectively unsigned long, instead of unsigned long long. Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: Relay BCE exceptions to userland as SIGSEGV with si_code=SEGV_BNDERRWANG Xuerui
SEGV_BNDERR was introduced initially for supporting the Intel MPX, but fell into disuse after the MPX support was removed. The LoongArch bounds-checking instructions behave very differently than MPX, but overall the interface is still kind of suitable for conveying the information to userland when bounds-checking assertions trigger, so we wouldn't have to invent more UAPI. Specifically, when the BCE triggers, a SEGV_BNDERR is sent to userland, with si_addr set to the out-of-bounds address or value (in asrt{gt,le}'s case), and one of si_lower or si_upper set to the configured bound depending on the faulting instruction. The other bound is set to either 0 or ULONG_MAX to resemble a range with both lower and upper bounds. Note that it is possible to have si_addr == si_lower in case of a failing asrtgt or {ld,st}gt, because those instructions test for strict greater-than relationship. This should not pose a problem for userland, though, because the faulting PC is available for the application to associate back to the exact instruction for figuring out the expectation. Example exception context generated by a faulting `asrtgt.d t0, t1` (assert t0 > t1 or BCE) with t0=100 and t1=200: > pc 00005555558206a4 ra 00007ffff2d854fc tp 00007ffff2f2f180 sp 00007ffffbf9fb80 > a0 0000000000000002 a1 00007ffffbf9fce8 a2 00007ffffbf9fd00 a3 00007ffff2ed4558 > a4 0000000000000000 a5 00007ffff2f044c8 a6 00007ffffbf9fce0 a7 fffffffffffff000 > t0 0000000000000064 t1 00000000000000c8 t2 00007ffffbfa2d5e t3 00007ffff2f12aa0 > t4 00007ffff2ed6158 t5 00007ffff2ed6158 t6 000000000000002e t7 0000000003d8f538 > t8 0000000000000005 u0 0000000000000000 s9 0000000000000000 s0 00007ffffbf9fce8 > s1 0000000000000002 s2 0000000000000000 s3 00007ffff2f2c038 s4 0000555555820610 > s5 00007ffff2ed5000 s6 0000555555827e38 s7 00007ffffbf9fd00 s8 0000555555827e38 > ra: 00007ffff2d854fc > ERA: 00005555558206a4 > CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) > PRMD: 00000007 (PPLV3 +PIE -PWE) > EUEN: 00000000 (-FPE -SXE -ASXE -BTE) > ECFG: 0007181c (LIE=2-4,11-12 VS=7) > ESTAT: 000a0000 [BCE] (IS= ECode=10 EsubCode=0) > PRID: 0014c010 (Loongson-64bit, Loongson-3A5000) Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: Tweak the BADV and CPUCFG.PRID lines in show_regs()WANG Xuerui
Use ISA manual names for BADV and CPUCFG.PRID lines in show_regs(), for stylistic consistency with the other lines already touched. While at it, also include current CPU's full name in show_regs() output. It may be more helpful for developers looking at the resulting dumps, because multiple distinct CPU models may share the same PRID. Not having this info available may hide problems only found on some but not all of the models sharing one specific PRID. Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: Humanize the ESTAT line when showing registersWANG Xuerui
Example output looks like: [ xx.xxxxxx] ESTAT: 00001000 [INT] (IS=12 ECode=0 EsubCode=0) Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: Humanize the ECFG line when showing registersWANG Xuerui
Example output looks like: [ xx.xxxxxx] ECFG: 00071c1c (LIE=2-4,10-12 VS=7) Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: Humanize the EUEN line when showing registersWANG Xuerui
Example output looks like: [ xx.xxxxxx] EUEN: 00000000 (-FPE -SXE -ASXE -BTE) Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: Humanize the PRMD line when showing registersWANG Xuerui
Example output looks like: [ xx.xxxxxx] PRMD: 00000004 (PPLV0 +PIE -PWE) Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: Humanize the CRMD line when showing registersWANG Xuerui
Example output looks like: [ xx.xxxxxx] CRMD: 000000b0 (PLV0 -IE -DA +PG DACF=CC DACM=CC -WE) Some initial machinery for this pretty-printing format has been included in this patch as well. Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: Fix format of CSR lines during show_regs()WANG Xuerui
Use uppercase CSR names throughout for consistency with the manual wording, and right-align the keys. The "CSR" part is inferrable from context, hence dropped for more horizontal space. Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: Print symbol info for $ra and CSR.ERA only for kernel-mode contextsWANG Xuerui
Otherwise the addresses wouldn't make sense at all. While at it, align the "map keys" to maintain right-alignment with the "estat:" line too; also swap the ERA and ra lines so all CSRs are shown together. Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: Print GPRs with ABI names when showing registersWANG Xuerui
Show PC (CSR.ERA) in place of $zero, and also show the syscall restart flag (conveniently stuffed in regs[0]) if non-zero. Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-05-01LoongArch: Clean up the architectural interrupt definitionsWANG Xuerui
While interrupts are assigned ECodes `64 + interrupt number`, all existing use sites of interrupt numbers want the 64 subtracted. Re-arrange the definitions so that the actual interrupt number is used everywhere, and make EXCCODE_INT_END inclusive as it is more intuitive that way. While at it, according to the asm/loongarch.h definitions, the total number of architectural interrupts should be 14, but various other places indicate otherwise (13 or 15). Those places have been adjusted to 14 as well for consistency. Signed-off-by: WANG Xuerui <git@xen0n.name> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-02-25LoongArch: Add kprobes supportTiezhu Yang
Kprobes allows you to trap at almost any kernel address and execute a callback function, this commit adds kprobes support for LoongArch. Tested-by: Jeff Xie <xiehuan09@gmail.com> Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-02-25LoongArch: ptrace: Add hardware single step supportQing Zhang
Use the generic ptrace_resume code for PTRACE_SYSCALL, PTRACE_CONT, PTRACE_KILL and PTRACE_SINGLESTEP handling. This implies defining arch_has_single_step() and implementing the user_enable_single_step() and user_disable_single_step() functions. LoongArch cannot do hardware single-stepping per se, the hardware single-stepping it is achieved by configuring the instruction fetch watchpoints (FWPS) and specifies that the next instruction must trigger the watch exception by setting the mask bit. In some scenarios CSR.FWPS.Skip is used to ignore the next hit result, avoid endless repeated triggering of the same watchpoint without canceling it. Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-02-25LoongArch: Add hardware breakpoints/watchpoints supportQing Zhang
Use perf framework to manage hardware instruction and data breakpoints. LoongArch defines hardware watchpoint functions for instruction fetch and memory load/store operations. After the software configures hardware watchpoints, the processor hardware will monitor the access address of the instruction fetch and load/store operation, and trigger an exception of the watchpoint when it meets the conditions set by the watchpoint. The hardware monitoring points for instruction fetching and load/store operations each have a register for the overall configuration of all monitoring points, a register for recording the status of all monitoring points, and four registers required for configuration of each watchpoint individually. Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-02-25LoongArch: Make -mstrict-align configurableHuacai Chen
Introduce Kconfig option ARCH_STRICT_ALIGN to make -mstrict-align be configurable. Not all LoongArch cores support h/w unaligned access, we can use the -mstrict-align build parameter to prevent unaligned accesses. CPUs with h/w unaligned access support: Loongson-2K2000/2K3000/3A5000/3C5000/3D5000. CPUs without h/w unaligned access support: Loongson-2K500/2K1000. This option is enabled by default to make the kernel be able to run on all LoongArch systems. But you can disable it manually if you want to run kernel only on systems with h/w unaligned access support in order to optimise for performance. Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2023-01-17LoongArch: Strip guess unwinder out from prologue unwinderJinyang He
The prolugue unwinder rely on symbol info. When PC is not in kernel text address, it cannot find relative symbol info and it will be broken. The guess unwinder will be used in this case. And the guess unwinder code in prolugue unwinder is redundant. Strip it out and set the unwinder type in unwind_state. Make guess_unwinder::unwind_next_frame() as default way when other unwinders cannot unwind in some extreme case. Signed-off-by: Jinyang He <hejinyang@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-12-14LoongArch: Add unaligned access supportHuacai Chen
Loongson-2 series (Loongson-2K500, Loongson-2K1000) don't support unaligned access in hardware, while Loongson-3 series (Loongson-3A5000, Loongson-3C5000) are configurable whether support unaligned access in hardware. This patch add unaligned access emulation for those LoongArch processors without hardware support. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Add kdump supportYouling Tang
This patch adds support for kdump. In kdump case the normal kernel will reserve a region for the crash kernel and jump there on panic. Arch-specific functions are added to allow for implementing a crash dump file interface, /proc/vmcore, which can be viewed as a ELF file. A user-space tool, such as kexec-tools, is responsible for allocating a separate region for the core's ELF header within the crash kdump kernel memory and filling it in when executing kexec_load(). Then, its location will be advertised to the crash dump kernel via a command line argument "elfcorehdr=", and the crash dump kernel will preserve this region for later use with arch_reserve_vmcore() at boot time. At the same time, the crash kdump kernel is also limited within the "crashkernel" area via a command line argument "mem=", so as not to destroy the original kernel dump data. In the crash dump kernel environment, /proc/vmcore is used to access the primary kernel's memory with copy_oldmem_page(). I tested kdump on LoongArch machines (Loongson-3A5000) and it works as expected (suggested crashkernel parameter is "crashkernel=512M@2560M"), you may test it by triggering a crash through /proc/sysrq-trigger: $ sudo kexec -p /boot/vmlinux-kdump --reuse-cmdline --append="nr_cpus=1" # echo c > /proc/sysrq-trigger Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Use generic BUG() handlerYouling Tang
Inspired by commit 9fb7410f955("arm64/BUG: Use BRK instruction for generic BUG traps"), do similar for LoongArch to use generic BUG() handler. This patch uses the BREAK software breakpoint instruction to generate a trap instead, similarly to most other arches, with the generic BUG code generating the dmesg boilerplate. This allows bug metadata to be moved to a separate table and reduces the amount of inline code at BUG() and WARN() sites. This also avoids clobbering any registers before they can be dumped. To mitigate the size of the bug table further, this patch makes use of the existing infrastructure for encoding addresses within the bug table as 32-bit relative pointers instead of absolute pointers. (Note: this limits the max kernel size to 2GB.) Before patch: [ 3018.338013] lkdtm: Performing direct entry BUG [ 3018.342445] Kernel bug detected[#5]: [ 3018.345992] CPU: 2 PID: 865 Comm: cat Tainted: G D 6.0.0-rc6+ #35 After patch: [ 125.585985] lkdtm: Performing direct entry BUG [ 125.590433] ------------[ cut here ]------------ [ 125.595020] kernel BUG at drivers/misc/lkdtm/bugs.c:78! [ 125.600211] Oops - BUG[#1]: [ 125.602980] CPU: 3 PID: 410 Comm: cat Not tainted 6.0.0-rc6+ #36 Out-of-line file/line data information obtained compared to before. Signed-off-by: Youling Tang <tangyouling@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-10-12LoongArch: Refactor cache probe and flush methodsHuacai Chen
Current cache probe and flush methods have some drawbacks: 1, Assume there are 3 cache levels and only 3 levels; 2, Assume L1 = I + D, L2 = V, L3 = S, V is exclusive, S is inclusive. However, the fact is I + D, I + D + V, I + D + S and I + D + V + S are all valid. So, refactor the cache probe and flush methods to adapt more types of cache hierarchy. Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-09-29LoongArch: Fix and cleanup csr_era handling in do_ri()Huacai Chen
We don't emulate reserved instructions and just send a signal to the current process now. So we don't need to call compute_return_era() to add 4 (point to the next instruction) to csr_era in pt_regs. RA/ERA's backup/restore is cleaned up as well. Signed-off-by: Jun Yi <yijun@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-08-12LoongArch: Add prologue unwinder supportQing Zhang
It unwind the stack frame based on prologue code analyze. CONFIG_KALLSYMS is needed, at least the address and length of each function. Three stages when we do unwind, 1) unwind_start(), the prapare of unwinding, fill unwind_state. 2) unwind_done(), judge whether the unwind process is finished or not. 3) unwind_next_frame(), unwind the next frame. Dividing unwinder helps to add new unwinders in the future, e.g.: unwinder_frame, unwinder_orc, .etc. Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-08-12LoongArch: Add guess unwinder supportQing Zhang
Name "guess unwinder" comes from x86, it scans the stack and reports every kernel text address it finds. Unwinders can be used by dump_stack() and other stacktrace functions. Three stages when we do unwind, 1) unwind_start(), the prapare of unwinding, fill unwind_state. 2) unwind_done(), judge whether the unwind process is finished or not. 3) unwind_next_frame(), unwind the next frame. Add get_stack_info() to get stack info. At present we have irq stack and task stack. The next_sp is the key info between two types of stacks. Dividing unwinder helps to add new unwinders in the future. Signed-off-by: Qing Zhang <zhangqing@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-06-25LoongArch: Make compute_return_era() return voidTiezhu Yang
compute_return_era() always returns 0, make it return void, and then no need to check its return value for its callers. Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-06-03LoongArch: Add Non-Uniform Memory Access (NUMA) supportHuacai Chen
Add Non-Uniform Memory Access (NUMA) support for LoongArch. LoongArch has 48-bit physical address, but the HyperTransport I/O bus only support 40-bit address, so we need a custom phys_to_dma() and dma_to_phys() to extract the 4-bit node id (bit 44~47) from Loongson-3's 48-bit physical address space and embed it into 40-bit. In the 40-bit dma address, node id offset can be read from the LS7A_DMA_CFG register. Reviewed-by: WANG Xuerui <git@xen0n.name> Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2022-06-03LoongArch: Add exception/interrupt handlingHuacai Chen
Add the exception and interrupt handling machanism for basic LoongArch support. Reviewed-by: WANG Xuerui <git@xen0n.name> Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>