summaryrefslogtreecommitdiff
path: root/arch/arm64/kernel/entry-common.c
AgeCommit message (Collapse)Author
2021-07-15arm64: entry: add missing noinstrMark Rutland
We intend that all the early exception handling code is marked as `noinstr`, but we forgot this for __el0_error_handler_common(), which is called before we have completed entry from user mode. If it were instrumented, we could run into problems with RCU, lockdep, etc. Mark it as `noinstr` to prevent this. The few other functions in entry-common.c which do not have `noinstr` are called once we've completed entry, and are safe to instrument. Fixes: bb8e93a287a5 ("arm64: entry: convert SError handlers to C") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Joey Gouly <joey.gouly@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210714172801.16475-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07arm64: entry: make NMI entry/exit functions staticMark Rutland
Now that we only call arm64_enter_nmi() and arm64_exit_nmi() from within entry-common.c, let's make these static to ensure this remains the case. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210607094624.34689-19-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07arm64: entry: split SDEI entryMark Rutland
We'd like to keep all the entry sequencing in entry-common.c, as this will allow us to ensure this is consistent, and free from any unsound instrumentation. Currently __sdei_handler() performs the NMI entry/exit sequences in sdei.c. Let's split the low-level entry sequence from the event handling, moving the former to entry-common.c and keeping the latter in sdei.c. The event handling function is renamed to do_sdei_event(), matching the do_${FOO}() pattern used for other exception handlers. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210607094624.34689-18-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07arm64: entry: split bad stack entryMark Rutland
We'd like to keep all the entry sequencing in entry-common.c, as this will allow us to ensure this is consistent, and free from any unsound instrumentation. Currently handle_bad_stack() performs the NMI entry sequence in traps.c. Let's split the low-level entry sequence from the reporting, moving the former to entry-common.c and keeping the latter in traps.c. To make it clear that reporting function never returns, it is renamed to panic_bad_stack(). Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210607094624.34689-17-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07arm64: entry: fold el1_inv() into el1h_64_sync_handler()Mark Rutland
An unexpected synchronous exception from EL1h could happen at any time, and for robustness we should treat this as an NMI, making minimal assumptions about the context the exception was taken from. Currently el1_inv() assumes we can use enter_from_kernel_mode(), and also assumes that we should inherit the original DAIF value. Neither of these are desireable when we take an unexpected exception. Further, after el1_inv() calls __panic_unhandled(), the remainder of the function is unreachable, and therefore superfluous. Let's address this and simplify things by having el1h_64_sync_handler() call __panic_unhandled() directly, without any of the redundant logic. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Reported-by: Joey Gouly <joey.gouly@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210607094624.34689-16-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07arm64: entry: handle all vectors with CMark Rutland
We have 16 architectural exception vectors, and depending on kernel configuration we handle 8 or 12 of these with C code, with the remaining 8 or 4 of these handled as special cases in the entry assembly. It would be nicer if the entry assembly were uniform for all exceptions, and we deferred any specific handling of the exceptions to C code. This way the entry assembly can be more easily templated without ifdeffery or special cases, and it's easier to modify the handling of these cases in future (e.g. to dump additional registers other context). This patch reworks the entry code so that we always have a C handler for every architectural exception vector, with the entry assembly being completely uniform. We now have to handle exceptions from EL1t and EL1h, and also have to handle exceptions from AArch32 even when the kernel is built without CONFIG_COMPAT. To make this clear and to simplify templating, we rename the top-level exception handlers with a consistent naming scheme: asm: <el+sp>_<regsize>_<type> c: <el+sp>_<regsize>_<type>_handler .. where: <el+sp> is `el1t`, `el1h`, or `el0t` <regsize> is `64` or `32` <type> is `sync`, `irq`, `fiq`, or `error` ... e.g. asm: el1h_64_sync c: el1h_64_sync_handler ... with lower-level handlers simply using "el1" and "compat" as today. For unexpected exceptions, this information is passed to __panic_unhandled(), so it can report the specific vector an unexpected exception was taken from, e.g. | Unhandled 64-bit el1t sync exception For vectors we never expect to enter legitimately, the C code is generated using a macro to avoid code duplication. The exceptions are handled via __panic_unhandled(), replacing bad_mode() (which is removed). The `kernel_ventry` and `entry_handler` assembly macros are updated to handle the new naming scheme. In theory it should be possible to generate the entry functions at the same time as the vectors using a single table, but this will require reworking the linker script to split the two into separate sections, so for now we have separate tables. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210607094624.34689-15-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07arm64: entry: improve bad_mode()Mark Rutland
Our use of bad_mode() has a few rough edges: * AArch64 doesn't use the term "mode", and refers to "Execution states", "Exception levels", and "Selected stack pointer". * We log the exception type (SYNC/IRQ/FIQ/SError), but not the actual "mode" (though this can be decoded from the SPSR value). * We use bad_mode() as a second-level handler for unexpected synchronous exceptions, where the "mode" is legitimate, but the specific exception is not. * We dump the ESR value, but call this "code", and so it's not clear to all readers that this is the ESR. ... and all of this can be somewhat opaque to those who aren't extremely familiar with the code. Let's make this a bit clearer by having bad_mode() log "Unhandled ${TYPE} exception" rather than "Bad mode in ${TYPE} handler", using "ESR" rather than "code", and having the final panic() log "Unhandled exception" rather than "Bad mode". In future we'd like to log the specific architectural vector rather than just the type of exception, so we also split the core of bad_mode() out into a helper called __panic_unhandled(), which takes the vector as a string argument. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210607094624.34689-13-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07arm64: entry: move bad_mode() to entry-common.cMark Rutland
In subsequent patches we'll rework the way bad_mode() is called by exception entry code. In preparation for this, let's move bad_mode() itself into entry-common.c. Let's also mark it as noinstr (e.g. to prevent it being kprobed), and let's also make the `handler` array a local variable, as this is only use by bad_mode(), and will be removed entirely in a subsequent patch. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210607094624.34689-12-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07arm64: entry: convert IRQ+FIQ handlers to CMark Rutland
For various reasons we'd like to convert the bulk of arm64's exception triage logic to C. As a step towards that, this patch converts the EL1 and EL0 IRQ+FIQ triage logic to C. Separate C functions are added for the native and compat cases so that in subsequent patches we can handle native/compat differences in C. Since the triage functions can now call arm64_apply_bp_hardening() directly, the do_el0_irq_bp_hardening() wrapper function is removed. Since the user_exit_irqoff macro is now unused, it is removed. The user_enter_irqoff macro is still used by the ret_to_user code, and cannot be removed at this time. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210607094624.34689-8-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07arm64: entry: move NMI preempt logic to CMark Rutland
Currently portions of our preempt logic are written in C while other parts are written in assembly. Let's clean this up a little bit by moving the NMI preempt checks to C. For now, the preempt count (and need_resched) checking is left in assembly, and will be converted with the body of the IRQ handler in subsequent patches. Other than the increased lockdep coverage there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210607094624.34689-6-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07arm64: entry: move arm64_preempt_schedule_irq to entry-common.cMark Rutland
Subsequent patches will pull more of the IRQ entry handling into C. To keep this in one place, let's move arm64_preempt_schedule_irq() into entry-common.c along with the other entry management functions. We no longer need to include <linux/lockdep.h> in process.c, so the include directive is removed. There should be no functional change as a result of this patch. Reviewed-by Joey Gouly <joey.gouly@arm.com> Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210607094624.34689-5-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07arm64: entry: convert SError handlers to CMark Rutland
For various reasons we'd like to convert the bulk of arm64's exception triage logic to C. As a step towards that, this patch converts the EL1 and EL0 SError triage logic to C. Separate C functions are added for the native and compat cases so that in subsequent patches we can handle native/compat differences in C. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210607094624.34689-4-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-06-07arm64: entry: unmask IRQ+FIQ after EL0 handlingMark Rutland
For non-fatal exceptions taken from EL0, we expect that at some point during exception handling it is possible to return to a regular process context with all exceptions unmasked (e.g. as we do in do_notify_resume()), and we generally aim to unmask exceptions wherever possible. While handling SError and debug exceptions from EL0, we need to leave some exceptions masked during handling. Handling SError requires us to mask SError (which also requires masking IRQ+FIQ), and handing debug exceptions requires us to mask debug (which also requires masking SError+IRQ+FIQ). Once do_serror() or do_debug_exception() has returned, we no longer need to mask exceptions, and can unmask them all, which is what we did prior to commit: 9034f6251572a474 ("arm64: Do not enable IRQs for ct_user_exit") ... where we had to mask IRQs as for context_tracking_user_exit() expected IRQs to be masked. Since then, we realised that our context tracking wasn't entirely correct, and reworked the entry code to fix this. As of commit: 23529049c6842382 ("arm64: entry: fix non-NMI user<->kernel transitions") ... we replaced the call to context_tracking_user_exit() with a call to user_exit_irqoff() as part of enter_from_user_mode(), which occurs earlier, before we run the body of the handler and unmask exceptions in DAIF. When we return to userspace, we go via ret_to_user(), which masks exceptions in DAIF prior to calling user_enter_irqoff() as part of exit_to_user_mode(). Thus, there's no longer a reason to leave IRQs or FIQs masked at the end of the EL0 debug or error handlers, as neither the user exit context tracking nor the user entry context tracking requires this. Let's bring these into line with other EL0 exception handlers and ensure that IRQ and FIQ are unmasked in DAIF at some point during the handler. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Joey Gouly <joey.gouly@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210607094624.34689-3-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2021-05-05arm64: entry: always set GIC_PRIO_PSR_I_SET during entryMark Rutland
Zenghui reports that booting a kernel with "irqchip.gicv3_pseudo_nmi=1" on the command line hits a warning during kernel entry, due to the way we manipulate the PMR. Early in the entry sequence, we call lockdep_hardirqs_off() to inform lockdep that interrupts have been masked (as the HW sets DAIF wqhen entering an exception). Architecturally PMR_EL1 is not affected by exception entry, and we don't set GIC_PRIO_PSR_I_SET in the PMR early in the exception entry sequence, so early in exception entry the PMR can indicate that interrupts are unmasked even though they are masked by DAIF. If DEBUG_LOCKDEP is selected, lockdep_hardirqs_off() will check that interrupts are masked, before we set GIC_PRIO_PSR_I_SET in any of the exception entry paths, and hence lockdep_hardirqs_off() will WARN() that something is amiss. We can avoid this by consistently setting GIC_PRIO_PSR_I_SET during exception entry so that kernel code sees a consistent environment. We must also update local_daif_inherit() to undo this, as currently only touches DAIF. For other paths, local_daif_restore() will update both DAIF and the PMR. With this done, we can remove the existing special cases which set this later in the entry code. We always use (GIC_PRIO_IRQON | GIC_PRIO_PSR_I_SET) for consistency with local_daif_save(), as this will warn if it ever encounters (GIC_PRIO_IRQOFF | GIC_PRIO_PSR_I_SET), and never sets this itself. This matches the gic_prio_kentry_setup that we have to retain for ret_to_user. The original splat from Zenghui's report was: | DEBUG_LOCKS_WARN_ON(!irqs_disabled()) | WARNING: CPU: 3 PID: 125 at kernel/locking/lockdep.c:4258 lockdep_hardirqs_off+0xd4/0xe8 | Modules linked in: | CPU: 3 PID: 125 Comm: modprobe Tainted: G W 5.12.0-rc8+ #463 | Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 | pstate: 604003c5 (nZCv DAIF +PAN -UAO -TCO BTYPE=--) | pc : lockdep_hardirqs_off+0xd4/0xe8 | lr : lockdep_hardirqs_off+0xd4/0xe8 | sp : ffff80002a39bad0 | pmr_save: 000000e0 | x29: ffff80002a39bad0 x28: ffff0000de214bc0 | x27: ffff0000de1c0400 x26: 000000000049b328 | x25: 0000000000406f30 x24: ffff0000de1c00a0 | x23: 0000000020400005 x22: ffff8000105f747c | x21: 0000000096000044 x20: 0000000000498ef9 | x19: ffff80002a39bc88 x18: ffffffffffffffff | x17: 0000000000000000 x16: ffff800011c61eb0 | x15: ffff800011700a88 x14: 0720072007200720 | x13: 0720072007200720 x12: 0720072007200720 | x11: 0720072007200720 x10: 0720072007200720 | x9 : ffff80002a39bad0 x8 : ffff80002a39bad0 | x7 : ffff8000119f0800 x6 : c0000000ffff7fff | x5 : ffff8000119f07a8 x4 : 0000000000000001 | x3 : 9bcdab23f2432800 x2 : ffff800011730538 | x1 : 9bcdab23f2432800 x0 : 0000000000000000 | Call trace: | lockdep_hardirqs_off+0xd4/0xe8 | enter_from_kernel_mode.isra.5+0x7c/0xa8 | el1_abort+0x24/0x100 | el1_sync_handler+0x80/0xd0 | el1_sync+0x6c/0x100 | __arch_clear_user+0xc/0x90 | load_elf_binary+0x9fc/0x1450 | bprm_execve+0x404/0x880 | kernel_execve+0x180/0x188 | call_usermodehelper_exec_async+0xdc/0x158 | ret_from_fork+0x10/0x18 Fixes: 23529049c684 ("arm64: entry: fix non-NMI user<->kernel transitions") Fixes: 7cd1ea1010ac ("arm64: entry: fix non-NMI kernel<->kernel transitions") Fixes: f0cd5ac1e4c5 ("arm64: entry: fix NMI {user, kernel}->kernel transitions") Fixes: 2a9b3e6ac69a ("arm64: entry: fix EL1 debug transitions") Link: https://lore.kernel.org/r/f4012761-026f-4e51-3a0c-7524e434e8b3@huawei.com Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reported-by: Zenghui Yu <yuzenghui@huawei.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210428111555.50880-1-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-04-11arm64: mte: Enable async tag check faultVincenzo Frascino
MTE provides a mode that asynchronously updates the TFSR_EL1 register when a tag check exception is detected. To take advantage of this mode the kernel has to verify the status of the register at: 1. Context switching 2. Return to user/EL0 (Not required in entry from EL0 since the kernel did not run) 3. Kernel entry from EL1 4. Kernel exit to EL1 If the register is non-zero a trace is reported. Add the required features for EL1 detection and reporting. Note: ITFSB bit is set in the SCTLR_EL1 register hence it guaranties that the indirect writes to TFSR_EL1 are synchronized at exception entry to EL1. On the context switch path the synchronization is guarantied by the dsb() in __switch_to(). The dsb(nsh) in mte_check_tfsr_exit() is provisional pending confirmation by the architects. Cc: Will Deacon <will@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Link: https://lore.kernel.org/r/20210315132019.33202-8-vincenzo.frascino@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2021-02-08arm64: entry: consolidate Cortex-A76 erratum 1463225 workaroundMark Rutland
The workaround for Cortex-A76 erratum 1463225 is split across the syscall and debug handlers in separate files. This structure currently forces us to do some redundant work for debug exceptions from EL0, is a little difficult to follow, and gets in the way of some future rework of the exception entry code as it requires exceptions to be unmasked late in the syscall handling path. To simplify things, and as a preparatory step for future rework of exception entry, this patch moves all the workaround logic into entry-common.c. As the debug handler only needs to run for EL1 debug exceptions, we no longer call it for EL0 debug exceptions, and no longer need to check user_mode(regs) as this is always false. For clarity cortex_a76_erratum_1463225_debug_handler() is changed to return bool. In the SVC path, the workaround is applied earlier, but this should have no functional impact as exceptions are still masked. In the debug path we run the fixup before explicitly disabling preemption, but we will not attempt to preempt before returning from the exception. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20210202120341.28858-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-12-09Merge remote-tracking branch 'arm64/for-next/fixes' into for-next/coreCatalin Marinas
* arm64/for-next/fixes: (26 commits) arm64: mte: fix prctl(PR_GET_TAGGED_ADDR_CTRL) if TCF0=NONE arm64: mte: Fix typo in macro definition arm64: entry: fix EL1 debug transitions arm64: entry: fix NMI {user, kernel}->kernel transitions arm64: entry: fix non-NMI kernel<->kernel transitions arm64: ptrace: prepare for EL1 irq/rcu tracking arm64: entry: fix non-NMI user<->kernel transitions arm64: entry: move el1 irq/nmi logic to C arm64: entry: prepare ret_to_user for function call arm64: entry: move enter_from_user_mode to entry-common.c arm64: entry: mark entry code as noinstr arm64: mark idle code as noinstr arm64: syscall: exit userspace before unmasking exceptions arm64: pgtable: Ensure dirty bit is preserved across pte_wrprotect() arm64: pgtable: Fix pte_accessible() ACPI/IORT: Fix doc warnings in iort.c arm64/fpsimd: add <asm/insn.h> to <asm/kprobes.h> to fix fpsimd build arm64: cpu_errata: Apply Erratum 845719 to KRYO2XX Silver arm64: proton-pack: Add KRYO2XX silver CPUs to spectre-v2 safe-list arm64: kpti: Add KRYO2XX gold/silver CPU cores to kpti safelist ... # Conflicts: # arch/arm64/include/asm/exception.h # arch/arm64/kernel/sdei.c
2020-11-30arm64: entry: fix EL1 debug transitionsMark Rutland
In debug_exception_enter() and debug_exception_exit() we trace hardirqs on/off while RCU isn't guaranteed to be watching, and we don't save and restore the hardirq state, and so may return with this having changed. Handle this appropriately with new entry/exit helpers which do the bare minimum to ensure this is appropriately maintained, without marking debug exceptions as NMIs. These are placed in entry-common.c with the other entry/exit helpers. In future we'll want to reconsider whether some debug exceptions should be NMIs, but this will require a significant refactoring, and for now this should prevent issues with lockdep and RCU. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marins <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201130115950.22492-12-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-11-30arm64: entry: fix NMI {user, kernel}->kernel transitionsMark Rutland
Exceptions which can be taken at (almost) any time are consdiered to be NMIs. On arm64 that includes: * SDEI events * GICv3 Pseudo-NMIs * Kernel stack overflows * Unexpected/unhandled exceptions ... but currently debug exceptions (BRKs, breakpoints, watchpoints, single-step) are not considered NMIs. As these can be taken at any time, kernel features (lockdep, RCU, ftrace) may not be in a consistent kernel state. For example, we may take an NMI from the idle code or partway through an entry/exit path. While nmi_enter() and nmi_exit() handle most of this state, notably they don't save/restore the lockdep state across an NMI being taken and handled. When interrupts are enabled and an NMI is taken, lockdep may see interrupts become disabled within the NMI code, but not see interrupts become enabled when returning from the NMI, leaving lockdep believing interrupts are disabled when they are actually disabled. The x86 code handles this in idtentry_{enter,exit}_nmi(), which will shortly be moved to the generic entry code. As we can't use either yet, we copy the x86 approach in arm64-specific helpers. All the NMI entrypoints are marked as noinstr to prevent any instrumentation handling code being invoked before the state has been corrected. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201130115950.22492-11-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-11-30arm64: entry: fix non-NMI kernel<->kernel transitionsMark Rutland
There are periods in kernel mode when RCU is not watching and/or the scheduler tick is disabled, but we can still take exceptions such as interrupts. The arm64 exception handlers do not account for this, and it's possible that RCU is not watching while an exception handler runs. The x86/generic entry code handles this by ensuring that all (non-NMI) kernel exception handlers call irqentry_enter() and irqentry_exit(), which handle RCU, lockdep, and IRQ flag tracing. We can't yet move to the generic entry code, and already hadnle the user<->kernel transitions elsewhere, so we add new kernel<->kernel transition helpers alog the lines of the generic entry code. Since we now track interrupts becoming masked when an exception is taken, local_daif_inherit() is modified to track interrupts becoming re-enabled when the original context is inherited. To balance the entry/exit paths, each handler masks all DAIF exceptions before exit_to_kernel_mode(). Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201130115950.22492-10-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-11-30arm64: entry: fix non-NMI user<->kernel transitionsMark Rutland
When built with PROVE_LOCKING, NO_HZ_FULL, and CONTEXT_TRACKING_FORCE will WARN() at boot time that interrupts are enabled when we call context_tracking_user_enter(), despite the DAIF flags indicating that IRQs are masked. The problem is that we're not tracking IRQ flag changes accurately, and so lockdep believes interrupts are enabled when they are not (and vice-versa). We can shuffle things so to make this more accurate. For kernel->user transitions there are a number of constraints we need to consider: 1) When we call __context_tracking_user_enter() HW IRQs must be disabled and lockdep must be up-to-date with this. 2) Userspace should be treated as having IRQs enabled from the PoV of both lockdep and tracing. 3) As context_tracking_user_enter() stops RCU from watching, we cannot use RCU after calling it. 4) IRQ flag tracing and lockdep have state that must be manipulated before RCU is disabled. ... with similar constraints applying for user->kernel transitions, with the ordering reversed. The generic entry code has enter_from_user_mode() and exit_to_user_mode() helpers to handle this. We can't use those directly, so we add arm64 copies for now (without the instrumentation markers which aren't used on arm64). These replace the existing user_exit() and user_exit_irqoff() calls spread throughout handlers, and the exception unmasking is left as-is. Note that: * The accounting for debug exceptions from userspace now happens in el0_dbg() and ret_to_user(), so this is removed from debug_exception_enter() and debug_exception_exit(). As user_exit_irqoff() wakes RCU, the userspace-specific check is removed. * The accounting for syscalls now happens in el0_svc(), el0_svc_compat(), and ret_to_user(), so this is removed from el0_svc_common(). This does not adversely affect the workaround for erratum 1463225, as this does not depend on any of the state tracking. * In ret_to_user() we mask interrupts with local_daif_mask(), and so we need to inform lockdep and tracing. Here a trace_hardirqs_off() is sufficient and safe as we have not yet exited kernel context and RCU is usable. * As PROVE_LOCKING selects TRACE_IRQFLAGS, the ifdeferry in entry.S only needs to check for the latter. * EL0 SError handling will be dealt with in a subsequent patch, as this needs to be treated as an NMI. Prior to this patch, booting an appropriately-configured kernel would result in spats as below: | DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled()) | WARNING: CPU: 2 PID: 1 at kernel/locking/lockdep.c:5280 check_flags.part.54+0x1dc/0x1f0 | Modules linked in: | CPU: 2 PID: 1 Comm: init Not tainted 5.10.0-rc3 #3 | Hardware name: linux,dummy-virt (DT) | pstate: 804003c5 (Nzcv DAIF +PAN -UAO -TCO BTYPE=--) | pc : check_flags.part.54+0x1dc/0x1f0 | lr : check_flags.part.54+0x1dc/0x1f0 | sp : ffff80001003bd80 | x29: ffff80001003bd80 x28: ffff66ce801e0000 | x27: 00000000ffffffff x26: 00000000000003c0 | x25: 0000000000000000 x24: ffffc31842527258 | x23: ffffc31842491368 x22: ffffc3184282d000 | x21: 0000000000000000 x20: 0000000000000001 | x19: ffffc318432ce000 x18: 0080000000000000 | x17: 0000000000000000 x16: ffffc31840f18a78 | x15: 0000000000000001 x14: ffffc3184285c810 | x13: 0000000000000001 x12: 0000000000000000 | x11: ffffc318415857a0 x10: ffffc318406614c0 | x9 : ffffc318415857a0 x8 : ffffc31841f1d000 | x7 : 647261685f706564 x6 : ffffc3183ff7c66c | x5 : ffff66ce801e0000 x4 : 0000000000000000 | x3 : ffffc3183fe00000 x2 : ffffc31841500000 | x1 : e956dc24146b3500 x0 : 0000000000000000 | Call trace: | check_flags.part.54+0x1dc/0x1f0 | lock_is_held_type+0x10c/0x188 | rcu_read_lock_sched_held+0x70/0x98 | __context_tracking_enter+0x310/0x350 | context_tracking_enter.part.3+0x5c/0xc8 | context_tracking_user_enter+0x6c/0x80 | finish_ret_to_user+0x2c/0x13cr Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201130115950.22492-8-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-11-30arm64: entry: move el1 irq/nmi logic to CMark Rutland
In preparation for reworking the EL1 irq/nmi entry code, move the existing logic to C. We no longer need the asm_nmi_enter() and asm_nmi_exit() wrappers, so these are removed. The new C functions are marked noinstr, which prevents compiler instrumentation and runtime probing. In subsequent patches we'll want the new C helpers to be called in all cases, so we don't bother wrapping the calls with ifdeferry. Even when the new C functions are stubs the trivial calls are unlikely to have a measurable impact on the IRQ or NMI paths anyway. Prototypes are added to <asm/exception.h> as otherwise (in some configurations) GCC will complain about the lack of a forward declaration. We already do this for existing function, e.g. enter_from_user_mode(). The new helpers are marked as noinstr (which prevents all instrumentation, tracing, and kprobes). Otherwise, there should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201130115950.22492-7-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-11-30arm64: entry: move enter_from_user_mode to entry-common.cMark Rutland
In later patches we'll want to extend enter_from_user_mode() and add a corresponding exit_to_user_mode(). As these will be common for all entries/exits from userspace, it'd be better for these to live in entry-common.c with the rest of the entry logic. This patch moves enter_from_user_mode() into entry-common.c. As with other functions in entry-common.c it is marked as noinstr (which prevents all instrumentation, tracing, and kprobes) but there are no other functional changes. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201130115950.22492-5-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-11-30arm64: entry: mark entry code as noinstrMark Rutland
Functions in entry-common.c are marked as notrace and NOKPROBE_SYMBOL(), but they're still subject to other instrumentation which may rely on lockdep/rcu/context-tracking being up-to-date, and may cause nested exceptions (e.g. for WARN/BUG or KASAN's use of BRK) which will corrupt exceptions registers which have not yet been read. Prevent this by marking all functions in entry-common.c as noinstr to prevent compiler instrumentation. This also blacklists the functions for tracing and kprobes, so we don't need to handle that separately. Functions elsewhere will be dealt with in subsequent patches. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201130115950.22492-4-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-11-23arm64: expose FAR_EL1 tag bits in siginfoPeter Collingbourne
The kernel currently clears the tag bits (i.e. bits 56-63) in the fault address exposed via siginfo.si_addr and sigcontext.fault_address. However, the tag bits may be needed by tools in order to accurately diagnose memory errors, such as HWASan [1] or future tools based on the Memory Tagging Extension (MTE). Expose these bits via the arch_untagged_si_addr mechanism, so that they are only exposed to signal handlers with the SA_EXPOSE_TAGBITS flag set. [1] http://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html Signed-off-by: Peter Collingbourne <pcc@google.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://linux-review.googlesource.com/id/Ia8876bad8c798e0a32df7c2ce1256c4771c81446 Link: https://lore.kernel.org/r/0010296597784267472fa13b39f8238d87a72cf8.1605904350.git.pcc@google.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-09-14arm64: ptrauth: Introduce Armv8.3 pointer authentication enhancementsAmit Daniel Kachhap
Some Armv8.3 Pointer Authentication enhancements have been introduced which are mandatory for Armv8.6 and optional for Armv8.3. These features are, * ARMv8.3-PAuth2 - An enhanced PAC generation logic is added which hardens finding the correct PAC value of the authenticated pointer. * ARMv8.3-FPAC - Fault is generated now when the ptrauth authentication instruction fails in authenticating the PAC present in the address. This is different from earlier case when such failures just adds an error code in the top byte and waits for subsequent load/store to abort. The ptrauth instructions which may cause this fault are autiasp, retaa etc. The above features are now represented by additional configurations for the Address Authentication cpufeature and a new ESR exception class. The userspace fault received in the kernel due to ARMv8.3-FPAC is treated as Illegal instruction and hence signal SIGILL is injected with ILL_ILLOPN as the signal code. Note that this is different from earlier ARMv8.3 ptrauth where signal SIGSEGV is issued due to Pointer authentication failures. The in-kernel PAC fault causes kernel to crash. Signed-off-by: Amit Daniel Kachhap <amit.kachhap@arm.com> Reviewed-by: Dave Martin <Dave.Martin@arm.com> Link: https://lore.kernel.org/r/20200914083656.21428-4-amit.kachhap@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2020-07-08arm64: entry: Fix the typo in the comment of el1_dbg()Kevin Hao
The function name should be local_daif_mask(). Signed-off-by: Kevin Hao <haokexin@gmail.com> Acked-by: Mark Rutlamd <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20200417103212.45812-2-haokexin@gmail.com Signed-off-by: Will Deacon <will@kernel.org>
2020-05-28Merge branch 'for-next/bti' into for-next/coreWill Deacon
Support for Branch Target Identification (BTI) in user and kernel (Mark Brown and others) * for-next/bti: (39 commits) arm64: vdso: Fix CFI directives in sigreturn trampoline arm64: vdso: Don't prefix sigreturn trampoline with a BTI C instruction arm64: bti: Fix support for userspace only BTI arm64: kconfig: Update and comment GCC version check for kernel BTI arm64: vdso: Map the vDSO text with guarded pages when built for BTI arm64: vdso: Force the vDSO to be linked as BTI when built for BTI arm64: vdso: Annotate for BTI arm64: asm: Provide a mechanism for generating ELF note for BTI arm64: bti: Provide Kconfig for kernel mode BTI arm64: mm: Mark executable text as guarded pages arm64: bpf: Annotate JITed code for BTI arm64: Set GP bit in kernel page tables to enable BTI for the kernel arm64: asm: Override SYM_FUNC_START when building the kernel with BTI arm64: bti: Support building kernel C code using BTI arm64: Document why we enable PAC support for leaf functions arm64: insn: Report PAC and BTI instructions as skippable arm64: insn: Don't assume unrecognized HINTs are skippable arm64: insn: Provide a better name for aarch64_insn_is_nop() arm64: insn: Add constants for new HINT instruction decode arm64: Disable old style assembly annotations ...
2020-05-05Merge branch 'for-next/bti-user' into for-next/btiWill Deacon
Merge in user support for Branch Target Identification, which narrowly missed the cut for 5.7 after a late ABI concern. * for-next/bti-user: arm64: bti: Document behaviour for dynamically linked binaries arm64: elf: Fix allnoconfig kernel build with !ARCH_USE_GNU_PROPERTY arm64: BTI: Add Kconfig entry for userspace BTI mm: smaps: Report arm64 guarded pages in smaps arm64: mm: Display guarded pages in ptdump KVM: arm64: BTI: Reset BTYPE when skipping emulated instructions arm64: BTI: Reset BTYPE when skipping emulated instructions arm64: traps: Shuffle code to eliminate forward declarations arm64: unify native/compat instruction skipping arm64: BTI: Decode BYTPE bits when printing PSTATE arm64: elf: Enable BTI at exec based on ELF program properties elf: Allow arch to tweak initial mmap prot flags arm64: Basic Branch Target Identification support ELF: Add ELF program property parsing support ELF: UAPI and Kconfig additions for ELF program properties
2020-04-28arm64: entry: remove unneeded semicolon in el1_sync_handler()Jason Yan
Fix the following coccicheck warning: arch/arm64/kernel/entry-common.c:97:2-3: Unneeded semicolon Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20200418081909.41471-1-yanaijie@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2020-03-16arm64: Basic Branch Target Identification supportDave Martin
This patch adds the bare minimum required to expose the ARMv8.5 Branch Target Identification feature to userspace. By itself, this does _not_ automatically enable BTI for any initial executable pages mapped by execve(). This will come later, but for now it should be possible to enable BTI manually on those pages by using mprotect() from within the target process. Other arches already using the generic mman.h are already using 0x10 for arch-specific prot flags, so we use that for PROT_BTI here. For consistency, signal handler entry points in BTI guarded pages are required to be annotated as such, just like any other function. This blocks a relatively minor attack vector, but comforming userspace will have the annotations anyway, so we may as well enforce them. Signed-off-by: Mark Brown <broonie@kernel.org> Signed-off-by: Dave Martin <Dave.Martin@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2020-03-11arm64: entry: unmask IRQ in el0_sp()Mark Rutland
Currently, the EL0 SP alignment handler masks IRQs unnecessarily. It does so due to historic code sharing of the EL0 SP and PC alignment handlers, and branch predictor hardening applicable to the EL0 SP handler. We began masking IRQs in the EL0 SP alignment handler in commit: 5dfc6ed27710c42c ("arm64: entry: Apply BP hardening for high-priority synchronous exception") ... as this shared code with the EL0 PC alignment handler, and branch predictor hardening made it necessary to disable IRQs for early parts of the EL0 PC alignment handler. It was not necessary to mask IRQs during EL0 SP alignment exceptions, but it was not considered harmful to do so. This masking was carried forward into C code in commit: 582f95835a8fc812 ("arm64: entry: convert el0_sync to C") ... where the SP/PC cases were split into separate handlers, and the masking duplicated. Subsequently the EL0 PC alignment handler was refactored to perform branch predictor hardening before unmasking IRQs, in commit: bfe298745afc9548 ("arm64: entry-common: don't touch daif before bp-hardening") ... but the redundant masking of IRQs was not removed from the EL0 SP alignment handler. Let's do so now, and make it interruptible as with most other synchronous exception handlers. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: James Morse <james.morse@arm.com>
2020-01-17arm64: entry: cleanup el0 svc handler namingMark Rutland
For most of the exception entry code, <foo>_handler() is the first C function called from the entry assembly in entry-common.c, and external functions handling the bulk of the logic are called do_<foo>(). For consistency, apply this scheme to el0_svc_handler and el0_svc_compat_handler, renaming them to do_el0_svc and do_el0_svc_compat respectively. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2020-01-17arm64: entry: mark all entry code as notraceMark Rutland
Almost all functions in entry-common.c are marked notrace, with el1_undef and el1_inv being the only exceptions. We appear to have done this on the assumption that there were no exception registers that we needed to snapshot, and thus it was safe to run trace code that might result in further exceptions and clobber those registers. However, until we inherit the DAIF flags, our irq flag tracing is stale, and this discrepancy could set off warnings in some configurations. For example if CONFIG_DEBUG_LOCKDEP is selected and a trace function calls into any flag-checking locking routines. Given we don't expect to trigger el1_undef or el1_inv unless something is already wrong, any irqflag warnigns are liable to mask the information we'd actually care about. Let's keep things simple and mark el1_undef and el1_inv as notrace. Developers can trace do_undefinstr and bad_mode if they really want to monitor these cases. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Will Deacon <will@kernel.org>
2019-10-28arm64: entry-common: don't touch daif before bp-hardeningJames Morse
The previous patches mechanically transformed the assembly version of entry.S to entry-common.c for synchronous exceptions. The C version of local_daif_restore() doesn't quite do the same thing as the assembly versions if pseudo-NMI is in use. In particular, | local_daif_restore(DAIF_PROCCTX_NOIRQ) will still allow pNMI to be delivered. This is not the behaviour do_el0_ia_bp_hardening() and do_sp_pc_abort() want as it should not be possible for the PMU handler to run as an NMI until the bp-hardening sequence has run. The bp-hardening calls were placed where they are because this was the first C code to run after the relevant exceptions. As we've now moved that point earlier, move the checks and calls earlier too. This makes it clearer that this stuff runs before any kind of exception, and saves modifying PSTATE twice. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-28arm64: entry: convert el0_sync to CMark Rutland
This is largely a 1-1 conversion of asm to C, with a couple of caveats. The el0_sync{_compat} switches explicitly handle all the EL0 debug cases, so el0_dbg doesn't have to try to bail out for unexpected EL1 debug ESR values. This also means that an unexpected vector catch from AArch32 is routed to el0_inv. We *could* merge the native and compat switches, which would make the diffstat negative, but I've tried to stay as close to the existing assembly as possible for the moment. Signed-off-by: Mark Rutland <mark.rutland@arm.com> [split out of a bigger series, added nokprobes. removed irq trace calls as the C helpers do this. renamed el0_dbg's use of FAR] Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2019-10-28arm64: entry: convert el1_sync to CMark Rutland
This patch converts the EL1 sync entry assembly logic to C code. Doing this will allow us to make changes in a slightly more readable way. A case in point is supporting kernel-first RAS. do_sea() should be called on the CPU that took the fault. Largely the assembly code is converted to C in a relatively straightforward manner. Since all sync sites share a common asm entry point, the ASM_BUG() instances are no longer required for effective backtraces back to assembly, and we don't need similar BUG() entries. The ESR_ELx.EC codes for all (supported) debug exceptions are now checked in the el1_sync_handler's switch statement, which renders the check in el1_dbg redundant. This both simplifies the el1_dbg handler, and makes the EL1 exception handling more robust to currently-unallocated ESR_ELx.EC encodings. Signed-off-by: Mark Rutland <mark.rutland@arm.com> [split out of a bigger series, added nokprobes, moved prototypes] Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Cc: Julien Thierry <julien.thierry.kdev@gmail.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>