summaryrefslogtreecommitdiff
path: root/arch/arm64/kernel/ptrace.c
AgeCommit message (Collapse)Author
5 daysMerge tag 'execve-v6.17' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull execve updates from Kees Cook: - Introduce regular REGSET note macros arch-wide (Dave Martin) - Remove arbitrary 4K limitation of program header size (Yin Fengwei) - Reorder function qualifiers for copy_clone_args_from_user() (Dishank Jogi) * tag 'execve-v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (25 commits) fork: reorder function qualifiers for copy_clone_args_from_user binfmt_elf: remove the 4k limitation of program header size binfmt_elf: Warn on missing or suspicious regset note names xtensa: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names um: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names x86/ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names sparc: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names sh: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names s390/ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names riscv: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names powerpc/ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names parisc: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names openrisc: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names nios2: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names MIPS: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names m68k: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names LoongArch: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names hexagon: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names csky: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names arm64: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note names ...
2025-07-14arm64: ptrace: Use USER_REGSET_NOTE_TYPE() to specify regset note namesDave Martin
Instead of having the core code guess the note name for each regset, use USER_REGSET_NOTE_TYPE() to pick the correct name from elf.h. This does not affect the correctness of switch(note_type) and similar code, since note type values known to Linux for coredump purposes were already required to be unique. Signed-off-by: Dave Martin <Dave.Martin@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Kees Cook <kees@kernel.org> Cc: Akihiko Odaki <akihiko.odaki@daynix.com> Cc: linux-arm-kernel@lists.infradead.org Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Link: https://lore.kernel.org/r/20250701135616.29630-7-Dave.Martin@arm.com Signed-off-by: Kees Cook <kees@kernel.org>
2025-06-12arm64/ptrace: Fix stack-out-of-bounds read in regs_get_kernel_stack_nth()Tengda Wu
KASAN reports a stack-out-of-bounds read in regs_get_kernel_stack_nth(). Call Trace: [ 97.283505] BUG: KASAN: stack-out-of-bounds in regs_get_kernel_stack_nth+0xa8/0xc8 [ 97.284677] Read of size 8 at addr ffff800089277c10 by task 1.sh/2550 [ 97.285732] [ 97.286067] CPU: 7 PID: 2550 Comm: 1.sh Not tainted 6.6.0+ #11 [ 97.287032] Hardware name: linux,dummy-virt (DT) [ 97.287815] Call trace: [ 97.288279] dump_backtrace+0xa0/0x128 [ 97.288946] show_stack+0x20/0x38 [ 97.289551] dump_stack_lvl+0x78/0xc8 [ 97.290203] print_address_description.constprop.0+0x84/0x3c8 [ 97.291159] print_report+0xb0/0x280 [ 97.291792] kasan_report+0x84/0xd0 [ 97.292421] __asan_load8+0x9c/0xc0 [ 97.293042] regs_get_kernel_stack_nth+0xa8/0xc8 [ 97.293835] process_fetch_insn+0x770/0xa30 [ 97.294562] kprobe_trace_func+0x254/0x3b0 [ 97.295271] kprobe_dispatcher+0x98/0xe0 [ 97.295955] kprobe_breakpoint_handler+0x1b0/0x210 [ 97.296774] call_break_hook+0xc4/0x100 [ 97.297451] brk_handler+0x24/0x78 [ 97.298073] do_debug_exception+0xac/0x178 [ 97.298785] el1_dbg+0x70/0x90 [ 97.299344] el1h_64_sync_handler+0xcc/0xe8 [ 97.300066] el1h_64_sync+0x78/0x80 [ 97.300699] kernel_clone+0x0/0x500 [ 97.301331] __arm64_sys_clone+0x70/0x90 [ 97.302084] invoke_syscall+0x68/0x198 [ 97.302746] el0_svc_common.constprop.0+0x11c/0x150 [ 97.303569] do_el0_svc+0x38/0x50 [ 97.304164] el0_svc+0x44/0x1d8 [ 97.304749] el0t_64_sync_handler+0x100/0x130 [ 97.305500] el0t_64_sync+0x188/0x190 [ 97.306151] [ 97.306475] The buggy address belongs to stack of task 1.sh/2550 [ 97.307461] and is located at offset 0 in frame: [ 97.308257] __se_sys_clone+0x0/0x138 [ 97.308910] [ 97.309241] This frame has 1 object: [ 97.309873] [48, 184) 'args' [ 97.309876] [ 97.310749] The buggy address belongs to the virtual mapping at [ 97.310749] [ffff800089270000, ffff800089279000) created by: [ 97.310749] dup_task_struct+0xc0/0x2e8 [ 97.313347] [ 97.313674] The buggy address belongs to the physical page: [ 97.314604] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x14f69a [ 97.315885] flags: 0x15ffffe00000000(node=1|zone=2|lastcpupid=0xfffff) [ 97.316957] raw: 015ffffe00000000 0000000000000000 dead000000000122 0000000000000000 [ 97.318207] raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 [ 97.319445] page dumped because: kasan: bad access detected [ 97.320371] [ 97.320694] Memory state around the buggy address: [ 97.321511] ffff800089277b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 97.322681] ffff800089277b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 97.323846] >ffff800089277c00: 00 00 f1 f1 f1 f1 f1 f1 00 00 00 00 00 00 00 00 [ 97.325023] ^ [ 97.325683] ffff800089277c80: 00 00 00 00 00 00 00 00 00 f3 f3 f3 f3 f3 f3 f3 [ 97.326856] ffff800089277d00: f3 f3 00 00 00 00 00 00 00 00 00 00 00 00 00 00 This issue seems to be related to the behavior of some gcc compilers and was also fixed on the s390 architecture before: commit d93a855c31b7 ("s390/ptrace: Avoid KASAN false positives in regs_get_kernel_stack_nth()") As described in that commit, regs_get_kernel_stack_nth() has confirmed that `addr` is on the stack, so reading the value at `*addr` should be allowed. Use READ_ONCE_NOCHECK() helper to silence the KASAN check for this case. Fixes: 0a8ea52c3eb1 ("arm64: Add HAVE_REGS_AND_STACK_ACCESS_API feature") Signed-off-by: Tengda Wu <wutengda@huaweicloud.com> Link: https://lore.kernel.org/r/20250604005533.1278992-1-wutengda@huaweicloud.com [will: Use '*addr' as the argument to READ_ONCE_NOCHECK()] Signed-off-by: Will Deacon <will@kernel.org>
2025-05-08arm64/fpsimd: ptrace: Gracefully handle errorsMark Rutland
Within sve_set_common() we do not handle error conditions correctly: * When writing to NT_ARM_SSVE, if sme_alloc() fails, the task will be left with task->thread.sme_state==NULL, but TIF_SME will be set and task->thread.fp_type==FP_STATE_SVE. This will result in a subsequent null pointer dereference when the task's state is loaded or otherwise manipulated. * When writing to NT_ARM_SSVE, if sve_alloc() fails, the task will be left with task->thread.sve_state==NULL, but TIF_SME will be set, PSTATE.SM will be set, and task->thread.fp_type==FP_STATE_FPSIMD. This is not a legitimate state, and can result in various problems, including a subsequent null pointer dereference and/or the task inheriting stale streaming mode register state the next time its state is loaded into hardware. * When writing to NT_ARM_SSVE, if the VL is changed but the resulting VL differs from that in the header, the task will be left with TIF_SME set, PSTATE.SM set, but task->thread.fp_type==FP_STATE_FPSIMD. This is not a legitimate state, and can result in various problems as described above. Avoid these problems by allocating memory earlier, and by changing the task's saved fp_type to FP_STATE_SVE before skipping register writes due to a change of VL. To make early returns simpler, I've moved the call to fpsimd_flush_task_state() earlier. As the tracee's state has already been saved, and the tracee is known to be blocked for the duration of sve_set_common(), it doesn't matter whether this is called at the start or the end. For consistency I've moved the setting of TIF_SVE earlier. This will be cleared when loading FPSIMD-only state, and so moving this has no resulting functional change. Note that we only allocate the memory for SVE state when SVE register contents are provided, avoiding unnecessary memory allocations for tasks which only use FPSIMD. Fixes: e12310a0d30f ("arm64/sme: Implement ptrace support for streaming mode SVE registers") Fixes: baa8515281b3 ("arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE") Fixes: 5d0a8d2fba50 ("arm64/ptrace: Ensure that SME is set up for target when writing SSVE state") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Spickett <david.spickett@arm.com> Cc: Luis Machado <luis.machado@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250508132644.1395904-20-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-05-08arm64/fpsimd: ptrace: Mandate SVE payload for streaming-mode stateMark Rutland
When a task has PSTATE.SM==1, reads of NT_ARM_SSVE are required to always present a header with SVE_PT_REGS_SVE, and register data in SVE format. Reads of NT_ARM_SSVE must never present register data in FPSIMD format. Within the kernel, we always expect streaming SVE data to be stored in SVE format. Currently a user can write to NT_ARM_SSVE with a header presenting SVE_PT_REGS_FPSIMD rather than SVE_PT_REGS_SVE, placing the task's FPSIMD/SVE data into an invalid state. To fix this we can either: (a) Forbid such writes. (b) Accept such writes, and immediately convert data into SVE format. Take the simple option and forbid such writes. Fixes: e12310a0d30f ("arm64/sme: Implement ptrace support for streaming mode SVE registers") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Spickett <david.spickett@arm.com> Cc: Luis Machado <luis.machado@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20250508132644.1395904-19-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-05-08arm64/fpsimd: ptrace: Do not present register data for inactive modeMark Rutland
The SME ptrace ABI is written around the incorrect assumption that SVE_PT_REGS_FPSIMD and SVE_PT_REGS_SVE are independent bit flags, where it is possible for both to be clear. In reality they are different values for bit 0 of the header flags, where SVE_PT_REGS_FPSIMD is 0 and SVE_PT_REGS_SVE is 1. In cases where code was written expecting that neither bit flag would be set, the value is equivalent to SVE_PT_REGS_FPSIMD. One consequence of this is that reads of the NT_ARM_SVE or NT_ARM_SSVE will erroneously present data from the other mode: * When PSTATE.SM==1, reads of NT_ARM_SVE will present a header with SVE_PT_REGS_FPSIMD, and FPSIMD-formatted data from streaming mode. * When PSTATE.SM==0, reads of NT_ARM_SSVE will present a header with SVE_PT_REGS_FPSIMD, and FPSIMD-formatted data from non-streaming mode. The original intent was that no register data would be provided in these cases, as described in commit: e12310a0d30f ("arm64/sme: Implement ptrace support for streaming mode SVE registers") Luckily, debuggers do not consume the bogus register data. Both GDB and LLDB read the NT_ARM_SSVE regset before the NT_ARM_SVE regset, and assume that when the NT_ARM_SSVE header presents SVE_PT_REGS_FPSIMD, it is necessary to read register contents from the NT_ARM_SVE regset, regardless of whether the NT_ARM_SSVE regset provided bogus register data. Fix the code to stop presenting register data from the inactive mode. At the same time, make the manipulation of the flag clearer, and remove the bogus comment from sve_set_common(). I've given this a quick spin with GDB and LLDB, and both seem happy. Fixes: e12310a0d30f ("arm64/sme: Implement ptrace support for streaming mode SVE registers") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Spickett <david.spickett@arm.com> Cc: Luis Machado <luis.machado@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250508132644.1395904-18-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-05-08arm64/fpsimd: ptrace: Save task state before generating SVE headerMark Rutland
As sve_init_header_from_task() consumes the saved value of PSTATE.SM and the saved fp_type, both must be saved before the header is generated. When generating a coredump for the current task, sve_get_common() calls sve_init_header_from_task() before saving the task's state. Consequently the header may be bogus, and the contents of the regset may be misleading. Fix this by saving the task's state before generting the header. Fixes: e12310a0d30f ("arm64/sme: Implement ptrace support for streaming mode SVE registers") Fixes: b017a0cea627 ("arm64/ptrace: Use saved floating point state type to determine SVE layout") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Spickett <david.spickett@arm.com> Cc: Luis Machado <luis.machado@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250508132644.1395904-17-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-05-08arm64/fpsimd: Clarify sve_sync_*() functionsMark Rutland
The sve_sync_{to,from}_fpsimd*() functions are intended to extract/insert the currently effective FPSIMD state of a task regardless of whether the task's state is saved in FPSIMD format or SVE format. Historically they were only used by ptrace, but sve_sync_to_fpsimd() is now used more widely, and sve_sync_from_fpsimd_zeropad() may be used more widely in future. When FPSIMD/SVE state tracking was changed across commits: baa8515281b3 ("arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE") a0136be443d5 (arm64/fpsimd: Load FP state based on recorded data type") bbc6172eefdb ("arm64/fpsimd: SME no longer requires SVE register state") 8c845e273104 ("arm64/sve: Leave SVE enabled on syscall if we don't context switch") ... sve_sync_to_fpsimd() was updated to consider task->thread.fp_type rather than the task's TIF_SVE and PSTATE.SM, but (apparently due to an oversight) sve_sync_from_fpsimd_zeropad() was left as-is, leaving the two inconsistent. Due to this, sve_sync_from_fpsimd_zeropad() may copy state from task->thread.uw.fpsimd_state into task->thread.sve_state when task->thread.fp_type == FP_STATE_FPSIMD. This is redundant (but benign) as task->thread.uw.fpsimd_state is the effective state that will be restored, and task->thread.sve_state will not be consumed. For consistency, and to avoid the redundant work, it better for sve_sync_from_fpsimd_zeropad() to consider task->thread.fp_type alone, matching sve_sync_to_fpsimd(). The naming of both functions is somehat unfortunate, as it is unclear when and why they copy state. It would be better to describe them in terms of the effective state. Considering all of the above, clean this up: * Adjust sve_sync_from_fpsimd_zeropad() to consider task->thread.fp_type. * Update comments to clarify the intended semantics/usage. I've removed the description that task->thread.sve_state must have been allocated, as this is only necessary when task->thread.fp_type == FP_STATE_SVE, which itself implies that task->thread.sve_state must have been allocated. * Rename the functions to more clearly indicate when/why they copy state: - sve_sync_to_fpsimd() => fpsimd_sync_from_effective_state() - sve_sync_from_fpsimd_zeropad => fpsimd_sync_to_effective_state_zeropad() Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250508132644.1395904-7-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2025-05-08arm64/fpsimd: ptrace: Consistently handle partial writes to NT_ARM_(S)SVEMark Rutland
Partial writes to the NT_ARM_SVE and NT_ARM_SSVE regsets using an payload are handled inconsistently and non-deterministically. A comment within sve_set_common() indicates that we intended that a partial write would preserve any effective FPSIMD/SVE state which was not overwritten, but this has never worked consistently, and during syscalls the FPSIMD vector state may be non-deterministically preserved and may be erroneously migrated between streaming and non-streaming SVE modes. The simplest fix is to handle a partial write by consistently zeroing the remaining state. As detailed below I do not believe this will adversely affect any real usage. Neither GDB nor LLDB attempt partial writes to these regsets, and the documentation (in Documentation/arch/arm64/sve.rst) has always indicated that state preservation was not guaranteed, as is says: | The effect of writing a partial, incomplete payload is unspecified. When the logic was originally introduced in commit: 43d4da2c45b2 ("arm64/sve: ptrace and ELF coredump support") ... there were two potential behaviours, depending on TIF_SVE: * When TIF_SVE was clear, all SVE state would be zeroed, excluding the low 128 bits of vectors shared with FPSIMD, FPSR, and FPCR. * When TIF_SVE was set, all SVE state would be zeroed, including the low 128 bits of vectors shared with FPSIMD, but excluding FPSR and FPCR. Note that as writing to NT_ARM_SVE would set TIF_SVE, partial writes to NT_ARM_SVE would not be idempotent, and if a first write preserved the low 128 bits, a subsequent (potentially identical) partial write would discard the low 128 bits. When support for the NT_ARM_SSVE regset was added in commit: e12310a0d30f ("arm64/sme: Implement ptrace support for streaming mode SVE registers") ... the above behaviour was retained for writes to the NT_ARM_SVE regset, though writes to the NT_ARM_SSVE would always zero the SVE registers and would not inherit FPSIMD register state. This happened as fpsimd_sync_to_sve() only copied the FPSIMD regs when TIF_SVE was clear and PSTATE.SM==0. Subsequently, when FPSIMD/SVE state tracking was changed across commits: baa8515281b3 ("arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVE") a0136be443d5 (arm64/fpsimd: Load FP state based on recorded data type") bbc6172eefdb ("arm64/fpsimd: SME no longer requires SVE register state") 8c845e273104 ("arm64/sve: Leave SVE enabled on syscall if we don't context switch") ... there was no corresponding update to the ptrace code, nor to fpsimd_sync_to_sve(), which stil considers TIF_SVE and PSTATE.SM rather than the saved fp_type. The saved state can be in the FPSIMD format regardless of whether TIF_SVE is set or clear, and the saved type can change non-deterministically during syscalls. Consequently a subsequent partial write to the NT_ARM_SVE or NT_ARM_SSVE regsets may non-deterministically preserve the FPSIMD state, and may migrate this state between streaming and non-streaming modes. Clean this up by never attempting to preserve ANY state when writing an SVE payload to the NT_ARM_SVE/NT_ARM_SSVE regsets, zeroing all relevant state including FPSR and FPCR. This simplifies the code, makes the behaviour deterministic, and avoids migrating state between streaming and non-streaming modes. As above, I do not believe this should adversely affect existing userspace applications. At the same time, remove fpsimd_sync_to_sve(). It is no longer used, doesn't do what its documentation implies, and gets in the way of other cleanups and fixes. Fixes: 43d4da2c45b2 ("arm64/sve: ptrace and ELF coredump support") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Spickett <david.spickett@arm.com> Cc: Luis Machado <luis.machado@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250508132644.1395904-6-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-12-05arm64: ptrace: fix partial SETREGSET for NT_ARM_GCSMark Rutland
Currently gcs_set() doesn't initialize the temporary 'user_gcs' variable, and a SETREGSET call with a length of 0, 8, or 16 will leave some portion of this uninitialized. Consequently some arbitrary uninitialized values may be written back to the relevant fields in task struct, potentially leaking up to 192 bits of memory from the kernel stack. The read is limited to a specific slot on the stack, and the issue does not provide a write mechanism. As gcs_set() rejects cases where user_gcs::features_enabled has bits set other than PR_SHADOW_STACK_SUPPORTED_STATUS_MASK, a SETREGSET call with a length of zero will randomly succeed or fail depending on the value of the uninitialized value, it isn't possible to leak the full 192 bits. With a length of 8 or 16, user_gcs::features_enabled can be initialized to an accepted value, making it practical to leak 128 or 64 bits. Fix this by initializing the temporary value before copying the regset from userspace, as for other regsets (e.g. NT_PRSTATUS, NT_PRFPREG, NT_ARM_SYSTEM_CALL). In the case of a zero-length or partial write, the existing contents of the fields which are not written to will be retained. To ensure that the extraction and insertion of fields is consistent across the GETREGSET and SETREGSET calls, new task_gcs_to_user() and task_gcs_from_user() helpers are added, matching the style of pac_address_keys_to_user() and pac_address_keys_from_user(). Before this patch: | # ./gcs-test | Attempting to write NT_ARM_GCS::user_gcs = { | .features_enabled = 0x0000000000000000, | .features_locked = 0x0000000000000000, | .gcspr_el0 = 0x900d900d900d900d, | } | SETREGSET(nt=0x410, len=24) wrote 24 bytes | | Attempting to read NT_ARM_GCS::user_gcs | GETREGSET(nt=0x410, len=24) read 24 bytes | Read NT_ARM_GCS::user_gcs = { | .features_enabled = 0x0000000000000000, | .features_locked = 0x0000000000000000, | .gcspr_el0 = 0x900d900d900d900d, | } | | Attempting partial write NT_ARM_GCS::user_gcs = { | .features_enabled = 0x0000000000000000, | .features_locked = 0x1de7ec7edbadc0de, | .gcspr_el0 = 0x1de7ec7edbadc0de, | } | SETREGSET(nt=0x410, len=8) wrote 8 bytes | | Attempting to read NT_ARM_GCS::user_gcs | GETREGSET(nt=0x410, len=24) read 24 bytes | Read NT_ARM_GCS::user_gcs = { | .features_enabled = 0x0000000000000000, | .features_locked = 0x000000000093e780, | .gcspr_el0 = 0xffff800083a63d50, | } After this patch: | # ./gcs-test | Attempting to write NT_ARM_GCS::user_gcs = { | .features_enabled = 0x0000000000000000, | .features_locked = 0x0000000000000000, | .gcspr_el0 = 0x900d900d900d900d, | } | SETREGSET(nt=0x410, len=24) wrote 24 bytes | | Attempting to read NT_ARM_GCS::user_gcs | GETREGSET(nt=0x410, len=24) read 24 bytes | Read NT_ARM_GCS::user_gcs = { | .features_enabled = 0x0000000000000000, | .features_locked = 0x0000000000000000, | .gcspr_el0 = 0x900d900d900d900d, | } | | Attempting partial write NT_ARM_GCS::user_gcs = { | .features_enabled = 0x0000000000000000, | .features_locked = 0x1de7ec7edbadc0de, | .gcspr_el0 = 0x1de7ec7edbadc0de, | } | SETREGSET(nt=0x410, len=8) wrote 8 bytes | | Attempting to read NT_ARM_GCS::user_gcs | GETREGSET(nt=0x410, len=24) read 24 bytes | Read NT_ARM_GCS::user_gcs = { | .features_enabled = 0x0000000000000000, | .features_locked = 0x0000000000000000, | .gcspr_el0 = 0x900d900d900d900d, | } Fixes: 7ec3b57cb29f ("arm64/ptrace: Expose GCS via ptrace and core files") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241205121655.1824269-5-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-12-05arm64: ptrace: fix partial SETREGSET for NT_ARM_POEMark Rutland
Currently poe_set() doesn't initialize the temporary 'ctrl' variable, and a SETREGSET call with a length of zero will leave this uninitialized. Consequently an arbitrary value will be written back to target->thread.por_el0, potentially leaking up to 64 bits of memory from the kernel stack. The read is limited to a specific slot on the stack, and the issue does not provide a write mechanism. Fix this by initializing the temporary value before copying the regset from userspace, as for other regsets (e.g. NT_PRSTATUS, NT_PRFPREG, NT_ARM_SYSTEM_CALL). In the case of a zero-length write, the existing contents of POR_EL1 will be retained. Before this patch: | # ./poe-test | Attempting to write NT_ARM_POE::por_el0 = 0x900d900d900d900d | SETREGSET(nt=0x40f, len=8) wrote 8 bytes | | Attempting to read NT_ARM_POE::por_el0 | GETREGSET(nt=0x40f, len=8) read 8 bytes | Read NT_ARM_POE::por_el0 = 0x900d900d900d900d | | Attempting to write NT_ARM_POE (zero length) | SETREGSET(nt=0x40f, len=0) wrote 0 bytes | | Attempting to read NT_ARM_POE::por_el0 | GETREGSET(nt=0x40f, len=8) read 8 bytes | Read NT_ARM_POE::por_el0 = 0xffff8000839c3d50 After this patch: | # ./poe-test | Attempting to write NT_ARM_POE::por_el0 = 0x900d900d900d900d | SETREGSET(nt=0x40f, len=8) wrote 8 bytes | | Attempting to read NT_ARM_POE::por_el0 | GETREGSET(nt=0x40f, len=8) read 8 bytes | Read NT_ARM_POE::por_el0 = 0x900d900d900d900d | | Attempting to write NT_ARM_POE (zero length) | SETREGSET(nt=0x40f, len=0) wrote 0 bytes | | Attempting to read NT_ARM_POE::por_el0 | GETREGSET(nt=0x40f, len=8) read 8 bytes | Read NT_ARM_POE::por_el0 = 0x900d900d900d900d Fixes: 175198199262 ("arm64/ptrace: add support for FEAT_POE") Cc: <stable@vger.kernel.org> # 6.12.x Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241205121655.1824269-4-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-12-05arm64: ptrace: fix partial SETREGSET for NT_ARM_FPMRMark Rutland
Currently fpmr_set() doesn't initialize the temporary 'fpmr' variable, and a SETREGSET call with a length of zero will leave this uninitialized. Consequently an arbitrary value will be written back to target->thread.uw.fpmr, potentially leaking up to 64 bits of memory from the kernel stack. The read is limited to a specific slot on the stack, and the issue does not provide a write mechanism. Fix this by initializing the temporary value before copying the regset from userspace, as for other regsets (e.g. NT_PRSTATUS, NT_PRFPREG, NT_ARM_SYSTEM_CALL). In the case of a zero-length write, the existing contents of FPMR will be retained. Before this patch: | # ./fpmr-test | Attempting to write NT_ARM_FPMR::fpmr = 0x900d900d900d900d | SETREGSET(nt=0x40e, len=8) wrote 8 bytes | | Attempting to read NT_ARM_FPMR::fpmr | GETREGSET(nt=0x40e, len=8) read 8 bytes | Read NT_ARM_FPMR::fpmr = 0x900d900d900d900d | | Attempting to write NT_ARM_FPMR (zero length) | SETREGSET(nt=0x40e, len=0) wrote 0 bytes | | Attempting to read NT_ARM_FPMR::fpmr | GETREGSET(nt=0x40e, len=8) read 8 bytes | Read NT_ARM_FPMR::fpmr = 0xffff800083963d50 After this patch: | # ./fpmr-test | Attempting to write NT_ARM_FPMR::fpmr = 0x900d900d900d900d | SETREGSET(nt=0x40e, len=8) wrote 8 bytes | | Attempting to read NT_ARM_FPMR::fpmr | GETREGSET(nt=0x40e, len=8) read 8 bytes | Read NT_ARM_FPMR::fpmr = 0x900d900d900d900d | | Attempting to write NT_ARM_FPMR (zero length) | SETREGSET(nt=0x40e, len=0) wrote 0 bytes | | Attempting to read NT_ARM_FPMR::fpmr | GETREGSET(nt=0x40e, len=8) read 8 bytes | Read NT_ARM_FPMR::fpmr = 0x900d900d900d900d Fixes: 4035c22ef7d4 ("arm64/ptrace: Expose FPMR via ptrace") Cc: <stable@vger.kernel.org> # 6.9.x Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241205121655.1824269-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-12-05arm64: ptrace: fix partial SETREGSET for NT_ARM_TAGGED_ADDR_CTRLMark Rutland
Currently tagged_addr_ctrl_set() doesn't initialize the temporary 'ctrl' variable, and a SETREGSET call with a length of zero will leave this uninitialized. Consequently tagged_addr_ctrl_set() will consume an arbitrary value, potentially leaking up to 64 bits of memory from the kernel stack. The read is limited to a specific slot on the stack, and the issue does not provide a write mechanism. As set_tagged_addr_ctrl() only accepts values where bits [63:4] zero and rejects other values, a partial SETREGSET attempt will randomly succeed or fail depending on the value of the uninitialized value, and the exposure is significantly limited. Fix this by initializing the temporary value before copying the regset from userspace, as for other regsets (e.g. NT_PRSTATUS, NT_PRFPREG, NT_ARM_SYSTEM_CALL). In the case of a zero-length write, the existing value of the tagged address ctrl will be retained. The NT_ARM_TAGGED_ADDR_CTRL regset is only visible in the user_aarch64_view used by a native AArch64 task to manipulate another native AArch64 task. As get_tagged_addr_ctrl() only returns an error value when called for a compat task, tagged_addr_ctrl_get() and tagged_addr_ctrl_set() should never observe an error value from get_tagged_addr_ctrl(). Add a WARN_ON_ONCE() to both to indicate that such an error would be unexpected, and error handlnig is not missing in either case. Fixes: 2200aa7154cb ("arm64: mte: ptrace: Add NT_ARM_TAGGED_ADDR_CTRL regset") Cc: <stable@vger.kernel.org> # 5.10.x Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241205121655.1824269-2-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-11-14Merge branches 'for-next/gcs', 'for-next/probes', 'for-next/asm-offsets', ↵Catalin Marinas
'for-next/tlb', 'for-next/misc', 'for-next/mte', 'for-next/sysreg', 'for-next/stacktrace', 'for-next/hwcap3', 'for-next/kselftest', 'for-next/crc32', 'for-next/guest-cca', 'for-next/haft' and 'for-next/scs', remote-tracking branch 'arm64/for-next/perf' into for-next/core * arm64/for-next/perf: perf: Switch back to struct platform_driver::remove() perf: arm_pmuv3: Add support for Samsung Mongoose PMU dt-bindings: arm: pmu: Add Samsung Mongoose core compatible perf/dwc_pcie: Fix typos in event names perf/dwc_pcie: Add support for Ampere SoCs ARM: pmuv3: Add missing write_pmuacr() perf/marvell: Marvell PEM performance monitor support perf/arm_pmuv3: Add PMUv3.9 per counter EL0 access control perf/dwc_pcie: Convert the events with mixed case to lowercase perf/cxlpmu: Support missing events in 3.1 spec perf: imx_perf: add support for i.MX91 platform dt-bindings: perf: fsl-imx-ddr: Add i.MX91 compatible drivers perf: remove unused field pmu_node * for-next/gcs: (42 commits) : arm64 Guarded Control Stack user-space support kselftest/arm64: Fix missing printf() argument in gcs/gcs-stress.c arm64/gcs: Fix outdated ptrace documentation kselftest/arm64: Ensure stable names for GCS stress test results kselftest/arm64: Validate that GCS push and write permissions work kselftest/arm64: Enable GCS for the FP stress tests kselftest/arm64: Add a GCS stress test kselftest/arm64: Add GCS signal tests kselftest/arm64: Add test coverage for GCS mode locking kselftest/arm64: Add a GCS test program built with the system libc kselftest/arm64: Add very basic GCS test program kselftest/arm64: Always run signals tests with GCS enabled kselftest/arm64: Allow signals tests to specify an expected si_code kselftest/arm64: Add framework support for GCS to signal handling tests kselftest/arm64: Add GCS as a detected feature in the signal tests kselftest/arm64: Verify the GCS hwcap arm64: Add Kconfig for Guarded Control Stack (GCS) arm64/ptrace: Expose GCS via ptrace and core files arm64/signal: Expose GCS state in signal frames arm64/signal: Set up and restore the GCS context for signal handlers arm64/mm: Implement map_shadow_stack() ... * for-next/probes: : Various arm64 uprobes/kprobes cleanups arm64: insn: Simulate nop instruction for better uprobe performance arm64: probes: Remove probe_opcode_t arm64: probes: Cleanup kprobes endianness conversions arm64: probes: Move kprobes-specific fields arm64: probes: Fix uprobes for big-endian kernels arm64: probes: Fix simulate_ldr*_literal() arm64: probes: Remove broken LDR (literal) uprobe support * for-next/asm-offsets: : arm64 asm-offsets.c cleanup (remove unused offsets) arm64: asm-offsets: remove PREEMPT_DISABLE_OFFSET arm64: asm-offsets: remove DMA_{TO,FROM}_DEVICE arm64: asm-offsets: remove VM_EXEC and PAGE_SZ arm64: asm-offsets: remove MM_CONTEXT_ID arm64: asm-offsets: remove COMPAT_{RT_,SIGFRAME_REGS_OFFSET arm64: asm-offsets: remove VMA_VM_* arm64: asm-offsets: remove TSK_ACTIVE_MM * for-next/tlb: : TLB flushing optimisations arm64: optimize flush tlb kernel range arm64: tlbflush: add __flush_tlb_range_limit_excess() * for-next/misc: : Miscellaneous patches arm64: tls: Fix context-switching of tpidrro_el0 when kpti is enabled arm64/ptrace: Clarify documentation of VL configuration via ptrace acpi/arm64: remove unnecessary cast arm64/mm: Change protval as 'pteval_t' in map_range() arm64: uprobes: Optimize cache flushes for xol slot acpi/arm64: Adjust error handling procedure in gtdt_parse_timer_block() arm64: fix .data.rel.ro size assertion when CONFIG_LTO_CLANG arm64/ptdump: Test both PTE_TABLE_BIT and PTE_VALID for block mappings arm64/mm: Sanity check PTE address before runtime P4D/PUD folding arm64/mm: Drop setting PTE_TYPE_PAGE in pte_mkcont() ACPI: GTDT: Tighten the check for the array of platform timer structures arm64/fpsimd: Fix a typo arm64: Expose ID_AA64ISAR1_EL1.XS to sanitised feature consumers arm64: Return early when break handler is found on linked-list arm64/mm: Re-organize arch_make_huge_pte() arm64/mm: Drop _PROT_SECT_DEFAULT arm64: Add command-line override for ID_AA64MMFR0_EL1.ECV arm64: head: Drop SWAPPER_TABLE_SHIFT arm64: cpufeature: add POE to cpucap_is_possible() arm64/mm: Change pgattr_change_is_safe() arguments as pteval_t * for-next/mte: : Various MTE improvements selftests: arm64: add hugetlb mte tests hugetlb: arm64: add mte support * for-next/sysreg: : arm64 sysreg updates arm64/sysreg: Update ID_AA64MMFR1_EL1 to DDI0601 2024-09 * for-next/stacktrace: : arm64 stacktrace improvements arm64: preserve pt_regs::stackframe during exec*() arm64: stacktrace: unwind exception boundaries arm64: stacktrace: split unwind_consume_stack() arm64: stacktrace: report recovered PCs arm64: stacktrace: report source of unwind data arm64: stacktrace: move dump_backtrace() to kunwind_stack_walk() arm64: use a common struct frame_record arm64: pt_regs: swap 'unused' and 'pmr' fields arm64: pt_regs: rename "pmr_save" -> "pmr" arm64: pt_regs: remove stale big-endian layout arm64: pt_regs: assert pt_regs is a multiple of 16 bytes * for-next/hwcap3: : Add AT_HWCAP3 support for arm64 (also wire up AT_HWCAP4) arm64: Support AT_HWCAP3 binfmt_elf: Wire up AT_HWCAP3 at AT_HWCAP4 * for-next/kselftest: (30 commits) : arm64 kselftest fixes/cleanups kselftest/arm64: Try harder to generate different keys during PAC tests kselftest/arm64: Don't leak pipe fds in pac.exec_sign_all() kselftest/arm64: Corrupt P0 in the irritator when testing SSVE kselftest/arm64: Add FPMR coverage to fp-ptrace kselftest/arm64: Expand the set of ZA writes fp-ptrace does kselftets/arm64: Use flag bits for features in fp-ptrace assembler code kselftest/arm64: Enable build of PAC tests with LLVM=1 kselftest/arm64: Check that SVCR is 0 in signal handlers kselftest/arm64: Fix printf() compiler warnings in the arm64 syscall-abi.c tests kselftest/arm64: Fix printf() warning in the arm64 MTE prctl() test kselftest/arm64: Fix printf() compiler warnings in the arm64 fp tests kselftest/arm64: Fix build with stricter assemblers kselftest/arm64: Test signal handler state modification in fp-stress kselftest/arm64: Provide a SIGUSR1 handler in the kernel mode FP stress test kselftest/arm64: Implement irritators for ZA and ZT kselftest/arm64: Remove unused ADRs from irritator handlers kselftest/arm64: Correct misleading comments on fp-stress irritators kselftest/arm64: Poll less often while waiting for fp-stress children kselftest/arm64: Increase frequency of signal delivery in fp-stress kselftest/arm64: Fix encoding for SVE B16B16 test ... * for-next/crc32: : Optimise CRC32 using PMULL instructions arm64/crc32: Implement 4-way interleave using PMULL arm64/crc32: Reorganize bit/byte ordering macros arm64/lib: Handle CRC-32 alternative in C code * for-next/guest-cca: : Support for running Linux as a guest in Arm CCA arm64: Document Arm Confidential Compute virt: arm-cca-guest: TSM_REPORT support for realms arm64: Enable memory encrypt for Realms arm64: mm: Avoid TLBI when marking pages as valid arm64: Enforce bounce buffers for realm DMA efi: arm64: Map Device with Prot Shared arm64: rsi: Map unprotected MMIO as decrypted arm64: rsi: Add support for checking whether an MMIO is protected arm64: realm: Query IPA size from the RMM arm64: Detect if in a realm and set RIPAS RAM arm64: rsi: Add RSI definitions * for-next/haft: : Support for arm64 FEAT_HAFT arm64: pgtable: Warn unexpected pmdp_test_and_clear_young() arm64: Enable ARCH_HAS_NONLEAF_PMD_YOUNG arm64: Add support for FEAT_HAFT arm64: setup: name 'tcr2' register arm64/sysreg: Update ID_AA64MMFR1_EL1 register * for-next/scs: : Dynamic shadow call stack fixes arm64/scs: Drop unused prototype __pi_scs_patch_vmlinux() arm64/scs: Deal with 64-bit relative offsets in FDE frames arm64/scs: Fix handling of DWARF augmentation data in CIE/FDE frames
2024-11-12arm64/ptrace: Clarify documentation of VL configuration via ptraceMark Brown
When we configure SVE, SSVE or ZA via ptrace we allow the user to configure the vector length and specify any of the flags that are accepted when configuring via prctl(). This includes the S[VM]E_SET_VL_ONEXEC flag which defers the configuration of the VL until an exec(). We don't do anything to limit the provision of register data as part of configuring the _ONEXEC VL but as a function of the VL enumeration support we do this will be interpreted using the vector length currently configured for the process. This is all a bit surprising, and probably we should just not have allowed register data to be specified with _ONEXEC, but it's our ABI so let's add some explicit documentation in both the ABI documents and the source calling out what happens. The comments are also missing the fact that since SME does not have a mandatory 128 bit VL it is possible for VL enumeration to result in the configuration of a higher VL than was requested, cover that too. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241106-arm64-sve-ptrace-vl-set-v1-1-3b164e8b559c@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-10-04arm64/ptrace: Expose GCS via ptrace and core filesMark Brown
Provide a new register type NT_ARM_GCS reporting the current GCS mode and pointer for EL0. Due to the interactions with allocation and deallocation of Guarded Control Stacks we do not permit any changes to the GCS mode via ptrace, only GCSPR_EL0 may be changed. Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20241001-arm64-gcs-v13-27-222b78d87eee@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-09-04arm64/ptrace: add support for FEAT_POEJoey Gouly
Add a regset for POE containing POR_EL0. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20240822151113.1479789-21-joey.gouly@arm.com Signed-off-by: Will Deacon <will@kernel.org>
2024-04-03arm64/ptrace: Use saved floating point state type to determine SVE layoutMark Brown
The SVE register sets have two different formats, one of which is a wrapped version of the standard FPSIMD register set and another with actual SVE register data. At present we check TIF_SVE to see if full SVE register state should be provided when reading the SVE regset but if we were in a syscall we may have saved only floating point registers even though that is set. Fix this and simplify the logic by checking and using the format which we recorded when deciding if we should use FPSIMD or SVE format. Fixes: 8c845e273104 ("arm64/sve: Leave SVE enabled on syscall if we don't context switch") Cc: <stable@vger.kernel.org> # 6.2.x Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240325-arm64-ptrace-fp-type-v1-1-8dc846caf11f@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-03-14Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: "The major features are support for LPA2 (52-bit VA/PA with 4K and 16K pages), the dpISA extension and Rust enabled on arm64. The changes are mostly contained within the usual arch/arm64/, drivers/perf, the arm64 Documentation and kselftests. The exception is the Rust support which touches some generic build files. Summary: - Reorganise the arm64 kernel VA space and add support for LPA2 (at stage 1, KVM stage 2 was merged earlier) - 52-bit VA/PA address range with 4KB and 16KB pages - Enable Rust on arm64 - Support for the 2023 dpISA extensions (data processing ISA), host only - arm64 perf updates: - StarFive's StarLink (integrates one or more CPU cores with a shared L3 memory system) PMU support - Enable HiSilicon Erratum 162700402 quirk for HIP09 - Several updates for the HiSilicon PCIe PMU driver - Arm CoreSight PMU support - Convert all drivers under drivers/perf/ to use .remove_new() - Miscellaneous: - Don't enable workarounds for "rare" errata by default - Clean up the DAIF flags handling for EL0 returns (in preparation for NMI support) - Kselftest update for ptrace() - Update some of the sysreg field definitions - Slight improvement in the code generation for inline asm I/O accessors to permit offset addressing - kretprobes: acquire regs via a BRK exception (previously done via a trampoline handler) - SVE/SME cleanups, comment updates - Allow CALL_OPS+CC_OPTIMIZE_FOR_SIZE with clang (previously disabled due to gcc silently ignoring -falign-functions=N)" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (134 commits) Revert "mm: add arch hook to validate mmap() prot flags" Revert "arm64: mm: add support for WXN memory translation attribute" Revert "ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512" ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512 kselftest/arm64: Add 2023 DPISA hwcap test coverage kselftest/arm64: Add basic FPMR test kselftest/arm64: Handle FPMR context in generic signal frame parser arm64/hwcap: Define hwcaps for 2023 DPISA features arm64/ptrace: Expose FPMR via ptrace arm64/signal: Add FPMR signal handling arm64/fpsimd: Support FEAT_FPMR arm64/fpsimd: Enable host kernel access to FPMR arm64/cpufeature: Hook new identification registers up to cpufeature docs: perf: Fix build warning of hisi-pcie-pmu.rst perf: starfive: Only allow COMPILE_TEST for 64-bit architectures MAINTAINERS: Add entry for StarFive StarLink PMU docs: perf: Add description for StarFive's StarLink PMU dt-bindings: perf: starfive: Add JH8100 StarLink PMU perf: starfive: Add StarLink PMU support docs: perf: Update usage for target filter of hisi-pcie-pmu ...
2024-03-07Merge branches 'for-next/reorg-va-space', 'for-next/rust-for-arm64', ↵Catalin Marinas
'for-next/misc', 'for-next/daif-cleanup', 'for-next/kselftest', 'for-next/documentation', 'for-next/sysreg' and 'for-next/dpisa', remote-tracking branch 'arm64/for-next/perf' into for-next/core * arm64/for-next/perf: (39 commits) docs: perf: Fix build warning of hisi-pcie-pmu.rst perf: starfive: Only allow COMPILE_TEST for 64-bit architectures MAINTAINERS: Add entry for StarFive StarLink PMU docs: perf: Add description for StarFive's StarLink PMU dt-bindings: perf: starfive: Add JH8100 StarLink PMU perf: starfive: Add StarLink PMU support docs: perf: Update usage for target filter of hisi-pcie-pmu drivers/perf: hisi_pcie: Merge find_related_event() and get_event_idx() drivers/perf: hisi_pcie: Relax the check on related events drivers/perf: hisi_pcie: Check the target filter properly drivers/perf: hisi_pcie: Add more events for counting TLP bandwidth drivers/perf: hisi_pcie: Fix incorrect counting under metric mode drivers/perf: hisi_pcie: Introduce hisi_pcie_pmu_get_event_ctrl_val() drivers/perf: hisi_pcie: Rename hisi_pcie_pmu_{config,clear}_filter() drivers/perf: hisi: Enable HiSilicon Erratum 162700402 quirk for HIP09 perf/arm_cspmu: Add devicetree support dt-bindings/perf: Add Arm CoreSight PMU perf/arm_cspmu: Simplify counter reset perf/arm_cspmu: Simplify attribute groups perf/arm_cspmu: Simplify initialisation ... * for-next/reorg-va-space: : Reorganise the arm64 kernel VA space in preparation for LPA2 support : (52-bit VA/PA). arm64: kaslr: Adjust randomization range dynamically arm64: mm: Reclaim unused vmemmap region for vmalloc use arm64: vmemmap: Avoid base2 order of struct page size to dimension region arm64: ptdump: Discover start of vmemmap region at runtime arm64: ptdump: Allow all region boundaries to be defined at boot time arm64: mm: Move fixmap region above vmemmap region arm64: mm: Move PCI I/O emulation region above the vmemmap region * for-next/rust-for-arm64: : Enable Rust support for arm64 arm64: rust: Enable Rust support for AArch64 rust: Refactor the build target to allow the use of builtin targets * for-next/misc: : Miscellaneous arm64 patches ARM64: Dynamically allocate cpumasks and increase supported CPUs to 512 arm64: Remove enable_daif macro arm64/hw_breakpoint: Directly use ESR_ELx_WNR for an watchpoint exception arm64: cpufeatures: Clean up temporary variable to simplify code arm64: Update setup_arch() comment on interrupt masking arm64: remove unnecessary ifdefs around is_compat_task() arm64: ftrace: Don't forbid CALL_OPS+CC_OPTIMIZE_FOR_SIZE with Clang arm64/sme: Ensure that all fields in SMCR_EL1 are set to known values arm64/sve: Ensure that all fields in ZCR_EL1 are set to known values arm64/sve: Document that __SVE_VQ_MAX is much larger than needed arm64: make member of struct pt_regs and it's offset macro in the same order arm64: remove unneeded BUILD_BUG_ON assertion arm64: kretprobes: acquire the regs via a BRK exception arm64: io: permit offset addressing arm64: errata: Don't enable workarounds for "rare" errata by default * for-next/daif-cleanup: : Clean up DAIF handling for EL0 returns arm64: Unmask Debug + SError in do_notify_resume() arm64: Move do_notify_resume() to entry-common.c arm64: Simplify do_notify_resume() DAIF masking * for-next/kselftest: : Miscellaneous arm64 kselftest patches kselftest/arm64: Test that ptrace takes effect in the target process * for-next/documentation: : arm64 documentation patches arm64/sme: Remove spurious 'is' in SME documentation arm64/fp: Clarify effect of setting an unsupported system VL arm64/sme: Fix cut'n'paste in ABI document arm64/sve: Remove bitrotted comment about syscall behaviour * for-next/sysreg: : sysreg updates arm64/sysreg: Update ID_AA64DFR0_EL1 register arm64/sysreg: Update ID_DFR0_EL1 register fields arm64/sysreg: Add register fields for ID_AA64DFR1_EL1 * for-next/dpisa: : Support for 2023 dpISA extensions kselftest/arm64: Add 2023 DPISA hwcap test coverage kselftest/arm64: Add basic FPMR test kselftest/arm64: Handle FPMR context in generic signal frame parser arm64/hwcap: Define hwcaps for 2023 DPISA features arm64/ptrace: Expose FPMR via ptrace arm64/signal: Add FPMR signal handling arm64/fpsimd: Support FEAT_FPMR arm64/fpsimd: Enable host kernel access to FPMR arm64/cpufeature: Hook new identification registers up to cpufeature
2024-03-07arm64/ptrace: Expose FPMR via ptraceMark Brown
Add a new regset to expose FPMR via ptrace. It is not added to the FPSIMD registers since that structure is exposed elsewhere without any allowance for extension we don't add there. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240306-arm64-2023-dpisa-v5-5-c568edc8ed7f@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-02-28arm64: remove unnecessary ifdefs around is_compat_task()Leonardo Bras
Currently some parts of the codebase will test for CONFIG_COMPAT before testing is_compat_task(). is_compat_task() is a inlined function only present on CONFIG_COMPAT. On the other hand, for !CONFIG_COMPAT, we have in linux/compat.h: #define is_compat_task() (0) Since we have this define available in every usage of is_compat_task() for !CONFIG_COMPAT, it's unnecessary to keep the ifdefs, since the compiler is smart enough to optimize-out those snippets on CONFIG_COMPAT=n This requires some regset code as well as a few other defines to be made available on !CONFIG_COMPAT, so some symbols can get resolved before getting optimized-out. Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Link: https://lore.kernel.org/r/20240109034651.478462-2-leobras@redhat.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2024-02-15arm64/sve: Lower the maximum allocation for the SVE ptrace regsetMark Brown
Doug Anderson observed that ChromeOS crashes are being reported which include failing allocations of order 7 during core dumps due to ptrace allocating storage for regsets: chrome: page allocation failure: order:7, mode:0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), nodemask=(null),cpuset=urgent,mems_allowed=0 ... regset_get_alloc+0x1c/0x28 elf_core_dump+0x3d8/0xd8c do_coredump+0xeb8/0x1378 with further investigation showing that this is: [ 66.957385] DOUG: Allocating 279584 bytes which is the maximum size of the SVE regset. As Doug observes it is not entirely surprising that such a large allocation of contiguous memory might fail on a long running system. The SVE regset is currently sized to hold SVE registers with a VQ of SVE_VQ_MAX which is 512, substantially more than the architectural maximum of 16 which we might see even in a system emulating the limits of the architecture. Since we don't expose the size we tell the regset core externally let's define ARCH_SVE_VQ_MAX with the actual architectural maximum and use that for the regset, we'll still overallocate most of the time but much less so which will be helpful even if the core is fixed to not require contiguous allocations. Specify ARCH_SVE_VQ_MAX in terms of the maximum value that can be written into ZCR_ELx.LEN (where this is set in the hardware). For consistency update the maximum SME vector length to be specified in the same style while we are at it. We could also teach the ptrace core about runtime discoverable regset sizes but that would be a more invasive change and this is being observed in practical systems. Reported-by: Doug Anderson <dianders@chromium.org> Signed-off-by: Mark Brown <broonie@kernel.org> Tested-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20240213-arm64-sve-ptrace-regset-size-v2-1-c7600ca74b9b@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2024-01-19Merge tag 'arm64-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fixes from Will Deacon: "I think the main one is fixing the dynamic SCS patching when full LTO is enabled (clang was silently getting this horribly wrong), but it's all good stuff. Rob just pointed out that the fix to the workaround for erratum #2966298 might not be necessary, but in the worst case it's harmless and since the official description leaves a little to be desired here, I've left it in. Summary: - Fix shadow call stack patching with LTO=full - Fix voluntary preemption of the FPSIMD registers from assembly code - Fix workaround for A520 CPU erratum #2966298 and extend to A510 - Fix SME issues that resulted in corruption of the register state - Minor fixes (missing includes, formatting)" * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64: Fix silcon-errata.rst formatting arm64/sme: Always exit sme_alloc() early with existing storage arm64/fpsimd: Remove spurious check for SVE support arm64/ptrace: Don't flush ZA/ZT storage when writing ZA via ptrace arm64: entry: simplify kernel_exit logic arm64: entry: fix ARM64_WORKAROUND_SPECULATIVE_UNPRIV_LOAD arm64: errata: Add Cortex-A510 speculative unprivileged load workaround arm64: Rename ARM64_WORKAROUND_2966298 arm64: fpsimd: Bring cond_yield asm macro in line with new rules arm64: scs: Work around full LTO issue with dynamic SCS arm64: irq: include <linux/cpumask.h>
2024-01-18arm64/ptrace: Don't flush ZA/ZT storage when writing ZA via ptraceMark Brown
When writing ZA we currently unconditionally flush the buffer used to store it as part of ensuring that it is allocated. Since this buffer is shared with ZT0 this means that a write to ZA when PSTATE.ZA is already set will corrupt the value of ZT0 on a SME2 system. Fix this by only flushing the backing storage if PSTATE.ZA was not previously set. This will mean that short or failed writes may leave stale data in the buffer, this seems as correct as our current behaviour and unlikely to be something that userspace will rely on. Fixes: f90b529bcbe5 ("arm64/sme: Implement ZT0 ptrace support") Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20240115-arm64-fix-ptrace-za-zt-v1-1-48617517028a@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2023-12-27rseq: Split out rseq.h from sched.hKent Overstreet
We're trying to get sched.h down to more or less just types only, not code - rseq can live in its own header. This helps us kill the dependency on preempt.h in sched.h. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-08-28Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "I think we have a bit less than usual on the architecture side, but that's somewhat balanced out by a large crop of perf/PMU driver updates and extensions to our selftests. CPU features and system registers: - Advertise hinted conditional branch support (FEAT_HBC) to userspace - Avoid false positive "SANITY CHECK" warning when xCR registers differ outside of the length field Documentation: - Fix macro name typo in SME documentation Entry code: - Unmask exceptions earlier on the system call entry path Memory management: - Don't bother clearing PTE_RDONLY for dirty ptes in pte_wrprotect() and pte_modify() Perf and PMU drivers: - Initial support for Coresight TRBE devices on ACPI systems (the coresight driver changes will come later) - Fix hw_breakpoint single-stepping when called from bpf - Fixes for DDR PMU on i.MX8MP SoC - Add NUMA-awareness to Hisilicon PCIe PMU driver - Fix locking dependency issue in Arm DMC620 PMU driver - Workaround Hisilicon erratum 162001900 in the SMMUv3 PMU driver - Add support for Arm CMN-700 r3 parts to the CMN PMU driver - Add support for recent Arm Cortex CPU PMUs - Update Hisilicon PMU maintainers Selftests: - Add a bunch of new features to the hwcap test (JSCVT, PMULL, AES, SHA1, etc) - Fix SSVE test to leave streaming-mode after grabbing the signal context - Add new test for SVE vector-length changes with SME enabled Miscellaneous: - Allow compiler to warn on suspicious looking system register expressions - Work around SDEI firmware bug by aborting any running handlers on a kernel crash - Fix some harmless warnings when building with W=1 - Remove some unused function declarations - Other minor fixes and cleanup" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (62 commits) drivers/perf: hisi: Update HiSilicon PMU maintainers arm_pmu: acpi: Add a representative platform device for TRBE arm_pmu: acpi: Refactor arm_spe_acpi_register_device() kselftest/arm64: Fix hwcaps selftest build hw_breakpoint: fix single-stepping when using bpf_overflow_handler arm64/sysreg: refactor deprecated strncpy kselftest/arm64: add jscvt feature to hwcap test kselftest/arm64: add pmull feature to hwcap test kselftest/arm64: add AES feature check to hwcap test kselftest/arm64: add SHA1 and related features to hwcap test arm64: sysreg: Generate C compiler warnings on {read,write}_sysreg_s arguments kselftest/arm64: build BTI tests in output directory perf/imx_ddr: don't enable counter0 if none of 4 counters are used perf/imx_ddr: speed up overflow frequency of cycle drivers/perf: hisi: Schedule perf session according to locality kselftest/arm64: fix a memleak in zt_regs_run() perf/arm-dmc620: Fix dmc620_pmu_irqs_lock/cpu_hotplug_lock circular lock dependency perf/smmuv3: Add MODULE_ALIAS for module auto loading perf/smmuv3: Enable HiSilicon Erratum 162001900 quirk for HIP08/09 kselftest/arm64: Size sycall-abi buffers for the actual maximum VL ...
2023-08-17arm64/ptrace: Ensure that the task sees ZT writes on first useMark Brown
When the value of ZT is set via ptrace we don't disable traps for SME. This means that when a the task has never used SME before then the value set via ptrace will never be seen by the target task since it will trigger a SME access trap which will flush the register state. Disable SME traps when setting ZT, this means we also need to allocate storage for SVE if it is not already allocated, for the benefit of streaming SVE. Fixes: f90b529bcbe5 ("arm64/sme: Implement ZT0 ptrace support") Signed-off-by: Mark Brown <broonie@kernel.org> Cc: <stable@vger.kernel.org> # 6.3.x Link: https://lore.kernel.org/r/20230816-arm64-zt-ptrace-first-use-v2-1-00aa82847e28@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-08-17arm64/ptrace: Ensure that SME is set up for target when writing SSVE stateMark Brown
When we use NT_ARM_SSVE to either enable streaming mode or change the vector length for a process we do not currently do anything to ensure that there is storage allocated for the SME specific register state. If the task had not previously used SME or we changed the vector length then the task will not have had TIF_SME set or backing storage for ZA/ZT allocated, resulting in inconsistent register sizes when saving state and spurious traps which flush the newly set register state. We should set TIF_SME to disable traps and ensure that storage is allocated for ZA and ZT if it is not already allocated. This requires modifying sme_alloc() to make the flush of any existing register state optional so we don't disturb existing state for ZA and ZT. Fixes: e12310a0d30f ("arm64/sme: Implement ptrace support for streaming mode SVE registers") Reported-by: David Spickett <David.Spickett@arm.com> Signed-off-by: Mark Brown <broonie@kernel.org> Cc: <stable@vger.kernel.org> # 5.19.x Link: https://lore.kernel.org/r/20230810-arm64-fix-ptrace-race-v1-1-a5361fad2bd6@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-08-04arm64/ptrace: Don't enable SVE when setting streaming SVEMark Brown
Systems which implement SME without also implementing SVE are architecturally valid but were not initially supported by the kernel, unfortunately we missed one issue in the ptrace code. The SVE register setting code is shared between SVE and streaming mode SVE. When we set full SVE register state we currently enable TIF_SVE unconditionally, in the case where streaming SVE is being configured on a system that supports vanilla SVE this is not an issue since we always initialise enough state for both vector lengths but on a system which only support SME it will result in us attempting to restore the SVE vector length after having set streaming SVE registers. Fix this by making the enabling of SVE conditional on setting SVE vector state. If we set streaming SVE state and SVE was not already enabled this will result in a SVE access trap on next use of normal SVE, this will cause us to flush our register state but this is fine since the only way to trigger a SVE access trap would be to exit streaming mode which will cause the in register state to be flushed anyway. Fixes: e12310a0d30f ("arm64/sme: Implement ptrace support for streaming mode SVE registers") Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230803-arm64-fix-ptrace-ssve-no-sve-v1-1-49df214bfb3e@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-08-03arm64/ptrace: Flush FP state when setting ZT0Mark Brown
When setting ZT0 via ptrace we do not currently force a reload of the floating point register state from memory, do that to ensure that the newly set value gets loaded into the registers on next task execution. The function was templated off the function for FPSIMD which due to our providing the option of embedding a FPSIMD regset within the SVE regset does not directly include the flush. Fixes: f90b529bcbe5 ("arm64/sme: Implement ZT0 ptrace support") Signed-off-by: Mark Brown <broonie@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230803-arm64-fix-ptrace-zt0-flush-v1-1-72e854eaf96e@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-07-27arm64/ptrace: Clean up error handling path in sve_set_common()Christophe JAILLET
All error handling paths go to 'out', except this one. Be consistent and also branch to 'out' here. Fixes: e12310a0d30f ("arm64/sme: Implement ptrace support for streaming mode SVE registers") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Mark Brown <broonie@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/aa61301ed2dfd079b74b37f7fede5f179ac3087a.1689616473.git.christophe.jaillet@wanadoo.fr Signed-off-by: Will Deacon <will@kernel.org>
2023-02-21Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - Support for arm64 SME 2 and 2.1. SME2 introduces a new 512-bit architectural register (ZT0, for the look-up table feature) that Linux needs to save/restore - Include TPIDR2 in the signal context and add the corresponding kselftests - Perf updates: Arm SPEv1.2 support, HiSilicon uncore PMU updates, ACPI support to the Marvell DDR and TAD PMU drivers, reset DTM_PMU_CONFIG (ARM CMN) at probe time - Support for DYNAMIC_FTRACE_WITH_CALL_OPS on arm64 - Permit EFI boot with MMU and caches on. Instead of cleaning the entire loaded kernel image to the PoC and disabling the MMU and caches before branching to the kernel bare metal entry point, leave the MMU and caches enabled and rely on EFI's cacheable 1:1 mapping of all of system RAM to populate the initial page tables - Expose the AArch32 (compat) ELF_HWCAP features to user in an arm64 kernel (the arm32 kernel only defines the values) - Harden the arm64 shadow call stack pointer handling: stash the shadow stack pointer in the task struct on interrupt, load it directly from this structure - Signal handling cleanups to remove redundant validation of size information and avoid reading the same data from userspace twice - Refactor the hwcap macros to make use of the automatically generated ID registers. It should make new hwcaps writing less error prone - Further arm64 sysreg conversion and some fixes - arm64 kselftest fixes and improvements - Pointer authentication cleanups: don't sign leaf functions, unify asm-arch manipulation - Pseudo-NMI code generation optimisations - Minor fixes for SME and TPIDR2 handling - Miscellaneous updates: ARCH_FORCE_MAX_ORDER is now selectable, replace strtobool() to kstrtobool() in the cpufeature.c code, apply dynamic shadow call stack in two passes, intercept pfn changes in set_pte_at() without the required break-before-make sequence, attempt to dump all instructions on unhandled kernel faults * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (130 commits) arm64: fix .idmap.text assertion for large kernels kselftest/arm64: Don't require FA64 for streaming SVE+ZA tests kselftest/arm64: Copy whole EXTRA context arm64: kprobes: Drop ID map text from kprobes blacklist perf: arm_spe: Print the version of SPE detected perf: arm_spe: Add support for SPEv1.2 inverted event filtering perf: Add perf_event_attr::config3 arm64/sme: Fix __finalise_el2 SMEver check drivers/perf: fsl_imx8_ddr_perf: Remove set-but-not-used variable arm64/signal: Only read new data when parsing the ZT context arm64/signal: Only read new data when parsing the ZA context arm64/signal: Only read new data when parsing the SVE context arm64/signal: Avoid rereading context frame sizes arm64/signal: Make interface for restore_fpsimd_context() consistent arm64/signal: Remove redundant size validation from parse_user_sigframe() arm64/signal: Don't redundantly verify FPSIMD magic arm64/cpufeature: Use helper macros to specify hwcaps arm64/cpufeature: Always use symbolic name for feature value in hwcaps arm64/sysreg: Initial unsigned annotations for ID registers arm64/sysreg: Initial annotation of signed ID registers ...
2023-02-10Merge branches 'for-next/sysreg', 'for-next/sme', 'for-next/kselftest', ↵Catalin Marinas
'for-next/misc', 'for-next/sme2', 'for-next/tpidr2', 'for-next/scs', 'for-next/compat-hwcap', 'for-next/ftrace', 'for-next/efi-boot-mmu-on', 'for-next/ptrauth' and 'for-next/pseudo-nmi', remote-tracking branch 'arm64/for-next/perf' into for-next/core * arm64/for-next/perf: perf: arm_spe: Print the version of SPE detected perf: arm_spe: Add support for SPEv1.2 inverted event filtering perf: Add perf_event_attr::config3 drivers/perf: fsl_imx8_ddr_perf: Remove set-but-not-used variable perf: arm_spe: Support new SPEv1.2/v8.7 'not taken' event perf: arm_spe: Use new PMSIDR_EL1 register enums perf: arm_spe: Drop BIT() and use FIELD_GET/PREP accessors arm64/sysreg: Convert SPE registers to automatic generation arm64: Drop SYS_ from SPE register defines perf: arm_spe: Use feature numbering for PMSEVFR_EL1 defines perf/marvell: Add ACPI support to TAD uncore driver perf/marvell: Add ACPI support to DDR uncore driver perf/arm-cmn: Reset DTM_PMU_CONFIG at probe drivers/perf: hisi: Extract initialization of "cpa_pmu->pmu" drivers/perf: hisi: Simplify the parameters of hisi_pmu_init() drivers/perf: hisi: Advertise the PERF_PMU_CAP_NO_EXCLUDE capability * for-next/sysreg: : arm64 sysreg and cpufeature fixes/updates KVM: arm64: Use symbolic definition for ISR_EL1.A arm64/sysreg: Add definition of ISR_EL1 arm64/sysreg: Add definition for ICC_NMIAR1_EL1 arm64/cpufeature: Remove 4 bit assumption in ARM64_FEATURE_MASK() arm64/sysreg: Fix errors in 32 bit enumeration values arm64/cpufeature: Fix field sign for DIT hwcap detection * for-next/sme: : SME-related updates arm64/sme: Optimise SME exit on syscall entry arm64/sme: Don't use streaming mode to probe the maximum SME VL arm64/ptrace: Use system_supports_tpidr2() to check for TPIDR2 support * for-next/kselftest: (23 commits) : arm64 kselftest fixes and improvements kselftest/arm64: Don't require FA64 for streaming SVE+ZA tests kselftest/arm64: Copy whole EXTRA context kselftest/arm64: Fix enumeration of systems without 128 bit SME for SSVE+ZA kselftest/arm64: Fix enumeration of systems without 128 bit SME kselftest/arm64: Don't require FA64 for streaming SVE tests kselftest/arm64: Limit the maximum VL we try to set via ptrace kselftest/arm64: Correct buffer size for SME ZA storage kselftest/arm64: Remove the local NUM_VL definition kselftest/arm64: Verify simultaneous SSVE and ZA context generation kselftest/arm64: Verify that SSVE signal context has SVE_SIG_FLAG_SM set kselftest/arm64: Remove spurious comment from MTE test Makefile kselftest/arm64: Support build of MTE tests with clang kselftest/arm64: Initialise current at build time in signal tests kselftest/arm64: Don't pass headers to the compiler as source kselftest/arm64: Remove redundant _start labels from FP tests kselftest/arm64: Fix .pushsection for strings in FP tests kselftest/arm64: Run BTI selftests on systems without BTI kselftest/arm64: Fix test numbering when skipping tests kselftest/arm64: Skip non-power of 2 SVE vector lengths in fp-stress kselftest/arm64: Only enumerate power of two VLs in syscall-abi ... * for-next/misc: : Miscellaneous arm64 updates arm64/mm: Intercept pfn changes in set_pte_at() Documentation: arm64: correct spelling arm64: traps: attempt to dump all instructions arm64: Apply dynamic shadow call stack patching in two passes arm64: el2_setup.h: fix spelling typo in comments arm64: Kconfig: fix spelling arm64: cpufeature: Use kstrtobool() instead of strtobool() arm64: Avoid repeated AA64MMFR1_EL1 register read on pagefault path arm64: make ARCH_FORCE_MAX_ORDER selectable * for-next/sme2: (23 commits) : Support for arm64 SME 2 and 2.1 arm64/sme: Fix __finalise_el2 SMEver check kselftest/arm64: Remove redundant _start labels from zt-test kselftest/arm64: Add coverage of SME 2 and 2.1 hwcaps kselftest/arm64: Add coverage of the ZT ptrace regset kselftest/arm64: Add SME2 coverage to syscall-abi kselftest/arm64: Add test coverage for ZT register signal frames kselftest/arm64: Teach the generic signal context validation about ZT kselftest/arm64: Enumerate SME2 in the signal test utility code kselftest/arm64: Cover ZT in the FP stress test kselftest/arm64: Add a stress test program for ZT0 arm64/sme: Add hwcaps for SME 2 and 2.1 features arm64/sme: Implement ZT0 ptrace support arm64/sme: Implement signal handling for ZT arm64/sme: Implement context switching for ZT0 arm64/sme: Provide storage for ZT0 arm64/sme: Add basic enumeration for SME2 arm64/sme: Enable host kernel to access ZT0 arm64/sme: Manually encode ZT0 load and store instructions arm64/esr: Document ISS for ZT0 being disabled arm64/sme: Document SME 2 and SME 2.1 ABI ... * for-next/tpidr2: : Include TPIDR2 in the signal context kselftest/arm64: Add test case for TPIDR2 signal frame records kselftest/arm64: Add TPIDR2 to the set of known signal context records arm64/signal: Include TPIDR2 in the signal context arm64/sme: Document ABI for TPIDR2 signal information * for-next/scs: : arm64: harden shadow call stack pointer handling arm64: Stash shadow stack pointer in the task struct on interrupt arm64: Always load shadow stack pointer directly from the task struct * for-next/compat-hwcap: : arm64: Expose compat ARMv8 AArch32 features (HWCAPs) arm64: Add compat hwcap SSBS arm64: Add compat hwcap SB arm64: Add compat hwcap I8MM arm64: Add compat hwcap ASIMDBF16 arm64: Add compat hwcap ASIMDFHM arm64: Add compat hwcap ASIMDDP arm64: Add compat hwcap FPHP and ASIMDHP * for-next/ftrace: : Add arm64 support for DYNAMICE_FTRACE_WITH_CALL_OPS arm64: avoid executing padding bytes during kexec / hibernation arm64: Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS arm64: ftrace: Update stale comment arm64: patching: Add aarch64_insn_write_literal_u64() arm64: insn: Add helpers for BTI arm64: Extend support for CONFIG_FUNCTION_ALIGNMENT ACPI: Don't build ACPICA with '-Os' Compiler attributes: GCC cold function alignment workarounds ftrace: Add DYNAMIC_FTRACE_WITH_CALL_OPS * for-next/efi-boot-mmu-on: : Permit arm64 EFI boot with MMU and caches on arm64: kprobes: Drop ID map text from kprobes blacklist arm64: head: Switch endianness before populating the ID map efi: arm64: enter with MMU and caches enabled arm64: head: Clean the ID map and the HYP text to the PoC if needed arm64: head: avoid cache invalidation when entering with the MMU on arm64: head: record the MMU state at primary entry arm64: kernel: move identity map out of .text mapping arm64: head: Move all finalise_el2 calls to after __enable_mmu * for-next/ptrauth: : arm64 pointer authentication cleanup arm64: pauth: don't sign leaf functions arm64: unify asm-arch manipulation * for-next/pseudo-nmi: : Pseudo-NMI code generation optimisations arm64: irqflags: use alternative branches for pseudo-NMI logic arm64: add ARM64_HAS_GIC_PRIO_RELAXED_SYNC cpucap arm64: make ARM64_HAS_GIC_PRIO_MASKING depend on ARM64_HAS_GIC_CPUIF_SYSREGS arm64: rename ARM64_HAS_IRQ_PRIO_MASKING to ARM64_HAS_GIC_PRIO_MASKING arm64: rename ARM64_HAS_SYSREG_GIC_CPUIF to ARM64_HAS_GIC_CPUIF_SYSREGS
2023-01-20arm64/sme: Implement ZT0 ptrace supportMark Brown
Implement support for a new note type NT_ARM64_ZT providing access to ZT0 when implemented. Since ZT0 is a register with constant size this is much simpler than for other SME state. As ZT0 is only accessible when PSTATE.ZA is set writes to ZT0 cause PSTATE.ZA to be set, the main alternative would be to return -EBUSY in this case but this seemed more constructive. Practical users are also going to be working with ZA anyway and have some understanding of the state. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-12-f2fa0aef982f@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-20arm64/sme: Rename za_state to sme_stateMark Brown
In preparation for adding support for storage for ZT0 to the thread_struct rename za_state to sme_state. Since ZT0 is accessible when PSTATE.ZA is set just like ZA itself we will extend the allocation done for ZA to cover it, avoiding the need to further expand task_struct for non-SME tasks. No functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20221208-arm64-sme2-v4-1-f2fa0aef982f@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-12arm64/ptrace: Use system_supports_tpidr2() to check for TPIDR2 supportMark Brown
We have a separate system_supports_tpidr2() to check for TPIDR2 support but were using system_supports_sme() in tls_set(). While these are currently identical let's use the specific check instead so we don't have any surprises in future. Reported-by: Will Deacon <will@kernel.org> Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20221208-arm64-tpidr2-ptrace-feat-v2-1-3760c895a574@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-01-05arm64: ptrace: Use ARM64_SME to guard the SME register enumerationsZenghui Yu
We currently guard REGSET_{SSVE, ZA} using ARM64_SVE for no good reason. Both enumerations would be pointless without ARM64_SME and create two empty entries in aarch64_regsets[] which would then become part of a process's native regset view (they should be ignored though). Switch to use ARM64_SME instead. Fixes: e12310a0d30f ("arm64/sme: Implement ptrace support for streaming mode SVE registers") Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20221214135943.379-1-yuzenghui@huawei.com Signed-off-by: Will Deacon <will@kernel.org>
2022-12-12Merge tag 'mm-nonmm-stable-2022-12-12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - A ptrace API cleanup series from Sergey Shtylyov - Fixes and cleanups for kexec from ye xingchen - nilfs2 updates from Ryusuke Konishi - squashfs feature work from Xiaoming Ni: permit configuration of the filesystem's compression concurrency from the mount command line - A series from Akinobu Mita which addresses bound checking errors when writing to debugfs files - A series from Yang Yingliang to address rapidio memory leaks - A series from Zheng Yejian to address possible overflow errors in encode_comp_t() - And a whole shower of singleton patches all over the place * tag 'mm-nonmm-stable-2022-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (79 commits) ipc: fix memory leak in init_mqueue_fs() hfsplus: fix bug causing custom uid and gid being unable to be assigned with mount rapidio: devices: fix missing put_device in mport_cdev_open kcov: fix spelling typos in comments hfs: Fix OOB Write in hfs_asc2mac hfs: fix OOB Read in __hfs_brec_find relay: fix type mismatch when allocating memory in relay_create_buf() ocfs2: always read both high and low parts of dinode link count io-mapping: move some code within the include guarded section kernel: kcsan: kcsan_test: build without structleak plugin mailmap: update email for Iskren Chernev eventfd: change int to __u64 in eventfd_signal() ifndef CONFIG_EVENTFD rapidio: fix possible UAF when kfifo_alloc() fails relay: use strscpy() is more robust and safer cpumask: limit visibility of FORCE_NR_CPUS acct: fix potential integer overflow in encode_comp_t() acct: fix accuracy loss for input value of encode_comp_t() linux/init.h: include <linux/build_bug.h> and <linux/stringify.h> rapidio: rio: fix possible name leak in rio_register_mport() rapidio: fix possible name leaks when rio_add_device() fails ...
2022-11-29arm64/fpsimd: SME no longer requires SVE register stateMark Brown
Now that we track the type of the stored register state separately to what is active in the task, it is valid to have the FPSIMD register state stored while in streaming mode. Remove the special case handling for SME when setting FPSIMD register state. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221115094640.112848-7-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-11-29arm64/fpsimd: Track the saved FPSIMD state type separately to TIF_SVEMark Brown
When we save the state for the floating point registers this can be done in the form visible through either the FPSIMD V registers or the SVE Z and P registers. At present we track which format is currently used based on TIF_SVE and the SME streaming mode state but particularly in the SVE case this limits our options for optimising things, especially around syscalls. Introduce a new enum which we place together with saved floating point state in both thread_struct and the KVM guest state which explicitly states which format is active and keep it up to date when we change it. At present we do not use this state except to verify that it has the expected value when loading the state, future patches will introduce functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221115094640.112848-3-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-11-15arm64: ptrace: user_regset_copyin_ignore() always returns 0Sergey Shtylyov
user_regset_copyin_ignore() always returns 0, so checking its result seems pointless -- don't do this anymore... Found by Linux Verification Center (linuxtesting.org) with the SVACE static analysis tool. Link: https://lkml.kernel.org/r/20221014212235.10770-4-s.shtylyov@omp.ru Signed-off-by: Sergey Shtylyov <s.shtylyov@omp.ru> Cc: Brian Cain <bcain@quicinc.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: David S. Miller <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Helge Deller <deller@gmx.de> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Jonas Bonn <jonas@southpole.se> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.osdn.me> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-10-06Merge tag 'arm64-upstream' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: - arm64 perf: DDR PMU driver for Alibaba's T-Head Yitian 710 SoC, SVE vector granule register added to the user regs together with SVE perf extensions documentation. - SVE updates: add HWCAP for SVE EBF16, update the SVE ABI documentation to match the actual kernel behaviour (zeroing the registers on syscall rather than "zeroed or preserved" previously). - More conversions to automatic system registers generation. - vDSO: use self-synchronising virtual counter access in gettimeofday() if the architecture supports it. - arm64 stacktrace cleanups and improvements. - arm64 atomics improvements: always inline assembly, remove LL/SC trampolines. - Improve the reporting of EL1 exceptions: rework BTI and FPAC exception handling, better EL1 undefs reporting. - Cortex-A510 erratum 2658417: remove BF16 support due to incorrect result. - arm64 defconfig updates: build CoreSight as a module, enable options necessary for docker, memory hotplug/hotremove, enable all PMUs provided by Arm. - arm64 ptrace() support for TPIDR2_EL0 (register provided with the SME extensions). - arm64 ftraces updates/fixes: fix module PLTs with mcount, remove unused function. - kselftest updates for arm64: simple HWCAP validation, FP stress test improvements, validation of ZA regs in signal handlers, include larger SVE and SME vector lengths in signal tests, various cleanups. - arm64 alternatives (code patching) improvements to robustness and consistency: replace cpucap static branches with equivalent alternatives, associate callback alternatives with a cpucap. - Miscellaneous updates: optimise kprobe performance of patching single-step slots, simplify uaccess_mask_ptr(), move MTE registers initialisation to C, support huge vmalloc() mappings, run softirqs on the per-CPU IRQ stack, compat (arm32) misalignment fixups for multiword accesses. * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (126 commits) arm64: alternatives: Use vdso/bits.h instead of linux/bits.h arm64/kprobe: Optimize the performance of patching single-step slot arm64: defconfig: Add Coresight as module kselftest/arm64: Handle EINTR while reading data from children kselftest/arm64: Flag fp-stress as exiting when we begin finishing up kselftest/arm64: Don't repeat termination handler for fp-stress ARM64: reloc_test: add __init/__exit annotations to module init/exit funcs arm64/mm: fold check for KFENCE into can_set_direct_map() arm64: ftrace: fix module PLTs with mcount arm64: module: Remove unused plt_entry_is_initialized() arm64: module: Make plt_equals_entry() static arm64: fix the build with binutils 2.27 kselftest/arm64: Don't enable v8.5 for MTE selftest builds arm64: uaccess: simplify uaccess_mask_ptr() arm64: asm/perf_regs.h: Avoid C++-style comment in UAPI header kselftest/arm64: Fix typo in hwcap check arm64: mte: move register initialization to C arm64: mm: handle ARM64_KERNEL_USES_PMD_MAPS in vmemmap_populate() arm64: dma: Drop cache invalidation from arch_dma_prep_coherent() arm64/sve: Add Perf extensions documentation ...
2022-09-30Merge branches 'for-next/doc', 'for-next/sve', 'for-next/sysreg', ↵Catalin Marinas
'for-next/gettimeofday', 'for-next/stacktrace', 'for-next/atomics', 'for-next/el1-exceptions', 'for-next/a510-erratum-2658417', 'for-next/defconfig', 'for-next/tpidr2_el0' and 'for-next/ftrace', remote-tracking branch 'arm64/for-next/perf' into for-next/core * arm64/for-next/perf: arm64: asm/perf_regs.h: Avoid C++-style comment in UAPI header arm64/sve: Add Perf extensions documentation perf: arm64: Add SVE vector granule register to user regs MAINTAINERS: add maintainers for Alibaba' T-Head PMU driver drivers/perf: add DDR Sub-System Driveway PMU driver for Yitian 710 SoC docs: perf: Add description for Alibaba's T-Head PMU driver * for-next/doc: : Documentation/arm64 updates arm64/sve: Document our actual ABI for clearing registers on syscall * for-next/sve: : SVE updates arm64/sysreg: Add hwcap for SVE EBF16 * for-next/sysreg: (35 commits) : arm64 system registers generation (more conversions) arm64/sysreg: Fix a few missed conversions arm64/sysreg: Convert ID_AA64AFRn_EL1 to automatic generation arm64/sysreg: Convert ID_AA64DFR1_EL1 to automatic generation arm64/sysreg: Convert ID_AA64FDR0_EL1 to automatic generation arm64/sysreg: Use feature numbering for PMU and SPE revisions arm64/sysreg: Add _EL1 into ID_AA64DFR0_EL1 definition names arm64/sysreg: Align field names in ID_AA64DFR0_EL1 with architecture arm64/sysreg: Add defintion for ALLINT arm64/sysreg: Convert SCXTNUM_EL1 to automatic generation arm64/sysreg: Convert TIPDR_EL1 to automatic generation arm64/sysreg: Convert ID_AA64PFR1_EL1 to automatic generation arm64/sysreg: Convert ID_AA64PFR0_EL1 to automatic generation arm64/sysreg: Convert ID_AA64MMFR2_EL1 to automatic generation arm64/sysreg: Convert ID_AA64MMFR1_EL1 to automatic generation arm64/sysreg: Convert ID_AA64MMFR0_EL1 to automatic generation arm64/sysreg: Convert HCRX_EL2 to automatic generation arm64/sysreg: Standardise naming of ID_AA64PFR1_EL1 SME enumeration arm64/sysreg: Standardise naming of ID_AA64PFR1_EL1 BTI enumeration arm64/sysreg: Standardise naming of ID_AA64PFR1_EL1 fractional version fields arm64/sysreg: Standardise naming for MTE feature enumeration ... * for-next/gettimeofday: : Use self-synchronising counter access in gettimeofday() (if FEAT_ECV) arm64: vdso: use SYS_CNTVCTSS_EL0 for gettimeofday arm64: alternative: patch alternatives in the vDSO arm64: module: move find_section to header * for-next/stacktrace: : arm64 stacktrace cleanups and improvements arm64: stacktrace: track hyp stacks in unwinder's address space arm64: stacktrace: track all stack boundaries explicitly arm64: stacktrace: remove stack type from fp translator arm64: stacktrace: rework stack boundary discovery arm64: stacktrace: add stackinfo_on_stack() helper arm64: stacktrace: move SDEI stack helpers to stacktrace code arm64: stacktrace: rename unwind_next_common() -> unwind_next_frame_record() arm64: stacktrace: simplify unwind_next_common() arm64: stacktrace: fix kerneldoc comments * for-next/atomics: : arm64 atomics improvements arm64: atomic: always inline the assembly arm64: atomics: remove LL/SC trampolines * for-next/el1-exceptions: : Improve the reporting of EL1 exceptions arm64: rework BTI exception handling arm64: rework FPAC exception handling arm64: consistently pass ESR_ELx to die() arm64: die(): pass 'err' as long arm64: report EL1 UNDEFs better * for-next/a510-erratum-2658417: : Cortex-A510: 2658417: remove BF16 support due to incorrect result arm64: errata: remove BF16 HWCAP due to incorrect result on Cortex-A510 arm64: cpufeature: Expose get_arm64_ftr_reg() outside cpufeature.c arm64: cpufeature: Force HWCAP to be based on the sysreg visible to user-space * for-next/defconfig: : arm64 defconfig updates arm64: defconfig: Add Coresight as module arm64: Enable docker support in defconfig arm64: defconfig: Enable memory hotplug and hotremove config arm64: configs: Enable all PMUs provided by Arm * for-next/tpidr2_el0: : arm64 ptrace() support for TPIDR2_EL0 kselftest/arm64: Add coverage of TPIDR2_EL0 ptrace interface arm64/ptrace: Support access to TPIDR2_EL0 arm64/ptrace: Document extension of NT_ARM_TLS to cover TPIDR2_EL0 kselftest/arm64: Add test coverage for NT_ARM_TLS * for-next/ftrace: : arm64 ftraces updates/fixes arm64: ftrace: fix module PLTs with mcount arm64: module: Remove unused plt_entry_is_initialized() arm64: module: Make plt_equals_entry() static
2022-09-21arm64/ptrace: Support access to TPIDR2_EL0Mark Brown
SME introduces an additional EL0 register, TPIDR2_EL0, intended for use by userspace as part of the SME. Provide ptrace access to it through the existing NT_ARM_TLS regset used for TPIDR_EL0 by expanding it to two registers with TPIDR2_EL0 being the second one. Existing programs that query the size of the register set will be able to observe the increased size of the register set. Programs that assume the register set is single register will see no change. On systems that do not support SME TPIDR2_EL0 will read as 0 and writes will be ignored, support for SME should be queried via hwcaps as normal. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220829154921.837871-4-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-09arm64: stacktrace: rework stack boundary discoveryMark Rutland
In subsequent patches we'll want to acquire the stack boundaries ahead-of-time, and we'll need to be able to acquire the relevant stack_info regardless of whether we have an object the happens to be on the stack. This patch replaces the on_XXX_stack() helpers with stackinfo_get_XXX() helpers, with the caller being responsible for the checking whether an object is on a relevant stack. For the moment this is moved into the on_accessible_stack() functions, making these slightly larger; subsequent patches will remove the on_accessible_stack() functions and simplify the logic. The on_irq_stack() and on_task_stack() helpers are kept as these are used by IRQ entry sequences and stackleak respectively. As they're only used as predicates, the stack_info pointer parameter is removed in both cases. As the on_accessible_stack() functions are always passed a non-NULL info pointer, these now update info unconditionally. When updating the type to STACK_TYPE_UNKNOWN, the low/high bounds are also modified, but as these will not be consumed this should have no adverse affect. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Kalesh Singh <kaleshsingh@google.com> Reviewed-by: Madhavan T. Venkataraman <madvenka@linux.microsoft.com> Reviewed-by: Mark Brown <broonie@kernel.org> Cc: Fuad Tabba <tabba@google.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20220901130646.1316937-7-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-09-08arm64/ptrace: Don't clear calling process' TIF_SME on OOMMark Brown
If allocating memory for the target SVE state in za_set() fails we clear TIF_SME for the ptracing task which is obviously not correct. If we are here we know that the target task already had neither TIF_SVE nor TIF_SME set since we only need to allocate if either the target had not used either SVE or SME and had no need to allocate state before or we just changed the vector length with vec_set_vector_length() which clears TIF_ for us on allocation failure so just remove the clear entirely. Reported-by: Wang ShaoBo <bobo.shaobowang@huawei.com> Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220902132802.39682-1-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-08-23arm64/sme: Don't flush SVE register state when allocating SME storageMark Brown
Currently when taking a SME access trap we allocate storage for the SVE register state in order to be able to handle storage of streaming mode SVE. Due to the original usage in a purely SVE context the SVE register state allocation this also flushes the register state for SVE if storage was already allocated but in the SME context this is not desirable. For a SME access trap to be taken the task must not be in streaming mode so either there already is SVE register state present for regular SVE mode which would be corrupted or the task does not have TIF_SVE and the flush is redundant. Fix this by adding a flag to sve_alloc() indicating if we are in a SVE context and need to flush the state. Freshly allocated storage is always zeroed either way. Fixes: 8bd7f91c03d8 ("arm64/sme: Implement traps and syscall handling for SME") Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20220817182324.638214-4-broonie@kernel.org Signed-off-by: Will Deacon <will@kernel.org>
2022-05-16arm64/sme: Remove _EL0 from name of SVCR - FIXME sysreg.hMark Brown
The defines for SVCR call it SVCR_EL0 however the architecture calls the register SVCR with no _EL0 suffix. In preparation for generating the sysreg definitions rename to match the architecture, no functional change. Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20220510161208.631259-6-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-05-06arm64/sme: More sensibly define the size for the ZA register setMark Brown
Since the vector length configuration mechanism is identical between SVE and SME we share large elements of the code including the definition for the maximum vector length. Unfortunately when we were defining the ABI for SVE we included not only the actual maximum vector length of 2048 bits but also the value possible if all the bits reserved in the architecture for expansion of the LEN field were used, 16384 bits. This starts creating problems if we try to allocate anything for the ZA matrix based on the maximum possible vector length, as we do for the regset used with ptrace during the process of generating a core dump. While the maximum potential size for ZA with the current architecture is a reasonably managable 64K with the higher reserved limit ZA would be 64M which leads to entirely reasonable complaints from the memory management code when we try to allocate a buffer of that size. Avoid these issues by defining the actual maximum vector length for the architecture and using it for the SME regsets. Also use the full ZA_PT_SIZE() with the header rather than just the actual register payload when specifying the size, fixing support for the largest vector lengths now that we have this new, lower define. With the SVE maximum this did not cause problems due to the extra headroom we had. While we're at it add a comment clarifying why even though ZA is a single register we tell the regset code that it is a multi-register regset. Reported-by: Qian Cai <quic_qiancai@quicinc.com> Signed-off-by: Mark Brown <broonie@kernel.org> Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org> Link: https://lore.kernel.org/r/20220505221517.1642014-1-broonie@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>