summaryrefslogtreecommitdiff
path: root/arch/arm64/include/asm
AgeCommit message (Collapse)Author
2023-06-12KVM: arm64: Remove alternatives from sysreg accessors in VHE hypervisor contextMarc Zyngier
In the VHE hypervisor code, we should be using the remapped VHE accessors, no ifs, no buts. No need to generate any alternative. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230609162200.2024064-9-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-06-12arm64: Use CPACR_EL1 format to set CPTR_EL2 when E2H is setMarc Zyngier
When HCR_EL2.E2H is set, the CPTR_EL2 register takes the CPACR_EL1 format. Yes, this is good fun. Hack the bits of startup code that assume E2H=0 while setting up CPTR_EL2 to make them grok the CPTR_EL1 format. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20230609162200.2024064-8-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-06-12arm64: Allow EL1 physical timer access when running VHEMarc Zyngier
To initialise the timer access from EL2 when HCR_EL2.E2H is set, we must make use the CNTHCTL_EL2 formap used is appropriate. This amounts to shifting the timer/counter enable bits by 10 to the left. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20230609162200.2024064-7-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-06-12arm64: Add KVM_HVHE capability and has_hvhe() predicateMarc Zyngier
Expose a capability keying the hVHE feature as well as a new predicate testing it. Nothing is so far using it, and nothing is enabling it yet. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20230609162200.2024064-5-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-06-12arm64: Turn kaslr_feature_override into a generic SW feature overrideMarc Zyngier
Disabling KASLR from the command line is implemented as a feature override. Repaint it slightly so that it can further be used as more generic infrastructure for SW override purposes. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20230609162200.2024064-4-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-06-12arm64: Prevent the use of is_kernel_in_hyp_mode() in hypervisor codeMarc Zyngier
Using is_kernel_in_hyp_mode() in hypervisor code is a pretty bad mistake. This helper only checks for CurrentEL being EL2, which is always true. Make the compilation fail if using the helper in hypervisor context Whilst we're at it, flag the helper as __always_inline, which it really should be. Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20230609162200.2024064-3-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-06-12KVM: arm64: Drop is_kernel_in_hyp_mode() from __invalidate_icache_guest_page()Marc Zyngier
It is pretty obvious that is_kernel_in_hyp_mode() doesn't make much sense in the hypervisor part of KVM, and should be reserved to the kernel side. However, mem_protect.c::invalidate_icache_guest_page() calls into __invalidate_icache_guest_page(), which uses is_kernel_in_hyp_mode(). Given that this is part of the pKVM side of the hypervisor, this helper can only return true. Nothing goes really bad, but __invalidate_icache_guest_page() could spell out what the actual check is: we cannot invalidate the cache if the i-cache is VPIPT and we're running at EL1. Drop the is_kernel_in_hyp_mode() check for an explicit check against CurrentEL being EL1 or not. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230609162200.2024064-2-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-06-12KVM: arm64: Rewrite IMPDEF PMU version as NIOliver Upton
KVM allows userspace to write an IMPDEF PMU version to the corresponding 32bit and 64bit ID register fields for the sake of backwards compatibility with kernels that lacked commit 3d0dba5764b9 ("KVM: arm64: PMU: Move the ID_AA64DFR0_EL1.PMUver limit to VM creation"). Plumbing that IMPDEF PMU version through to the gues is getting in the way of progress, and really doesn't any sense in the first place. Bite the bullet and reinterpret the IMPDEF PMU version as NI (0) for userspace writes. Additionally, spill the dirty details into a comment. Link: https://lore.kernel.org/r/20230609190054.1542113-5-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-06-12KVM: arm64: Make vCPU feature flags consistent VM-wideOliver Upton
To date KVM has allowed userspace to construct asymmetric VMs where particular features may only be supported on a subset of vCPUs. This wasn't really the intened usage pattern, and it is a total pain in the ass to keep working in the kernel. What's more, this is at odds with CPU features in host userspace, where asymmetric features are largely hidden or disabled. It's time to put an end to the whole game. Require all vCPUs in the VM to have the same feature set, rejecting deviants in the KVM_ARM_VCPU_INIT ioctl. Preserve some of the vestiges of per-vCPU feature flags in case we need to reinstate the old behavior for some limited configurations. Yes, this is a sign of cowardice around a user-visibile change. Hoist all of the 32-bit limitations into kvm_vcpu_init_check_features() to avoid nested attempts to acquire the config_lock, which won't end well. Link: https://lore.kernel.org/r/20230609190054.1542113-4-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-06-12KVM: arm64: Separate out feature sanitisation and initialisationOliver Upton
kvm_vcpu_set_target() iteratively sanitises and copies feature flags in one go. This is rather odd, especially considering the fact that bitmap accessors can do the heavy lifting. A subsequent change will make vCPU features VM-wide, and fitting that into the present implementation is just a chore. Rework the whole thing to use bitmap accessors to sanitise and copy flags. Link: https://lore.kernel.org/r/20230609190054.1542113-2-oliver.upton@linux.dev Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2023-06-09thread_info: move function declarations to linux/thread_info.hArnd Bergmann
There are a few __weak functions in kernel/fork.c, which architectures can override. If there is no prototype, the compiler warns about them: kernel/fork.c:164:13: error: no previous prototype for 'arch_release_task_struct' [-Werror=missing-prototypes] kernel/fork.c:991:20: error: no previous prototype for 'arch_task_cache_init' [-Werror=missing-prototypes] kernel/fork.c:1086:12: error: no previous prototype for 'arch_dup_task_struct' [-Werror=missing-prototypes] There are already prototypes in a number of architecture specific headers that have addressed those warnings before, but it's much better to have these in a single place so the warning no longer shows up anywhere. Link: https://lkml.kernel.org/r/20230517131102.934196-14-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christoph Lameter <cl@linux.com> Cc: Dennis Zhou <dennis@kernel.org> Cc: Eric Paris <eparis@redhat.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Moore <paul@paul-moore.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Waiman Long <longman@redhat.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-09cachestat: wire up cachestat for other architecturesNhat Pham
cachestat is previously only wired in for x86 (and architectures using the generic unistd.h table): https://lore.kernel.org/lkml/20230503013608.2431726-1-nphamcs@gmail.com/ This patch wires cachestat in for all the other architectures. [nphamcs@gmail.com: wire up cachestat for arm64] Link: https://lkml.kernel.org/r/20230511092843.3896327-1-nphamcs@gmail.com Link: https://lkml.kernel.org/r/20230510195806.2902878-1-nphamcs@gmail.com Signed-off-by: Nhat Pham <nphamcs@gmail.com> Tested-by: Michael Ellerman <mpe@ellerman.id.au> [powerpc] Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Reviewed-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Heiko Carstens <hca@linux.ibm.com> [s390] Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Chris Zankel <chris@zankel.net> Cc: David S. Miller <davem@davemloft.net> Cc: Helge Deller <deller@gmx.de> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Rich Felker <dalias@libc.org> Cc: Russell King <linux@armlinux.org.uk> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-06-08arm64: set __exception_irq_entry with __irq_entry as a defaultYoungmin Nam
filter_irq_stacks() is supposed to cut entries which are related irq entries from its call stack. And in_irqentry_text() which is called by filter_irq_stacks() uses __irqentry_text_start/end symbol to find irq entries in callstack. But it doesn't work correctly as without "CONFIG_FUNCTION_GRAPH_TRACER", arm64 kernel doesn't include gic_handle_irq which is entry point of arm64 irq between __irqentry_text_start and __irqentry_text_end as we discussed in below link. https://lore.kernel.org/all/CACT4Y+aReMGLYua2rCLHgFpS9io5cZC04Q8GLs-uNmrn1ezxYQ@mail.gmail.com/#t This problem can makes unintentional deep call stack entries especially in KASAN enabled situation as below. [ 2479.383395]I[0:launcher-loader: 1719] Stack depot reached limit capacity [ 2479.383538]I[0:launcher-loader: 1719] WARNING: CPU: 0 PID: 1719 at lib/stackdepot.c:129 __stack_depot_save+0x464/0x46c [ 2479.385693]I[0:launcher-loader: 1719] pstate: 624000c5 (nZCv daIF +PAN -UAO +TCO -DIT -SSBS BTYPE=--) [ 2479.385724]I[0:launcher-loader: 1719] pc : __stack_depot_save+0x464/0x46c [ 2479.385751]I[0:launcher-loader: 1719] lr : __stack_depot_save+0x460/0x46c [ 2479.385774]I[0:launcher-loader: 1719] sp : ffffffc0080073c0 [ 2479.385793]I[0:launcher-loader: 1719] x29: ffffffc0080073e0 x28: ffffffd00b78a000 x27: 0000000000000000 [ 2479.385839]I[0:launcher-loader: 1719] x26: 000000000004d1dd x25: ffffff891474f000 x24: 00000000ca64d1dd [ 2479.385882]I[0:launcher-loader: 1719] x23: 0000000000000200 x22: 0000000000000220 x21: 0000000000000040 [ 2479.385925]I[0:launcher-loader: 1719] x20: ffffffc008007440 x19: 0000000000000000 x18: 0000000000000000 [ 2479.385969]I[0:launcher-loader: 1719] x17: 2065726568207475 x16: 000000000000005e x15: 2d2d2d2d2d2d2d20 [ 2479.386013]I[0:launcher-loader: 1719] x14: 5d39313731203a72 x13: 00000000002f6b30 x12: 00000000002f6af8 [ 2479.386057]I[0:launcher-loader: 1719] x11: 00000000ffffffff x10: ffffffb90aacf000 x9 : e8a74a6c16008800 [ 2479.386101]I[0:launcher-loader: 1719] x8 : e8a74a6c16008800 x7 : 00000000002f6b30 x6 : 00000000002f6af8 [ 2479.386145]I[0:launcher-loader: 1719] x5 : ffffffc0080070c8 x4 : ffffffd00b192380 x3 : ffffffd0092b313c [ 2479.386189]I[0:launcher-loader: 1719] x2 : 0000000000000001 x1 : 0000000000000004 x0 : 0000000000000022 [ 2479.386231]I[0:launcher-loader: 1719] Call trace: [ 2479.386248]I[0:launcher-loader: 1719] __stack_depot_save+0x464/0x46c [ 2479.386273]I[0:launcher-loader: 1719] kasan_save_stack+0x58/0x70 [ 2479.386303]I[0:launcher-loader: 1719] save_stack_info+0x34/0x138 [ 2479.386331]I[0:launcher-loader: 1719] kasan_save_free_info+0x18/0x24 [ 2479.386358]I[0:launcher-loader: 1719] ____kasan_slab_free+0x16c/0x170 [ 2479.386385]I[0:launcher-loader: 1719] __kasan_slab_free+0x10/0x20 [ 2479.386410]I[0:launcher-loader: 1719] kmem_cache_free+0x238/0x53c [ 2479.386435]I[0:launcher-loader: 1719] mempool_free_slab+0x1c/0x28 [ 2479.386460]I[0:launcher-loader: 1719] mempool_free+0x7c/0x1a0 [ 2479.386484]I[0:launcher-loader: 1719] bvec_free+0x34/0x80 [ 2479.386514]I[0:launcher-loader: 1719] bio_free+0x60/0x98 [ 2479.386540]I[0:launcher-loader: 1719] bio_put+0x50/0x21c [ 2479.386567]I[0:launcher-loader: 1719] f2fs_write_end_io+0x4ac/0x4d0 [ 2479.386594]I[0:launcher-loader: 1719] bio_endio+0x2dc/0x300 [ 2479.386622]I[0:launcher-loader: 1719] __dm_io_complete+0x324/0x37c [ 2479.386650]I[0:launcher-loader: 1719] dm_io_dec_pending+0x60/0xa4 [ 2479.386676]I[0:launcher-loader: 1719] clone_endio+0xf8/0x2f0 [ 2479.386700]I[0:launcher-loader: 1719] bio_endio+0x2dc/0x300 [ 2479.386727]I[0:launcher-loader: 1719] blk_update_request+0x258/0x63c [ 2479.386754]I[0:launcher-loader: 1719] scsi_end_request+0x50/0x304 [ 2479.386782]I[0:launcher-loader: 1719] scsi_io_completion+0x88/0x160 [ 2479.386808]I[0:launcher-loader: 1719] scsi_finish_command+0x17c/0x194 [ 2479.386833]I[0:launcher-loader: 1719] scsi_complete+0xcc/0x158 [ 2479.386859]I[0:launcher-loader: 1719] blk_mq_complete_request+0x4c/0x5c [ 2479.386885]I[0:launcher-loader: 1719] scsi_done_internal+0xf4/0x1e0 [ 2479.386910]I[0:launcher-loader: 1719] scsi_done+0x14/0x20 [ 2479.386935]I[0:launcher-loader: 1719] ufshcd_compl_one_cqe+0x578/0x71c [ 2479.386963]I[0:launcher-loader: 1719] ufshcd_mcq_poll_cqe_nolock+0xc8/0x150 [ 2479.386991]I[0:launcher-loader: 1719] ufshcd_intr+0x868/0xc0c [ 2479.387017]I[0:launcher-loader: 1719] __handle_irq_event_percpu+0xd0/0x348 [ 2479.387044]I[0:launcher-loader: 1719] handle_irq_event_percpu+0x24/0x74 [ 2479.387068]I[0:launcher-loader: 1719] handle_irq_event+0x74/0xe0 [ 2479.387091]I[0:launcher-loader: 1719] handle_fasteoi_irq+0x174/0x240 [ 2479.387118]I[0:launcher-loader: 1719] handle_irq_desc+0x7c/0x2c0 [ 2479.387147]I[0:launcher-loader: 1719] generic_handle_domain_irq+0x1c/0x28 [ 2479.387174]I[0:launcher-loader: 1719] gic_handle_irq+0x64/0x158 [ 2479.387204]I[0:launcher-loader: 1719] call_on_irq_stack+0x2c/0x54 [ 2479.387231]I[0:launcher-loader: 1719] do_interrupt_handler+0x70/0xa0 [ 2479.387258]I[0:launcher-loader: 1719] el1_interrupt+0x34/0x68 [ 2479.387283]I[0:launcher-loader: 1719] el1h_64_irq_handler+0x18/0x24 [ 2479.387308]I[0:launcher-loader: 1719] el1h_64_irq+0x68/0x6c [ 2479.387332]I[0:launcher-loader: 1719] blk_attempt_bio_merge+0x8/0x170 [ 2479.387356]I[0:launcher-loader: 1719] blk_mq_attempt_bio_merge+0x78/0x98 [ 2479.387383]I[0:launcher-loader: 1719] blk_mq_submit_bio+0x324/0xa40 [ 2479.387409]I[0:launcher-loader: 1719] __submit_bio+0x104/0x138 [ 2479.387436]I[0:launcher-loader: 1719] submit_bio_noacct_nocheck+0x1d0/0x4a0 [ 2479.387462]I[0:launcher-loader: 1719] submit_bio_noacct+0x618/0x804 [ 2479.387487]I[0:launcher-loader: 1719] submit_bio+0x164/0x180 [ 2479.387511]I[0:launcher-loader: 1719] f2fs_submit_read_bio+0xe4/0x1c4 [ 2479.387537]I[0:launcher-loader: 1719] f2fs_mpage_readpages+0x888/0xa4c [ 2479.387563]I[0:launcher-loader: 1719] f2fs_readahead+0xd4/0x19c [ 2479.387587]I[0:launcher-loader: 1719] read_pages+0xb0/0x4ac [ 2479.387614]I[0:launcher-loader: 1719] page_cache_ra_unbounded+0x238/0x288 [ 2479.387642]I[0:launcher-loader: 1719] do_page_cache_ra+0x60/0x6c [ 2479.387669]I[0:launcher-loader: 1719] page_cache_ra_order+0x318/0x364 [ 2479.387695]I[0:launcher-loader: 1719] ondemand_readahead+0x30c/0x3d8 [ 2479.387722]I[0:launcher-loader: 1719] page_cache_sync_ra+0xb4/0xc8 [ 2479.387749]I[0:launcher-loader: 1719] filemap_read+0x268/0xd24 [ 2479.387777]I[0:launcher-loader: 1719] f2fs_file_read_iter+0x1a0/0x62c [ 2479.387806]I[0:launcher-loader: 1719] vfs_read+0x258/0x34c [ 2479.387831]I[0:launcher-loader: 1719] ksys_pread64+0x8c/0xd0 [ 2479.387857]I[0:launcher-loader: 1719] __arm64_sys_pread64+0x48/0x54 [ 2479.387881]I[0:launcher-loader: 1719] invoke_syscall+0x58/0x158 [ 2479.387909]I[0:launcher-loader: 1719] el0_svc_common+0xf0/0x134 [ 2479.387935]I[0:launcher-loader: 1719] do_el0_svc+0x44/0x114 [ 2479.387961]I[0:launcher-loader: 1719] el0_svc+0x2c/0x80 [ 2479.387985]I[0:launcher-loader: 1719] el0t_64_sync_handler+0x48/0x114 [ 2479.388010]I[0:launcher-loader: 1719] el0t_64_sync+0x190/0x194 [ 2479.388038]I[0:launcher-loader: 1719] Kernel panic - not syncing: kernel: panic_on_warn set ... So let's set __exception_irq_entry with __irq_entry as a default. Applying this patch, we can see gic_hande_irq is included in Systemp.map as below. * Before ffffffc008010000 T __do_softirq ffffffc008010000 T __irqentry_text_end ffffffc008010000 T __irqentry_text_start ffffffc008010000 T __softirqentry_text_start ffffffc008010000 T _stext ffffffc00801066c T __softirqentry_text_end ffffffc008010670 T __entry_text_start * After ffffffc008010000 T __irqentry_text_start ffffffc008010000 T _stext ffffffc008010000 t gic_handle_irq ffffffc00801013c t gic_handle_irq ffffffc008010294 T __irqentry_text_end ffffffc008010298 T __do_softirq ffffffc008010298 T __softirqentry_text_start ffffffc008010904 T __softirqentry_text_end ffffffc008010908 T __entry_text_start Signed-off-by: Youngmin Nam <youngmin.nam@samsung.com> Signed-off-by: SEO HOYOUNG <hy50.seo@samsung.com> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230424010436.779733-1-youngmin.nam@samsung.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-07arm64: cpufeature: fold cpus_set_cap() into update_cpu_capabilities()Mark Rutland
We only use cpus_set_cap() in update_cpu_capabilities(), where we open-code an analgous update to boot_cpucaps. Due to the way the cpucap_ptrs[] array is initialized, we know that the capability number cannot be greater than or equal to ARM64_NCAPS, so the warning is superfluous. Fold cpus_set_cap() into update_cpu_capabilities(), matching what we do for the boot_cpucaps, and making the relationship between the two a bit clearer. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230607164846.3967305-5-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-07arm64: alternatives: use cpucap namingMark Rutland
To more clearly align the various users of the cpucap enumeration, this patch changes the alternative code to use the term `cpucap` in favour of `feature`. The alternative_has_feature_{likely,unlikely}() functions are renamed to alternative_has_cap_<likely,unlikely}() to more clearly align with the cpus_have_{const_,}cap() helpers. At the same time remove the stale comment referring to the "ARM64_CB bit", which is evidently a typo for ARM64_CB_PATCH, which was removed in commit: 4c0bd995d73ed889 ("arm64: alternatives: have callbacks take a cap") There should be no functional change as a result of this patch; this is purely a renaming exercise. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230607164846.3967305-3-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-07arm64: standardise cpucap bitmap namesMark Rutland
The 'cpu_hwcaps' and 'boot_capabilities' bitmaps are bitmaps have the same enumerated bits, but are named wildly differently for no good reason. The terms 'hwcaps' and 'capabilities' have become ambiguous over time (e.g. due to clashes with ELF hwcaps and the structures used to manage feature detection), and it would be nicer to use 'cpucaps', matching the <asm/cpucaps.h> header the enumerated bit indices are defined in. While this isn't a functional problem, it makes the code harder than necessary to understand, and hard to extend with related functionality (e.g. per-cpu cpucap bitmaps). To that end, this patch renames `boot_capabilities` to `boot_cpucaps` and `cpu_hwcaps` to `system_cpucaps`. This more clearly indicates the relationship between the two and aligns with terminology used elsewhere in our feature management code. This change was scripted with: | find . -type f -name '*.[chS]' -print0 | \ | xargs -0 sed -i 's/\<boot_capabilities\>/boot_cpucaps/' | find . -type f -name '*.[chS]' -print0 | \ | xargs -0 sed -i 's/\<cpu_hwcaps\>/system_cpucaps/' ... and the instance of "cpu_hwcap" (without a trailing "s") in <asm/mmu_context.h> corrected manually to "system_cpucaps". Subsequent patches will adjust the naming of related functions to better align with the `cpucap` naming. There should be no functional change as a result of this patch; this is purely a renaming exercise. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Mark Brown <broonie@kernel.org> Reviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230607164846.3967305-2-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-06arm64/sysreg: Convert OSECCR_EL1 to automatic generationMark Brown
Convert OSECCR_EL1 to automatic generation as per DDI0601 2023-03, no functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20230419-arm64-syreg-gen-v2-7-4c6add1f6257@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-06arm64/sysreg: Convert OSDTRTX_EL1 to automatic generationMark Brown
Convert OSDTRTX_EL1 to automatic generation as per DDI0601 2023-03. No functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20230419-arm64-syreg-gen-v2-6-4c6add1f6257@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-06arm64/sysreg: Convert OSDTRRX_EL1 to automatic generationMark Brown
Convert OSDTRRX_EL1 to automatic generation as per DDI0601 2023-03, no functional changes. Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20230419-arm64-syreg-gen-v2-5-4c6add1f6257@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-06arm64/sysreg: Convert OSLAR_EL1 to automatic generationMark Brown
Convert OSLAR_EL1 to automatic generation as per DDI0601 2023-03. No functional change. Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20230419-arm64-syreg-gen-v2-4-4c6add1f6257@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-06arm64/sysreg: Standardise naming of bitfield constants in OSL[AS]R_EL1Mark Brown
Our standard scheme for naming the constants for bitfields in system registers includes _ELx in the name but not the SYS_, update the constants for OSL[AS]R_EL1 to follow this convention. Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20230419-arm64-syreg-gen-v2-3-4c6add1f6257@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-06arm64/sysreg: Convert MDSCR_EL1 to automatic register generationMark Brown
Convert MDSCR_EL1 to automatic register generation as per DDI0616 2023-03. No functional change. Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20230419-arm64-syreg-gen-v2-2-4c6add1f6257@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-06arm64/sysreg: Convert MDCCINT_EL1 to automatic register generationMark Brown
Convert MDCCINT_EL1 to automatic register generation as per DDI0616 2023-03. No functional change. Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Shaoqin Huang <shahuang@redhat.com> Signed-off-by: Mark Brown <broonie@kernel.org> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20230419-arm64-syreg-gen-v2-1-4c6add1f6257@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-06arm64: module: rework module VA range selectionMark Rutland
Currently, the modules region is 128M in size, which is a problem for some large modules. Shanker reports [1] that the NVIDIA GPU driver alone can consume 110M of module space in some configurations. We'd like to make the modules region a full 2G such that we can always make use of a 2G range. It's possible to build kernel images which are larger than 128M in some configurations, such as when many debug options are selected and many drivers are built in. In these configurations, we can't legitimately select a base for a 128M module region, though we currently select a value for which allocation will fail. It would be nicer to have a diagnostic message in this case. Similarly, in theory it's possible to build a kernel image which is larger than 2G and which cannot support modules. While this isn't likely to be the case for any realistic kernel deplyed in the field, it would be nice if we could print a diagnostic in this case. This patch reworks the module VA range selection to use a 2G range, and improves handling of cases where we cannot select legitimate module regions. We now attempt to select a 128M region and a 2G region: * The 128M region is selected such that modules can use direct branches (with JUMP26/CALL26 relocations) to branch to kernel code and other modules, and so that modules can reference data and text (using PREL32 relocations) anywhere in the kernel image and other modules. This region covers the entire kernel image (rather than just the text) to ensure that all PREL32 relocations are in range even when the kernel data section is absurdly large. Where we cannot allocate from this region, we'll fall back to the full 2G region. * The 2G region is selected such that modules can use direct branches with PLTs to branch to kernel code and other modules, and so that modules can use reference data and text (with PREL32 relocations) in the kernel image and other modules. This region covers the entire kernel image, and the 128M region (if one is selected). The two module regions are randomized independently while ensuring the constraints described above. [1] https://lore.kernel.org/linux-arm-kernel/159ceeab-09af-3174-5058-445bc8dcf85b@nvidia.com/ Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Shanker Donthineni <sdonthineni@nvidia.com> Cc: Will Deacon <will@kernel.org> Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> Link: https://lore.kernel.org/r/20230530110328.2213762-7-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-06arm64: module: mandate MODULE_PLTSMark Rutland
Contemporary kernels and modules can be relatively large, especially when common debug options are enabled. Using GCC 12.1.0, a v6.3-rc7 defconfig kernel is ~38M, and with PROVE_LOCKING + KASAN_INLINE enabled this expands to ~117M. Shanker reports [1] that the NVIDIA GPU driver alone can consume 110M of module space in some configurations. Both KASLR and ARM64_ERRATUM_843419 select MODULE_PLTS, so anyone wanting a kernel to have KASLR or run on Cortex-A53 will have MODULE_PLTS selected. This is the case in defconfig and distribution kernels (e.g. Debian, Android, etc). Practically speaking, this means we're very likely to need MODULE_PLTS and while it's almost guaranteed that MODULE_PLTS will be selected, it is possible to disable support, and we have to maintain some awkward special cases for such unusual configurations. This patch removes the MODULE_PLTS config option, with the support code always enabled if MODULES is selected. This results in a slight simplification, and will allow for further improvement in subsequent patches. For any config which currently selects MODULE_PLTS, there will be no functional change as a result of this patch. [1] https://lore.kernel.org/linux-arm-kernel/159ceeab-09af-3174-5058-445bc8dcf85b@nvidia.com/ Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Shanker Donthineni <sdonthineni@nvidia.com> Cc: Will Deacon <will@kernel.org> Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> Link: https://lore.kernel.org/r/20230530110328.2213762-6-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-06arm64: module: move module randomization to module.cMark Rutland
When CONFIG_RANDOMIZE_BASE=y, module_alloc_base is a variable which is configured by kaslr_module_init() in kaslr.c, and otherwise it is an expression defined in module.h. As kaslr_module_init() is no longer tightly coupled with the KASLR initialization code, we can centralize this in module.c. This patch moves kaslr_module_init() to module.c, making module_alloc_base a static variable, and removing redundant includes from kaslr.c. For the defintion of struct arm64_ftr_override we must include <asm/cpufeature.h>, which was previously included transitively via another header. There should be no functional change as a result of this patch. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Will Deacon <will@kernel.org> Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> Link: https://lore.kernel.org/r/20230530110328.2213762-5-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-06arm64: kaslr: split kaslr/module initializationMark Rutland
Currently kaslr_init() handles a mixture of detecting/announcing whether KASLR is enabled, and randomizing the module region depending on whether KASLR is enabled. To make it easier to rework the module region initialization, split the KASLR initialization into two steps: * kaslr_init() determines whether KASLR should be enabled, and announces this choice, recording this to a new global boolean variable. This is called from setup_arch() just before the existing call to kaslr_requires_kpti() so that this will always provide the expected result. * kaslr_module_init() randomizes the module region when required. This is called as a subsys_initcall, where we previously called kaslr_init(). As a bonus, moving the KASLR reporting earlier makes it easier to spot and permits it to be logged via earlycon, making it easier to debug any issues that could be triggered by KASLR. Booting a v6.4-rc1 kernel with this patch applied, the log looks like: | EFI stub: Booting Linux Kernel... | EFI stub: Generating empty DTB | EFI stub: Exiting boot services... | [ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x000f0510] | [ 0.000000] Linux version 6.4.0-rc1-00006-g4763a8f8aeb3 (mark@lakrids) (aarch64-linux-gcc (GCC) 12.1.0, GNU ld (GNU Binutils) 2.38) #2 SMP PREEMPT Tue May 9 11:03:37 BST 2023 | [ 0.000000] KASLR enabled | [ 0.000000] earlycon: pl11 at MMIO 0x0000000009000000 (options '') | [ 0.000000] printk: bootconsole [pl11] enabled Signed-off-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Will Deacon <will@kernel.org> Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> Link: https://lore.kernel.org/r/20230530110328.2213762-4-mark.rutland@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-06arm64: add encodings of PIRx_ELx registersJoey Gouly
The encodings used in the permission indirection registers means that the values that Linux puts in the PTEs do not need to be changed. The E0 values are replicated in E1, with the execute permissions removed. This is needed as the futex operations access user mappings with privileged loads/stores. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20230606145859.697944-16-joey.gouly@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-06arm64: disable EL2 traps for PIEJoey Gouly
Disable trapping of TCR2_EL1 and PIRx_EL1 registers, so they can be accessed from by EL1. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20230606145859.697944-15-joey.gouly@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-06arm64: reorganise PAGE_/PROT_ macrosJoey Gouly
Make these macros available to assembly code, so they can be re-used by the PIE initialisation code. This involves adding some extra macros, prepended with _ that are the raw values not `pgprot` values. A dummy value for PTE_MAYBE_NG is also provided, for use in assembly. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20230606145859.697944-14-joey.gouly@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-06arm64: add PTE_WRITE to PROT_SECT_NORMALJoey Gouly
With PIE enabled, PROT_SECT_NORMAL would map onto PAGE_KERNEL_RO. Add PTE_WRITE so that this maps onto PAGE_KERNEL, so that it is writable. Without PIE, this should enable DBM for PROT_SECT_NORMAL. However PTE_RDONLY is already cleared, so the DBM mechanism is not used, and it is always writable, so this is functionally equivalent. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20230606145859.697944-13-joey.gouly@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-06arm64: add PTE_UXN/PTE_WRITE to SWAPPER_*_FLAGSJoey Gouly
With PIE enabled, the swapper PTEs would have a Permission Indirection Index (PIIndex) of 0. A PIIndex of 0 is not currently used by any other PTEs. To avoid using index 0 specifically for the swapper PTEs, mark them as PTE_UXN and PTE_WRITE, so that they map to a PAGE_KERNEL_EXEC equivalent. This also adds PTE_WRITE to KPTI_NG_PTE_FLAGS, which was tested by booting with kpti=on. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230606145859.697944-12-joey.gouly@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-06KVM: arm64: Save/restore PIE registersJoey Gouly
Define the new system registers that PIE introduces and context switch them. The PIE feature is still hidden from the ID register, and not exposed to a VM. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: James Morse <james.morse@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Zenghui Yu <yuzenghui@huawei.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230606145859.697944-10-joey.gouly@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-06KVM: arm64: Save/restore TCR2_EL1Joey Gouly
Define the new system register TCR2_EL1 and context switch it. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: James Morse <james.morse@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Zenghui Yu <yuzenghui@huawei.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230606145859.697944-9-joey.gouly@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-06arm64: cpufeature: add system register ID_AA64MMFR3Joey Gouly
Add new system register ID_AA64MMFR3 to the cpufeature infrastructure. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20230606145859.697944-6-joey.gouly@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-06arm64/sysreg: add PIR*_ELx registersJoey Gouly
Add definitions of PIR_EL1, PIR_EL12, PIRE0_EL1, PIRE0_EL12, and PIR_EL2 registers. Signed-off-by: Joey Gouly <joey.gouly@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Brown <broonie@kernel.org> Reviewed-by: Mark Brown <broonie@kernel.org> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Link: https://lore.kernel.org/r/20230606145859.697944-5-joey.gouly@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-05arm64/arch_timer: Provide noinstr sched_clock_read() functionsPeter Zijlstra
With the intent to provide local_clock_noinstr(), a variant of local_clock() that's safe to be called from noinstr code (with the assumption that any such code will already be non-preemptible), prepare for things by providing a noinstr sched_clock_read() function. Specifically, preempt_enable_*() calls out to schedule(), which upsets noinstr validation efforts. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Michael Kelley <mikelley@microsoft.com> # Hyper-V Link: https://lore.kernel.org/r/20230519102715.435618812@infradead.org
2023-06-05arm64/io: Always inline all of __raw_{read,write}[bwlq]()Peter Zijlstra
The next patch will want to use __raw_readl() from a noinstr section and as such that needs to be marked __always_inline to avoid the compiler being a silly bugger. Turns out it already is, but its siblings are not. Finish the work started in commit e43f1331e2ef913b ("arm64: Ask the compiler to __always_inline functions used by KVM at HYP") for consistenies sake. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <vschneid@redhat.com> Tested-by: Michael Kelley <mikelley@microsoft.com> # Hyper-V Link: https://lore.kernel.org/r/20230519102715.368919762@infradead.org
2023-06-05arm64: mops: detect and enable FEAT_MOPSKristina Martsenko
The Arm v8.8/9.3 FEAT_MOPS feature provides new instructions that perform a memory copy or set. Wire up the cpufeature code to detect the presence of FEAT_MOPS and enable it. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20230509142235.3284028-10-kristina.martsenko@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-05arm64: mops: handle MOPS exceptionsKristina Martsenko
The memory copy/set instructions added as part of FEAT_MOPS can take an exception (e.g. page fault) part-way through their execution and resume execution afterwards. If however the task is re-scheduled and execution resumes on a different CPU, then the CPU may take a new type of exception to indicate this. This is because the architecture allows two options (Option A and Option B) to implement the instructions and a heterogeneous system can have different implementations between CPUs. In this case the OS has to reset the registers and restart execution from the prologue instruction. The algorithm for doing this is provided as part of the Arm ARM. Add an exception handler for the new exception and wire it up for userspace tasks. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20230509142235.3284028-8-kristina.martsenko@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-05arm64: mops: don't disable host MOPS instructions from EL2Kristina Martsenko
To allow nVHE host EL0 and EL1 to use FEAT_MOPS instructions, configure EL2 to not cause these instructions to be treated as UNDEFINED. A VHE host is unaffected by this control. Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Link: https://lore.kernel.org/r/20230509142235.3284028-6-kristina.martsenko@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-05KVM: arm64: switch HCRX_EL2 between host and guestKristina Martsenko
Switch the HCRX_EL2 register between host and guest configurations, in order to enable different features in the host and guest. Now that there are separate guest flags, we can also remove SMPME from the host flags, as SMPME is used for virtualizing SME priorities and has no use in the host. Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20230509142235.3284028-4-kristina.martsenko@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-05KVM: arm64: initialize HCRX_EL2Kristina Martsenko
ARMv8.7/9.2 adds a new hypervisor configuration register HCRX_EL2. Initialize the register to a safe value (all fields 0), to be robust against firmware that has not initialized it. This is also needed to ensure that the register is reinitialized after a kexec by a future kernel. In addition, move SMPME setup over to the new flags, as it would otherwise get overridden. It is safe to set the bit even if SME is not (uniformly) supported, as it will write to a RES0 bit (having no effect), and SME will be disabled by the cpufeature framework. (Similar to how e.g. the API bit is handled in HCR_HOST_NVHE_FLAGS.) Signed-off-by: Kristina Martsenko <kristina.martsenko@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Acked-by: Oliver Upton <oliver.upton@linux.dev> Link: https://lore.kernel.org/r/20230509142235.3284028-2-kristina.martsenko@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2023-06-05locking/atomic: make atomic*_{cmp,}xchg optionalMark Rutland
Most architectures define the atomic/atomic64 xchg and cmpxchg operations in terms of arch_xchg and arch_cmpxchg respectfully. Add fallbacks for these cases and remove the trivial cases from arch code. On some architectures the existing definitions are kept as these are used to build other arch_atomic*() operations. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20230605070124.3741859-5-mark.rutland@arm.com
2023-06-05arch: Remove cmpxchg_doublePeter Zijlstra
No moar users, remove the monster. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230531132323.991907085@infradead.org
2023-06-05percpu: Wire up cmpxchg128Peter Zijlstra
In order to replace cmpxchg_double() with the newly minted cmpxchg128() family of functions, wire it up in this_cpu_cmpxchg(). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230531132323.654945124@infradead.org
2023-06-05arch: Introduce arch_{,try_}_cmpxchg128{,_local}()Peter Zijlstra
For all architectures that currently support cmpxchg_double() implement the cmpxchg128() family of functions that is basically the same but with a saner interface. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Tested-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20230531132323.452120708@infradead.org
2023-06-04KVM: arm64: PMU: Don't overwrite PMUSERENR with vcpu loadedReiji Watanabe
Currently, with VHE, KVM sets ER, CR, SW and EN bits of PMUSERENR_EL0 to 1 on vcpu_load(), and saves and restores the register value for the host on vcpu_load() and vcpu_put(). If the value of those bits are cleared on a pCPU with a vCPU loaded (armv8pmu_start() would do that when PMU counters are programmed for the guest), PMU access from the guest EL0 might be trapped to the guest EL1 directly regardless of the current PMUSERENR_EL0 value of the vCPU. Fix this by not letting armv8pmu_start() overwrite PMUSERENR_EL0 on the pCPU where PMUSERENR_EL0 for the guest is loaded, and instead updating the saved shadow register value for the host so that the value can be restored on vcpu_put() later. While vcpu_{put,load}() are manipulating PMUSERENR_EL0, disable IRQs to prevent a race condition between these processes and IPIs that attempt to update PMUSERENR_EL0 for the host EL0. Suggested-by: Mark Rutland <mark.rutland@arm.com> Suggested-by: Marc Zyngier <maz@kernel.org> Fixes: 83a7a4d643d3 ("arm64: perf: Enable PMU counter userspace access for perf event") Signed-off-by: Reiji Watanabe <reijiw@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20230603025035.3781797-3-reijiw@google.com
2023-06-04Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm fixes from Paolo Bonzini: "ARM: - Address some fallout of the locking rework, this time affecting the way the vgic is configured - Fix an issue where the page table walker frees a subtree and then proceeds with walking what it has just freed... - Check that a given PA donated to the guest is actually memory (only affecting pKVM) - Correctly handle MTE CMOs by Set/Way - Fix the reported address of a watchpoint forwarded to userspace - Fix the freeing of the root of stage-2 page tables - Stop creating spurious PMU events to perform detection of the default PMU and use the existing PMU list instead x86: - Fix a memslot lookup bug in the NX recovery thread that could theoretically let userspace bypass the NX hugepage mitigation - Fix a s/BLOCKING/PENDING bug in SVM's vNMI support - Account exit stats for fastpath VM-Exits that never leave the super tight run-loop - Fix an out-of-bounds bug in the optimized APIC map code, and add a regression test for the race" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: selftests: Add test for race in kvm_recalculate_apic_map() KVM: x86: Bail from kvm_recalculate_phys_map() if x2APIC ID is out-of-bounds KVM: x86: Account fastpath-only VM-Exits in vCPU stats KVM: SVM: vNMI pending bit is V_NMI_PENDING_MASK not V_NMI_BLOCKING_MASK KVM: x86/mmu: Grab memslot for correct address space in NX recovery worker KVM: arm64: Document default vPMU behavior on heterogeneous systems KVM: arm64: Iterate arm_pmus list to probe for default PMU KVM: arm64: Drop last page ref in kvm_pgtable_stage2_free_removed() KVM: arm64: Populate fault info for watchpoint KVM: arm64: Reload PTE after invoking walker callback on preorder traversal KVM: arm64: Handle trap of tagged Set/Way CMOs arm64: Add missing Set/Way CMO encodings KVM: arm64: Prevent unconditional donation of unmapped regions from the host KVM: arm64: vgic: Fix a comment KVM: arm64: vgic: Fix locking comment KVM: arm64: vgic: Wrap vgic_its_create() with config_lock KVM: arm64: vgic: Fix a circular locking issue
2023-06-01KVM: arm64: pkvm: Add support for fragmented FF-A descriptorsQuentin Perret
FF-A memory descriptors may need to be sent in fragments when they don't fit in the mailboxes. Doing so involves using the FRAG_TX and FRAG_RX primitives defined in the FF-A protocol. Add support in the pKVM FF-A relayer for fragmented descriptors by monitoring outgoing FRAG_TX transactions and by buffering large descriptors on the reclaim path. Co-developed-by: Andrew Walbran <qwandor@google.com> Signed-off-by: Andrew Walbran <qwandor@google.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20230523101828.7328-11-will@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>