summaryrefslogtreecommitdiff
path: root/arch/arm64/kvm/mmu.c
AgeCommit message (Collapse)Author
2021-06-18Merge branch kvm-arm64/mmu/stage2-cmos into kvmarm-master/nextMarc Zyngier
Cache maintenance updates from Yanan Wang, moving the CMOs down into the page-table code. This ensures that we only issue them when actually performing a mapping rather than upfront. * kvm-arm64/mmu/stage2-cmos: KVM: arm64: Move guest CMOs to the fault handlers KVM: arm64: Tweak parameters of guest cache maintenance functions KVM: arm64: Introduce mm_ops member for structure stage2_attr_data KVM: arm64: Introduce two cache maintenance callbacks
2021-06-18KVM: arm64: Move guest CMOs to the fault handlersYanan Wang
We currently uniformly perform CMOs of D-cache and I-cache in function user_mem_abort before calling the fault handlers. If we get concurrent guest faults(e.g. translation faults, permission faults) or some really unnecessary guest faults caused by BBM, CMOs for the first vcpu are necessary while the others later are not. By moving CMOs to the fault handlers, we can easily identify conditions where they are really needed and avoid the unnecessary ones. As it's a time consuming process to perform CMOs especially when flushing a block range, so this solution reduces much load of kvm and improve efficiency of the stage-2 page table code. We can imagine two specific scenarios which will gain much benefit: 1) In a normal VM startup, this solution will improve the efficiency of handling guest page faults incurred by vCPUs, when initially populating stage-2 page tables. 2) After live migration, the heavy workload will be resumed on the destination VM, however all the stage-2 page tables need to be rebuilt at the moment. So this solution will ease the performance drop during resuming stage. Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210617105824.31752-5-wangyanan55@huawei.com
2021-06-18KVM: arm64: Tweak parameters of guest cache maintenance functionsYanan Wang
Adjust the parameter "kvm_pfn_t pfn" of __clean_dcache_guest_page and __invalidate_icache_guest_page to "void *va", which paves the way for converting these two guest CMO functions into callbacks in structure kvm_pgtable_mm_ops. No functional change. Reviewed-by: Fuad Tabba <tabba@google.com> Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210617105824.31752-4-wangyanan55@huawei.com
2021-06-01KVM: arm64: Try stage2 block mapping for host device MMIOKeqian Zhu
The MMIO region of a device maybe huge (GB level), try to use block mapping in stage2 to speedup both map and unmap. Compared to normal memory mapping, we should consider two more points when try block mapping for MMIO region: 1. For normal memory mapping, the PA(host physical address) and HVA have same alignment within PUD_SIZE or PMD_SIZE when we use the HVA to request hugepage, so we don't need to consider PA alignment when verifing block mapping. But for device memory mapping, the PA and HVA may have different alignment. 2. For normal memory mapping, we are sure hugepage size properly fit into vma, so we don't check whether the mapping size exceeds the boundary of vma. But for device memory mapping, we should pay attention to this. This adds get_vma_page_shift() to get page shift for both normal memory and device MMIO region, and check these two points when selecting block mapping size for MMIO region. Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210507110322.23348-3-zhukeqian1@huawei.com
2021-06-01KVM: arm64: Remove the creation time's mapping of MMIO regionsKeqian Zhu
The MMIO regions may be unmapped for many reasons and can be remapped by stage2 fault path. Map MMIO regions at creation time becomes a minor optimization and makes these two mapping path hard to sync. Remove the mapping code while keep the useful sanity check. Signed-off-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210507110322.23348-2-zhukeqian1@huawei.com
2021-05-15KVM: arm64: Fix boolreturn.cocci warningskernel test robot
arch/arm64/kvm/mmu.c:1114:9-10: WARNING: return of 0/1 in function 'kvm_age_gfn' with return type bool arch/arm64/kvm/mmu.c:1084:9-10: WARNING: return of 0/1 in function 'kvm_set_spte_gfn' with return type bool arch/arm64/kvm/mmu.c:1127:9-10: WARNING: return of 0/1 in function 'kvm_test_age_gfn' with return type bool arch/arm64/kvm/mmu.c:1070:9-10: WARNING: return of 0/1 in function 'kvm_unmap_gfn_range' with return type bool Return statements in functions returning bool should use true/false instead of 1/0. Generated by: scripts/coccinelle/misc/boolreturn.cocci Fixes: cd4c71835228 ("KVM: arm64: Convert to the gfn-based MMU notifier callbacks") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: kernel test robot <lkp@intel.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210426223357.GA45871@cd4295a34ed8
2021-04-23Merge tag 'kvmarm-5.13' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for Linux 5.13 New features: - Stage-2 isolation for the host kernel when running in protected mode - Guest SVE support when running in nVHE mode - Force W^X hypervisor mappings in nVHE mode - ITS save/restore for guests using direct injection with GICv4.1 - nVHE panics now produce readable backtraces - Guest support for PTP using the ptp_kvm driver - Performance improvements in the S2 fault handler - Alexandru is now a reviewer (not really a new feature...) Fixes: - Proper emulation of the GICR_TYPER register - Handle the complete set of relocation in the nVHE EL2 object - Get rid of the oprofile dependency in the PMU code (and of the oprofile body parts at the same time) - Debug and SPE fixes - Fix vcpu reset
2021-04-17KVM: arm64: Convert to the gfn-based MMU notifier callbacksSean Christopherson
Move arm64 to the gfn-base MMU notifier APIs, which do the hva->gfn lookup in common code. No meaningful functional change intended, though the exact order of operations is slightly different since the memslot lookups occur before calling into arch code. Reviewed-by: Marc Zyngier <maz@kernel.org> Tested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210402005658.3024832-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17KVM: Move arm64's MMU notifier trace events to generic codeSean Christopherson
Move arm64's MMU notifier trace events into common code in preparation for doing the hva->gfn lookup in common code. The alternative would be to trace the gfn instead of hva, but that's not obviously better and could also be done in common code. Tracing the notifiers is also quite handy for debug regardless of architecture. Remove a completely redundant tracepoint from PPC e500. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20210326021957.1424875-10-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-13Merge branch 'kvm-arm64/memslot-fixes' into kvmarm-master/nextMarc Zyngier
Signed-off-by: Marc Zyngier <maz@kernel.org>
2021-04-07KVM: arm64: Don't retrieve memory slot again in page fault handlerGavin Shan
We needn't retrieve the memory slot again in user_mem_abort() because the corresponding memory slot has been passed from the caller. This would save some CPU cycles. For example, the time used to write 1GB memory, which is backed by 2MB hugetlb pages and write-protected, is dropped by 6.8% from 928ms to 864ms. Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210316041126.81860-4-gshan@redhat.com
2021-04-07KVM: arm64: Use find_vma_intersection()Gavin Shan
find_vma_intersection() has been existing to search the intersected vma. This uses the function where it's applicable, to simplify the code. Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210316041126.81860-3-gshan@redhat.com
2021-04-07KVM: arm64: Hide kvm_mmu_wp_memory_region()Gavin Shan
We needn't expose the function as it's only used by mmu.c since it was introduced by commit c64735554c0a ("KVM: arm: Add initial dirty page locking support"). Signed-off-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210316041126.81860-2-gshan@redhat.com
2021-03-19KVM: arm64: Use kvm_arch in kvm_s2_mmuQuentin Perret
In order to make use of the stage 2 pgtable code for the host stage 2, change kvm_s2_mmu to use a kvm_arch pointer in lieu of the kvm pointer, as the host will have the former but not the latter. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-21-qperret@google.com
2021-03-19KVM: arm64: Use kvm_arch for stage 2 pgtableQuentin Perret
In order to make use of the stage 2 pgtable code for the host stage 2, use struct kvm_arch in lieu of struct kvm as the host will have the former but not the latter. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-20-qperret@google.com
2021-03-19KVM: arm64: Elevate hypervisor mappings creation at EL2Quentin Perret
Previous commits have introduced infrastructure to enable the EL2 code to manage its own stage 1 mappings. However, this was preliminary work, and none of it is currently in use. Put all of this together by elevating the mapping creation at EL2 when memory protection is enabled. In this case, the host kernel running at EL1 still creates _temporary_ EL2 mappings, only used while initializing the hypervisor, but frees them right after. As such, all calls to create_hyp_mappings() after kvm init has finished turn into hypercalls, as the host now has no 'legal' way to modify the hypevisor page tables directly. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-19-qperret@google.com
2021-03-19KVM: arm64: Factor memory allocation out of pgtable.cQuentin Perret
In preparation for enabling the creation of page-tables at EL2, factor all memory allocation out of the page-table code, hence making it re-usable with any compatible memory allocator. No functional changes intended. Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210319100146.1149909-7-qperret@google.com
2021-03-12KVM: arm64: Fix exclusive limit for IPA sizeMarc Zyngier
When registering a memslot, we check the size and location of that memslot against the IPA size to ensure that we can provide guest access to the whole of the memory. Unfortunately, this check rejects memslot that end-up at the exact limit of the addressing capability for a given IPA size. For example, it refuses the creation of a 2GB memslot at 0x8000000 with a 32bit IPA space. Fix it by relaxing the check to accept a memslot reaching the limit of the IPA space. Fixes: c3058d5da222 ("arm/arm64: KVM: Ensure memslots are within KVM_PHYS_SIZE") Reviewed-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Reviewed-by: Andrew Jones <drjones@redhat.com> Link: https://lore.kernel.org/r/20210311100016.3830038-3-maz@kernel.org
2021-01-25KVM: arm64: Mark the page dirty only if the fault is handled successfullyYanan Wang
We now set the pfn dirty and mark the page dirty before calling fault handlers in user_mem_abort(), so we might end up having spurious dirty pages if update of permissions or mapping has failed. Let's move these two operations after the fault handlers, and they will be done only if the fault has been handled successfully. When an -EAGAIN errno is returned from the map handler, we hope to the vcpu to enter guest directly instead of exiting back to userspace, so adjust the return value at the end of function. Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210114121350.123684-4-wangyanan55@huawei.com
2020-12-20Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull KVM updates from Paolo Bonzini: "Much x86 work was pushed out to 5.12, but ARM more than made up for it. ARM: - PSCI relay at EL2 when "protected KVM" is enabled - New exception injection code - Simplification of AArch32 system register handling - Fix PMU accesses when no PMU is enabled - Expose CSV3 on non-Meltdown hosts - Cache hierarchy discovery fixes - PV steal-time cleanups - Allow function pointers at EL2 - Various host EL2 entry cleanups - Simplification of the EL2 vector allocation s390: - memcg accouting for s390 specific parts of kvm and gmap - selftest for diag318 - new kvm_stat for when async_pf falls back to sync x86: - Tracepoints for the new pagetable code from 5.10 - Catch VFIO and KVM irqfd events before userspace - Reporting dirty pages to userspace with a ring buffer - SEV-ES host support - Nested VMX support for wait-for-SIPI activity state - New feature flag (AVX512 FP16) - New system ioctl to report Hyper-V-compatible paravirtualization features Generic: - Selftest improvements" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (171 commits) KVM: SVM: fix 32-bit compilation KVM: SVM: Add AP_JUMP_TABLE support in prep for AP booting KVM: SVM: Provide support to launch and run an SEV-ES guest KVM: SVM: Provide an updated VMRUN invocation for SEV-ES guests KVM: SVM: Provide support for SEV-ES vCPU loading KVM: SVM: Provide support for SEV-ES vCPU creation/loading KVM: SVM: Update ASID allocation to support SEV-ES guests KVM: SVM: Set the encryption mask for the SVM host save area KVM: SVM: Add NMI support for an SEV-ES guest KVM: SVM: Guest FPU state save/restore not needed for SEV-ES guest KVM: SVM: Do not report support for SMM for an SEV-ES guest KVM: x86: Update __get_sregs() / __set_sregs() to support SEV-ES KVM: SVM: Add support for CR8 write traps for an SEV-ES guest KVM: SVM: Add support for CR4 write traps for an SEV-ES guest KVM: SVM: Add support for CR0 write traps for an SEV-ES guest KVM: SVM: Add support for EFER write traps for an SEV-ES guest KVM: SVM: Support string IO operations for an SEV-ES guest KVM: SVM: Support MMIO for an SEV-ES guest KVM: SVM: Create trace events for VMGEXIT MSR protocol processing KVM: SVM: Create trace events for VMGEXIT processing ...
2020-12-02KVM: arm64: Add usage of stage 2 fault lookup level in user_mem_abort()Yanan Wang
If we get a FSC_PERM fault, just using (logging_active && writable) to determine calling kvm_pgtable_stage2_map(). There will be two more cases we should consider. (1) After logging_active is configged back to false from true. When we get a FSC_PERM fault with write_fault and adjustment of hugepage is needed, we should merge tables back to a block entry. This case is ignored by still calling kvm_pgtable_stage2_relax_perms(), which will lead to an endless loop and guest panic due to soft lockup. (2) We use (FSC_PERM && logging_active && writable) to determine collapsing a block entry into a table by calling kvm_pgtable_stage2_map(). But sometimes we may only need to relax permissions when trying to write to a page other than a block. In this condition,using kvm_pgtable_stage2_relax_perms() will be fine. The ISS filed bit[1:0] in ESR_EL2 regesiter indicates the stage2 lookup level at which a D-abort or I-abort occurred. By comparing granule of the fault lookup level with vma_pagesize, we can strictly distinguish conditions of calling kvm_pgtable_stage2_relax_perms() or kvm_pgtable_stage2_map(), and the above two cases will be well considered. Suggested-by: Keqian Zhu <zhukeqian1@huawei.com> Signed-off-by: Yanan Wang <wangyanan55@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20201201201034.116760-4-wangyanan55@huawei.com
2020-11-27Merge branch 'kvm-arm64/el2-pc' into kvmarm-master/nextMarc Zyngier
Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-12Merge tag 'v5.10-rc1' into kvmarm-master/nextMarc Zyngier
Linux 5.10-rc1 Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10KVM: arm64: Make kvm_skip_instr() and co private to HYPMarc Zyngier
In an effort to remove the vcpu PC manipulations from EL1 on nVHE systems, move kvm_skip_instr() to be HYP-specific. EL1's intent to increment PC post emulation is now signalled via a flag in the vcpu structure. Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-10KVM: arm64: Move kvm_vcpu_trap_il_is32bit into kvm_skip_instr32()Marc Zyngier
There is no need to feed the result of kvm_vcpu_trap_il_is32bit() to kvm_skip_instr(), as only AArch32 has a variable length ISA, and this helper can equally be called from kvm_skip_instr32(), reducing the complexity at all the call sites. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2020-11-08Merge tag 'kvmarm-fixes-5.10-2' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for v5.10, take #2 - Fix compilation error when PMD and PUD are folded - Fix regresssion of the RAZ behaviour of ID_AA64ZFR0_EL1
2020-11-06KVM: arm64: Fix build error in user_mem_abort()Gavin Shan
The PUD and PMD are folded into PGD when the following options are enabled. In that case, PUD_SHIFT is equal to PMD_SHIFT and we fail to build with the indicated errors: CONFIG_ARM64_VA_BITS_42=y CONFIG_ARM64_PAGE_SHIFT=16 CONFIG_PGTABLE_LEVELS=3 arch/arm64/kvm/mmu.c: In function ‘user_mem_abort’: arch/arm64/kvm/mmu.c:798:2: error: duplicate case value case PMD_SHIFT: ^~~~ arch/arm64/kvm/mmu.c:791:2: note: previously used here case PUD_SHIFT: ^~~~ This fixes the issue by skipping the check on PUD huge page when PUD and PMD are folded into PGD. Fixes: 2f40c46021bbb ("KVM: arm64: Use fallback mapping sizes for contiguous huge page sizes") Reported-by: Eric Auger <eric.auger@redhat.com> Signed-off-by: Gavin Shan <gshan@redhat.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201103003009.32955-1-gshan@redhat.com
2020-10-30Merge tag 'kvmarm-fixes-5.10-1' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 5.10, take #1 - Force PTE mapping on device pages provided via VFIO - Fix detection of cacheable mapping at S2 - Fallback to PMD/PTE mappings for composite huge pages - Fix accounting of Stage-2 PGD allocation - Fix AArch32 handling of some of the debug registers - Simplify host HYP entry - Fix stray pointer conversion on nVHE TLB invalidation - Fix initialization of the nVHE code - Simplify handling of capabilities exposed to HYP - Nuke VCPUs caught using a forbidden AArch32 EL0
2020-10-29KVM: arm64: Force PTE mapping on fault resulting in a device mappingSantosh Shukla
VFIO allows a device driver to resolve a fault by mapping a MMIO range. This can be subsequently result in user_mem_abort() to try and compute a huge mapping based on the MMIO pfn, which is a sure recipe for things to go wrong. Instead, force a PTE mapping when the pfn faulted in has a device mapping. Fixes: 6d674e28f642 ("KVM: arm/arm64: Properly handle faulting of device mappings") Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Santosh Shukla <sashukla@nvidia.com> [maz: rewritten commit message] Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1603711447-11998-2-git-send-email-sashukla@nvidia.com
2020-10-29KVM: arm64: Use fallback mapping sizes for contiguous huge page sizesGavin Shan
Although huge pages can be created out of multiple contiguous PMDs or PTEs, the corresponding sizes are not supported at Stage-2 yet. Instead of failing the mapping, fall back to the nearer supported mapping size (CONT_PMD to PMD and CONT_PTE to PTE respectively). Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Gavin Shan <gshan@redhat.com> [maz: rewritten commit message] Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201025230626.18501-1-gshan@redhat.com
2020-10-20Merge tag 'kvmarm-5.10' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for Linux 5.10 - New page table code for both hypervisor and guest stage-2 - Introduction of a new EL2-private host context - Allow EL2 to have its own private per-CPU variables - Support of PMU event filtering - Complete rework of the Spectre mitigation
2020-10-02KVM: arm64: Ensure user_mem_abort() return value is initialisedWill Deacon
If a change in the MMU notifier sequence number forces user_mem_abort() to return early when attempting to handle a stage-2 fault, we return uninitialised stack to kvm_handle_guest_abort(), which could potentially result in the injection of an external abort into the guest or a spurious return to userspace. Neither or these are what we want to do. Initialise 'ret' to 0 in user_mem_abort() so that bailing due to a change in the MMU notrifier sequence number is treated as though the fault was handled. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Cc: Gavin Shan <gshan@redhat.com> Cc: Alexandru Elisei <alexandru.elisei@arm.com> Link: https://lore.kernel.org/r/20200930102442.16142-1-will@kernel.org
2020-09-20Merge tag 'kvmarm-fixes-5.9-2' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master KVM/arm64 fixes for 5.9, take #2 - Fix handling of S1 Page Table Walk permission fault at S2 on instruction fetch - Cleanup kvm_vcpu_dabt_iswrite()
2020-09-18KVM: arm64: Assume write fault on S1PTW permission fault on instruction fetchMarc Zyngier
KVM currently assumes that an instruction abort can never be a write. This is in general true, except when the abort is triggered by a S1PTW on instruction fetch that tries to update the S1 page tables (to set AF, for example). This can happen if the page tables have been paged out and brought back in without seeing a direct write to them (they are thus marked read only), and the fault handling code will make the PT executable(!) instead of writable. The guest gets stuck forever. In these conditions, the permission fault must be considered as a write so that the Stage-1 update can take place. This is essentially the I-side equivalent of the problem fixed by 60e21a0ef54c ("arm64: KVM: Take S1 walks into account when determining S2 write faults"). Update kvm_is_write_fault() to return true on IABT+S1PTW, and introduce kvm_vcpu_trap_is_exec_fault() that only return true when no faulting on a S1 fault. Additionally, kvm_vcpu_dabt_iss1tw() is renamed to kvm_vcpu_abt_iss1tw(), as the above makes it plain that it isn't specific to data abort. Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Will Deacon <will@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20200915104218.1284701-2-maz@kernel.org
2020-09-18KVM: arm64: Fix doc warnings in mmu codeXiaofei Tan
Fix following warnings caused by mismatch bewteen function parameters and comments. arch/arm64/kvm/mmu.c:128: warning: Function parameter or member 'mmu' not described in '__unmap_stage2_range' arch/arm64/kvm/mmu.c:128: warning: Function parameter or member 'may_block' not described in '__unmap_stage2_range' arch/arm64/kvm/mmu.c:128: warning: Excess function parameter 'kvm' description in '__unmap_stage2_range' arch/arm64/kvm/mmu.c:499: warning: Function parameter or member 'writable' not described in 'kvm_phys_addr_ioremap' arch/arm64/kvm/mmu.c:538: warning: Function parameter or member 'mmu' not described in 'stage2_wp_range' arch/arm64/kvm/mmu.c:538: warning: Excess function parameter 'kvm' description in 'stage2_wp_range' Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/1600307269-50957-1-git-send-email-tanxiaofei@huawei.com
2020-09-18KVM: arm64: Do not flush memslot if FWB is supportedAlexandru Elisei
As a result of a KVM_SET_USER_MEMORY_REGION ioctl, KVM flushes the dcache for the memslot being changed to ensure a consistent view of memory between the host and the guest: the host runs with caches enabled, and it is possible for the data written by the hypervisor to still be in the caches, but the guest is running with stage 1 disabled, meaning data accesses are to Device-nGnRnE memory, bypassing the caches entirely. Flushing the dcache is not necessary when KVM enables FWB, because it forces the guest to uses cacheable memory accesses. The current behaviour does not change, as the dcache flush helpers execute the cache operation only if FWB is not enabled, but walking the stage 2 table is avoided. Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200915170442.131635-1-alexandru.elisei@arm.com
2020-09-18KVM: arm64: Try PMD block mappings if PUD mappings are not supportedAlexandru Elisei
When userspace uses hugetlbfs for the VM memory, user_mem_abort() tries to use the same block size to map the faulting IPA in stage 2. If stage 2 cannot the same block mapping because the block size doesn't fit in the memslot or the memslot is not properly aligned, user_mem_abort() will fall back to a page mapping, regardless of the block size. We can do better for PUD backed hugetlbfs by checking if a PMD block mapping is supported before deciding to use a page. vma_pagesize is an unsigned long, use 1UL instead of 1ULL when assigning its value. Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200910133351.118191-1-alexandru.elisei@arm.com
2020-09-11Merge tag 'kvmarm-fixes-5.9-1' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for Linux 5.9, take #1 - Multiple stolen time fixes, with a new capability to match x86 - Fix for hugetlbfs mappings when PUD and PMD are the same level - Fix for hugetlbfs mappings when PTE mappings are enforced (dirty logging, for example) - Fix tracing output of 64bit values
2020-09-11KVM: arm64: Remove unused 'pgd' field from 'struct kvm_s2_mmu'Will Deacon
The stage-2 page-tables are entirely encapsulated by the 'pgt' field of 'struct kvm_s2_mmu', so remove the unused 'pgd' field. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20200911132529.19844-21-will@kernel.org
2020-09-11KVM: arm64: Remove unused page-table codeWill Deacon
Now that KVM is using the generic page-table code to manage the guest stage-2 page-tables, we can remove a bunch of unused macros, #defines and static inline functions from the old implementation. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20200911132529.19844-20-will@kernel.org
2020-09-11KVM: arm64: Check the pgt instead of the pgd when modifying page-tableWill Deacon
In preparation for removing the 'pgd' field of 'struct kvm_s2_mmu', update the few remaining users to check the 'pgt' field instead. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20200911132529.19844-19-will@kernel.org
2020-09-11KVM: arm64: Convert user_mem_abort() to generic page-table APIWill Deacon
Convert user_mem_abort() to call kvm_pgtable_stage2_relax_perms() when handling a stage-2 permission fault and kvm_pgtable_stage2_map() when handling a stage-2 translation fault, rather than walking the page-table manually. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20200911132529.19844-18-will@kernel.org
2020-09-11KVM: arm64: Convert memslot cache-flushing code to generic page-table APIQuentin Perret
Convert stage2_flush_memslot() to call the kvm_pgtable_stage2_flush() function of the generic page-table code instead of walking the page-table directly. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Cc: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200911132529.19844-16-will@kernel.org
2020-09-11KVM: arm64: Convert write-protect operation to generic page-table APIQuentin Perret
Convert stage2_wp_range() to call the kvm_pgtable_stage2_wrprotect() function of the generic page-table code instead of walking the page-table directly. Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Cc: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200911132529.19844-14-will@kernel.org
2020-09-11KVM: arm64: Convert page-aging and access faults to generic page-table APIWill Deacon
Convert the page-aging functions and access fault handler to use the generic page-table code instead of walking the page-table directly. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20200911132529.19844-12-will@kernel.org
2020-09-11KVM: arm64: Convert unmap_stage2_range() to generic page-table APIWill Deacon
Convert unmap_stage2_range() to use kvm_pgtable_stage2_unmap() instead of walking the page-table directly. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20200911132529.19844-10-will@kernel.org
2020-09-11KVM: arm64: Convert kvm_set_spte_hva() to generic page-table APIWill Deacon
Convert kvm_set_spte_hva() to use kvm_pgtable_stage2_map() instead of stage2_set_pte(). Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20200911132529.19844-9-will@kernel.org
2020-09-11KVM: arm64: Convert kvm_phys_addr_ioremap() to generic page-table APIWill Deacon
Convert kvm_phys_addr_ioremap() to use kvm_pgtable_stage2_map() instead of stage2_set_pte(). Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20200911132529.19844-8-will@kernel.org
2020-09-11KVM: arm64: Add support for creating kernel-agnostic stage-2 page tablesWill Deacon
Introduce alloc() and free() functions to the generic page-table code for guest stage-2 page-tables and plumb these into the existing KVM page-table allocator. Subsequent patches will convert other operations within the KVM allocator over to the generic code. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Gavin Shan <gshan@redhat.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20200911132529.19844-6-will@kernel.org
2020-09-11KVM: arm64: Use generic allocator for hyp stage-1 page-tablesWill Deacon
Now that we have a shiny new page-table allocator, replace the hyp page-table code with calls into the new API. This also allows us to remove the extended idmap code, as we can now simply ensure that the VA size is large enough to map everything we need. Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Cc: Quentin Perret <qperret@google.com> Link: https://lore.kernel.org/r/20200911132529.19844-5-will@kernel.org