summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2022-06-20KVM: Don't WARN if kvm_pfn_to_page() encounters a "reserved" pfnSean Christopherson
Drop a WARN_ON() if kvm_pfn_to_page() encounters a "reserved" pfn, which in this context means a struct page that has PG_reserved but is not a/the ZERO_PAGE and is not a ZONE_DEVICE page. The usage, via gfn_to_page(), in x86 is safe as gfn_to_page() is used only to retrieve a page from KVM-controlled memslot, but the usage in PPC and s390 operates on arbitrary gfns and thus memslots that can be backed by incompatible memory. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220429010416.2788472-7-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-20KVM: nVMX: Use kvm_vcpu_map() to get/pin vmcs12's APIC-access pageSean Christopherson
Use kvm_vcpu_map() to get/pin the backing for vmcs12's APIC-access page, there's no reason it has to be restricted to 'struct page' backing. The APIC-access page actually doesn't need to be backed by anything, which is ironically why it got left behind by the series which introduced kvm_vcpu_map()[1]; the plan was to shove a dummy pfn into vmcs02[2], but that code never got merged. Switching the APIC-access page to kvm_vcpu_map() doesn't preclude using a magic pfn in the future, and will allow a future patch to drop kvm_vcpu_gpa_to_page(). [1] https://lore.kernel.org/all/1547026933-31226-1-git-send-email-karahmed@amazon.de [2] https://lore.kernel.org/lkml/1543845551-4403-1-git-send-email-karahmed@amazon.de Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220429010416.2788472-6-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-20KVM: Avoid pfn_to_page() and vice versa when releasing pagesSean Christopherson
Invert the order of KVM's page/pfn release helpers so that the "inner" helper operates on a page instead of a pfn. As pointed out by Linus[*], converting between struct page and a pfn isn't necessarily cheap, and that's not even counting the overhead of is_error_noslot_pfn() and kvm_is_reserved_pfn(). Even if the checks were dirt cheap, there's no reason to convert from a page to a pfn and back to a page, just to mark the page dirty/accessed or to put a reference to the page. Opportunistically drop a stale declaration of kvm_set_page_accessed() from kvm_host.h (there was no implementation). No functional change intended. [*] https://lore.kernel.org/all/CAHk-=wifQimj2d6npq-wCi5onYPjzQg4vyO4tFcPJJZr268cRw@mail.gmail.com Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220429010416.2788472-5-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-20KVM: Don't set Accessed/Dirty bits for ZERO_PAGESean Christopherson
Don't set Accessed/Dirty bits for a struct page with PG_reserved set, i.e. don't set A/D bits for the ZERO_PAGE. The ZERO_PAGE (or pages depending on the architecture) should obviously never be written, and similarly there's no point in marking it accessed as the page will never be swapped out or reclaimed. The comment in page-flags.h is quite clear that PG_reserved pages should be managed only by their owner, and strictly following that mandate also simplifies KVM's logic. Fixes: 7df003c85218 ("KVM: fix overflow of zero page refcount with ksm running") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220429010416.2788472-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-20KVM: Drop bogus "pfn != 0" guard from kvm_release_pfn()Sean Christopherson
Remove a check from kvm_release_pfn() to bail if the provided @pfn is zero. Zero is a perfectly valid pfn on most architectures, and should not be used to indicate an error or an invalid pfn. The bogus check was added by commit 917248144db5 ("x86/kvm: Cache gfn to pfn translation"), which also did the bad thing of zeroing the pfn and gfn to mark a cache invalid. Thankfully, that bad behavior was axed by commit 357a18ad230f ("KVM: Kill kvm_map_gfn() / kvm_unmap_gfn() and gfn_to_pfn_cache"). Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220429010416.2788472-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-20KVM: x86/mmu: Use common logic for computing the 32/64-bit base PA maskSean Christopherson
Use common logic for computing PT_BASE_ADDR_MASK for 32-bit, 64-bit, and EPT paging. Both PAGE_MASK and the new-common logic are supsersets of what is actually needed for 32-bit paging. PAGE_MASK sets bits 63:12 and the former GUEST_PT64_BASE_ADDR_MASK sets bits 51:12, so regardless of which value is used, the result will always be bits 31:12. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220614233328.3896033-9-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-20KVM: x86/mmu: Truncate paging32's PT_BASE_ADDR_MASK to 32 bitsSean Christopherson
Truncate paging32's PT_BASE_ADDR_MASK to a pt_element_t, i.e. to 32 bits. Ignoring PSE huge pages, the mask is only used in conjunction with gPTEs, which are 32 bits, and so the address is limited to bits 31:12. PSE huge pages encoded PA bits 39:32 in PTE bits 20:13, i.e. need custom logic to handle their funky encoding regardless of PT_BASE_ADDR_MASK. Note, PT_LVL_OFFSET_MASK is somewhat confusing in that it computes the offset of the _gfn_, not of the gpa, i.e. not having bits 63:32 set in PT_BASE_ADDR_MASK is again correct. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220614233328.3896033-8-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-20KVM: x86/mmu: Use common macros to compute 32/64-bit paging masksPaolo Bonzini
Dedup the code for generating (most of) the per-type PT_* masks in paging_tmpl.h. The relevant macros only vary based on the number of bits per level, and that smidge of info is already provided in a common form as PT_LEVEL_BITS. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220614233328.3896033-7-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-20KVM: x86/mmu: Use separate namespaces for guest PTEs and shadow PTEsSean Christopherson
Separate the macros for KVM's shadow PTEs (SPTE) from guest 64-bit PTEs (PT64). SPTE and PT64 are _mostly_ the same, but the few differences are quite critical, e.g. *_BASE_ADDR_MASK must differentiate between host and guest physical address spaces, and SPTE_PERM_MASK (was PT64_PERM_MASK) is very much specific to SPTEs. Opportunistically (and temporarily) move most guest macros into paging.h to clearly associate them with shadow paging, and to ensure that they're not used as of this commit. A future patch will eliminate them entirely. Sadly, PT32_LEVEL_BITS is left behind in mmu_internal.h because it's needed for the quadrant calculation in kvm_mmu_get_page(). The quadrant calculation is hot enough (when using shadow paging with 32-bit guests) that adding a per-context helper is undesirable, and burying the computation in paging_tmpl.h with a forward declaration isn't exactly an improvement. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220614233328.3896033-6-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-20KVM: x86/mmu: Dedup macros for computing various page table masksSean Christopherson
Provide common helper macros to generate various masks, shifts, etc... for 32-bit vs. 64-bit page tables. Only the inputs differ, the actual calculations are identical. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220614233328.3896033-5-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-20KVM: x86/mmu: Bury 32-bit PSE paging helpers in paging_tmpl.hSean Christopherson
Move a handful of one-off macros and helpers for 32-bit PSE paging into paging_tmpl.h and hide them behind "PTTYPE == 32". Under no circumstance should anything but 32-bit shadow paging care about PSE paging. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220614233328.3896033-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-20KVM: VMX: Refactor 32-bit PSE PT creation to avoid using MMU macroSean Christopherson
Compute the number of PTEs to be filled for the 32-bit PSE page tables using the page size and the size of each entry. While using the MMU's PT32_ENT_PER_PAGE macro is arguably better in isolation, removing VMX's usage will allow a future namespacing cleanup to move the guest page table macros into paging_tmpl.h, out of the reach of code that isn't directly related to shadow paging. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220614233328.3896033-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-20KVM: x86: Use lapic_in_kernel() to query in-kernel APIC in APICv helperSean Christopherson
Use lapic_in_kernel() in kvm_vcpu_apicv_active() to take advantage of the kvm_has_noapic_vcpu static branch. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220614230548.3852141-6-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-20KVM: x86: Move "apicv_active" into "struct kvm_lapic"Sean Christopherson
Move the per-vCPU apicv_active flag into KVM's local APIC instance. APICv is fully dependent on an in-kernel local APIC, but that's not at all clear when reading the current code due to the flag being stored in the generic kvm_vcpu_arch struct. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220614230548.3852141-5-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-20KVM: x86: Check for in-kernel xAPIC when querying APICv for directed yieldSean Christopherson
Use kvm_vcpu_apicv_active() to check if APICv is active when seeing if a vCPU is a candidate for directed yield due to a pending ACPIv interrupt. This will allow moving apicv_active into kvm_lapic without introducing a potential NULL pointer deref (kvm_vcpu_apicv_active() effectively adds a pre-check on the vCPU having an in-kernel APIC). No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220614230548.3852141-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-20KVM: x86: Drop @vcpu parameter from kvm_x86_ops.hwapic_isr_update()Sean Christopherson
Drop the unused @vcpu parameter from hwapic_isr_update(). AMD/AVIC is unlikely to implement the helper, and VMX/APICv doesn't need the vCPU as it operates on the current VMCS. The result is somewhat odd, but allows for a decent amount of (future) cleanup in the APIC code. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220614230548.3852141-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-20KVM: SVM: Drop unused AVIC / kvm_x86_ops declarationsSean Christopherson
Drop a handful of unused AVIC function declarations whose implementations were removed during the conversion to optional static calls. No functional change intended. Fixes: abb6d479e226 ("KVM: x86: make several APIC virtualization callbacks optional") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220614230548.3852141-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-20KVM: nVMX: Update vmcs12 on BNDCFGS write, not at vmcs02=>vmcs12 syncSean Christopherson
Update vmcs12->guest_bndcfgs on intercepted writes to BNDCFGS from L2 instead of waiting until vmcs02 is synchronized to vmcs12. KVM always intercepts BNDCFGS accesses, so the only way the value in vmcs02 can change is via KVM's explicit VMWRITE during emulation. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220614215831.3762138-6-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-20KVM: nVMX: Save BNDCFGS to vmcs12 iff relevant controls are exposed to L1Sean Christopherson
Save BNDCFGS to vmcs12 (from vmcs02) if and only if at least of one of the load-on-entry or clear-on-exit fields for BNDCFGS is enumerated as an allowed-1 bit in vmcs12. Skipping the field avoids an unnecessary VMREAD when MPX is supported but not exposed to L1. Per Intel's SDM: If the processor supports either the 1-setting of the "load IA32_BNDCFGS" VM-entry control or that of the "clear IA32_BNDCFGS" VM-exit control, the contents of the IA32_BNDCFGS MSR are saved into the corresponding field. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220614215831.3762138-5-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-20KVM: nVMX: Rename nested.vmcs01_* fields to nested.pre_vmenter_*Sean Christopherson
Rename the fields in struct nested_vmx used to snapshot pre-VM-Enter values to reflect that they can hold L2's values when restoring nested state, e.g. if userspace restores MSRs before nested state. As crazy as it seems, restoring MSRs before nested state actually works (because KVM goes out if it's way to make it work), even though the initial MSR writes will hit vmcs01 despite holding L2 values. Add a related comment to vmx_enter_smm() to call out that using the common VM-Exit and VM-Enter helpers to emulate SMI and RSM is wrong and broken. The few MSRs that have snapshots _could_ be fixed by taking a snapshot prior to the forced VM-Exit instead of at forced VM-Enter, but that's just the tip of the iceberg as the rather long list of MSRs that aren't snapshotted (hello, VM-Exit MSR load list) can't be handled this way. Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220614215831.3762138-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-20KVM: nVMX: Snapshot pre-VM-Enter DEBUGCTL for !nested_run_pending caseSean Christopherson
If a nested run isn't pending, snapshot vmcs01.GUEST_IA32_DEBUGCTL irrespective of whether or not VM_ENTRY_LOAD_DEBUG_CONTROLS is set in vmcs12. When restoring nested state, e.g. after migration, without a nested run pending, prepare_vmcs02() will propagate nested.vmcs01_debugctl to vmcs02, i.e. will load garbage/zeros into vmcs02.GUEST_IA32_DEBUGCTL. If userspace restores nested state before MSRs, then loading garbage is a non-issue as loading DEBUGCTL will also update vmcs02. But if usersepace restores MSRs first, then KVM is responsible for propagating L2's value, which is actually thrown into vmcs01, into vmcs02. Restoring L2 MSRs into vmcs01, i.e. loading all MSRs before nested state is all kinds of bizarre and ideally would not be supported. Sadly, some VMMs do exactly that and rely on KVM to make things work. Note, there's still a lurking SMM bug, as propagating vmcs01's DEBUGCTL to vmcs02 across RSM may corrupt L2's DEBUGCTL. But KVM's entire VMX+SMM emulation is flawed as SMI+RSM should not toouch _any_ VMCS when use the "default treatment of SMIs", i.e. when not using an SMI Transfer Monitor. Link: https://lore.kernel.org/all/Yobt1XwOfb5M6Dfa@google.com Fixes: 8fcc4b5923af ("kvm: nVMX: Introduce KVM_CAP_NESTED_STATE") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220614215831.3762138-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-20KVM: nVMX: Snapshot pre-VM-Enter BNDCFGS for !nested_run_pending caseSean Christopherson
If a nested run isn't pending, snapshot vmcs01.GUEST_BNDCFGS irrespective of whether or not VM_ENTRY_LOAD_BNDCFGS is set in vmcs12. When restoring nested state, e.g. after migration, without a nested run pending, prepare_vmcs02() will propagate nested.vmcs01_guest_bndcfgs to vmcs02, i.e. will load garbage/zeros into vmcs02.GUEST_BNDCFGS. If userspace restores nested state before MSRs, then loading garbage is a non-issue as loading BNDCFGS will also update vmcs02. But if usersepace restores MSRs first, then KVM is responsible for propagating L2's value, which is actually thrown into vmcs01, into vmcs02. Restoring L2 MSRs into vmcs01, i.e. loading all MSRs before nested state is all kinds of bizarre and ideally would not be supported. Sadly, some VMMs do exactly that and rely on KVM to make things work. Note, there's still a lurking SMM bug, as propagating vmcs01.GUEST_BNDFGS to vmcs02 across RSM may corrupt L2's BNDCFGS. But KVM's entire VMX+SMM emulation is flawed as SMI+RSM should not toouch _any_ VMCS when use the "default treatment of SMIs", i.e. when not using an SMI Transfer Monitor. Link: https://lore.kernel.org/all/Yobt1XwOfb5M6Dfa@google.com Fixes: 62cf9bd8118c ("KVM: nVMX: Fix emulation of VM_ENTRY_LOAD_BNDCFGS") Cc: stable@vger.kernel.org Cc: Lei Wang <lei4.wang@intel.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220614215831.3762138-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-15KVM: x86/mmu: Use try_cmpxchg64 in fast_pf_fix_direct_spteUros Bizjak
Use try_cmpxchg64 instead of cmpxchg64 (*ptr, old, new) != old in fast_pf_fix_direct_spte. cmpxchg returns success in ZF flag, so this change saves a compare after cmpxchg (and related move instruction in front of cmpxchg). Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Jim Mattson <jmattson@google.com> Cc: Joerg Roedel <joro@8bytes.org> Message-Id: <20220520144635.63134-1-ubizjak@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-15KVM: VMX: Use try_cmpxchg64 in pi_try_set_controlUros Bizjak
Use try_cmpxchg64 instead of cmpxchg64 (*ptr, old, new) != old in pi_try_set_control. cmpxchg returns success in ZF flag, so this change saves a compare after cmpxchg (and related move instruction in front of cmpxchg): b9: 88 44 24 60 mov %al,0x60(%rsp) bd: 48 89 c8 mov %rcx,%rax c0: c6 44 24 62 f2 movb $0xf2,0x62(%rsp) c5: 48 8b 74 24 60 mov 0x60(%rsp),%rsi ca: f0 49 0f b1 34 24 lock cmpxchg %rsi,(%r12) d0: 48 39 c1 cmp %rax,%rcx d3: 75 cf jne a4 <vmx_vcpu_pi_load+0xa4> patched: c1: 88 54 24 60 mov %dl,0x60(%rsp) c5: c6 44 24 62 f2 movb $0xf2,0x62(%rsp) ca: 48 8b 54 24 60 mov 0x60(%rsp),%rdx cf: f0 48 0f b1 13 lock cmpxchg %rdx,(%rbx) d4: 75 d5 jne ab <vmx_vcpu_pi_load+0xab> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Wanpeng Li <wanpengli@tencent.com> Cc: Jim Mattson <jmattson@google.com> Cc: Joerg Roedel <joro@8bytes.org> Reported-by: kernel test robot <lkp@intel.com> Message-Id: <20220520143737.62513-1-ubizjak@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-15KVM: x86/mmu: Use try_cmpxchg64 in tdp_mmu_set_spte_atomicUros Bizjak
Use try_cmpxchg64 instead of cmpxchg64 (*ptr, old, new) != old in tdp_mmu_set_spte_atomic. cmpxchg returns success in ZF flag, so this change saves a compare after cmpxchg (and related move instruction in front of cmpxchg). Also, remove explicit assignment to iter->old_spte when cmpxchg fails, this is what try_cmpxchg does implicitly. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Reviewed-by: David Matlack <dmatlack@google.com> Message-Id: <20220518135111.3535-1-ubizjak@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-15KVM: VMX: Skip filter updates for MSRs that KVM is already interceptingSean Christopherson
When handling userspace MSR filter updates, recompute interception for possible passthrough MSRs if and only if KVM wants to disabled interception. If KVM wants to intercept accesses, i.e. the associated bit is set in vmx->shadow_msr_intercept, then there's no need to set the intercept again as KVM will intercept the MSR regardless of userspace's wants. No functional change intended, the call to vmx_enable_intercept_for_msr() really is just a gigantic nop. Suggested-by: Aaron Lewis <aaronlewis@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220610214140.612025-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-15KVM: x86/mmu: Drop unused CMPXCHG macro from paging_tmpl.hSean Christopherson
Drop the CMPXCHG macro from paging_tmpl.h, it's no longer used now that KVM uses a common uaccess helper to do 8-byte CMPXCHG. Fixes: f122dfe44768 ("KVM: x86: Use __try_cmpxchg_user() to update guest PTE A/D bits") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220613225723.2734132-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-15KVM: X86/SVM: Use root_level in svm_load_mmu_pgd()Lai Jiangshan
Use root_level in svm_load_mmu_pg() rather that looking up the root level in vcpu->arch.mmu->root_role.level. svm_load_mmu_pgd() has only one caller, kvm_mmu_load_pgd(), which always passes vcpu->arch.mmu->root_role.level as root_level. Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Message-Id: <20220605063417.308311-7-jiangshanlai@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-15KVM: X86/MMU: Remove useless mmu_topup_memory_caches() in kvm_mmu_pte_write()Lai Jiangshan
Since the commit c5e2184d1544("KVM: x86/mmu: Remove the defunct update_pte() paging hook"), kvm_mmu_pte_write() no longer uses the rmap cache. So remove mmu_topup_memory_caches() in it. Cc: Sean Christopherson <seanjc@google.com> Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Message-Id: <20220605063417.308311-6-jiangshanlai@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-15KVM: Rename ack_flush() to ack_kick()Lai Jiangshan
Make it use the same verb as in kvm_kick_many_cpus(). Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Message-Id: <20220605063417.308311-5-jiangshanlai@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-15KVM: X86/MMU: Remove unused PT32_DIR_BASE_ADDR_MASK from mmu.cLai Jiangshan
It is unused. Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com> Message-Id: <20220605063417.308311-3-jiangshanlai@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-15KVM: s390: selftests: Fix memop extension capability checkJanis Schoetterl-Glausch
Fix the inverted logic of the memop extension capability check. Fixes: 97da92c0ff92 ("KVM: s390: selftests: Use TAP interface in the memop test") Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com> Message-Id: <20220614162635.3445019-1-scgl@linux.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-15KVM: SVM: Hide SEV migration lockdep goo behind CONFIG_PROVE_LOCKINGSean Christopherson
Wrap the manipulation of @role and the manual mutex_{release,acquire}() invocations in CONFIG_PROVE_LOCKING=y to squash a clang-15 warning. When building with -Wunused-but-set-parameter and CONFIG_DEBUG_LOCK_ALLOC=n, clang-15 seees there's no usage of @role in mutex_lock_killable_nested() and yells. PROVE_LOCKING selects DEBUG_LOCK_ALLOC, and the only reason KVM manipulates @role is to make PROVE_LOCKING happy. To avoid true ugliness, use "i" and "j" to detect the first pass in the loops; the "idx" field that's used by kvm_for_each_vcpu() is guaranteed to be '0' on the first pass as it's simply the first entry in the vCPUs XArray, which is fully KVM controlled. kvm_for_each_vcpu() passes '0' for xa_for_each_range()'s "start", and xa_for_each_range() will not enter the loop if there's no entry at '0'. Fixes: 0c2c7c069285 ("KVM: SEV: Mark nested locking of vcpu->lock") Reported-by: kernel test robot <lkp@intel.com> Cc: Peter Gonda <pgonda@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220613214237.2538266-1-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-15KVM: SEV: fix misplaced closing parenthesisPaolo Bonzini
This caused a warning on 32-bit systems, but undoubtedly would have acted funny on 64-bit as well. The fix was applied directly on merge in 5.19, see commit 24625f7d91fb ("Merge tag for-linus of git://git.kernel.org/pub/scm/virt/kvm/kvm"). Fixes: 3743c2f02517 ("KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC base") Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-15KVM: selftests: Remove the mismatched parameter commentsShaoqin Huang
There are some parameter being removed in function but the parameter comments still exist, so remove them. Signed-off-by: Shaoqin Huang <shaoqin.huang@intel.com> Message-Id: <20220614224126.211054-1-shaoqin.huang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-14KVM: selftests: Use kvm_has_cap(), not kvm_check_cap(), where possibleSean Christopherson
Replace calls to kvm_check_cap() that treat its return as a boolean with calls to kvm_has_cap(). Several instances of kvm_check_cap() were missed when kvm_has_cap() was introduced. Reported-by: Andrew Jones <drjones@redhat.com> Fixes: 3ea9b809650b ("KVM: selftests: Add kvm_has_cap() to provide syntactic sugar") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220613161942.1586791-5-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-14KVM: selftests: Drop a duplicate TEST_ASSERT() in vm_nr_pages_required()Sean Christopherson
Remove a duplicate TEST_ASSERT() on the number of runnable vCPUs in vm_nr_pages_required() that snuck in during a rebase gone bad. Reported-by: Andrew Jones <drjones@redhat.com> Fixes: 6e1d13bf3815 ("KVM: selftests: Move per-VM/per-vCPU nr pages calculation to __vm_create()") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220613161942.1586791-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-14KVM: selftests: Call a dummy helper in VM/vCPU ioctls() to enforce typeSean Christopherson
Replace the goofy static_assert on the size of the @vm/@vcpu parameters with a call to a dummy helper, i.e. let the compiler naturally complain about an incompatible type instead of homebrewing a poor replacement. Reported-by: Andrew Jones <drjones@redhat.com> Fixes: fcba483e8246 ("KVM: selftests: Sanity check input to ioctls() at build time") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220613161942.1586791-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-14KVM: selftests: Add a missing apostrophe in comment to show ownershipSean Christopherson
Add an apostrophe in a comment about it being the caller's, not callers, responsibility to free an object. Reported-by: Andrew Jones <drjones@redhat.com> Fixes: 768e9a61856b ("KVM: selftests: Purge vm+vcpu_id == vcpu silliness") Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220613161942.1586791-2-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-14KVM: selftests: kvm_binary_stats_test: Fix index expressionsAndrew Jones
kvm_binary_stats_test accepts two arguments, the number of vms and number of vcpus. If these inputs are not equal then the test would likely crash for one reason or another due to using miscalculated indices for the vcpus array. Fix the index expressions by swapping the use of i and j. Signed-off-by: Andrew Jones <drjones@redhat.com> Message-Id: <20220614081041.2571511-1-drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-11KVM: selftests: Sanity check input to ioctls() at build timeSean Christopherson
Add a static assert to the KVM/VM/vCPU ioctl() helpers to verify that the size of the argument provided matches the expected size of the IOCTL. Because ioctl() ultimately takes a "void *", it's all too easy to pass in garbage and not detect the error until runtime. E.g. while working on a CPUID rework, selftests happily compiled when vcpu_set_cpuid() unintentionally passed the cpuid() function as the parameter to ioctl() (a local "cpuid" parameter was removed, but its use was not replaced with "vcpu->cpuid" as intended). Tweak a variety of benign issues that aren't compatible with the sanity check, e.g. passing a non-pointer for ioctls(). Note, static_assert() requires a string on older versions of GCC. Feed it an empty string to make the compiler happy. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-11KVM: selftests: Use TAP-friendly ksft_exit_skip() in __TEST_REQUIRESean Christopherson
Use the TAP-friendly ksft_exit_skip() instead of KVM's custom print_skip() when skipping a test via __TEST_REQUIRE. KVM's "skipping test" has no known benefit, whereas some setups rely on TAP output. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-11KVM: selftests: Add TEST_REQUIRE macros to reduce skipping copy+pasteSean Christopherson
Add TEST_REQUIRE() and __TEST_REQUIRE() to replace the myriad open coded instances of selftests exiting with KSFT_SKIP after printing an informational message. In addition to reducing the amount of boilerplate code in selftests, the UPPERCASE macro names make it easier to visually identify a test's requirements. Convert usage that erroneously uses something other than print_skip() and/or "exits" with '0' or some other non-KSFT_SKIP value. Intentionally drop a kvm_vm_free() in aarch64/debug-exceptions.c as part of the conversion. All memory and file descriptors are freed on process exit, so the explicit free is superfluous. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-11KVM: selftests: Add kvm_has_cap() to provide syntactic sugarSean Christopherson
Add kvm_has_cap() to wrap kvm_check_cap() and return a bool for the use cases where the caller only wants check if a capability is supported, i.e. doesn't care about the value beyond whether or not it's non-zero. The "check" terminology is somewhat ambiguous as the non-boolean return suggests that '0' might mean "success", i.e. suggests that the ioctl uses the 0/-errno pattern. Provide a wrapper instead of trying to find a new name for the raw helper; the "check" terminology is derived from the name of the ioctl, so using e.g. "get" isn't a clear win. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-11KVM: selftests: Return an 'unsigned int' from kvm_check_cap()Sean Christopherson
Return an 'unsigned int' instead of a signed 'int' from kvm_check_cap(), to make it more obvious that kvm_check_cap() can never return a negative value due to its assertion that the return is ">= 0". Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-11KVM: selftests: Drop DEFAULT_GUEST_PHY_PAGES, open code the magic numberSean Christopherson
Remove DEFAULT_GUEST_PHY_PAGES and open code the magic number (with a comment) in vm_nr_pages_required(). Exposing DEFAULT_GUEST_PHY_PAGES to tests was a symptom of the VM creation APIs not cleanly supporting tests that create runnable vCPUs, but can't do so immediately. Now that tests don't have to manually compute the amount of memory needed for basic operation, make it harder for tests to do things that should be handled by the framework, i.e. force developers to improve the framework instead of hacking around flaws in individual tests. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-11KVM: selftests: Trust that MAXPHYADDR > memslot0 in vmx_apic_access_testSean Christopherson
Use vm->max_gfn to compute the highest gpa in vmx_apic_access_test, and blindly trust that the highest gfn/gpa will be well above the memory carved out for memslot0. The existing check is beyond paranoid; KVM doesn't support CPUs with host.MAXPHYADDR < 32, and the selftests are all kinds of hosed if memslot0 overlaps the local xAPIC, which resides above "lower" (below 4gb) DRAM. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-11KVM: selftests: Move per-VM/per-vCPU nr pages calculation to __vm_create()Sean Christopherson
Handle all memslot0 size adjustments in __vm_create(). Currently, the adjustments reside in __vm_create_with_vcpus(), which means tests that call vm_create() or __vm_create() directly are left to their own devices. Some tests just pass DEFAULT_GUEST_PHY_PAGES and don't bother with any adjustments, while others mimic the per-vCPU calculations. For vm_create(), and thus __vm_create(), take the number of vCPUs that will be runnable to calculate that number of per-vCPU pages needed for memslot0. To give readers a hint that neither vm_create() nor __vm_create() create vCPUs, name the parameter @nr_runnable_vcpus instead of @nr_vcpus. That also gives readers a hint as to why tests that create larger numbers of vCPUs but never actually run those vCPUs can skip straight to the vm_create_barebones() variant. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-11KVM: selftests: Drop @num_percpu_pages from __vm_create_with_vcpus()Sean Christopherson
Drop @num_percpu_pages from __vm_create_with_vcpus(), all callers pass '0' and there's unlikely to be a test that allocates just enough memory that it needs a per-CPU allocation, but not so much that it won't just do its own memory management. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-11KVM: selftests: Drop @slot0_mem_pages from __vm_create_with_vcpus()Sean Christopherson
All callers of __vm_create_with_vcpus() pass DEFAULT_GUEST_PHY_PAGES for @slot_mem_pages; drop the param and just hardcode the "default" as the base number of pages for slot0. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>