Age | Commit message (Collapse) | Author |
|
* kvm-arm64/misc-5.15:
: Misc improvements for 5.15:
:
: - Account the number of VMID-wide TLB invalidations as
: remote TLB flushes
: - Fix comments in the VGIC code
: - Cleanup the PMU IMPDEF identification
: - Streamline the TGRAN2 usage
: - Avoid advertising a 52bit IPA range for non-64KB configs
: - Avoid spurious signalling when a HW-mapped interrupt is in the
: A+P state on entry, and in the P state on exit, but that the
: physical line is not pending anymore.
: - Bunch of minor cleanups
KVM: arm64: vgic: Resample HW pending state on deactivation
KVM: arm64: vgic: Drop WARN from vgic_get_irq
KVM: arm64: Drop unused REQUIRES_VIRT
KVM: arm64: Drop check_kvm_target_cpu() based percpu probe
KVM: arm64: Drop init_common_resources()
KVM: arm64: Use ARM64_MIN_PARANGE_BITS as the minimum supported IPA
arm64/mm: Add remaining ID_AA64MMFR0_PARANGE_ macros
KVM: arm64: Restrict IPA size to maximum 48 bits on 4K and 16K page size
arm64/mm: Define ID_AA64MMFR0_TGRAN_2_SHIFT
KVM: arm64: perf: Replace '0xf' instances with ID_AA64DFR0_PMUVER_IMP_DEF
KVM: arm64: Fix comments related to GICv2 PMR reporting
KVM: arm64: Count VMID-wide TLB invalidations
arm64/kexec: Test page size support with new TGRAN range values
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
ID_AA64DFR0_PMUVER_IMP_DEF which indicate implementation defined PMU, never
actually gets used although there are '0xf' instances scattered all around.
Just do the macro replacement to improve readability.
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: linux-perf-users@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: kvmarm@lists.cs.columbia.edu
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
In a KVM guest on arm64, performance counters interrupts have an
unnecessary overhead which slows down execution when using the "perf
record" command and limits the "perf record" sampling period.
The problem is that when a guest VM disables counters by clearing the
PMCR_EL0.E bit (bit 0), KVM will disable all counters defined in
PMCR_EL0 even if they are not enabled in PMCNTENSET_EL0.
KVM disables a counter by calling into the perf framework, in particular
by calling perf_event_create_kernel_counter() which is a time consuming
operation. So, for example, with a Neoverse N1 CPU core which has 6 event
counters and one cycle counter, KVM will always disable all 7 counters
even if only one is enabled.
This typically happens when using the "perf record" command in a guest
VM: perf will disable all event counters with PMCNTENTSET_EL0 and only
uses the cycle counter. And when using the "perf record" -F option with
a high profiling frequency, the overhead of KVM disabling all counters
instead of one on every counter interrupt becomes very noticeable.
The problem is fixed by having KVM disable only counters which are
enabled in PMCNTENSET_EL0. If a counter is not enabled in PMCNTENSET_EL0
then KVM will not enable it when setting PMCR_EL0.E and it will remain
disabled as long as it is not enabled in PMCNTENSET_EL0. So there is
effectively no need to disable a counter when clearing PMCR_EL0.E if it
is not enabled PMCNTENSET_EL0.
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com>
[maz: moved 'mask' close to the actual user, simplifying the patch]
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210712170345.660272-1-alexandre.chartre@oracle.com
Link: https://lore.kernel.org/r/20210719123902.1493805-4-maz@kernel.org
|
|
We always sanitise our PMU sysreg on the write side, so there
is no need to do it on the read side as well.
Drop the unnecessary masking.
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210719123902.1493805-3-maz@kernel.org
|
|
Restoring a guest with an active virtual PMU results in no perf
counters being instanciated on the host side. Not quite what
you'd expect from a restore.
In order to fix this, force a writeback of PMCR_EL0 on the first
run of a vcpu (using a new request so that it happens once the
vcpu has been loaded). This will in turn create all the host-side
counters that were missing.
Reported-by: Jinank Jain <jinankj@amazon.de>
Tested-by: Jinank Jain <jinankj@amazon.de>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/87wnrbylxv.wl-maz@kernel.org
Link: https://lore.kernel.org/r/b53dfcf9bbc4db7f96154b1cd5188d72b9766358.camel@amazon.de
|
|
According to ARM DDI 0487G.a, page D13-3895, setting the PMCR_EL0.P bit to
1 has the following effect:
"Reset all event counters accessible in the current Exception level, not
including PMCCNTR_EL0, to zero."
Similar behaviour is described for AArch32 on page G8-7022. Make it so.
Fixes: c01d6a18023b ("KVM: arm64: pmu: Only handle supported event counters")
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210618105139.83795-1-alexandru.elisei@arm.com
|
|
KVM/arm64 is the sole user of perf_num_counters(), and really
could do without it. Stop using the obsolete API by relying on
the existing probing code.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210414134409.1266357-2-maz@kernel.org
|
|
We currently find out about the presence of a HW PMU (or the handling
of that PMU by perf, which amounts to the same thing) in a fairly
roundabout way, by checking the number of counters available to perf.
That's good enough for now, but we will soon need to find about about
that on paths where perf is out of reach (in the world switch).
Instead, let's turn kvm_arm_support_pmu_v3() into a static key.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/r/20210209114844.3278746-2-maz@kernel.org
Message-Id: <20210305185254.3730990-5-maz@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
|
|
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Instead of using a bunch of magic numbers, use the existing definitions
that have been added since 8673e02e58410 ("arm64: perf: Add support
for ARMv8.5-PMU 64-bit counters")
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Upgrading the PMU code from ARMv8.1 to ARMv8.4 turns out to be
pretty easy. All that is required is support for PMMIR_EL1, which
is read-only, and for which returning 0 is a valid option as long
as we don't advertise STALL_SLOT as an implemented event.
Let's just do that and adjust what we return to the guest.
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
When running on v8.0 HW, make sure we don't try to advertise
events in the 0x4000-0x403f range.
Cc: stable@vger.kernel.org
Fixes: 88865beca9062 ("KVM: arm64: Mask out filtered events in PCMEID{0,1}_EL1")
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20210121105636.1478491-1-maz@kernel.org
|
|
KVM_ARM_VCPU_INIT ioctl calls kvm_reset_vcpu(), which in turn resets the
PMU with a call to kvm_pmu_vcpu_reset(). The function zeroes the PMU
chained counters bitmap and stops all the counters with a perf event
attached. Because it is called before the VCPU has had the chance to run,
no perf events are in use and none are released.
kvm_arm_pmu_v3_enable(), called by kvm_vcpu_first_run_init() only if the
VCPU has been initialized, also resets the PMU. kvm_pmu_vcpu_reset() in
this case does the exact same thing as the previous call, so remove it.
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201201150157.223625-6-alexandru.elisei@arm.com
|
|
The PMU ready state has no user left. Goodbye.
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
We currently gate the update of the PMU state on the PMU being "ready".
The "ready" state is only set to true when the first vcpu run is
successful, and if it isn't, we never reach the update code.
So the "ready" state is never the right thing to check for, and it
should instead be the presence of the PMU feature, which makes
a bit more sense.
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
We accept to configure a PMU when a vcpu is created, even if the
HW (or the host) doesn't support it. This results in failures
when attributes get set, which is a bit odd as we should have
failed the vcpu creation the first place.
Move the check to the point where we check the vcpu feature set,
and fail early if we cannot support a PMU. This further simplifies
the attribute handling.
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
When enabling the PMU in kvm_arm_pmu_v3_enable(), KVM returns early if the
PMU flag created is false and skips any other checks. Because PMU emulation
is gated only on the VCPU feature being set, this makes it possible for
userspace to get away with setting the VCPU feature but not doing any
initialization for the PMU. Fix it by returning an error when trying to run
the VCPU if the PMU hasn't been initialized correctly.
The PMU is marked as created only if the interrupt ID has been set when
using an in-kernel irqchip. This means the same check in
kvm_arm_pmu_v3_enable() is redundant, remove it.
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201126144916.164075-1-alexandru.elisei@arm.com
|
|
There are a number of places where we check for the KVM_ARM_VCPU_PMU_V3
feature. Wrap this check into a new kvm_vcpu_has_pmu(), and use
it at the existing locations.
No functional change.
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
Pull KVM updates from Paolo Bonzini:
"For x86, there is a new alternative and (in the future) more scalable
implementation of extended page tables that does not need a reverse
map from guest physical addresses to host physical addresses.
For now it is disabled by default because it is still lacking a few of
the existing MMU's bells and whistles. However it is a very solid
piece of work and it is already available for people to hammer on it.
Other updates:
ARM:
- New page table code for both hypervisor and guest stage-2
- Introduction of a new EL2-private host context
- Allow EL2 to have its own private per-CPU variables
- Support of PMU event filtering
- Complete rework of the Spectre mitigation
PPC:
- Fix for running nested guests with in-kernel IRQ chip
- Fix race condition causing occasional host hard lockup
- Minor cleanups and bugfixes
x86:
- allow trapping unknown MSRs to userspace
- allow userspace to force #GP on specific MSRs
- INVPCID support on AMD
- nested AMD cleanup, on demand allocation of nested SVM state
- hide PV MSRs and hypercalls for features not enabled in CPUID
- new test for MSR_IA32_TSC writes from host and guest
- cleanups: MMU, CPUID, shared MSRs
- LAPIC latency optimizations ad bugfixes"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (232 commits)
kvm: x86/mmu: NX largepage recovery for TDP MMU
kvm: x86/mmu: Don't clear write flooding count for direct roots
kvm: x86/mmu: Support MMIO in the TDP MMU
kvm: x86/mmu: Support write protection for nesting in tdp MMU
kvm: x86/mmu: Support disabling dirty logging for the tdp MMU
kvm: x86/mmu: Support dirty logging for the TDP MMU
kvm: x86/mmu: Support changed pte notifier in tdp MMU
kvm: x86/mmu: Add access tracking for tdp_mmu
kvm: x86/mmu: Support invalidate range MMU notifier for TDP MMU
kvm: x86/mmu: Allocate struct kvm_mmu_pages for all pages in TDP MMU
kvm: x86/mmu: Add TDP MMU PF handler
kvm: x86/mmu: Remove disallowed_hugepage_adjust shadow_walk_iterator arg
kvm: x86/mmu: Support zapping SPTEs in the TDP MMU
KVM: Cache as_id in kvm_memory_slot
kvm: x86/mmu: Add functions to handle changed TDP SPTEs
kvm: x86/mmu: Allocate and free TDP MMU roots
kvm: x86/mmu: Init / Uninit the TDP MMU
kvm: x86/mmu: Introduce tdp_iter
KVM: mmu: extract spte.h and spte.c
KVM: mmu: Separate updating a PTE from kvm_set_pte_rmapp
...
|
|
As we can now hide events from the guest, let's also adjust its view of
PCMEID{0,1}_EL1 so that it can figure out why some common events are not
counting as they should.
The astute user can still look into the TRM for their CPU and find out
they've been cheated, though. Nobody's perfect.
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
It can be desirable to expose a PMU to a guest, and yet not want the
guest to be able to count some of the implemented events (because this
would give information on shared resources, for example.
For this, let's extend the PMUv3 device API, and offer a way to setup a
bitmap of the allowed events (the default being no bitmap, and thus no
filtering).
Userspace can thus allow/deny ranges of event. The default policy
depends on the "polarity" of the first filter setup (default deny if the
filter allows events, and default allow if the filter denies events).
This allows to setup exactly what is allowed for a given guest.
Note that although the ioctl is per-vcpu, the map of allowed events is
global to the VM (it can be setup from any vcpu until the vcpu PMU is
initialized).
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The PMU code suffers from a small defect where we assume that the event
number provided by the guest is always 16 bit wide, even if the CPU only
implements the ARMv8.0 architecture. This isn't really problematic in
the sense that the event number ends up in a system register, cropping
it to the right width, but still this needs fixing.
In order to make it work, let's probe the version of the PMU that the
guest is going to use. This is done by temporarily creating a kernel
event and looking at the PMUVer field that has been saved at probe time
in the associated arm_pmu structure. This in turn gets saved in the kvm
structure, and subsequently used to compute the event mask that gets
used throughout the PMU code.
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
The PMU emulation error handling is pretty messy when dealing with
attributes. Let's refactor it so that we have less duplication,
and that it is easy to extend later on.
A functional change is that kvm_arm_pmu_v3_init() used to return
-ENXIO when the PMU feature wasn't set. The error is now reported
as -ENODEV, matching the documentation. -ENXIO is still returned
when the interrupt isn't properly configured.
Signed-off-by: Marc Zyngier <maz@kernel.org>
|
|
kvm_vcpu_kick() is not NMI safe. When the overflow handler is called from
NMI context, defer waking the vcpu to an irq_work queue.
A vcpu can be freed while it's not running by kvm_destroy_vm(). Prevent
running the irq_work for a non-existent vcpu by calling irq_work_sync() on
the PMU destroy path.
[Alexandru E.: Added irq_work_sync()]
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Tested-by: Sumit Garg <sumit.garg@linaro.org> (Developerbox)
Cc: Julien Thierry <julien.thierry.kdev@gmail.com>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Suzuki K Pouloze <suzuki.poulose@arm.com>
Cc: kvm@vger.kernel.org
Cc: kvmarm@lists.cs.columbia.edu
Link: https://lore.kernel.org/r/20200924110706.254996-6-alexandru.elisei@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
|
|
Now that the 32bit KVM/arm host is a distant memory, let's move the
whole of the KVM/arm64 code into the arm64 tree.
As they said in the song: Welcome Home (Sanitarium).
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200513104034.74741-1-maz@kernel.org
|