summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2021-05-27KVM: selftests: add shmem backing source typeAxel Rasmussen
This lets us run the demand paging test on top of a shmem-backed area. In follow-up commits, we'll 1) leverage this new capability to create an alias mapping, and then 2) use the alias mapping to exercise UFFD minor faults. Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Message-Id: <20210519200339.829146-8-axelrasmussen@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-27KVM: selftests: refactor vm_mem_backing_src_type flagsAxel Rasmussen
Each struct vm_mem_backing_src_alias has a flags field, which denotes the flags used to mmap() an area of that type. Previously, this field never included MAP_PRIVATE | MAP_ANONYMOUS, because vm_userspace_mem_region_add assumed that *all* types would always use those flags, and so it hardcoded them. In a follow-up commit, we'll add a new type: shmem. Areas of this type must not have MAP_PRIVATE | MAP_ANONYMOUS, and instead they must have MAP_SHARED. So, refactor things. Make it so that the flags field of struct vm_mem_backing_src_alias really is a complete set of flags, and don't add in any extras in vm_userspace_mem_region_add. This will let us easily tack on shmem. Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Message-Id: <20210519200339.829146-7-axelrasmussen@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-27KVM: selftests: allow different backing source typesAxel Rasmussen
Add an argument which lets us specify a different backing memory type for the test. The default is just to use anonymous, matching existing behavior. This is in preparation for testing UFFD minor faults. For that, we'll need to use a new backing memory type which is setup with MAP_SHARED. Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Message-Id: <20210519200339.829146-6-axelrasmussen@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-27KVM: selftests: compute correct demand paging sizeAxel Rasmussen
This is a preparatory commit needed before we can use different kinds of backing pages for guest memory. Previously, we used perf_test_args.host_page_size, which is the host's native page size (commonly 4K). For VM_MEM_SRC_ANONYMOUS this turns out to be okay, but in a follow-up commit we want to allow using different kinds of backing memory. Take VM_MEM_SRC_ANONYMOUS_HUGETLB for example. Without this change, if we used that backing page type, when we issued a UFFDIO_COPY ioctl we'd only do so with 4K, rather than the full 2M of a backing hugepage. In this case, UFFDIO_COPY returns -EINVAL (__mcopy_atomic_hugetlb checks the size). Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Message-Id: <20210519200339.829146-5-axelrasmussen@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-27KVM: selftests: simplify setup_demand_paging error handlingAxel Rasmussen
A small cleanup. Our caller writes: r = setup_demand_paging(...); if (r < 0) exit(-r); Since we're just going to exit anyway, instead of returning an error we can just re-use TEST_ASSERT. This makes the caller simpler, as well as the function itself - no need to write our branches, etc. Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Message-Id: <20210519200339.829146-3-axelrasmussen@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-27KVM: selftests: Print a message if /dev/kvm is missingDavid Matlack
If a KVM selftest is run on a machine without /dev/kvm, it will exit silently. Make it easy to tell what's happening by printing an error message. Opportunistically consolidate all codepaths that open /dev/kvm into a single function so they all print the same message. This slightly changes the semantics of vm_is_unrestricted_guest() by changing a TEST_ASSERT() to exit(KSFT_SKIP). However vm_is_unrestricted_guest() is only called in one place (x86_64/mmio_warning_test.c) and that is to determine if the test should be skipped or not. Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20210511202120.1371800-1-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-27KVM: selftests: trivial comment/logging fixesAxel Rasmussen
Some trivial fixes I found while touching related code in this series, factored out into a separate commit for easier reviewing: - s/gor/got/ and add a newline in demand_paging_test.c - s/backing_src/src_type/ in a comment to be consistent with the real function signature in kvm_util.c Signed-off-by: Axel Rasmussen <axelrasmussen@google.com> Message-Id: <20210519200339.829146-2-axelrasmussen@google.com> Reviewed-by: Ben Gardon <bgardon@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-27KVM: selftests: Fix hang in hardware_disable_testDavid Matlack
If /dev/kvm is not available then hardware_disable_test will hang indefinitely because the child process exits before posting to the semaphore for which the parent is waiting. Fix this by making the parent periodically check if the child has exited. We have to be careful to forward the child's exit status to preserve a KSFT_SKIP status. I considered just checking for /dev/kvm before creating the child process, but there are so many other reasons why the child could exit early that it seemed better to handle that as general case. Tested: $ ./hardware_disable_test /dev/kvm not available, skipping test $ echo $? 4 $ modprobe kvm_intel $ ./hardware_disable_test $ echo $? 0 Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20210514230521.2608768-1-dmatlack@google.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-27KVM: selftests: Ignore CPUID.0DH.1H in get_cpuid_testDavid Matlack
Similar to CPUID.0DH.0H this entry depends on the vCPU's XCR0 register and IA32_XSS MSR. Since this test does not control for either before assigning the vCPU's CPUID, these entries will not necessarily match the supported CPUID exposed by KVM. This fixes get_cpuid_test on Cascade Lake CPUs. Suggested-by: Jim Mattson <jmattson@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20210519211345.3944063-1-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-27KVM: selftests: Fix 32-bit truncation of vm_get_max_gfn()David Matlack
vm_get_max_gfn() casts vm->max_gfn from a uint64_t to an unsigned int, which causes the upper 32-bits of the max_gfn to get truncated. Nobody noticed until now likely because vm_get_max_gfn() is only used as a mechanism to create a memslot in an unused region of the guest physical address space (the top), and the top of the 32-bit physical address space was always good enough. This fix reveals a bug in memslot_modification_stress_test which was trying to create a dummy memslot past the end of guest physical memory. Fix that by moving the dummy memslot lower. Fixes: 52200d0d944e ("KVM: selftests: Remove duplicate guest mode handling") Reviewed-by: Venkatesh Srinivas <venkateshs@chromium.org> Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20210521173828.1180619-1-dmatlack@google.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-27KVM: selftests: add a memslot-related performance benchmarkMaciej S. Szmigiero
This benchmark contains the following tests: * Map test, where the host unmaps guest memory while the guest writes to it (maps it). The test is designed in a way to make the unmap operation on the host take a negligible amount of time in comparison with the mapping operation in the guest. The test area is actually split in two: the first half is being mapped by the guest while the second half in being unmapped by the host. Then a guest <-> host sync happens and the areas are reversed. * Unmap test which is broadly similar to the above map test, but it is designed in an opposite way: to make the mapping operation in the guest take a negligible amount of time in comparison with the unmap operation on the host. This test is available in two variants: with per-page unmap operation or a chunked one (using 2 MiB chunk size). * Move active area test which involves moving the last (highest gfn) memslot a bit back and forth on the host while the guest is concurrently writing around the area being moved (including over the moved memslot). * Move inactive area test which is similar to the previous move active area test, but now guest writes all happen outside of the area being moved. * Read / write test in which the guest writes to the beginning of each page of the test area while the host writes to the middle of each such page. Then each side checks the values the other side has written. This particular test is not expected to give different results depending on particular memslots implementation, it is meant as a rough sanity check and to provide insight on the spread of test results expected. Each test performs its operation in a loop until a test period ends (this is 5 seconds by default, but it is configurable). Then the total count of loops done is divided by the actual elapsed time to give the test result. The tests have a configurable memslot cap with the "-s" test option, by default the system maximum is used. Each test is repeated a particular number of times (by default 20 times), the best result achieved is printed. The test memory area is divided equally between memslots, the reminder is added to the last memslot. The test area size does not depend on the number of memslots in use. The tests also measure the time that it took to add all these memslots. The best result from the tests that use the whole test area is printed after all the requested tests are done. In general, these tests are designed to use as much memory as possible (within reason) while still doing 100+ loops even on high memslot counts with the default test length. Increasing the test runtime makes it increasingly more likely that some event will happen on the system during the test run, which might lower the test result. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Message-Id: <8d31bb3d92bc8fa33a9756fa802ee14266ab994e.1618253574.git.maciej.szmigiero@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-27KVM: selftests: Keep track of memslots more efficientlyMaciej S. Szmigiero
The KVM selftest framework was using a simple list for keeping track of the memslots currently in use. This resulted in lookups and adding a single memslot being O(n), the later due to linear scanning of the existing memslot set to check for the presence of any conflicting entries. Before this change, benchmarking high count of memslots was more or less impossible as pretty much all the benchmark time was spent in the selftest framework code. We can simply use a rbtree for keeping track of both of gfn and hva. We don't need an interval tree for hva here as we can't have overlapping memslots because we allocate a completely new memory chunk for each new memslot. Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com> Reviewed-by: Andrew Jones <drjones@redhat.com> Message-Id: <b12749d47ee860468240cf027412c91b76dbe3db.1618253574.git.maciej.szmigiero@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-27selftests: kvm: fix potential issue with ELF loadingPaolo Bonzini
vm_vaddr_alloc() sets up GVA to GPA mapping page by page; therefore, GPAs may not be continuous if same memslot is used for data and page table allocation. kvm_vm_elf_load() however expects a continuous range of HVAs (and thus GPAs) because it does not try to read file data page by page. Fix this mismatch by allocating memory in one step. Reported-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-27selftests: kvm: make allocation of extra memory take effectZhenzhong Duan
The extra memory pages is missed to be allocated during VM creating. perf_test_util and kvm_page_table_test use it to alloc extra memory currently. Fix it by adding extra_mem_pages to the total memory calculation before allocate. Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Message-Id: <20210512043107.30076-1-zhenzhong.duan@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-27KVM: X86: hyper-v: Task srcu lock when accessing kvm_memslots()Wanpeng Li
WARNING: suspicious RCU usage 5.13.0-rc1 #4 Not tainted ----------------------------- ./include/linux/kvm_host.h:710 suspicious rcu_dereference_check() usage! other info that might help us debug this: rcu_scheduler_active = 2, debug_locks = 1 1 lock held by hyperv_clock/8318: #0: ffffb6b8cb05a7d8 (&hv->hv_lock){+.+.}-{3:3}, at: kvm_hv_invalidate_tsc_page+0x3e/0xa0 [kvm] stack backtrace: CPU: 3 PID: 8318 Comm: hyperv_clock Not tainted 5.13.0-rc1 #4 Call Trace: dump_stack+0x87/0xb7 lockdep_rcu_suspicious+0xce/0xf0 kvm_write_guest_page+0x1c1/0x1d0 [kvm] kvm_write_guest+0x50/0x90 [kvm] kvm_hv_invalidate_tsc_page+0x79/0xa0 [kvm] kvm_gen_update_masterclock+0x1d/0x110 [kvm] kvm_arch_vm_ioctl+0x2a7/0xc50 [kvm] kvm_vm_ioctl+0x123/0x11d0 [kvm] __x64_sys_ioctl+0x3ed/0x9d0 do_syscall_64+0x3d/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae kvm_memslots() will be called by kvm_write_guest(), so we should take the srcu lock. Fixes: e880c6ea5 (KVM: x86: hyper-v: Prevent using not-yet-updated TSC page by secondary CPUs) Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Message-Id: <1621339235-11131-4-git-send-email-wanpengli@tencent.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-27KVM: X86: Fix vCPU preempted state from guest's point of viewWanpeng Li
Commit 66570e966dd9 (kvm: x86: only provide PV features if enabled in guest's CPUID) avoids to access pv tlb shootdown host side logic when this pv feature is not exposed to guest, however, kvm_steal_time.preempted not only leveraged by pv tlb shootdown logic but also mitigate the lock holder preemption issue. From guest's point of view, vCPU is always preempted since we lose the reset of kvm_steal_time.preempted before vmentry if pv tlb shootdown feature is not exposed. This patch fixes it by clearing kvm_steal_time.preempted before vmentry. Fixes: 66570e966dd9 (kvm: x86: only provide PV features if enabled in guest's CPUID) Reviewed-by: Sean Christopherson <seanjc@google.com> Cc: stable@vger.kernel.org Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Message-Id: <1621339235-11131-3-git-send-email-wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-27KVM: X86: Bail out of direct yield in case of under-committed scenariosWanpeng Li
In case of under-committed scenarios, vCPUs can be scheduled easily; kvm_vcpu_yield_to adds extra overhead, and it is also common to see when vcpu->ready is true but yield later failing due to p->state is TASK_RUNNING. Let's bail out in such scenarios by checking the length of current cpu runqueue, which can be treated as a hint of under-committed instead of guarantee of accuracy. 30%+ of directed-yield attempts can now avoid the expensive lookups in kvm_sched_yield() in an under-committed scenario. Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Message-Id: <1621339235-11131-2-git-send-email-wanpengli@tencent.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-27KVM: PPC: exit halt polling on need_resched()Wanpeng Li
This is inspired by commit 262de4102c7bb8 (kvm: exit halt polling on need_resched() as well). Due to PPC implements an arch specific halt polling logic, we have to the need_resched() check there as well. This patch adds a helper function that can be shared between book3s and generic halt-polling loops. Reviewed-by: David Matlack <dmatlack@google.com> Reviewed-by: Venkatesh Srinivas <venkateshs@chromium.org> Cc: Ben Segall <bsegall@google.com> Cc: Venkatesh Srinivas <venkateshs@chromium.org> Cc: Jim Mattson <jmattson@google.com> Cc: David Matlack <dmatlack@google.com> Cc: Paul Mackerras <paulus@ozlabs.org> Cc: Suraj Jitindar Singh <sjitindarsingh@gmail.com> Signed-off-by: Wanpeng Li <wanpengli@tencent.com> Message-Id: <1621339235-11131-1-git-send-email-wanpengli@tencent.com> [Make the function inline. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-27KVM: arm64: Prevent mixed-width VM creationMarc Zyngier
It looks like we have tolerated creating mixed-width VMs since... forever. However, that was never the intention, and we'd rather not have to support that pointless complexity. Forbid such a setup by making sure all the vcpus have the same register width. Reported-by: Steven Price <steven.price@arm.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Acked-by: Mark Rutland <mark.rutland@arm.com> Link: https://lore.kernel.org/r/20210524170752.1549797-1-maz@kernel.org
2021-05-27KVM: arm64: Resolve all pending PC updates before immediate exitZenghui Yu
Commit 26778aaa134a ("KVM: arm64: Commit pending PC adjustemnts before returning to userspace") fixed the PC updating issue by forcing an explicit synchronisation of the exception state on vcpu exit to userspace. However, we forgot to take into account the case where immediate_exit is set by userspace and KVM_RUN will exit immediately. Fix it by resolving all pending PC updates before returning to userspace. Since __kvm_adjust_pc() relies on a loaded vcpu context, I moved the immediate_exit checking right after vcpu_load(). We will get some overhead if immediate_exit is true (which should hopefully be rare). Fixes: 26778aaa134a ("KVM: arm64: Commit pending PC adjustemnts before returning to userspace") Signed-off-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210526141831.1662-1-yuzenghui@huawei.com Cc: stable@vger.kernel.org # 5.11
2021-05-24KVM: SVM: make the avic parameter a boolPaolo Bonzini
Make it consistent with kvm_intel.enable_apicv. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-24KVM: VMX: Drop unneeded CONFIG_X86_LOCAL_APIC checkVitaly Kuznetsov
CONFIG_X86_LOCAL_APIC is always on when CONFIG_KVM (on x86) since commit e42eef4ba388 ("KVM: add X86_LOCAL_APIC dependency"). Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210518144339.1987982-3-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com>
2021-05-24KVM: SVM: Drop unneeded CONFIG_X86_LOCAL_APIC checkVitaly Kuznetsov
AVIC dependency on CONFIG_X86_LOCAL_APIC is dead code since commit e42eef4ba388 ("KVM: add X86_LOCAL_APIC dependency"). Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210518144339.1987982-2-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com>
2021-05-17Merge tag 'kvmarm-fixes-5.13-1' of ↵Paolo Bonzini
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 5.13, take #1 - Fix regression with irqbypass not restarting the guest on failed connect - Fix regression with debug register decoding resulting in overlapping access - Commit exception state on exit to usrspace - Fix the MMU notifier return values - Add missing 'static' qualifiers in the new host stage-2 code
2021-05-16Linux 5.13-rc2v5.13-rc2Linus Torvalds
2021-05-16Merge tag 'driver-core-5.13-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core fixes from Greg KH: "Here are two driver fixes for driver core changes that happened in 5.13-rc1. The clk driver fix resolves a many-reported issue with booting some devices, and the USB typec fix resolves the reported problem of USB systems on some embedded boards. Both of these have been in linux-next this week with no reported issues" * tag 'driver-core-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: clk: Skip clk provider registration when np is NULL usb: typec: tcpm: Don't block probing of consumers of "connector" nodes
2021-05-16Merge tag 'staging-5.13-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging and IIO driver fixes from Greg KH: "Here are some small IIO driver fixes and one Staging driver fix for 5.13-rc2. Nothing major, just some resolutions for reported problems: - gcc-11 bogus warning fix for rtl8723bs - iio driver tiny fixes All of these have been in linux-next for many days with no reported issues" * tag 'staging-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: iio: tsl2583: Fix division by a zero lux_val iio: core: return ENODEV if ioctl is unknown iio: core: fix ioctl handlers removal iio: gyro: mpu3050: Fix reported temperature value iio: hid-sensors: select IIO_TRIGGERED_BUFFER under HID_SENSOR_IIO_TRIGGER iio: proximity: pulsedlight: Fix rumtime PM imbalance on error iio: light: gp2ap002: Fix rumtime PM imbalance on error staging: rtl8723bs: avoid bogus gcc warning
2021-05-16Merge tag 'usb-5.13-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some small USB fixes for 5.13-rc2. They consist of a number of resolutions for reported issues: - typec fixes for found problems - xhci fixes and quirk additions - dwc3 driver fixes - minor fixes found by Coverity - cdc-wdm fixes for reported problems All of these have been in linux-next for a few days with no reported issues" * tag 'usb-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (28 commits) usb: core: hub: fix race condition about TRSMRCY of resume usb: typec: tcpm: Fix SINK_DISCOVERY current limit for Rp-default xhci: Add reset resume quirk for AMD xhci controller. usb: xhci: Increase timeout for HC halt xhci: Do not use GFP_KERNEL in (potentially) atomic context xhci: Fix giving back cancelled URBs even if halted endpoint can't reset xhci-pci: Allow host runtime PM as default for Intel Alder Lake xHCI usb: musb: Fix an error message usb: typec: tcpm: Fix wrong handling for Not_Supported in VDM AMS usb: typec: tcpm: Send DISCOVER_IDENTITY from dedicated work usb: typec: ucsi: Retrieve all the PDOs instead of just the first 4 usb: fotg210-hcd: Fix an error message docs: usb: function: Modify path name usb: dwc3: omap: improve extcon initialization usb: typec: ucsi: Put fwnode in any case during ->probe() usb: typec: tcpm: Fix wrong handling in GET_SINK_CAP usb: dwc2: Remove obsolete MODULE_ constants from platform.c usb: dwc3: imx8mp: fix error return code in dwc3_imx8mp_probe() usb: dwc3: imx8mp: detect dwc3 core node via compatible string usb: dwc3: gadget: Return success always for kick transfer in ep queue ...
2021-05-16Merge tag 'timers-urgent-2021-05-16' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fixes from Thomas Gleixner: "Two fixes for timers: - Use the ALARM feature check in the alarmtimer core code insted of the old method of checking for the set_alarm() callback. Drivers can have that callback set but the feature bit cleared. If such a RTC device is selected then alarms wont work. - Use a proper define to let the preprocessor check whether Hyper-V VDSO clocksource should be active. The code used a constant in an enum with #ifdef, which evaluates to always false and disabled the clocksource for VDSO" * tag 'timers-urgent-2021-05-16' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: clocksource/drivers/hyper-v: Re-enable VDSO_CLOCKMODE_HVCLOCK on X86 alarmtimer: Check RTC features instead of ops
2021-05-16Merge tag 'for-linus-5.13b-rc2-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - two patches for error path fixes - a small series for fixing a regression with swiotlb with Xen on Arm * tag 'for-linus-5.13b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/swiotlb: check if the swiotlb has already been initialized arm64: do not set SWIOTLB_NO_FORCE when swiotlb is required xen/arm: move xen_swiotlb_detect to arm/swiotlb-xen.h xen/unpopulated-alloc: fix error return code in fill_list() xen/gntdev: fix gntdev_mmap() error exit path
2021-05-16Merge tag 'x86_urgent_for_v5.13_rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: "The three SEV commits are not really urgent material. But we figured since getting them in now will avoid a huge amount of conflicts between future SEV changes touching tip, the kvm and probably other trees, sending them to you now would be best. The idea is that the tip, kvm etc branches for 5.14 will all base ontop of -rc2 and thus everything will be peachy. What is more, those changes are purely mechanical and defines movement so they should be fine to go now (famous last words). Summary: - Enable -Wundef for the compressed kernel build stage - Reorganize SEV code to streamline and simplify future development" * tag 'x86_urgent_for_v5.13_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot/compressed: Enable -Wundef x86/msr: Rename MSR_K8_SYSCFG to MSR_AMD64_SYSCFG x86/sev: Move GHCB MSR protocol and NAE definitions in a common header x86/sev-es: Rename sev-es.{ch} to sev.{ch}
2021-05-15Merge tag 'powerpc-5.13-3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Michael Ellerman: - Fix a regression in the conversion of the 64-bit BookE interrupt entry to C. - Fix KVM hosts running with the hash MMU since the recent KVM gfn changes. - Fix a deadlock in our paravirt spinlocks when hcall tracing is enabled. - Several fixes for oopses in our runtime code patching for security mitigations. - A couple of minor fixes for the recent conversion of 32-bit interrupt entry/exit to C. - Fix __get_user() causing spurious crashes in sigreturn due to a bad inline asm constraint, spotted with GCC 11. - A fix for the way we track IRQ masking state vs NMI interrupts when using the new scv system call entry path. - A couple more minor fixes. Thanks to Cédric Le Goater, Christian Zigotzky, Christophe Leroy, Naveen N. Rao, Nicholas Piggin Paul Menzel, and Sean Christopherson. * tag 'powerpc-5.13-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: powerpc/64e/interrupt: Fix nvgprs being clobbered powerpc/64s: Make NMI record implicitly soft-masked code as irqs disabled powerpc/64s: Fix stf mitigation patching w/strict RWX & hash powerpc/64s: Fix entry flush patching w/strict RWX & hash powerpc/64s: Fix crashes when toggling entry flush barrier powerpc/64s: Fix crashes when toggling stf barrier KVM: PPC: Book3S HV: Fix kvm_unmap_gfn_range_hv() for Hash MMU powerpc/legacy_serial: Fix UBSAN: array-index-out-of-bounds powerpc/signal: Fix possible build failure with unsafe_copy_fpr_{to/from}_user powerpc/uaccess: Fix __get_user() with CONFIG_CC_HAS_ASM_GOTO_OUTPUT powerpc/pseries: warn if recursing into the hcall tracing code powerpc/pseries: use notrace hcall variant for H_CEDE idle powerpc/pseries: Don't trace hcall tracing wrapper powerpc/pseries: Fix hcall tracing recursion in pv queued spinlocks powerpc/syscall: Calling kuap_save_and_lock() is wrong powerpc/interrupts: Fix kuep_unlock() call
2021-05-15Merge tag 'sched-urgent-2021-05-15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Fix an idle CPU selection bug, and an AMD Ryzen maximum frequency enumeration bug" * tag 'sched-urgent-2021-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86, sched: Fix the AMD CPPC maximum performance value on certain AMD Ryzen generations sched/fair: Fix clearing of has_idle_cores flag in select_idle_cpu()
2021-05-15Merge tag 'objtool-urgent-2021-05-15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fixes from Ingo Molnar: "Fix a couple of endianness bugs that crept in" * tag 'objtool-urgent-2021-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool/x86: Fix elf_add_alternative() endianness objtool: Fix elf_create_undef_symbol() endianness
2021-05-15Merge tag 'irq-urgent-2021-05-15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fix from Ingo Molnar: "Fix build warning on SH" * tag 'irq-urgent-2021-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sh: Remove unused variable
2021-05-15Merge tag 'core-urgent-2021-05-15' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 stack randomization fix from Ingo Molnar: "Fix an assembly constraint that affected LLVM up to version 12" * tag 'core-urgent-2021-05-15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: stack: Replace "o" output with "r" input constraint
2021-05-15Merge branch 'akpm' (patches from Andrew)Linus Torvalds
Merge misc fixes from Andrew Morton: "13 patches. Subsystems affected by this patch series: resource, squashfs, hfsplus, modprobe, and mm (hugetlb, slub, userfaultfd, ksm, pagealloc, kasan, pagemap, and ioremap)" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm/ioremap: fix iomap_max_page_shift docs: admin-guide: update description for kernel.modprobe sysctl hfsplus: prevent corruption in shrinking truncate mm/filemap: fix readahead return types kasan: fix unit tests with CONFIG_UBSAN_LOCAL_BOUNDS enabled mm: fix struct page layout on 32-bit systems ksm: revert "use GET_KSM_PAGE_NOLOCK to get ksm page in remove_rmap_item_from_tree()" userfaultfd: release page in error path to avoid BUG_ON squashfs: fix divide error in calculate_skip() kernel/resource: fix return code check in __request_free_mem_region mm, slub: move slub_debug static key enabling outside slab_mutex mm/hugetlb: fix cow where page writtable in child mm/hugetlb: fix F_SEAL_FUTURE_WRITE
2021-05-15Merge tag 'arc-5.13-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc Pull ARC fixes from Vineet Gupta: - PAE fixes - syscall num check off-by-one bug - misc fixes * tag 'arc-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc: ARC: mm: Use max_high_pfn as a HIGHMEM zone border ARC: mm: PAE: use 40-bit physical page mask ARC: entry: fix off-by-one error in syscall number validation ARC: kgdb: add 'fallthrough' to prevent a warning arc: Fix typos/spellos
2021-05-15Merge tag 'block-5.13-2021-05-14' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull block fixes from Jens Axboe: - Fix for shared tag set exit (Bart) - Correct ioctl range for zoned ioctls (Damien) - Removed dead/unused function (Lin) - Fix perf regression for shared tags (Ming) - Fix out-of-bounds issue with kyber and preemption (Omar) - BFQ merge fix (Paolo) - Two error handling fixes for nbd (Sun) - Fix weight update in blk-iocost (Tejun) - NVMe pull request (Christoph): - correct the check for using the inline bio in nvmet (Chaitanya Kulkarni) - demote unsupported command warnings (Chaitanya Kulkarni) - fix corruption due to double initializing ANA state (me, Hou Pu) - reset ns->file when open fails (Daniel Wagner) - fix a NULL deref when SEND is completed with error in nvmet-rdma (Michal Kalderon) - Fix kernel-doc warning (Bart) * tag 'block-5.13-2021-05-14' of git://git.kernel.dk/linux-block: block/partitions/efi.c: Fix the efi_partition() kernel-doc header blk-mq: Swap two calls in blk_mq_exit_queue() blk-mq: plug request for shared sbitmap nvmet: use new ana_log_size instead the old one nvmet: seset ns->file when open fails nbd: share nbd_put and return by goto put_nbd nbd: Fix NULL pointer in flush_workqueue blkdev.h: remove unused codes blk_account_rq block, bfq: avoid circular stable merges blk-iocost: fix weight updates of inner active iocgs nvmet: demote fabrics cmd parse err msg to debug nvmet: use helper to remove the duplicate code nvmet: demote discovery cmd parse err msg to debug nvmet-rdma: Fix NULL deref when SEND is completed with error nvmet: fix inline bio check for passthru nvmet: fix inline bio check for bdev-ns nvme-multipath: fix double initialization of ANA state kyber: fix out of bounds access when preempted block: uapi: fix comment about block device ioctl
2021-05-15Merge tag 'io_uring-5.13-2021-05-14' of git://git.kernel.dk/linux-blockLinus Torvalds
Pull io_uring fixes from Jens Axboe: "Just a few minor fixes/changes: - Fix issue with double free race for linked timeout completions - Fix reference issue with timeouts - Remove last few places that make SQPOLL special, since it's just an io thread now. - Bump maximum allowed registered buffers, as we don't allocate as much anymore" * tag 'io_uring-5.13-2021-05-14' of git://git.kernel.dk/linux-block: io_uring: increase max number of reg buffers io_uring: further remove sqpoll limits on opcodes io_uring: fix ltout double free on completion race io_uring: fix link timeout refs
2021-05-15Merge tag 'erofs-for-5.13-rc2-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs fixes from Gao Xiang: "This mainly fixes 1 lcluster-sized pclusters for the big pcluster feature, which can be forcely generated by mkfs as a specific on-disk case for per-(sub)file compression strategies but missed to handle in runtime properly. Also, documentation updates are included to fix the broken illustration due to the ReST conversion by accident and complete the big pcluster introduction. Summary: - update documentation to fix the broken illustration due to ReST conversion by accident at that time and complete the big pcluster introduction - fix 1 lcluster-sized pclusters for the big pcluster feature" * tag 'erofs-for-5.13-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: fix 1 lcluster-sized pcluster for big pcluster erofs: update documentation about data compression erofs: fix broken illustration in documentation
2021-05-15Merge tag 'libnvdimm-fixes-5.13-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull libnvdimm fixes from Dan Williams: "A regression fix for a bootup crash condition introduced in this merge window and some other minor fixups: - Fix regression in ACPI NFIT table handling leading to crashes and driver load failures. - Move the nvdimm mailing list - Miscellaneous minor fixups" * tag 'libnvdimm-fixes-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: ACPI: NFIT: Fix support for variable 'SPA' structure size MAINTAINERS: Move nvdimm mailing list tools/testing/nvdimm: Make symbol '__nfit_test_ioremap' static libnvdimm: Remove duplicate struct declaration
2021-05-15Merge tag 'dax-fixes-5.13-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull dax fixes from Dan Williams: "A fix for a hang condition due to missed wakeups in the filesystem-dax core when exercised by virtiofs. This bug has been there from the beginning, but the condition has not triggered on other filesystems since they hold a lock over invalidation events" * tag 'dax-fixes-5.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: dax: Wake up all waiters after invalidating dax entry dax: Add a wakeup mode parameter to put_unlocked_entry() dax: Add an enum for specifying dax wakup mode
2021-05-15Merge tag 'drm-fixes-2021-05-15' of git://anongit.freedesktop.org/drm/drmLinus Torvalds
Pull more drm fixes from Dave Airlie: "Looks like I wasn't the only one not fully switched on this week. The msm pull has a missing tag so I missed it, and i915 team were a bit late. In my defence I did have a day with the roof of my home office removed, so was sitting at my kids desk. msm: - dsi regression fix - dma-buf pinning fix - displayport fixes - llc fix i915: - Fix active callback alignment annotations and subsequent crashes - Retract link training strategy to slow and wide, again - Avoid division by zero on gen2 - Use correct width reads for C0DRB3/C1DRB3 registers - Fix double free in pdp allocation failure path - Fix HDMI 2.1 PCON downstream caps check" * tag 'drm-fixes-2021-05-15' of git://anongit.freedesktop.org/drm/drm: drm/i915: Use correct downstream caps for check Src-Ctl mode for PCON drm/i915/overlay: Fix active retire callback alignment drm/i915: Fix crash in auto_retire drm/i915/gt: Fix a double free in gen8_preallocate_top_level_pdp drm/i915: Read C0DRB3/C1DRB3 as 16 bits again drm/i915: Avoid div-by-zero on gen2 drm/i915/dp: Use slow and wide link training for everything drm/msm/dp: initialize audio_comp when audio starts drm/msm/dp: check sink_count before update is_connected status drm/msm: fix minor version to indicate MSM_PARAM_SUSPENDS support drm/msm/dsi: fix msm_dsi_phy_get_clk_provider return code drm/msm/dsi: dsi_phy_28nm_8960: fix uninitialized variable access drm/msm: fix LLC not being enabled for mmu500 targets drm/msm: Do not unpin/evict exported dma-buf's
2021-05-15tty: vt: always invoke vc->vc_sw->con_resize callbackTetsuo Handa
syzbot is reporting OOB write at vga16fb_imageblit() [1], for resize_screen() from ioctl(VT_RESIZE) returns 0 without checking whether requested rows/columns fit the amount of memory reserved for the graphical screen if current mode is KD_GRAPHICS. ---------- #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <sys/ioctl.h> #include <linux/kd.h> #include <linux/vt.h> int main(int argc, char *argv[]) { const int fd = open("/dev/char/4:1", O_RDWR); struct vt_sizes vt = { 0x4100, 2 }; ioctl(fd, KDSETMODE, KD_GRAPHICS); ioctl(fd, VT_RESIZE, &vt); ioctl(fd, KDSETMODE, KD_TEXT); return 0; } ---------- Allow framebuffer drivers to return -EINVAL, by moving vc->vc_mode != KD_GRAPHICS check from resize_screen() to fbcon_resize(). Link: https://syzkaller.appspot.com/bug?extid=1f29e126cf461c4de3b3 [1] Reported-by: syzbot <syzbot+1f29e126cf461c4de3b3@syzkaller.appspotmail.com> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Tested-by: syzbot <syzbot+1f29e126cf461c4de3b3@syzkaller.appspotmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-15KVM: arm64: Fix debug register indexingMarc Zyngier
Commit 03fdfb2690099 ("KVM: arm64: Don't write junk to sysregs on reset") flipped the register number to 0 for all the debug registers in the sysreg table, hereby indicating that these registers live in a separate shadow structure. However, the author of this patch failed to realise that all the accessors are using that particular index instead of the register encoding, resulting in all the registers hitting index 0. Not quite a valid implementation of the architecture... Address the issue by fixing all the accessors to use the CRm field of the encoding, which contains the debug register index. Fixes: 03fdfb2690099 ("KVM: arm64: Don't write junk to sysregs on reset") Reported-by: Ricardo Koller <ricarkol@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org
2021-05-15KVM: arm64: Commit pending PC adjustemnts before returning to userspaceMarc Zyngier
KVM currently updates PC (and the corresponding exception state) using a two phase approach: first by setting a set of flags, then by converting these flags into a state update when the vcpu is about to enter the guest. However, this creates a disconnect with userspace if the vcpu thread returns there with any exception/PC flag set. In this case, the exposed context is wrong, as userspace doesn't have access to these flags (they aren't architectural). It also means that these flags are preserved across a reset, which isn't expected. To solve this problem, force an explicit synchronisation of the exception state on vcpu exit to userspace. As an optimisation for nVHE systems, only perform this when there is something pending. Reported-by: Zenghui Yu <yuzenghui@huawei.com> Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Tested-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org # 5.11
2021-05-15KVM: arm64: Move __adjust_pc out of lineMarc Zyngier
In order to make it easy to call __adjust_pc() from the EL1 code (in the case of nVHE), rename it to __kvm_adjust_pc() and move it out of line. No expected functional change. Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Tested-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org # 5.11
2021-05-15KVM: arm64: Mark the host stage-2 memory pools staticQuentin Perret
The host stage-2 memory pools are not used outside of mem_protect.c, mark them static. Fixes: 1025c8c0c6ac ("KVM: arm64: Wrap the host with a stage 2") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210514085640.3917886-3-qperret@google.com
2021-05-15KVM: arm64: Mark pkvm_pgtable_mm_ops staticQuentin Perret
It is not used outside of setup.c, mark it static. Fixes:f320bc742bc2 ("KVM: arm64: Prepare the creation of s1 mappings at EL2") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Quentin Perret <qperret@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20210514085640.3917886-2-qperret@google.com