linux/linux-stable.git - Linux kernel stable tree

Age	Commit message (Collapse)	Author
2022-11-16	KVM: selftests: Add arch specific post vm creation hook	Vishal Annapurve
	Add arch specific API kvm_arch_vm_post_create to perform any required setup after VM creation. Suggested-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Andrew Jones <andrew.jones@linux.dev> Reviewed-by: Peter Gonda <pgonda@google.com> Signed-off-by: Vishal Annapurve <vannapurve@google.com> Link: https://lore.kernel.org/r/20221115213845.3348210-4-vannapurve@google.com [sean: place x86's implementation by vm_arch_vcpu_add()] Signed-off-by: Sean Christopherson <seanjc@google.com>
2022-11-16	KVM: selftests: Add arch specific initialization	Vishal Annapurve
	Introduce arch specific API: kvm_selftest_arch_init to allow each arch to handle initialization before running any selftest logic. Suggested-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Andrew Jones <andrew.jones@linux.dev> Reviewed-by: Peter Gonda <pgonda@google.com> Signed-off-by: Vishal Annapurve <vannapurve@google.com> Link: https://lore.kernel.org/r/20221115213845.3348210-3-vannapurve@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2022-11-16	KVM: selftests: move common startup logic to kvm_util.c	Vishal Annapurve
	Consolidate common startup logic in one place by implementing a single setup function with __attribute((constructor)) for all selftests within kvm_util.c. This allows moving logic like: /* Tell stdout not to buffer its content */ setbuf(stdout, NULL); to a single file for all selftests. This will also allow any required setup at entry in future to be done in common main function. Link: https://lore.kernel.org/lkml/Ywa9T+jKUpaHLu%2Fl@google.com Suggested-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Andrew Jones <andrew.jones@linux.dev> Reviewed-by: Peter Gonda <pgonda@google.com> Signed-off-by: Vishal Annapurve <vannapurve@google.com> Link: https://lore.kernel.org/r/20221115213845.3348210-2-vannapurve@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2022-11-16	KVM: selftests: Play nice with huge pages when getting PTEs/GPAs	Sean Christopherson
	Play nice with huge pages when getting PTEs and translating GVAs to GPAs, there's no reason to disallow using huge pages in selftests. Use PG_LEVEL_NONE to indicate that the caller doesn't care about the mapping level and just wants to get the pte+level. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006004512.666529-8-seanjc@google.com
2022-11-16	KVM: selftests: Use vm_get_page_table_entry() in addr_arch_gva2gpa()	Sean Christopherson
	Use vm_get_page_table_entry() in addr_arch_gva2gpa() to get the leaf PTE instead of manually walking page tables. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006004512.666529-7-seanjc@google.com
2022-11-16	KVM: selftests: Use virt_get_pte() when getting PTE pointer	Sean Christopherson
	Use virt_get_pte() in vm_get_page_table_entry() instead of open coding equivalent code. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006004512.666529-6-seanjc@google.com
2022-11-16	KVM: selftests: Verify parent PTE is PRESENT when getting child PTE	Sean Christopherson
	Verify the parent PTE is PRESENT when getting a child via virt_get_pte() so that the helper can be used for getting PTEs/GPAs without losing sanity checks that the walker isn't wandering into the weeds. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006004512.666529-5-seanjc@google.com
2022-11-16	KVM: selftests: Remove useless shifts when creating guest page tables	Sean Christopherson
	Remove the pointless shift from GPA=>GFN and immediately back to GFN=>GPA when creating guest page tables. Ignore the other walkers that have a similar pattern for the moment, they will be converted to use virt_get_pte() in the near future. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006004512.666529-4-seanjc@google.com
2022-11-16	KVM: selftests: Drop reserved bit checks from PTE accessor	Sean Christopherson
	Drop the reserved bit checks from the helper to retrieve a PTE, there's very little value in sanity checking the constructed page tables as any will quickly be noticed in the form of an unexpected #PF. The checks also place unnecessary restrictions on the usage of the helpers, e.g. if a test _wanted_ to set reserved bits for whatever reason. Removing the NX check in particular allows for the removal of the @vcpu param, which will in turn allow the helper to be reused nearly verbatim for addr_gva2gpa(). Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006004512.666529-3-seanjc@google.com
2022-11-16	KVM: selftests: Drop helpers to read/write page table entries	Sean Christopherson
	Drop vm_{g,s}et_page_table_entry() and instead expose the "inner" helper (was _vm_get_page_table_entry()) that returns a _pointer_ to the PTE, i.e. let tests directly modify PTEs instead of bouncing through helpers that just make life difficult. Opportunsitically use BIT_ULL() in emulator_error_test, and use the MAXPHYADDR define to set the "rogue" GPA bit instead of open coding the same value. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006004512.666529-2-seanjc@google.com
2022-11-16	KVM: selftests: Fix spelling mistake "begining" -> "beginning"	Colin Ian King
	There is a spelling mistake in an assert message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Jim Mattson <jmattson@google.com> Link: https://lore.kernel.org/r/20220928213458.64089-1-colin.i.king@gmail.com [sean: fix an ironic typo in the changelog] Signed-off-by: Sean Christopherson <seanjc@google.com>
2022-11-16	KVM: selftests: Add ucall pool based implementation	Peter Gonda
	To play nice with guests whose stack memory is encrypted, e.g. AMD SEV, introduce a new "ucall pool" implementation that passes the ucall struct via dedicated memory (which can be mapped shared, a.k.a. as plain text). Because not all architectures have access to the vCPU index in the guest, use a bitmap with atomic accesses to track which entries in the pool are free/used. A list+lock could also work in theory, but synchronizing the individual pointers to the guest would be a mess. Note, there's no need to rewalk the bitmap to ensure success. If all vCPUs are simply allocating, success is guaranteed because there are enough entries for all vCPUs. If one or more vCPUs are freeing and then reallocating, success is guaranteed because vCPUs _always_ walk the bitmap from 0=>N; if vCPU frees an entry and then wins a race to re-allocate, then either it will consume the entry it just freed (bit is the first free bit), or the losing vCPU is guaranteed to see the freed bit (winner consumes an earlier bit, which the loser hasn't yet visited). Reviewed-by: Andrew Jones <andrew.jones@linux.dev> Signed-off-by: Peter Gonda <pgonda@google.com> Co-developed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006003409.649993-8-seanjc@google.com
2022-11-16	KVM: selftests: Drop now-unnecessary ucall_uninit()	Sean Christopherson
	Drop ucall_uninit() and ucall_arch_uninit() now that ARM doesn't modify the host's copy of ucall_exit_mmio_addr, i.e. now that there's no need to reset the pointer before potentially creating a new VM. The few calls to ucall_uninit() are all immediately followed by kvm_vm_free(), and that is likely always going to hold true, i.e. it's extremely unlikely a test will want to effectively disable ucall in the middle of a test. Reviewed-by: Andrew Jones <andrew.jones@linux.dev> Tested-by: Peter Gonda <pgonda@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006003409.649993-7-seanjc@google.com
2022-11-16	KVM: selftests: Make arm64's MMIO ucall multi-VM friendly	Sean Christopherson
	Fix a mostly-theoretical bug where ARM's ucall MMIO setup could result in different VMs stomping on each other by cloberring the global pointer. Fix the most obvious issue by saving the MMIO gpa into the VM. A more subtle bug is that creating VMs in parallel (on multiple tasks) could result in a VM using the wrong address. Synchronizing a global to a guest effectively snapshots the value on a per-VM basis, i.e. the "global" is already prepped to work with multiple VMs, but setting the global in the host is not thread-safe. To fix that bug, add write_guest_global() to allow stuffing a VM's copy of a "global" without modifying the host value. Reviewed-by: Andrew Jones <andrew.jones@linux.dev> Tested-by: Peter Gonda <pgonda@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006003409.649993-6-seanjc@google.com
2022-11-16	tools: Add atomic_test_and_set_bit()	Peter Gonda
	Add x86 and generic implementations of atomic_test_and_set_bit() to allow KVM selftests to atomically manage bitmaps. Note, the generic version is taken from arch_test_and_set_bit() as of commit 415d83249709 ("locking/atomic: Make test_and_*_bit() ordered on failure"). Signed-off-by: Peter Gonda <pgonda@google.com> Co-developed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006003409.649993-5-seanjc@google.com
2022-11-16	KVM: selftests: Automatically do init_ucall() for non-barebones VMs	Sean Christopherson
	Do init_ucall() automatically during VM creation to kill two (three?) birds with one stone. First, initializing ucall immediately after VM creations allows forcing aarch64's MMIO ucall address to immediately follow memslot0. This is still somewhat fragile as tests could clobber the MMIO address with a new memslot, but it's safe-ish since tests have to be conversative when accounting for memslot0. And this can be hardened in the future by creating a read-only memslot for the MMIO page (KVM ARM exits with MMIO if the guest writes to a read-only memslot). Add a TODO to document that selftests can and should use a memslot for the ucall MMIO (doing so requires yet more rework because tests assumes thay can use all memslots except memslot0). Second, initializing ucall for all VMs prepares for making ucall initialization meaningful on all architectures. aarch64 is currently the only arch that needs to do any setup, but that will change in the future by switching to a pool-based implementation (instead of the current stack-based approach). Lastly, defining the ucall MMIO address from common code will simplify switching all architectures (except s390) to a common MMIO-based ucall implementation (if there's ever sufficient motivation to do so). Cc: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Andrew Jones <andrew.jones@linux.dev> Tested-by: Peter Gonda <pgonda@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006003409.649993-4-seanjc@google.com
2022-11-16	KVM: selftests: Consolidate boilerplate code in get_ucall()	Sean Christopherson
	Consolidate the actual copying of a ucall struct from guest=>host into the common get_ucall(). Return a host virtual address instead of a guest virtual address even though the addr_gva2hva() part could be moved to get_ucall() too. Conceptually, get_ucall() is invoked from the host and should return a host virtual address (and returning NULL for "nothing to see here" is far superior to returning 0). Use pointer shenanigans instead of an unnecessary bounce buffer when the caller of get_ucall() provides a valid pointer. Reviewed-by: Andrew Jones <andrew.jones@linux.dev> Tested-by: Peter Gonda <pgonda@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006003409.649993-3-seanjc@google.com
2022-11-16	KVM: selftests: Consolidate common code for populating ucall struct	Sean Christopherson
	Make ucall() a common helper that populates struct ucall, and only calls into arch code to make the actually call out to userspace. Rename all arch-specific helpers to make it clear they're arch-specific, and to avoid collisions with common helpers (one more on its way...) Add WRITE_ONCE() to stores in ucall() code (as already done to aarch64 code in commit 9e2f6498efbb ("selftests: KVM: Handle compiler optimizations in ucall")) to prevent clang optimizations breaking ucalls. Cc: Colton Lewis <coltonlewis@google.com> Reviewed-by: Andrew Jones <andrew.jones@linux.dev> Tested-by: Peter Gonda <pgonda@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006003409.649993-2-seanjc@google.com
2022-11-16	KVM: arm64: selftests: Disable single-step without relying on ucall()	Sean Christopherson
	Automatically disable single-step when the guest reaches the end of the verified section instead of using an explicit ucall() to ask userspace to disable single-step. An upcoming change to implement a pool-based scheme for ucall() will add an atomic operation (bit test and set) in the guest ucall code, and if the compiler generate "old school" atomics, e.g. 40e57c: c85f7c20 ldxr x0, [x1] 40e580: aa100011 orr x17, x0, x16 40e584: c80ffc31 stlxr w15, x17, [x1] 40e588: 35ffffaf cbnz w15, 40e57c <__aarch64_ldset8_sync+0x1c> the guest will hang as the local exclusive monitor is reset by eret, i.e. the stlxr will always fail due to the debug exception taken to EL2. Link: https://lore.kernel.org/all/20221006003409.649993-8-seanjc@google.com Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Marc Zyngier <maz@kernel.org> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221117002350.2178351-3-seanjc@google.com Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
2022-11-16	KVM: arm64: selftests: Disable single-step with correct KVM define	Sean Christopherson
	Disable single-step by setting debug.control to KVM_GUESTDBG_ENABLE, not to SINGLE_STEP_DISABLE. The latter is an arbitrary test enum that just happens to have the same value as KVM_GUESTDBG_ENABLE, and so effectively disables single-step debug. No functional change intended. Cc: Reiji Watanabe <reijiw@google.com> Fixes: b18e4d4aebdd ("KVM: arm64: selftests: Add a test case for KVM_GUESTDBG_SINGLESTEP") Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221117002350.2178351-2-seanjc@google.com Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
2022-11-16	KVM: selftests: Rename perf_test_util symbols to memstress	David Matlack
	Replace the perf_test_ prefix on symbol names with memstress_ to match the new file name. "memstress" better describes the functionality proveded by this library, which is to provide functionality for creating and running a VM that stresses VM memory by reading and writing to guest memory on all vCPUs in parallel. "memstress" also contains the same number of chracters as "perf_test", making it a drop-in replacement in symbols, e.g. function names, without impacting line lengths. Also the lack of underscore between "mem" and "stress" makes it clear "memstress" is a noun. Signed-off-by: David Matlack <dmatlack@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221012165729.3505266-4-dmatlack@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2022-11-16	KVM: selftests: Rename pta (short for perf_test_args) to args	David Matlack
	Rename the local variables "pta" (which is short for perf_test_args) for args. "pta" is not an obvious acronym and using "args" mirrors "vcpu_args". Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221012165729.3505266-3-dmatlack@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2022-11-16	KVM: selftests: Rename perf_test_util.[ch] to memstress.[ch]	David Matlack
	Rename the perf_test_util.[ch] files to memstress.[ch]. Symbols are renamed in the following commit to reduce the amount of churn here in hopes of playiing nice with git's file rename detection. The name "memstress" was chosen to better describe the functionality proveded by this library, which is to create and run a VM that reads/writes to guest memory on all vCPUs in parallel. "memstress" also contains the same number of chracters as "perf_test", making it a drop-in replacement in symbols, e.g. function names, without impacting line lengths. Also the lack of underscore between "mem" and "stress" makes it clear "memstress" is a noun. Signed-off-by: David Matlack <dmatlack@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221012165729.3505266-2-dmatlack@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2022-11-16	KVM: selftests: randomize page access order	Colton Lewis
	Create the ability to randomize page access order with the -a argument. This includes the possibility that the same pages may be hit multiple times during an iteration or not at all. Population has random access as false to ensure all pages will be touched by population and avoid page faults in late dirty memory that would pollute the test results. Signed-off-by: Colton Lewis <coltonlewis@google.com> Reviewed-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20221107182208.479157-5-coltonlewis@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2022-11-16	KVM: selftests: randomize which pages are written vs read	Colton Lewis
	Randomize which pages are written vs read using the random number generator. Change the variable wr_fract and associated function calls to write_percent that now operates as a percentage from 0 to 100 where X means each page has an X% chance of being written. Change the -f argument to -w to reflect the new variable semantics. Keep the same default of 100% writes. Population always uses 100% writes to ensure all memory is actually populated and not just mapped to the zero page. The prevents expensive copy-on-write faults from occurring during the dirty memory iterations below, which would pollute the performance results. Each vCPU calculates its own random seed by adding its index to the seed provided. Signed-off-by: Colton Lewis <coltonlewis@google.com> Reviewed-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20221107182208.479157-4-coltonlewis@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2022-11-16	KVM: selftests: create -r argument to specify random seed	Colton Lewis
	Create a -r argument to specify a random seed. If no argument is provided, the seed defaults to 1. The random seed is set with perf_test_set_random_seed() and must be set before guest_code runs to apply. Signed-off-by: Colton Lewis <coltonlewis@google.com> Reviewed-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20221107182208.479157-3-coltonlewis@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2022-11-16	KVM: selftests: implement random number generator for guest code	Colton Lewis
	Implement random number generator for guest code to randomize parts of the test, making it less predictable and a more accurate reflection of reality. The random number generator chosen is the Park-Miller Linear Congruential Generator, a fancy name for a basic and well-understood random number generator entirely sufficient for this purpose. Signed-off-by: Colton Lewis <coltonlewis@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Reviewed-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20221107182208.479157-2-coltonlewis@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2022-11-16	KVM: selftests: Allowing running dirty_log_perf_test on specific CPUs	Vipin Sharma
	Add a command line option, -c, to pin vCPUs to physical CPUs (pCPUs), i.e. to force vCPUs to run on specific pCPUs. Requirement to implement this feature came in discussion on the patch "Make page tables for eager page splitting NUMA aware" https://lore.kernel.org/lkml/YuhPT2drgqL+osLl@google.com/ This feature is useful as it provides a way to analyze performance based on the vCPUs and dirty log worker locations, like on the different NUMA nodes or on the same NUMA nodes. To keep things simple, implementation is intentionally very limited, either all of the vCPUs will be pinned followed by an optional main thread or nothing will be pinned. Signed-off-by: Vipin Sharma <vipinsh@google.com> Suggested-by: David Matlack <dmatlack@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221103191719.1559407-8-vipinsh@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2022-11-16	KVM: selftests: Add atoi_positive() and atoi_non_negative() for input validation	Vipin Sharma
	Many KVM selftests take command line arguments which are supposed to be positive (>0) or non-negative (>=0). Some tests do these validation and some missed adding the check. Add atoi_positive() and atoi_non_negative() to validate inputs in selftests before proceeding to use those values. Signed-off-by: Vipin Sharma <vipinsh@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221103191719.1559407-7-vipinsh@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2022-11-16	KVM: selftests: Shorten the test args in memslot_modification_stress_test.c	Vipin Sharma
	Change test args memslot_modification_delay and nr_memslot_modifications to delay and nr_iterations for simplicity. Signed-off-by: Vipin Sharma <vipinsh@google.com> Suggested-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221103191719.1559407-6-vipinsh@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2022-11-16	KVM: selftests: Use SZ_* macros from sizes.h in max_guest_memory_test.c	Vipin Sharma
	Replace size_1gb defined in max_guest_memory_test.c with the SZ_1G, SZ_2G and SZ_4G from linux/sizes.h header file. Signed-off-by: Vipin Sharma <vipinsh@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221103191719.1559407-5-vipinsh@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2022-11-16	KVM: selftests: Add atoi_paranoid() to catch errors missed by atoi()	Vipin Sharma
	atoi() doesn't detect errors. There is no way to know that a 0 return is correct conversion or due to an error. Introduce atoi_paranoid() to detect errors and provide correct conversion. Replace all atoi() calls with atoi_paranoid(). Signed-off-by: Vipin Sharma <vipinsh@google.com> Suggested-by: David Matlack <dmatlack@google.com> Suggested-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221103191719.1559407-4-vipinsh@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2022-11-16	KVM: selftests: Put command line options in alphabetical order in ↵	Vipin Sharma
	dirty_log_perf_test There are 13 command line options and they are not in any order. Put them in alphabetical order to make it easy to add new options. No functional change intended. Signed-off-by: Vipin Sharma <vipinsh@google.com> Reviewed-by: Wei Wang <wei.w.wang@intel.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221103191719.1559407-3-vipinsh@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2022-11-16	KVM: selftests: Add missing break between -e and -g option in ↵	Vipin Sharma
	dirty_log_perf_test Passing -e option (Run VCPUs while dirty logging is being disabled) in dirty_log_perf_test also unintentionally enables -g (Do not enable KVM_CAP_MANUAL_DIRTY_LOG_PROTECT2). Add break between two switch case logic. Fixes: cfe12e64b065 ("KVM: selftests: Add an option to run vCPUs while disabling dirty logging") Signed-off-by: Vipin Sharma <vipinsh@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221103191719.1559407-2-vipinsh@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2022-11-09	selftests: kvm/x86: Test the flags in MSR filtering and MSR exiting	Aaron Lewis
	When using the flags in KVM_X86_SET_MSR_FILTER and KVM_CAP_X86_USER_SPACE_MSR it is expected that an attempt to write to any of the unused bits will fail. Add testing to walk over every bit in each of the flag fields in MSR filtering and MSR exiting to verify that unused bits return and error and used bits, i.e. valid bits, succeed. Signed-off-by: Aaron Lewis <aaronlewis@google.com> Message-Id: <20220921151525.904162-6-aaronlewis@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-09	KVM: allow compiling out SMM support	Paolo Bonzini
	Some users of KVM implement the UEFI variable store through a paravirtual device that does not require the "SMM lockbox" component of edk2; allow them to compile out system management mode, which is not a full implementation especially in how it interacts with nested virtualization. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20220929172016.319443-6-pbonzini@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-09	Merge tag 'kvm-s390-master-6.1-1' of ↵	Paolo Bonzini
	https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD A PCI allocation fix and a PV clock fix.
2022-11-09	tools/kvm_stat: update exit reasons for vmx/svm/aarch64/userspace	Rong Tao
	Update EXIT_REASONS from source, including VMX_EXIT_REASONS, SVM_EXIT_REASONS, AARCH64_EXIT_REASONS, USERSPACE_EXIT_REASONS. Signed-off-by: Rong Tao <rongtao@cestc.cn> Message-Id: <tencent_00082C8BFA925A65E11570F417F1CD404505@qq.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-09	tools/kvm_stat: fix incorrect detection of debugfs	Matthias Gerstner
	The first field in /proc/mounts can be influenced by unprivileged users through the widespread `fusermount` setuid-root program. Example: ``` user$ mkdir ~/mydebugfs user$ export _FUSE_COMMFD=0 user$ fusermount ~/mydebugfs -ononempty,fsname=debugfs user$ grep debugfs /proc/mounts debugfs /home/user/mydebugfs fuse rw,nosuid,nodev,relatime,user_id=1000,group_id=100 0 0 ``` If there is no debugfs already mounted in the system then this can be used by unprivileged users to trick kvm_stat into using a user controlled file system location for obtaining KVM statistics. Even though the root user is not allowed to access non-root FUSE mounts for security reasons, the unprivileged user can unmount the FUSE mount before kvm_stat uses the mounted path. If it wins the race, kvm_stat will read from the location where the FUSE mount resided. Note that the files in debugfs are only opened for reading, so the attacker can cause very large data to be read in by kvm_stat, or fake data to be processed, but there should be no viable way to turn this into a privilege escalation. The fix is simply to use the file system type field instead. Whitespace in the mount path is escaped in /proc/mounts thus no further safety measures in the parsing should be necessary to make this correct. Message-Id: <20221103135927.13656-1-matthias.gerstner@suse.de> Signed-off-by: Matthias Gerstner <matthias.gerstner@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-11-06	Merge tag 'cxl-fixes-for-6.1-rc4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull cxl fixes from Dan Williams: "Several fixes for CXL region creation crashes, leaks and failures. This is mainly fallout from the original implementation of dynamic CXL region creation (instantiate new physical memory pools) that arrived in v6.0-rc1. Given the theme of "failures in the presence of pass-through decoders" this also includes new regression test infrastructure for that case. Summary: - Fix region creation crash with pass-through decoders - Fix region creation crash when no decoder allocation fails - Fix region creation crash when scanning regions to enforce the increasing physical address order constraint that CXL mandates - Fix a memory leak for cxl_pmem_region objects, track 1:N instead of 1:1 memory-device-to-region associations. - Fix a memory leak for cxl_region objects when regions with active targets are deleted - Fix assignment of NUMA nodes to CXL regions by CFMWS (CXL Window) emulated proximity domains. - Fix region creation failure for switch attached devices downstream of a single-port host-bridge - Fix false positive memory leak of cxl_region objects by recycling recently used region ids rather than freeing them - Add regression test infrastructure for a pass-through decoder configuration - Fix some mailbox payload handling corner cases" * tag 'cxl-fixes-for-6.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/region: Recycle region ids cxl/region: Fix 'distance' calculation with passthrough ports tools/testing/cxl: Add a single-port host-bridge regression config tools/testing/cxl: Fix some error exits cxl/pmem: Fix cxl_pmem_region and cxl_memdev leak cxl/region: Fix cxl_region leak, cleanup targets at region delete cxl/region: Fix region HPA ordering validation cxl/pmem: Use size_add() against integer overflow cxl/region: Fix decoder allocation crash ACPI: NUMA: Add CXL CFMWS 'nodes' to the possible nodes set cxl/pmem: Fix failure to account for 8 byte header for writes to the device LSA. cxl/region: Fix null pointer dereference due to pass through decoder commit cxl/mbox: Add a check on input payload size
2022-11-04	tools/testing/cxl: Add a single-port host-bridge regression config	Dan Williams
	Jonathan reports that region creation fails when a single-port host-bridge connects to a multi-port switch. Mock up that configuration so a fix can be tested and regression tested going forward. Reported-by: Bobo WL <lmw.bobo@gmail.com> Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: http://lore.kernel.org/r/20221010172057.00001559@huawei.com Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/166752184838.947915.2167957540894293891.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-11-04	tools/testing/cxl: Fix some error exits	Dan Williams
	Fix a few typos where 'goto err_port' was used rather than the object specific cleanup. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Vishal Verma <vishal.l.verma@intel.com> Link: https://lore.kernel.org/r/166752184255.947915.16163477849330181425.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-11-04	Merge tag 'landlock-6.1-rc4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull landlock fix from Mickaël Salaün: "Fix the test build for some distros" * tag 'landlock-6.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: selftests/landlock: Build without static libraries
2022-11-03	Merge tag 'linux-kselftest-fixes-6.1-rc4' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull Kselftest fixes from Shuah Khan: "Fixes to the pidfd test" * tag 'linux-kselftest-fixes-6.1-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests/pidfd_test: Remove the erroneous ',' selftests: pidfd: Fix compling warnings ksefltests: pidfd: Fix wait_states: Test terminated by timeout
2022-11-02	selftests/pidfd_test: Remove the erroneous ','	Zhao Gongyi
	Remove the erroneous ',', otherwise it might result in wrong output and report: ... Bail out! (errno %d) test: Unexpected epoll_wait result (c=4208480, events=2) ... Fixes: 740378dc7834 ("pidfd: add polling selftests") Signed-off-by: Zhao Gongyi <zhaogongyi@huawei.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-11-01	Merge tag 'nolibc-urgent.2022.10.28a' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull nolibc fixes from Paul McKenney: "This contains a couple of fixes for string-function bugs" * tag 'nolibc-urgent.2022.10.28a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: tools/nolibc/string: Fix memcmp() implementation tools/nolibc: Fix missing strlen() definition and infinite loop with gcc-12
2022-11-01	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds
	Pull kvm fixes from Paolo Bonzini: "x86: - fix lock initialization race in gfn-to-pfn cache (+selftests) - fix two refcounting errors - emulator fixes - mask off reserved bits in CPUID - fix bug with disabling SGX RISC-V: - update MAINTAINERS" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86/xen: Fix eventfd error handling in kvm_xen_eventfd_assign() KVM: x86: smm: number of GPRs in the SMRAM image depends on the image format KVM: x86: emulator: update the emulation mode after CR0 write KVM: x86: emulator: update the emulation mode after rsm KVM: x86: emulator: introduce emulator_recalc_and_set_mode KVM: x86: emulator: em_sysexit should update ctxt->mode KVM: selftests: Mark "guest_saw_irq" as volatile in xen_shinfo_test KVM: selftests: Add tests in xen_shinfo_test to detect lock races KVM: Reject attempts to consume or refresh inactive gfn_to_pfn_cache KVM: Initialize gfn_to_pfn_cache locks in dedicated helper KVM: VMX: fully disable SGX if SECONDARY_EXEC_ENCLS_EXITING unavailable KVM: x86: Exempt pending triple fault from event injection sanity check MAINTAINERS: git://github -> https://github.com for kvm-riscv KVM: debugfs: Return retval of simple_attr_open() if it fails KVM: x86: Reduce refcount if single_open() fails in kvm_mmu_rmaps_stat_open() KVM: x86: Mask off reserved bits in CPUID.8000001FH KVM: x86: Mask off reserved bits in CPUID.8000001AH KVM: x86: Mask off reserved bits in CPUID.80000008H KVM: x86: Mask off reserved bits in CPUID.80000006H KVM: x86: Mask off reserved bits in CPUID.80000001H
2022-10-30	Merge tag 'char-misc-6.1-rc3' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc fixes from Greg KH: "Some small driver fixes for 6.1-rc3. They include: - iio driver bugfixes - counter driver bugfixes - coresight bugfixes, including a revert and then a second fix to get it right. All of these have been in linux-next with no reported problems" * tag 'char-misc-6.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (21 commits) misc: sgi-gru: use explicitly signed char coresight: cti: Fix hang in cti_disable_hw() Revert "coresight: cti: Fix hang in cti_disable_hw()" counter: 104-quad-8: Fix race getting function mode and direction counter: microchip-tcb-capture: Handle Signal1 read and Synapse coresight: cti: Fix hang in cti_disable_hw() coresight: Fix possible deadlock with lock dependency counter: ti-ecap-capture: fix IS_ERR() vs NULL check counter: Reduce DEFINE_COUNTER_ARRAY_POLARITY() to defining counter_array iio: bmc150-accel-core: Fix unsafe buffer attributes iio: adxl367: Fix unsafe buffer attributes iio: adxl372: Fix unsafe buffer attributes iio: at91-sama5d2_adc: Fix unsafe buffer attributes iio: temperature: ltc2983: allocate iio channels once tools: iio: iio_utils: fix digit calculation iio: adc: stm32-adc: fix channel sampling time init iio: adc: mcp3911: mask out device ID in debug prints iio: adc: mcp3911: use correct id bits iio: adc: mcp3911: return proper error code on failure to allocate trigger iio: adc: mcp3911: fix sizeof() vs ARRAY_SIZE() bug ...
2022-10-30	selftests: pidfd: Fix compling warnings	Li Zhijian
	Fix warnings and enable Wall. pidfd_wait.c: In function ‘wait_nonblock’: pidfd_wait.c:150:13: warning: unused variable ‘status’ [-Wunused-variable] 150 \| int pidfd, status = 0; \| ^~~~~~ ... pidfd_test.c: In function ‘child_poll_exec_test’: pidfd_test.c:438:1: warning: no return statement in function returning non-void [-Wreturn-type] 438 \| } \| ^ Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> v2: fix mistake assignment to pidfd Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-10-30	ksefltests: pidfd: Fix wait_states: Test terminated by timeout	Li Zhijian
	0Day/LKP observed that the kselftest blocks forever since one of the pidfd_wait doesn't terminate in 1 of 30 runs. After digging into the source, we found that it blocks at: ASSERT_EQ(sys_waitid(P_PIDFD, pidfd, &info, WCONTINUED, NULL), 0); wait_states has below testing flow: CHILD PARENT ---------------+-------------- 1 STOP itself 2 WAIT for CHILD STOPPED 3 SIGNAL CHILD to CONT 4 CONT 5 STOP itself 5' WAIT for CHILD CONT 6 WAIT for CHILD STOPPED The problem is that the kernel cannot ensure the order of 5 and 5', once 5 goes first, the test will fail. we can reproduce it by: $ while true; do make run_tests -C pidfd; done Introduce a blocking read in child process to make sure the parent can check its WCONTINUED. CC: Philip Li <philip.li@intel.com> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>