summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-08-31drm/amd/display: Roll back unit correctionOvidiu Bunea
[why] This Unit correction exposes a Replay corruption. [how] This reverts commit: commit dbd29029c7b5 ("drm/amd/display: Correct unit conversion for vstartup") Roll back unit conversion until Replay can fix their corruption. Fixes: dbd29029c7b5 ("drm/amd/display: Correct unit conversion for vstartup") Reviewed-by: Reza Amini <reza.amini@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Ovidiu Bunea <ovidiu.bunea@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-31drm/amdgpu: Enable ras for mp0 v13_0_6 sriovYiPeng Chai
Enable ras for mp0 v13_0_6 sriov Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-31drm/amdkfd: retry after EBUSY is returned from hmm_ranges_get_pagesAlex Sierra
if hmm_range_get_pages returns EBUSY error during svm_range_validate_and_map, within the context of a page fault interrupt. This should retry through svm_range_restore_pages callback. Therefore we treat this as EAGAIN error instead, and defer it to restore pages fallback. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-31drm/amdgpu/jpeg - skip change of power-gating state for sriovSamir Dhume
Powergating is handled in the host driver. Reviewed-by: Zhigang Luo <zhigang.luo@amd.com> Signed-off-by: Samir Dhume <samir.dhume@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-31drm/amd/pm: Add critical temp for GC v9.4.3Asad Kamal
Add critical temperature message support func for smu v13.0.6 and expose critical temperature as part of hw mon attributes for GC v9.4.3 v2: Added comment for pmfw version requirement & move the check to get_thermal_temperature_range function Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-31drm/amd/pm: Update SMUv13.0.6 PMFW headersAsad Kamal
Update PMFW interface headers for updated metrics table and critical temperature message Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-31drm/amdgpu: update gc_info v2_1 from discoveryLe Ma
Several new fields are exposed in gc_info v2_1 Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-31drm/amdgpu: update mall info v2 from discoveryLe Ma
Mall info v2 is introduced in ip discovery Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-31drm/amdgpu: Only support RAS EEPROM on dGPU platformCandice Li
RAS EEPROM device is only supported on dGPU platform for smu v13_0_6. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-31Documentation/gpu: Update amdgpu documentationLijo Lazar
7957ec80ef97 ("drm/amdgpu: Add FRU sysfs nodes only if needed") moved the documentation for some of the sysfs nodes to amdgpu_fru_eeprom.c. Update the documentation accordingly. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-31Merge tag 'v6.6-vfs.super.fixes.2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull more superblock follow-on fixes from Christian Brauner: "This contains two more small follow-up fixes for the super work this cycle. I went through all filesystems once more and detected two minor issues that still needed fixing: - Some filesystems support mtd devices (e.g., mount -t jffs2 mtd2 /mnt). The mtd infrastructure uses the sb->s_mtd pointer to find an existing superblock. When the mtd device is put and sb->s_mtd cleared the superblock can still be found fs_supers and so this risks a use-after-free. Add a small patch that aligns mtd with what we did for regular block devices and switch keying to rely on sb->s_dev. (This was tested with mtd devices and jffs2 as xfstests doesn't support mtd devices.) - Switch nfs back to rely on kill_anon_super() so the superblock is removed from the list of active supers before sb->s_fs_info is freed" * tag 'v6.6-vfs.super.fixes.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: NFS: switch back to using kill_anon_super mtd: key superblock by device number fs: export sget_dev()
2023-08-31drm/amdgpu/pm: Add notification for no DC supportBokun Zhang
- There is a DPM issue where if DC is not present, FCLK will stay at low level. We need to send a SMU message to configure the DPM - Reuse smu_v13_0_notify_display_change() for this purpose Reviewed-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Bokun Zhang <bokun.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-31drm/amd/display: Enable Replay for static screen use casesBhawanpreet Lakha
- Setup replay config on device init. - Enable replay if feature is enabled (prioritize replay over PSR, since it can be enabled in more usecases) - Add debug masks to enable replay on supported ASICs Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-08-31powerpc: Fix pud_mkwrite() definition after pte_mkwrite() API changesIngo Molnar
Fix up missed semantic mis-merge between commits 161e393c0f63 ("mm: Make pte_mkwrite() take a VMA") 27af67f35631 ("powerpc/book3s64/mm: enable transparent pud hugepage") where the newly introduced powerpc use of 'pte_mkwrite()' needs to use the 'novma()' versions as per commit 2f0584f3f4bd ("mm: Rename arch pte_mkwrite()'s to pte_mkwrite_novma()"). Fixes: df57721f9a63 ("Merge tag 'x86_shstk_for_6.6-rc1' of [...]") Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-08-31x86/fpu/xstate: Fix PKRU covert channelJim Mattson
When XCR0[9] is set, PKRU can be read and written from userspace with XSAVE and XRSTOR, even when CR4.PKE is clear. Clear XCR0[9] when protection keys are disabled. Reported-by: Tavis Ormandy <taviso@google.com> Signed-off-by: Jim Mattson <jmattson@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20230831043228.1194256-1-jmattson@google.com
2023-08-31pstore: Base compression input buffer size on estimated compressed sizeArd Biesheuvel
Commit 1756ddea6916 ("pstore: Remove worst-case compression size logic") removed some clunky per-algorithm worst case size estimation routines on the basis that we can always store pstore records uncompressed, and these worst case estimations are about how much the size might inadvertently *increase* due to encapsulation overhead when the input cannot be compressed at all. So if compression results in a size increase, we just store the original data instead. However, it seems that the original code was misinterpreting these calculations as an estimation of how much uncompressed data might fit into a compressed buffer of a given size, and it was using the results to consume the input data in larger chunks than the pstore record size, relying on the compression to ensure that what ultimately gets stored fits into the available space. One result of this, as observed and reported by Linus, is that upgrading to a newer kernel that includes the given commit may result in pstore decompression errors reported in the kernel log. This is due to the fact that the existing records may unexpectedly decompress to a size that is larger than the pstore record size. Another potential problem caused by this change is that we may underutilize the fixed sized records on pstore backends such as ramoops. And on pstore backends with variable sized records such as EFI, we will end up creating many more entries than before to store the same amount of compressed data. So let's fix both issues, by bringing back the typical case estimation of how much ASCII text captured from the dmesg log might fit into a pstore record of a given size after compression. The original implementation used the computation given below for zlib: switch (size) { /* buffer range for efivars */ case 1000 ... 2000: cmpr = 56; break; case 2001 ... 3000: cmpr = 54; break; case 3001 ... 3999: cmpr = 52; break; /* buffer range for nvram, erst */ case 4000 ... 10000: cmpr = 45; break; default: cmpr = 60; break; } return (size * 100) / cmpr; We will use the previous worst-case of 60% for compression. For decompression go extra large (3x) so we make sure there's enough space for anything. While at it, rate limit the error message so we don't flood the log unnecessarily on systems that have accumulated a lot of pstore history. Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20230830212238.135900-1-ardb@kernel.org Co-developed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Kees Cook <keescook@chromium.org>
2023-08-31fbdev: mx3fb: Remove the driverFabio Estevam
The mx3fb driver does not support devicetree and i.MX has been converted to a DT-only platform since kernel 5.10. As there is no user for this driver anymore, just remove it. Signed-off-by: Fabio Estevam <festevam@denx.de> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Helge Deller <deller@gmx.de>
2023-08-31fbdev/core: Use list_for_each_entry() helperJinjie Ruan
Convert list_for_each() to list_for_each_entry() so that the pos list_head pointer and list_entry() call are no longer needed, which can reduce a few lines of code. No functional changed. Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Signed-off-by: Helge Deller <deller@gmx.de>
2023-08-31selftests/bpf: Include build flavors for install targetBjörn Töpel
When using the "install" or targets depending on install, e.g. "gen_tar", the BPF machine flavors weren't included. A command like: | make ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- O=/workspace/kbuild \ | HOSTCC=gcc FORMAT= SKIP_TARGETS="arm64 ia64 powerpc sparc64 x86 sgx" \ | -C tools/testing/selftests gen_tar would not include bpf/no_alu32, bpf/cpuv4, or bpf/bpf-gcc. Include the BPF machine flavors for "install" make target. Signed-off-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20230831162954.111485-1-bjorn@kernel.org
2023-08-31bpf: Annotate bpf_long_memcpy with data_raceDaniel Borkmann
syzbot reported a data race splat between two processes trying to update the same BPF map value via syscall on different CPUs: BUG: KCSAN: data-race in bpf_percpu_array_update / bpf_percpu_array_update write to 0xffffe8fffe7425d8 of 8 bytes by task 8257 on cpu 1: bpf_long_memcpy include/linux/bpf.h:428 [inline] bpf_obj_memcpy include/linux/bpf.h:441 [inline] copy_map_value_long include/linux/bpf.h:464 [inline] bpf_percpu_array_update+0x3bb/0x500 kernel/bpf/arraymap.c:380 bpf_map_update_value+0x190/0x370 kernel/bpf/syscall.c:175 generic_map_update_batch+0x3ae/0x4f0 kernel/bpf/syscall.c:1749 bpf_map_do_batch+0x2df/0x3d0 kernel/bpf/syscall.c:4648 __sys_bpf+0x28a/0x780 __do_sys_bpf kernel/bpf/syscall.c:5241 [inline] __se_sys_bpf kernel/bpf/syscall.c:5239 [inline] __x64_sys_bpf+0x43/0x50 kernel/bpf/syscall.c:5239 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd write to 0xffffe8fffe7425d8 of 8 bytes by task 8268 on cpu 0: bpf_long_memcpy include/linux/bpf.h:428 [inline] bpf_obj_memcpy include/linux/bpf.h:441 [inline] copy_map_value_long include/linux/bpf.h:464 [inline] bpf_percpu_array_update+0x3bb/0x500 kernel/bpf/arraymap.c:380 bpf_map_update_value+0x190/0x370 kernel/bpf/syscall.c:175 generic_map_update_batch+0x3ae/0x4f0 kernel/bpf/syscall.c:1749 bpf_map_do_batch+0x2df/0x3d0 kernel/bpf/syscall.c:4648 __sys_bpf+0x28a/0x780 __do_sys_bpf kernel/bpf/syscall.c:5241 [inline] __se_sys_bpf kernel/bpf/syscall.c:5239 [inline] __x64_sys_bpf+0x43/0x50 kernel/bpf/syscall.c:5239 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd value changed: 0x0000000000000000 -> 0xfffffff000002788 The bpf_long_memcpy is used with 8-byte aligned pointers, power-of-8 size and forced to use long read/writes to try to atomically copy long counters. It is best-effort only and no barriers are here since it _will_ race with concurrent updates from BPF programs. The bpf_long_memcpy() is called from bpf(2) syscall. Marco suggested that the best way to make this known to KCSAN would be to use data_race() annotation. Reported-by: syzbot+97522333291430dd277f@syzkaller.appspotmail.com Suggested-by: Marco Elver <elver@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Marco Elver <elver@google.com> Link: https://lore.kernel.org/bpf/000000000000d87a7f06040c970c@google.com Link: https://lore.kernel.org/bpf/57628f7a15e20d502247c3b55fceb1cb2b31f266.1693342186.git.daniel@iogearbox.net
2023-08-31Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-armLinus Torvalds
Pull ARM updates from Russell King: - Refactor VFP code and convert to C code (Ard Biesheuvel) - Fix hardware breakpoint single-stepping using bpf_overflow_handler - Make SMP stop calls asynchronous allowing panic from irq context to work - Fix for kernel-doc warnings for locomo * tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm: Revert part of ae1f8d793a19 ("ARM: 9304/1: add prototype for function called only from asm") ARM: 9318/1: locomo: move kernel-doc to prevent warnings ARM: 9317/1: kexec: Make smp stop calls asynchronous ARM: 9316/1: hw_breakpoint: fix single-stepping when using bpf_overflow_handler ARM: entry: Make asm coproc dispatch code NWFPE only ARM: iwmmxt: Use undef hook to enable coprocessor for task ARM: entry: Disregard Thumb undef exception in coproc dispatch ARM: vfp: Use undef hook for handling VFP exceptions ARM: kernel: Get rid of thread_info::used_cp[] array ARM: vfp: Reimplement VFP exception entry in C code ARM: vfp: Remove workaround for Feroceon CPUs ARM: vfp: Record VFP bounces as perf emulation faults
2023-08-31Merge tag 'powerpc-6.6-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Add HOTPLUG_SMT support (/sys/devices/system/cpu/smt) and honour the configured SMT state when hotplugging CPUs into the system - Combine final TLB flush and lazy TLB mm shootdown IPIs when using the Radix MMU to avoid a broadcast TLBIE flush on exit - Drop the exclusion between ptrace/perf watchpoints, and drop the now unused associated arch hooks - Add support for the "nohlt" command line option to disable CPU idle - Add support for -fpatchable-function-entry for ftrace, with GCC >= 13.1 - Rework memory block size determination, and support 256MB size on systems with GPUs that have hotpluggable memory - Various other small features and fixes Thanks to Andrew Donnellan, Aneesh Kumar K.V, Arnd Bergmann, Athira Rajeev, Benjamin Gray, Christophe Leroy, Frederic Barrat, Gautam Menghani, Geoff Levand, Hari Bathini, Immad Mir, Jialin Zhang, Joel Stanley, Jordan Niethe, Justin Stitt, Kajol Jain, Kees Cook, Krzysztof Kozlowski, Laurent Dufour, Liang He, Linus Walleij, Mahesh Salgaonkar, Masahiro Yamada, Michal Suchanek, Nageswara R Sastry, Nathan Chancellor, Nathan Lynch, Naveen N Rao, Nicholas Piggin, Nick Desaulniers, Omar Sandoval, Randy Dunlap, Reza Arbab, Rob Herring, Russell Currey, Sourabh Jain, Thomas Gleixner, Trevor Woerner, Uwe Kleine-König, Vaibhav Jain, Xiongfeng Wang, Yuan Tan, Zhang Rui, and Zheng Zengkai. * tag 'powerpc-6.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (135 commits) macintosh/ams: linux/platform_device.h is needed powerpc/xmon: Reapply "Relax frame size for clang" powerpc/mm/book3s64: Use 256M as the upper limit with coherent device memory attached powerpc/mm/book3s64: Fix build error with SPARSEMEM disabled powerpc/iommu: Fix notifiers being shared by PCI and VIO buses powerpc/mpc5xxx: Add missing fwnode_handle_put() powerpc/config: Disable SLAB_DEBUG_ON in skiroot powerpc/pseries: Remove unused hcall tracing instruction powerpc/pseries: Fix hcall tracepoints with JUMP_LABEL=n powerpc: dts: add missing space before { powerpc/eeh: Use pci_dev_id() to simplify the code powerpc/64s: Move CPU -mtune options into Kconfig powerpc/powermac: Fix unused function warning powerpc/pseries: Rework lppaca_shared_proc() to avoid DEBUG_PREEMPT powerpc: Don't include lppaca.h in paca.h powerpc/pseries: Move hcall_vphn() prototype into vphn.h powerpc/pseries: Move VPHN constants into vphn.h cxl: Drop unused detach_spa() powerpc: Drop zalloc_maybe_bootmem() powerpc/powernv: Use struct opal_prd_msg in more places ...
2023-08-31perf parse-events: Fix propagation of term's no_value when cloningIan Rogers
The no_value field in 'struct parse_events_term' indicates that the val variable isn't used, the case for an event name. Cloning wasn't propagating this, making cloned event name terms appearing to have a constant assinged to them. Working around the bug would check for a value of 1 assigned to value, but then this meant a user value of 1 couldn't be differentiated causing the value to be lost in debug printing and perf list. The change fixes the cloning and updates the "val.num ==/!= 1" tests to use no_value instead. To better check the no_value is set appropriately parameter comments are added for constant values. This found that no_value wasn't set correctly in parse_events_multi_pmu_add, which matters now that no_value is used to indicate an event name. Fixes: 7a6e91644708d514 ("perf parse-events: Make common term list to strbuf helper") Fixes: 99e7138eb7897aa0 ("perf tools: Fail on using multiple bits long terms without value") Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230831071421.2201358-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-31perf parse-events: Name the two term enumsIan Rogers
Name the enums used by 'struct parse_events_term' to parse_events__term_val_type and parse_events__term_type. This allows greater compile time error checking. Fix -Wswitch related issues by explicitly listing all enum values prior to default. Add config_term_name to safely look up a parse_events__term_type name, bounds checking the array access first. Add documentation to 'struct parse_events_terms' and reorder to save space. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230831071421.2201358-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-31perf list: Don't print Unit for "default_core"Ian Rogers
"default_core" was added as a way to demark JSON events whose PMU should be whatever the default core PMU is, previously this had been assumed to be "cpu" but that fails on s390 and ARM. 'perf list' displays the PMU in the event description to save storing it in JSON, but was still comparing against "cpu" and not "default_core", so update this. Fixes: d2045f87154bf67a ("perf jevents: Use "default_core" for events with no Unit") Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230831071421.2201358-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-31Merge tag 'x86_shstk_for_6.6-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 shadow stack support from Dave Hansen: "This is the long awaited x86 shadow stack support, part of Intel's Control-flow Enforcement Technology (CET). CET consists of two related security features: shadow stacks and indirect branch tracking. This series implements just the shadow stack part of this feature, and just for userspace. The main use case for shadow stack is providing protection against return oriented programming attacks. It works by maintaining a secondary (shadow) stack using a special memory type that has protections against modification. When executing a CALL instruction, the processor pushes the return address to both the normal stack and to the special permission shadow stack. Upon RET, the processor pops the shadow stack copy and compares it to the normal stack copy. For more information, refer to the links below for the earlier versions of this patch set" Link: https://lore.kernel.org/lkml/20220130211838.8382-1-rick.p.edgecombe@intel.com/ Link: https://lore.kernel.org/lkml/20230613001108.3040476-1-rick.p.edgecombe@intel.com/ * tag 'x86_shstk_for_6.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (47 commits) x86/shstk: Change order of __user in type x86/ibt: Convert IBT selftest to asm x86/shstk: Don't retry vm_munmap() on -EINTR x86/kbuild: Fix Documentation/ reference x86/shstk: Move arch detail comment out of core mm x86/shstk: Add ARCH_SHSTK_STATUS x86/shstk: Add ARCH_SHSTK_UNLOCK x86: Add PTRACE interface for shadow stack selftests/x86: Add shadow stack test x86/cpufeatures: Enable CET CR4 bit for shadow stack x86/shstk: Wire in shadow stack interface x86: Expose thread features in /proc/$PID/status x86/shstk: Support WRSS for userspace x86/shstk: Introduce map_shadow_stack syscall x86/shstk: Check that signal frame is shadow stack mem x86/shstk: Check that SSP is aligned on sigreturn x86/shstk: Handle signals for shadow stack x86/shstk: Introduce routines modifying shstk x86/shstk: Handle thread shadow stack x86/shstk: Add user-mode shadow stack support ...
2023-08-31x86/irq/i8259: Fix kernel-doc annotation warningVincenzo Palazzo
Fix this warning: arch/x86/kernel/i8259.c:235: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst * ELCR registers (0x4d0, 0x4d1) control edge/level of IRQ CC arch/x86/kernel/irqinit.o Signed-off-by: Vincenzo Palazzo <vincenzopalazzodev@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20230830131211.88226-1-vincenzopalazzodev@gmail.com
2023-08-31x86/speculation: Mark all Skylake CPUs as vulnerable to GDSDave Hansen
The Gather Data Sampling (GDS) vulnerability is common to all Skylake processors. However, the "client" Skylakes* are now in this list: https://www.intel.com/content/www/us/en/support/articles/000022396/processors.html which means they are no longer included for new vulnerabilities here: https://www.intel.com/content/www/us/en/developer/topic-technology/software-security-guidance/processors-affected-consolidated-product-cpu-model.html or in other GDS documentation. Thus, they were not included in the original GDS mitigation patches. Mark SKYLAKE and SKYLAKE_L as vulnerable to GDS to match all the other Skylake CPUs (which include Kaby Lake). Also group the CPUs so that the ones that share the exact same vulnerabilities are next to each other. Last, move SRBDS to the end of each line. This makes it clear at a glance that SKYLAKE_X is unique. Of the five Skylakes, it is the only "server" CPU and has a different implementation from the clients of the "special register" hardware, making it immune to SRBDS. This makes the diff much harder to read, but the resulting table is worth it. I very much appreciate the report from Michael Zhivich about this issue. Despite what level of support a hardware vendor is providing, the kernel very much needs an accurate and up-to-date list of vulnerable CPUs. More reports like this are very welcome. * Client Skylakes are CPUID 406E3/506E3 which is family 6, models 0x4E and 0x5E, aka INTEL_FAM6_SKYLAKE and INTEL_FAM6_SKYLAKE_L. Reported-by: Michael Zhivich <mzhivich@akamai.com> Fixes: 8974eb588283 ("x86/speculation: Add Gather Data Sampling mitigation") Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Cc: Linus Torvalds <torvalds@linux-foundation.org>
2023-08-31KVM: x86/mmu: Include mmu.h in spte.hSean Christopherson
Explicitly include mmu.h in spte.h instead of relying on the "parent" to include mmu.h. spte.h references a variety of macros and variables that are defined/declared in mmu.h, and so including spte.h before (or instead of) mmu.h will result in build errors, e.g. arch/x86/kvm/mmu/spte.h: In function ‘is_mmio_spte’: arch/x86/kvm/mmu/spte.h:242:23: error: ‘enable_mmio_caching’ undeclared 242 | likely(enable_mmio_caching); | ^~~~~~~~~~~~~~~~~~~ arch/x86/kvm/mmu/spte.h: In function ‘is_large_pte’: arch/x86/kvm/mmu/spte.h:302:22: error: ‘PT_PAGE_SIZE_MASK’ undeclared 302 | return pte & PT_PAGE_SIZE_MASK; | ^~~~~~~~~~~~~~~~~ arch/x86/kvm/mmu/spte.h: In function ‘is_dirty_spte’: arch/x86/kvm/mmu/spte.h:332:56: error: ‘PT_WRITABLE_MASK’ undeclared 332 | return dirty_mask ? spte & dirty_mask : spte & PT_WRITABLE_MASK; | ^~~~~~~~~~~~~~~~ Fixes: 5a9624affe7c ("KVM: mmu: extract spte.h and spte.c") Link: https://lore.kernel.org/r/20230808224059.2492476-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-08-31KVM: x86/mmu: Use dummy root, backed by zero page, for !visible guest rootsSean Christopherson
When attempting to allocate a shadow root for a !visible guest root gfn, e.g. that resides in MMIO space, load a dummy root that is backed by the zero page instead of immediately synthesizing a triple fault shutdown (using the zero page ensures any attempt to translate memory will generate a !PRESENT fault and thus VM-Exit). Unless the vCPU is racing with memslot activity, KVM will inject a page fault due to not finding a visible slot in FNAME(walk_addr_generic), i.e. the end result is mostly same, but critically KVM will inject a fault only *after* KVM runs the vCPU with the bogus root. Waiting to inject a fault until after running the vCPU fixes a bug where KVM would bail from nested VM-Enter if L1 tried to run L2 with TDP enabled and a !visible root. Even though a bad root will *probably* lead to shutdown, (a) it's not guaranteed and (b) the CPU won't read the underlying memory until after VM-Enter succeeds. E.g. if L1 runs L2 with a VMX preemption timer value of '0', then architecturally the preemption timer VM-Exit is guaranteed to occur before the CPU executes any instruction, i.e. before the CPU needs to translate a GPA to a HPA (so long as there are no injected events with higher priority than the preemption timer). If KVM manages to get to FNAME(fetch) with a dummy root, e.g. because userspace created a memslot between installing the dummy root and handling the page fault, simply unload the MMU to allocate a new root and retry the instruction. Use KVM_REQ_MMU_FREE_OBSOLETE_ROOTS to drop the root, as invoking kvm_mmu_free_roots() while holding mmu_lock would deadlock, and conceptually the dummy root has indeeed become obsolete. The only difference versus existing usage of KVM_REQ_MMU_FREE_OBSOLETE_ROOTS is that the root has become obsolete due to memslot *creation*, not memslot deletion or movement. Reported-by: Reima Ishii <ishiir@g.ecc.u-tokyo.ac.jp> Cc: Yu Zhang <yu.c.zhang@linux.intel.com> Link: https://lore.kernel.org/r/20230729005200.1057358-6-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-08-31KVM: x86/mmu: Disallow guest from using !visible slots for page tablesSean Christopherson
Explicitly inject a page fault if guest attempts to use a !visible gfn as a page table. kvm_vcpu_gfn_to_hva_prot() will naturally handle the case where there is no memslot, but doesn't catch the scenario where the gfn points at a KVM-internal memslot. Letting the guest backdoor its way into accessing KVM-internal memslots isn't dangerous on its own, e.g. at worst the guest can crash itself, but disallowing the behavior will simplify fixing how KVM handles !visible guest root gfns (immediately synthesizing a triple fault when loading the root is architecturally wrong). Link: https://lore.kernel.org/r/20230729005200.1057358-5-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-08-31KVM: x86/mmu: Harden TDP MMU iteration against root w/o shadow pageSean Christopherson
Explicitly check that tdp_iter_start() is handed a valid shadow page to harden KVM against bugs, e.g. if KVM calls into the TDP MMU with an invalid or shadow MMU root (which would be a fatal KVM bug), the shadow page pointer will be NULL. Opportunistically stop the TDP MMU iteration instead of continuing on with garbage if the incoming root is bogus. Attempting to walk a garbage root is more likely to caused major problems than doing nothing. Cc: Yu Zhang <yu.c.zhang@linux.intel.com> Link: https://lore.kernel.org/r/20230729005200.1057358-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-08-31KVM: x86/mmu: Harden new PGD against roots without shadow pagesSean Christopherson
Harden kvm_mmu_new_pgd() against NULL pointer dereference bugs by sanity checking that the target root has an associated shadow page prior to dereferencing said shadow page. The code in question is guaranteed to only see roots with shadow pages as fast_pgd_switch() explicitly frees the current root if it doesn't have a shadow page, i.e. is a PAE root, and that in turn prevents valid roots from being cached, but that's all very subtle. Link: https://lore.kernel.org/r/20230729005200.1057358-3-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-08-31KVM: x86/mmu: Add helper to convert root hpa to shadow pageSean Christopherson
Add a dedicated helper for converting a root hpa to a shadow page in anticipation of using a "dummy" root to handle the scenario where KVM needs to load a valid shadow root (from hardware's perspective), but the guest doesn't have a visible root to shadow. Similar to PAE roots, the dummy root won't have an associated kvm_mmu_page and will need special handling when finding a shadow page given a root. Opportunistically retrieve the root shadow page in kvm_mmu_sync_roots() *after* verifying the root is unsync (the dummy root can never be unsync). Link: https://lore.kernel.org/r/20230729005200.1057358-2-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-08-31drm/i915/gvt: Drop final dependencies on KVM internal detailsSean Christopherson
Open code gpa_to_gfn() in kvmgt_page_track_write() and drop KVMGT's dependency on kvm_host.h, i.e. include only kvm_page_track.h. KVMGT assumes "gfn == gpa >> PAGE_SHIFT" all over the place, including a few lines below in the same function with the same gpa, i.e. there's no reason to use KVM's helper for this one case. No functional change intended. Reviewed-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Yongwei Ma <yongwei.ma@intel.com> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Link: https://lore.kernel.org/r/20230729013535.1070024-30-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-08-31KVM: x86/mmu: Handle KVM bookkeeping in page-track APIs, not callersSean Christopherson
Get/put references to KVM when a page-track notifier is (un)registered instead of relying on the caller to do so. Forcing the caller to do the bookkeeping is unnecessary and adds one more thing for users to get wrong, e.g. see commit 9ed1fdee9ee3 ("drm/i915/gvt: Get reference to KVM iff attachment to VM is successful"). Reviewed-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Yongwei Ma <yongwei.ma@intel.com> Link: https://lore.kernel.org/r/20230729013535.1070024-29-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-08-31KVM: x86/mmu: Drop @slot param from exported/external page-track APIsSean Christopherson
Refactor KVM's exported/external page-track, a.k.a. write-track, APIs to take only the gfn and do the required memslot lookup in KVM proper. Forcing users of the APIs to get the memslot unnecessarily bleeds KVM internals into KVMGT and complicates usage of the APIs. No functional change intended. Reviewed-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Yongwei Ma <yongwei.ma@intel.com> Link: https://lore.kernel.org/r/20230729013535.1070024-28-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-08-31KVM: x86/mmu: Bug the VM if write-tracking is used but not enabledSean Christopherson
Bug the VM if something attempts to write-track a gfn, but write-tracking isn't enabled. The VM is doomed (and KVM has an egregious bug) if KVM or KVMGT wants to shadow guest page tables but can't because write-tracking isn't enabled. Tested-by: Yongwei Ma <yongwei.ma@intel.com> Link: https://lore.kernel.org/r/20230729013535.1070024-27-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-08-31KVM: x86/mmu: Assert that correct locks are held for page write-trackingSean Christopherson
When adding/removing gfns to/from write-tracking, assert that mmu_lock is held for write, and that either slots_lock or kvm->srcu is held. mmu_lock must be held for write to protect gfn_write_track's refcount, and SRCU or slots_lock must be held to protect the memslot itself. Tested-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Yongwei Ma <yongwei.ma@intel.com> Link: https://lore.kernel.org/r/20230729013535.1070024-26-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-08-31KVM: x86/mmu: Rename page-track APIs to reflect the new realitySean Christopherson
Rename the page-track APIs to capture that they're all about tracking writes, now that the facade of supporting multiple modes is gone. Opportunstically replace "slot" with "gfn" in anticipation of removing the @slot param from the external APIs. No functional change intended. Tested-by: Yongwei Ma <yongwei.ma@intel.com> Link: https://lore.kernel.org/r/20230729013535.1070024-25-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-08-31KVM: x86/mmu: Drop infrastructure for multiple page-track modesSean Christopherson
Drop "support" for multiple page-track modes, as there is no evidence that array-based and refcounted metadata is the optimal solution for other modes, nor is there any evidence that other use cases, e.g. for access-tracking, will be a good fit for the page-track machinery in general. E.g. one potential use case of access-tracking would be to prevent guest access to poisoned memory (from the guest's perspective). In that case, the number of poisoned pages is likely to be a very small percentage of the guest memory, and there is no need to reference count the number of access-tracking users, i.e. expanding gfn_track[] for a new mode would be grossly inefficient. And for poisoned memory, host userspace would also likely want to trap accesses, e.g. to inject #MC into the guest, and that isn't currently supported by the page-track framework. A better alternative for that poisoned page use case is likely a variation of the proposed per-gfn attributes overlay (linked), which would allow efficiently tracking the sparse set of poisoned pages, and by default would exit to userspace on access. Link: https://lore.kernel.org/all/Y2WB48kD0J4VGynX@google.com Cc: Ben Gardon <bgardon@google.com> Tested-by: Yongwei Ma <yongwei.ma@intel.com> Link: https://lore.kernel.org/r/20230729013535.1070024-24-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-08-31KVM: x86/mmu: Use page-track notifiers iff there are external usersSean Christopherson
Disable the page-track notifier code at compile time if there are no external users, i.e. if CONFIG_KVM_EXTERNAL_WRITE_TRACKING=n. KVM itself now hooks emulated writes directly instead of relying on the page-track mechanism. Provide a stub for "struct kvm_page_track_notifier_node" so that including headers directly from the command line, e.g. for testing include guards, doesn't fail due to a struct having an incomplete type. Reviewed-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Yongwei Ma <yongwei.ma@intel.com> Link: https://lore.kernel.org/r/20230729013535.1070024-23-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-08-31KVM: x86/mmu: Move KVM-only page-track declarations to internal headerSean Christopherson
Bury the declaration of the page-track helpers that are intended only for internal KVM use in a "private" header. In addition to guarding against unwanted usage of the internal-only helpers, dropping their definitions avoids exposing other structures that should be KVM-internal, e.g. for memslots. This is a baby step toward making kvm_host.h a KVM-internal header in the very distant future. Tested-by: Yongwei Ma <yongwei.ma@intel.com> Link: https://lore.kernel.org/r/20230729013535.1070024-22-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-08-31KVM: x86: Remove the unused page-track hook track_flush_slot()Yan Zhao
Remove ->track_remove_slot(), there are no longer any users and it's unlikely a "flush" hook will ever be the correct API to provide to an external page-track user. Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Yongwei Ma <yongwei.ma@intel.com> Link: https://lore.kernel.org/r/20230729013535.1070024-21-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-08-31drm/i915/gvt: switch from ->track_flush_slot() to ->track_remove_region()Yan Zhao
Switch from the poorly named and flawed ->track_flush_slot() to the newly introduced ->track_remove_region(). From KVMGT's perspective, the two hooks are functionally equivalent, the only difference being that ->track_remove_region() is called only when KVM is 100% certain the memory region will be removed, i.e. is invoked slightly later in KVM's memslot modification flow. Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com> [sean: handle name change, massage changelog, rebase] Tested-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Yongwei Ma <yongwei.ma@intel.com> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Link: https://lore.kernel.org/r/20230729013535.1070024-20-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-08-31KVM: x86: Add a new page-track hook to handle memslot deletionYan Zhao
Add a new page-track hook, track_remove_region(), that is called when a memslot DELETE operation is about to be committed. The "remove" hook will be used by KVMGT and will effectively replace the existing track_flush_slot() altogether now that KVM itself doesn't rely on the "flush" hook either. The "flush" hook is flawed as it's invoked before the memslot operation is guaranteed to succeed, i.e. KVM might ultimately keep the existing memslot without notifying external page track users, a.k.a. KVMGT. In practice, this can't currently happen on x86, but there are no guarantees that won't change in the future, not to mention that "flush" does a very poor job of describing what is happening. Pass in the gfn+nr_pages instead of the slot itself so external users, i.e. KVMGT, don't need to exposed to KVM internals (memslots). This will help set the stage for additional cleanups to the page-track APIs. Opportunistically align the existing srcu_read_lock_held() usage so that the new case doesn't stand out like a sore thumb (and not aligning the new code makes bots unhappy). Cc: Zhenyu Wang <zhenyuw@linux.intel.com> Tested-by: Yongwei Ma <yongwei.ma@intel.com> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com> Co-developed-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20230729013535.1070024-19-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-08-31drm/i915/gvt: Don't bother removing write-protection on to-be-deleted slotSean Christopherson
When handling a slot "flush", don't call back into KVM to drop write protection for gfns in the slot. Now that KVM rejects attempts to move memory slots while KVMGT is attached, the only time a slot is "flushed" is when it's being removed, i.e. the memslot and all its write-tracking metadata is about to be deleted. Reviewed-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Yongwei Ma <yongwei.ma@intel.com> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Link: https://lore.kernel.org/r/20230729013535.1070024-18-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-08-31KVM: x86: Reject memslot MOVE operations if KVMGT is attachedSean Christopherson
Disallow moving memslots if the VM has external page-track users, i.e. if KVMGT is being used to expose a virtual GPU to the guest, as KVMGT doesn't correctly handle moving memory regions. Note, this is potential ABI breakage! E.g. userspace could move regions that aren't shadowed by KVMGT without harming the guest. However, the only known user of KVMGT is QEMU, and QEMU doesn't move generic memory regions. KVM's own support for moving memory regions was also broken for multiple years (albeit for an edge case, but arguably moving RAM is itself an edge case), e.g. see commit edd4fa37baa6 ("KVM: x86: Allocate new rmap and large page tracking when moving memslot"). Reviewed-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Yongwei Ma <yongwei.ma@intel.com> Link: https://lore.kernel.org/r/20230729013535.1070024-17-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-08-31KVM: drm/i915/gvt: Drop @vcpu from KVM's ->track_write() hookSean Christopherson
Drop @vcpu from KVM's ->track_write() hook provided for external users of the page-track APIs now that KVM itself doesn't use the page-track mechanism. Reviewed-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Yongwei Ma <yongwei.ma@intel.com> Reviewed-by: Zhi Wang <zhi.a.wang@intel.com> Link: https://lore.kernel.org/r/20230729013535.1070024-16-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-08-31KVM: x86/mmu: Don't bounce through page-track mechanism for guest PTEsSean Christopherson
Don't use the generic page-track mechanism to handle writes to guest PTEs in KVM's MMU. KVM's MMU needs access to information that should not be exposed to external page-track users, e.g. KVM needs (for some definitions of "need") the vCPU to query the current paging mode, whereas external users, i.e. KVMGT, have no ties to the current vCPU and so should never need the vCPU. Moving away from the page-track mechanism will allow dropping use of the page-track mechanism for KVM's own MMU, and will also allow simplifying and cleaning up the page-track APIs. Reviewed-by: Yan Zhao <yan.y.zhao@intel.com> Tested-by: Yongwei Ma <yongwei.ma@intel.com> Link: https://lore.kernel.org/r/20230729013535.1070024-15-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>