summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2025-05-22arm64: dts: amlogic: dreambox: fix missing clkc_audio nodeChristian Hewitt
commit 0f67578587bb9e5a8eecfcdf6b8a501b5bd90526 upstream. Add the clkc_audio node to fix audio support on Dreambox One/Two. Fixes: 83a6f4c62cb1 ("arm64: dts: meson: add initial support for Dreambox One/Two") CC: stable@vger.kernel.org Suggested-by: Emanuel Strobel <emanuel.strobel@yahoo.com> Signed-off-by: Christian Hewitt <christianshewitt@gmail.com> Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Link: https://lore.kernel.org/r/20250503084443.3704866-1-christianshewitt@gmail.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-22LoongArch: uprobes: Remove redundant code about resume_eraTiezhu Yang
commit 12614f794274f63fbdfe76771b2b332077d63848 upstream. arch_uprobe_skip_sstep() returns true if instruction was emulated, that is to say, there is no need to single step for the emulated instructions. regs->csr_era will point to the destination address directly after the exception, so the resume_era related code is redundant, just remove them. Cc: stable@vger.kernel.org Fixes: 19bc6cb64092 ("LoongArch: Add uprobes support") Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-22LoongArch: uprobes: Remove user_{en,dis}able_single_step()Tiezhu Yang
commit 0b326b2371f94e798137cc1a3c5c2eef2bc69061 upstream. When executing the "perf probe" and "perf stat" test cases about some cryptographic algorithm, the output shows that "Trace/breakpoint trap". This is because it uses the software singlestep breakpoint for uprobes on LoongArch, and no need to use the hardware singlestep. So just remove the related function call to user_{en,dis}able_single_step() for uprobes on LoongArch. How to reproduce: Please make sure CONFIG_UPROBE_EVENTS is set and openssl supports sm2 algorithm, then execute the following command. cd tools/perf && make ./perf probe -x /usr/lib64/libcrypto.so BN_mod_mul_montgomery ./perf stat -e probe_libcrypto:BN_mod_mul_montgomery openssl speed sm2 Cc: stable@vger.kernel.org Fixes: 19bc6cb64092 ("LoongArch: Add uprobes support") Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-22LoongArch: Fix MAX_REG_OFFSET calculationHuacai Chen
commit 90436d234230e9a950ccd87831108b688b27a234 upstream. Fix MAX_REG_OFFSET calculation, make it point to the last register in 'struct pt_regs' and not to the marker itself, which could allow regs_get_register() to return an invalid offset. Cc: stable@vger.kernel.org Fixes: 803b0fc5c3f2baa6e5 ("LoongArch: Add process management") Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-22LoongArch: Save and restore CSR.CNTC for hibernationHuacai Chen
commit ceb9155d058a11242aa0572875c44e9713b1a2be upstream. Save and restore CSR.CNTC for hibernation which is similar to suspend. For host this is unnecessary because sched clock is ensured continuous, but for kvm guest sched clock isn't enough because rdtime.d should also be continuous. Host::rdtime.d = Host::CSR.CNTC + counter Guest::rdtime.d = Host::CSR.CNTC + Host::CSR.GCNTC + Guest::CSR.CNTC + counter so, Guest::rdtime.d = Host::rdtime.d + Host::CSR.GCNTC + Guest::CSR.CNTC To ensure Guest::rdtime.d continuous, Host::rdtime.d should be at first continuous, while Host::CSR.GCNTC / Guest::CSR.CNTC is maintained by KVM. Cc: stable@vger.kernel.org Signed-off-by: Xianglai Li <lixianglai@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-22LoongArch: Move __arch_cpu_idle() to .cpuidle.text sectionHuacai Chen
commit 3e245b7b74c3a2ead5fa4bad27cc275284c75189 upstream. Now arch_cpu_idle() is annotated with __cpuidle which means it is in the .cpuidle.text section, but __arch_cpu_idle() isn't. Thus, fix the missing .cpuidle.text section assignment for __arch_cpu_idle() in order to correct backtracing with nmi_backtrace(). The principle is similar to the commit 97c8580e85cf81c ("MIPS: Annotate cpu_wait implementations with __cpuidle") Cc: stable@vger.kernel.org Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-22LoongArch: Prevent cond_resched() occurring within kernel-fpuTianyang Zhang
commit 2468b0e3d5659dfde77f081f266e1111a981efb8 upstream. When CONFIG_PREEMPT_COUNT is not configured (i.e. CONFIG_PREEMPT_NONE/ CONFIG_PREEMPT_VOLUNTARY), preempt_disable() / preempt_enable() merely acts as a barrier(). However, in these cases cond_resched() can still trigger a context switch and modify the CSR.EUEN, resulting in do_fpu() exception being activated within the kernel-fpu critical sections, as demonstrated in the following path: dcn32_calculate_wm_and_dlg() DC_FP_START() dcn32_calculate_wm_and_dlg_fpu() dcn32_find_dummy_latency_index_for_fw_based_mclk_switch() dcn32_internal_validate_bw() dcn32_enable_phantom_stream() dc_create_stream_for_sink() kzalloc(GFP_KERNEL) __kmem_cache_alloc_node() __cond_resched() DC_FP_END() This patch is similar to commit d02198550423a0b (x86/fpu: Improve crypto performance by making kernel-mode FPU reliably usable in softirqs). It uses local_bh_disable() instead of preempt_disable() for non-RT kernels so it can avoid the cond_resched() issue, and also extend the kernel-fpu application scenarios to the softirq context. Cc: stable@vger.kernel.org Signed-off-by: Tianyang Zhang <zhangtianyang@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-22KVM: x86/mmu: Prevent installing hugepages when mem attributes are changingSean Christopherson
[ Upstream commit 9129633d568edd36aa22bf703b12835153cec985 ] When changing memory attributes on a subset of a potential hugepage, add the hugepage to the invalidation range tracking to prevent installing a hugepage until the attributes are fully updated. Like the actual hugepage tracking updates in kvm_arch_post_set_memory_attributes(), process only the head and tail pages, as any potential hugepages that are entirely covered by the range will already be tracked. Note, only hugepage chunks whose current attributes are NOT mixed need to be added to the invalidation set, as mixed attributes already prevent installing a hugepage, and it's perfectly safe to install a smaller mapping for a gfn whose attributes aren't changing. Fixes: 8dd2eee9d526 ("KVM: x86/mmu: Handle page fault for private memory") Cc: stable@vger.kernel.org Reported-by: Michael Roth <michael.roth@amd.com> Tested-by: Michael Roth <michael.roth@amd.com> Link: https://lore.kernel.org/r/20250430220954.522672-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-22KVM: Add member to struct kvm_gfn_range to indicate private/sharedIsaku Yamahata
[ Upstream commit dca6c88532322830d5d92486467fcc91b67a9ad8 ] Add new members to strut kvm_gfn_range to indicate which mapping (private-vs-shared) to operate on: enum kvm_gfn_range_filter attr_filter. Update the core zapping operations to set them appropriately. TDX utilizes two GPA aliases for the same memslots, one for memory that is for private memory and one that is for shared. For private memory, KVM cannot always perform the same operations it does on memory for default VMs, such as zapping pages and having them be faulted back in, as this requires guest coordination. However, some operations such as guest driven conversion of memory between private and shared should zap private memory. Internally to the MMU, private and shared mappings are tracked on separate roots. Mapping and zapping operations will operate on the respective GFN alias for each root (private or shared). So zapping operations will by default zap both aliases. Add fields in struct kvm_gfn_range to allow callers to specify which aliases so they can only target the aliases appropriate for their specific operation. There was feedback that target aliases should be specified such that the default value (0) is to operate on both aliases. Several options were considered. Several variations of having separate bools defined such that the default behavior was to process both aliases. They either allowed nonsensical configurations, or were confusing for the caller. A simple enum was also explored and was close, but was hard to process in the caller. Instead, use an enum with the default value (0) reserved as a disallowed value. Catch ranges that didn't have the target aliases specified by looking for that specific value. Set target alias with enum appropriately for these MMU operations: - For KVM's mmu notifier callbacks, zap shared pages only because private pages won't have a userspace mapping - For setting memory attributes, kvm_arch_pre_set_memory_attributes() chooses the aliases based on the attribute. - For guest_memfd invalidations, zap private only. Link: https://lore.kernel.org/kvm/ZivIF9vjKcuGie3s@google.com/ Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com> Co-developed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Message-ID: <20240718211230.1492011-3-rick.p.edgecombe@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Stable-dep-of: 9129633d568e ("KVM: x86/mmu: Prevent installing hugepages when mem attributes are changing") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-22arm64: dts: imx8mp-var-som: Fix LDO5 shutdown causing SD card timeoutHimanshu Bhavani
[ Upstream commit c6888983134e2ccc2db8ffd2720b0d4826d952e4 ] Fix SD card timeout issue caused by LDO5 regulator getting disabled after boot. The kernel log shows LDO5 being disabled, which leads to a timeout on USDHC2: [ 33.760561] LDO5: disabling [ 81.119861] mmc1: Timeout waiting for hardware interrupt. To prevent this, set regulator-boot-on and regulator-always-on for LDO5. Also add the vqmmc regulator to properly support 1.8V/3.3V signaling for USDHC2 using a GPIO-controlled regulator. Fixes: 6c2a1f4f71258 ("arm64: dts: imx8mp-var-som-symphony: Add Variscite Symphony board and VAR-SOM-MX8MP SoM") Signed-off-by: Himanshu Bhavani <himanshu.bhavani@siliconsignals.io> Acked-by: Tarang Raval <tarang.raval@siliconsignals.io> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-22riscv: dts: sophgo: fix DMA data-width configuration for CV18xxZe Huang
[ Upstream commit 3e6244429ba38f8dee3336b8b805948276b281ab ] The "snps,data-width" property[1] defines the AXI data width of the DMA controller as: width = 8 × (2^n) bits (0 = 8 bits, 1 = 16 bits, 2 = 32 bits, ..., 6 = 512 bits) where "n" is the value of "snps,data-width". For the CV18xx DMA controller, the correct AXI data width is 32 bits, corresponding to "snps,data-width = 2". Test results on Milkv Duo S can be found here [2]. Link: https://github.com/torvalds/linux/blob/master/Documentation/devicetree/bindings/dma/snps%2Cdw-axi-dmac.yaml#L74 [1] Link: https://gist.github.com/Sutter099/4fa99bb2d89e5af975983124704b3861 [2] Fixes: 514951a81a5e ("riscv: dts: sophgo: cv18xx: add DMA controller") Co-developed-by: Yu Yuan <yu.yuan@sjtu.edu.cn> Signed-off-by: Yu Yuan <yu.yuan@sjtu.edu.cn> Signed-off-by: Ze Huang <huangze@whut.edu.cn> Link: https://lore.kernel.org/r/20250428-duo-dma-config-v1-1-eb6ad836ca42@whut.edu.cn Signed-off-by: Inochi Amaoto <inochiama@gmail.com> Signed-off-by: Chen Wang <unicorn_wang@outlook.com> Signed-off-by: Chen Wang <wangchen20@iscas.ac.cn> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-22arm64: dts: rockchip: Assign RT5616 MCLK rate on rk3588-friendlyelec-cm3588Tom Vincent
[ Upstream commit 5e6a4ee9799b202fefa8c6264647971f892f0264 ] The Realtek RT5616 audio codec on the FriendlyElec CM3588 module fails to probe correctly due to the missing clock properties. This results in distorted analogue audio output. Assign MCLK to 12.288 MHz, which allows the codec to advertise most of the standard sample rates per other RK3588 devices. Fixes: e23819cf273c ("arm64: dts: rockchip: Add FriendlyElec CM3588 NAS board") Signed-off-by: Tom Vincent <linux@tlvince.com> Link: https://lore.kernel.org/r/20250417081753.644950-1-linux@tlvince.com Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-18x86/its: FineIBT-paranoid vs ITSPeter Zijlstra
commit e52c1dc7455d32c8a55f9949d300e5e87d011fa6 upstream. FineIBT-paranoid was using the retpoline bytes for the paranoid check, disabling retpolines, because all parts that have IBT also have eIBRS and thus don't need no stinking retpolines. Except... ITS needs the retpolines for indirect calls must not be in the first half of a cacheline :-/ So what was the paranoid call sequence: <fineibt_paranoid_start>: 0: 41 ba 78 56 34 12 mov $0x12345678, %r10d 6: 45 3b 53 f7 cmp -0x9(%r11), %r10d a: 4d 8d 5b <f0> lea -0x10(%r11), %r11 e: 75 fd jne d <fineibt_paranoid_start+0xd> 10: 41 ff d3 call *%r11 13: 90 nop Now becomes: <fineibt_paranoid_start>: 0: 41 ba 78 56 34 12 mov $0x12345678, %r10d 6: 45 3b 53 f7 cmp -0x9(%r11), %r10d a: 4d 8d 5b f0 lea -0x10(%r11), %r11 e: 2e e8 XX XX XX XX cs call __x86_indirect_paranoid_thunk_r11 Where the paranoid_thunk looks like: 1d: <ea> (bad) __x86_indirect_paranoid_thunk_r11: 1e: 75 fd jne 1d __x86_indirect_its_thunk_r11: 20: 41 ff eb jmp *%r11 23: cc int3 [ dhansen: remove initialization to false ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> [ Just a portion of the original commit, in order to fix a build issue in stable kernels due to backports ] Tested-by: Holger Hoffstätte <holger@applied-asynchrony.com> Link: https://lore.kernel.org/r/20250514113952.GB16434@noisy.programming.kicks-ass.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18x86/its: Fix build errors when CONFIG_MODULES=nEric Biggers
commit 9f35e33144ae5377d6a8de86dd3bd4d995c6ac65 upstream. Fix several build errors when CONFIG_MODULES=n, including the following: ../arch/x86/kernel/alternative.c:195:25: error: incomplete definition of type 'struct module' 195 | for (int i = 0; i < mod->its_num_pages; i++) { Fixes: 872df34d7c51 ("x86/its: Use dynamic thunks for indirect branches") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Dave Hansen <dave.hansen@intel.com> Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18x86/its: Use dynamic thunks for indirect branchesPeter Zijlstra
commit 872df34d7c51a79523820ea6a14860398c639b87 upstream. ITS mitigation moves the unsafe indirect branches to a safe thunk. This could degrade the prediction accuracy as the source address of indirect branches becomes same for different execution paths. To improve the predictions, and hence the performance, assign a separate thunk for each indirect callsite. This is also a defense-in-depth measure to avoid indirect branches aliasing with each other. As an example, 5000 dynamic thunks would utilize around 16 bits of the address space, thereby gaining entropy. For a BTB that uses 32 bits for indexing, dynamic thunks could provide better prediction accuracy over fixed thunks. Have ITS thunks be variable sized and use EXECMEM_MODULE_TEXT such that they are both more flexible (got to extend them later) and live in 2M TLBs, just like kernel code, avoiding undue TLB pressure. [ pawan: CONFIG_EXECMEM_ROX is not supported on backport kernel, made adjustments to set memory to RW and ROX ] Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18x86/ibt: Keep IBT disabled during alternative patchingPawan Gupta
commit ebebe30794d38c51f71fe4951ba6af4159d9837d upstream. cfi_rewrite_callers() updates the fineIBT hash matching at the caller side, but except for paranoid-mode it relies on apply_retpoline() and friends for any ENDBR relocation. This could temporarily cause an indirect branch to land on a poisoned ENDBR. For instance, with para-virtualization enabled, a simple wrmsrl() could have an indirect branch pointing to native_write_msr() who's ENDBR has been relocated due to fineIBT: <wrmsrl>: push %rbp mov %rsp,%rbp mov %esi,%eax mov %rsi,%rdx shr $0x20,%rdx mov %edi,%edi mov %rax,%rsi call *0x21e65d0(%rip) # <pv_ops+0xb8> ^^^^^^^^^^^^^^^^^^^^^^^ Such an indirect call during the alternative patching could #CP if the caller is not *yet* adjusted for the new target ENDBR. To prevent a false #CP, keep CET-IBT disabled until all callers are patched. Patching during the module load does not need to be guarded by IBT-disable because the module code is not executed until the patching is complete. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18x86/its: Align RETs in BHB clear sequence to avoid thunkingPawan Gupta
commit f0cd7091cc5a032c8870b4285305d9172569d126 upstream. The software mitigation for BHI is to execute BHB clear sequence at syscall entry, and possibly after a cBPF program. ITS mitigation thunks RETs in the lower half of the cacheline. This causes the RETs in the BHB clear sequence to be thunked as well, adding unnecessary branches to the BHB clear sequence. Since the sequence is in hot path, align the RET instructions in the sequence to avoid thunking. This is how disassembly clear_bhb_loop() looks like after this change: 0x44 <+4>: mov $0x5,%ecx 0x49 <+9>: call 0xffffffff81001d9b <clear_bhb_loop+91> 0x4e <+14>: jmp 0xffffffff81001de5 <clear_bhb_loop+165> 0x53 <+19>: int3 ... 0x9b <+91>: call 0xffffffff81001dce <clear_bhb_loop+142> 0xa0 <+96>: ret 0xa1 <+97>: int3 ... 0xce <+142>: mov $0x5,%eax 0xd3 <+147>: jmp 0xffffffff81001dd6 <clear_bhb_loop+150> 0xd5 <+149>: nop 0xd6 <+150>: sub $0x1,%eax 0xd9 <+153>: jne 0xffffffff81001dd3 <clear_bhb_loop+147> 0xdb <+155>: sub $0x1,%ecx 0xde <+158>: jne 0xffffffff81001d9b <clear_bhb_loop+91> 0xe0 <+160>: ret 0xe1 <+161>: int3 0xe2 <+162>: int3 0xe3 <+163>: int3 0xe4 <+164>: int3 0xe5 <+165>: lfence 0xe8 <+168>: pop %rbp 0xe9 <+169>: ret Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18x86/its: Add support for RSB stuffing mitigationPawan Gupta
commit facd226f7e0c8ca936ac114aba43cb3e8b94e41e upstream. When retpoline mitigation is enabled for spectre-v2, enabling call-depth-tracking and RSB stuffing also mitigates ITS. Add cmdline option indirect_target_selection=stuff to allow enabling RSB stuffing mitigation. When retpoline mitigation is not enabled, =stuff option is ignored, and default mitigation for ITS is deployed. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18x86/its: Add "vmexit" option to skip mitigation on some CPUsPawan Gupta
commit 2665281a07e19550944e8354a2024635a7b2714a upstream. Ice Lake generation CPUs are not affected by guest/host isolation part of ITS. If a user is only concerned about KVM guests, they can now choose a new cmdline option "vmexit" that will not deploy the ITS mitigation when CPU is not affected by guest/host isolation. This saves the performance overhead of ITS mitigation on Ice Lake gen CPUs. When "vmexit" option selected, if the CPU is affected by ITS guest/host isolation, the default ITS mitigation is deployed. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18x86/its: Enable Indirect Target Selection mitigationPawan Gupta
commit f4818881c47fd91fcb6d62373c57c7844e3de1c0 upstream. Indirect Target Selection (ITS) is a bug in some pre-ADL Intel CPUs with eIBRS. It affects prediction of indirect branch and RETs in the lower half of cacheline. Due to ITS such branches may get wrongly predicted to a target of (direct or indirect) branch that is located in the upper half of the cacheline. Scope of impact =============== Guest/host isolation -------------------- When eIBRS is used for guest/host isolation, the indirect branches in the VMM may still be predicted with targets corresponding to branches in the guest. Intra-mode ---------- cBPF or other native gadgets can be used for intra-mode training and disclosure using ITS. User/kernel isolation --------------------- When eIBRS is enabled user/kernel isolation is not impacted. Indirect Branch Prediction Barrier (IBPB) ----------------------------------------- After an IBPB, indirect branches may be predicted with targets corresponding to direct branches which were executed prior to IBPB. This is mitigated by a microcode update. Add cmdline parameter indirect_target_selection=off|on|force to control the mitigation to relocate the affected branches to an ITS-safe thunk i.e. located in the upper half of cacheline. Also add the sysfs reporting. When retpoline mitigation is deployed, ITS safe-thunks are not needed, because retpoline sequence is already ITS-safe. Similarly, when call depth tracking (CDT) mitigation is deployed (retbleed=stuff), ITS safe return thunk is not used, as CDT prevents RSB-underflow. To not overcomplicate things, ITS mitigation is not supported with spectre-v2 lfence;jmp mitigation. Moreover, it is less practical to deploy lfence;jmp mitigation on ITS affected parts anyways. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18x86/its: Add support for ITS-safe return thunkPawan Gupta
commit a75bf27fe41abe658c53276a0c486c4bf9adecfc upstream. RETs in the lower half of cacheline may be affected by ITS bug, specifically when the RSB-underflows. Use ITS-safe return thunk for such RETs. RETs that are not patched: - RET in retpoline sequence does not need to be patched, because the sequence itself fills an RSB before RET. - RET in Call Depth Tracking (CDT) thunks __x86_indirect_{call|jump}_thunk and call_depth_return_thunk are not patched because CDT by design prevents RSB-underflow. - RETs in .init section are not reachable after init. - RETs that are explicitly marked safe with ANNOTATE_UNRET_SAFE. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18x86/its: Add support for ITS-safe indirect thunkPawan Gupta
commit 8754e67ad4ac692c67ff1f99c0d07156f04ae40c upstream. Due to ITS, indirect branches in the lower half of a cacheline may be vulnerable to branch target injection attack. Introduce ITS-safe thunks to patch indirect branches in the lower half of cacheline with the thunk. Also thunk any eBPF generated indirect branches in emit_indirect_jump(). Below category of indirect branches are not mitigated: - Indirect branches in the .init section are not mitigated because they are discarded after boot. - Indirect branches that are explicitly marked retpoline-safe. Note that retpoline also mitigates the indirect branches against ITS. This is because the retpoline sequence fills an RSB entry before RET, and it does not suffer from RSB-underflow part of the ITS. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18x86/its: Enumerate Indirect Target Selection (ITS) bugPawan Gupta
commit 159013a7ca18c271ff64192deb62a689b622d860 upstream. ITS bug in some pre-Alderlake Intel CPUs may allow indirect branches in the first half of a cache line get predicted to a target of a branch located in the second half of the cache line. Set X86_BUG_ITS on affected CPUs. Mitigation to follow in later commits. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18x86/speculation: Remove the extra #ifdef around CALL_NOSPECPawan Gupta
commit c8c81458863ab686cda4fe1e603fccaae0f12460 upstream. Commit: 010c4a461c1d ("x86/speculation: Simplify and make CALL_NOSPEC consistent") added an #ifdef CONFIG_MITIGATION_RETPOLINE around the CALL_NOSPEC definition. This is not required as this code is already under a larger #ifdef. Remove the extra #ifdef, no functional change. vmlinux size remains same before and after this change: CONFIG_MITIGATION_RETPOLINE=y: text data bss dec hex filename 25434752 7342290 2301212 35078254 217406e vmlinux.before 25434752 7342290 2301212 35078254 217406e vmlinux.after # CONFIG_MITIGATION_RETPOLINE is not set: text data bss dec hex filename 22943094 6214994 1550152 30708240 1d49210 vmlinux.before 22943094 6214994 1550152 30708240 1d49210 vmlinux.after Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Link: https://lore.kernel.org/r/20250320-call-nospec-extra-ifdef-v1-1-d9b084d24820@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18x86/speculation: Add a conditional CS prefix to CALL_NOSPECPawan Gupta
commit 052040e34c08428a5a388b85787e8531970c0c67 upstream. Retpoline mitigation for spectre-v2 uses thunks for indirect branches. To support this mitigation compilers add a CS prefix with -mindirect-branch-cs-prefix. For an indirect branch in asm, this needs to be added manually. CS prefix is already being added to indirect branches in asm files, but not in inline asm. Add CS prefix to CALL_NOSPEC for inline asm as well. There is no JMP_NOSPEC for inline asm. Reported-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250228-call-nospec-v3-2-96599fed0f33@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18x86/speculation: Simplify and make CALL_NOSPEC consistentPawan Gupta
commit cfceff8526a426948b53445c02bcb98453c7330d upstream. CALL_NOSPEC macro is used to generate Spectre-v2 mitigation friendly indirect branches. At compile time the macro defaults to indirect branch, and at runtime those can be patched to thunk based mitigations. This approach is opposite of what is done for the rest of the kernel, where the compile time default is to replace indirect calls with retpoline thunk calls. Make CALL_NOSPEC consistent with the rest of the kernel, default to retpoline thunk at compile time when CONFIG_MITIGATION_RETPOLINE is enabled. Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250228-call-nospec-v3-1-96599fed0f33@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18x86/bhi: Do not set BHI_DIS_S in 32-bit modePawan Gupta
commit 073fdbe02c69c43fb7c0d547ec265c7747d4a646 upstream. With the possibility of intra-mode BHI via cBPF, complete mitigation for BHI is to use IBHF (history fence) instruction with BHI_DIS_S set. Since this new instruction is only available in 64-bit mode, setting BHI_DIS_S in 32-bit mode is only a partial mitigation. Do not set BHI_DIS_S in 32-bit mode so as to avoid reporting misleading mitigated status. With this change IBHF won't be used in 32-bit mode, also remove the CONFIG_X86_64 check from emit_spectre_bhb_barrier(). Suggested-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Josh Poimboeuf <jpoimboe@kernel.org> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18x86/bpf: Add IBHF call at end of classic BPFDaniel Sneddon
commit 9f725eec8fc0b39bdc07dcc8897283c367c1a163 upstream. Classic BPF programs can be run by unprivileged users, allowing unprivileged code to execute inside the kernel. Attackers can use this to craft branch history in kernel mode that can influence the target of indirect branches. BHI_DIS_S provides user-kernel isolation of branch history, but cBPF can be used to bypass this protection by crafting branch history in kernel mode. To stop intra-mode attacks via cBPF programs, Intel created a new instruction Indirect Branch History Fence (IBHF). IBHF prevents the predicted targets of subsequent indirect branches from being influenced by branch history prior to the IBHF. IBHF is only effective while BHI_DIS_S is enabled. Add the IBHF instruction to cBPF jitted code's exit path. Add the new fence when the hardware mitigation is enabled (i.e., X86_FEATURE_CLEAR_BHB_HW is set) or after the software sequence (X86_FEATURE_CLEAR_BHB_LOOP) is being used in a virtual machine. Note that X86_FEATURE_CLEAR_BHB_HW and X86_FEATURE_CLEAR_BHB_LOOP are mutually exclusive, so the JIT compiler will only emit the new fence, not the SW sequence, when X86_FEATURE_CLEAR_BHB_HW is set. Hardware that enumerates BHI_NO basically has BHI_DIS_S protections always enabled, regardless of the value of BHI_DIS_S. Since BHI_DIS_S doesn't protect against intra-mode attacks, enumerate BHI bug on BHI_NO hardware as well. Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18x86/bpf: Call branch history clearing sequence on exitDaniel Sneddon
commit d4e89d212d401672e9cdfe825d947ee3a9fbe3f5 upstream. Classic BPF programs have been identified as potential vectors for intra-mode Branch Target Injection (BTI) attacks. Classic BPF programs can be run by unprivileged users. They allow unprivileged code to execute inside the kernel. Attackers can use unprivileged cBPF to craft branch history in kernel mode that can influence the target of indirect branches. Introduce a branch history buffer (BHB) clearing sequence during the JIT compilation of classic BPF programs. The clearing sequence is the same as is used in previous mitigations to protect syscalls. Since eBPF programs already have their own mitigations in place, only insert the call on classic programs that aren't run by privileged users. Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18arm64: proton-pack: Add new CPUs 'k' values for branch mitigationJames Morse
commit efe676a1a7554219eae0b0dcfe1e0cdcc9ef9aef upstream. Update the list of 'k' values for the branch mitigation from arm's website. Add the values for Cortex-X1C. The MIDR_EL1 value can be found here: https://developer.arm.com/documentation/101968/0002/Register-descriptions/AArch> Link: https://developer.arm.com/documentation/110280/2-0/?lang=en Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18arm64: bpf: Only mitigate cBPF programs loaded by unprivileged usersJames Morse
commit f300769ead032513a68e4a02e806393402e626f8 upstream. Support for eBPF programs loaded by unprivileged users is typically disabled. This means only cBPF programs need to be mitigated for BHB. In addition, only mitigate cBPF programs that were loaded by an unprivileged user. Privileged users can also load the same program via eBPF, making the mitigation pointless. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18arm64: bpf: Add BHB mitigation to the epilogue for cBPF programsJames Morse
commit 0dfefc2ea2f29ced2416017d7e5b1253a54c2735 upstream. A malicious BPF program may manipulate the branch history to influence what the hardware speculates will happen next. On exit from a BPF program, emit the BHB mititgation sequence. This is only applied for 'classic' cBPF programs that are loaded by seccomp. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18arm64: proton-pack: Expose whether the branchy loop k valueJames Morse
commit a1152be30a043d2d4dcb1683415f328bf3c51978 upstream. Add a helper to expose the k value of the branchy loop. This is needed by the BPF JIT to generate the mitigation sequence in BPF programs. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18arm64: proton-pack: Expose whether the platform is mitigated by firmwareJames Morse
commit e7956c92f396a44eeeb6eaf7a5b5e1ad24db6748 upstream. is_spectre_bhb_fw_affected() allows the caller to determine if the CPU is known to need a firmware mitigation. CPUs are either on the list of CPUs we know about, or firmware has been queried and reported that the platform is affected - and mitigated by firmware. This helper is not useful to determine if the platform is mitigated by firmware. A CPU could be on the know list, but the firmware may not be implemented. Its affected but not mitigated. spectre_bhb_enable_mitigation() handles this distinction by checking the firmware state before enabling the mitigation. Add a helper to expose this state. This will be used by the BPF JIT to determine if calling firmware for a mitigation is necessary and supported. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18arm64: insn: Add support for encoding DSBJames Morse
commit 63de8abd97ddb9b758bd8f915ecbd18e1f1a87a0 upstream. To generate code in the eBPF epilogue that uses the DSB instruction, insn.c needs a heler to encode the type and domain. Re-use the crm encoding logic from the DMB instruction. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18Revert "um: work around sched_yield not yielding in time-travel mode"Christian Lamparter
This reverts commit da780c4a075ba2deb05ae29f0af4a990578c7901 which is commit 887c5c12e80c8424bd471122d2e8b6b462e12874 upstream. Reason being that the patch depends on at least commit 0b8b2668f998 ("um: insert scheduler ticks when userspace does not yield") in order to build. Otherwise it fails with: | /usr/bin/ld: arch/um/kernel/skas/syscall.o: in function `handle_syscall': | linux-6.12.27/arch/um/kernel/skas/syscall.c:43:(.text+0xa2): undefined | reference to `tt_extra_sched_jiffies' | collect2: error: ld returned 1 exit status The author Benjamin Berg commented: "I think it is better to just not backport commit 0b8b2668f998 ("um: insert scheduler ticks when userspace does not yield")" Link: https://lore.kernel.org/linux-um/8ce0b6056a9726e540f61bce77311278654219eb.camel@sipsolutions.net/ Cc: <stable@vger.kernel.org> # 6.12.y Cc: Benjamin Berg <benjamin.berg@intel.com> Signed-off-by: Christian Lamparter <chunkeey@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18riscv: misaligned: enable IRQs while handling misaligned accessesClément Léger
[ Upstream commit 453805f0a28fc5091e46145e6560c776f7c7a611 ] We can safely reenable IRQs if coming from userspace. This allows to access user memory that could potentially trigger a page fault. Fixes: b686ecdeacf6 ("riscv: misaligned: Restrict user access to kernel memory") Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250422162324.956065-3-cleger@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-18riscv: misaligned: factorize trap handlingClément Léger
[ Upstream commit fd94de9f9e7aac11ec659e386b9db1203d502023 ] Since both load/store and user/kernel should use almost the same path and that we are going to add some code around that, factorize it. Signed-off-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250422162324.956065-2-cleger@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Stable-dep-of: 453805f0a28f ("riscv: misaligned: enable IRQs while handling misaligned accesses") Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-18riscv: misaligned: Add handling for ZCB instructionsNylon Chen
[ Upstream commit eb16b3727c05ed36420c90eca1e8f0e279514c1c ] Add support for the Zcb extension's compressed half-word instructions (C.LHU, C.LH, and C.SH) in the RISC-V misaligned access trap handler. Signed-off-by: Zong Li <zong.li@sifive.com> Signed-off-by: Nylon Chen <nylon.chen@sifive.com> Fixes: 956d705dd279 ("riscv: Unaligned load/store handling for M_MODE") Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Link: https://lore.kernel.org/r/20250411073850.3699180-2-nylon.chen@sifive.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-18MIPS: Fix MAX_REG_OFFSETThorsten Blum
[ Upstream commit c44572e0cc13c9afff83fd333135a0aa9b27ba26 ] Fix MAX_REG_OFFSET to point to the last register in 'pt_regs' and not to the marker itself, which could allow regs_get_register() to return an invalid offset. Fixes: 40e084a506eb ("MIPS: Add uprobes support.") Suggested-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-18x86/microcode: Consolidate the loader enablement checkingBorislav Petkov (AMD)
commit 5214a9f6c0f56644acb9d2cbb58facf1856d322b upstream. Consolidate the whole logic which determines whether the microcode loader should be enabled or not into a single function and call it everywhere. Well, almost everywhere - not in mk_early_pgtbl_32() because there the kernel is running without paging enabled and checking dis_ucode_ldr et al would require physical addresses and uglification of the code. But since this is 32-bit, the easier thing to do is to simply map the initrd unconditionally especially since that mapping is getting removed later anyway by zap_early_initrd_mapping() and avoid the uglification. In doing so, address the issue of old 486er machines without CPUID support, not booting current kernels. [ mingo: Fix no previous prototype for ‘microcode_loader_disabled’ [-Wmissing-prototypes] ] Fixes: 4c585af7180c1 ("x86/boot/32: Temporarily map initrd for microcode loading") Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: <stable@kernel.org> Link: https://lore.kernel.org/r/CANpbe9Wm3z8fy9HbgS8cuhoj0TREYEEkBipDuhgkWFvqX0UoVQ@mail.gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18arm64: cpufeature: Move arm64_use_ng_mappings to the .data section to ↵Yeoreum Yun
prevent wrong idmap generation commit 363cd2b81cfdf706bbfc9ec78db000c9b1ecc552 upstream. The PTE_MAYBE_NG macro sets the nG page table bit according to the value of "arm64_use_ng_mappings". This variable is currently placed in the .bss section. create_init_idmap() is called before the .bss section initialisation which is done in early_map_kernel(). Therefore, data/test_prot in create_init_idmap() could be set incorrectly through the PAGE_KERNEL -> PROT_DEFAULT -> PTE_MAYBE_NG macros. # llvm-objdump-21 --syms vmlinux-gcc | grep arm64_use_ng_mappings ffff800082f242a8 g O .bss 0000000000000001 arm64_use_ng_mappings The create_init_idmap() function disassembly compiled with llvm-21: // create_init_idmap() ffff80008255c058: d10103ff sub sp, sp, #0x40 ffff80008255c05c: a9017bfd stp x29, x30, [sp, #0x10] ffff80008255c060: a90257f6 stp x22, x21, [sp, #0x20] ffff80008255c064: a9034ff4 stp x20, x19, [sp, #0x30] ffff80008255c068: 910043fd add x29, sp, #0x10 ffff80008255c06c: 90003fc8 adrp x8, 0xffff800082d54000 ffff80008255c070: d280e06a mov x10, #0x703 // =1795 ffff80008255c074: 91400409 add x9, x0, #0x1, lsl #12 // =0x1000 ffff80008255c078: 394a4108 ldrb w8, [x8, #0x290] ------------- (1) ffff80008255c07c: f2e00d0a movk x10, #0x68, lsl #48 ffff80008255c080: f90007e9 str x9, [sp, #0x8] ffff80008255c084: aa0103f3 mov x19, x1 ffff80008255c088: aa0003f4 mov x20, x0 ffff80008255c08c: 14000000 b 0xffff80008255c08c <__pi_create_init_idmap+0x34> ffff80008255c090: aa082d56 orr x22, x10, x8, lsl #11 -------- (2) Note (1) is loading the arm64_use_ng_mappings value in w8 and (2) is set the text or data prot with the w8 value to set PTE_NG bit. If the .bss section isn't initialized, x8 could include a garbage value and generate an incorrect mapping. Annotate arm64_use_ng_mappings as __read_mostly so that it is placed in the .data section. Fixes: 84b04d3e6bdb ("arm64: kernel: Create initial ID map from C code") Cc: stable@vger.kernel.org # 6.9.x Tested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Yeoreum Yun <yeoreum.yun@arm.com> Link: https://lore.kernel.org/r/20250502180412.3774883-1-yeoreum.yun@arm.com [catalin.marinas@arm.com: use __read_mostly instead of __ro_after_init] [catalin.marinas@arm.com: slight tweaking of the code comment] Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18KVM: SVM: Forcibly leave SMM mode on SHUTDOWN interceptionMikhail Lobanov
commit a2620f8932fa9fdabc3d78ed6efb004ca409019f upstream. Previously, commit ed129ec9057f ("KVM: x86: forcibly leave nested mode on vCPU reset") addressed an issue where a triple fault occurring in nested mode could lead to use-after-free scenarios. However, the commit did not handle the analogous situation for System Management Mode (SMM). This omission results in triggering a WARN when KVM forces a vCPU INIT after SHUTDOWN interception while the vCPU is in SMM. This situation was reprodused using Syzkaller by: 1) Creating a KVM VM and vCPU 2) Sending a KVM_SMI ioctl to explicitly enter SMM 3) Executing invalid instructions causing consecutive exceptions and eventually a triple fault The issue manifests as follows: WARNING: CPU: 0 PID: 25506 at arch/x86/kvm/x86.c:12112 kvm_vcpu_reset+0x1d2/0x1530 arch/x86/kvm/x86.c:12112 Modules linked in: CPU: 0 PID: 25506 Comm: syz-executor.0 Not tainted 6.1.130-syzkaller-00157-g164fe5dde9b6 #0 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014 RIP: 0010:kvm_vcpu_reset+0x1d2/0x1530 arch/x86/kvm/x86.c:12112 Call Trace: <TASK> shutdown_interception+0x66/0xb0 arch/x86/kvm/svm/svm.c:2136 svm_invoke_exit_handler+0x110/0x530 arch/x86/kvm/svm/svm.c:3395 svm_handle_exit+0x424/0x920 arch/x86/kvm/svm/svm.c:3457 vcpu_enter_guest arch/x86/kvm/x86.c:10959 [inline] vcpu_run+0x2c43/0x5a90 arch/x86/kvm/x86.c:11062 kvm_arch_vcpu_ioctl_run+0x50f/0x1cf0 arch/x86/kvm/x86.c:11283 kvm_vcpu_ioctl+0x570/0xf00 arch/x86/kvm/../../../virt/kvm/kvm_main.c:4122 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:870 [inline] __se_sys_ioctl fs/ioctl.c:856 [inline] __x64_sys_ioctl+0x19a/0x210 fs/ioctl.c:856 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x35/0x80 arch/x86/entry/common.c:81 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Architecturally, INIT is blocked when the CPU is in SMM, hence KVM's WARN() in kvm_vcpu_reset() to guard against KVM bugs, e.g. to detect improper emulation of INIT. SHUTDOWN on SVM is a weird edge case where KVM needs to do _something_ sane with the VMCB, since it's technically undefined, and INIT is the least awful choice given KVM's ABI. So, double down on stuffing INIT on SHUTDOWN, and force the vCPU out of SMM to avoid any weirdness (and the WARN). Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Fixes: ed129ec9057f ("KVM: x86: forcibly leave nested mode on vCPU reset") Cc: stable@vger.kernel.org Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Mikhail Lobanov <m.lobanov@rosa.ru> Link: https://lore.kernel.org/r/20250414171207.155121-1-m.lobanov@rosa.ru [sean: massage changelog, make it clear this isn't architectural behavior] Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18x86/mm: Eliminate window where TLB flushes may be inadvertently skippedDave Hansen
commit fea4e317f9e7e1f449ce90dedc27a2d2a95bee5a upstream. tl;dr: There is a window in the mm switching code where the new CR3 is set and the CPU should be getting TLB flushes for the new mm. But should_flush_tlb() has a bug and suppresses the flush. Fix it by widening the window where should_flush_tlb() sends an IPI. Long Version: === History === There were a few things leading up to this. First, updating mm_cpumask() was observed to be too expensive, so it was made lazier. But being lazy caused too many unnecessary IPIs to CPUs due to the now-lazy mm_cpumask(). So code was added to cull mm_cpumask() periodically[2]. But that culling was a bit too aggressive and skipped sending TLB flushes to CPUs that need them. So here we are again. === Problem === The too-aggressive code in should_flush_tlb() strikes in this window: // Turn on IPIs for this CPU/mm combination, but only // if should_flush_tlb() agrees: cpumask_set_cpu(cpu, mm_cpumask(next)); next_tlb_gen = atomic64_read(&next->context.tlb_gen); choose_new_asid(next, next_tlb_gen, &new_asid, &need_flush); load_new_mm_cr3(need_flush); // ^ After 'need_flush' is set to false, IPIs *MUST* // be sent to this CPU and not be ignored. this_cpu_write(cpu_tlbstate.loaded_mm, next); // ^ Not until this point does should_flush_tlb() // become true! should_flush_tlb() will suppress TLB flushes between load_new_mm_cr3() and writing to 'loaded_mm', which is a window where they should not be suppressed. Whoops. === Solution === Thankfully, the fuzzy "just about to write CR3" window is already marked with loaded_mm==LOADED_MM_SWITCHING. Simply checking for that state in should_flush_tlb() is sufficient to ensure that the CPU is targeted with an IPI. This will cause more TLB flush IPIs. But the window is relatively small and I do not expect this to cause any kind of measurable performance impact. Update the comment where LOADED_MM_SWITCHING is written since it grew yet another user. Peter Z also raised a concern that should_flush_tlb() might not observe 'loaded_mm' and 'is_lazy' in the same order that switch_mm_irqs_off() writes them. Add a barrier to ensure that they are observed in the order they are written. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Rik van Riel <riel@surriel.com> Link: https://lore.kernel.org/oe-lkp/202411282207.6bd28eae-lkp@intel.com/ [1] Fixes: 6db2526c1d69 ("x86/mm/tlb: Only trim the mm_cpumask once a second") [2] Reported-by: Stephen Dolan <sdolan@janestreet.com> Cc: stable@vger.kernel.org Acked-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18s390/entry: Fix last breaking event handling in case of stack corruptionHeiko Carstens
[ Upstream commit ae952eea6f4a7e2193f8721a5366049946e012e7 ] In case of stack corruption stack_invalid() is called and the expectation is that register r10 contains the last breaking event address. This dependency is quite subtle and broke a couple of years ago without that anybody noticed. Fix this by getting rid of the dependency and read the last breaking event address from lowcore. Fixes: 56e62a737028 ("s390: convert to generic entry") Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-18s390/pci: Fix missing check for zpci_create_device() error returnNiklas Schnelle
commit 42420c50c68f3e95e90de2479464f420602229fc upstream. The zpci_create_device() function returns an error pointer that needs to be checked before dereferencing it as a struct zpci_dev pointer. Add the missing check in __clp_add() where it was missed when adding the scan_list in the fixed commit. Simply not adding the device to the scan list results in the previous behavior. Cc: stable@vger.kernel.org Fixes: 0467cdde8c43 ("s390/pci: Sort PCI functions prior to creating virtual busses") Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-18arm64: dts: imx8mm-verdin: Link reg_usdhc2_vqmmc to usdhc2Wojciech Dubowik
commit 5591ce0069ddda97cdbbea596bed53e698f399c2 upstream. Define vqmmc regulator-gpio for usdhc2 with vin-supply coming from LDO5. Without this definition LDO5 will be powered down, disabling SD card after bootup. This has been introduced in commit f5aab0438ef1 ("regulator: pca9450: Fix enable register for LDO5"). Fixes: 6a57f224f734 ("arm64: dts: freescale: add initial support for verdin imx8m mini") Fixes: f5aab0438ef1 ("regulator: pca9450: Fix enable register for LDO5") Tested-by: Manuel Traut <manuel.traut@mt.com> Reviewed-by: Philippe Schenker <philippe.schenker@impulsing.ch> Tested-by: Francesco Dolcini <francesco.dolcini@toradex.com> Reviewed-by: Francesco Dolcini <francesco.dolcini@toradex.com> Cc: stable@vger.kernel.org Signed-off-by: Wojciech Dubowik <Wojciech.Dubowik@mt.com> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-09arm64: dts: st: Use 128kB size for aliased GIC400 register access on ↵Christian Bruel
stm32mp25 SoCs [ Upstream commit 06c231fe953a26f4bc9d7a37ba1b9b288a59c7c2 ] Adjust the size of 8kB GIC regions to 128kB so that each 4kB is mapped 16 times over a 64kB region. The offset is then adjusted in the irq-gic driver. see commit 12e14066f4835 ("irqchip/GIC: Add workaround for aliased GIC400") Fixes: 5d30d03aaf785 ("arm64: dts: st: introduce stm32mp25 SoCs family") Suggested-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Christian Bruel <christian.bruel@foss.st.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250415111654.2103767-3-christian.bruel@foss.st.com Signed-off-by: Alexandre Torgue <alexandre.torgue@foss.st.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-09arm64: dts: st: Adjust interrupt-controller for stm32mp25 SoCsChristian Bruel
[ Upstream commit de2b2107d5a41a91ab603e135fb6e408abbee28e ] Use gic-400 compatible and remove address-cells = <1> on aarch64 Fixes: 5d30d03aaf785 ("arm64: dts: st: introduce stm32mp25 SoCs family") Signed-off-by: Christian Bruel <christian.bruel@foss.st.com> Link: https://lore.kernel.org/r/20250415111654.2103767-2-christian.bruel@foss.st.com Signed-off-by: Alexandre Torgue <alexandre.torgue@foss.st.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-09ARM: dts: opos6ul: add ksz8081 phy propertiesSébastien Szymanski
[ Upstream commit 6e1a7bc8382b0d4208258f7d2a4474fae788dd90 ] Commit c7e73b5051d6 ("ARM: imx: mach-imx6ul: remove 14x14 EVK specific PHY fixup") removed a PHY fixup that setted the clock mode and the LED mode. Make the Ethernet interface work again by doing as advised in the commit's log, set clock mode and the LED mode in the device tree. Fixes: c7e73b5051d6 ("ARM: imx: mach-imx6ul: remove 14x14 EVK specific PHY fixup") Signed-off-by: Sébastien Szymanski <sebastien.szymanski@armadeus.com> Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de> Signed-off-by: Shawn Guo <shawnguo@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>