summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2023-08-08Linux 6.4.9v6.4.9Greg Kroah-Hartman
Tested-by: Salvatore Bonaccorso <carnil@debian.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08x86: fix backwards merge of GDS/SRSO bitGreg Kroah-Hartman
Stable-tree-only change. Due to the way the GDS and SRSO patches flowed into the stable tree, it was a 50% chance that the merge of the which value GDS and SRSO should be. Of course, I lost that bet, and chose the opposite of what Linus chose in commit 64094e7e3118 ("Merge tag 'gds-for-linus-2023-08-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip") Fix this up by switching the values to match what is now in Linus's tree as that is the correct value to mirror. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08xen/netback: Fix buffer overrun triggered by unusual packetRoss Lagerwall
commit 534fc31d09b706a16d83533e16b5dc855caf7576 upstream. It is possible that a guest can send a packet that contains a head + 18 slots and yet has a len <= XEN_NETBACK_TX_COPY_LEN. This causes nr_slots to underflow in xenvif_get_requests() which then causes the subsequent loop's termination condition to be wrong, causing a buffer overrun of queue->tx_map_ops. Rework the code to account for the extra frag_overflow slots. This is CVE-2023-34319 / XSA-432. Fixes: ad7f402ae4f4 ("xen/netback: Ensure protocol headers don't fall in the non-linear area") Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com> Reviewed-by: Paul Durrant <paul@xen.org> Reviewed-by: Wei Liu <wei.liu@kernel.org> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08x86/srso: Tie SBPB bit setting to microcode patch detectionBorislav Petkov (AMD)
commit 5a15d8348881e9371afdf9f5357a135489496955 upstream. The SBPB bit in MSR_IA32_PRED_CMD is supported only after a microcode patch has been applied so set X86_FEATURE_SBPB only then. Otherwise, guests would attempt to set that bit and #GP on the MSR write. While at it, make SMT detection more robust as some guests - depending on how and what CPUID leafs their report - lead to cpu_smt_control getting set to CPU_SMT_NOT_SUPPORTED but SRSO_NO should be set for any guest incarnation where one simply cannot do SMT, for whatever reason. Fixes: fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation") Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Reported-by: Salvatore Bonaccorso <carnil@debian.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08x86/srso: Add a forgotten NOENDBR annotationBorislav Petkov (AMD)
Upstream commit: 3bbbe97ad83db8d9df06daf027b0840188de625d Fix: vmlinux.o: warning: objtool: .export_symbol+0x29e40: data relocation to !ENDBR: srso_untrain_ret_alias+0x0 Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08x86/srso: Fix return thunks in generated codeJosh Poimboeuf
Upstream commit: 238ec850b95a02dcdff3edc86781aa913549282f Set X86_FEATURE_RETHUNK when enabling the SRSO mitigation so that generated code (e.g., ftrace, static call, eBPF) generates "jmp __x86_return_thunk" instead of RET. [ bp: Add a comment. ] Fixes: fb3bd914b3ec ("x86/srso: Add a Speculative RAS Overflow mitigation") Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08x86/srso: Add IBPB on VMEXITBorislav Petkov (AMD)
Upstream commit: d893832d0e1ef41c72cdae444268c1d64a2be8ad Add the option to flush IBPB only on VMEXIT in order to protect from malicious guests but one otherwise trusts the software that runs on the hypervisor. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08x86/srso: Add IBPBBorislav Petkov (AMD)
Upstream commit: 233d6f68b98d480a7c42ebe78c38f79d44741ca9 Add the option to mitigate using IBPB on a kernel entry. Pull in the Retbleed alternative so that the IBPB call from there can be used. Also, if Retbleed mitigation is done using IBPB, the same mitigation can and must be used here. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08x86/srso: Add SRSO_NO supportBorislav Petkov (AMD)
Upstream commit: 1b5277c0ea0b247393a9c426769fde18cff5e2f6 Add support for the CPUID flag which denotes that the CPU is not affected by SRSO. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08x86/srso: Add IBPB_BRTYPE supportBorislav Petkov (AMD)
Upstream commit: 79113e4060aba744787a81edb9014f2865193854 Add support for the synthetic CPUID flag which "if this bit is 1, it indicates that MSR 49h (PRED_CMD) bit 0 (IBPB) flushes all branch type predictions from the CPU branch predictor." This flag is there so that this capability in guests can be detected easily (otherwise one would have to track microcode revisions which is impossible for guests). It is also needed only for Zen3 and -4. The other two (Zen1 and -2) always flush branch type predictions by default. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08x86/srso: Add a Speculative RAS Overflow mitigationBorislav Petkov (AMD)
Upstream commit: fb3bd914b3ec28f5fb697ac55c4846ac2d542855 Add a mitigation for the speculative return address stack overflow vulnerability found on AMD processors. The mitigation works by ensuring all RET instructions speculate to a controlled location, similar to how speculation is controlled in the retpoline sequence. To accomplish this, the __x86_return_thunk forces the CPU to mispredict every function return using a 'safe return' sequence. To ensure the safety of this mitigation, the kernel must ensure that the safe return sequence is itself free from attacker interference. In Zen3 and Zen4, this is accomplished by creating a BTB alias between the untraining function srso_untrain_ret_alias() and the safe return function srso_safe_ret_alias() which results in evicting a potentially poisoned BTB entry and using that safe one for all function returns. In older Zen1 and Zen2, this is accomplished using a reinterpretation technique similar to Retbleed one: srso_untrain_ret() and srso_safe_ret(). Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08x86/bugs: Increase the x86 bugs vector size to two u32sBorislav Petkov (AMD)
Upstream commit: 0e52740ffd10c6c316837c6c128f460f1aaba1ea There was never a doubt in my mind that they would not fit into a single u32 eventually. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08Documentation/x86: Fix backwards on/off logic about YMM supportDave Hansen
commit 1b0fc0345f2852ffe54fb9ae0e12e2ee69ad6a20 upstream These options clearly turn *off* XSAVE YMM support. Correct the typo. Reported-by: Ben Hutchings <ben@decadent.org.uk> Fixes: 553a5c03e90a ("x86/speculation: Add force option to GDS mitigation") Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08x86/xen: Fix secondary processors' FPU initializationJuergen Gross
commit fe3e0a13e597c1c8617814bf9b42ab732db5c26e upstream. Moving the call of fpu__init_cpu() from cpu_init() to start_secondary() broke Xen PV guests, as those don't call start_secondary() for APs. Call fpu__init_cpu() in Xen's cpu_bringup(), which is the Xen PV replacement of start_secondary(). Fixes: b81fac906a8f ("x86/fpu: Move FPU initialization into arch_cpu_finalize_init()") Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Acked-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230703130032.22916-1-jgross@suse.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08x86/mem_encrypt: Unbreak the AMD_MEM_ENCRYPT=n buildThomas Gleixner
commit 0a9567ac5e6a40cdd9c8cd15b19a62a15250f450 upstream. Moving mem_encrypt_init() broke the AMD_MEM_ENCRYPT=n because the declaration of that function was under #ifdef CONFIG_AMD_MEM_ENCRYPT and the obvious placement for the inline stub was the #else path. This is a leftover of commit 20f07a044a76 ("x86/sev: Move common memory encryption code to mem_encrypt.c") which made mem_encrypt_init() depend on X86_MEM_ENCRYPT without moving the prototype. That did not fail back then because there was no stub inline as the core init code had a weak function. Move both the declaration and the stub out of the CONFIG_AMD_MEM_ENCRYPT section and guard it with CONFIG_X86_MEM_ENCRYPT. Fixes: 439e17576eb4 ("init, x86: Move mem_encrypt_init() into arch_cpu_finalize_init()") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Closes: https://lore.kernel.org/oe-kbuild-all/202306170247.eQtCJPE8-lkp@intel.com/ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08KVM: Add GDS_NO support to KVMDaniel Sneddon
commit 81ac7e5d741742d650b4ed6186c4826c1a0631a7 upstream Gather Data Sampling (GDS) is a transient execution attack using gather instructions from the AVX2 and AVX512 extensions. This attack allows malicious code to infer data that was previously stored in vector registers. Systems that are not vulnerable to GDS will set the GDS_NO bit of the IA32_ARCH_CAPABILITIES MSR. This is useful for VM guests that may think they are on vulnerable systems that are, in fact, not affected. Guests that are running on affected hosts where the mitigation is enabled are protected as if they were running on an unaffected system. On all hosts that are not affected or that are mitigated, set the GDS_NO bit. Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08x86/speculation: Add Kconfig option for GDSDaniel Sneddon
commit 53cf5797f114ba2bd86d23a862302119848eff19 upstream Gather Data Sampling (GDS) is mitigated in microcode. However, on systems that haven't received the updated microcode, disabling AVX can act as a mitigation. Add a Kconfig option that uses the microcode mitigation if available and disables AVX otherwise. Setting this option has no effect on systems not affected by GDS. This is the equivalent of setting gather_data_sampling=force. Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08x86/speculation: Add force option to GDS mitigationDaniel Sneddon
commit 553a5c03e90a6087e88f8ff878335ef0621536fb upstream The Gather Data Sampling (GDS) vulnerability allows malicious software to infer stale data previously stored in vector registers. This may include sensitive data such as cryptographic keys. GDS is mitigated in microcode, and systems with up-to-date microcode are protected by default. However, any affected system that is running with older microcode will still be vulnerable to GDS attacks. Since the gather instructions used by the attacker are part of the AVX2 and AVX512 extensions, disabling these extensions prevents gather instructions from being executed, thereby mitigating the system from GDS. Disabling AVX2 is sufficient, but we don't have the granularity to do this. The XCR0[2] disables AVX, with no option to just disable AVX2. Add a kernel parameter gather_data_sampling=force that will enable the microcode mitigation if available, otherwise it will disable AVX on affected systems. This option will be ignored if cmdline mitigations=off. This is a *big* hammer. It is known to break buggy userspace that uses incomplete, buggy AVX enumeration. Unfortunately, such userspace does exist in the wild: https://www.mail-archive.com/bug-coreutils@gnu.org/msg33046.html [ dhansen: add some more ominous warnings about disabling AVX ] Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08x86/speculation: Add Gather Data Sampling mitigationDaniel Sneddon
commit 8974eb588283b7d44a7c91fa09fcbaf380339f3a upstream Gather Data Sampling (GDS) is a hardware vulnerability which allows unprivileged speculative access to data which was previously stored in vector registers. Intel processors that support AVX2 and AVX512 have gather instructions that fetch non-contiguous data elements from memory. On vulnerable hardware, when a gather instruction is transiently executed and encounters a fault, stale data from architectural or internal vector registers may get transiently stored to the destination vector register allowing an attacker to infer the stale data using typical side channel techniques like cache timing attacks. This mitigation is different from many earlier ones for two reasons. First, it is enabled by default and a bit must be set to *DISABLE* it. This is the opposite of normal mitigation polarity. This means GDS can be mitigated simply by updating microcode and leaving the new control bit alone. Second, GDS has a "lock" bit. This lock bit is there because the mitigation affects the hardware security features KeyLocker and SGX. It needs to be enabled and *STAY* enabled for these features to be mitigated against GDS. The mitigation is enabled in the microcode by default. Disable it by setting gather_data_sampling=off or by disabling all mitigations with mitigations=off. The mitigation status can be checked by reading: /sys/devices/system/cpu/vulnerabilities/gather_data_sampling Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08x86/fpu: Move FPU initialization into arch_cpu_finalize_init()Thomas Gleixner
commit b81fac906a8f9e682e513ddd95697ec7a20878d4 upstream Initializing the FPU during the early boot process is a pointless exercise. Early boot is convoluted and fragile enough. Nothing requires that the FPU is set up early. It has to be initialized before fork_init() because the task_struct size depends on the FPU register buffer size. Move the initialization to arch_cpu_finalize_init() which is the perfect place to do so. No functional change. This allows to remove quite some of the custom early command line parsing, but that's subject to the next installment. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230613224545.902376621@linutronix.de Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08x86/fpu: Mark init functions __initThomas Gleixner
commit 1703db2b90c91b2eb2d699519fc505fe431dde0e upstream No point in keeping them around. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230613224545.841685728@linutronix.de Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08x86/fpu: Remove cpuinfo argument from init functionsThomas Gleixner
commit 1f34bb2a24643e0087652d81078e4f616562738d upstream Nothing in the call chain requires it Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230613224545.783704297@linutronix.de Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08x86/init: Initialize signal frame size lateThomas Gleixner
commit 54d9a91a3d6713d1332e93be13b4eaf0fa54349d upstream No point in doing this during really early boot. Move it to an early initcall so that it is set up before possible user mode helpers are started during device initialization. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230613224545.727330699@linutronix.de Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08init, x86: Move mem_encrypt_init() into arch_cpu_finalize_init()Thomas Gleixner
commit 439e17576eb47f26b78c5bbc72e344d4206d2327 upstream Invoke the X86ism mem_encrypt_init() from X86 arch_cpu_finalize_init() and remove the weak fallback from the core code. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230613224545.670360645@linutronix.de Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08init: Invoke arch_cpu_finalize_init() earlierThomas Gleixner
commit 9df9d2f0471b4c4702670380b8d8a45b40b23a7d upstream X86 is reworking the boot process so that initializations which are not required during early boot can be moved into the late boot process and out of the fragile and restricted initial boot phase. arch_cpu_finalize_init() is the obvious place to do such initializations, but arch_cpu_finalize_init() is invoked too late in start_kernel() e.g. for initializing the FPU completely. fork_init() requires that the FPU is initialized as the size of task_struct on X86 depends on the size of the required FPU register buffer. Fortunately none of the init calls between calibrate_delay() and arch_cpu_finalize_init() is relevant for the functionality of arch_cpu_finalize_init(). Invoke it right after calibrate_delay() where everything which is relevant for arch_cpu_finalize_init() has been set up already. No functional change intended. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Rick Edgecombe <rick.p.edgecombe@intel.com> Link: https://lore.kernel.org/r/20230613224545.612182854@linutronix.de Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08init: Remove check_bugs() leftoversThomas Gleixner
commit 61235b24b9cb37c13fcad5b9596d59a1afdcec30 upstream Everything is converted over to arch_cpu_finalize_init(). Remove the check_bugs() leftovers including the empty stubs in asm-generic, alpha, parisc, powerpc and xtensa. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Link: https://lore.kernel.org/r/20230613224545.553215951@linutronix.de Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08um/cpu: Switch to arch_cpu_finalize_init()Thomas Gleixner
commit 9349b5cd0908f8afe95529fc7a8cbb1417df9b0c upstream check_bugs() is about to be phased out. Switch over to the new arch_cpu_finalize_init() implementation. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Richard Weinberger <richard@nod.at> Link: https://lore.kernel.org/r/20230613224545.493148694@linutronix.de Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08sparc/cpu: Switch to arch_cpu_finalize_init()Thomas Gleixner
commit 44ade508e3bfac45ae97864587de29eb1a881ec0 upstream check_bugs() is about to be phased out. Switch over to the new arch_cpu_finalize_init() implementation. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Sam Ravnborg <sam@ravnborg.org> Link: https://lore.kernel.org/r/20230613224545.431995857@linutronix.de Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08sh/cpu: Switch to arch_cpu_finalize_init()Thomas Gleixner
commit 01eb454e9bfe593f320ecbc9aaec60bf87cd453d upstream check_bugs() is about to be phased out. Switch over to the new arch_cpu_finalize_init() implementation. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230613224545.371697797@linutronix.de Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08mips/cpu: Switch to arch_cpu_finalize_init()Thomas Gleixner
commit 7f066a22fe353a827a402ee2835e81f045b1574d upstream check_bugs() is about to be phased out. Switch over to the new arch_cpu_finalize_init() implementation. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230613224545.312438573@linutronix.de Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08m68k/cpu: Switch to arch_cpu_finalize_init()Thomas Gleixner
commit 9ceecc2589b9d7cef6b321339ed8de484eac4b20 upstream check_bugs() is about to be phased out. Switch over to the new arch_cpu_finalize_init() implementation. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://lore.kernel.org/r/20230613224545.254342916@linutronix.de Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08loongarch/cpu: Switch to arch_cpu_finalize_init()Thomas Gleixner
commit 9841c423164787feb8f1442f922b7d80a70c82f1 upstream check_bugs() is about to be phased out. Switch over to the new arch_cpu_finalize_init() implementation. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230613224545.195288218@linutronix.de Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08ia64/cpu: Switch to arch_cpu_finalize_init()Thomas Gleixner
commit 6c38e3005621800263f117fb00d6787a76e16de7 upstream check_bugs() is about to be phased out. Switch over to the new arch_cpu_finalize_init() implementation. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230613224545.137045745@linutronix.de Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08ARM: cpu: Switch to arch_cpu_finalize_init()Thomas Gleixner
commit ee31bb0524a2e7c99b03f50249a411cc1eaa411f upstream check_bugs() is about to be phased out. Switch over to the new arch_cpu_finalize_init() implementation. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230613224545.078124882@linutronix.de Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08x86/cpu: Switch to arch_cpu_finalize_init()Thomas Gleixner
commit 7c7077a72674402654f3291354720cd73cdf649e upstream check_bugs() is a dumping ground for finalizing the CPU bringup. Only parts of it has to do with actual CPU bugs. Split it apart into arch_cpu_finalize_init() and cpu_select_mitigations(). Fixup the bogus 32bit comments while at it. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20230613224545.019583869@linutronix.de Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-08init: Provide arch_cpu_finalize_init()Thomas Gleixner
commit 7725acaa4f0c04fbefb0e0d342635b967bb7d414 upstream check_bugs() has become a dumping ground for all sorts of activities to finalize the CPU initialization before running the rest of the init code. Most are empty, a few do actual bug checks, some do alternative patching and some cobble a CPU advertisement string together.... Aside of that the current implementation requires duplicated function declaration and mostly empty header files for them. Provide a new function arch_cpu_finalize_init(). Provide a generic declaration if CONFIG_ARCH_HAS_CPU_FINALIZE_INIT is selected and a stub inline otherwise. This requires a temporary #ifdef in start_kernel() which will be removed along with check_bugs() once the architectures are converted over. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20230613224544.957805717@linutronix.de Signed-off-by: Daniel Sneddon <daniel.sneddon@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-03Linux 6.4.8v6.4.8Greg Kroah-Hartman
Link: https://lore.kernel.org/r/20230801091925.659598007@linuxfoundation.org Tested-by: Ronald Warsow <rwarsow@gmx.de> Tested-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Conor Dooley <conor.dooley@microchip.com> Tested-by: SeongJae Park <sj@kernel.org> Tested-by: Shuah Khan <skhan@linuxfoundation.org> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Tested-by: Ron Economos <re@w6rz.net> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Link: https://lore.kernel.org/r/20230802065501.780725463@linuxfoundation.org Tested-by: Chris Paterson (CIP) <chris.paterson2@renesas.com> Tested-by: Ronald Warsow <rwarsow@gmx.de> Tested-by: SeongJae Park <sj@kernel.org> Tested-by: Ron Economos <re@w6rz.net> Tested-by: Guenter Roeck <linux@roeck-us.net> Tested-by: Florian Fainelli <florian.fainelli@broadcom.com> Tested-by: Bagas Sanjaya <bagasdotme@gmail.com> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-03dma-buf: fix an error pointer vs NULL bugDan Carpenter
commit 00ae1491f970acc454be0df63f50942d94825860 upstream. Smatch detected potential error pointer dereference. drivers/gpu/drm/drm_syncobj.c:888 drm_syncobj_transfer_to_timeline() error: 'fence' dereferencing possible ERR_PTR() The error pointer comes from dma_fence_allocate_private_stub(). One caller expected error pointers and one expected NULL pointers. Change it to return NULL and update the caller which expected error pointers, drm_syncobj_assign_null_handle(), to check for NULL instead. Fixes: f781f661e8c9 ("dma-buf: keep the signaling time of merged fences v3") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Sumit Semwal <sumit.semwal@linaro.org> Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/b09f1996-3838-4fa2-9193-832b68262e43@moroto.mountain Cc: Jindong Yue <jindong.yue@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-03dma-buf: keep the signaling time of merged fences v3Christian König
commit f781f661e8c99b0cb34129f2e374234d61864e77 upstream. Some Android CTS is testing if the signaling time keeps consistent during merges. v2: use the current time if the fence is still in the signaling path and the timestamp not yet available. v3: improve comment, fix one more case to use the correct timestamp Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230630120041.109216-1-christian.koenig@amd.com Cc: Jindong Yue <jindong.yue@nxp.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-03mm/mempolicy: Take VMA lock before replacing policyJann Horn
commit 6c21e066f9256ea1df6f88768f6ae1080b7cf509 upstream. mbind() calls down into vma_replace_policy() without taking the per-VMA locks, replaces the VMA's vma->vm_policy pointer, and frees the old policy. That's bad; a concurrent page fault might still be using the old policy (in vma_alloc_folio()), resulting in use-after-free. Normally this will manifest as a use-after-free read first, but it can result in memory corruption, including because vma_alloc_folio() can call mpol_cond_put() on the freed policy, which conditionally changes the policy's refcount member. This bug is specific to CONFIG_NUMA, but it does also affect non-NUMA systems as long as the kernel was built with CONFIG_NUMA. Signed-off-by: Jann Horn <jannh@google.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Fixes: 5e31275cc997 ("mm: add per-VMA lock and helper functions to control it") Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-03mm/memory-failure: fix hardware poison check in unpoison_memory()Sidhartha Kumar
commit 6c54312f9689fbe27c70db5d42eebd29d04b672e upstream. It was pointed out[1] that using folio_test_hwpoison() is wrong as we need to check the indiviual page that has poison. folio_test_hwpoison() only checks the head page so go back to using PageHWPoison(). User-visible effects include existing hwpoison-inject tests possibly failing as unpoisoning a single subpage could lead to unpoisoning an entire folio. Memory unpoisoning could also not work as expected as the function will break early due to only checking the head page and not the actually poisoned subpage. [1]: https://lore.kernel.org/lkml/ZLIbZygG7LqSI9xe@casper.infradead.org/ Link: https://lkml.kernel.org/r/20230717181812.167757-1-sidhartha.kumar@oracle.com Fixes: a6fddef49eef ("mm/memory-failure: convert unpoison_memory() to folios") Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reported-by: Matthew Wilcox (Oracle) <willy@infradead.org> Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-03mm: fix memory ordering for mm_lock_seq and vm_lock_seqJann Horn
commit b1f02b95758d05b799731d939e76a0bd6da312db upstream. mm->mm_lock_seq effectively functions as a read/write lock; therefore it must be used with acquire/release semantics. A specific example is the interaction between userfaultfd_register() and lock_vma_under_rcu(). userfaultfd_register() does the following from the point where it changes a VMA's flags to the point where concurrent readers are permitted again (in a simple scenario where only a single private VMA is accessed and no merging/splitting is involved): userfaultfd_register userfaultfd_set_vm_flags vm_flags_reset vma_start_write down_write(&vma->vm_lock->lock) vma->vm_lock_seq = mm_lock_seq [marks VMA as busy] up_write(&vma->vm_lock->lock) vm_flags_init [sets VM_UFFD_* in __vm_flags] vma->vm_userfaultfd_ctx.ctx = ctx mmap_write_unlock vma_end_write_all WRITE_ONCE(mm->mm_lock_seq, mm->mm_lock_seq + 1) [unlocks VMA] There are no memory barriers in between the __vm_flags update and the mm->mm_lock_seq update that unlocks the VMA, so the unlock can be reordered to above the `vm_flags_init()` call, which means from the perspective of a concurrent reader, a VMA can be marked as a userfaultfd VMA while it is not VMA-locked. That's bad, we definitely need a store-release for the unlock operation. The non-atomic write to vma->vm_lock_seq in vma_start_write() is mostly fine because all accesses to vma->vm_lock_seq that matter are always protected by the VMA lock. There is a racy read in vma_start_read() though that can tolerate false-positives, so we should be using WRITE_ONCE() to keep things tidy and data-race-free (including for KCSAN). On the other side, lock_vma_under_rcu() works as follows in the relevant region for locking and userfaultfd check: lock_vma_under_rcu vma_start_read vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq) [early bailout] down_read_trylock(&vma->vm_lock->lock) vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq) [main check] userfaultfd_armed checks vma->vm_flags & __VM_UFFD_FLAGS Here, the interesting aspect is how far down the mm->mm_lock_seq read can be reordered - if this read is reordered down below the vma->vm_flags access, this could cause lock_vma_under_rcu() to partly operate on information that was read while the VMA was supposed to be locked. To prevent this kind of downwards bleeding of the mm->mm_lock_seq read, we need to read it with a load-acquire. Some of the comment wording is based on suggestions by Suren. BACKPORT WARNING: One of the functions changed by this patch (which I've written against Linus' tree) is vma_try_start_write(), but this function no longer exists in mm/mm-everything. I don't know whether the merged version of this patch will be ordered before or after the patch that removes vma_try_start_write(). If you're backporting this patch to a tree with vma_try_start_write(), make sure this patch changes that function. Link: https://lkml.kernel.org/r/20230721225107.942336-1-jannh@google.com Fixes: 5e31275cc997 ("mm: add per-VMA lock and helper functions to control it") Signed-off-by: Jann Horn <jannh@google.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-03mm: lock VMA in dup_anon_vma() before setting ->anon_vmaJann Horn
commit d8ab9f7b644a2c9b64de405c1953c905ff219dc9 upstream. When VMAs are merged, dup_anon_vma() is called with `dst` pointing to the VMA that is being expanded to cover the area previously occupied by another VMA. This currently happens while `dst` is not write-locked. This means that, in the `src->anon_vma && !dst->anon_vma` case, as soon as the assignment `dst->anon_vma = src->anon_vma` has happened, concurrent page faults can happen on `dst` under the per-VMA lock. This is already icky in itself, since such page faults can now install pages into `dst` that are attached to an `anon_vma` that is not yet tied back to the `anon_vma` with an `anon_vma_chain`. But if `anon_vma_clone()` fails due to an out-of-memory error, things get much worse: `anon_vma_clone()` then reverts `dst->anon_vma` back to NULL, and `dst` remains completely unconnected to the `anon_vma`, even though we can have pages in the area covered by `dst` that point to the `anon_vma`. This means the `anon_vma` of such pages can be freed while the pages are still mapped into userspace, which leads to UAF when a helper like folio_lock_anon_vma_read() tries to look up the anon_vma of such a page. This theoretically is a security bug, but I believe it is really hard to actually trigger as an unprivileged user because it requires that you can make an order-0 GFP_KERNEL allocation fail, and the page allocator tries pretty hard to prevent that. I think doing the vma_start_write() call inside dup_anon_vma() is the most straightforward fix for now. For a kernel-assisted reproducer, see the notes section of the patch mail. Link: https://lkml.kernel.org/r/20230721034643.616851-1-jannh@google.com Fixes: 5e31275cc997 ("mm: add per-VMA lock and helper functions to control it") Signed-off-by: Jann Horn <jannh@google.com> Reviewed-by: Suren Baghdasaryan <surenb@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-03rbd: retrieve and check lock owner twice before blocklistingIlya Dryomov
commit 588159009d5b7a09c3e5904cffddbe4a4e170301 upstream. An attempt to acquire exclusive lock can race with the current lock owner closing the image: 1. lock is held by client123, rbd_lock() returns -EBUSY 2. get_lock_owner_info() returns client123 instance details 3. client123 closes the image, lock is released 4. find_watcher() returns 0 as there is no matching watcher anymore 5. client123 instance gets erroneously blocklisted Particularly impacted is mirror snapshot scheduler in snapshot-based mirroring since it happens to open and close images a lot (images are opened only for as long as it takes to take the next mirror snapshot, the same client instance is used for all images). To reduce the potential for erroneous blocklisting, retrieve the lock owner again after find_watcher() returns 0. If it's still there, make sure it matches the previously detected lock owner. Cc: stable@vger.kernel.org # f38cb9d9c204: rbd: make get_lock_owner_info() return a single locker or NULL Cc: stable@vger.kernel.org # 8ff2c64c9765: rbd: harden get_lock_owner_info() a bit Cc: stable@vger.kernel.org Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-03rbd: harden get_lock_owner_info() a bitIlya Dryomov
commit 8ff2c64c9765446c3cef804fb99da04916603e27 upstream. - we want the exclusive lock type, so test for it directly - use sscanf() to actually parse the lock cookie and avoid admitting invalid handles - bail if locker has a blank address Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-03rbd: make get_lock_owner_info() return a single locker or NULLIlya Dryomov
commit f38cb9d9c2045dad16eead4a2e1aedfddd94603b upstream. Make the "num_lockers can be only 0 or 1" assumption explicit and simplify the API by getting rid of output parameters in preparation for calling get_lock_owner_info() twice before blocklisting. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Dongsheng Yang <dongsheng.yang@easystack.cn> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-03dm cache policy smq: ensure IO doesn't prevent cleaner policy progressJoe Thornber
commit 1e4ab7b4c881cf26c1c72b3f56519e03475486fb upstream. When using the cleaner policy to decommission the cache, there is never any writeback started from the cache as it is constantly delayed due to normal I/O keeping the device busy. Meaning @idle=false was always being passed to clean_target_met() Fix this by adding a specific 'cleaner' flag that is set when the cleaner policy is configured. This flag serves to always allow the cleaner's writeback work to be queued until the cache is decommissioned (even if the cache isn't idle). Reported-by: David Jeffery <djeffery@redhat.com> Fixes: b29d4986d0da ("dm cache: significant rework to leverage dm-bio-prison-v2") Cc: stable@vger.kernel.org Signed-off-by: Joe Thornber <ejt@redhat.com> Signed-off-by: Mike Snitzer <snitzer@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-03drm/i915/dpt: Use shmem for dpt objectsRadhakrishna Sripada
commit 3844ed5e78823eebb5f0f1edefc403310693d402 upstream. Dpt objects that are created from internal get evicted when there is memory pressure and do not get restored when pinned during scanout. The pinned page table entries look corrupted and programming the display engine with the incorrect pte's result in DE throwing pipe faults. Create DPT objects from shmem and mark the object as dirty when pinning so that the object is restored when shrinker evicts an unpinned buffer object. v2: Unconditionally mark the dpt objects dirty during pinning(Chris). Fixes: 0dc987b699ce ("drm/i915/display: Add smem fallback allocation for dpt") Cc: <stable@vger.kernel.org> # v6.0+ Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Suggested-by: Chris Wilson <chris.p.wilson@intel.com> Signed-off-by: Fei Yang <fei.yang@intel.com> Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230718225118.2562132-1-radhakrishna.sripada@intel.com (cherry picked from commit e91a777a6e602ba0e3366e053e4e094a334a1244) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-03ceph: never send metrics if disable_send_metrics is setXiubo Li
commit 50164507f6b7b7ed85d8c3ac0266849fbd908db7 upstream. Even the 'disable_send_metrics' is true so when the session is being opened it will always trigger to send the metric for the first time. Cc: stable@vger.kernel.org Signed-off-by: Xiubo Li <xiubli@redhat.com> Reviewed-by: Venky Shankar <vshankar@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-03thermal: of: fix double-free on unregistrationAhmad Fatoum
commit ac4436a5b20e0ef1f608a9ef46c08d5d142f8da6 upstream. Since commit 3d439b1a2ad3 ("thermal/core: Alloc-copy-free the thermal zone parameters structure"), thermal_zone_device_register() allocates a copy of the tzp argument and frees it when unregistering, so thermal_of_zone_register() now ends up leaking its original tzp and double-freeing the tzp copy. Fix this by locating tzp on stack instead. Fixes: 3d439b1a2ad3 ("thermal/core: Alloc-copy-free the thermal zone parameters structure") Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org> Cc: 6.4+ <stable@vger.kernel.org> # 6.4+: 8bcbb18c61d6: thermal: core: constify params in thermal_zone_device_register Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>