summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2025-06-04arm64: dts: qcom: sm8550: Add missing properties for cryptobamStephan Gerhold
commit 663cd2cad36da23cf1a3db7868fce9f1a19b2d61 upstream. num-channels and qcom,num-ees are required for BAM nodes without clock, because the driver cannot ensure the hardware is powered on when trying to obtain the information from the hardware registers. Specifying the node without these properties is unsafe and has caused early boot crashes for other SoCs before [1, 2]. Add the missing information from the hardware registers to ensure the driver can probe successfully without causing crashes. [1]: https://lore.kernel.org/r/CY01EKQVWE36.B9X5TDXAREPF@fairphone.com/ [2]: https://lore.kernel.org/r/20230626145959.646747-1-krzysztof.kozlowski@linaro.org/ Cc: stable@vger.kernel.org Fixes: 433477c3bf0b ("arm64: dts: qcom: sm8550: add QCrypto nodes") Signed-off-by: Stephan Gerhold <stephan.gerhold@linaro.org> Link: https://lore.kernel.org/r/20250212-bam-dma-fixes-v1-3-f560889e65d8@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-04arm64: dts: qcom: sm8450: Add missing properties for cryptobamStephan Gerhold
commit 0fe6357229cb15a64b6413c62f1c3d4de68ce55f upstream. num-channels and qcom,num-ees are required for BAM nodes without clock, because the driver cannot ensure the hardware is powered on when trying to obtain the information from the hardware registers. Specifying the node without these properties is unsafe and has caused early boot crashes for other SoCs before [1, 2]. Add the missing information from the hardware registers to ensure the driver can probe successfully without causing crashes. [1]: https://lore.kernel.org/r/CY01EKQVWE36.B9X5TDXAREPF@fairphone.com/ [2]: https://lore.kernel.org/r/20230626145959.646747-1-krzysztof.kozlowski@linaro.org/ Cc: stable@vger.kernel.org Fixes: b92b0d2f7582 ("arm64: dts: qcom: sm8450: add crypto nodes") Signed-off-by: Stephan Gerhold <stephan.gerhold@linaro.org> Link: https://lore.kernel.org/r/20250212-bam-dma-fixes-v1-2-f560889e65d8@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-04arm64: dts: qcom: sm8350: Fix typo in pil_camera_mem nodeAlok Tiwari
commit 295217420a44403a33c30f99d8337fe7b07eb02b upstream. There is a typo in sm8350.dts where the node label mmeory@85200000 should be memory@85200000. This patch corrects the typo for clarity and consistency. Fixes: b7e8f433a673 ("arm64: dts: qcom: Add basic devicetree support for SM8350 SoC") Cc: stable@vger.kernel.org Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Link: https://lore.kernel.org/r/20250514114656.2307828-1-alok.a.tiwari@oracle.com Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-04arm64: dts: qcom: sa8775p: Remove cdsp compute-cb@10Karthik Sanagavarapu
commit d180c2bd3b43d55f30c9b99de68bc6bb8420d1c1 upstream. Remove the context bank compute-cb@10 because these SMMU ids are S2-only which is not used for S1 transaction. Fixes: f7b01bfb4b47 ("arm64: qcom: sa8775p: Add ADSP and CDSP0 fastrpc nodes") Cc: stable@kernel.org Signed-off-by: Karthik Sanagavarapu <quic_kartsana@quicinc.com> Signed-off-by: Ling Xu <quic_lxu5@quicinc.com> Link: https://lore.kernel.org/r/4c9de858fda7848b77ea8c528c9b9d53600ad21a.1739260973.git.quic_lxu5@quicinc.com Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-04arm64: dts: qcom: sa8775p: Remove extra entries from the iommus propertyLing Xu
commit eb73f500548a3205741330cbd7d0e209a7a6a9af upstream. There are some items come out to be same value if we do SID & ~MASK. Remove extra entries from the iommus property for sa8775p to simplify. Fixes: f7b01bfb4b47 ("arm64: qcom: sa8775p: Add ADSP and CDSP0 fastrpc nodes") Cc: stable@kernel.org Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Ling Xu <quic_lxu5@quicinc.com> Link: https://lore.kernel.org/r/49f463415c8fa2b08fbc2317e31493362056f403.1739260973.git.quic_lxu5@quicinc.com Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-04arm64: dts: qcom: ipq9574: Add missing properties for cryptobamStephan Gerhold
commit b4cd966edb2deb5c75fe356191422e127445b830 upstream. num-channels and qcom,num-ees are required for BAM nodes without clock, because the driver cannot ensure the hardware is powered on when trying to obtain the information from the hardware registers. Specifying the node without these properties is unsafe and has caused early boot crashes for other SoCs before [1, 2]. Add the missing information from the hardware registers to ensure the driver can probe successfully without causing crashes. [1]: https://lore.kernel.org/r/CY01EKQVWE36.B9X5TDXAREPF@fairphone.com/ [2]: https://lore.kernel.org/r/20230626145959.646747-1-krzysztof.kozlowski@linaro.org/ Cc: stable@vger.kernel.org Tested-by: Md Sadre Alam <quic_mdalam@quicinc.com> Fixes: ffadc79ed99f ("arm64: dts: qcom: ipq9574: Enable crypto nodes") Signed-off-by: Stephan Gerhold <stephan.gerhold@linaro.org> Link: https://lore.kernel.org/r/20250212-bam-dma-fixes-v1-6-f560889e65d8@linaro.org Signed-off-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-29x86/mm/init: Handle the special case of device private pages in add_pages(), ↵Balbir Singh
to not increase max_pfn and trigger dma_addressing_limited() bounce buffers bounce buffers commit 7170130e4c72ce0caa0cb42a1627c635cc262821 upstream. As Bert Karwatzki reported, the following recent commit causes a performance regression on AMD iGPU and dGPU systems: 7ffb791423c7 ("x86/kaslr: Reduce KASLR entropy on most x86 systems") It exposed a bug with nokaslr and zone device interaction. The root cause of the bug is that, the GPU driver registers a zone device private memory region. When KASLR is disabled or the above commit is applied, the direct_map_physmem_end is set to much higher than 10 TiB typically to the 64TiB address. When zone device private memory is added to the system via add_pages(), it bumps up the max_pfn to the same value. This causes dma_addressing_limited() to return true, since the device cannot address memory all the way up to max_pfn. This caused a regression for games played on the iGPU, as it resulted in the DMA32 zone being used for GPU allocations. Fix this by not bumping up max_pfn on x86 systems, when pgmap is passed into add_pages(). The presence of pgmap is used to determine if device private memory is being added via add_pages(). More details: devm_request_mem_region() and request_free_mem_region() request for device private memory. iomem_resource is passed as the base resource with start and end parameters. iomem_resource's end depends on several factors, including the platform and virtualization. On x86 for example on bare metal, this value is set to boot_cpu_data.x86_phys_bits. boot_cpu_data.x86_phys_bits can change depending on support for MKTME. By default it is set to the same as log2(direct_map_physmem_end) which is 46 to 52 bits depending on the number of levels in the page table. The allocation routines used iomem_resource's end and direct_map_physmem_end to figure out where to allocate the region. [ arch/powerpc is also impacted by this problem, but this patch does not fix the issue for PowerPC. ] Testing: 1. Tested on a virtual machine with test_hmm for zone device inseration 2. A previous version of this patch was tested by Bert, please see: https://lore.kernel.org/lkml/d87680bab997fdc9fb4e638983132af235d9a03a.camel@web.de/ [ mingo: Clarified the comments and the changelog. ] Reported-by: Bert Karwatzki <spasswolf@web.de> Tested-by: Bert Karwatzki <spasswolf@web.de> Fixes: 7ffb791423c7 ("x86/kaslr: Reduce KASLR entropy on most x86 systems") Signed-off-by: Balbir Singh <balbirs@nvidia.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Juergen Gross <jgross@suse.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Christoph Hellwig <hch@lst.de> Cc: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Simona Vetter <simona@ffwll.ch> Link: https://lore.kernel.org/r/20250401000752.249348-1-balbirs@nvidia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-29Fix mis-uses of 'cc-option' for warning disablementLinus Torvalds
commit a79be02bba5c31f967885c7f3bf3a756d77d11d9 upstream. This was triggered by one of my mis-uses causing odd build warnings on sparc in linux-next, but while figuring out why the "obviously correct" use of cc-option caused such odd breakage, I found eight other cases of the same thing in the tree. The root cause is that 'cc-option' doesn't work for checking negative warning options (ie things like '-Wno-stringop-overflow') because gcc will silently accept options it doesn't recognize, and so 'cc-option' ends up thinking they are perfectly fine. And it all works, until you have a situation where _another_ warning is emitted. At that point the compiler will go "Hmm, maybe the user intended to disable this warning but used that wrong option that I didn't recognize", and generate a warning for the unrecognized negative option. Which explains why we have several cases of this in the tree: the 'cc-option' test really doesn't work for this situation, but most of the time it simply doesn't matter that ity doesn't work. The reason my recently added case caused problems on sparc was pointed out by Thomas Weißschuh: the sparc build had a previous explicit warning that then triggered the new one. I think the best fix for this would be to make 'cc-option' a bit smarter about this sitation, possibly by adding an intentional warning to the test case that then triggers the unrecognized option warning reliably. But the short-term fix is to replace 'cc-option' with an existing helper designed for this exact case: 'cc-disable-warning', which picks the negative warning but uses the positive form for testing the compiler support. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Link: https://lore.kernel.org/all/20250422204718.0b4e3f81@canb.auug.org.au/ Explained-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-29Revert "arm64: dts: allwinner: h6: Use RSB for AXP805 PMIC connection"Jernej Skrabec
[ Upstream commit 573f99c7585f597630f14596550c79e73ffaeef4 ] This reverts commit 531fdbeedeb89bd32018a35c6e137765c9cc9e97. Hardware that uses I2C wasn't designed with high speeds in mind, so communication with PMIC via RSB can intermittently fail. Go back to I2C as higher speed and efficiency isn't worth the trouble. Fixes: 531fdbeedeb8 ("arm64: dts: allwinner: h6: Use RSB for AXP805 PMIC connection") Link: https://github.com/LibreELEC/LibreELEC.tv/issues/7731 Signed-off-by: Jernej Skrabec <jernej.skrabec@gmail.com> Link: https://patch.msgid.link/20250413135848.67283-1-jernej.skrabec@gmail.com Signed-off-by: Chen-Yu Tsai <wens@csie.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29arm64: dts: marvell: uDPU: define pinctrl state for alarm LEDsGabor Juhos
commit b04f0d89e880bc2cca6a5c73cf287082c91878da upstream. The two alarm LEDs of on the uDPU board are stopped working since commit 78efa53e715e ("leds: Init leds class earlier"). The LEDs are driven by the GPIO{15,16} pins of the North Bridge GPIO controller. These pins are part of the 'spi_quad' pin group for which the 'spi' function is selected via the default pinctrl state of the 'spi' node. This is wrong however, since in order to allow controlling the LEDs, the pins should use the 'gpio' function. Before the commit mentined above, the 'spi' function is selected first by the pinctrl core before probing the spi driver, but then it gets overridden to 'gpio' implicitly via the devm_gpiod_get_index_optional() call from the 'leds-gpio' driver. After the commit, the LED subsystem gets initialized before the SPI subsystem, so the function of the pin group remains 'spi' which in turn prevents controlling of the LEDs. Despite the change of the initialization order, the root cause is that the pinctrl state definition is wrong since its initial commit 0d45062cfc89 ("arm64: dts: marvell: Add device tree for uDPU board"), To fix the problem, override the function in the 'spi_quad_pins' node to 'gpio' and move the pinctrl state definition from the 'spi' node into the 'leds' node. Cc: stable@vger.kernel.org # needs adjustment for < 6.1 Fixes: 0d45062cfc89 ("arm64: dts: marvell: Add device tree for uDPU board") Signed-off-by: Gabor Juhos <j4g8y7@gmail.com> Signed-off-by: Imre Kaloz <kaloz@openwrt.org> Signed-off-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-29perf/x86/intel: Fix segfault with PEBS-via-PT with sample_freqAdrian Hunter
[ Upstream commit 99bcd91fabada0dbb1d5f0de44532d8008db93c6 ] Currently, using PEBS-via-PT with a sample frequency instead of a sample period, causes a segfault. For example: BUG: kernel NULL pointer dereference, address: 0000000000000195 <NMI> ? __die_body.cold+0x19/0x27 ? page_fault_oops+0xca/0x290 ? exc_page_fault+0x7e/0x1b0 ? asm_exc_page_fault+0x26/0x30 ? intel_pmu_pebs_event_update_no_drain+0x40/0x60 ? intel_pmu_pebs_event_update_no_drain+0x32/0x60 intel_pmu_drain_pebs_icl+0x333/0x350 handle_pmi_common+0x272/0x3c0 intel_pmu_handle_irq+0x10a/0x2e0 perf_event_nmi_handler+0x2a/0x50 That happens because intel_pmu_pebs_event_update_no_drain() assumes all the pebs_enabled bits represent counter indexes, which is not always the case. In this particular case, bits 60 and 61 are set for PEBS-via-PT purposes. The behaviour of PEBS-via-PT with sample frequency is questionable because although a PMI is generated (PEBS_PMI_AFTER_EACH_RECORD), the period is not adjusted anyway. Putting that aside, fix intel_pmu_pebs_event_update_no_drain() by passing the mask of counter bits instead of 'size'. Note, prior to the Fixes commit, 'size' would be limited to the maximum counter index, so the issue was not hit. Fixes: 722e42e45c2f1 ("perf/x86: Support counter mask") Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Ian Rogers <irogers@google.com> Cc: linux-perf-users@vger.kernel.org Link: https://lore.kernel.org/r/20250508134452.73960-1-adrian.hunter@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29x86/sev: Fix operator precedence in GHCB_MSR_VMPL_REQ_LEVEL macroSeongman Lee
[ Upstream commit f7387eff4bad33d12719c66c43541c095556ae4e ] The GHCB_MSR_VMPL_REQ_LEVEL macro lacked parentheses around the bitmask expression, causing the shift operation to bind too early. As a result, when requesting VMPL1 (e.g., GHCB_MSR_VMPL_REQ_LEVEL(1)), incorrect values such as 0x000000016 were generated instead of the intended 0x100000016 (the requested VMPL level is specified in GHCBData[39:32]). Fix the precedence issue by grouping the masked value before applying the shift. [ bp: Massage commit message. ] Fixes: 34ff65901735 ("x86/sev: Use kernel provided SVSM Calling Areas") Signed-off-by: Seongman Lee <augustus92@kaist.ac.kr> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250511092329.12680-1-cloudlee1719@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29x86/Kconfig: make CFI_AUTO_DEFAULT depend on !RUST or Rust >= 1.88Paweł Anikiel
[ Upstream commit 5595c31c370957aabe739ac3996aedba8267603f ] Calling core::fmt::write() from rust code while FineIBT is enabled results in a kernel panic: [ 4614.199779] kernel BUG at arch/x86/kernel/cet.c:132! [ 4614.205343] Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI [ 4614.211781] CPU: 2 UID: 0 PID: 6057 Comm: dmabuf_dump Tainted: G U O 6.12.17-android16-0-g6ab38c534a43 #1 9da040f27673ec3945e23b998a0f8bd64c846599 [ 4614.227832] Tainted: [U]=USER, [O]=OOT_MODULE [ 4614.241247] RIP: 0010:do_kernel_cp_fault+0xea/0xf0 ... [ 4614.398144] RIP: 0010:_RNvXs5_NtNtNtCs3o2tGsuHyou_4core3fmt3num3impyNtB9_7Display3fmt+0x0/0x20 [ 4614.407792] Code: 48 f7 df 48 0f 48 f9 48 89 f2 89 c6 5d e9 18 fd ff ff 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 41 81 ea 14 61 af 2c 74 03 0f 0b 90 <66> 0f 1f 00 55 48 89 e5 48 89 f2 48 8b 3f be 01 00 00 00 5d e9 e7 [ 4614.428775] RSP: 0018:ffffb95acfa4ba68 EFLAGS: 00010246 [ 4614.434609] RAX: 0000000000000000 RBX: 0000000000000010 RCX: 0000000000000000 [ 4614.442587] RDX: 0000000000000007 RSI: ffffb95acfa4ba70 RDI: ffffb95acfa4bc88 [ 4614.450557] RBP: ffffb95acfa4bae0 R08: ffff0a00ffffff05 R09: 0000000000000070 [ 4614.458527] R10: 0000000000000000 R11: ffffffffab67eaf0 R12: ffffb95acfa4bcc8 [ 4614.466493] R13: ffffffffac5d50f0 R14: 0000000000000000 R15: 0000000000000000 [ 4614.474473] ? __cfi__RNvXs5_NtNtNtCs3o2tGsuHyou_4core3fmt3num3impyNtB9_7Display3fmt+0x10/0x10 [ 4614.484118] ? _RNvNtCs3o2tGsuHyou_4core3fmt5write+0x1d2/0x250 This happens because core::fmt::write() calls core::fmt::rt::Argument::fmt(), which currently has CFI disabled: library/core/src/fmt/rt.rs: 171 // FIXME: Transmuting formatter in new and indirectly branching to/calling 172 // it here is an explicit CFI violation. 173 #[allow(inline_no_sanitize)] 174 #[no_sanitize(cfi, kcfi)] 175 #[inline] 176 pub(super) unsafe fn fmt(&self, f: &mut Formatter<'_>) -> Result { This causes a Control Protection exception, because FineIBT has sealed off the original function's endbr64. This makes rust currently incompatible with FineIBT. Add a Kconfig dependency that prevents FineIBT from getting turned on by default if rust is enabled. [ Rust 1.88.0 (scheduled for 2025-06-26) should have this fixed [1], and thus we relaxed the condition with Rust >= 1.88. When `objtool` lands checking for this with e.g. [2], the plan is to ideally run that in upstream Rust's CI to prevent regressions early [3], since we do not control `core`'s source code. Alice tested the Rust PR backported to an older compiler. Peter would like that Rust provides a stable `core` which can be pulled into the kernel: "Relying on that much out of tree code is 'unfortunate'". - Miguel ] Signed-off-by: Paweł Anikiel <panikiel@google.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Acked-by: Peter Zijlstra <peterz@infradead.org> Link: https://github.com/rust-lang/rust/pull/139632 [1] Link: https://lore.kernel.org/rust-for-linux/20250410154556.GB9003@noisy.programming.kicks-ass.net/ [2] Link: https://github.com/rust-lang/rust/pull/139632#issuecomment-2801950873 [3] Link: https://lore.kernel.org/r/20250410115420.366349-1-panikiel@google.com Link: https://lore.kernel.org/r/att0-CANiq72kjDM0cKALVy4POEzhfdT4nO7tqz0Pm7xM+3=_0+L1t=A@mail.gmail.com [ Reduced splat. - Miguel ] Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29book3s64/radix: Fix compile errors when CONFIG_ARCH_WANT_OPTIMIZE_DAX_VMEMMAP=nRitesh Harjani (IBM)
[ Upstream commit 29bdc1f1c1df80868fb35bc69d1f073183adc6de ] Fix compile errors when CONFIG_ARCH_WANT_OPTIMIZE_DAX_VMEMMAP=n Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Signed-off-by: Donet Tom <donettom@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/8231763344223c193e3452eab0ae8ea966aff466.1741609795.git.donettom@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29perf/amd/ibs: Fix ->config to sample period calculation for OP PMURavi Bangoria
[ Upstream commit 598bdf4fefff5af4ce6d26d16f7b2a20808fc4cb ] Instead of using standard perf_event_attr->freq=0 and ->sample_period fields, IBS event in 'sample period mode' can also be opened by setting period value directly in perf_event_attr->config in a MaxCnt bit-field format. IBS OP MaxCnt bits are defined as: (high bits) IbsOpCtl[26:20] = IbsOpMaxCnt[26:20] (low bits) IbsOpCtl[15:0] = IbsOpMaxCnt[19:4] Perf event sample period can be derived from MaxCnt bits as: sample_period = (high bits) | ((low_bits) << 4); However, current code just masks MaxCnt bits and shifts all of them, including high bits, which is incorrect. Fix it. Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/r/20250115054438.1021-4-ravi.bangoria@amd.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29perf/amd/ibs: Fix perf_ibs_op.cnt_mask for CurCntRavi Bangoria
[ Upstream commit 46dcf85566170d4528b842bf83ffc350d71771fa ] IBS Op uses two counters: MaxCnt and CurCnt. MaxCnt is programmed with the desired sample period. IBS hw generates sample when CurCnt reaches to MaxCnt. The size of these counter used to be 20 bits but later they were extended to 27 bits. The 7 bit extension is indicated by CPUID Fn8000_001B_EAX[6 / OpCntExt]. perf_ibs->cnt_mask variable contains bit masks for MaxCnt and CurCnt. But IBS driver does not set upper 7 bits of CurCnt in cnt_mask even when OpCntExt CPUID bit is set. Fix this. IBS driver uses cnt_mask[CurCnt] bits only while disabling an event. Fortunately, CurCnt bits are not read from MSR while re-enabling the event, instead MaxCnt is programmed with desired period and CurCnt is set to 0. Hence, we did not see any issues so far. Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lkml.kernel.org/r/20250115054438.1021-5-ravi.bangoria@amd.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29arm64: zynqmp: add clock-output-names property in clock nodesNaman Trivedi
[ Upstream commit 385a59e7f7fb3438466a0712cc14672c708bbd57 ] Add clock-output-names property to clock nodes, so that the resulting clock name do not change when clock node name is changed. Also, replace underscores with hyphens in the clock node names as per dt-schema rule. Signed-off-by: Naman Trivedi <naman.trivedimanojbhai@amd.com> Acked-by: Senthil Nathan Thangaraj <senthilnathan.thangaraj@amd.com> Link: https://lore.kernel.org/r/20241122095712.1166883-1-naman.trivedimanojbhai@amd.com Signed-off-by: Michal Simek <michal.simek@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29arch/powerpc/perf: Check the instruction type before creating sample with ↵Athira Rajeev
perf_mem_data_src [ Upstream commit 2ffb26afa64261139e608bf087a0c1fe24d76d4d ] perf mem report aborts as below sometimes (during some corner case) in powerpc: # ./perf mem report 1>out *** stack smashing detected ***: terminated Aborted (core dumped) The backtrace is as below: __pthread_kill_implementation () raise () abort () __libc_message __fortify_fail __stack_chk_fail hist_entry.lvl_snprintf __sort__hpp_entry __hist_entry__snprintf hists.fprintf cmd_report cmd_mem Snippet of code which triggers the issue from tools/perf/util/sort.c static int hist_entry__lvl_snprintf(struct hist_entry *he, char *bf, size_t size, unsigned int width) { char out[64]; perf_mem__lvl_scnprintf(out, sizeof(out), he->mem_info); return repsep_snprintf(bf, size, "%-*s", width, out); } The value of "out" is filled from perf_mem_data_src value. Debugging this further showed that for some corner cases, the value of "data_src" was pointing to wrong value. This resulted in bigger size of string and causing stack check fail. The perf mem data source values are captured in the sample via isa207_get_mem_data_src function. The initial check is to fetch the type of sampled instruction. If the type of instruction is not valid (not a load/store instruction), the function returns. Since 'commit e16fd7f2cb1a ("perf: Use sample_flags for data_src")', data_src field is not initialized by the perf_sample_data_init() function. If the PMU driver doesn't set the data_src value to zero if type is not valid, this will result in uninitailised value for data_src. The uninitailised value of data_src resulted in stack check fail followed by abort for "perf mem report". When requesting for data source information in the sample, the instruction type is expected to be load or store instruction. In ISA v3.0, due to hardware limitation, there are corner cases where the instruction type other than load or store is observed. In ISA v3.0 and before values "0" and "7" are considered reserved. In ISA v3.1, value "7" has been used to indicate "larx/stcx". Drop the sample if instruction type has reserved values for this field with a ISA version check. Initialize data_src to zero in isa207_get_mem_data_src if the instruction type is not load/store. Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250121131621.39054-1-atrajeev@linux.vnet.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29powerpc/pseries/iommu: create DDW for devices with DMA mask less than 64-bitsGaurav Batra
[ Upstream commit 67dfc11982f7e3c37f0977e74671da2391b29181 ] Starting with PAPR level 2.13, platform supports placing PHB in limited address mode. Devices that support DMA masks less that 64-bit but greater than 32-bits are placed in limited address mode. In this mode, the starting DMA address returned by the DDW is 4GB. When the device driver calls dma_supported, with mask less then 64-bit, the PowerPC IOMMU driver places PHB in the Limited Addressing Mode before creating DDW. Signed-off-by: Gaurav Batra <gbatra@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250108164814.73250-1-gbatra@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29powerpc/pseries/iommu: memory notifier incorrectly adds TCEs for pmemoryGaurav Batra
[ Upstream commit 6aa989ab2bd0d37540c812b4270006ff794662e7 ] iommu_mem_notifier() is invoked when RAM is dynamically added/removed. This notifier call is responsible to add/remove TCEs from the Dynamic DMA Window (DDW) when TCEs are pre-mapped. TCEs are pre-mapped only for RAM and not for persistent memory (pmemory). For DMA buffers in pmemory, TCEs are dynamically mapped when the device driver instructs to do so. The issue is 'daxctl' command is capable of adding pmemory as "System RAM" after LPAR boot. The command to do so is - daxctl reconfigure-device --mode=system-ram dax0.0 --force This will dynamically add pmemory range to LPAR RAM eventually invoking iommu_mem_notifier(). The address range of pmemory is way beyond the Max RAM that the LPAR can have. Which means, this range is beyond the DDW created for the device, at device initialization time. As a result when TCEs are pre-mapped for the pmemory range, by iommu_mem_notifier(), PHYP HCALL returns H_PARAMETER. This failed the command, daxctl, to add pmemory as RAM. The solution is to not pre-map TCEs for pmemory. Signed-off-by: Gaurav Batra <gbatra@linux.ibm.com> Tested-by: Donet Tom <donettom@linux.ibm.com> Reviewed-by: Donet Tom <donettom@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250130183854.92258-1-gbatra@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29x86/traps: Cleanup and robustify decode_bug()Peter Zijlstra
[ Upstream commit c20ad96c9a8f0aeaf4e4057730a22de2657ad0c2 ] Notably, don't attempt to decode an immediate when MOD == 3. Additionally have it return the instruction length, such that WARN like bugs can more reliably skip to the correct instruction. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lore.kernel.org/r/20250207122546.721120726@infradead.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29x86/ibt: Handle FineIBT in handle_cfi_failure()Peter Zijlstra
[ Upstream commit 882b86fd4e0d49bf91148dbadcdbece19ded40e6 ] Sami reminded me that FineIBT failure does not hook into the regular CFI failure case, and as such CFI_PERMISSIVE does not work. Reported-by: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Link: https://lkml.kernel.org/r/20250214092619.GB21726@noisy.programming.kicks-ass.net Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29MIPS: pm-cps: Use per-CPU variables as per-CPU, not per-corePaul Burton
[ Upstream commit 00a134fc2bb4a5f8fada58cf7ff4259149691d64 ] The pm-cps code has up until now used per-CPU variables indexed by core, rather than CPU number, in order to share data amongst sibling CPUs (ie. VPs/threads in a core). This works fine for single cluster systems, but with multi-cluster systems a core number is no longer unique in the system, leading to sharing between CPUs that are not actually siblings. Avoid this issue by using per-CPU variables as they are more generally used - ie. access them using CPU numbers rather than core numbers. Sharing between siblings is then accomplished by: - Assigning the same pointer to entries for each sibling CPU for the nc_asm_enter & ready_count variables, which allow this by virtue of being per-CPU pointers. - Indexing by the first CPU set in a CPUs cpu_sibling_map in the case of pm_barrier, for which we can't use the previous approach because the per-CPU variable is not a pointer. Signed-off-by: Paul Burton <paulburton@kernel.org> Signed-off-by: Dragan Mladjenovic <dragan.mladjenovic@syrmia.com> Signed-off-by: Aleksandar Rikalo <arikalo@gmail.com> Tested-by: Serge Semin <fancer.lancer@gmail.com> Tested-by: Gregory CLEMENT <gregory.clement@bootlin.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29x86/locking: Use ALT_OUTPUT_SP() for percpu_{,try_}cmpxchg{64,128}_op()Uros Bizjak
[ Upstream commit 4087e16b033140cf2ce509ec23503bddec818a16 ] percpu_{,try_}cmpxchg{64,128}() macros use CALL instruction inside asm statement in one of their alternatives. Use ALT_OUTPUT_SP() macro to add required dependence on %esp register. ALT_OUTPUT_SP() implements the above dependence by adding ASM_CALL_CONSTRAINT to its arguments. This constraint should be used for any inline asm which has a CALL instruction, otherwise the compiler may schedule the asm before the frame pointer gets set up by the containing function, causing objtool to print a "call without frame pointer save/setup" warning. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250214150929.5780-1-ubizjak@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29MIPS: Use arch specific syscall name match functionBibo Mao
[ Upstream commit 756276ce78d5624dc814f9d99f7d16c8fd51076e ] On MIPS system, most of the syscall function name begin with prefix sys_. Some syscalls are special such as clone/fork, function name of these begin with __sys_. Since scratch registers need be saved in stack when these system calls happens. With ftrace system call method, system call functions are declared with SYSCALL_DEFINEx, metadata of the system call symbol name begins with sys_. Here mips specific function arch_syscall_match_sym_name is used to compare function name between sys_call_table[] and metadata of syscall symbol. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29x86/kaslr: Reduce KASLR entropy on most x86 systemsBalbir Singh
[ Upstream commit 7ffb791423c7c518269a9aad35039ef824a40adb ] When CONFIG_PCI_P2PDMA=y (which is basically enabled on all large x86 distros), it maps the PFN's via a ZONE_DEVICE mapping using devm_memremap_pages(). The mapped virtual address range corresponds to the pci_resource_start() of the BAR address and size corresponding to the BAR length. When KASLR is enabled, the direct map range of the kernel is reduced to the size of physical memory plus additional padding. If the BAR address is beyond this limit, PCI peer to peer DMA mappings fail. Fix this by not shrinking the size of the direct map when CONFIG_PCI_P2PDMA=y. This reduces the total available entropy, but it's better than the current work around of having to disable KASLR completely. [ mingo: Clarified the changelog to point out the broad impact ... ] Signed-off-by: Balbir Singh <balbirs@nvidia.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Kees Cook <kees@kernel.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # drivers/pci/Kconfig Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Link: https://lore.kernel.org/lkml/20250206023201.1481957-1-balbirs@nvidia.com/ Link: https://lore.kernel.org/r/20250206234234.1912585-1-balbirs@nvidia.com -- arch/x86/mm/kaslr.c | 10 ++++++++-- drivers/pci/Kconfig | 6 ++++++ 2 files changed, 14 insertions(+), 2 deletions(-) Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29x86/nmi: Add an emergency handler in nmi_desc & use it in nmi_shootdown_cpus()Waiman Long
[ Upstream commit fe37c699ae3eed6e02ee55fbf5cb9ceb7fcfd76c ] Depending on the type of panics, it was found that the __register_nmi_handler() function can be called in NMI context from nmi_shootdown_cpus() leading to a lockdep splat: WARNING: inconsistent lock state inconsistent {INITIAL USE} -> {IN-NMI} usage. lock(&nmi_desc[0].lock); <Interrupt> lock(&nmi_desc[0].lock); Call Trace: _raw_spin_lock_irqsave __register_nmi_handler nmi_shootdown_cpus kdump_nmi_shootdown_cpus native_machine_crash_shutdown __crash_kexec In this particular case, the following panic message was printed before: Kernel panic - not syncing: Fatal hardware error! This message seemed to be given out from __ghes_panic() running in NMI context. The __register_nmi_handler() function which takes the nmi_desc lock with irq disabled shouldn't be called from NMI context as this can lead to deadlock. The nmi_shootdown_cpus() function can only be invoked once. After the first invocation, all other CPUs should be stuck in the newly added crash_nmi_callback() and cannot respond to a second NMI. Fix it by adding a new emergency NMI handler to the nmi_desc structure and provide a new set_emergency_nmi_handler() helper to set crash_nmi_callback() in any context. The new emergency handler will preempt other handlers in the linked list. That will eliminate the need to take any lock and serve the panic in NMI use case. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Rik van Riel <riel@surriel.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/r/20250206191844.131700-1-longman@redhat.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29x86/build: Fix broken copy command in genimage.sh when making isoimageNir Lichtman
[ Upstream commit e451630226bd09dc730eedb4e32cab1cc7155ae8 ] Problem: Currently when running the "make isoimage" command there is an error related to wrong parameters passed to the cp command: "cp: missing destination file operand after 'arch/x86/boot/isoimage/'" This is caused because FDINITRDS is an empty array. Solution: Check if FDINITRDS is empty before executing the "cp" command, similar to how it is done in the case of hdimage. Signed-off-by: Nir Lichtman <nir@lichtman.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Michal Marek <michal.lkml@markovi.net> Link: https://lore.kernel.org/r/20250110120500.GA923218@lichtman.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29ARM: at91: pm: fix at91_suspend_finish for ZQ calibrationLi Bin
[ Upstream commit bc4722c3598d0e2c2dbf9609a3d3198993093e2b ] For sama7g5 and sama7d65 backup mode, we encountered a "ZQ calibrate error" during recalibrating the impedance in BootStrap. We found that the impedance value saved in at91_suspend_finish() before the DDR entered self-refresh mode did not match the resistor values. The ZDATA field in the DDR3PHY_ZQ0CR0 register uses a modified gray code to select the different impedance setting. But these gray code are incorrect, a workaournd from design team fixed the bug in the calibration logic. The ZDATA contains four independent impedance elements, but the algorithm combined the four elements into one. The elements were fixed using properly shifted offsets. Signed-off-by: Li Bin <bin.li@microchip.com> [nicolas.ferre@microchip.com: fix indentation and combine 2 patches] Signed-off-by: Nicolas Ferre <nicolas.ferre@microchip.com> Tested-by: Ryan Wanner <Ryan.Wanner@microchip.com> Tested-by: Durai Manickam KR <durai.manickamkr@microchip.com> Tested-by: Andrei Simion <andrei.simion@microchip.com> Signed-off-by: Ryan Wanner <Ryan.Wanner@microchip.com> Link: https://lore.kernel.org/r/28b33f9bcd0ca60ceba032969fe054d38f2b9577.1740671156.git.Ryan.Wanner@microchip.com Signed-off-by: Claudiu Beznea <claudiu.beznea@tuxon.dev> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29x86/bugs: Make spectre user default depend on MITIGATION_SPECTRE_V2Breno Leitao
[ Upstream commit 98fdaeb296f51ef08e727a7cc72e5b5c864c4f4d ] Change the default value of spectre v2 in user mode to respect the CONFIG_MITIGATION_SPECTRE_V2 config option. Currently, user mode spectre v2 is set to auto (SPECTRE_V2_USER_CMD_AUTO) by default, even if CONFIG_MITIGATION_SPECTRE_V2 is disabled. Set the spectre_v2 value to auto (SPECTRE_V2_USER_CMD_AUTO) if the Spectre v2 config (CONFIG_MITIGATION_SPECTRE_V2) is enabled, otherwise set the value to none (SPECTRE_V2_USER_CMD_NONE). Important to say the command line argument "spectre_v2_user" overwrites the default value in both cases. When CONFIG_MITIGATION_SPECTRE_V2 is not set, users have the flexibility to opt-in for specific mitigations independently. In this scenario, setting spectre_v2= will not enable spectre_v2_user=, and command line options spectre_v2_user and spectre_v2 are independent when CONFIG_MITIGATION_SPECTRE_V2=n. Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: David Kaplan <David.Kaplan@amd.com> Link: https://lore.kernel.org/r/20241031-x86_bugs_last_v2-v2-2-b7ff1dab840e@debian.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29ARM: tegra: Switch DSI-B clock parent to PLLD on Tegra114Svyatoslav Ryhel
[ Upstream commit 2b3db788f2f614b875b257cdb079adadedc060f3 ] PLLD is usually used as parent clock for internal video devices, like DSI for example, while PLLD2 is used as parent for HDMI. Signed-off-by: Svyatoslav Ryhel <clamor95@gmail.com> Link: https://lore.kernel.org/r/20250226105615.61087-3-clamor95@gmail.com Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29powerpc/prom_init: Fixup missing #size-cells on PowerBook6,7Andreas Schwab
[ Upstream commit 7e67ef889c9ab7246547db73d524459f47403a77 ] Similar to the PowerMac3,1, the PowerBook6,7 is missing the #size-cells property on the i2s node. Depends-on: commit 045b14ca5c36 ("of: WARN on deprecated #address-cells/#size-cells handling") Signed-off-by: Andreas Schwab <schwab@linux-m68k.org> Acked-by: Rob Herring (Arm) <robh@kernel.org> [maddy: added "commit" work in depends-on to avoid checkpatch error] Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/875xmizl6a.fsf@igel.home Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29arm64: tegra: Resize aperture for the IGX PCIe C5 slotJon Hunter
[ Upstream commit 6d4bfe6d86af1ef52bdb4592c9afb2037f24f2c4 ] Some discrete graphics cards such as the NVIDIA RTX A6000 support resizable BARs. When connecting an A6000 card to the NVIDIA IGX Orin platform, resizing the BAR1 aperture to 8GB fails because the current device-tree configuration for the PCIe C5 slot cannot support this. Fix this by updating the device-tree 'reg' and 'ranges' properties for the PCIe C5 slot to support this. Signed-off-by: Jon Hunter <jonathanh@nvidia.com> Link: https://lore.kernel.org/r/20250116151903.476047-1-jonathanh@nvidia.com Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29arm64: tegra: p2597: Fix gpio for vdd-1v8-dis regulatorDiogo Ivo
[ Upstream commit f34621f31e3be81456c903287f7e4c0609829e29 ] According to the board schematics the enable pin of this regulator is connected to gpio line #9 of the first instance of the TCA9539 GPIO expander, so adjust it. Signed-off-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt> Link: https://lore.kernel.org/r/20250224-diogo-gpio_exp-v1-1-80fb84ac48c6@tecnico.ulisboa.pt Signed-off-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29arm64/mm: Check PUD_TYPE_TABLE in pud_bad()Ryan Roberts
[ Upstream commit bfb1d2b9021c21891427acc86eb848ccedeb274e ] pud_bad() is currently defined in terms of pud_table(). Although for some configs, pud_table() is hard-coded to true i.e. when using 64K base pages or when page table levels are less than 3. pud_bad() is intended to check that the pud is configured correctly. Hence let's open-code the same check that the full version of pud_table() uses into pud_bad(). Then it always performs the check regardless of the config. Cc: Will Deacon <will@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20250221044227.1145393-7-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29arm64/mm: Check pmd_table() in pmd_trans_huge()Ryan Roberts
[ Upstream commit d1770e909898c108e8c7d30ca039053e8818a9c9 ] Check for pmd_table() in pmd_trans_huge() rather then just checking for the PMD_TABLE_BIT. But ensure all present-invalid entries are handled correctly by always setting PTE_VALID before checking with pmd_table(). Cc: Will Deacon <will@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com> Link: https://lore.kernel.org/r/20250221044227.1145393-8-anshuman.khandual@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29hypfs_create_cpu_files(): add missing check for hypfs_mkdir() failureAl Viro
[ Upstream commit 00cdfdcfa0806202aea56b02cedbf87ef1e75df8 ] Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29riscv: Call secondary mmu notifier when flushing the tlbAlexandre Ghiti
[ Upstream commit d9be2b9b60497a82aeceec3a98d8b37fdd2960f2 ] This is required to allow the IOMMU driver to correctly flush its own TLB. Reviewed-by: Clément Léger <cleger@rivosinc.com> Reviewed-by: Samuel Holland <samuel.holland@sifive.com> Link: https://lore.kernel.org/r/20250113142424.30487-1-alexghiti@rivosinc.com Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29um: Update min_low_pfn to match changes in uml_reservedTiwei Bie
[ Upstream commit e82cf3051e6193f61e03898f8dba035199064d36 ] When uml_reserved is updated, min_low_pfn must also be updated accordingly. Otherwise, min_low_pfn will not accurately reflect the lowest available PFN. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20250221041855.1156109-1-tiwei.btw@antgroup.com Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29um: Store full CSGSFS and SS register from mcontextBenjamin Berg
[ Upstream commit cef721e0d53d2b64f2ba177c63a0dfdd7c0daf17 ] Doing this allows using registers as retrieved from an mcontext to be pushed to a process using PTRACE_SETREGS. It is not entirely clear to me why CSGSFS was masked. Doing so creates issues when using the mcontext as process state in seccomp and simply copying the register appears to work perfectly fine for ptrace. Signed-off-by: Benjamin Berg <benjamin@sipsolutions.net> Link: https://patch.msgid.link/20250224181827.647129-2-benjamin@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29s390/tlb: Use mm_has_pgste() instead of mm_alloc_pgste()Heiko Carstens
[ Upstream commit 9291ea091b29bb3e37c4b3416c7c1e49e472c7d5 ] An mm has pgstes only after s390_enable_sie() has been called, while mm_alloc_pgste() may be always true (e.g. via sysctl setting). Limit the calls to gmap_unlink() in pte_free_tlb() to those cases where there might be something to unlink. Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29x86/mm: Check return value from memblock_phys_alloc_range()Philip Redkin
[ Upstream commit 631ca8909fd5c62b9fda9edda93924311a78a9c4 ] At least with CONFIG_PHYSICAL_START=0x100000, if there is < 4 MiB of contiguous free memory available at this point, the kernel will crash and burn because memblock_phys_alloc_range() returns 0 on failure, which leads memblock_phys_free() to throw the first 4 MiB of physical memory to the wolves. At a minimum it should fail gracefully with a meaningful diagnostic, but in fact everything seems to work fine without the weird reserve allocation. Signed-off-by: Philip Redkin <me@rarity.fan> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Rik van Riel <riel@surriel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lore.kernel.org/r/94b3e98f-96a7-3560-1f76-349eb95ccf7f@rarity.fan Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29x86/microcode: Update the Intel processor flag scan checkSohil Mehta
[ Upstream commit 7e6b0a2e4152f4046af95eeb46f8b4f9b2a7398d ] The Family model check to read the processor flag MSR is misleading and potentially incorrect. It doesn't consider Family while comparing the model number. The original check did have a Family number but it got lost/moved during refactoring. intel_collect_cpu_info() is called through multiple paths such as early initialization, CPU hotplug as well as IFS image load. Some of these flows would be error prone due to the ambiguous check. Correct the processor flag scan check to use a Family number and update it to a VFM based one to make it more readable. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Link: https://lore.kernel.org/r/20250219184133.816753-4-sohil.mehta@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29x86/smpboot: Fix INIT delay assignment for extended Intel FamiliesSohil Mehta
[ Upstream commit 7a2ad752746bfb13e89a83984ecc52a48bae4969 ] Some old crusty CPUs need an extra delay that slows down booting. See the comment above 'init_udelay' for details. Newer CPUs don't need the delay. Right now, for Intel, Family 6 and only Family 6 skips the delay. That leaves out both the Family 15 (Pentium 4s) and brand new Family 18/19 models. The omission of Family 15 (Pentium 4s) seems like an oversight and 18/19 do not need the delay. Skip the delay on all Intel processors Family 6 and beyond. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250219184133.816753-11-sohil.mehta@intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29x86/stackprotector/64: Only export __ref_stack_chk_guard on CONFIG_SMPIngo Molnar
[ Upstream commit 91d5451d97ce35cbd510277fa3b7abf9caa4e34d ] The __ref_stack_chk_guard symbol doesn't exist on UP: <stdin>:4:15: error: ‘__ref_stack_chk_guard’ undeclared here (not in a function) Fix the #ifdef around the entry.S export. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Uros Bizjak <ubizjak@gmail.com> Link: https://lore.kernel.org/r/20250123190747.745588-8-brgerst@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29x86/headers: Replace __ASSEMBLY__ with __ASSEMBLER__ in UAPI headersThomas Huth
[ Upstream commit 8a141be3233af7d4f7014ebc44d5452d46b2b1be ] __ASSEMBLY__ is only defined by the Makefile of the kernel, so this is not really useful for UAPI headers (unless the userspace Makefile defines it, too). Let's switch to __ASSEMBLER__ which gets set automatically by the compiler when compiling assembly code. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Kees Cook <keescook@chromium.org> Cc: Brian Gerst <brgerst@gmail.com> Link: https://lore.kernel.org/r/20250310104256.123527-1-thuth@redhat.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29riscv: Allow NOMMU kernels to access all of RAMSamuel Holland
[ Upstream commit 2c0391b29b27f315c1b4c29ffde66f50b29fab99 ] NOMMU kernels currently cannot access memory below the kernel link address. Remove this restriction by setting PAGE_OFFSET to the actual start of RAM, as determined from the devicetree. The kernel link address must be a constant, so keep using CONFIG_PAGE_OFFSET for that purpose. Signed-off-by: Samuel Holland <samuel.holland@sifive.com> Reviewed-by: Jesse Taube <mr.bossman075@gmail.com> Link: https://lore.kernel.org/r/20241026171441.3047904-3-samuel.holland@sifive.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29arm64: Add support for HIP09 Spectre-BHB mitigationJinqian Yang
[ Upstream commit e18c09b204e81702ea63b9f1a81ab003b72e3174 ] The HIP09 processor is vulnerable to the Spectre-BHB (Branch History Buffer) attack, which can be exploited to leak information through branch prediction side channels. This commit adds the MIDR of HIP09 to the list for software mitigation. Signed-off-by: Jinqian Yang <yangjinqian1@huawei.com> Link: https://lore.kernel.org/r/20250325141900.2057314-1-yangjinqian1@huawei.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-29x86/fred: Fix system hang during S4 resume with FRED enabledXin Li (Intel)
[ Upstream commit e5f1e8af9c9e151ecd665f6d2e36fb25fec3b110 ] Upon a wakeup from S4, the restore kernel starts and initializes the FRED MSRs as needed from its perspective. It then loads a hibernation image, including the image kernel, and attempts to load image pages directly into their original page frames used before hibernation unless those frames are currently in use. Once all pages are moved to their original locations, it jumps to a "trampoline" page in the image kernel. At this point, the image kernel takes control, but the FRED MSRs still contain values set by the restore kernel, which may differ from those set by the image kernel before hibernation. Therefore, the image kernel must ensure the FRED MSRs have the same values as before hibernation. Since these values depend only on the location of the kernel text and data, they can be recomputed from scratch. Reported-by: Xi Pardee <xi.pardee@intel.com> Reported-by: Todd Brandt <todd.e.brandt@intel.com> Tested-by: Todd Brandt <todd.e.brandt@intel.com> Suggested-by: H. Peter Anvin (Intel) <hpa@zytor.com> Signed-off-by: Xin Li (Intel) <xin@zytor.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: H. Peter Anvin (Intel) <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250401075728.3626147-1-xin@zytor.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2025-05-22arm64: dts: rockchip: Remove overdrive-mode OPPs from RK3588J SoC dtsiDragan Simic
commit e0bd7ecf6b2dc71215af699dffbf14bf0bc3d978 upstream. The differences in the vendor-approved CPU and GPU OPPs for the standard Rockchip RK3588 variant [1] and the industrial Rockchip RK3588J variant [2] come from the latter, presumably, supporting an extended temperature range that's usually associated with industrial applications, despite the two SoC variant datasheets specifying the same upper limit for the allowed ambient temperature for both variants. However, the lower temperature limit is specified much lower for the RK3588J variant. [1][2] To be on the safe side and to ensure maximum longevity of the RK3588J SoCs, only the CPU and GPU OPPs that are declared by the vendor to be always safe for this SoC variant may be provided. As explained by the vendor [3] and according to the RK3588J datasheet, [2] higher-frequency/higher-voltage CPU and GPU OPPs can be used as well, but at the risk of reducing the SoC lifetime expectancy. Presumably, using the higher OPPs may be safe only when not enjoying the assumed extended temperature range that the RK3588J, as an SoC variant targeted specifically at higher-temperature, industrial applications, is made (or binned) for. Anyone able to keep their RK3588J-based board outside the above-presumed extended temperature range at all times, and willing to take the associated risk of possibly reducing the SoC lifetime expectancy, is free to apply a DT overlay that adds the higher CPU and GPU OPPs. With all this and the downstream RK3588(J) DT definitions [4][5] in mind, let's delete the RK3588J CPU and GPU OPPs that are not considered belonging to the normal operation mode for this SoC variant. To quote the RK3588J datasheet [2], "normal mode means the chipset works under safety voltage and frequency; for the industrial environment, highly recommend to keep in normal mode, the lifetime is reasonably guaranteed", while "overdrive mode brings higher frequency, and the voltage will increase accordingly; under the overdrive mode for a long time, the chipset may shorten the lifetime, especially in high-temperature condition". To sum the RK3588J datasheet [2] and the vendor-provided DTs up, [4][5] the maximum allowed CPU core, GPU and NPU frequencies are as follows: IP core | Normal mode | Overdrive mode ------------+-------------+---------------- Cortex-A55 | 1,296 MHz | 1,704 MHz Cortex-A76 | 1,608 MHz | 2,016 MHz GPU | 700 MHz | 850 MHz NPU | 800 MHz | 950 MHz Unfortunately, when it comes to the actual voltages for the RK3588J CPU and GPU OPPs, there's a discrepancy between the RK3588J datasheet [2] and the downstream kernel code. [4][5] The RK3588J datasheet states that "the max. working voltage of CPU/GPU/NPU is 0.75 V under the normal mode", while the downstream kernel code actually allows voltage ranges that go up to 0.95 V, which is still within the voltage range allowed by the datasheet. However, the RK3588J datasheet also tells us to "strictly refer to the software configuration of SDK and the hardware reference design", so let's embrace the voltage ranges provided by the downstream kernel code, which also prevents the undesirable theoretical outcome of ending up with no usable OPPs on a particular board, as a result of the board's voltage regulator(s) being unable to deliver the exact voltages, for whatever reason. The above-described voltage ranges for the RK3588J CPU OPPs remain taken from the downstream kernel code [4][5] by picking the highest, worst-bin values, which ensure that all RK3588J bins will work reliably. Yes, with some power inevitably wasted as unnecessarily generated heat, but the reliability is paramount, together with the longevity. This deficiency may be revisited separately at some point in the future. The provided RK3588J CPU OPPs follow the slightly debatable "provide only the highest-frequency OPP from the same-voltage group" approach that's been established earlier, [6] as a result of the "same-voltage, lower-frequency" OPPs being considered inefficient from the IPA governor's standpoint, which may also be revisited separately at some point in the future. [1] https://wiki.friendlyelec.com/wiki/images/e/ee/Rockchip_RK3588_Datasheet_V1.6-20231016.pdf [2] https://wmsc.lcsc.com/wmsc/upload/file/pdf/v2/lcsc/2403201054_Rockchip-RK3588J_C22364189.pdf [3] https://lore.kernel.org/linux-rockchip/e55125ed-64fb-455e-b1e4-cebe2cf006e4@cherry.de/T/#u [4] https://raw.githubusercontent.com/rockchip-linux/kernel/604cec4004abe5a96c734f2fab7b74809d2d742f/arch/arm64/boot/dts/rockchip/rk3588s.dtsi [5] https://raw.githubusercontent.com/rockchip-linux/kernel/604cec4004abe5a96c734f2fab7b74809d2d742f/arch/arm64/boot/dts/rockchip/rk3588j.dtsi [6] https://lore.kernel.org/all/20240229-rk-dts-additions-v3-5-6afe8473a631@gmail.com/ Fixes: 667885a68658 ("arm64: dts: rockchip: Add OPP data for CPU cores on RK3588j") Fixes: a7b2070505a2 ("arm64: dts: rockchip: Split GPU OPPs of RK3588 and RK3588j") Cc: stable@vger.kernel.org Cc: Heiko Stuebner <heiko@sntech.de> Cc: Alexey Charkov <alchark@gmail.com> Helped-by: Quentin Schulz <quentin.schulz@cherry.de> Reviewed-by: Quentin Schulz <quentin.schulz@cherry.de> Signed-off-by: Dragan Simic <dsimic@manjaro.org> Link: https://lore.kernel.org/r/eeec0d30d79b019d111b3f0aa2456e69896b2caa.1742813866.git.dsimic@manjaro.org Signed-off-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>