summaryrefslogtreecommitdiff
path: root/arch
AgeCommit message (Collapse)Author
2023-01-04powerpc/rtas: avoid scheduling in rtas_os_term()Nathan Lynch
[ Upstream commit 6c606e57eecc37d6b36d732b1ff7e55b7dc32dd4 ] It's unsafe to use rtas_busy_delay() to handle a busy status from the ibm,os-term RTAS function in rtas_os_term(): Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b BUG: sleeping function called from invalid context at arch/powerpc/kernel/rtas.c:618 in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 1, name: swapper/0 preempt_count: 2, expected: 0 CPU: 7 PID: 1 Comm: swapper/0 Tainted: G D 6.0.0-rc5-02182-gf8553a572277-dirty #9 Call Trace: [c000000007b8f000] [c000000001337110] dump_stack_lvl+0xb4/0x110 (unreliable) [c000000007b8f040] [c0000000002440e4] __might_resched+0x394/0x3c0 [c000000007b8f0e0] [c00000000004f680] rtas_busy_delay+0x120/0x1b0 [c000000007b8f100] [c000000000052d04] rtas_os_term+0xb8/0xf4 [c000000007b8f180] [c0000000001150fc] pseries_panic+0x50/0x68 [c000000007b8f1f0] [c000000000036354] ppc_panic_platform_handler+0x34/0x50 [c000000007b8f210] [c0000000002303c4] notifier_call_chain+0xd4/0x1c0 [c000000007b8f2b0] [c0000000002306cc] atomic_notifier_call_chain+0xac/0x1c0 [c000000007b8f2f0] [c0000000001d62b8] panic+0x228/0x4d0 [c000000007b8f390] [c0000000001e573c] do_exit+0x140c/0x1420 [c000000007b8f480] [c0000000001e586c] make_task_dead+0xdc/0x200 Use rtas_busy_delay_time() instead, which signals without side effects whether to attempt the ibm,os-term RTAS call again. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221118150751.469393-5-nathanl@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-04powerpc/rtas: avoid device tree lookups in rtas_os_term()Nathan Lynch
[ Upstream commit ed2213bfb192ab51f09f12e9b49b5d482c6493f3 ] rtas_os_term() is called during panic. Its behavior depends on a couple of conditions in the /rtas node of the device tree, the traversal of which entails locking and local IRQ state changes. If the kernel panics while devtree_lock is held, rtas_os_term() as currently written could hang. Instead of discovering the relevant characteristics at panic time, cache them in file-static variables at boot. Note the lookup for "ibm,extended-os-term" is converted to of_property_read_bool() since it is a boolean property, not an RTAS function token. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Reviewed-by: Nicholas Piggin <npiggin@gmail.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> [mpe: Incorporate suggested change from Nick] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221118150751.469393-4-nathanl@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31arm64: dts: qcom: sm8250: fix USB-DP PHY registersJohan Hovold
commit f8d8840c72b3df61b5252052b79020dabec01ab5 upstream. When adding support for the DisplayPort part of the QMP PHY the binding (and devicetree parser) for the (USB) child node was simply reused and this has lead to some confusion. The third DP register region is really the DP_PHY region, not "PCS" as the binding claims, and lie at offset 0x2a00 (not 0x2c00). Similarly, there likely are no "RX", "RX2" or "PCS_MISC" regions as there are for the USB part of the PHY (and in any case the Linux driver does not use them). Note that the sixth "PCS_MISC" region is not even in the binding. Fixes: 5aa0d1becd5b ("arm64: dts: qcom: sm8250: switch usb1 qmp phy to USB3+DP mode") Cc: stable@vger.kernel.org # 5.13 Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221111094729.11842-3-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-12-31arm64: dts: qcom: sm6350: fix USB-DP PHY registersJohan Hovold
commit 347b9491c595d5091bfabe65cad2fd6eee786153 upstream. When adding support for the DisplayPort part of the QMP PHY the binding (and devicetree parser) for the (USB) child node was simply reused and this has lead to some confusion. The third DP register region is really the DP_PHY region, not "PCS" as the binding claims, and lie at offset 0x2a00 (not 0x2c00). Similarly, there likely are no "RX", "RX2" or "PCS_MISC" regions as there are for the USB part of the PHY (and in any case the Linux driver does not use them). Note that the sixth "PCS_MISC" region is not even in the binding. Fixes: 23737b9557fe ("arm64: dts: qcom: sm6350: Add USB1 nodes") Cc: stable@vger.kernel.org # 5.16 Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221111094729.11842-2-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-12-31MIPS: ralink: mt7621: avoid to init common ralink reset controllerSergio Paracuellos
[ Upstream commit 76ce51798cb16738a4a28a6662e7344aaf7ef769 ] Commit 38a8553b0a22 ("clk: ralink: make system controller node a reset provider") make system controller a reset provider for mt7621 ralink SoCs. Ralink init code also tries to start previous common reset controller which at the end tries to find device tree node 'ralink,rt2880-reset'. mt7621 device tree file is not using at all this node anymore. Hence avoid to init this common reset controller for mt7621 ralink SoCs to avoid 'Failed to find reset controller node' boot error trace error. Fixes: 64b2d6ffff86 ("staging: mt7621-dts: align resets with binding documentation") Signed-off-by: Sergio Paracuellos <sergio.paracuellos@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31x86/apic: Handle no CONFIG_X86_X2APIC on systems with x2APIC enabled by BIOSMateusz Jończyk
[ Upstream commit e3998434da4f5b1f57f8d6a8a9f8502ee3723bae ] A kernel that was compiled without CONFIG_X86_X2APIC was unable to boot on platforms that have x2APIC already enabled in the BIOS before starting the kernel. The kernel was supposed to panic with an approprite error message in validate_x2apic() due to the missing X2APIC support. However, validate_x2apic() was run too late in the boot cycle, and the kernel tried to initialize the APIC nonetheless. This resulted in an earlier panic in setup_local_APIC() because the APIC was not registered. In my experiments, a panic message in setup_local_APIC() was not visible in the graphical console, which resulted in a hang with no indication what has gone wrong. Instead of calling panic(), disable the APIC, which results in a somewhat working system with the PIC only (and no SMP). This way the user is able to diagnose the problem more easily. Disabling X2APIC mode is not an option because it's impossible on systems with locked x2APIC. The proper place to disable the APIC in this case is in check_x2apic(), which is called early from setup_arch(). Doing this in __apic_intr_mode_select() is too late. Make check_x2apic() unconditionally available and remove the empty stub. Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Reported-by: Robert Elliott (Servers) <elliott@hpe.com> Signed-off-by: Mateusz Jończyk <mat.jonczyk@o2.pl> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/lkml/d573ba1c-0dc4-3016-712a-cc23a8a33d42@molgen.mpg.de Link: https://lore.kernel.org/lkml/20220911084711.13694-3-mat.jonczyk@o2.pl Link: https://lore.kernel.org/all/20221129215008.7247-1-mat.jonczyk@o2.pl Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31arm64: dts: qcom: sm6350: Add apps_smmu with streamID to SDHCI 1/2 nodesMarijn Suijten
[ Upstream commit 7372b944a6ba5ac86628eaacc89ed4f103435cb9 ] When enabling the APPS SMMU the mainline driver reconfigures the SMMU from its bootloader configuration, losing the stream mapping for (among which) the SDHCI hardware and breaking its ADMA feature. This feature can be disabled with: sdhci.debug_quirks=0x40 But it is of course desired to have this feature enabled and working through the SMMU. Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@somainline.org> Reviewed-by: Luca Weiss <luca.weiss@fairphone.com> Tested-by: Luca Weiss <luca.weiss@fairphone.com> # sm7225-fairphone-fp4 Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20221030073232.22726-11-marijn.suijten@somainline.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31x86/hyperv: Remove unregister syscore call from Hyper-V cleanupGaurav Kohli
[ Upstream commit 32c97d980e2eef25465d453f2956a9ca68926a3c ] Hyper-V cleanup code comes under panic path where preemption and irq is already disabled. So calling of unregister_syscore_ops might schedule out the thread even for the case where mutex lock is free. hyperv_cleanup unregister_syscore_ops mutex_lock(&syscore_ops_lock) might_sleep Here might_sleep might schedule out this thread, where voluntary preemption config is on and this thread will never comes back. And also this was added earlier to maintain the symmetry which is not required as this can comes during crash shutdown path only. To prevent the same, removing unregister_syscore_ops function call. Signed-off-by: Gaurav Kohli <gauravkohli@linux.microsoft.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Link: https://lore.kernel.org/r/1669443291-2575-1-git-send-email-gauravkohli@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31ARM: dts: aspeed: rainier,everest: Move reserved memory regionsAdriana Kobylak
[ Upstream commit e184d42a6e085f95f5c4f1a4fbabebab2984cb68 ] Move the reserved regions to account for a decrease in DRAM when ECC is enabled. ECC takes 1/9th of memory. Running on HW with ECC off, u-boot prints: DRAM: already initialized, 1008 MiB (capacity:1024 MiB, VGA:16 MiB, ECC:off) And with ECC on, u-boot prints: DRAM: already initialized, 896 MiB (capacity:1024 MiB, VGA:16 MiB, ECC:on, ECC size:896 MiB) This implies that MCR54 is configured for ECC to be bounded at the bottom of a 16MiB VGA memory region: 1024MiB - 16MiB (VGA) = 1008MiB 1008MiB / 9 (for ECC) = 112MiB 1008MiB - 112MiB = 896MiB (available DRAM) The flash_memory region currently starts at offset 896MiB: 0xb8000000 (flash_memory offset) - 0x80000000 (base memory address) = 0x38000000 = 896MiB This is the end of the available DRAM with ECC enabled and therefore it needs to be moved. Since the flash_memory is 64MiB in size and needs to be 64MiB aligned, it can just be moved up by 64MiB and would sit right at the end of the available DRAM buffer. The ramoops region currently follows the flash_memory, but it can be moved to sit above flash_memory which would minimize the address-space fragmentation. Signed-off-by: Adriana Kobylak <anoo@us.ibm.com> Reviewed-by: Andrew Jeffery <andrew@aj.id.au> Link: https://lore.kernel.org/r/20220916195535.1020185-1-anoo@linux.ibm.com Signed-off-by: Joel Stanley <joel@jms.id.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31arm64: make is_ttbrX_addr() noinstr-safeMark Rutland
[ Upstream commit d8c1d798a2e5091128c391c6dadcc9be334af3f5 ] We use is_ttbr0_addr() in noinstr code, but as it's only marked as inline, it's theoretically possible for the compiler to place it out-of-line and instrument it, which would be problematic. Mark is_ttbr0_addr() as __always_inline such that that can safely be used from noinstr code. For consistency, do the same to is_ttbr1_addr(). Note that while is_ttbr1_addr() calls arch_kasan_reset_tag(), this is a macro (and its callees are either macros or __always_inline), so there is not a risk of transient instrumentation. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20221114144042.3001140-1-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31arm64: dts: mt8183: Fix Mali GPU clockChen-Yu Tsai
[ Upstream commit ad2631b5645a1d0ca9bf6fecf71f77e3b0071ee5 ] The actual clock feeding into the Mali GPU on the MT8183 is from the clock gate in the MFGCFG block, not CLK_TOP_MFGPLL_CK from the TOPCKGEN block, which itself is simply a pass-through placeholder for the MFGPLL in the APMIXEDSYS block. Fix the hardware description with the correct clock reference. Fixes: a8168cebf1bc ("arm64: dts: mt8183: Add node for the Mali GPU") Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Link: https://lore.kernel.org/r/20220927101128.44758-2-angelogioacchino.delregno@collabora.com Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31powerpc/pseries/eeh: use correct API for error log sizeNathan Lynch
[ Upstream commit 9aafbfa5f57a4b75bafd3bed0191e8429c5fa618 ] rtas-error-log-max is not the name of an RTAS function, so rtas_token() is not the appropriate API for retrieving its value. We already have rtas_get_error_log_max() which returns a sensible value if the property is absent for any reason, so use that instead. Fixes: 8d633291b4fc ("powerpc/eeh: pseries platform EEH error log retrieval") Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> [mpe: Drop no-longer possible error handling as noticed by ajd] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221118150751.469393-6-nathanl@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31RISC-V: KVM: Fix reg_val check in kvm_riscv_vcpu_set_reg_config()Anup Patel
[ Upstream commit e482d9e33d5b0f222cbef7341dcd52cead6b9edc ] The reg_val check in kvm_riscv_vcpu_set_reg_config() should only be done for isa config register. Fixes: 9bfd900beeec ("RISC-V: KVM: Improve ISA extension by using a bitmap") Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Reviewed-by: Atish Patra <atishp@rivosinc.com> Signed-off-by: Anup Patel <anup@brainfault.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31powerpc/hv-gpci: Fix hv_gpci event listKajol Jain
[ Upstream commit 03f7c1d2a49acd30e38789cd809d3300721e9b0e ] Based on getPerfCountInfo v1.018 documentation, some of the hv_gpci events were deprecated for platform firmware that supports counter_info_version 0x8 or above. Fix the hv_gpci event list by adding a new attribute group called "hv_gpci_event_attrs_v6" and a "ENABLE_EVENTS_COUNTERINFO_V6" macro to enable these events for platform firmware that supports counter_info_version 0x6 or below. And assigning the hv_gpci event list based on output counter info version of underlying plaform. Fixes: 97bf2640184f ("powerpc/perf/hv-gpci: add the remaining gpci requests") Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Madhavan Srinivasan <maddy@linux.ibm.com> Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221130174513.87501-1-kjain@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31powerpc/83xx/mpc832x_rdb: call platform_device_put() in error case in ↵Yang Yingliang
of_fsl_spi_probe() [ Upstream commit 4d0eea415216fe3791da2f65eb41399e70c7bedf ] If platform_device_add() is not called or failed, it can not call platform_device_del() to clean up memory, it should call platform_device_put() in error case. Fixes: 26f6cb999366 ("[POWERPC] fsl_soc: add support for fsl_spi") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221029111626.429971-1-yangyingliang@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31powerpc/perf: callchain validate kernel stack pointer boundsNicholas Piggin
[ Upstream commit 32c5209214bd8d4f8c4e9d9b630ef4c671f58e79 ] The interrupt frame detection and loads from the hypothetical pt_regs are not bounds-checked. The next-frame validation only bounds-checks STACK_FRAME_OVERHEAD, which does not include the pt_regs. Add another test for this. The user could set r1 to be equal to the address matching the first interrupt frame - STACK_INT_FRAME_SIZE, which is in the previous page due to the kernel redzone, and induce the kernel to load the marker from there. Possibly this could cause a crash at least. If the user could induce the previous page to contain a valid marker, then it might be able to direct perf to read specific memory addresses in a way that could be transmitted back to the user in the perf data. Fixes: 20002ded4d93 ("perf_counter: powerpc: Add callchain support") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221127124942.1665522-4-npiggin@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31powerpc: dts: turris1x.dts: Add channel labels for temperature sensorPali Rohár
[ Upstream commit 67bbb62f61e810734da0a1577a9802ddaed24140 ] Channel 0 of SA56004ED chip refers to internal SA56004ED chip sensor (chip itself is located on the board) and channel 1 of SA56004ED chip refers to external sensor which is connected to temperature diode of the P2020 CPU. Fixes: 54c15ec3b738 ("powerpc: dts: Add DTS file for CZ.NIC Turris 1.x routers") Signed-off-by: Pali Rohár <pali@kernel.org> Reviewed-by: Marek Behún <kabel@kernel.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20220930123901.10251-1-pali@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31powerpc/pseries: fix plpks_read_var() code for different consumersNayna Jain
[ Upstream commit 1f622f3f80cbf8999ff5955a2fcfbd801a1f32e0 ] Even though plpks_read_var() is currently called to read variables owned by different consumers, it internally supports only OS consumer. Fix plpks_read_var() to handle different consumers correctly. Fixes: 2454a7af0f2a ("powerpc/pseries: define driver for Platform KeyStore") Signed-off-by: Nayna Jain <nayna@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221106205839.600442-7-nayna@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31powerpc/pseries: Return -EIO instead of -EINTR for H_ABORTED errorNayna Jain
[ Upstream commit bb8e4c7cb759b90a04f2e94056b50288ff46a0ed ] Some commands for eg. "cat" might continue to retry on encountering EINTR. This is not expected for original error code H_ABORTED. Map H_ABORTED to more relevant Linux error code EIO. Fixes: 2454a7af0f2a ("powerpc/pseries: define driver for Platform KeyStore") Signed-off-by: Nayna Jain <nayna@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221106205839.600442-4-nayna@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31powerpc/pseries: Fix the H_CALL error code in PLPKS driverNayna Jain
[ Upstream commit af223e1728c448073d1e12fe464bf344310edeba ] PAPR Spec defines H_P1 actually as H_PARAMETER and maps H_ABORTED to a different numerical value. Fix the error codes as per PAPR Specification. Fixes: 2454a7af0f2a ("powerpc/pseries: define driver for Platform KeyStore") Signed-off-by: Nayna Jain <nayna@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221106205839.600442-3-nayna@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31powerpc/pseries: fix the object owners enum value in plpks driverNayna Jain
[ Upstream commit 2330757e0be0acad88852e211dcd6106390a729b ] OS_VAR_LINUX enum in PLPKS driver should be 0x02 instead of 0x01. Fixes: 2454a7af0f2a ("powerpc/pseries: define driver for Platform KeyStore") Signed-off-by: Nayna Jain <nayna@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221106205839.600442-2-nayna@linux.ibm.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31powerpc/xive: add missing iounmap() in error path in ↵Yang Yingliang
xive_spapr_populate_irq_data() [ Upstream commit 8b49670f3bb3f10cd4d5a6dca17f5a31b173ecdc ] If remapping 'data->trig_page' fails, the 'data->eoi_mmio' need be unmapped before returning from xive_spapr_populate_irq_data(). Fixes: eac1e731b59e ("powerpc/xive: guest exploitation of the XIVE interrupt controller") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20221017032333.1852406-1-yangyingliang@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31powerpc/xmon: Fix -Wswitch-unreachable warning in bpt_cmdsGustavo A. R. Silva
[ Upstream commit 1c4a4a4c8410be4a231a58b23e7a30923ff954ac ] When building with automatic stack variable initialization, GCC 12 complains about variables defined outside of switch case statements. Move the variable into the case that uses it, which silences the warning: arch/powerpc/xmon/xmon.c: In function ‘bpt_cmds’: arch/powerpc/xmon/xmon.c:1529:13: warning: statement will never be executed [-Wswitch-unreachable] 1529 | int mode; | ^~~~ Fixes: 09b6c1129f89 ("powerpc/xmon: Fix compile error with PPC_8xx=y") Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/YySE6FHiOcbWWR+9@work Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31powerpc/52xx: Fix a resource leak in an error handling pathChristophe JAILLET
[ Upstream commit 5836947613ef33d311b4eff6a32d019580a214f5 ] The error handling path of mpc52xx_lpbfifo_probe() has a request_irq() that is not balanced by a corresponding free_irq(). Add the missing call, as already done in the remove function. Fixes: 3c9059d79f5e ("powerpc/5200: add LocalPlus bus FIFO device driver") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/dec1496d46ccd5311d0f6e9f9ca4238be11bf6a6.1643440531.git.christophe.jaillet@wanadoo.fr Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31ftrace: Allow WITH_ARGS flavour of graph tracer with shadow call stackArd Biesheuvel
[ Upstream commit 38792972de4294163f44d6360fd221e6f2c22a05 ] The recent switch on arm64 from DYNAMIC_FTRACE_WITH_REGS to DYNAMIC_FTRACE_WITH_ARGS failed to take into account that we currently require the former in order to allow the function graph tracer to be enabled in combination with shadow call stacks. This means that this is no longer permitted at all, in spite of the fact that either flavour of ftrace works perfectly fine in this combination. So permit WITH_ARGS as well as WITH_REGS. Fixes: ddc9863e9e90 ("scs: Disable when function graph tracing is enabled") Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20221213132407.1485025-1-ardb@kernel.org Signed-off-by: Will Deacon <will@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31riscv: Fix P4D_SHIFT definition for 3-level page table modeAlexandre Ghiti
[ Upstream commit 71fc3621efc38ace9640ee6a0db3300900689592 ] RISC-V kernels support 3,4,5-level page tables at runtime by folding upper levels. In case of a 3-level page table, PGDIR is folded into P4D which in turn is folded into PUD: PGDIR_SHIFT value is correctly set to the same value as PUD_SHIFT, but P4D_SHIFT is not, then any use of P4D_SHIFT will access invalid address bits (all set to 1). Fix this by dynamically defining P4D_SHIFT value, like we already do for PGDIR_SHIFT. Fixes: d10efa21a937 ("riscv: mm: Control p4d's folding by pgtable_l5_enabled") Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com> Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com> Link: https://lore.kernel.org/r/20221201135128.1482189-2-alexghiti@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31RISC-V: Align the shadow stackPalmer Dabbelt
[ Upstream commit b003b3b77d65133a0011ae3b7b255347438c12f6 ] The standard RISC-V ABIs all require 16-byte stack alignment. We're only calling that one function on the shadow stack so I doubt it'd result in a real issue, but might as well keep this lined up. Fixes: 31da94c25aea ("riscv: add VMAP_STACK overflow detection") Reviewed-by: Jisheng Zhang <jszhang@kernel.org> Link: https://lore.kernel.org/r/20221130023515.20217-1-palmer@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31riscv: Fix crash during early errata patchingSamuel Holland
[ Upstream commit 0c49688174f5347c3f8012e84c0ffa0d2b2890c8 ] The patch function for the T-Head PBMT errata calls __pa_symbol() before relocation. This crashes when CONFIG_DEBUG_VIRTUAL is enabled, because __pa_symbol() forwards to __phys_addr_symbol(), and __phys_addr_symbol() checks against the absolute kernel start/end address. Fix this by checking against the kernel map instead of a symbol address. Fixes: a35707c3d850 ("riscv: add memory-type errata for T-Head") Reviewed-by: Heiko Stuebner <heiko@sntech.de> Tested-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Samuel Holland <samuel@sholland.org> Link: https://lore.kernel.org/r/20221126060920.65009-1-samuel@sholland.org Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31RISC-V: Fix MEMREMAP_WB for systems with SvpbmtAnup Patel
[ Upstream commit b91676fc16cd384a81e3af52c641aa61985cc231 ] Currently, the memremap() called with MEMREMAP_WB maps memory using the generic ioremap() function which breaks on system with Svpbmt because memory mapped using _PAGE_IOREMAP page attributes is treated as strongly-ordered non-cacheable IO memory. To address this, we implement RISC-V specific arch_memremap_wb() which maps memory using _PAGE_KERNEL page attributes resulting in write-back cacheable mapping on systems with Svpbmt. Fixes: ff689fd21cb1 ("riscv: add RISC-V Svpbmt extension support") Co-developed-by: Mayuresh Chitale <mchitale@ventanamicro.com> Signed-off-by: Mayuresh Chitale <mchitale@ventanamicro.com> Signed-off-by: Anup Patel <apatel@ventanamicro.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://lore.kernel.org/r/20221114090536.1662624-2-apatel@ventanamicro.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31RISC-V: Fix unannoted hardirqs-on in return to userspace slow-pathAndrew Bresticker
[ Upstream commit b0f4c74eadbf69a3298f38566bfaa2e202541f2f ] The return to userspace path in entry.S may enable interrupts without the corresponding lockdep annotation, producing a splat[0] when DEBUG_LOCKDEP is enabled. Simply calling __trace_hardirqs_on() here gets a bit messy due to the use of RA to point back to ret_from_exception, so just move the whole slow-path loop into C. It's more readable and it lets us use local_irq_{enable,disable}(), avoiding the need for manual annotations altogether. [0]: ------------[ cut here ]------------ DEBUG_LOCKS_WARN_ON(!lockdep_hardirqs_enabled()) WARNING: CPU: 2 PID: 1 at kernel/locking/lockdep.c:5512 check_flags+0x10a/0x1e0 Modules linked in: CPU: 2 PID: 1 Comm: init Not tainted 6.1.0-rc4-00160-gb56b6e2b4f31 #53 Hardware name: riscv-virtio,qemu (DT) epc : check_flags+0x10a/0x1e0 ra : check_flags+0x10a/0x1e0 <snip> status: 0000000200000100 badaddr: 0000000000000000 cause: 0000000000000003 [<ffffffff808edb90>] lock_is_held_type+0x78/0x14e [<ffffffff8003dae2>] __might_resched+0x26/0x22c [<ffffffff8003dd24>] __might_sleep+0x3c/0x66 [<ffffffff80022c60>] get_signal+0x9e/0xa70 [<ffffffff800054a2>] do_notify_resume+0x6e/0x422 [<ffffffff80003c68>] ret_from_exception+0x0/0x10 irq event stamp: 44512 hardirqs last enabled at (44511): [<ffffffff808f901c>] _raw_spin_unlock_irqrestore+0x54/0x62 hardirqs last disabled at (44512): [<ffffffff80008200>] __trace_hardirqs_off+0xc/0x14 softirqs last enabled at (44472): [<ffffffff808f9fbe>] __do_softirq+0x3de/0x51e softirqs last disabled at (44467): [<ffffffff80017760>] irq_exit+0xd6/0x104 ---[ end trace 0000000000000000 ]--- possible reason: unannotated irqs-on. Signed-off-by: Andrew Bresticker <abrestic@rivosinc.com> Fixes: 3c4697982982 ("riscv: Enable LOCKDEP_SUPPORT & fixup TRACE_IRQFLAGS_SUPPORT") Link: https://lore.kernel.org/r/20221111223108.1976562-1-abrestic@rivosinc.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31riscv/mm: add arch hook arch_clear_hugepage_flagsTong Tiangen
[ Upstream commit d8bf77a1dc3079692f54be3087a5fd16d90027b0 ] With the PG_arch_1 we keep track if the page's data cache is clean, architecture rely on this property to treat new pages as dirty with respect to the data cache and perform the flushing before mapping the pages into userspace. This patch adds a new architecture hook, arch_clear_hugepage_flags,so that architectures which rely on the page flags being in a particular state for fresh allocations can adjust the flags accordingly when a page is freed into the pool. Fixes: 9e953cda5cdf ("riscv: Introduce huge page support for 32/64bit kernel") Signed-off-by: Tong Tiangen <tongtiangen@huawei.com> Link: https://lore.kernel.org/r/20221024094725.3054311-3-tongtiangen@huawei.com Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31riscv, bpf: Emit fixed-length instructions for BPF_PSEUDO_FUNCPu Lehui
[ Upstream commit b54b6003612a376e7be32cbc5c1af3754bbbbb3d ] For BPF_PSEUDO_FUNC instruction, verifier will refill imm with correct addresses of bpf_calls and then run last pass of JIT. Since the emit_imm of RV64 is variable-length, which will emit appropriate length instructions accorroding to the imm, it may broke ctx->offset, and lead to unpredictable problem, such as inaccurate jump. So let's fix it with fixed-length instructions. Fixes: 69c087ba6225 ("bpf: Add bpf_for_each_map_elem() helper") Suggested-by: Björn Töpel <bjorn@rivosinc.com> Signed-off-by: Pu Lehui <pulehui@huawei.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Björn Töpel <bjorn@kernel.org> Acked-by: Björn Töpel <bjorn@kernel.org> Link: https://lore.kernel.org/bpf/20221206091410.1584784-1-pulehui@huaweicloud.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31x86/boot: Skip realmode init code when running as Xen PV guestJuergen Gross
[ Upstream commit f1e525009493cbd569e7c8dd7d58157855f8658d ] When running as a Xen PV guest there is no need for setting up the realmode trampoline, as realmode isn't supported in this environment. Trying to setup the trampoline has been proven to be problematic in some cases, especially when trying to debug early boot problems with Xen requiring to keep the EFI boot-services memory mapped (some firmware variants seem to claim basically all memory below 1Mb for boot services). Introduce new x86_platform_ops operations for that purpose, which can be set to a NOP by the Xen PV specific kernel boot code. [ bp: s/call_init_real_mode/do_init_real_mode/ ] Fixes: 084ee1c641a0 ("x86, realmode: Relocator for realmode code") Suggested-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/r/20221123114523.3467-1-jgross@suse.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31MIPS: OCTEON: warn only once if deprecated link status is being usedLadislav Michl
[ Upstream commit 4c587a982603d7e7e751b4925809a1512099a690 ] Avoid flooding kernel log with warnings. Fixes: 2c0756d306c2 ("MIPS: OCTEON: warn if deprecated link status is being used") Signed-off-by: Ladislav Michl <ladis@linux-mips.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31MIPS: BCM63xx: Add check for NULL for clk in clk_enableAnastasia Belova
[ Upstream commit ee9ef11bd2a59c2fefaa0959e5efcdf040d7c654 ] Check clk for NULL before calling clk_enable_unlocked where clk is dereferenced. There is such check in other implementations of clk_enable. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: e7300d04bd08 ("MIPS: BCM63xx: Add support for the Broadcom BCM63xx family of SOCs.") Signed-off-by: Anastasia Belova <abelova@astralinux.ru> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31x86/xen: Fix memory leak in xen_init_lock_cpu()Xiu Jianfeng
[ Upstream commit ca84ce153d887b1dc8b118029976cc9faf2a9b40 ] In xen_init_lock_cpu(), the @name has allocated new string by kasprintf(), if bind_ipi_to_irqhandler() fails, it should be freed, otherwise may lead to a memory leak issue, fix it. Fixes: 2d9e1e2f58b5 ("xen: implement Xen-specific spinlocks") Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20221123155858.11382-3-xiujianfeng@huawei.com Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31x86/xen: Fix memory leak in xen_smp_intr_init{_pv}()Xiu Jianfeng
[ Upstream commit 69143f60868b3939ddc89289b29db593b647295e ] These local variables @{resched|pmu|callfunc...}_name saves the new string allocated by kasprintf(), and when bind_{v}ipi_to_irqhandler() fails, it goes to the @fail tag, and calls xen_smp_intr_free{_pv}() to free resource, however the new string is not saved, which cause a memory leak issue. fix it. Fixes: 9702785a747a ("i386: move xen") Signed-off-by: Xiu Jianfeng <xiujianfeng@huawei.com> Reviewed-by: Juergen Gross <jgross@suse.com> Link: https://lore.kernel.org/r/20221123155858.11382-2-xiujianfeng@huawei.com Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31uprobes/x86: Allow to probe a NOP instruction with 0x66 prefixOleg Nesterov
[ Upstream commit cefa72129e45313655d53a065b8055aaeb01a0c9 ] Intel ICC -hotpatch inserts 2-byte "0x66 0x90" NOP at the start of each function to reserve extra space for hot-patching, and currently it is not possible to probe these functions because branch_setup_xol_ops() wrongly rejects NOP with REP prefix as it treats them like word-sized branch instructions. Fixes: 250bbd12c2fe ("uprobes/x86: Refuse to attach uprobe to "word-sized" branch insns") Reported-by: Seiji Nishikawa <snishika@redhat.com> Suggested-by: Denys Vlasenko <dvlasenk@redhat.com> Signed-off-by: Oleg Nesterov <oleg@redhat.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://lore.kernel.org/r/20221204173933.GA31544@redhat.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31mips: ralink: mt7621: do not use kzalloc too earlyJohn Thomson
[ Upstream commit 7c18b64bba3bcad1be94b404f47b94a04b91ce79 ] With CONFIG_SLUB=y, following commit 6edf2576a6cc ("mm/slub: enable debugging memory wasting of kmalloc") mt7621 failed to boot very early, without showing any console messages. This exposed the pre-existing bug of mt7621.c using kzalloc before normal memory management was available. Prior to this slub change, there existed the unintended protection against "kmem_cache *s" being NULL as slab_pre_alloc_hook() happened to return NULL and bailed out of slab_alloc_node(). This allowed mt7621 prom_soc_init to fail in the soc_dev_init kzalloc, but continue booting without the SOC_BUS driver device registered. Console output from a DEBUG_ZBOOT vmlinuz kernel loading, with mm/slub modified to warn on kmem_cache zero or null: zimage at: 80B842A0 810B4BC0 Uncompressing Linux at load address 80001000 Copy device tree to address 80B80EE0 Now, booting the kernel... [ 0.000000] Linux version 6.1.0-rc3+ (john@john) (mipsel-buildroot-linux-gnu-gcc.br_real (Buildroot 2021.11-4428-g6b6741b) 12.2.0, GNU ld (GNU Binutils) 2.39) #73 SMP Wed Nov 2 05:10:01 AEST 2022 [ 0.000000] ------------[ cut here ]------------ [ 0.000000] WARNING: CPU: 0 PID: 0 at mm/slub.c:3416 kmem_cache_alloc+0x5a4/0x5e8 [ 0.000000] Modules linked in: [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.1.0-rc3+ #73 [ 0.000000] Stack : 810fff78 80084d98 00000000 00000004 00000000 00000000 80889d04 80c90000 [ 0.000000] 80920000 807bd328 8089d368 80923bd3 00000000 00000001 80889cb0 00000000 [ 0.000000] 00000000 00000000 807bd328 8084bcb1 00000002 00000002 00000001 6d6f4320 [ 0.000000] 00000000 80c97d3d 80c97d68 fffffffc 807bd328 00000000 00000000 00000000 [ 0.000000] 00000000 a0000000 80910000 8110a0b4 00000000 00000020 80010000 80010000 [ 0.000000] ... [ 0.000000] Call Trace: [ 0.000000] [<80008260>] show_stack+0x28/0xf0 [ 0.000000] [<8070c958>] dump_stack_lvl+0x60/0x80 [ 0.000000] [<8002e184>] __warn+0xc4/0xf8 [ 0.000000] [<8002e210>] warn_slowpath_fmt+0x58/0xa4 [ 0.000000] [<801c0fac>] kmem_cache_alloc+0x5a4/0x5e8 [ 0.000000] [<8092856c>] prom_soc_init+0x1fc/0x2b4 [ 0.000000] [<80928060>] prom_init+0x44/0xf0 [ 0.000000] [<80929214>] setup_arch+0x4c/0x6a8 [ 0.000000] [<809257e0>] start_kernel+0x88/0x7c0 [ 0.000000] [ 0.000000] ---[ end trace 0000000000000000 ]--- [ 0.000000] SoC Type: MediaTek MT7621 ver:1 eco:3 [ 0.000000] printk: bootconsole [early0] enabled Allowing soc_device_register to work exposed oops in the mt7621 phy pci, and pci controller drivers from soc_device_match_attr, due to missing sentinels in the quirks tables. These were fixed with: commit 819b885cd886 ("phy: ralink: mt7621-pci: add sentinel to quirks table") not yet applied ("PCI: mt7621: add sentinel to quirks table") Link: https://lore.kernel.org/linux-mm/becf2ac3-2a90-4f3a-96d9-a70f67c66e4a@app.fastmail.com/ Fixes: 71b9b5e0130d ("MIPS: ralink: mt7621: introduce 'soc_device' initialization") Signed-off-by: John Thomson <git@johnthomson.fastmail.com.au> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31mips: ralink: mt7621: soc queries and tests as functionsJohn Thomson
[ Upstream commit b4767d4c072583dec987225b6fe3f5524a735f42 ] Move the SoC register value queries and tests to specific functions, to remove repetition of logic No functional changes intended Signed-off-by: John Thomson <git@johnthomson.fastmail.com.au> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Stable-dep-of: 7c18b64bba3b ("mips: ralink: mt7621: do not use kzalloc too early") Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31mips: ralink: mt7621: define MT7621_SYSC_BASE with __iomemJohn Thomson
[ Upstream commit a2cab953b4c077cc02878d424466d3a6eac32aaf ] So that MT7621_SYSC_BASE can be used later in multiple functions without needing to repeat this __iomem declaration each time Signed-off-by: John Thomson <git@johnthomson.fastmail.com.au> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Stable-dep-of: 7c18b64bba3b ("mips: ralink: mt7621: do not use kzalloc too early") Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31perf/x86/intel/uncore: Fix reference count leak in __uncore_imc_init_box()Xiongfeng Wang
[ Upstream commit 17b8d847b92d815d1638f0de154654081d66b281 ] pci_get_device() will increase the reference count for the returned pci_dev, so tgl_uncore_get_mc_dev() will return a pci_dev with its reference count increased. We need to call pci_dev_put() to decrease the reference count before exiting from __uncore_imc_init_box(). Add pci_dev_put() for both normal and error path. Fixes: fdb64822443e ("perf/x86: Add Intel Tiger Lake uncore support") Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20221118063137.121512-5-wangxiongfeng2@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31perf/x86/intel/uncore: Fix reference count leak in snr_uncore_mmio_map()Xiongfeng Wang
[ Upstream commit 8ebd16c11c346751b3944d708e6c181ed4746c39 ] pci_get_device() will increase the reference count for the returned pci_dev, so snr_uncore_get_mc_dev() will return a pci_dev with its reference count increased. We need to call pci_dev_put() to decrease the reference count. Let's add the missing pci_dev_put(). Fixes: ee49532b38dd ("perf/x86/intel/uncore: Add IMC uncore support for Snow Ridge") Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20221118063137.121512-4-wangxiongfeng2@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31perf/x86/intel/uncore: Fix reference count leak in hswep_has_limit_sbox()Xiongfeng Wang
[ Upstream commit 1ff9dd6e7071a561f803135c1d684b13c7a7d01d ] pci_get_device() will increase the reference count for the returned 'dev'. We need to call pci_dev_put() to decrease the reference count. Since 'dev' is only used in pci_read_config_dword(), let's add pci_dev_put() right after it. Fixes: 9d480158ee86 ("perf/x86/intel/uncore: Remove uncore extra PCI dev HSWEP_PCI_PCU_3") Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20221118063137.121512-3-wangxiongfeng2@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31perf/x86/intel/uncore: Fix reference count leak in sad_cfg_iio_topology()Xiongfeng Wang
[ Upstream commit c508eb042d9739bf9473526f53303721b70e9100 ] pci_get_device() will increase the reference count for the returned pci_dev, and also decrease the reference count for the input parameter *from* if it is not NULL. If we break the loop in sad_cfg_iio_topology() with 'dev' not NULL. We need to call pci_dev_put() to decrease the reference count. Since pci_dev_put() can handle the NULL input parameter, we can just add one pci_dev_put() right before 'return ret'. Fixes: c1777be3646b ("perf/x86/intel/uncore: Enable I/O stacks to IIO PMON mapping on SNR") Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20221118063137.121512-2-wangxiongfeng2@huawei.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31MIPS: vpe-cmp: fix possible memory leak while module exitingYang Yingliang
[ Upstream commit c5ed1fe0801f0c66b0fbce2785239a5664629057 ] dev_set_name() allocates memory for name, it need be freed when module exiting, call put_device() to give up reference, so that it can be freed in kobject_cleanup() when the refcount hit to 0. The vpe_device is static, so remove kfree() from vpe_device_release(). Fixes: 17a1d523aa58 ("MIPS: APRP: Add VPE loader support for CMP platforms.") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31MIPS: vpe-mt: fix possible memory leak while module exitingYang Yingliang
[ Upstream commit 5822e8cc84ee37338ab0bdc3124f6eec04dc232d ] Afer commit 1fa5ae857bb1 ("driver core: get rid of struct device's bus_id string array"), the name of device is allocated dynamically, it need be freed when module exiting, call put_device() to give up reference, so that it can be freed in kobject_cleanup() when the refcount hit to 0. The vpe_device is static, so remove kfree() from vpe_device_release(). Fixes: 1fa5ae857bb1 ("driver core: get rid of struct device's bus_id string array") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31x86/split_lock: Add sysctl to control the misery modeGuilherme G. Piccoli
[ Upstream commit 727209376f4998bc84db1d5d8af15afea846a92b ] Commit b041b525dab9 ("x86/split_lock: Make life miserable for split lockers") changed the way the split lock detector works when in "warn" mode; basically, it not only shows the warn message, but also intentionally introduces a slowdown through sleeping plus serialization mechanism on such task. Based on discussions in [0], seems the warning alone wasn't enough motivation for userspace developers to fix their applications. This slowdown is enough to totally break some proprietary (aka. unfixable) userspace[1]. Happens that originally the proposal in [0] was to add a new mode which would warns + slowdown the "split locking" task, keeping the old warn mode untouched. In the end, that idea was discarded and the regular/default "warn" mode now slows down the applications. This is quite aggressive with regards proprietary/legacy programs that basically are unable to properly run in kernel with this change. While it is understandable that a malicious application could DoS by split locking, it seems unacceptable to regress old/proprietary userspace programs through a default configuration that previously worked. An example of such breakage was reported in [1]. Add a sysctl to allow controlling the "misery mode" behavior, as per Thomas suggestion on [2]. This way, users running legacy and/or proprietary software are allowed to still execute them with a decent performance while still observing the warning messages on kernel log. [0] https://lore.kernel.org/lkml/20220217012721.9694-1-tony.luck@intel.com/ [1] https://github.com/doitsujin/dxvk/issues/2938 [2] https://lore.kernel.org/lkml/87pmf4bter.ffs@tglx/ [ dhansen: minor changelog tweaks, including clarifying the actual problem ] Fixes: b041b525dab9 ("x86/split_lock: Make life miserable for split lockers") Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Tested-by: Andre Almeida <andrealmeid@igalia.com> Link: https://lore.kernel.org/all/20221024200254.635256-1-gpiccoli%40igalia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31x86/sgx: Reduce delay and interference of enclave releaseReinette Chatre
[ Upstream commit 7b72c823ddf8aaaec4e9fb28e6fbe4d511e7dad1 ] commit 8795359e35bc ("x86/sgx: Silence softlockup detection when releasing large enclaves") introduced a cond_resched() during enclave release where the EREMOVE instruction is applied to every 4k enclave page. Giving other tasks an opportunity to run while tearing down a large enclave placates the soft lockup detector but Iqbal found that the fix causes a 25% performance degradation of a workload run using Gramine. Gramine maintains a 1:1 mapping between processes and SGX enclaves. That means if a workload in an enclave creates a subprocess then Gramine creates a duplicate enclave for that subprocess to run in. The consequence is that the release of the enclave used to run the subprocess can impact the performance of the workload that is run in the original enclave, especially in large enclaves when SGX2 is not in use. The workload run by Iqbal behaves as follows: Create enclave (enclave "A") /* Initialize workload in enclave "A" */ Create enclave (enclave "B") /* Run subprocess in enclave "B" and send result to enclave "A" */ Release enclave (enclave "B") /* Run workload in enclave "A" */ Release enclave (enclave "A") The performance impact of releasing enclave "B" in the above scenario is amplified when there is a lot of SGX memory and the enclave size matches the SGX memory. When there is 128GB SGX memory and an enclave size of 128GB, from the time enclave "B" starts the 128GB SGX memory is oversubscribed with a combined demand for 256GB from the two enclaves. Before commit 8795359e35bc ("x86/sgx: Silence softlockup detection when releasing large enclaves") enclave release was done in a tight loop without giving other tasks a chance to run. Even though the system experienced soft lockups the workload (run in enclave "A") obtained good performance numbers because when the workload started running there was no interference. Commit 8795359e35bc ("x86/sgx: Silence softlockup detection when releasing large enclaves") gave other tasks opportunity to run while an enclave is released. The impact of this in this scenario is that while enclave "B" is released and needing to access each page that belongs to it in order to run the SGX EREMOVE instruction on it, enclave "A" is attempting to run the workload needing to access the enclave pages that belong to it. This causes a lot of swapping due to the demand for the oversubscribed SGX memory. Longer latencies are experienced by the workload in enclave "A" while enclave "B" is released. Improve the performance of enclave release while still avoiding the soft lockup detector with two enhancements: - Only call cond_resched() after XA_CHECK_SCHED iterations. - Use the xarray advanced API to keep the xarray locked for XA_CHECK_SCHED iterations instead of locking and unlocking at every iteration. This batching solution is copied from sgx_encl_may_map() that also iterates through all enclave pages using this technique. With this enhancement the workload experiences a 5% performance degradation when compared to a kernel without commit 8795359e35bc ("x86/sgx: Silence softlockup detection when releasing large enclaves"), an improvement to the reported 25% degradation, while still placating the soft lockup detector. Scenarios with poor performance are still possible even with these enhancements. For example, short workloads creating sub processes while running in large enclaves. Further performance improvements are pursued in user space through avoiding to create duplicate enclaves for certain sub processes, and using SGX2 that will do lazy allocation of pages as needed so enclaves created for sub processes start quickly and release quickly. Fixes: 8795359e35bc ("x86/sgx: Silence softlockup detection when releasing large enclaves") Reported-by: Md Iqbal Hossain <md.iqbal.hossain@intel.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Tested-by: Md Iqbal Hossain <md.iqbal.hossain@intel.com> Link: https://lore.kernel.org/all/00efa80dd9e35dc85753e1c5edb0344ac07bb1f0.1667236485.git.reinette.chatre%40intel.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31alpha: fix syscall entry in !AUDUT_SYSCALL caseAl Viro
[ Upstream commit f7b2431a6d22f7a91c567708e071dfcd6d66db14 ] We only want to take the slow path if SYSCALL_TRACE or SYSCALL_AUDIT is set; on !AUDIT_SYSCALL configs the current tree hits it whenever _any_ thread flag (including NEED_RESCHED, NOTIFY_SIGNAL, etc.) happens to be set. Fixes: a9302e843944 "alpha: Enable system-call auditing support" Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Sasha Levin <sashal@kernel.org>