summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/msm/adreno
AgeCommit message (Collapse)Author
2023-03-20drm/msm/a6xx: Remove cx gdsc polling using 'reset'Akhil P Oommen
Remove the unused 'reset' interface which was supposed to help to ensure that cx gdsc has collapsed during gpu recovery. This is was not enabled so far due to missing gpucc driver support. Similar functionality using genpd framework will be implemented in the upcoming patch. This effectively reverts commit 1f6cca404918 ("drm/msm/a6xx: Ensure CX collapse during gpu recovery"). Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Patchwork: https://patchwork.freedesktop.org/patch/516470/ Link: https://lore.kernel.org/r/20230102161757.v5.4.I96e0bf9eaf96dd866111c1eec8a4c9b70fd7cbcb@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2023-03-20drm/msm/a6xx: Vote for cx gdsc from gpu driverAkhil P Oommen
When a device has multiple power domains, dev->power_domain is left empty during probe. That didn't cause any issue so far because we are freeloading on smmu driver's vote on cx gdsc. Instead of that, create a device_link between cx genpd device and gmu device to keep a vote from gpu driver. Before this patch: localhost ~ # cat /sys/kernel/debug/pm_genpd/pm_genpd_summary gx_gdsc on 0 /devices/genpd:1:3d6a000.gmu active 0 cx_gdsc on 0 /devices/platform/soc@0/3da0000.iommu active 0 After this patch: localhost ~ # cat /sys/kernel/debug/pm_genpd/pm_genpd_summary gx_gdsc on 0 /devices/genpd:1:3d6a000.gmu active 0 cx_gdsc on 0 /devices/platform/soc@0/3da0000.iommu active 0 /devices/genpd:0:3d6a000.gmu active 0 Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/516468/ Link: https://lore.kernel.org/r/20230102161757.v5.3.I7f545d8494dcdbe6e96a15fbe8aaf5bb0c003d50@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2023-03-10Merge tag 'drm-msm-fixes-2023-03-09' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/msm into drm-fixes msm-fixes for v6.3-rc2 - Fix for possible invalid ptr free in submit ioctl syncobj cleanup path. - Synchronize GMU removal in driver teardown path - a5xx preemption fixes - Fix runpm imbalance at unbind - DPU hw catalog fixes: - set DPU_MDP_PERIPH_0_REMOVED for sc8280xp as this is another chipset where the PERIPH_0 block of registers is not there - fix the DPU features supported in QCM2290 by comparing it with the downstream device tree - fix the length of registers in the sc7180_ctl from 0xe4 to 0x1dc - fix the max mixer line width for sm6115 and qcm2290 chipsets in the DPU catalog - fix the scaler version on sm8550, sc8280xp, sm8450, sm8250, sm8350 and sm6115. This was incorrectly populated on the SW version of the scaler library and not the scaler HW version - Drop dim layer support for msm8998 as its not indicated to be supported in the downstream DTSI - fix the DPU_CLK_CTRL bits for msm 8998 sspp blocks - Use DPU_CLK_CTRL_DMA* prefix instead of DPU_CLK_CTRL_CURSOR* for all chipsets for the DMA sspp blocks - fix the ping-pong block base address for sc7280 in the DPU HW catalog - Fix stack corruption issue in the dpu_hw_ctl_setup_blendstage() function as it was causing a negative left shift by protecting against an invalid index - Clear the DSPP reservations in dpu_rm_release(). This was missed out and as as result the DSPP was not released from the resource manager global state. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvH+VH_Wx3mFMG51CMnoiU06CM-+-WMhM73M42Qx7Bp4A@mail.gmail.com
2023-02-27Merge tag 'soc-drivers-6.3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull ARM SoC driver updates from Arnd Bergmann: "As usual, there are lots of minor driver changes across SoC platforms from NXP, Amlogic, AMD Zynq, Mediatek, Qualcomm, Apple and Samsung. These usually add support for additional chip variations in existing drivers, but also add features or bugfixes. The SCMI firmware subsystem gains a unified raw userspace interface through debugfs, which can be used for validation purposes. Newly added drivers include: - New power management drivers for StarFive JH7110, Allwinner D1 and Renesas RZ/V2M - A driver for Qualcomm battery and power supply status - A SoC device driver for identifying Nuvoton WPCM450 chips - A regulator coupler driver for Mediatek MT81xxv" * tag 'soc-drivers-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (165 commits) power: supply: Introduce Qualcomm PMIC GLINK power supply soc: apple: rtkit: Do not copy the reg state structure to the stack soc: sunxi: SUN20I_PPU should depend on PM memory: renesas-rpc-if: Remove redundant division of dummy soc: qcom: socinfo: Add IDs for IPQ5332 and its variant dt-bindings: arm: qcom,ids: Add IDs for IPQ5332 and its variant dt-bindings: power: qcom,rpmpd: add RPMH_REGULATOR_LEVEL_LOW_SVS_L1 firmware: qcom_scm: Move qcom_scm.h to include/linux/firmware/qcom/ MAINTAINERS: Update qcom CPR maintainer entry dt-bindings: firmware: document Qualcomm SM8550 SCM dt-bindings: firmware: qcom,scm: add qcom,scm-sa8775p compatible soc: qcom: socinfo: Add Soc IDs for IPQ8064 and variants dt-bindings: arm: qcom,ids: Add Soc IDs for IPQ8064 and variants soc: qcom: socinfo: Add support for new field in revision 17 soc: qcom: smd-rpm: Add IPQ9574 compatible soc: qcom: pmic_glink: remove redundant calculation of svid soc: qcom: stats: Populate all subsystem debugfs files dt-bindings: soc: qcom,rpmh-rsc: Update to allow for generic nodes soc: qcom: pmic_glink: add CONFIG_NET/CONFIG_OF dependencies soc: qcom: pmic_glink: Introduce altmode support ...
2023-02-22drm/msm/adreno: fix runtime PM imbalance at unbindJohan Hovold
A recent commit moved enabling of runtime PM from adreno_gpu_init() to adreno_load_gpu() (called on first open()), which means that unbind() may now be called with runtime PM disabled in case the device was never opened in between. Make sure to only forcibly suspend and disable runtime PM at unbind() in case runtime PM has been enabled to prevent a disable count imbalance. This specifically avoids leaving runtime PM disabled when the device is later opened after a successful bind: msm_dpu ae01000.display-controller: [drm:adreno_load_gpu [msm]] *ERROR* Couldn't power up the GPU: -13 Fixes: 4b18299b3365 ("drm/msm/adreno: Defer enabling runpm until hw_init()") Reported-by: Bjorn Andersson <quic_bjorande@quicinc.com> Link: https://lore.kernel.org/lkml/20230203181245.3523937-1-quic_bjorande@quicinc.com Cc: stable@vger.kernel.org # 6.0 Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Patchwork: https://patchwork.freedesktop.org/patch/523549/ Link: https://lore.kernel.org/r/20230221101430.14546-2-johan+linaro@kernel.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2023-02-22drm/msm/a5xx: fix context faults during ring switchDmitry Baryshkov
The rptr_addr is set in the preempt_init_ring(), which is called from a5xx_gpu_init(). It uses shadowptr() to set the address, however the shadow_iova is not yet initialized at that time. Move the rptr_addr setting to the a5xx_preempt_hw_init() which is called after setting the shadow_iova, getting the correct value for the address. Fixes: 8907afb476ac ("drm/msm: Allow a5xx to mark the RPTR shadow as privileged") Suggested-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/522640/ Link: https://lore.kernel.org/r/20230214020956.164473-5-dmitry.baryshkov@linaro.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2023-02-22drm/msm/a5xx: fix the emptyness check in the preempt codeDmitry Baryshkov
Quoting Yassine: ring->memptrs->rptr is never updated and stays 0, so the comparison always evaluates to false and get_next_ring always returns ring 0 thinking it isn't empty. Fix this by calling get_rptr() instead of reading rptr directly. Reported-by: Yassine Oudjana <y.oudjana@protonmail.com> Fixes: b1fc2839d2f9 ("drm/msm: Implement preemption for A5XX targets") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/522642/ Link: https://lore.kernel.org/r/20230214020956.164473-4-dmitry.baryshkov@linaro.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2023-02-22drm/msm/a5xx: fix highest bank bit for a530Dmitry Baryshkov
A530 has highest bank bit equal to 15 (like A540). Fix values written to REG_A5XX_RB_MODE_CNTL and REG_A5XX_TPL1_MODE_CNTL registers. Fixes: 1d832ab30ce6 ("drm/msm/a5xx: Add support for Adreno 508, 509, 512 GPUs") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/522639/ Link: https://lore.kernel.org/r/20230214020956.164473-3-dmitry.baryshkov@linaro.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2023-02-22drm/msm/a5xx: fix setting of the CP_PREEMPT_ENABLE_LOCAL registerDmitry Baryshkov
Rather than writing CP_PREEMPT_ENABLE_GLOBAL twice, follow the vendor kernel and set CP_PREEMPT_ENABLE_LOCAL register instead. a5xx_submit() will override it during submission, but let's get the sequence correct. Fixes: b1fc2839d2f9 ("drm/msm: Implement preemption for A5XX targets") Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/522638/ Link: https://lore.kernel.org/r/20230214020956.164473-2-dmitry.baryshkov@linaro.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2023-02-22drm/msm/a6xx: Make GPU destroy a bit saferDouglas Anderson
If, for whatever reason, we're trying process adreno_runtime_resume() at the same time that a6xx_destroy() is running then things can go boom. Specifically adreno_runtime_resume() will eventually call a6xx_pm_resume() and that may try to resume the gmu. Let's grab the GMU lock as we're destroying the GMU. That will solve the race because a6xx_pm_resume() grabs the same lock. That makes the access of `gmu->initialized` in a6xx_gmu_resume() safe. We'll also return an error code in a6xx_gmu_resume() if we see that `gmu->initialized` was false. If this happens we'll bail out of the rest of a6xx_pm_resume(), which is good because the rest of that function is also not good to do if we're racing with a6xx_destroy(). Signed-off-by: Douglas Anderson <dianders@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/521232/ Link: https://lore.kernel.org/r/20230202104822.1.I0e49003bf4dd1dead9be4a29dbee41f3b1236e48@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2023-02-22Merge tag 'drm-msm-fixes-2023-01-16' into msm-fixesRob Clark
Back-merge of previous cycles msm-fixes for kexec fix (to avoid merge conflict) Signed-off-by: Rob Clark <robdclark@chromium.org>
2023-02-08firmware: qcom_scm: Move qcom_scm.h to include/linux/firmware/qcom/Elliot Berman
Move include/linux/qcom_scm.h to include/linux/firmware/qcom/qcom_scm.h. This removes 1 of a few remaining Qualcomm-specific headers into a more approciate subdirectory under include/. Suggested-by: Bjorn Andersson <andersson@kernel.org> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com> Reviewed-by: Guru Das Srinagesh <quic_gurus@quicinc.com> Acked-by: Mukesh Ojha <quic_mojha@quicinc.com> Signed-off-by: Bjorn Andersson <andersson@kernel.org> Link: https://lore.kernel.org/r/20230203210956.3580811-1-quic_eberman@quicinc.com
2023-02-03Merge tag 'drm-msm-next-2023-01-30' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/msm into drm-next msm-next for v6.3 There is one devfreq patch, maintainer acked to land via msm-next to avoid a build break on platforms that do not support PM_DEVFREQ. And otherwise the usual assortment: GPU: - Add MSM_SUBMIT_BO_NO_IMPLICIT - a2xx: Support to load legacy firmware - a6xx: GPU devcore dump updates for a650/a660 - GPU devfreq tuning and fixes DPU, DSI, MDSS: - Support for SM8350, SM8450 SM8550 and SC8280XP platform Core: - Added bindings for SM8150 (driver support already present) DPU: - Partial support for DSC on SM8150 and SM8250 - Fixed color transformation matrix being lost on suspend/resume - Include DSC blocks into register snapshot - Misc HW catalog fixes DP: - Support for DP on SDM845 and SC8280XP platforms - HPD fixes - Support for limiting DP link rate via DT property, this enables - Support for HBR3 rates. DSI: - Validate display modes according to the DSI OPP table - DSI PHY support for the SM6375 platform - Fixed byte intf clock selection for 14nm PHYs - Fix the case of empty OPP tables (fixing db410c) - DT schema rework and fixes HDMI: - Turn 8960 HDMI PHY into clock provider, - Make 8960 HDMI PHY use PXO clock from DT MDP5: - Schema conversion to YAML Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGv6zQ-zsgS+NG+WuV=tk51q9vA2QdKqYhNgiXQddAdZjA@mail.gmail.com
2023-01-22Merge branch 'msm-next-lumag' into HEADDmitry Baryshkov
Merge display-related changes targeting Qualcomm DRM MSM driver. Notable changes: DPU, DSI, MDSS: - Support for SM8350, SM8450 SM8550 and SC8280XP platform Core: - Added bindings for SM8150 (driver support already present) DPU: - Partial support for DSC on SM8150 and SM8250 - Fixed color transformation matrix being lost on suspend/resume DP: - Support for DP on SDM845 and SC8280XP platforms - HPD fixes - Support for limiting DP link rate via DT property, this enables support for HBR3 rates. DSI: - Validate display modes according to the DSI OPP table - DSI PHY support for the SM6375 platform - Fixed byte intf clock selection for 14nm PHYs MDP5: - Schema conversion to YAML Misc fixes as usual Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
2023-01-20Merge tag 'drm-msm-fixes-2023-01-16' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/msm into drm-fixes msm-fixes for v6.3-rc5 Two GPU fixes which were meant to be part of the previous pull request, but I'd forgotten to fetch from gitlab after the MR was merged so that git tag was applied to the wrong commit. - kexec shutdown fix - fix potential double free Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGskguoVsz2wqAK2k+f32LwcVY5JC6+e2RwLqZswz3RY2Q@mail.gmail.com
2023-01-16drm/msm/gpu: Add devfreq tuning debugfsRob Clark
Make the handful of tuning knobs available visible via debugfs. v2: select DEVFREQ_GOV_SIMPLE_ONDEMAND because for some reason struct devfreq_simple_ondemand_data depends on this Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/517784/ Link: https://lore.kernel.org/r/20230110231447.1939101-2-robdclark@gmail.com Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
2023-01-16drm/msm/a6xx: Update ROQ size in coredumpAkhil P Oommen
Since RoQ size differs between generations, calculate dynamically the RoQ size while capturing coredump. Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/515610/ Link: https://lore.kernel.org/r/20221221203925.v2.4.I07f22966395eb045f6b312710f53890d5d7e69d4@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2023-01-16drm/msm/a6xx: Update a6xx gpu coredumpAkhil P Oommen
Update gpu coredump for a660/a650 family of gpus with the extra information available. Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/515608/ Link: https://lore.kernel.org/r/20221221203925.v2.3.Ifbfce6d693b202dac92006345bb825e7c5aee9c6@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2023-01-16drm/msm/adreno: Fix null ptr access in adreno_gpu_cleanup()Akhil P Oommen
Fix the below kernel panic due to null pointer access: [ 18.504431] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000048 [ 18.513464] Mem abort info: [ 18.516346] ESR = 0x0000000096000005 [ 18.520204] EC = 0x25: DABT (current EL), IL = 32 bits [ 18.525706] SET = 0, FnV = 0 [ 18.528878] EA = 0, S1PTW = 0 [ 18.532117] FSC = 0x05: level 1 translation fault [ 18.537138] Data abort info: [ 18.540110] ISV = 0, ISS = 0x00000005 [ 18.544060] CM = 0, WnR = 0 [ 18.547109] user pgtable: 4k pages, 39-bit VAs, pgdp=0000000112826000 [ 18.553738] [0000000000000048] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000 [ 18.562690] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP **Snip** [ 18.696758] Call trace: [ 18.699278] adreno_gpu_cleanup+0x30/0x88 [ 18.703396] a6xx_destroy+0xc0/0x130 [ 18.707066] a6xx_gpu_init+0x308/0x424 [ 18.710921] adreno_bind+0x178/0x288 [ 18.714590] component_bind_all+0xe0/0x214 [ 18.718797] msm_drm_bind+0x1d4/0x614 [ 18.722566] try_to_bring_up_aggregate_device+0x16c/0x1b8 [ 18.728105] __component_add+0xa0/0x158 [ 18.732048] component_add+0x20/0x2c [ 18.735719] adreno_probe+0x40/0xc0 [ 18.739300] platform_probe+0xb4/0xd4 [ 18.743068] really_probe+0xfc/0x284 [ 18.746738] __driver_probe_device+0xc0/0xec [ 18.751129] driver_probe_device+0x48/0x110 [ 18.755421] __device_attach_driver+0xa8/0xd0 [ 18.759900] bus_for_each_drv+0x90/0xdc [ 18.763843] __device_attach+0xfc/0x174 [ 18.767786] device_initial_probe+0x20/0x2c [ 18.772090] bus_probe_device+0x40/0xa0 [ 18.776032] deferred_probe_work_func+0x94/0xd0 [ 18.780686] process_one_work+0x190/0x3d0 [ 18.784805] worker_thread+0x280/0x3d4 [ 18.788659] kthread+0x104/0x1c0 [ 18.791981] ret_from_fork+0x10/0x20 [ 18.795654] Code: f9400408 aa0003f3 aa1f03f4 91142015 (f9402516) [ 18.801913] ---[ end trace 0000000000000000 ]--- [ 18.809039] Kernel panic - not syncing: Oops: Fatal exception Fixes: 17e822f7591f ("drm/msm: fix unbalanced pm_runtime_enable in adreno_gpu_{init, cleanup}") Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/515605/ Link: https://lore.kernel.org/r/20221221203925.v2.1.Ib978de92c4bd000b515486aad72e96c2481f84d0@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2023-01-16drm/msm/a2xx: support loading legacy (iMX) firmwareDmitry Baryshkov
Support loading A200 firmware generated from the iMX firmware header files. The firmware lacks protection support, however it allows GPU to function properly while using the firmware files with clear license which allows redistribution. Cc: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Rob Clark <robdclark@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/516443/ Link: https://lore.kernel.org/r/20230101155753.779176-1-dmitry.baryshkov@linaro.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2023-01-13Merge tag 'drm-msm-fixes-2023-01-12' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/msm into drm-fixes msm-fixes for v6.3-rc4 Display Fixes: - Fix the documentation for dpu_encoder_phys_wb_init() and dpu_encoder_phys_wb_setup_fb() APIs to address doc warnings - Remove vcca-supply and vdds-supply as mandatory for 14nm PHY and 10nm PHY DT schemas respectively as they are not present on some SOCs using these PHYs - Add the dsi-phy-regulator-ldo-mode to dsi-phy-28nm.yaml as it was missed out during txt to yaml migration - Remove operating-points-v2 and power-domain as a required property for the DSI controller as thats not the case for every SOC - Fix the description from display escape clock to display core clock in the dsi controller yaml - Fix the memory leak for mdp1-mem path for the cases when we return early after failing to get mdp0-mem ICC paths for msm - Fix error handling path in msm_hdmi_dev_probe() to release the phy ref count when devm_pm_runtime_enable() fails - Fix the dp_aux_isr() routine to make sure it doesnt incorrectly signal the aux transaction as complete if the ISR was not an AUX isr. This fixes a big hitter stability bug on chromebooks. - Add protection against null pointer dereference when there is no kms object as in the case of headless adreno GPU in the shutdown path. GPU Fixes: - a5xx: fix quirks to actually be a bitmask and not overwrite each other - a6xx: fix gx halt sequence to avoid 1000ms hang on some devices - kexec shutdown fix - fix potential double free Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGv7=in_MHW3kdkhqh7ZFoVCmnikmr29YYHCXR=7aOEneg@mail.gmail.com
2023-01-11drm/msm/gpu: Fix potential double-freeRob Clark
If userspace was calling the MSM_SET_PARAM ioctl on multiple threads to set the COMM or CMDLINE param, it could trigger a race causing the previous value to be kfree'd multiple times. Fix this by serializing on the gpu lock. Signed-off-by: Rob Clark <robdclark@chromium.org> Fixes: d4726d770068 ("drm/msm: Add a way to override processes comm/cmdline") Patchwork: https://patchwork.freedesktop.org/patch/517778/ Link: https://lore.kernel.org/r/20230110212903.1925878-1-robdclark@gmail.com
2023-01-11adreno: Shutdown the GPU properlyJoel Fernandes (Google)
During kexec on ARM device, we notice that device_shutdown() only calls pm_runtime_force_suspend() while shutting down the GPU. This means the GPU kthread is still running and further, there maybe active submits. This causes all kinds of issues during a kexec reboot: Warning from shutdown path: [ 292.509662] WARNING: CPU: 0 PID: 6304 at [...] adreno_runtime_suspend+0x3c/0x44 [ 292.509863] Hardware name: Google Lazor (rev3 - 8) with LTE (DT) [ 292.509872] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 292.509881] pc : adreno_runtime_suspend+0x3c/0x44 [ 292.509891] lr : pm_generic_runtime_suspend+0x30/0x44 [ 292.509905] sp : ffffffc014473bf0 [...] [ 292.510043] Call trace: [ 292.510051] adreno_runtime_suspend+0x3c/0x44 [ 292.510061] pm_generic_runtime_suspend+0x30/0x44 [ 292.510071] pm_runtime_force_suspend+0x54/0xc8 [ 292.510081] adreno_shutdown+0x1c/0x28 [ 292.510090] platform_shutdown+0x2c/0x38 [ 292.510104] device_shutdown+0x158/0x210 [ 292.510119] kernel_restart_prepare+0x40/0x4c And here from GPU kthread, an SError OOPs: [ 192.648789] el1h_64_error+0x7c/0x80 [ 192.648812] el1_interrupt+0x20/0x58 [ 192.648833] el1h_64_irq_handler+0x18/0x24 [ 192.648854] el1h_64_irq+0x7c/0x80 [ 192.648873] local_daif_inherit+0x10/0x18 [ 192.648900] el1h_64_sync_handler+0x48/0xb4 [ 192.648921] el1h_64_sync+0x7c/0x80 [ 192.648941] a6xx_gmu_set_oob+0xbc/0x1fc [ 192.648968] a6xx_hw_init+0x44/0xe38 [ 192.648991] msm_gpu_hw_init+0x48/0x80 [ 192.649013] msm_gpu_submit+0x5c/0x1a8 [ 192.649034] msm_job_run+0xb0/0x11c [ 192.649058] drm_sched_main+0x170/0x434 [ 192.649086] kthread+0x134/0x300 [ 192.649114] ret_from_fork+0x10/0x20 Fix by calling adreno_system_suspend() in the device_shutdown() path. [ Applied Rob Clark feedback on fixing adreno_unbind() similarly, also tested as above. ] Cc: Rob Clark <robdclark@chromium.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ricardo Ribalda <ribalda@chromium.org> Cc: Ross Zwisler <zwisler@kernel.org> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Reviewed-by: Ricardo Ribalda <ribalda@chromium.org> Reviewed-by: Rob Clark <robdclark@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/517633/ Link: https://lore.kernel.org/r/20230109222547.1368644-1-joel@joelfernandes.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2023-01-05drm/msm/a6xx: Avoid gx gbit halt during rpm suspendAkhil P Oommen
As per the downstream driver, gx gbif halt is required only during recovery sequence. So lets avoid it during regular rpm suspend. Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/515279/ Link: https://lore.kernel.org/r/20221216223253.1.Ice9c47bfeb1fddb8dc377a3491a043a3ee7fca7d@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2023-01-05drm/msm/adreno: Make adreno quirks not overwrite each otherKonrad Dybcio
So far the adreno quirks have all been assigned with an OR operator, which is problematic, because they were assigned consecutive integer values, which makes checking them with an AND operator kind of no bueno.. Switch to using BIT(n) so that only the quirks that the programmer chose are taken into account when evaluating info->quirks & ADRENO_QUIRK_... Fixes: 370063ee427a ("drm/msm/adreno: Add A540 support") Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Marijn Suijten <marijn.suijten@somainline.org> Reviewed-by: Rob Clark <robdclark@gmail.com> Signed-off-by: Konrad Dybcio <konrad.dybcio@linaro.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/516456/ Link: https://lore.kernel.org/r/20230102100201.77286-1-konrad.dybcio@linaro.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-11-30Merge tag 'drm-msm-next-2022-11-28' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/msm into drm-next msm-next for v6.2 (the gpu/gem bits) - Remove exclusive-fence hack that caused over-synchronization - Fix speed-bin detection vs. probe-defer - Enable clamp_to_idle on 7c3 - Improved hangcheck detection Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvT1h_S4d=YRgphgR8i7aMaxQaNW8mru7QaoUo9uiUk2A@mail.gmail.com
2022-11-17drm/msm: Hangcheck progress detectionRob Clark
If the hangcheck timer expires, check if the fw's position in the cmdstream has advanced (changed) since last timer expiration, and allow it up to three additional "extensions" to it's alotted time. The intention is to continue to catch "shader stuck in a loop" type hangs quickly, but allow more time for things that are actually making forward progress. Because we need to sample the CP state twice to detect if there has not been progress, this also cuts the the timer's duration in half. v2: Fix typo (REG_A6XX_CP_CSQ_IB2_STAT), add comment v3: Only halve hangcheck timer duration for generations which support progress detection (hdanton); removed unused a5xx progress (without knowing how to adjust for data buffered in ROQ it is too likely to report a false negative) v4: Comment updates to better describe the total hangcheck duration when progress detection is applied Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Tested-by: Chia-I Wu <olvaffe@gmail.com> # dEQP-GLES2.functional.flush_finish.wait Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/511584/ Link: https://lore.kernel.org/r/20221114193049.1533391-3-robdclark@gmail.com
2022-11-17drm/msm/adreno: Simplify read64/write64 helpersRob Clark
The _HI reg is always following the _LO reg, so no need to pass these offsets seprately. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/511581/ Link: https://lore.kernel.org/r/20221114193049.1533391-2-robdclark@gmail.com
2022-11-17drm/msm: Enable clamp_to_idle for 7c3Rob Clark
This was overlooked. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/511693/ Link: https://lore.kernel.org/r/20221115155535.1615278-1-robdclark@gmail.com
2022-11-17drm/msm/a6xx: Fix speed-bin detection vs probe-deferRob Clark
If we get an error (other than -ENOENT) we need to propagate that up the stack. Otherwise if the nvmem driver hasn't probed yet, we'll end up end up claiming that we support all the OPPs which is not likely to be true (and on some generations impossible to be true, ie. if there are conflicting OPPs). v2: Update commit msg, gc unused label, etc v3: Add previously missing \n's Fixes: fe7952c629da ("drm/msm: Add speed-bin support to a618 gpu") Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/511690/ Link: https://lore.kernel.org/r/20221115154637.1613968-1-robdclark@gmail.com
2022-11-03drm/msm: remove duplicated code from a6xx_create_address_spaceDmitry Baryshkov
The function a6xx_create_address_space() is mostly a copy of adreno_iommu_create_address_space() with added quirk setting. Rework these two functions to be a thin wrappers around a common helper. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Rob Clark <robdclark@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/509614/ Link: https://lore.kernel.org/r/20221102175449.452283-3-dmitry.baryshkov@linaro.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
2022-11-03drm/msm: move domain allocation into msm_iommu_new()Dmitry Baryshkov
After the msm_iommu instance is created, the IOMMU domain is completely handled inside the msm_iommu code. Move the iommu_domain_alloc() call into the msm_iommu_new() to simplify callers code. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Rob Clark <robdclark@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/509615/ Link: https://lore.kernel.org/r/20221102175449.452283-2-dmitry.baryshkov@linaro.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
2022-10-14drm/msm/a6xx: Remove state objects from list before freeingRob Clark
Technically it worked as it was before, only because it was using the _safe version of the iterator. But it is sloppy practice to leave dangling pointers. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/507017/ Link: https://lore.kernel.org/r/20221013225520.371226-4-robdclark@gmail.com
2022-10-14drm/msm/a6xx: Skip snapshotting unused GMU buffersRob Clark
Some buffers are unused on certain sub-generations of a6xx. So just skip them. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/507013/ Link: https://lore.kernel.org/r/20221013225520.371226-3-robdclark@gmail.com
2022-10-14drm/msm/a6xx: Fix kvzalloc vs state_kcalloc usageRob Clark
adreno_show_object() is a trap! It will re-allocate the pointer it is passed on first call, when the data is ascii85 encoded, using kvmalloc/ kvfree(). Which means the data *passed* to it must be kvmalloc'd, ie. we cannot use the state_kcalloc() helper. This partially reverts commit ec8f1813bf8d ("drm/msm/a6xx: Replace kcalloc() with kvzalloc()"), but adds the missing kvfree() to fix the memory leak that was present previously. And adds a warning comment. Fixes: ec8f1813bf8d ("drm/msm/a6xx: Replace kcalloc() with kvzalloc()") Closes: https://gitlab.freedesktop.org/drm/msm/-/issues/20 Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/507014/ Link: https://lore.kernel.org/r/20221013225520.371226-2-robdclark@gmail.com
2022-09-30drm/msm/gpu: Fix crash during system suspend after unbindAkhil P Oommen
In adreno_unbind, we should clean up gpu device's drvdata to avoid accessing a stale pointer during system suspend. Also, check for NULL ptr in both system suspend/resume callbacks. Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/505075/ Link: https://lore.kernel.org/r/20220928124830.2.I5ee0ac073ccdeb81961e5ec0cce5f741a7207a71@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-09-30drm/msm/a6xx: Replace kcalloc() with kvzalloc()Akhil P Oommen
In order to reduce chance of allocation failure while capturing a6xx gpu state, use kvzalloc() instead of kcalloc() in state_kcalloc(). Indirectly, this patch helps to fix leaking memory allocated for gmu_debug object. Fixes: b859f9b009b (drm/msm/gpu: Snapshot GMU debug buffer) Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/505074/ Link: https://lore.kernel.org/r/20220928124830.1.I8ea24a8d586b4978823b848adde000f92f74d5c2@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-09-01drm/msm/a6xx: Handle GMU prepare-slumber hfi failureAkhil P Oommen
When prepare-slumber hfi fails, we should follow a6xx_gmu_force_off() sequence. Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/498401/ Link: https://lore.kernel.org/r/20220819015030.v5.7.I54815c7c36b80d4725cd054e536365250454452f@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-08-28drm/msm/a6xx: Improve gpu recovery sequenceAkhil P Oommen
We can do a few more things to improve our chance at a successful gpu recovery, especially during a hangcheck timeout: 1. Halt CP and GMU core 2. Do RBBM GBIF HALT sequence 3. Do a soft reset of GPU core Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/498400/ Link: https://lore.kernel.org/r/20220819015030.v5.6.Idf2ba51078e87ae7ceb75cc77a5bd4ff2bd31eab@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-08-28drm/msm/a6xx: Ensure CX collapse during gpu recoveryAkhil P Oommen
Because there could be transient votes from other drivers/tz/hyp which may keep the cx gdsc enabled, we should poll until cx gdsc collapses. We can use the reset framework to poll for cx gdsc collapse from gpucc clk driver. This feature requires support from the platform's gpucc driver. Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de> Patchwork: https://patchwork.freedesktop.org/patch/498397/ Link: https://lore.kernel.org/r/20220819015030.v5.5.I176567525af2b9439a7e485d0ca130528666a55c@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-08-28drm/msm: Fix cx collapse issue during recoveryAkhil P Oommen
There are some hardware logic under CX domain. For a successful recovery, we should ensure cx headswitch collapses to ensure all the stale states are cleard out. This is especially true to for a6xx family where we can GMU co-processor. Currently, cx doesn't collapse due to a devlink between gpu and its smmu. So the *struct gpu device* needs to be runtime suspended to ensure that the iommu driver removes its vote on cx gdsc. Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/498398/ Link: https://lore.kernel.org/r/20220819015030.v5.4.I4ac27a0b34ea796ce0f938bb509e257516bc6f57@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-08-28drm/msm: De-open-code some CP_EVENT_WRITERob Clark
Replace some open coding to improve readability. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/499272/ Link: https://lore.kernel.org/r/20220821155441.1092134-1-robdclark@gmail.com
2022-07-06drm/msm/adreno: Defer enabling runpm until hw_init()Rob Clark
To avoid preventing the display from coming up before the rootfs is mounted, without resorting to packing fw in the initrd, the GPU has this limbo state where the device is probed, but we aren't ready to start sending commands to it. This is particularly problematic for a6xx, since the GMU (which requires fw to be loaded) is the one that is controlling the power/clk/icc votes. So defer enabling runpm until we are ready to call gpu->hw_init(), as that is a point where we know we have all the needed fw and are ready to start sending commands to the coproc's. Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/489337/ Link: https://lore.kernel.org/r/20220613182036.2567963-1-robdclark@gmail.com
2022-07-06drm/msm/adreno: Do not propagate void return valuesGeert Uytterhoeven
With sparse ("make C=2"), lots of error: return expression in void function messages are seen. Fix this by removing the return statements to propagate void return values. Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Guenter Roeck <linux@roeck-us.net> Patchwork: https://patchwork.freedesktop.org/patch/492529/ Link: https://lore.kernel.org/r/0083bc7e23753c19902580b902582ae499b44dbf.1657113388.git.geert@linux-m68k.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-07-06drm/msm/gpu: Add GEM debug label to devcoreRob Clark
When trying to understand an iova fault devcore, once you figure out which buffer we accessed beyond the end of, it is useful to see the buffer's debug label. Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/491910/ Link: https://lore.kernel.org/r/20220629211919.563585-3-robdclark@gmail.com
2022-07-06drm/msm: Fix %d vs %uRob Clark
In debugging fence rollover, I noticed that GPU state capture and devcore dumps were showing me negative fence numbers. Let's fix that and some related signed vs unsigned confusion. Signed-off-by: Rob Clark <robdclark@chromium.org> Patchwork: https://patchwork.freedesktop.org/patch/489621/ Link: https://lore.kernel.org/r/20220615163532.3013035-1-robdclark@gmail.com
2022-07-06drm/msm/adreno: Allow larger address space sizeRob Clark
The restriction to 4G was strictly to work around 64b math bug in some versions of SQE firmware. This appears to be fixed in a650+ SQE fw, so allow a larger address space size on these devices. Also, add a modparam override for debugging and igt. v2: Send the right version of the patch (ie. the one that actually compiles) Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/487601/ Link: https://lore.kernel.org/r/20220529180428.2577832-1-robdclark@gmail.com
2022-07-06drm/msm/adreno: Fix up formattingKonrad Dybcio
Leading spaces are not something checkpatch likes, and it says so when they are present. Use tabs consistently to indent function body and unwrap a 83-char-long line, as 100 is cool nowadays. Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/487592/ Link: https://lore.kernel.org/r/20220528160353.157870-4-konrad.dybcio@somainline.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-07-06drm/msm/a6xx: Add speedbin support for A619 GPUKonrad Dybcio
There are various SKUs of A619, ranging from 565 MHz to 850 MHz, depending on the bin. Add support for distinguishing them, so that proper frequency ranges can be applied, depending on the HW. Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/487590/ Link: https://lore.kernel.org/r/20220528160353.157870-3-konrad.dybcio@somainline.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-07-06drm/msm/adreno: Add A619 supportKonrad Dybcio
Add support for the Adreno 619 GPU, as found in Snapdragon 690 (SM6350), 480 (SM4350) and 750G (SM7225). Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/487588/ Link: https://lore.kernel.org/r/20220528160353.157870-2-konrad.dybcio@somainline.org Signed-off-by: Rob Clark <robdclark@chromium.org>