summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/msm/adreno
AgeCommit message (Collapse)Author
2022-07-06drm/msm/adreno: Remove dead codeKonrad Dybcio
This BUG_ON will never be reached, and there is a comment 20 above explaining why. Signed-off-by: Konrad Dybcio <konrad.dybcio@somainline.org> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/487586/ Link: https://lore.kernel.org/r/20220528160353.157870-1-konrad.dybcio@somainline.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-07-06drm/msm: Avoid unclocked GMU register access in 6xx gpu_busyDouglas Anderson
From testing on sc7180-trogdor devices, reading the GMU registers needs the GMU clocks to be enabled. Those clocks get turned on in a6xx_gmu_resume(). Confusingly enough, that function is called as a result of the runtime_pm of the GPU "struct device", not the GMU "struct device". Unfortunately the current a6xx_gpu_busy() grabs a reference to the GMU's "struct device". The fact that we were grabbing the wrong reference was easily seen to cause crashes that happen if we change the GPU's pm_runtime usage to not use autosuspend. It's also believed to cause some long tail GPU crashes even with autosuspend. We could look at changing it so that we do pm_runtime_get_if_in_use() on the GPU's "struct device", but then we run into a different problem. pm_runtime_get_if_in_use() will return 0 for the GPU's "struct device" the whole time when we're in the "autosuspend delay". That is, when we drop the last reference to the GPU but we're waiting a period before actually suspending then we'll think the GPU is off. One reason that's bad is that if the GPU didn't actually turn off then the cycle counter doesn't lose state and that throws off all of our calculations. Let's change the code to keep track of the suspend state of devfreq. msm_devfreq_suspend() is always called before we actually suspend the GPU and msm_devfreq_resume() after we resume it. This means we can use the suspended state to know if we're powered or not. NOTE: one might wonder when exactly our status function is called when devfreq is supposed to be disabled. The stack crawl I captured was: msm_devfreq_get_dev_status devfreq_simple_ondemand_func devfreq_update_target qos_notifier_call qos_max_notifier_call blocking_notifier_call_chain pm_qos_update_target freq_qos_apply apply_constraint __dev_pm_qos_update_request dev_pm_qos_update_request msm_devfreq_idle_work Fixes: eadf79286a4b ("drm/msm: Check for powered down HW in the devfreq callbacks") Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Rob Clark <robdclark@gmail.com> Patchwork: https://patchwork.freedesktop.org/patch/489124/ Link: https://lore.kernel.org/r/20220610124639.v4.1.Ie846c5352bc307ee4248d7cab998ab3016b85d06@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-06-28Merge tag 'drm-msm-fixes-2022-06-28' into msm-next-stagingRob Clark
Merge v5.19 fixes to avoid merge conflicts Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-06-18drm/msm: Don't overwrite hw fence in hw_initRob Clark
Prior to the last commit, this could result in setting the GPU written fence value back to an older value, if we had missed updating completed_fence prior to suspend. This was mostly harmless as the GPU would eventually overwrite it again with the correct value. But we should just not do this. Instead just leave a sanity check that the fence looks plausible (in case the GPU scribbled on memory). Reported-by: Steev Klimaszewski <steev@kali.org> Fixes: 95d1deb02a9c ("drm/msm/gem: Add fenced vma unpin") Signed-off-by: Rob Clark <robdclark@chromium.org> Tested-by: Steev Klimaszewski <steev@kali.org> Patchwork: https://patchwork.freedesktop.org/patch/490138/ Link: https://lore.kernel.org/r/20220618161120.3451993-2-robdclark@gmail.com
2022-06-07drm/msm: Fix double pm_runtime_disable() callMaximilian Luz
Following commit 17e822f7591f ("drm/msm: fix unbalanced pm_runtime_enable in adreno_gpu_{init, cleanup}"), any call to adreno_unbind() will disable runtime PM twice, as indicated by the call trees below: adreno_unbind() -> pm_runtime_force_suspend() -> pm_runtime_disable() adreno_unbind() -> gpu->funcs->destroy() [= aNxx_destroy()] -> adreno_gpu_cleanup() -> pm_runtime_disable() Note that pm_runtime_force_suspend() is called right before gpu->funcs->destroy() and both functions are called unconditionally. With recent addition of the eDP AUX bus code, this problem manifests itself when the eDP panel cannot be found yet and probing is deferred. On the first probe attempt, we disable runtime PM twice as described above. This then causes any later probe attempt to fail with [drm:adreno_load_gpu [msm]] *ERROR* Couldn't power up the GPU: -13 preventing the driver from loading. As there seem to be scenarios where the aNxx_destroy() functions are not called from adreno_unbind(), simply removing pm_runtime_disable() from inside adreno_unbind() does not seem to be the proper fix. This is what commit 17e822f7591f ("drm/msm: fix unbalanced pm_runtime_enable in adreno_gpu_{init, cleanup}") intended to fix. Therefore, instead check whether runtime PM is still enabled, and only disable it in that case. Fixes: 17e822f7591f ("drm/msm: fix unbalanced pm_runtime_enable in adreno_gpu_{init, cleanup}") Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com> Tested-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: Rob Clark <robdclark@gmail.com> Link: https://lore.kernel.org/r/20220606211305.189585-1-luzmaximilian@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-05-20Merge tag 'msm-next-5.19-fixes' of ↵Dave Airlie
https://gitlab.freedesktop.org/abhinavk/msm into drm-next 5.19 fixes for msm-next - Limiting WB modes to max sspp linewidth - Fixing the supported rotations to add 180 back for IGT - Fix to handle pm_runtime_get_sync() errors to avoid unclocked access in the bind() path for dpu driver - Fix the irq_free() without request issue which was a big-time hitter in the CI-runs. Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Signed-off-by: Dave Airlie <airlied@redhat.com> From: Abhinav Kumar <quic_abhinavk@quicinc.com> Link: https://patchwork.freedesktop.org/patch/msgid/b011d51d-d634-123e-bf5f-27219ee33151@quicinc.com
2022-05-18drm/msm/a6xx: Fix refcount leak in a6xx_gpu_initMiaoqian Lin
of_parse_phandle() returns a node pointer with refcount incremented, we should use of_node_put() on it when not need anymore. a6xx_gmu_init() passes the node to of_find_device_by_node() and of_dma_configure(), of_find_device_by_node() will takes its reference, of_dma_configure() doesn't need the node after usage. Add missing of_node_put() to avoid refcount leak. Fixes: 4b565ca5a2cb ("drm/msm: Add A6XX device support") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Link: https://lore.kernel.org/r/20220512121955.56937-1-linmq006@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-05-11Merge tag 'drm-msm-next-2022-05-09' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/msm into drm-next - Fourcc modifier for tiled but not compressed layouts - Support for userspace allocated IOVA (GPU virtual address) - Devfreq clamp_to_idle fix - DPU: DSC (Display Stream Compression) support - DPU: inline rotation support on SC7280 - DPU: update DP timings to follow vendor recommendations - DP, DPU: add support for wide bus (on newer chipsets) - DP: eDP support - Merge DPU1 and MDP5 MDSS driver, make dpu/mdp device the master component - MDSS: optionally reset the IP block at the bootup to drop bootloader state - Properly register and unregister internal bridges in the DRM framework - Complete DPU IRQ cleanup - DP: conversion to use drm_bridge and drm_bridge_connector - eDP: drop old eDP parts again - DPU: writeback support - Misc small fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGvJCr_1D8d0dgmyQC5HD4gmXeZw=bFV_CNCfceZbpMxRw@mail.gmail.com
2022-05-02drm/msm: Fix null pointer dereferences without iommuLuca Weiss
Check if 'aspace' is set before using it as it will stay null without IOMMU, such as on msm8974. Fixes: bc2112583a0b ("drm/msm/gpu: Track global faults per address-space") Signed-off-by: Luca Weiss <luca@z3ntu.xyz> Link: https://lore.kernel.org/r/20220421203455.313523-1-luca@z3ntu.xyz Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-21drm/msm: simplify gpu_busy callbackChia-I Wu
Move tracking and busy time calculation to msm_devfreq_get_dev_status. Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Cc: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220416003314.59211-2-olvaffe@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-21drm/msm: Add a way for userspace to allocate GPU iovaRob Clark
The motivation at this point is mainly native userspace mesa driver in a VM guest. The one remaining synchronous "hotpath" is buffer allocation, because guest needs to wait to know the bo's iova before it can start emitting cmdstream/state that references the new bo. By allocating the iova in the guest userspace, we no longer need to wait for a response from the host, but can just rely on the allocation request being processed before the cmdstream submission. Allocation failures (OoM, etc) would just be treated as context-lost (ie. GL_GUILTY_CONTEXT_RESET) or subsequent allocations (or readpix, etc) can raise GL_OUT_OF_MEMORY. v2: Fix inuse check v3: Change mismatched iova case to -EBUSY Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Link: https://lore.kernel.org/r/20220411215849.297838-11-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-21drm/msm/gem: Drop PAGE_SHIFT for address space mmRob Clark
Get rid of all the unnecessary conversion between address/size and page offsets. It just confuses things. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20220411215849.297838-6-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-21drm/msm/gpu: Drop duplicate fence counterRob Clark
The ring seqno counter duplicates the fence-context last_fence counter. They end up getting incremented in lock-step, on the same scheduler thread, but the split just makes things less obvious. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220411215849.297838-3-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-21drm/msm: Add a way to override processes comm/cmdlineRob Clark
In the cause of using the GPU via virtgpu, the host side process is really a sort of proxy, and not terribly interesting from the PoV of crash/fault logging. Add a way to override these per process so that we can see the guest process's name. v2: Handle kmalloc failure, add comment to explain kstrdup returns NULL if passed NULL [Dan Carpenter] Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220317165144.222101-4-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-21drm/msm: Add support for pointer paramsRob Clark
The 64b value field is already suffient to hold a pointer instead of immediate, but we also need a length field. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220317165144.222101-2-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-11drm/msm/gpu: Avoid -Wunused-function with !CONFIG_PM_SLEEPNathan Chancellor
When building with CONFIG_PM=y and CONFIG_PM_SLEEP=n (such as ARCH=riscv allmodconfig), the following warnings/errors occur: drivers/gpu/drm/msm/adreno/adreno_device.c:679:12: error: 'adreno_system_resume' defined but not used [-Werror=unused-function] 679 | static int adreno_system_resume(struct device *dev) | ^~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/msm/adreno/adreno_device.c:655:12: error: 'adreno_system_suspend' defined but not used [-Werror=unused-function] 655 | static int adreno_system_suspend(struct device *dev) | ^~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors These functions are only used in SET_SYSTEM_SLEEP_PM_OPS(), which evaluates to empty when CONFIG_PM_SLEEP is not set, making these functions unused. To resolve this, use the SYSTEM_SLEEP_PM_OPS() and RUNTIME_PM_OPS() macros, which were introduced in commit 1a3c7bb08826 ("PM: core: Add new *_PM_OPS macros, deprecate old ones"). They are designed to avoid these compiler warnings while still guarding their use on CONFIG_PM{,_SLEEP}=y. Fixes: 7e4167c9e021 ("drm/msm/gpu: Park scheduler threads for system suspend") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Link: https://lore.kernel.org/r/20220411181249.2758344-1-nathan@kernel.org Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-04-11drm/msm: Fix range size vs end confusionRob Clark
The fourth param is size, rather than range_end. Note that we could increase the address space size if we had a way to prevent buffers from spanning a 4G split, mostly just to avoid fw bugs with 64b math. Fixes: 84c31ee16f90 ("drm/msm/a6xx: Add support for per-instance pagetables") Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220407202836.1211268-1-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-03-24drm/msm/gpu: Remove mutex from wait_event conditionRob Clark
The mutex wasn't really protecting anything before. Before the previous patch we could still be racing with the scheduler's kthread, as that is not necessarily frozen yet. Now that we've parked the sched threads, the only race is with jobs retiring, and that is harmless, ie. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220310234611.424743-4-robdclark@gmail.com
2022-03-24drm/msm/gpu: Park scheduler threads for system suspendRob Clark
In the system suspend path, we don't want to be racing with the scheduler kthreads pushing additional queued up jobs to the hw queue (ringbuffer). So park them first. While we are at it, move the wait for active jobs to complete into the new system- suspend path. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220310234611.424743-3-robdclark@gmail.com
2022-03-24drm/msm/gpu: Rename runtime suspend/resume functionsRob Clark
Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220310234611.424743-2-robdclark@gmail.com
2022-03-08drm/msm/adreno: fix cast in adreno_get_param()Dan Carpenter
These casts need to happen before the shift. The only time it would matter would be if "rev.core" is >= 128. In that case the sign bit would be extended and we do not want that. Fixes: afab9d91d872 ("drm/msm/adreno: Expose speedbin to userspace") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Link: https://lore.kernel.org/r/20220307133105.GA17534@kili Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-03-05drm/msm/a6xx: Fix missing ARRAY_SIZE() checkRob Clark
Fixes: f6d62d091cfd ("drm/msm/a6xx: add support for Adreno 660 GPU") Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20220305173405.914989-1-robdclark@gmail.com
2022-03-04drm/msm/a6xx: Zap counters across context switchRob Clark
Any app controlled perfcntr collection (GL_AMD_performance_monitor, etc) does not require counters to maintain state across context switches. So clear them if systemwide profiling is not active. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220304005317.776110-5-robdclark@gmail.com
2022-03-04drm/msm: Add SYSPROF param (v2)Rob Clark
Add a SYSPROF param for system profiling tools like Mesa's pps-producer (perfetto) to control behavior related to system-wide performance counter collection. In particular, for profiling, one wants to ensure that GPU context switches do not effect perfcounter state, and might want to suppress suspend (which would cause counters to lose state). v2: Swap the order in msm_file_private_set_sysprof() [sboyd] and initialize the sysprof_active refcount to one (because the under/ overflow checking in refcount_t doesn't expect a 0->1 transition) meaning that values greater than 1 means sysprof is active. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220304005317.776110-4-robdclark@gmail.com
2022-03-04drm/msm: Add SET_PARAM ioctlRob Clark
It was always expected to have a use for this some day, so we left a placeholder. Now we do. (And I expect another use in the not too distant future when we start allowing userspace to allocate GPU iova.) Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220304005317.776110-3-robdclark@gmail.com
2022-03-04drm/msm: Update generated headersRob Clark
Update headers from mesa commit: commit 7e63fa2bb13cf14b765ad06d046789ee1879b5ef Author: Rob Clark <robclark@freedesktop.org> AuthorDate: Wed Mar 2 17:11:10 2022 -0800 freedreno/registers: Add a couple regs we need for kernel Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15221> Signed-off-by: Rob Clark <robdclark@chromium.org> [for display bits:] Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com> Link: https://lore.kernel.org/r/20220304005317.776110-2-robdclark@gmail.com
2022-02-25drm/msm/adreno: Expose speedbin to userspaceAkhil P Oommen
Expose speedbin through MSM_PARAM_CHIP_ID parameter to help userspace identify the sku. Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Link: https://lore.kernel.org/r/20220226005021.v2.4.I86c32730e08cba9e5c83f02ec17885124d45fa56@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-02-25drm/msm/a6xx: Add support for 7c3 SKUsAkhil P Oommen
Add support for 7c3 SKU detection using speedbin fuse. Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Link: https://lore.kernel.org/r/20220226005021.v2.3.I6e89c014eb17f090f716fba662bdd33073920804@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-02-25drm/msm/adreno: Generate name from chipid for 7c3Akhil P Oommen
Use a gpu name which is sprintf'ed from the chipid for 7c3 gpu instead of hardcoding one. This helps to avoid code churn in case of a gpu rename. Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Link: https://lore.kernel.org/r/20220226005021.v2.2.I9436e0e300f76b2e6c34136a0b902e8cfd73e0d6@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-02-20drm/msm/gpu: Track global faults per address-spaceRob Clark
Other processes don't need to know about faults that they are isolated from by virtue of address space isolation. They are only interested in whether some of their state might have been corrupted. But to be safe, also track unattributed faults. This case should really never happen unless there is a kernel bug (and that would never happen, right?) v2: Instead of adding a new param, just change the behavior of the existing param to match what userspace actually wants [anholt] Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5934 Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220201161618.778455-3-robdclark@gmail.com Reviewed-by: Emma Anholt <emma@anholt.net> Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-02-20drm/msm/gpu: Add ctx to get_param()Rob Clark
Prep work for next patch. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20220201161618.778455-2-robdclark@gmail.com Reviewed-by: Emma Anholt <emma@anholt.net> Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-02-18drm/msm: drop dbgname argument from msm_ioremap*()Dmitry Baryshkov
msm_ioremap() functions take additional argument dbgname which is now unused. Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Reviewed-by: Stephen Boyd <swboyd@chromium.org> Link: https://lore.kernel.org/r/20220105232700.444170-2-dmitry.baryshkov@linaro.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
2022-01-25drm/msm/gpu: Wait for idle before suspendingRob Clark
System suspend uses pm_runtime_force_suspend(), which cheekily bypasses the runpm reference counts. This doesn't actually work so well when the GPU is active. So add a reasonable delay waiting for the GPU to become idle. Alternatively we could just return -EBUSY in this case, but that has the disadvantage of causing system suspend to fail. v2: s/ret/remaining [sboyd], and switch to using active_submits count to ensure we aren't racing with submit cleanup (and devfreq idle work getting scheduled, etc) v3: fix inverted logic Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20220108180913.814448-2-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2022-01-25drm/msm/a6xx: Add missing suspend_count incrementRob Clark
Reported-by: Danylo Piliaiev <dpiliaiev@igalia.com> Fixes: 3ab1c5cc3939 ("drm/msm: Add param for userspace to query suspend count") Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20220113163215.215367-1-robdclark@gmail.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
2021-12-17drm/msm/a5xx: Fix missing CP_PROTECT for SMMU on A540Vladimir Lypak
A CP_PROTECT entry for SMMU registers is missing for A540. According to downstream sources its length is same as on A530 - 0x20000 bytes. On all other revisions SMMU region length is 0x10000 bytes. Despite this, we setup region of length 0x20000 on all revisions. This doesn't cause any issues on those GPUs. As for preventing accesses to the region from protected mode it was tested to work the same. This patch drops the "if" condition in setup of CP_PROTECT entry because it already includes all supported revisions except A540. Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com> Link: https://lore.kernel.org/r/20211212160333.980343-2-vladimir.lypak@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-12-17drm/msm/a5xx: Add support for Adreno 506 GPUVladimir Lypak
This GPU is found on SoCs such as MSM8953 (650 MHz), SDM450 (600 MHz), SDM632 (725 MHz). Signed-off-by: Vladimir Lypak <vladimir.lypak@gmail.com> Link: https://lore.kernel.org/r/20211212160333.980343-1-vladimir.lypak@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-12-13drm/msm/a6xx: Skip crashdumper state if GPU needs_hw_initRob Clark
I am seeing some crash logs which imply that we are trying to use crashdumper hw to read back GPU state when the GPU isn't initialized. This doesn't go well (for example, GPU could be in 32b address mode and ignoring the upper bits of buffer that it is trying to dump state to). I'm not *quite* sure how we get into this state in the first place, but lets not make a bad situation worse by triggering iova fault crashes. While we're at it, also add the information about whether the GPU is initialized to the devcore dump to make this easier to see in the logs (which makes the WARN_ON() redundant and even harmful because it fills up the small bit of dmesg we get with the crash report). Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20211209193118.1163248-1-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-12-06drm/msm: Allocate msm_drm_private early and pass it as driver dataAngeloGioacchino Del Regno
In preparation for registering the mdss interrupt controller earlier, move the allocation of msm_drm_private from component bind time to msm_drv probe; this also allows us to use the devm variant of kzalloc. Since it is not right to allocate the drm_device at probe time (as it should exist only when all components are bound, and taken down when components get cleaned up), the only way to make this happen is to pass a pointer to msm_drm_private as driver data (like done in many other DRM drivers), instead of one to drm_device like it's currently done in this driver. This is also simplifying some bind/unbind functions around drm/msm, as some of them are using drm_device just to grab a pointer to the msm_drm_private structure, which we now retrieve in one call. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://lore.kernel.org/r/20211201105210.24970-2-angelogioacchino.delregno@collabora.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-11-29drm/msm/gpu: Name GMU bosRob Clark
Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20211124214151.1427022-5-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-11-29drm/msm/gpu: Add a comment in a6xx_gmu_init()Rob Clark
If you don't realize is_a650_family() also encompasses a660 family, you'd think that the debug buffer is double allocated. Add a comment to make this more clear. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20211124214151.1427022-11-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-11-29drm/msm/gpu: Snapshot GMU debug bufferRob Clark
It appears to be a GMU fw build option whether it does anything with debug and log buffers, but if they are all zeros it won't add anything to the devcore size. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20211124214151.1427022-10-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-11-29drm/msm/gpu: Also snapshot GMU HFI bufferRob Clark
This also includes a history of start index of the last 8 messages on each queue, since parsing backwards to decode recently sent HFI messages is hard(ish). Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20211124214151.1427022-9-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-11-29drm/msm/gpu: Make a6xx_get_gmu_log() more genericRob Clark
Turn it into a thing we can use to snapshot other GMU buffers. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20211124214151.1427022-8-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-11-29drm/msm/gpu: Add some WARN_ON()sRob Clark
We don't expect either of these conditions to ever be true, so let's get shouty if they are. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20211124214151.1427022-6-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-11-28drm/msm/a6xx: Capture gmu log in devcoredumpAkhil P Oommen
Capture gmu log in coredump to enhance debugging. Signed-off-by: Akhil P Oommen <akhilpo@codeaurora.org> Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20211124214151.1427022-2-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-11-28drm/msm/adreno: Name the shadow bufferRob Clark
This was the one GPU related kernel buffer which was not given a debug name. Let's fix that. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20211115191514.310472-1-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-11-28drm/msm: Add debugfs to disable hw err handlingRob Clark
Add a debugfs interface to ignore hw error irqs, in order to force fallback to sw hangcheck mechanism. Because the hw error detection is pretty good on newer gens, we need this for igt tests to test the sw hang detection. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Akhil P Oommen <akhilpo@codeaurora.org> Link: https://lore.kernel.org/r/20211109181117.591148-6-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-11-28drm/msm: Remove struct_mutex usageRob Clark
The remaining struct_mutex usage is just to serialize various gpu related things (submit/retire/recover/fault/etc), so replace struct_mutex with gpu->lock. Signed-off-by: Rob Clark <robdclark@chromium.org> Link: https://lore.kernel.org/r/20211109181117.591148-4-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-11-28drm/msm: Drop priv->lastctxRob Clark
cur_ctx_seqno already does the same thing, but handles the edge cases where a refcnt'd context can live after lastclose. So let's not have two ways to do the same thing. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Akhil P Oommen <akhilpo@codeaurora.org> Link: https://lore.kernel.org/r/20211109181117.591148-3-robdclark@gmail.com Signed-off-by: Rob Clark <robdclark@chromium.org>
2021-11-21drm/msm/a6xx: Fix uinitialized use of gpu_scidAkhil P Oommen
Avoid a possible uninitialized use of gpu_scid variable to fix the below smatch warning: drivers/gpu/drm/msm/adreno/a6xx_gpu.c:1480 a6xx_llc_activate() error: uninitialized symbol 'gpu_scid'. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Akhil P Oommen <akhilpo@codeaurora.org> Link: https://lore.kernel.org/r/20211118154903.3.Ie4ac321feb10168af569d9c2b4cf6828bed8122c@changeid Signed-off-by: Rob Clark <robdclark@chromium.org>