summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/panfrost/panfrost_job.c
AgeCommit message (Collapse)Author
2025-07-14drm/panfrost: Fix scheduler workqueue bugPhilipp Stanner
When the GPU scheduler was ported to using a struct for its initialization parameters, it was overlooked that panfrost creates a distinct workqueue for timeout handling. The pointer to this new workqueue is not initialized to the struct, resulting in NULL being passed to the scheduler, which then uses the system_wq for timeout handling. Set the correct workqueue to the init args struct. Cc: stable@vger.kernel.org # 6.15+ Fixes: 796a9f55a8d1 ("drm/sched: Use struct for drm_sched_init() params") Reported-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Closes: https://lore.kernel.org/dri-devel/b5d0921c-7cbf-4d55-aa47-c35cd7861c02@igalia.com/ Signed-off-by: Philipp Stanner <phasta@kernel.org> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20250709102957.100849-2-phasta@kernel.org
2025-02-12drm/sched: Use struct for drm_sched_init() paramsPhilipp Stanner
drm_sched_init() has a great many parameters and upcoming new functionality for the scheduler might add even more. Generally, the great number of parameters reduces readability and has already caused one missnaming, addressed in: commit 6f1cacf4eba7 ("drm/nouveau: Improve variable name in nouveau_sched_init()"). Introduce a new struct for the scheduler init parameters and port all users. Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Acked-by: Matthew Brost <matthew.brost@intel.com> # for Xe Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> # for Panfrost and Panthor Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com> # for Etnaviv Reviewed-by: Frank Binns <frank.binns@imgtec.com> # for Imagination Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> # for Sched Reviewed-by: Maíra Canal <mcanal@igalia.com> # for v3d Reviewed-by: Danilo Krummrich <dakr@kernel.org> Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> # for amdxdna Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250211111422.21235-2-phasta@kernel.org
2024-09-06drm/sched: add optional errno to drm_sched_start()Christian König
The current implementation of drm_sched_start uses a hardcoded -ECANCELED to dispose of a job when the parent/hw fence is NULL. This results in drm_sched_job_done being called with -ECANCELED for each job with a NULL parent in the pending list, making it difficult to distinguish between recovery methods, whether a queue reset or a full GPU reset was used. To improve this, we first try a soft recovery for timeout jobs and use the error code -ENODATA. If soft recovery fails, we proceed with a queue reset, where the error code remains -ENODATA for the job. Finally, for a full GPU reset, we use error codes -ECANCELED or -ETIME. This patch adds an error code parameter to drm_sched_start, allowing us to differentiate between queue reset and GPU reset failures. This enables user mode and test applications to validate the expected correctness of the requested operation. After a successful queue reset, the only way to continue normal operation is to call drm_sched_job_done with the specific error code -ENODATA. v1: Initial implementation by Jesse utilized amdgpu_device_lock_reset_domain and amdgpu_device_unlock_reset_domain to allow user mode to track the queue reset status and distinguish between queue reset and GPU reset. v2: Christian suggested using the error codes -ENODATA for queue reset and -ECANCELED or -ETIME for GPU reset, returned to amdgpu_cs_wait_ioctl. v3: To meet the requirements, we introduce a new function drm_sched_start_ex with an additional parameter to set dma_fence_set_error, allowing us to handle the specific error codes appropriately and dispose of bad jobs with the selected error code depending on whether it was a queue reset or GPU reset. v4: Alex suggested using a new name, drm_sched_start_with_recovery_error, which more accurately describes the function's purpose. Additionally, it was recommended to add documentation details about the new method. v5: Fixed declaration of new function drm_sched_start_with_recovery_error.(Alex) v6 (chk): rebase on upstream changes, cleanup the commit message, drop the new function again and update all callers, apply the errno also to scheduler fences with hw fences v7 (chk): rebased Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240826122541.85663-1-christian.koenig@amd.com
2024-09-02drm/panfrost: Add cycle counter job requirementMary Guillemard
Extend the uAPI with a new job requirement flag for cycle counters. This requirement is used by userland to indicate that a job requires cycle counters or system timestamp to be propagated. (for use with write value timestamp jobs) We cannot enable cycle counters unconditionally as this would result in an increase of GPU power consumption. As a result, they should be left off unless required by the application. If a job requires cycle counters or system timestamps propagation, we must enable cycle counting before issuing a job and disable it right after the job completes. Since this extends the uAPI and because userland needs a way to advertise features like VK_KHR_shader_clock conditionally, we bumps the driver minor version. v2: - Rework commit message - Squash uAPI changes and implementation in this commit - Simplify changes based on Steven Price comments v3: - Add Steven Price r-b - Fix a codestyle issue Signed-off-by: Mary Guillemard <mary.guillemard@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Tested-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240819080224.24914-3-mary.guillemard@collabora.com
2024-07-25drm/scheduler: remove full_recover from drm_sched_startChristian König
This was basically just another one of amdgpus hacks. The parameter allowed to restart the scheduler without turning fence signaling on again. That this is absolutely not a good idea should be obvious by now since the fences will then just sit there and never signal. While at it cleanup the code a bit. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240722083816.99685-1-christian.koenig@amd.com
2024-03-11drm/panfrost: Replace fdinfo's profiling debugfs knob with sysfsAdrián Larumbe
Debugfs isn't always available in production builds that try to squeeze every single byte out of the kernel image, but we still need a way to toggle the timestamp and cycle counter registers so that jobs can be profiled for fdinfo's drm engine and cycle calculations. Drop the debugfs knob and replace it with a sysfs file that accomplishes the same functionality, and document its ABI in a separate file. Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240306015819.822128-2-adrian.larumbe@collabora.com
2023-12-05drm/panfrost: Synchronize and disable interrupts before powering offAngeloGioacchino Del Regno
To make sure that we don't unintentionally perform any unclocked and/or unpowered R/W operation on GPU registers, before turning off clocks and regulators we must make sure that no GPU, JOB or MMU ISR execution is pending: doing that requires to add a mechanism to synchronize the interrupts on suspend. Add functions panfrost_{gpu,job,mmu}_suspend_irq() which will perform interrupts masking and ISR execution synchronization, and then call those in the panfrost_device_runtime_suspend() handler in the exact sequence of job (may require mmu!) -> mmu -> gpu. As a side note, JOB and MMU suspend_irq functions needed some special treatment: as their interrupt handlers will unmask interrupts, it was necessary to add an `is_suspended` bitmap which is used to address the possible corner case of unintentional IRQ unmasking because of ISR execution after a call to synchronize_irq(). At resume, clear each is_suspended bit in the reset path of JOB/MMU to allow unmasking the interrupts. Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Tested-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231204114215.54575-4-angelogioacchino.delregno@collabora.com
2023-11-10drm/sched: implement dynamic job-flow controlDanilo Krummrich
Currently, job flow control is implemented simply by limiting the number of jobs in flight. Therefore, a scheduler is initialized with a credit limit that corresponds to the number of jobs which can be sent to the hardware. This implies that for each job, drivers need to account for the maximum job size possible in order to not overflow the ring buffer. However, there are drivers, such as Nouveau, where the job size has a rather large range. For such drivers it can easily happen that job submissions not even filling the ring by 1% can block subsequent submissions, which, in the worst case, can lead to the ring run dry. In order to overcome this issue, allow for tracking the actual job size instead of the number of jobs. Therefore, add a field to track a job's credit count, which represents the number of credits a job contributes to the scheduler's credit limit. Signed-off-by: Danilo Krummrich <dakr@redhat.com> Reviewed-by: Luben Tuikov <ltuikov89@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231110001638.71750-1-dakr@redhat.com
2023-11-01drm/sched: Convert drm scheduler to use a work queue rather than kthreadMatthew Brost
In Xe, the new Intel GPU driver, a choice has made to have a 1 to 1 mapping between a drm_gpu_scheduler and drm_sched_entity. At first this seems a bit odd but let us explain the reasoning below. 1. In Xe the submission order from multiple drm_sched_entity is not guaranteed to be the same completion even if targeting the same hardware engine. This is because in Xe we have a firmware scheduler, the GuC, which allowed to reorder, timeslice, and preempt submissions. If a using shared drm_gpu_scheduler across multiple drm_sched_entity, the TDR falls apart as the TDR expects submission order == completion order. Using a dedicated drm_gpu_scheduler per drm_sched_entity solve this problem. 2. In Xe submissions are done via programming a ring buffer (circular buffer), a drm_gpu_scheduler provides a limit on number of jobs, if the limit of number jobs is set to RING_SIZE / MAX_SIZE_PER_JOB we get flow control on the ring for free. A problem with this design is currently a drm_gpu_scheduler uses a kthread for submission / job cleanup. This doesn't scale if a large number of drm_gpu_scheduler are used. To work around the scaling issue, use a worker rather than kthread for submission / job cleanup. v2: - (Rob Clark) Fix msm build - Pass in run work queue v3: - (Boris) don't have loop in worker v4: - (Tvrtko) break out submit ready, stop, start helpers into own patch v5: - (Boris) default to ordered work queue v6: - (Luben / checkpatch) fix alignment in msm_ringbuffer.c - (Luben) s/drm_sched_submit_queue/drm_sched_wqueue_enqueue - (Luben) Update comment for drm_sched_wqueue_enqueue - (Luben) Positive check for submit_wq in drm_sched_init - (Luben) s/alloc_submit_wq/own_submit_wq v7: - (Luben) s/drm_sched_wqueue_enqueue/drm_sched_run_job_queue v8: - (Luben) Adjust var names / comments Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Link: https://lore.kernel.org/r/20231031032439.1558703-3-matthew.brost@intel.com Signed-off-by: Luben Tuikov <ltuikov89@gmail.com>
2023-10-26drm/sched: Convert the GPU scheduler to variable number of run-queuesLuben Tuikov
The GPU scheduler has now a variable number of run-queues, which are set up at drm_sched_init() time. This way, each driver announces how many run-queues it requires (supports) per each GPU scheduler it creates. Note, that run-queues correspond to scheduler "priorities", thus if the number of run-queues is set to 1 at drm_sched_init(), then that scheduler supports a single run-queue, i.e. single "priority". If a driver further sets a single entity per run-queue, then this creates a 1-to-1 correspondence between a scheduler and a scheduled entity. Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Russell King <linux+etnaviv@armlinux.org.uk> Cc: Qiang Yu <yuq825@gmail.com> Cc: Rob Clark <robdclark@gmail.com> Cc: Abhinav Kumar <quic_abhinavk@quicinc.com> Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Cc: Danilo Krummrich <dakr@redhat.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Emma Anholt <emma@anholt.net> Cc: etnaviv@lists.freedesktop.org Cc: lima@lists.freedesktop.org Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org Cc: nouveau@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20231023032251.164775-1-luben.tuikov@amd.com
2023-10-04drm/panfrost: Add fdinfo support GPU load metricsAdrián Larumbe
The drm-stats fdinfo tags made available to user space are drm-engine, drm-cycles, drm-max-freq and drm-curfreq, one per job slot. This deviates from standard practice in other DRM drivers, where a single set of key:value pairs is provided for the whole render engine. However, Panfrost has separate queues for fragment and vertex/tiler jobs, so a decision was made to calculate bus cycles and workload times separately. Maximum operating frequency is calculated at devfreq initialisation time. Current frequency is made available to user space because nvtop uses it when performing engine usage calculations. It is important to bear in mind that both GPU cycle and kernel time numbers provided are at best rough estimations, and always reported in excess from the actual figure because of two reasons: - Excess time because of the delay between the end of a job processing, the subsequent job IRQ and the actual time of the sample. - Time spent in the engine queue waiting for the GPU to pick up the next job. To avoid race conditions during enablement/disabling, a reference counting mechanism was introduced, and a job flag that tells us whether a given job increased the refcount. This is necessary, because user space can toggle cycle counting through a debugfs file, and a given job might have been in flight by the time cycle counting was disabled. The main goal of the debugfs cycle counter knob is letting tools like nvtop or IGT's gputop switch it at any time, to avoid power waste in case no engine usage measuring is necessary. Also add a documentation file explaining the possible values for fdinfo's engine keystrings and Panfrost-specific drm-curfreq-<keystr> pairs. Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230929181616.2769345-3-adrian.larumbe@collabora.com
2023-08-21drm/panfrost: Do not check for 0 return after calling platform_get_irq_byname()Ruan Jinjie
It is not possible for platform_get_irq_byname() to return 0. Use the return value from platform_get_irq_byname(). Signed-off-by: Ruan Jinjie <ruanjinjie@huawei.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230803040401.3067484-2-ruanjinjie@huawei.com
2023-08-10drm/panfrost: Sync IRQ by job's timeout handlerDmitry Osipenko
Panfrost IRQ handler may stuck for a long time, for example this happens when there is a bad HDMI connection and HDMI handler takes a long time to finish processing, holding Panfrost. Make Panfrost's job timeout handler to sync IRQ before checking fence signal status in order to prevent spurious job timeouts due to a slow IRQ processing. Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Tested-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> # MediaTek MT8192 and MT8195 Chromebooks Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230807000444.14926-1-dmitry.osipenko@collabora.com
2022-08-08drm/panfrost: Add support for devcoredumpAdrián Larumbe
In the event of a job timeout, debug dump information will be written into /sys/class/devcoredump. Inspired by etnaviv's similar feature. Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220729144610.2105223-3-adrian.larumbe@collabora.com
2022-06-10Merge tag 'drm-misc-fixes-2022-05-26' of ↵Dave Airlie
git://anongit.freedesktop.org/drm/drm-misc into drm-fixes A use-after-free fix for panfrost, and a DT invalid configuration fix for ti-sn65dsi83 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <maxime@cerno.tech> Link: https://patchwork.freedesktop.org/patch/msgid/20220526090532.nvhlmwev5qgln3nb@houat
2022-05-25drm/panfrost: Job should reference MMU not file_privSteven Price
For a while now it's been allowed for a MMU context to outlive it's corresponding panfrost_priv, however the job structure still references panfrost_priv to get hold of the MMU context. If panfrost_priv has been freed this is a use-after-free which I've been able to trigger resulting in a splat. To fix this, drop the reference to panfrost_priv in the job structure and add a direct reference to the MMU structure which is what's actually needed. Fixes: 7fdc48cc63a3 ("drm/panfrost: Make sure MMU context lifetime is not bound to panfrost_priv") Signed-off-by: Steven Price <steven.price@arm.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220519152003.81081-1-steven.price@arm.com
2022-04-07dma-buf: specify usage while adding fences to dma_resv obj v7Christian König
Instead of distingting between shared and exclusive fences specify the fence usage while adding fences. Rework all drivers to use this interface instead and deprecate the old one. v2: some kerneldoc comments suggested by Daniel v3: fix a missing case in radeon v4: rebase on nouveau changes, fix lockdep and temporary disable warning v5: more documentation updates v6: separate internal dma_resv changes from this patch, avoids to disable warning temporary, rebase on upstream changes v7: fix missed case in lima driver, minimize changes to i915_gem_busy_ioctl Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220407085946.744568-3-christian.koenig@amd.com
2022-04-06dma-buf/drivers: make reserving a shared slot mandatory v4Christian König
Audit all the users of dma_resv_add_excl_fence() and make sure they reserve a shared slot also when only trying to add an exclusive fence. This is the next step towards handling the exclusive fence like a shared one. v2: fix missed case in amdgpu v3: and two more radeon, rename function v4: add one more case to TTM, fix i915 after rebase Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220406075132.3263-2-christian.koenig@amd.com
2022-02-23drm/sched: Add device pointer to drm_gpu_schedulerJiawei Gu
Add device pointer so scheduler's printing can use DRM_DEV_ERROR() instead, which makes life easier under multiple GPU scenario. v2: amend all calls of drm_sched_init() v3: fill dev pointer for all drm_sched_init() calls Signed-off-by: Jiawei Gu <Jiawei.Gu@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220221095705.5290-1-Jiawei.Gu@amd.com
2021-08-30drm/panfrost: use scheduler dependency trackingDaniel Vetter
Just deletes some code that's now more shared. Note that thanks to the split into drm_sched_job_init/arm we can now easily pull the _init() part from under the submission lock way ahead where we're adding the sync file in-fences as dependencies. v2: Correctly clean up the partially set up job, now that job_init() and job_arm() are apart (Emma). v3: Rebased over renamed functions for adding depdencies Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Acked-by: Emma Anholt <emma@anholt.net> Reviewed-by: Steven Price <steven.price@arm.com> (v3) Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Rob Herring <robh@kernel.org> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Steven Price <steven.price@arm.com> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: Emma Anholt <emma@anholt.net> Link: https://patchwork.freedesktop.org/patch/msgid/20210805104705.862416-8-daniel.vetter@ffwll.ch
2021-08-30drm/sched: drop entity parameter from drm_sched_push_jobDaniel Vetter
Originally a job was only bound to the queue when we pushed this, but now that's done in drm_sched_job_init, making that parameter entirely redundant. Remove it. The same applies to the context parameter in lima_sched_context_queue_task, simplify that too. v2: Rebase on top of msm adopting drm/sched Reviewed-by: Christian König <christian.koenig@amd.com> Acked-by: Emma Anholt <emma@anholt.net> Acked-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Steven Price <steven.price@arm.com> (v1) Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> (v1) Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Russell King <linux+etnaviv@armlinux.org.uk> Cc: Christian Gmeiner <christian.gmeiner@gmail.com> Cc: Qiang Yu <yuq825@gmail.com> Cc: Rob Herring <robh@kernel.org> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Steven Price <steven.price@arm.com> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Cc: Emma Anholt <emma@anholt.net> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Nirmoy Das <nirmoy.das@amd.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Chen Li <chenli@uniontech.com> Cc: Lee Jones <lee.jones@linaro.org> Cc: Deepak R Varma <mh12gx2825@gmail.com> Cc: Kevin Wang <kevin1.wang@amd.com> Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: "Marek Olšák" <marek.olsak@amd.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Cc: Dennis Li <Dennis.Li@amd.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: etnaviv@lists.freedesktop.org Cc: lima@lists.freedesktop.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: Rob Clark <robdclark@gmail.com> Cc: Sean Paul <sean@poorly.run> Cc: Melissa Wen <mwen@igalia.com> Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org Link: https://patchwork.freedesktop.org/patch/msgid/20210805104705.862416-6-daniel.vetter@ffwll.ch
2021-08-30drm/sched: Split drm_sched_job_initDaniel Vetter
This is a very confusingly named function, because not just does it init an object, it arms it and provides a point of no return for pushing a job into the scheduler. It would be nice if that's a bit clearer in the interface. But the real reason is that I want to push the dependency tracking helpers into the scheduler code, and that means drm_sched_job_init must be called a lot earlier, without arming the job. v2: - don't change .gitignore (Steven) - don't forget v3d (Emma) v3: Emma noticed that I leak the memory allocated in drm_sched_job_init if we bail out before the point of no return in subsequent driver patches. To be able to fix this change drm_sched_job_cleanup() so it can handle being called both before and after drm_sched_job_arm(). Also improve the kerneldoc for this. v4: - Fix the drm_sched_job_cleanup logic, I inverted the booleans, as usual (Melissa) - Christian pointed out that drm_sched_entity_select_rq() also needs to be moved into drm_sched_job_arm, which made me realize that the job->id definitely needs to be moved too. Shuffle things to fit between job_init and job_arm. v5: Reshuffle the split between init/arm once more, amdgpu abuses drm_sched.ready to signal gpu reset failures. Also document this somewhat. (Christian) v6: Rebase on top of the msm drm/sched support. Note that the drm_sched_job_init() call is completely misplaced, and hence also the split-out drm_sched_entity_push_job(). I've put in a FIXME which the next patch will address. v7: Drop the FIXME in msm, after discussions with Rob I agree it shouldn't be a problem where it is now. Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Melissa Wen <mwen@igalia.com> Cc: Melissa Wen <melissa.srw@gmail.com> Acked-by: Emma Anholt <emma@anholt.net> Acked-by: Steven Price <steven.price@arm.com> (v2) Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> (v5) Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Russell King <linux+etnaviv@armlinux.org.uk> Cc: Christian Gmeiner <christian.gmeiner@gmail.com> Cc: Qiang Yu <yuq825@gmail.com> Cc: Rob Herring <robh@kernel.org> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Steven Price <steven.price@arm.com> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Kees Cook <keescook@chromium.org> Cc: Adam Borowski <kilobyte@angband.pl> Cc: Nick Terrell <terrelln@fb.com> Cc: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Sami Tolvanen <samitolvanen@google.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Nirmoy Das <nirmoy.das@amd.com> Cc: Deepak R Varma <mh12gx2825@gmail.com> Cc: Lee Jones <lee.jones@linaro.org> Cc: Kevin Wang <kevin1.wang@amd.com> Cc: Chen Li <chenli@uniontech.com> Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: "Marek Olšák" <marek.olsak@amd.com> Cc: Dennis Li <Dennis.Li@amd.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Cc: Sonny Jiang <sonny.jiang@amd.com> Cc: Boris Brezillon <boris.brezillon@collabora.com> Cc: Tian Tao <tiantao6@hisilicon.com> Cc: etnaviv@lists.freedesktop.org Cc: lima@lists.freedesktop.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Cc: Emma Anholt <emma@anholt.net> Cc: Rob Clark <robdclark@gmail.com> Cc: Sean Paul <sean@poorly.run> Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org Link: https://patchwork.freedesktop.org/patch/msgid/20210817084917.3555822-1-daniel.vetter@ffwll.ch
2021-08-26drm/panfrost: Use upper/lower_32_bits helpersAlyssa Rosenzweig
Use upper_32_bits/lower_32_bits helpers instead of open-coding them. This is easier to scan quickly compared to bitwise manipulation, and it is pleasingly symmetric. I noticed this when debugging lock_region, which had a particularly "creative" way of writing upper_32_bits. v2: Use helpers for one more call site and add review tag (Steven). Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Rob Herring <robh@kernel.org> (v1) Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210825153348.4980-1-alyssa.rosenzweig@collabora.com
2021-07-01drm/panfrost: Queue jobs on the hardwareSteven Price
The hardware has a set of '_NEXT' registers that can hold a second job while the first is executing. Make use of these registers to enqueue a second job per slot. v5: * Fix a comment in panfrost_job_init() v3: * Fix the done/err job dequeuing logic to get a valid active state * Only enable the second slot on GPUs supporting jobchain disambiguation * Split interrupt handling in sub-functions Signed-off-by: Steven Price <steven.price@arm.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-16-boris.brezillon@collabora.com
2021-07-01drm/panfrost: Kill in-flight jobs on FD closeBoris Brezillon
If the process who submitted these jobs decided to close the FD before the jobs are done it probably means it doesn't care about the result. v5: * Add a panfrost_exception_is_fault() helper and the DRM_PANFROST_EXCEPTION_MAX_NON_FAULT value v4: * Don't disable/restore irqs when taking the job_lock (not needed since this lock is never taken from an interrupt context) v3: * Set fence error to ECANCELED when a TERMINATED exception is received Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-15-boris.brezillon@collabora.com
2021-07-01drm/panfrost: Don't reset the GPU on job faults unless we really have toBoris Brezillon
If we can recover from a fault without a reset there's no reason to issue one. v3: * Drop the mention of Valhall requiring a reset on JOB_BUS_FAULT * Set the fence error to -EINVAL instead of having per-exception error codes Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-14-boris.brezillon@collabora.com
2021-07-01drm/panfrost: Make sure job interrupts are masked before resettingBoris Brezillon
This is not yet needed because we let active jobs be killed during by the reset and we don't really bother making sure they can be restarted. But once we start adding soft-stop support, controlling when we deal with the remaining interrrupts and making sure those are handled before the reset is issued gets tricky if we keep job interrupts active. Let's prepare for that and mask+flush job IRQs before issuing a reset. v4: * Add a comment explaining why we WARN_ON(!job) in the irq handler * Keep taking the job_lock when evicting stalled jobs v3: * New patch Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-11-boris.brezillon@collabora.com
2021-07-01drm/panfrost: Simplify the reset serialization logicBoris Brezillon
Now that we can pass our own workqueue to drm_sched_init(), we can use an ordered workqueue on for both the scheduler timeout tdr and our own reset work (which we use when the reset is not caused by a fault/timeout on a specific job, like when we have AS_ACTIVE bit stuck). This guarantees that the timeout handlers and reset handler can't run concurrently which drastically simplifies the locking. v5: * Don't call cancel_delayed_timeout() in the reset path (those works are canceled in drm_sched_stop()) v4: * Actually pass the reset workqueue to drm_sched_init() * Don't call cancel_work_sync() in panfrost_reset(). It will deadlock since it might be called from the reset work, which is executing and cancel_work_sync() will wait for the handler to return. Checking the reset pending status should avoid spurious resets v3: * New patch Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-10-boris.brezillon@collabora.com
2021-07-01drm/panfrost: Use a threaded IRQ for job interruptsBoris Brezillon
This should avoid switching to interrupt context when the GPU is under heavy use. v3: * Don't take the job_lock in panfrost_job_handle_irq() Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-9-boris.brezillon@collabora.com
2021-07-01drm/panfrost: Expose a helper to trigger a GPU resetBoris Brezillon
Expose a helper to trigger a GPU reset so we can easily trigger reset operations outside the job timeout handler. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-8-boris.brezillon@collabora.com
2021-07-01drm/panfrost: Drop the pfdev argument passed to panfrost_exception_name()Boris Brezillon
Currently unused. We'll add it back if we need per-GPU definitions. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-6-boris.brezillon@collabora.com
2021-07-01drm/panfrost: Make ->run_job() return an ERR_PTR() when appropriateBoris Brezillon
If the fence creation fail, we can return the error pointer directly. The core will update the fence error accordingly. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-4-boris.brezillon@collabora.com
2021-07-01drm/sched: Allow using a dedicated workqueue for the timeout/fault tdrBoris Brezillon
Mali Midgard/Bifrost GPUs have 3 hardware queues but only a global GPU reset. This leads to extra complexity when we need to synchronize timeout works with the reset work. One solution to address that is to have an ordered workqueue at the driver level that will be used by the different schedulers to queue their timeout work. Thanks to the serialization provided by the ordered workqueue we are guaranteed that timeout handlers are executed sequentially, and can thus easily reset the GPU from the timeout handler without extra synchronization. v5: * Add a new paragraph to the timedout_job() method v3: * New patch v4: * Actually use the timeout_wq to queue the timeout work Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Christian König <christian.koenig@amd.com> Cc: Qiang Yu <yuq825@gmail.com> Cc: Emma Anholt <emma@anholt.net> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210630062751.2832545-3-boris.brezillon@collabora.com
2021-06-24drm/panfrost: Make sure MMU context lifetime is not bound to panfrost_privBoris Brezillon
Jobs can be in-flight when the file descriptor is closed (either because the process did not terminate properly, or because it didn't wait for all GPU jobs to be finished), and apparently panfrost_job_close() does not cancel already running jobs. Let's refcount the MMU context object so it's lifetime is no longer bound to the FD lifetime and running jobs can finish properly without generating spurious page faults. Reported-by: Icecream95 <ixn@keemail.me> Fixes: 7282f7645d06 ("drm/panfrost: Implement per FD address spaces") Cc: <stable@vger.kernel.org> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210621133907.1683899-2-boris.brezillon@collabora.com
2021-06-23drm/panfrost: Fix implicit syncDaniel Vetter
Currently this has no practial relevance I think because there's not many who can pull off a setup with panfrost and another gpu in the same system. But the rules are that if you're setting an exclusive fence, indicating a gpu write access in the implicit fencing system, then you need to wait for all fences, not just the previous exclusive fence. panfrost against itself has no problem, because it always sets the exclusive fence (but that's probably something that will need to be fixed for vulkan and/or multi-engine gpus, or you'll suffer badly). Also no problem with that against display. With the prep work done to switch over to the dependency helpers this is now a oneliner. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Rob Herring <robh@kernel.org> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Steven Price <steven.price@arm.com> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: "Christian König" <christian.koenig@amd.com> Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Link: https://patchwork.freedesktop.org/patch/msgid/20210622165511.3169559-7-daniel.vetter@ffwll.ch
2021-06-23drm/panfrost: Use xarray and helpers for depedency trackingDaniel Vetter
More consistency and prep work for the next patch. Aside: I wonder whether we shouldn't just move this entire xarray business into the scheduler so that not everyone has to reinvent the same wheels. Cc'ing some scheduler people for this too. v2: Correctly handle sched_lock since Lucas pointed out it's needed. v3: Rebase, dma_resv_get_excl_unlocked got renamed v4: Don't leak job references on failure (Steven). Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: "Christian König" <christian.koenig@amd.com> Cc: Luben Tuikov <luben.tuikov@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Lee Jones <lee.jones@linaro.org> Cc: Steven Price <steven.price@arm.com> Cc: Rob Herring <robh@kernel.org> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210622165511.3169559-6-daniel.vetter@ffwll.ch
2021-06-23drm/panfrost: Shrink sched_lockDaniel Vetter
drm/scheduler requires a lock between _init and _push_job, but the reservation lock dance doesn't. So shrink the critical section a notch. v2: Lucas pointed out how this should really work, I got it all wrong in v1. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Daniel Vetter <daniel.vetter@intel.com> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Rob Herring <robh@kernel.org> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Steven Price <steven.price@arm.com> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210622165511.3169559-5-daniel.vetter@ffwll.ch
2021-06-06dma-buf: rename dma_resv_get_excl_rcu to _unlockedChristian König
That describes much better what the function is doing here. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210602111714.212426-6-christian.koenig@amd.com
2021-02-05drm/scheduler: provide scheduler score externallyChristian König
Allow multiple schedulers to share the load balancing score. This is useful when one engine has different hw rings. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-and-Tested-by: Leo Liu <leo.liu@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210204144405.2737-1-christian.koenig@amd.com
2021-01-29drm/scheduler: Job timeout handler returns status (v3)Luben Tuikov
This patch does not change current behaviour. The driver's job timeout handler now returns status indicating back to the DRM layer whether the device (GPU) is no longer available, such as after it's been unplugged, or whether all is normal, i.e. current behaviour. All drivers which make use of the drm_sched_backend_ops' .timedout_job() callback have been accordingly renamed and return the would've-been default value of DRM_GPU_SCHED_STAT_NOMINAL to restart the task's timeout timer--this is the old behaviour, and is preserved by this patch. v2: Use enum as the status of a driver's job timeout callback method. v3: Return scheduler/device information, rather than task information. Cc: Alexander Deucher <Alexander.Deucher@amd.com> Cc: Andrey Grodzovsky <Andrey.Grodzovsky@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Lucas Stach <l.stach@pengutronix.de> Cc: Russell King <linux+etnaviv@armlinux.org.uk> Cc: Christian Gmeiner <christian.gmeiner@gmail.com> Cc: Qiang Yu <yuq825@gmail.com> Cc: Rob Herring <robh@kernel.org> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Steven Price <steven.price@arm.com> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Cc: Eric Anholt <eric@anholt.net> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Steven Price <steven.price@arm.com> Signed-off-by: Christian König <christian.koenig@amd.com> Link: https://patchwork.freedesktop.org/patch/415095/
2020-11-16drm/panfrost: Move the GPU reset bits outside the timeout handlerBoris Brezillon
We've fixed many races in panfrost_job_timedout() but some remain. Instead of trying to fix it again, let's simplify the logic and move the reset bits to a separate work scheduled when one of the queue reports a timeout. v5: - Simplify panfrost_scheduler_stop() (Steven Price) - Always restart the queue in panfrost_scheduler_start() even if the status is corrupted (Steven Price) v4: - Rework the logic to prevent a race between drm_sched_start() (reset work) and drm_sched_job_timedout() (timeout work) - Drop Steven's R-b - Add dma_fence annotation to the panfrost_reset() function (Daniel Vetter) v3: - Replace the atomic_cmpxchg() by an atomic_xchg() (Robin Murphy) - Add Steven's R-b v2: - Use atomic_cmpxchg() to conditionally schedule the reset work (Steven Price) Fixes: 1a11a88cfd9a ("drm/panfrost: Fix job timeout handling") Cc: <stable@vger.kernel.org> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201105151704.2010667-1-boris.brezillon@collabora.com
2020-11-03drm/panfrost: Remove unused variables in panfrost_job_close()Boris Brezillon
Commit a17d609e3e21 ("drm/panfrost: Don't corrupt the queue mutex on open/close") left unused variables behind, thus generating a warning at compilation time. Remove those variables. Fixes: a17d609e3e21 ("drm/panfrost: Don't corrupt the queue mutex on open/close") Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201101173817.831769-1-boris.brezillon@collabora.com
2020-10-30drm/panfrost: Don't corrupt the queue mutex on open/closeSteven Price
The mutex within the panfrost_queue_state should have the lifetime of the queue, however it was erroneously initialised/destroyed during panfrost_job_{open,close} which is called every time a client opens/closes the drm node. Move the initialisation/destruction to panfrost_job_{init,fini} where it belongs. Fixes: 1a11a88cfd9a ("drm/panfrost: Fix job timeout handling") Signed-off-by: Steven Price <steven.price@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201029170047.30564-1-steven.price@arm.com
2020-10-08drm/panfrost: Fix job timeout handlingBoris Brezillon
If more than two jobs end up timeout-ing concurrently, only one of them (the one attached to the scheduler acquiring the lock) is fully handled. The other one remains in a dangling state where it's no longer part of the scheduling queue, but still blocks something in scheduler, leading to repetitive timeouts when new jobs are queued. Let's make sure all bad jobs are properly handled by the thread acquiring the lock. v3: - Add Steven's R-b - Don't take the sched_lock when stopping the schedulers v2: - Fix the subject prefix - Stop the scheduler before returning from panfrost_job_timedout() - Call cancel_delayed_work_sync() after drm_sched_stop() to make sure no timeout handlers are in flight when we reset the GPU (Steven Price) - Make sure we release the reset lock before restarting the schedulers (Steven Price) Fixes: f3ba91228e8e ("drm/panfrost: Add initial panfrost driver") Cc: <stable@vger.kernel.org> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201002122506.1374183-1-boris.brezillon@collabora.com
2020-08-07drm/panfrost: introduce panfrost_devfreq structClément Péron
Introduce a proper panfrost_devfreq to deal with devfreq variables. Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signed-off-by: Clément Péron <peron.clem@gmail.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200710095409.407087-5-peron.clem@gmail.com
2020-08-07drm/panfrost: don't use pfdevfreq.busy_count to know if hw is idleClément Péron
This use devfreq variable that will be lock with spinlock in future patches. We should either introduce a function to access this one but as devfreq is optional let's just remove it. Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signed-off-by: Clément Péron <peron.clem@gmail.com> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20200710095409.407087-4-peron.clem@gmail.com
2020-06-19drm/panfrost: Fix runtime PM imbalance on errorDinghao Liu
The caller expects panfrost_job_hw_submit() to increase runtime PM usage counter. The refcount decrement on the error branch of WARN_ON() will break the counter balance and needs to be removed. Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200522134109.27204-1-dinghao.liu@zju.edu.cn
2020-06-19drm/panfrost: Fix inbalance of devfreq record_busy/idle()Steven Price
The calls to panfrost_devfreq_record_busy() and panfrost_devfreq_record_idle() must be balanced to ensure that the devfreq utilisation is correctly reported. But there are two cases where this doesn't work correctly. In panfrost_job_hw_submit() if pm_runtime_get_sync() fails or the WARN_ON() fires then no call to panfrost_devfreq_record_busy() is made, but when the job times out the corresponding _record_idle() call is still made in panfrost_job_timedout(). Move the call up to ensure that it always happens. Secondly panfrost_job_timedout() only makes a single call to panfrost_devfreq_record_idle() even if it is cleaning up multiple jobs. Move the call inside the loop to ensure that the number of _record_idle() calls matches the number of _record_busy() calls. Fixes: 9e62b885f715 ("drm/panfrost: Simplify devfreq utilisation tracking") Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200522153653.40754-1-steven.price@arm.com
2020-05-19drm/panfrost: remove _unlocked suffix in drm_gem_object_put_unlockedEmil Velikov
Spelling out _unlocked for each and every driver is a annoying. Especially if we consider how many drivers, do not know (or need to) about the horror stories involving struct_mutex. Just drop the suffix. It makes the API cleaner. Done via the following script: __from=drm_gem_object_put_unlocked __to=drm_gem_object_put for __file in $(git grep --name-only $__from); do sed -i "s/$__from/$__to/g" $__file; done Cc: Rob Herring <robh@kernel.org> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Steven Price <steven.price@arm.com> Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Acked-by: Sam Ravnborg <sam@ravnborg.org> Reviewed-by: Steven Price <steven.price@arm.com> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20200515095118.2743122-28-emil.l.velikov@gmail.com
2020-03-11Merge v5.6-rc5 into drm-nextDave Airlie
Requested my mripard for some misc patches that need this as a base. Signed-off-by: Dave Airlie <airlied@redhat.com>