summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2017-06-06drm/amdgpu/gfx9: new queue policy, take first 2 queues of each pipeAlex Deucher
Instead of taking the first pipe and giving the rest to kfd, take the first 2 queues of each pipe. Effectively, amdgpu and amdkfd own the same number of queues. But because the queues are spread over multiple pipes the hardware will be able to better handle concurrent compute workloads. amdgpu goes from 1 pipe to 4 pipes, i.e. from 1 compute threads to 4 amdkfd goes from 3 pipe to 4 pipes, i.e. from 3 compute threads to 4 gfx9 was missed when this patch set was rebased to include gfx9. Acked-by: Tom St Denis <tom.stdenis@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06drm/amdgpu/gfx9: allocate queues horizontally across pipesAlex Deucher
Pipes provide better concurrency than queues, therefore we want to make sure that apps use queues from different pipes whenever possible. Optimize for the trivial case where an app will consume rings in order, therefore we don't want adjacent rings to belong to the same pipe. gfx9 was missed when these patches were rebased. Reviewed-by: Tom St Denis <tom.stdenis@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06drm/amd/powerplay: fix memory leak in cz_hwmgr backendHawking Zhang
vddc_dep_on_dal_pwrl is allocated and initialized in cz_hwmgr_backend_init Thus free the memory in cz_hwmgr_backend_fini Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2017-06-06drm/amd/powerplay: fix memory leak in rv_hwmgr backendHawking Zhang
vddc_dep_on_dal_pwrl and vq_budgeting_table are allocated and initialized in rv_hwmgr_backend_init. Thus free the memory in rv_hwmgr_backend_fini Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06drm/amd/powerplay: add sclk and mclk overdrive for vega10Eric Huang
For overclocking sclk and mclk. Signed-off-by: Eric Huang <JinHuiEric.Huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06drm/amd/powerplay: fix populate dpm level failed when s3 on vega10.Rex Zhu
As the min clk may be large than boot level can support. in this case, just ignore the min clk. Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06drm/amdgpu: update to use RREG32_SOC15/WREG32_SOC15 for gmc9Huang Rui
Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06drm/amdgpu: update to use RREG32_SOC15/WREG32_SOC15 for mmhubHuang Rui
Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06drm/amdgpu: update to use RREG32_SOC15/WREG32_SOC15 for gfxhubHuang Rui
Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06drm/amdgpu: fix the gart table cleared issue for S3Huang Rui
Something writes over the first 8 MB so reserve this on vega10 until we root cause it. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06drm/amdgpu: add ip block number printsHuang Rui
User is able to follow the ip block number to write the ip_block_mask for selecting the one which user would like to enable. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06drm/amdgpu: add ip name print for selecting ips with ip_block_maskHuang Rui
Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06drm/amdgpu: remove mmhub ipHuang Rui
Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06drm/amdgpu: remove gfxhub ipHuang Rui
Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06drm/amdgpu: export mmhub get clockgating into gmcHuang Rui
Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06drm/amdgpu: export mmhub set clockgating into gmcHuang Rui
Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06drm/amdgpu: export mmhub sw_init into gmcHuang Rui
Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06drm/amdgpu: export gfxhub sw_init into gmcHuang Rui
Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06drm/amdgpu: fix to miss program invalidation at resumeHuang Rui
This patch moves invalidation into gart enable function from hw_init. Because we would like align the sequence calling between init and resume. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06drm/amdgpu: abstract setup vmid config for gfxhub/mmhubHuang Rui
Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06drm/amdgpu: abstract disable identity aperture for gfxhub/mmhubHuang Rui
Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06drm/amdgpu: abstract system domain enablement for gfxhub/mmhubHuang Rui
Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06drm/amdgpu: abstract cache initialization for gfxhub/mmhubHuang Rui
Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06drm/amdgpu: abstract TLB initialization for gfxhub/mmhubHuang Rui
Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06drm/amdgpu: abstract system aperture initialization for gfxhub/mmhubHuang Rui
Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06drm/amdgpu: abstract gart aperture initialization for gfxhub/mmhubHuang Rui
Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-06drm/amdgpu: abstract gart table initialization for gfxhub/mmhubHuang Rui
Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-01drm/amdgpu: add saved_bo to save vce 4.0 context when suspendLeo Liu
We are using PSP to resume firmware after suspend, and it is resumed at where it got suspended, so we'd better save the the context. Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-01drm/amdgpu: use existing function amdgpu_bo_create_kernelLeo Liu
To simplify vce bo create Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-01drm/amdgpu: add vcpu_bo cpu address for vceLeo Liu
Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-01drm/amdgpu: Move compute vm bug logic to amdgpu_vm.cAlex Xie
In review, Christian would like to keep the logic inside amdgpu_vm.c with a cost of slightly slower. The loop is still optimized out with this patch. v2: remove the if statement. Now it is not slower. Signed-off-by: Alex Xie <AlexBin.Xie@amd.com> Reviewed-by: Christian König <christian.koeng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-01drm/amd/powerplay: enable CKS by default on vega10.Rex Zhu
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-01drm/amd/powerplay: Align with VBIOS to support AVFS parameters.Rex Zhu
Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-06-01drm/amd/powerplay: Add floor DCEF for DS on boot.Rex Zhu
Use the vbios to look up the default frequencies for socclk and dcefclk. Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31drm/amdgpu: use LRU mapping policy for SDMA enginesAndres Rodriguez
Spreading the load across multiple SDMA engines can increase memory transfer performance. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31drm/amdgpu: guarantee bijective mapping of ring ids for LRU v3Andres Rodriguez
Depending on usage patterns, the current LRU policy may create a non-injective mapping between userspace ring ids and kernel rings. This behaviour is undesired as apps that attempt to fill all HW blocks would be unable to reach some of them. This change forces the LRU policy to create bijective mappings only. v2: compress ring_blacklist v3: simplify amdgpu_ring_is_blacklisted() logic Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31drm/amdgpu: implement lru amdgpu_queue_mgr policy for compute v4Andres Rodriguez
Use an LRU policy to map usermode rings to HW compute queues. Most compute clients use one queue, and usually the first queue available. This results in poor pipe/queue work distribution when multiple compute apps are running. In most cases pipe 0 queue 0 is the only queue that gets used. In order to better distribute work across multiple HW queues, we adopt a policy to map the usermode ring ids to the LRU HW queue. This fixes a large majority of multi-app compute workloads sharing the same HW queue, even though 7 other queues are available. v2: use ring->funcs->type instead of ring->hw_ip v3: remove amdgpu_queue_mapper_funcs v4: change ring_lru_list_lock to spinlock, grab only once in lru_get() Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31drm/amdgpu: untie user ring ids from kernel ring ids v6Andres Rodriguez
Add amdgpu_queue_mgr, a mechanism that allows disjointing usermode's ring ids from the kernel's ring ids. The queue manager maintains a per-file descriptor map of user ring ids to amdgpu_ring pointers. Once a map is created it is permanent (this is required to maintain FIFO execution guarantees for a context's ring). Different queue map policies can be configured for each HW IP. Currently all HW IPs use the identity mapper, i.e. kernel ring id is equal to the user ring id. The purpose of this mechanism is to distribute the load across multiple queues more effectively for HW IPs that support multiple rings. Userspace clients are unable to check whether a specific resource is in use by a different client. Therefore, it is up to the kernel driver to make the optimal choice. v2: remove amdgpu_queue_mapper_funcs v3: made amdgpu_queue_mgr per context instead of per-fd v4: add context_put on error paths v5: rebase and include new IPs UVD_ENC & VCN_* v6: drop unused amdgpu_ring_is_valid_index (Alex) Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31drm/amdgpu: workaround tonga HW bug in HQD programming sequenceAndres Rodriguez
Tonga based asics may experience hangs when an HQD's EOP parameters are modified. Workaround this HW issue by avoiding writes to these registers for tonga asics. Based on the following ROCm commit: 2a0fb8 - drm/amdgpu: Synchronize KFD HQD load protocol with CP scheduler From the ROCm git repository: https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver.git CC: Jay Cornwall <Jay.Cornwall@amd.com> Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31drm/amdgpu: condense mqd programming sequenceAndres Rodriguez
The MQD structure matches the reg layout. Take advantage of this to simplify HQD programming. Note that the ACTIVE field still needs to be programmed last. Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31drm/amdgpu: new queue policy, take first 2 queues of each pipe v2Andres Rodriguez
Instead of taking the first pipe and giving the rest to kfd, take the first 2 queues of each pipe. Effectively, amdgpu and amdkfd own the same number of queues. But because the queues are spread over multiple pipes the hardware will be able to better handle concurrent compute workloads. amdgpu goes from 1 pipe to 4 pipes, i.e. from 1 compute threads to 4 amdkfd goes from 3 pipe to 4 pipes, i.e. from 3 compute threads to 4 v2: fix policy comment Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31drm/amdgpu: avoid KIQ clashing with compute or KFD queues v2Andres Rodriguez
Instead of picking an arbitrary queue for KIQ, search for one according to policy. The queue must be unused. Also report the KIQ as an unavailable resource to KFD. In testing I ran into KCQ initialization issues when using pipes 2/3 of MEC2 for the KIQ. Therefore the policy disallows grabbing one of these. v2: fix (ring.me + 1) to (ring.me -1) in amdgpu_amdkfd_device_init Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31drm/amdgpu: remove hardcoded queue_mask in PACKET3_SET_RESOURCESAndres Rodriguez
The assumption that we are only using the first pipe no longer holds. Instead, calculate the queue_mask from the queue_bitmap. Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31drm/amdgpu: allocate queues horizontally across pipesAndres Rodriguez
Pipes provide better concurrency than queues, therefore we want to make sure that apps use queues from different pipes whenever possible. Optimize for the trivial case where an app will consume rings in order, therefore we don't want adjacent rings to belong to the same pipe. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31drm/amdgpu: remove duplicate magic constants from amdgpu_amdkfd_gfx*.cAndres Rodriguez
This information is already available in adev. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31drm/amdkfd: allow split HQD on per-queue granularity v5Andres Rodriguez
Update the KGD to KFD interface to allow sharing pipes with queue granularity instead of pipe granularity. This allows for more interesting pipe/queue splits. v2: fix overflow check for res.queue_mask v3: fix shift overflow when setting res.queue_mask v4: fix comment in is_pipeline_enabled() v5: clamp res.queue_mask to the first MEC only Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31drm/amdgpu: teach amdgpu how to enable interrupts for any pipe v3Andres Rodriguez
The current implementation is hardcoded to enable ME1/PIPE0 interrupts only. This patch allows amdgpu to enable interrupts for any pipe of ME1. v2: added gfx9 support v3: use soc15_grbm_select for gfx9 Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31drm/amdgpu: allow split of queues with kfd at queue granularity v4Andres Rodriguez
Previously the queue/pipe split with kfd operated with pipe granularity. This patch allows amdgpu to take ownership of an arbitrary set of queues. It also consolidates the last few magic numbers in the compute initialization process into mec_init. v2: support for gfx9 v3: renamed AMDGPU_MAX_QUEUES to AMDGPU_MAX_COMPUTE_QUEUES v4: fix off-by-one in num_mec checks in *_compute_queue_acquire Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31drm/amdgpu: take ownership of per-pipe configuration v3Andres Rodriguez
Make amdgpu the owner of all per-pipe state of the HQDs. This change will allow us to split the queues between kfd and amdgpu with a queue granularity instead of pipe granularity. This patch fixes kfd allocating an HDP_EOP region for its 3 pipes which goes unused. v2: support for gfx9 v3: fix gfx7 HPD intitialization Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-31drm/radeon: take ownership of pipe initializationAndres Rodriguez
Take ownership of pipe initialization away from KFD. Note that hpd_eop_gpu_addr was already large enough to accomodate all pipes. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>