summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
AgeCommit message (Collapse)Author
2018-01-23drm/amdgpu: Avoid leaking PM domain on driver unbind (v2)Alex Deucher
We only support vga_switcheroo and runtime pm on PX/HG systems so forcing runpm to 1 doesn't do anything useful anyway. Only call vga_switcheroo_init_domain_pm_ops() for PX/HG so that the cleanup path is correct as well. This mirrors what radeon does as well. v2: rework the patch originally sent by Lukas (Alex) Acked-by: Lukas Wunner <lukas@wunner.de> Reported-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> (v1) Cc: stable@vger.kernel.org
2018-01-23drm/amdgpu: Reenable manual GPU reset from sysfsAndrey Grodzovsky
Otherwise it keeps rejecting the reset. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-01-10drm/amdgpu: fix 64bit BAR detectionChristian König
Windows added by the BIOS are not marked as 64bit because they are usually not changeable anyway. This fixes large BAR support on my new Ryzen build system. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18drm/amdgpu: rename amdgpu_get_pcie_infoAlex Deucher
add device to the name for consistency. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18drm/amdgpu: move amdgpu_need_backup to amdgpu_object.cAlex Deucher
It's the only place it's used. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18drm/amdgpu: rename amdgpu_gpu_recoverAlex Deucher
add device to the name for consistency. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18drm/amdgpu: move dummy page functions to amdgpu_gart.cAlex Deucher
It's the only place they are used. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18drm/amdgpu: rename amdgpu_need_postAlex Deucher
add device to the name for consistency. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18drm/amdgpu: rename ip block helper functionsAlex Deucher
add device to the name for consistency. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18drm/amdgpu: move fw_reserve functions to amdgpu_ttm.cAlex Deucher
It's the only place they are used. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18drm/amdgpu: rename amdgpu_*_location functionsAlex Deucher
add device to the name for consistency. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18drm/amdgpu: move amdgpu_doorbell_get_kfd_info to amdgpu_amdkfd.cAlex Deucher
It's the only place it's used. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18drm/amdgpu: rename amdgpu_pci_config_resetAlex Deucher
add device for consistency with other functions in this file. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18drm/amdgpu: rename amdgpu_program_register_sequenceAlex Deucher
add device for consistency with other functions in this file. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18drm/amdgpu: rename amdgpu_wb_* functionsAlex Deucher
add device for consistency. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18drm/amdgpu: move debugfs functions to their own fileAlex Deucher
amdgpu_device.c was getting pretty cluttered. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18drm/amdgpu: rename amdgpu_suspend to amdgpu_device_ip_suspendAlex Deucher
for consistency with the other functions in that file. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18drm/amdgpu: use consistent naming for static funcs in amdgpu_device.cAlex Deucher
Prefix the functions with device or device_ip for functions which deal with ip blocks for consistency. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-18drm/amdgpu: move atom functions from amdgpu_device.cAlex Deucher
and move them to amdgpu_atombios.c for consistency. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-15drm/amdgpu: Simplify amdgpu_lockup_timeout usage.Andrey Grodzovsky
With introduction of amdgpu_gpu_recovery we don't need any more to rely on amdgpu_lockup_timeout == 0 for disabling GPU reset. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-15drm/amdgpu: Add gpu_recovery parameterAndrey Grodzovsky
Add new parameter to control GPU recovery procedure. v2: Add auto logic where reset is disabled for bare metal and enabled for SR-IOV. Allow forced reset from debugfs. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-13drm/amdgpu: drop scratch regs save and restore from GPU reset handlingAlex Deucher
The expectation is that the base driver doesn't mess with these. Some components interact with these directly so let the components handle these directly. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-13drm/amdgpu: drop scratch regs save and restore from S3/S4 handlingAlex Deucher
The expectation is that the base driver doesn't mess with these. Some components interact with these directly so let the components handle these directly. Reviewed-by: Harry Wentland <harry.wentland@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-12drm/amdgpu: no need to evict VRAM in device_finiMonk Liu
this VRAM evict is not needed and also cost 2seconds to finish because the IRQ is software side disabled before it. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-12drm/amdgpu: add amdgpu_evict_vram debugfs fileChristian König
Torture test for MM and VM support, can be used to evict all VRAM while the system is under load. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-12drm/amdgpu: cleanup debugfs handling a bitChristian König
Remove the superflous .debugfs_init callback and register all files in amdgpu_device.c in just one function. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-07drm: move amd_gpu_scheduler into common locationLucas Stach
This moves and renames the AMDGPU scheduler to a common location in DRM in order to facilitate re-use by other drivers. This is mostly a straight forward rename with no code changes. One notable exception is the function to_drm_sched_fence(), which is no longer a inline header function to avoid the need to export the drm_sched_fence_ops_scheduled and drm_sched_fence_ops_finished structures. Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06drm/amdgpu: allow specifying vm_block_size for multi level PDs v2Christian König
This patch allows specifying the vm_block_size even when multi level page directories are active. v2: fix signed/unsigned compare warning Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06drm/amdgpu: move validation of the VM size into the VM codeChristian König
This moves validation of the VM size parameter into amdgpu_vm_adjust_size(). Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06drm/amdgpu: allow non pot VM size valuesChristian König
The VM size actually doesn't need to be a power of two. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06drm/ttm: use an operation context for ttm_bo_mem_space v2Christian König
Instead of specifying interruptible and no_wait_gpu manually. v2: rebase Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Tested-by: Michel Dänzer <michel.daenzer@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06drm/amdgpu: align GTT start to 4GB v2Christian König
For VCE to work properly the start of the GTT space must be aligned to a 4GB boundary. v2: add comment why we do this Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06drm/amdgpu: remove VRAM size reduction v2Christian König
Remove some outdated comments and all code which tries to reduce the VRAM size mapped into the MC. This is superfluous and misleading since we never actually program the size. v2: handle gmc_v6_0.c as well Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06drm/amdgpu: require a root bus window above 4GB for BAR resizeChristian König
Don't even try to resize the BAR when there is no window above 4GB. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06drm/amdgpu:show error message if fail on event4Monk Liu
Signed-off-by: Monk Liu <Monk.Liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06drm/amdgpu:free CSA in unified placeMonk Liu
instead of doing it in each GFX ip's sw_fini Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06drm/amdgpu:cleanup unused stack varMonk Liu
Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06drm/amdgpu:fix NULL pointer access during drv removeMonk Liu
NULL pointer is because original logic will step into set_pde_pte() even after the gart.ptr is freed due to there are twice gart_unbind() on all gart area. also, there are other minor fixes: 1,since gart_init only create dummy page, the corresponding gart_fini shouldn't do more like unbinding all GART, this is unnecessary because in driver fini stage all GART unbinding had already been done during each IP's SW_FINI (GMC's SW_FINI is the last one called), so remove the step for the GART unbinding in gart_fini(). 2,gart_fini() is already invoked during each GMC IP's gart_fini routine,e.g. gmc_vx_0_gart_fini(), so no need to manually call it during ttm_fini(). 3,amdgpu_gem_force_release() should be put ahead of amdgpu_vm_manager_fini() Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-06drm/amdgpu: revise retry init to fully cleanup driverPixel Ding
Retry at drm_dev_register instead of amdgpu_device_init. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Pixel Ding <Pixel.Ding@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04drm/amdgpu:read VRAMLOST from gimMonk Liu
Signed-off-by: Monk Liu <Monk.Liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04drm/amdgpu: bypass FB resizing for SRIOV VFpding
It introduces 900ms latency in exclusive mode which causes failure of driver loading. Host can resize the BAR before guest staring, so the resizing is not necessary here. Signed-off-by: Pixel Ding <Pixel.Ding@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04drm/amdgpu: release exclusive mode after hw_initpding
Signed-off-by: pding <Pixel.Ding@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04drm/amdkfd: initialise kfd inside amdgpu_device_initpding
Also finalize kfd inside amdgpu_device_fini. kfd device_init needs SRIOV exclusive accessing. Try to gather exclusive accessing to reduce time consuming. Signed-off-by: pding <Pixel.Ding@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04drm/amdgpu: resize VRAM BAR for CPU access v6Christian König
Try to resize BAR0 to let CPU access all of VRAM. v2: rebased, style cleanups, disable mem decode before resize, handle gmc_v9 as well, round size up to power of two. v3: handle gmc_v6 as well, release and reassign all BARs in the driver. v4: rename new function to amdgpu_device_resize_fb_bar, reenable mem decoding only if all resources are assigned. v5: reorder resource release, return -ENODEV instead of BUG_ON(). v6: squash in rebase fix Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04drm/amdgpu: refine SR-IOV firmware VRAM reservation to protect dataHorace Chen
The previous solution will create a zero buffer on the system domain and then move the zeroes to the VRAM. This will break the original data on the VRAM. Refine the code to create bo on VRAM domain directly and then remove and re-create mem node to the exact position before bo_pin. This can avoid breaking the data and will not cause eviction. Signed-off-by: Horace Chen <horace.chen@amd.com> Reviewed-by: monk liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04drm/amdgpu: retry init if exclusive mode request is failedpding
This is caused of that hypervisor fails to handle request, one known issue is MMIO unblocking timeout. In theory we can retry init here. Signed-off-by: pding <Pixel.Ding@amd.com> Reviewed-by: Xiangliang Yu <Xiangliang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04drm/amdgpu: move GART recovery into GTT manager v2Christian König
The GTT manager handles the GART address space anyway, so it is completely pointless to keep the same information around twice. v2: rebased Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04drm/amdgpu:cleanup in_sriov_reset and lock_resetMonk Liu
since now gpu reset is unified with gpu_recover for both bare-metal and SR-IOV: 1)rename in_sriov_reset to in_gpu_reset 2)move lock_reset from adev->virt to adev Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04drm/amdgpu:implement new GPU recover(v3)Monk Liu
1,new imple names amdgpu_gpu_recover which gives more hint on what it does compared with gpu_reset 2,gpu_recover unify bare-metal and SR-IOV, only the asic reset part is implemented differently 3,gpu_recover will increase hang job karma and mark its entity/context as guilty if exceeds limit V2: 4,in scheduler main routine the job from guilty context will be immedialy fake signaled after it poped from queue and its fence be set with "-ECANCELED" error 5,in scheduler recovery routine all jobs from the guilty entity would be dropped 6,in run_job() routine the real IB submission would be skipped if @skip parameter equales true or there was VRAM lost occured. V3: 7,replace deprecated gpu reset, use new gpu recover Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-12-04drm/amdgpu: retry init if it fails due to exclusive mode timeout (v3)pding
The exclusive mode has real-time limitation in reality, such like being done in 300ms. It's easy observed if running many VF/VMs in single host with heavy CPU workload. If we find the init fails due to exclusive mode timeout, try it again. v2: - rewrite the condition for readable value. v3: - fix typo, add comments for sleep Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: pding <Pixel.Ding@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>