summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdkfd/kfd_device.c
AgeCommit message (Collapse)Author
2020-10-30drm/amdkfd: Fix getting unique_id in topologyKent Russell
Since the unique_id is now obtained in amdgpu in smu_late_init, topology misses getting the value during KFD device initialization. To work around this, we use amdgpu_amdkfd_get_unique_id to get the unique_id at read time. Due to this, we can remove unique_id from the kfd_dev structure, since we only need it in the KFD node properties struct Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-12drm/amdkfd: Add kfd2kgd_funcs for dimgrey_cavefish kfd supportChengming Gui
Add KFD support. Signed-off-by: Chengming Gui <Jack.Gui@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-12drm/amdkfd: Support dimgrey_cavefish KFD (v2)Chengming Gui
Add KFD support for dimgrey cavefish. v2: rebase (Alex) Signed-off-by: Chengming Gui <Jack.Gui@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-10-05drm/amdkfd: add Van Gogh KFD supportHuang Rui
This patch is to add GFX10 based APU Van Gogh KFD support. We will treat Van Gogh as "dgpu" (bypass IOMMU v2). Signed-off-by: Huang Rui <ray.huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Yong Zhao <Yong.Zhao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-25drm/amdgpu: store noretry parameter per driver instanceAlex Deucher
This will allow us to have different defaults per asic in a future patch. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-22drm/amdkfd: Move process doorbell allocation into kfd deviceMukul Joshi
Move doorbell allocation for a process into kfd device and allocate doorbell space in each PDD during process creation. Currently, KFD manages its own doorbell space but for some devices, amdgpu would allocate the complete doorbell space instead of leaving a chunk of doorbell space for KFD to manage. In a system with mix of such devices, KFD would need to request process doorbell space based on the type of device, either from amdgpu or from its own doorbell space. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15drm/amdkfd: Fix -Wunused-const-variable warningYueHaibing
If KFD_SUPPORT_IOMMU_V2 is not set, gcc warns: drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_device.c:121:37: warning: ‘raven_device_info’ defined but not used [-Wunused-const-variable=] static const struct kfd_device_info raven_device_info = { ^~~~~~~~~~~~~~~~~ As Huang Rui suggested, Raven already has the fallback path, so it should be out of IOMMU v2 flag. Suggested-by: Huang Rui <ray.huang@amd.com> Signed-off-by: YueHaibing <yuehaibing@huawei.com> Acked-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-31drm/amdkfd: Add GPU reset SMI eventMukul Joshi
Add support for reporting GPU reset events through SMI. KFD would report both pre and post GPU reset events. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26drm/amdkfd: call amdgpu_amdkfd_get_hive_id directlyFelix Kuehling
No need to use a function pointer because the implementation is not ASIC-specific. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26drm/amdkfd: call amdgpu_amdkfd_get_unique_id directlyFelix Kuehling
No need to use a function pointer because the implementation is not ASIC-specific. This fixes missing support due to a missing function pointer on Arcturus. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-26drm/amdkfd: implement the dGPU fallback path for apu (v6)Huang Rui
We still have a few iommu issues which need to address, so force raven as "dgpu" path for the moment. This is to add the fallback path to bypass IOMMU if IOMMU v2 is disabled or ACPI CRAT table not correct. v2: Use ignore_crat parameter to decide whether it will go with IOMMUv2. v3: Align with existed thunk, don't change the way of raven, only renoir will use "dgpu" path by default. v4: don't update global ignore_crat in the driver, and revise fallback function if CRAT is broken. v5: refine acpi crat good but no iommu support case, and rename the title. v6: fix the issue of dGPU initialized firstly, just modify the report value in the node_show(). Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-27drm/amdkfd: Add thermal throttling SMI eventMukul Joshi
Add support for reporting thermal throttling events through SMI. Also, add a counter to count the number of throttling interrupts observed and report the count in the SMI event message. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-15drm/amdkfd: Provide SMI events watchAmber Lin
When the compute is malfunctioning or performance drops, the system admin will use SMI (System Management Interface) tool to monitor/diagnostic what went wrong. This patch provides an event watch interface for the user space to register devices and subscribe events they are interested. After registered, the user can use annoymous file descriptor's poll function with wait-time specified and wait for events to happen. Once an event happens, the user can use read() to retrieve information related to the event. VM fault event is done in this patch. v2: - remove UNREGISTER and add event ENABLE/DISABLE - correct kfifo usage - move event message API to kfd_ioctl.h v3: send the event msg in text than in binary v4: support multiple clients v5: move events enablement from ioctl to fd write v6: sparse fix Signed-off-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-15drm/amdkfd: Add kfd2kgd_funcs for navy_flounder kfd supportChengming Gui
Add callbacks to KGD for navy flounder. Signed-off-by: Chengming Gui <Jack.Gui@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-15drm/amdkfd: Support navy_flounder KFDChengming Gui
Add KFD support for Navy Flounder. Signed-off-by: Chengming Gui <Jack.Gui@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-02drm/amdkfd: Add Arcturus GWS support and fix VG10Joseph Greathouse
Add support for GWS in Arcturus, which needs MEC2 firmware #48 or above. Fix the MEC2 version check for Vega 10 GWS support, since Vega 10 firmware adds 0x8000 to the actual firmware revision. We were previously declaring support where it did not exist. Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com> Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-01drm/amdkfd: Add eviction debug messagesFelix Kuehling
Use WARN to print messages with backtrace when evictions are triggered. This can help determine the root cause of evictions and help spot driver bugs triggering evictions unintentionally, or help with performance tuning by avoiding conditions that cause evictions in a specific workload. The messages are controlled by a new module parameter that can be changed at runtime: echo Y > /sys/module/amdgpu/parameters/debug_evictions echo N > /sys/module/amdgpu/parameters/debug_evictions Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-01drm/amdkfd: sienna_cichlid virtual function supportshaoyunl
amdkfd add support for sienna_cichlid virtual function Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Reviewed-by: Yong Zhao <Yong.Zhao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-01drm/amdkfd: Add Sienna_Cichlid trap handler supportJay Cornwall
- Replace SQC stores with TCP stores - Synchronize with MSG_SAVEWAVE via lgkmcnt - HW_REG_IB_STS is now read-only Signed-off-by: Jay Cornwall <jay.cornwall@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-07-01drm/amdkfd: Support Sienna_Cichlid KFD v4Yong Zhao
v4: drop get_tile_config, comment out other callbacks Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-28drm/amdkfd: Enable GWS based on FW SupportJoseph Greathouse
Rather than only enabling GWS support based on the hws_gws_support modparm, also check whether the GPU's HWS firmware supports GWS. Leave the old modparm in place in case users want to test GWS on GPUs not yet in the support list. v2: fix broken syntax from the first patch. Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-04-01drm/amdkfd: kfree the wrong pointerJack Zhang
Originally, it kfrees the wrong pointer for mem_obj. It would cause memory leak under stress test. Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Acked-by: Nirmoy Das <nirmoy.das@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-26drm/amd: Extend ROCt to surface UUID for devices that have themDivya Shikre
Devices from Arcturus onwards will have their UUID exposed to Thunk. Adding neccessary functions to the kernel to propagate the uuid. Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-02-12drm/amdkfd: refactor runtime pm for bacoRajneesh Bhardwaj
So far the kfd driver implemented same routines for runtime and system wide suspend and resume (s2idle or mem). During system wide suspend the kfd aquires an atomic lock that prevents any more user processes to create queues and interact with kfd driver and amd gpu. This mechanism created problem when amdgpu device is runtime suspended with BACO enabled. Any application that relies on kfd driver fails to load because the driver reports a locked kfd device since gpu is runtime suspended. However, in an ideal case, when gpu is runtime suspended the kfd driver should be able to: - auto resume amdgpu driver whenever a client requests compute service - prevent runtime suspend for amdgpu while kfd is in use This change refactors the amdgpu and amdkfd drivers to support BACO and runtime power management. Reviewed-by: Oak Zeng <oak.zeng@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-01-07drm/amdkfd: Improve HWS hang detection and handlingFelix Kuehling
Move HWS hang detection into unmap_queues_cpsch to catch hangs in all cases. If this happens during a reset, don't schedule another reset because the reset already in progress is expected to take care of it. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Tested-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: shaoyunl <shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-12-18drm/amdkfd: queue kfd interrupt work to different CPUPhilip Yang
Because queue_work schedule the work on the same CPU the interrupt handler is running, if there are many interrupts pending, it takes longer time for work queue to start, or even worse system will hang. v2: queue work to same NUMA node for better cache locality v3: handle cpumask_next wraparound case Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Eric Huang <JinhuiEric.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-13drm/amdgpu: remove set but not used variable 'count'yu kuai
Fixes gcc '-Wunused-but-set-variable' warning: drivers/gpu/drm/amd/amdkfd/kfd_device.c: In function ‘kgd2kfd_post_reset’: drivers/gpu/drm/amd/amdkfd/kfd_device.c:745:11: warning: variable ‘count’ set but not used [-Wunused-but-set-variable] 'count' is never used, so can be removed. Thus 'atomic_dec_return' can be replaced as 'atomic_dec' Fixes: e42051d2133b ("drm/amdkfd: Implement GPU reset handlers in KFD") Signed-off-by: yu kuai <yukuai3@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-25drm/amdkfd: don't use dqm lock during device reset/suspend/resumePhilip Yang
If device reset/suspend/resume failed for some reason, dqm lock is hold forever and this causes deadlock. Below is a kernel backtrace when application open kfd after suspend/resume failed. Instead of holding dqm lock in pre_reset and releasing dqm lock in post_reset, add dqm->sched_running flag which is modified in dqm->ops.start and dqm->ops.stop. The flag doesn't need lock protection because write/read are all inside dqm lock. For HWS case, map_queues_cpsch and unmap_queues_cpsch checks sched_running flag before sending the updated runlist. v2: For no-HWS case, when device is stopped, don't call load/destroy_mqd for eviction, restore and create queue, and avoid debugfs dump hdqs. Backtrace of dqm lock deadlock: [Thu Oct 17 16:43:37 2019] INFO: task rocminfo:3024 blocked for more than 120 seconds. [Thu Oct 17 16:43:37 2019] Not tainted 5.0.0-rc1-kfd-compute-rocm-dkms-no-npi-1131 #1 [Thu Oct 17 16:43:37 2019] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [Thu Oct 17 16:43:37 2019] rocminfo D 0 3024 2947 0x80000000 [Thu Oct 17 16:43:37 2019] Call Trace: [Thu Oct 17 16:43:37 2019] ? __schedule+0x3d9/0x8a0 [Thu Oct 17 16:43:37 2019] schedule+0x32/0x70 [Thu Oct 17 16:43:37 2019] schedule_preempt_disabled+0xa/0x10 [Thu Oct 17 16:43:37 2019] __mutex_lock.isra.9+0x1e3/0x4e0 [Thu Oct 17 16:43:37 2019] ? __call_srcu+0x264/0x3b0 [Thu Oct 17 16:43:37 2019] ? process_termination_cpsch+0x24/0x2f0 [amdgpu] [Thu Oct 17 16:43:37 2019] process_termination_cpsch+0x24/0x2f0 [amdgpu] [Thu Oct 17 16:43:37 2019] kfd_process_dequeue_from_all_devices+0x42/0x60 [amdgpu] [Thu Oct 17 16:43:37 2019] kfd_process_notifier_release+0x1be/0x220 [amdgpu] [Thu Oct 17 16:43:37 2019] __mmu_notifier_release+0x3e/0xc0 [Thu Oct 17 16:43:37 2019] exit_mmap+0x160/0x1a0 [Thu Oct 17 16:43:37 2019] ? __handle_mm_fault+0xba3/0x1200 [Thu Oct 17 16:43:37 2019] ? exit_robust_list+0x5a/0x110 [Thu Oct 17 16:43:37 2019] mmput+0x4a/0x120 [Thu Oct 17 16:43:37 2019] do_exit+0x284/0xb20 [Thu Oct 17 16:43:37 2019] ? handle_mm_fault+0xfa/0x200 [Thu Oct 17 16:43:37 2019] do_group_exit+0x3a/0xa0 [Thu Oct 17 16:43:37 2019] __x64_sys_exit_group+0x14/0x20 [Thu Oct 17 16:43:37 2019] do_syscall_64+0x4f/0x100 [Thu Oct 17 16:43:37 2019] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-07drm/amdkfd: fix the build when CIK support is disabledAlex Deucher
Add proper ifdefs around CIK code in kfd setup. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-07drm/amdkfd: Fix a && vs || typoDan Carpenter
In the current code if "device_info" is ever NULL then the kernel will Oops so probably || was intended instead of &&. Fixes: e392c887df97 ("drm/amdkfd: Use array to probe kfd2kgd_calls") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03drm/amdkfd: Use array to probe kfd2kgd_callsYong Zhao
This is the same idea as the kfd device info probe and move all the probe control together for easy maintenance. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03drm/amd: Pass drm_device to kfdHarish Kasiviswanathan
kfd needs drm_device to call into drm_cgroup functions Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03drm/amdkfd: use navi12 specific family id for navi12 code pathshaoyunl
Keep the same use of CHIP_IDs for navi12 in kfd Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03drm/amdkfd: Add NAVI12 support from kfd sideshaoyunl
Add device info for both navi12 PF and VF Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16drm/amdkfd: fix the missed asic name while inited renoir_device_infoHuang Rui
This patch fixes null pointer issue below, I missed to init the asic renior name while I rebase the patches. [ 106.004250] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 106.004254] #PF: supervisor read access in kernel mode [ 106.004256] #PF: error_code(0x0000) - not-present page [ 106.004257] PGD 0 P4D 0 [ 106.004261] Oops: 0000 [#1] SMP NOPTI [ 106.004264] CPU: 3 PID: 1422 Comm: modprobe Not tainted 5.2.0-rc1-custom #1 [ 106.004266] Hardware name: AMD Celadon-RN/Celadon-RN, BIOS WCD9814N_Weekly_19_08_1 08/14/2019 [ 106.004272] RIP: 0010:strncpy+0x12/0x30 [ 106.004274] Code: c1 c0 11 48 c1 c6 15 48 31 d0 48 c1 c2 20 31 c2 89 d0 31 f0 41 5c 5d c3 55 48 85 d2 48 89 f8 48 89 e5 74 1e 48 01 fa 48 89 f9 <44> 0f b6 06 41 80 f8 01 44 88 01 48 83 de ff 48 83 c1 01 48 39 d1 [ 106.004278] RSP: 0018:ffffc092c1fd37a8 EFLAGS: 00010286 [ 106.004281] RAX: ffff9e943466a28c RBX: 00000000000036ed RCX: ffff9e943466a28c [ 106.004283] RDX: ffff9e943466a2ac RSI: 0000000000000000 RDI: ffff9e943466a28c [ 106.004285] RBP: ffffc092c1fd37a8 R08: ffff9e943d100000 R09: 0000000000000228 [ 106.004287] R10: ffff9e94418dc5a8 R11: ffff9e944746c0d0 R12: 0000000000000000 [ 106.004289] R13: ffff9e943fa1ec00 R14: ffff9e943466a200 R15: ffff9e943466a200 [ 106.004291] FS: 00007f7a022c5540(0000) GS:ffff9e9447ac0000(0000) knlGS:0000000000000000 [ 106.004294] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 106.004296] CR2: 0000000000000000 CR3: 00000001ff0b0000 CR4: 0000000000340ee0 [ 106.004298] Call Trace: [ 106.004382] kfd_topology_add_device+0x150/0x610 [amdgpu] [ 106.004445] kgd2kfd_device_init+0x2e0/0x4f0 [amdgpu] [ 106.004509] amdgpu_amdkfd_device_init+0x14c/0x1b0 [amdgpu] Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-and-Tested-by: Aaron Liu <aaron.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16drm/amdkfd: add renoir kfd device info (v2)Huang Rui
This patch inits renoir kfd device info, so we treat renoir as "dgpu" (bypass iommu v2). Will enable needs_iommu_device till renoir iommu is ready. v2: rebase and align the drm-next Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-16drm/amdkfd: Support Navi14 in KFDYong Zhao
Initial support of Navi14 in KFD. The device IDs will be added later. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13drm/amdkfd: Fix a building error when KFD_SUPPORT_IOMMU_V2 is turned offYong Zhao
The issue was accidentally introduced recently. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-09-13drm/amdkfd: Query kfd device info by CHIP id instead of pci device idYong Zhao
This optimizes out the pci device id usage in KFD and makes the code more maintainable. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-23amd/amdkfd: add Arcturus vf DID supportFrank.Min
Add the virtual function PCI device id. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Frank.Min <Frank.Min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-08-21drm/amdkfd: Fill the name field in node topology with asic name v2Yong Zhao
The name field in node topology has not been used. We re-purpose it to hold the asic name, which can be queried by user space applications through sysfs. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18drm/amdkfd: Add device id for real asicsOak Zeng
Add pci device ids. Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Yong Zhao <Yong.Zhao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18drm/amdkfd: Add arcturus CWSR trap handlerOak Zeng
CWSR (compute wave save/restore) is used for preempting compute queues. Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18drm/amdkfd: Set number of xgmi optimized SDMA engines for arcturusOak Zeng
some sdma engines are optimized for xgmi on arcturus. Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Yong Zhao <Yong.Zhao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18drm/amdkfd: Change arcturus sdma engines numberOak Zeng
Arcturus has 8 sdma engines Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Yong Zhao <yong.zhao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-18amd/amdkfd: Add ASIC ARCTURUS to kfdYong Zhao
Add initial support for ARCTURUS to kfd. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-02drm/amdkfd: remove an unused variableJack Xiao
Just for cleanup. Reviewed-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-07-01drm/amdkfd: remove duplicated PCIE atomics requestJack Xiao
Since amdgpu has always requested PCIE atomics, kfd don't need duplicated PCIE atomics enablement. Referring to amdgpu request result is enough. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-27drm/amdkfd: remove unnecessary warning message on gpu resetshaoyunl
In XGMI configuration, more than one asic can be reset at same time, kfd is able to handle this and no need to trigger the warning Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-06-21drm/amdkfd: Add navi10 support to amdkfd. (v3)Philip Cox
KFD (kernel fusion driver) is the kernel driver for the compute backend for usermode compute stack. v2: squash in updates (Alex) v3: squash in rebase fixes (Alex) Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Signed-off-by: Philip Cox <Philip.Cox@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>