summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
AgeCommit message (Collapse)Author
2025-06-30drm/amdkfd: Don't call mmput from MMU notifier callbackPhilip Yang
If the process is exiting, the mmput inside mmu notifier callback from compactd or fork or numa balancing could release the last reference of mm struct to call exit_mmap and free_pgtable, this triggers deadlock with below backtrace. The deadlock will leak kfd process as mmu notifier release is not called and cause VRAM leaking. The fix is to take mm reference mmget_non_zero when adding prange to the deferred list to pair with mmput in deferred list work. If prange split and add into pchild list, the pchild work_item.mm is not used, so remove the mm parameter from svm_range_unmap_split and svm_range_add_child. The backtrace of hung task: INFO: task python:348105 blocked for more than 64512 seconds. Call Trace: __schedule+0x1c3/0x550 schedule+0x46/0xb0 rwsem_down_write_slowpath+0x24b/0x4c0 unlink_anon_vmas+0xb1/0x1c0 free_pgtables+0xa9/0x130 exit_mmap+0xbc/0x1a0 mmput+0x5a/0x140 svm_range_cpu_invalidate_pagetables+0x2b/0x40 [amdgpu] mn_itree_invalidate+0x72/0xc0 __mmu_notifier_invalidate_range_start+0x48/0x60 try_to_unmap_one+0x10fa/0x1400 rmap_walk_anon+0x196/0x460 try_to_unmap+0xbb/0x210 migrate_page_unmap+0x54d/0x7e0 migrate_pages_batch+0x1c3/0xae0 migrate_pages_sync+0x98/0x240 migrate_pages+0x25c/0x520 compact_zone+0x29d/0x590 compact_zone_order+0xb6/0xf0 try_to_compact_pages+0xbe/0x220 __alloc_pages_direct_compact+0x96/0x1a0 __alloc_pages_slowpath+0x410/0x930 __alloc_pages_nodemask+0x3a9/0x3e0 do_huge_pmd_anonymous_page+0xd7/0x3e0 __handle_mm_fault+0x5e3/0x5f0 handle_mm_fault+0xf7/0x2e0 hmm_vma_fault.isra.0+0x4d/0xa0 walk_pmd_range.isra.0+0xa8/0x310 walk_pud_range+0x167/0x240 walk_pgd_range+0x55/0x100 __walk_page_range+0x87/0x90 walk_page_range+0xf6/0x160 hmm_range_fault+0x4f/0x90 amdgpu_hmm_range_get_pages+0x123/0x230 [amdgpu] amdgpu_ttm_tt_get_user_pages+0xb1/0x150 [amdgpu] init_user_pages+0xb1/0x2a0 [amdgpu] amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x543/0x7d0 [amdgpu] kfd_ioctl_alloc_memory_of_gpu+0x24c/0x4e0 [amdgpu] kfd_ioctl+0x29d/0x500 [amdgpu] Fixes: fa582c6f3684 ("drm/amdkfd: Use mmget_not_zero in MMU notifier") Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit a29e067bd38946f752b0ef855f3dfff87e77bec7) Cc: stable@vger.kernel.org
2025-06-30amdkfd: MTYPE_UC for ext-coherent system memoryDavid Yat Sin
Set memory mtype to UC host memory when ext-coherent flag is set and memory is registered as a SVM allocation. Reviewed-by: Amber Lin <Amber.Lin@amd.com> Signed-off-by: David Yat Sin <David.YatSin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 5d14fdab4778c29cfd39e62c3ce84d232b4a7d8c)
2025-05-22drm/amdkfd: Change svm_range_get_info return typeAndrey Vatoropin
Static analysis shows that pointer "svms" cannot be NULL because it points to the object "struct svm_range_list". Remove the extra NULL check. It is meaningless and harms the readability of the code. In the function svm_range_get_info() there is no possibility of failure. Therefore, the caller of the function svm_range_get_info() does not need a return value. Change the function svm_range_get_info() return type from "int" to "void". Since the function svm_range_get_info() has a return type of "void". The caller of the function svm_range_get_info() does not need a return value. Delete extra code. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Andrey Vatoropin <a.vatoropin@crpt.ru> Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdkfd: Drop workaround for GC v9.4.3 revID 0Apurv Mishra
Remove workaround code for the early engineering samples GC v9.4.3 SOCs with revID 0 Reviewed-by: Amber Lin <Amber.Lin@amd.com> Signed-off-by: Apurv Mishra <Apurv.Mishra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13drm/amdgpu: Restore uncached behaviour on GFX12David Belanger
Always use MTYPE_UC if UNCACHED flag is specified. This makes kernarg region uncached and it restores usermode cache disable debug flag functionality. Do not set MTYPE_UC for COHERENT flag, on GFX12 coherence is handled by shader code. Signed-off-by: David Belanger <david.belanger@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-13drm/amdgpu: Fix the race condition for draining retry faultEmily Deng
Issue: In the scenario where svm_range_restore_pages is called, but svm->checkpoint_ts has not been set and the retry fault has not been drained, svm_range_unmap_from_cpu is triggered and calls svm_range_free. Meanwhile, svm_range_restore_pages continues execution and reaches svm_range_from_addr. This results in a "failed to find prange..." error, causing the page recovery to fail. How to fix: Move the timestamp check code under the protection of svm->lock. v2: Make sure all right locks are released before go out. v3: Directly goto out_unlock_svms, and return -EAGAIN. v4: Refine code. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-07drm/amdgpu: Fix missing drain retry fault the last entryEmily Deng
While the entry get in svm_range_unmap_from_cpu is the last entry, and the entry is page fault, it also need to be dropped. So for equal case, it also need to be dropped. v2: Only modify the svm_range_restore_pages. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Xiaogang Chen<xiaogang.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12drm/amdkfd: add a new flag to manage where VRAM allocations goAlex Deucher
On big and small APUs we send KFD VRAM allocations to GTT since the carve out is either non-existent or relatively small. However, if someone sets the carve out size to be relatively large, we may end up using GTT rather than VRAM. No change of logic with this patch, but it allows the driver to determine which logic to use based on the carve out size in the future. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12drm/amdkfd: Have kfd driver use same PASID values from graphic driverXiaogang Chen
Current kfd driver has its own PASID value for a kfd process and uses it to locate vm at interrupt handler or mapping between kfd process and vm. That design is not working when a physical gpu device has multiple spatial partitions, ex: adev in CPX mode. This patch has kfd driver use same pasid values that graphic driver generated which is per vm per pasid. These pasid values are passed to fw/hardware. We do not need change interrupt handler though more pasid values are used. Also, pasid values at log are replaced by user process pid; pasid values are not exposed to user. Users see their process pids that have meaning in user space. Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd: update mtype flags for gfx 9.5.0Alex Sierra
Update mtype flags to meet gfx 9.5.0 requirements for remote GPU memory and system memory. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: Set proper MTYPE for GC 9.5.0Alex Sierra
GC 9.5.0 local memory MTYPE default should be set as RW. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd: define gc ip version local variableAlex Sierra
For better readability. Also leftover orphaned code. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-15drm/amdkfd: Accounting pdd vram_usage for svmPhilip Yang
Process device data pdd->vram_usage is read by rocm-smi via sysfs, this is currently missing the svm_bo usage accounting, so "rocm-smi --showpids" per process VRAM usage report is incorrect. Add pdd->vram_usage accounting when svm_bo allocation and release, change to atomic64_t type because it is updated outside process mutex now. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-07drm/amdkfd: Output migrate end event if migrate failedPhilip Yang
If page migration failed, also output migrate end event to match with migrate start event, with failure error_code added to the end of the migrate message macro. This will not break uAPI because application uses old message macro sscanf drop and ignore the error_code. Output GPU page fault restore end event if migration failed. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: James Zhu <James.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-06drm/amdgpu: Surface svm_default_granularity, a RW module parameterRamesh Errabolu
Enables users to update SVM's default granularity, used in buffer migration and handling of recoverable page faults. Param value is set in terms of log(numPages(buffer)), e.g. 9 for a 2 MIB buffer Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-29drm/amdkfd: Don't drain ih1 for APUYifan Zhang
ih1 is not initialized for APUs. Don't drain it or NULL pointer error will be triggered. Fixes: 6ef29715ac06 ("drm/amdkfd: Change kfd/svm page fault drain handling") Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-23drm/amdkfd: Change kfd/svm page fault drain handlingXiaogang Chen
When app unmap vm ranges(munmap) kfd/svm starts drain pending page fault and not handle any incoming pages fault of this process until a deferred work item got executed by default system wq. The time period of "not handle page fault" can be long and is unpredicable. That is advese to kfd performance on page faults recovery. This patch uses time stamp of incoming page fault to decide to drop or recover page fault. When app unmap vm ranges kfd records each gpu device's ih ring current time stamp. These time stamps are used at kfd page fault recovery routine. Any page fault happened on unmapped ranges after unmap events is application bug that accesses vm range after unmap. It is not driver work to cover that. By using time stamp of page fault do not need drain page faults at deferred work. So, the time period that kfd does not handle page faults is reduced and can be controlled. Signed-off-by: Xiaogang.Chen <Xiaogang.Chen@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-24drm/amdkfd: Validate user queue svm memory residencyPhilip Yang
Queue CWSR area maybe registered to GPU as svm memory, create queue to ensure svm mapped to GPU with KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED flag. Add queue_refcount to struct svm_range, to track queue CWSR area usage. Because unmap mmu notifier callback return value is ignored, if application unmap the CWSR area while queue is active, pr_warn message in dmesg log. To be safe, evict user queue. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05drm/amdkfd: remove dead code in the function svm_range_get_pte_flagsJesse Zhang
The varible uncached set false, the condition uncached cannot be true. So remove the dead code, mapping flags will set the flag AMDGPU_VM_MTYPE_UC in else. Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-29drm/amdkfd: simplify APU VRAM handlingAlex Deucher
With commit 89773b85599a ("drm/amdkfd: Let VRAM allocations go to GTT domain on small APUs") big and small APU "VRAM" handling in KFD was unified. Since AMD_IS_APU is set for both big and small APUs, we can simplify the checks in the code. v2: clean up a few more places (Lang) Acked-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Lang Yu <Lang.Yu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-17drm/amdkfd: Add GFX1201 to svm_range_get_pte_flags functionSreekant Somasekharan
GFX1201 was missed in the commit below. Adding it in. Fixes: 628e1ace2379 ("drm/amdkfd: mark GFX12 system and peer GPU memory mappings as MTYPE_NC") Signed-off-by: Sreekant Somasekharan <sreekant.somasekharan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-13drm/amd/amdkfd: Fix a resource leak in svm_range_validate_and_map()Ramesh Errabolu
Analysis of code by Coverity, a static code analyser, has identified a resource leak in the symbol hmm_range. This leak occurs when one of the prior steps before it is released encounters an error. Signed-off-by: Ramesh Errabolu <Ramesh.Errabolu@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-13drm/amdkfd: Remove arbitrary timeout for hmm_range_faultPhilip Yang
On system with khugepaged enabled and user cases with THP buffer, the hmm_range_fault may takes > 15 seconds to return -EBUSY, the arbitrary timeout value is not accurate, cause memory allocation failure. Remove the arbitrary timeout value, return EAGAIN to application if hmm_range_fault return EBUSY, then userspace libdrm and Thunk will call ioctl again. Change EAGAIN to debug message as this is not error. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-08drm/amdkfd: Let VRAM allocations go to GTT domain on small APUsLang Yu
Small APUs(i.e., consumer, embedded products) usually have a small carveout device memory which can't satisfy most compute workloads memory allocation requirements. We can't even run a Basic MNIST Example with a default 512MB carveout. https://github.com/pytorch/examples/tree/main/mnist. Error Log: "torch.cuda.OutOfMemoryError: HIP out of memory. Tried to allocate 84.00 MiB. GPU 0 has a total capacity of 512.00 MiB of which 0 bytes is free. Of the allocated memory 103.83 MiB is allocated by PyTorch, and 22.17 MiB is reserved by PyTorch but unallocated" Though we can change BIOS settings to enlarge carveout size, which is inflexible and may bring complaint. On the other hand, the memory resource can't be effectively used between host and device. The solution is MI300A approach, i.e., let VRAM allocations go to GTT. Then device and host can flexibly and effectively share memory resource. v2: Report local_mem_size_private as 0. (Felix) Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-02drm/amdkfd: mark GFX12 system and peer GPU memory mappings as MTYPE_NCSreekant Somasekharan
Due to a HW bug, the system memory mappings and peer GPU mappings on GFX12 need to be marked as MTYPE_NC. Cc: Joe Greathouse <joseph.greathouse@amd.com> Cc: David Belanger <david.belanger@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Sreekant Somasekharan <sreekant.somasekharan@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-02drm/amd/amdkfd: Add GFX12 PTE flag to SVM get PTE functionSreekant Somasekharan
Add new GFX12 PTE flag AMDGPU_PTE_IS_PTE to svm_range_get_pte_flags function. This resolves the issues related to SVM enablement in GFX12. Signed-off-by: Sreekant Somasekharan <sreekant.somasekharan@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-02drm/amdgpu: Add gfx v9_4_4 ip blockHawking Zhang
Add gfx v9_4_4 ip block support Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-04-26drm/amdkfd: Add VRAM accounting for SVM migrationMukul Joshi
Do VRAM accounting when doing migrations to vram to make sure there is enough available VRAM and migrating to VRAM doesn't evict other possible non-unified memory BOs. If migrating to VRAM fails, driver can fall back to using system memory seamlessly. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-15drm/amdgpu: Auto-validate DMABuf imports in compute VMsFelix Kuehling
DMABuf imports in compute VMs are not wrapped in a kgd_mem object on the process_info->kfd_bo_list. There is no explicit KFD API call to validate them or add eviction fences to them. This patch automatically validates and fences dymanic DMABuf imports when they are added to a compute VM. Revalidation after evictions is handled in the VM code. v2: * Renamed amdgpu_vm_validate_evicted_bos to amdgpu_vm_validate * Eliminated evicted_user state, use evicted state for VM BOs and user BOs * Fixed and simplified amdgpu_vm_fence_imports, depends on reserved BOs * Moved dma_resv_reserve_fences for amdgpu_vm_fence_imports into amdgpu_vm_validate, outside the vm->status_lock * Added dummy version of amdgpu_amdkfd_bo_validate_and_fence for builds without KFD v4: Eliminate amdgpu_vm_fence_imports. It's not needed because the reservation with its fences is shared with the export, as long as all imports are from KFD, with the exports already reserved, validated and fenced by the KFD restore worker. v5: Reintroduced separate evicted_user state to simplify the state machine and CS error handling when amdgpu_vm_validate is called without a ticket. Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-15drm/amdkfd: Fix 'node' NULL check in 'svm_range_get_range_boundaries()'Srinivasan Shanmugam
Range interval [start, last] is ordered by rb_tree, rb_prev, rb_next return value still needs NULL check, thus modified from "node" to "rb_node". Fixes the below: drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c:2691 svm_range_get_range_boundaries() warn: can 'node' even be NULL? Suggested-by: Philip Yang <Philip.Yang@amd.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-09drm/amdkfd: Fix lock dependency warning with srcuPhilip Yang
====================================================== WARNING: possible circular locking dependency detected 6.5.0-kfd-yangp #2289 Not tainted ------------------------------------------------------ kworker/0:2/996 is trying to acquire lock: (srcu){.+.+}-{0:0}, at: __synchronize_srcu+0x5/0x1a0 but task is already holding lock: ((work_completion)(&svms->deferred_list_work)){+.+.}-{0:0}, at: process_one_work+0x211/0x560 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #3 ((work_completion)(&svms->deferred_list_work)){+.+.}-{0:0}: __flush_work+0x88/0x4f0 svm_range_list_lock_and_flush_work+0x3d/0x110 [amdgpu] svm_range_set_attr+0xd6/0x14c0 [amdgpu] kfd_ioctl+0x1d1/0x630 [amdgpu] __x64_sys_ioctl+0x88/0xc0 -> #2 (&info->lock#2){+.+.}-{3:3}: __mutex_lock+0x99/0xc70 amdgpu_amdkfd_gpuvm_restore_process_bos+0x54/0x740 [amdgpu] restore_process_helper+0x22/0x80 [amdgpu] restore_process_worker+0x2d/0xa0 [amdgpu] process_one_work+0x29b/0x560 worker_thread+0x3d/0x3d0 -> #1 ((work_completion)(&(&process->restore_work)->work)){+.+.}-{0:0}: __flush_work+0x88/0x4f0 __cancel_work_timer+0x12c/0x1c0 kfd_process_notifier_release_internal+0x37/0x1f0 [amdgpu] __mmu_notifier_release+0xad/0x240 exit_mmap+0x6a/0x3a0 mmput+0x6a/0x120 do_exit+0x322/0xb90 do_group_exit+0x37/0xa0 __x64_sys_exit_group+0x18/0x20 do_syscall_64+0x38/0x80 -> #0 (srcu){.+.+}-{0:0}: __lock_acquire+0x1521/0x2510 lock_sync+0x5f/0x90 __synchronize_srcu+0x4f/0x1a0 __mmu_notifier_release+0x128/0x240 exit_mmap+0x6a/0x3a0 mmput+0x6a/0x120 svm_range_deferred_list_work+0x19f/0x350 [amdgpu] process_one_work+0x29b/0x560 worker_thread+0x3d/0x3d0 other info that might help us debug this: Chain exists of: srcu --> &info->lock#2 --> (work_completion)(&svms->deferred_list_work) Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock((work_completion)(&svms->deferred_list_work)); lock(&info->lock#2); lock((work_completion)(&svms->deferred_list_work)); sync(srcu); Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-09drm/amdkfd: Fix lock dependency warningFelix Kuehling
====================================================== WARNING: possible circular locking dependency detected 6.5.0-kfd-fkuehlin #276 Not tainted ------------------------------------------------------ kworker/8:2/2676 is trying to acquire lock: ffff9435aae95c88 ((work_completion)(&svm_bo->eviction_work)){+.+.}-{0:0}, at: __flush_work+0x52/0x550 but task is already holding lock: ffff9435cd8e1720 (&svms->lock){+.+.}-{3:3}, at: svm_range_deferred_list_work+0xe8/0x340 [amdgpu] which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&svms->lock){+.+.}-{3:3}: __mutex_lock+0x97/0xd30 kfd_ioctl_alloc_memory_of_gpu+0x6d/0x3c0 [amdgpu] kfd_ioctl+0x1b2/0x5d0 [amdgpu] __x64_sys_ioctl+0x86/0xc0 do_syscall_64+0x39/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd -> #1 (&mm->mmap_lock){++++}-{3:3}: down_read+0x42/0x160 svm_range_evict_svm_bo_worker+0x8b/0x340 [amdgpu] process_one_work+0x27a/0x540 worker_thread+0x53/0x3e0 kthread+0xeb/0x120 ret_from_fork+0x31/0x50 ret_from_fork_asm+0x11/0x20 -> #0 ((work_completion)(&svm_bo->eviction_work)){+.+.}-{0:0}: __lock_acquire+0x1426/0x2200 lock_acquire+0xc1/0x2b0 __flush_work+0x80/0x550 __cancel_work_timer+0x109/0x190 svm_range_bo_release+0xdc/0x1c0 [amdgpu] svm_range_free+0x175/0x180 [amdgpu] svm_range_deferred_list_work+0x15d/0x340 [amdgpu] process_one_work+0x27a/0x540 worker_thread+0x53/0x3e0 kthread+0xeb/0x120 ret_from_fork+0x31/0x50 ret_from_fork_asm+0x11/0x20 other info that might help us debug this: Chain exists of: (work_completion)(&svm_bo->eviction_work) --> &mm->mmap_lock --> &svms->lock Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&svms->lock); lock(&mm->mmap_lock); lock(&svms->lock); lock((work_completion)(&svm_bo->eviction_work)); I believe this cannot really lead to a deadlock in practice, because svm_range_evict_svm_bo_worker only takes the mmap_read_lock if the BO refcount is non-0. That means it's impossible that svm_range_bo_release is running concurrently. However, there is no good way to annotate this. To avoid the problem, take a BO reference in svm_range_schedule_evict_svm_bo instead of in the worker. That way it's impossible for a BO to get freed while eviction work is pending and the cancel_work_sync call in svm_range_bo_release can be eliminated. v2: Use svm_bo_ref_unless_zero and explained why that's safe. Also removed redundant checks that are already done in amdkfd_fence_enable_signaling. Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Philip Yang <philip.yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-01-09Merge tag 'amd-drm-next-6.8-2024-01-05' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.8-2024-01-05: amdgpu: - VRR fixes - PSR-SU fixes - SubVP fixes - DCN 3.5 fixes - Documentation updates - DMCUB fixes - DML2 fixes - UMC 12.0 updates - GPUVM fix - Misc code cleanups and whitespace cleanups - DP MST fix - Let KFD sync with GPUVM fences - GFX11 reset fix - SMU 13.0.6 fixes - VSC fix for DP/eDP - Navi12 display fix - RN/CZN system aperture fix - DCN 2.1 bandwidth validation fix - DCN INIT cleanup amdkfd: - SVM fixes - Revert TBA/TMA location change Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240105220522.4976-1-alexander.deucher@amd.com
2023-12-20Merge tag 'drm-msm-next-2023-12-15' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/msm into drm-next Updates for v6.8: Core: - Add support for SDM670, SM8650 - Handle the CFG interconnect to fix the obscure hangs / timeouts on register write - Kconfig fix for QMP dependency - DT schema fixes DPU: - Add support for SDM670, SM8650 - Enable SmartDMA on SM8350 and SM8450 - Correct UBWC settings for SC8280XP - Fix catalog settings for SC8180X - Actually make use of the version to switch between QSEED3/3LITE/4 scalers - Use devres-managed and drm-managed allocations where appropriate - misc other fixes - Enabled YUV writeback on SC7280, SM8250 - Enabled writeback on SM8350, SM8450 - CRC fix when encoder is selected as the input source - other misc fixes MDP4: - Use devres-managed and drm-managed allocations where appropriate - flush vblank event on CRTC disable MDP5: - Use devres-managed and drm-managed allocations where appropriate DP: - Add support for SM8650 - Enable PM runtime support - Merge msm-specific debugfs dir with the generic one - Described DisplayPort on SM8150 in DeviceTree bindings - Moved dp_display_get_next_bridge() to probe() DSI: - Add support for SM8650 - Enable PM runtime support GPU/GEM: - demote userspace triggerable warnings to debug - add GEM object metadata UAPI - move GPU devcoredumps to GPU device - fix hangcheck to skip retired submits - expose UBWC config to userspace - fix a680 chip-id - drm_exec conversion - drm/ci: remove rebase-merge directory (to unblock CI) [airlied: fix drm_exec/amd interaction] Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <robdclark@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGs9auYqmo-7NSd9FsbNBCDf7aBevd=4xkcF3A5G_OGvMQ@mail.gmail.com
2023-12-19drm/amdkfd: Use partial hmm page walk during buffer validation in SVMXiaogang Chen
SVM uses hmm page walk to valid buffer before map to gpu vm. After have partial migration/mapping do validation on same vm range as migration/map do instead of whole svm range that can be very large. This change is expected to improve svm code performance. Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com> Reviewed-by: Philip Yang <philip.yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-15drm/amdkfd: svm range always mapped flag not working on APUPhilip Yang
On gfx943 APU there is no VRAM and page migration, queue CWSR area, svm range with always mapped flag, is not mapped to GPU correctly. This works fine if retry fault on CWSR area can be recovered, but could cause deadlock if there is another retry fault recover waiting for CWSR to finish. Fix this by mapping svm range with always mapped flag to GPU with ACCESS attribute if XNACK ON. There is side effect, because all GPUs have ACCESS attribute by default on new svm range with XNACK on, the CWSR area will be mapped to all GPUs after this change. This side effect will be fixed with Thunk change to set CWSR svm range with ACCESS_IN_PLACE attribute on the GPU that user queue is created. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-12-10drm/exec: Pass in initial # of objectsRob Clark
In cases where the # is known ahead of time, it is silly to do the table resize dance. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Christian König <christian.koenig@amd.com> Patchwork: https://patchwork.freedesktop.org/patch/568338/
2023-12-06drm/amdkfd: Use partial migrations/mapping for GPU/CPU page faults in SVMXiaogang Chen
This patch implements partial migration/mapping for gpu/cpu page faults in SVM according to migration granularity(default 2MB). A svm range may include pages from both system ram and vram of one gpu now. These chagnes are expected to improve migration performance and reduce mmu callback and TLB flush workloads. Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-29drm/amdkfd: Run restore_workers on freezable WQsFelix Kuehling
Make restore workers freezable so we don't have to explicitly flush them in suspend and GPU reset code paths, and we don't accidentally try to restore BOs while the GPU is suspended. Not having to flush restore_work also helps avoid lock/fence dependencies in the GPU reset case where we're not allowed to wait for fences. A side effect of this is, that we can now have multiple concurrent threads trying to signal the same eviction fence. Rework eviction fence signaling and replacement to account for that. The GPU reset path can no longer rely on restore_process_worker to resume queues because evict/restore workers can run independently of it. Instead call a new restore_process_helper directly. This is an RFC and request for testing. v2: - Reworked eviction fence signaling - Introduced restore_process_helper v3: - Handle unsignaled eviction fences in restore_process_bos Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Tested-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-09drm/amdgpu: Change extended-scope MTYPE on GC 9.4.3David Yat Sin
Change local memory type to MTYPE_UC on revision id 0 Signed-off-by: David Yat Sin <David.YatSin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-27drm/amdgpu: Add EXT_COHERENT support for APU and NUMA systemsDavid Francis
On gfx943 APU, EXT_COHERENT should give MTYPE_CC for local and MTYPE_UC for nonlocal memory. On NUMA systems, local memory gets the local mtype, set by an override callback. If EXT_COHERENT is set, memory will be set as MTYPE_UC by default, with local memory MTYPE_CC. Add an option in the override function for this case, and add a check to ensure it is not used on UNCACHED memory. V2: Combined APU and NUMA code into one patch V3: Fixed a potential nullptr in amdgpu_vm_bo_update Signed-off-by: David Francis <David.Francis@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26Revert "drm/amdkfd: Use partial migrations in GPU page faults"Philip Yang
This reverts commit dc427a473e5d119232ddb27530920d9796cdea70. The change prevents migrating the entire range to VRAM because retry fault restore_pages map the remaining system memory range to GPUs. It will work correctly to submit together with partial mapping to GPU patch later. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26Revert "drm/amdkfd:remove unused code"Philip Yang
This reverts commit f9caf6cdd5cc1f4006fd7b6b113658c0b0159f23. Needed for the next revert patch. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amdkfd: Address 'remap_list' not described in 'svm_range_add'Srinivasan Shanmugam
Fixes the below: drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c:2073: warning: Function parameter or member 'remap_list' not described in 'svm_range_add' Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26drm/amdkfd: Fix shift out-of-bounds issueJesse Zhang
[ 567.613292] shift exponent 255 is too large for 64-bit type 'long unsigned int' [ 567.614498] CPU: 5 PID: 238 Comm: kworker/5:1 Tainted: G OE 6.2.0-34-generic #34~22.04.1-Ubuntu [ 567.614502] Hardware name: AMD Splinter/Splinter-RPL, BIOS WS43927N_871 09/25/2023 [ 567.614504] Workqueue: events send_exception_work_handler [amdgpu] [ 567.614748] Call Trace: [ 567.614750] <TASK> [ 567.614753] dump_stack_lvl+0x48/0x70 [ 567.614761] dump_stack+0x10/0x20 [ 567.614763] __ubsan_handle_shift_out_of_bounds+0x156/0x310 [ 567.614769] ? srso_alias_return_thunk+0x5/0x7f [ 567.614773] ? update_sd_lb_stats.constprop.0+0xf2/0x3c0 [ 567.614780] svm_range_split_by_granularity.cold+0x2b/0x34 [amdgpu] [ 567.615047] ? srso_alias_return_thunk+0x5/0x7f [ 567.615052] svm_migrate_to_ram+0x185/0x4d0 [amdgpu] [ 567.615286] do_swap_page+0x7b6/0xa30 [ 567.615291] ? srso_alias_return_thunk+0x5/0x7f [ 567.615294] ? __free_pages+0x119/0x130 [ 567.615299] handle_pte_fault+0x227/0x280 [ 567.615303] __handle_mm_fault+0x3c0/0x720 [ 567.615311] handle_mm_fault+0x119/0x330 [ 567.615314] ? lock_mm_and_find_vma+0x44/0x250 [ 567.615318] do_user_addr_fault+0x1a9/0x640 [ 567.615323] exc_page_fault+0x81/0x1b0 [ 567.615328] asm_exc_page_fault+0x27/0x30 [ 567.615332] RIP: 0010:__get_user_8+0x1c/0x30 Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Suggested-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-20drm/amdkfd: remap unaligned svm ranges that have splitAlex Sierra
Split SVM ranges that have been mapped into 2MB page table entries, require to be remap in case the split has happened in a non-aligned VA. [WHY]: This condition causes the 2MB page table entries be split into 4KB PTEs. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-20drm/amdkfd:remove unused codeJesse Zhang
Function svm_range_split_by_grinity is not used, so it is removed. Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Suggested-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-19drm/amdkfd: clean up some inconsistent indentingJiapeng Chong
No functional modification involved. drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_svm.c:305 svm_range_free() warn: inconsistent indenting. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=6804 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-18Merge tag 'amd-drm-next-6.7-2023-10-13' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.7-2023-10-13: amdgpu: - DC replay fixes - Misc code cleanups and spelling fixes - Documentation updates - RAS EEPROM Updates - FRU EEPROM Updates - IP discovery updates - SR-IOV fixes - RAS updates - DC PQ fixes - SMU 13.0.6 updates - GC 11.5 Support - NBIO 7.11 Support - GMC 11 Updates - Reset fixes - SMU 11.5 Updates - SMU 13.0 OD support - Use flexible arrays for bo list handling - W=1 Fixes - SubVP fixes - DPIA fixes - DCN 3.5 Support - Devcoredump fixes - VPE 6.1 support - VCN 4.0 Updates - S/G display fixes - DML fixes - DML2 Support - MST fixes - VRR fixes - Enable seamless boot in more cases - Enable content type property for HDMI - OLED fixes - Rework and clean up GPUVM TLB flushing - DC ODM fixes - DP 2.x fixes - AGP aperture fixes - SDMA firmware loading cleanups - Cyan Skillfish GPU clock counter fix - GC 11 GART fix - Cache GPU fault info for userspace queries - DC cursor check fixes - eDP fixes - DC FP handling fixes - Variable sized array fixes - SMU 13.0.x fixes - IB start and size alignment fixes for VCN - SMU 14 Support - Suspend and resume sequence rework - vkms fix amdkfd: - GC 11 fixes - GC 10 fixes - Doorbell fixes - CWSR fixes - SVM fixes - Clean up GC info enumeration - Rework memory limit handling - Coherent memory handling fixes - Use partial migrations in GPU faults - TLB flush fixes - DMA unmap fixes - GC 9.4.3 fixes - SQ interrupt fix - GTT mapping fix - GC 11.5 Support radeon: - Misc code cleanups - W=1 Fixes - Fix possible buffer overflow - Fix possible NULL pointer dereference UAPI: - Add EXT_COHERENT memory allocation flags. These allow for system scope atomics. Proposed userspace: https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/pull/88 - Add support for new VPE engine. This is a memory to memory copy engine with advanced scaling, CSC, and color management features Proposed mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25713 - Add INFO IOCTL interface to query GPU faults Proposed Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238 Proposed libdrm MR: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/298 Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231013175758.1735031-1-alexander.deucher@amd.com
2023-10-05drm/amdkfd: Use partial migrations in GPU page faultsXiaogang Chen
This patch implements partial migration in gpu page fault according to migration granularity(default 2MB) and not split svm range in cpu page fault handling. A svm range may include pages from both system ram and vram of one gpu now. These chagnes are expected to improve migration performance and reduce mmu callback and TLB flush workloads. Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>