summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
AgeCommit message (Collapse)Author
2025-07-16drm/amdgpu: rework gmc_v9_0_get_coherence_flags v2Christian König
Avoid using the mapping here. v2: use amdgpu_xgmi_same_hive() as suggested by Felix Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-07-04Merge tag 'amd-drm-next-6.17-2025-07-01' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.17-2025-07-01: amdgpu: - FAMS2 fixes - OLED fixes - Misc cleanups - AUX fixes - DMCUB updates - SR-IOV hibernation support - RAS updates - DP tunneling fixes - DML2 fixes - Backlight improvements - Suspend improvements - Use scaling for non-native modes on eDP - SDMA 4.4.x fixes - PCIe DPM fixes - SDMA 5.x fixes - Cleaner shader updates for GC 9.x - Remove fence slab - ISP genpd support - Parition handling rework - SDMA FW checks for userq support - Add missing firmware declaration - Fix leak in amdgpu_ctx_mgr_entity_fini() - Freesync fix - Ring reset refactoring - Legacy dpm verbosity changes amdkfd: - GWS fix - mtype fix for ext coherent system memory - MMU notifier fix - gfx7/8 fix radeon: - CS validation support for additional GL extensions - Bump driver version for new CS validation checks From: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250701194707.32905-1-alexander.deucher@amd.com Signed-off-by: Dave Airlie <airlied@redhat.com>
2025-06-24drm/amdgpu: Convert init_mem_ranges into common helpersHawking Zhang
They can be shared across multiple products Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-24drm/amdgpu: Generalize is_multi_chiplet with a common helper v2Hawking Zhang
It is not necessary to be ip generation specific v2: rename the helper to is_multi_aid (Lijo) Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-24drm/amdgpu: Convert query_memory_partition into common helpersHawking Zhang
The query_memory_partition does not need to remain as soc specific callbacks. They can be shared across multiple products Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-24drm/amdgpu: Move MAX_MEM_RANGES to amdgpu_gmc.hHawking Zhang
This relocation allows MAX_MEM_RANGES to be shared across multiple products Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-18drm/amdgpu: enable pdb0 for hibernation on SRIOVSamuel Zhang
When switching to new GPU index after hibernation and then resume, VRAM offset of each VRAM BO will be changed, and the cached gpu addresses needed to updated. This is to enable pdb0 and switch to use pdb0-based virtual gpu address by default in amdgpu_bo_create_reserved(). since the virtual addresses do not change, this can avoid the need to update all cached gpu addresses all over the codebase. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Samuel Zhang <guoqing.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-06-17drm: amdgpu: Create amdgpu_vm_print_task_info()André Almeida
To avoid repetitive code in amdgpu, create a function that prints the content of struct amdgpu_task_info. Reviewed-by: Christian König <christian.koenig@amd.com> Link: https://lore.kernel.org/r/20250617124949.2151549-3-andrealmeid@igalia.com Signed-off-by: André Almeida <andrealmeid@igalia.com>
2025-05-13drm/amdgpu: set vram type for GC 9.5.0Tao Zhou
Set vram type so we can take different actions according to the type. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-05-13drm/amdgpu: implement get_retire_flip_bits for UMC v12Tao Zhou
The RAS bad page retire flip bits can be set per vram type, vram vendor and NPS mode. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-11drm/amdgpu: Use the right function for hdp flushLijo Lazar
There are a few prechecks made before HDP flush like a flush is not required on APU bare metal. Using hdp callback directly bypasses those checks. Use amdgpu_device_flush_hdp which takes care of prechecks. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdkfd: Drop workaround for GC v9.4.3 revID 0Apurv Mishra
Remove workaround code for the early engineering samples GC v9.4.3 SOCs with revID 0 Reviewed-by: Amber Lin <Amber.Lin@amd.com> Signed-off-by: Apurv Mishra <Apurv.Mishra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-27drm/amdgpu: Fix logic to fetch supported NPS modesLijo Lazar
Correct the logic to find supported NPS modes from firmware. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reported-by: Ava Zhang <niandong.zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Fixes: 30eb41f5d1a7 ("drm/amdgpu: Use firmware supported NPS modes") Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-25drm/amdgpu: update the handle ptr in is_idleSunil Khatri
Update the *handle to amdgpu_ip_block ptr for all functions pointers of is_idle. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-19drm/amdgpu: update the handle ptr in get_clockgating_stateSunil Khatri
Update the *handle to amdgpu_ip_block ptr for all functions pointers of get_clockgating_state. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-19drm/amdgpu: Use firmware supported NPS modesLijo Lazar
If firmware supported NPS modes are available through CAP register, use those values for supported NPS modes. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12drm/amdgpu: Add support for umc 12.5.0/mmhub 1.8.1Mangesh Gadre
Add new IP version support Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12drm/amdgpu: Clean up IP version checks in gmcv9.0Lijo Lazar
Clean up some IP version checks in gmcv9.0 Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-02-12drm/amdgpu: Use active umc info from discoveryLijo Lazar
There could be configs where some UMC instances are harvested. This information is obtained through discovery data and populated in umc.active_mask. Avoid reassigning this as AID mask, instead use the mask directly while iterating through umc instances. This is to avoid accesses to harvested UMC instances. v2: fix warning (Alex) Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amdgpu: Optimize gfx v9 GPU page fault handlingPhilip Yang
After GPU page fault, there are lots of page fault interrupts generated at short period even with CAM filter enabled because the fault address is different. Each page fault copy to KFD ih fifo to send event to user space by KFD interrupt worker, this could cause KFD ih fifo overflow while other processes generate events at same time. KFD process is aborted after GPU page fault, we only need one GPU page fault interrupt sent to KFD ih fifo to send memory exception event to user space. Incease KFD ih fifo size to 2 times of IH primary ring size, to handle the burst events case. This patch handle the gfx v9 path, cover retry on/off and CAM filter on/off cases. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: Init mmhub v1_8_1 ras funcHawking Zhang
reuse mmhub v1_8 ras functuion Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: Enable xgmi for gfx v9_5_0Shiwu Zhang
Enable xgmi for gfx v9_5_0 Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd: update mtype flags for gfx 9.5.0Alex Sierra
Update mtype flags to meet gfx 9.5.0 requirements for remote GPU memory and system memory. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: add initial support for gfx950Le Ma
add gfx950 basic support Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amd: define gc ip version local variableAlex Sierra
For better readability. Also leftover orphaned code. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: pass ip_block in set_clockgating_stateBoyuan Zhang
Pass ip_block instead of adev in set_clockgating_state() callback functions. Modify set_clockgating_state()for all correspoding ip blocks. v2: remove all changes for is_idle(), remove type casting Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-10drm/amdgpu: pass ip_block in set_powergating_stateBoyuan Zhang
Pass ip_block instead of adev in set_powergating_state callback function. Modify set_powergating_state ip functions for all correspoding ip blocks. v2: fix a ip block index error. v3: remove type casting Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-08drm/amdgpu: fix check in gmc_v9_0_get_vm_pte()Christian König
The coherency flags can only be determined when the BO is locked and that in turn is only guaranteed when the mapping is validated. Fix the check, move the resource check into the function and add an assert that the BO is locked. Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: d1a372af1c3d ("drm/amdgpu: Set MTYPE in PTE based on BO flags") Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-24amdgpu: Don't print L2 status if there's nothing to printKent Russell
If a 2nd fault comes in before the 1st is handled, the 1st fault will clear out the FAULT STATUS registers before the 2nd fault is handled. Thus we get a lot of zeroes. If status=0, just skip the L2 fault status information, to avoid confusion of why some VM fault status prints in dmesg are all zeroes. Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-15drm/amdgpu: Fetch NPS mode for GCv9.4.3 VFsLijo Lazar
Use the memory ranges published in discovery table to deduce NPS mode of GC v9.4.3 VFs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Vignesh Chander <Vignesh.Chander@amd.com> Tested-by: Vignesh Chander <Vignesh.Chander@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-15drm/amdgpu: Add NPS switch support for GC 9.4.3Lijo Lazar
Add dynamic NPS switch support for GC 9.4.3 variants. Only GC v9.4.3 and GC v9.4.4 currently support this. NPS switch is only supported if an SOC supports multiple NPS modes. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-07drm/amdgpu: update the handle ptr in hw_finiSunil Khatri
Update the *handle to amdgpu_ip_block ptr for all functions pointers of hw_fini. Also update the ip_block ptr where ever needed as there were cyclic dependency of hw_fini on suspend and some followed clean up. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-07drm/amdgpu: update the handle ptr in hw_initSunil Khatri
Update the *handle to amdgpu_ip_block ptr for all functions pointers of hw_init. Also update the ip_block ptr where ever needed as there were cyclic dependency of hw_init on resume. v2: squash in isp fix Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-07drm/amdgpu: update the handle ptr in resumeSunil Khatri
Update the *handle to amdgpu_ip_block ptr for all functions pointers of resume. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-07drm/amdgpu: update the handle ptr in suspendSunil Khatri
Update the *handle to amdgpu_ip_block ptr for all functions pointers of suspend. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-07drm/amdgpu: update the handle ptr in wait_for_idleSunil Khatri
Update the *handle to amdgpu_ip_block ptr for all functions pointers of wait_for_idle. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-01drm/amdgpu: update the handle ptr in soft_resetSunil Khatri
Update the *handle to amdgpu_ip_block ptr for all functions pointers of soft_reset. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-01drm/amdgpu: update the handle ptr in sw_finiSunil Khatri
update the *handle to amdgpu_ip_block ptr for all functions pointers of sw_fini. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-01drm/amdgpu: update the handle ptr in sw_initSunil Khatri
update the *handle to amdgpu_ip_block ptr for all functions pointers of sw_init. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-01drm/amdgpu: update the handle ptr in late_initSunil Khatri
Update the ptr handle to amdgpu_ip_block ptr in all the functions of late_init function ptr. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-10-01drm/amdgpu: update the handle ptr in early_initSunil Khatri
update the handle ptr to amdgpu_ip_block ptr for all functions pointers on early_init. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-09-26drm/amdgpu: Refactor XGMI reset on init handlingLijo Lazar
Use XGMI hive information to rely on resetting XGMI devices on initialization rather than using mgpu structure. mgpu structure may have other devices as well. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Feifei Xu <feifxu@amd.com> Acked-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Tested-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-08-23drm/amdkfd: Change kfd/svm page fault drain handlingXiaogang Chen
When app unmap vm ranges(munmap) kfd/svm starts drain pending page fault and not handle any incoming pages fault of this process until a deferred work item got executed by default system wq. The time period of "not handle page fault" can be long and is unpredicable. That is advese to kfd performance on page faults recovery. This patch uses time stamp of incoming page fault to decide to drop or recover page fault. When app unmap vm ranges kfd records each gpu device's ih ring current time stamp. These time stamps are used at kfd page fault recovery routine. Any page fault happened on unmapped ranges after unmap events is application bug that accesses vm range after unmap. It is not driver work to cover that. By using time stamp of page fault do not need drain page faults at deferred work. So, the time period that kfd does not handle page faults is reduced and can be controlled. Signed-off-by: Xiaogang.Chen <Xiaogang.Chen@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-27drm/amdgpu: Fix register access violationHawking Zhang
fault_status is read only register. fault_cntl is not accessible from guest environment. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-06-05drm/amdgpu: Update the impelmentation of AMDGPU_PTE_MTYPE_VG10Shane Xiao
This patch changes the implementation of AMDGPU_PTE_MTYPE_VG10, clear the bits before setting the new one. Suggested-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: longlyao <Longlong.Yao@amd.com> Signed-off-by: Shane Xiao <shane.xiao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-23drm/amd/amdgpu: fix the inst passed to amdgpu_virt_rlcg_reg_rwVictor Zhao
the inst passed to amdgpu_virt_rlcg_reg_rw should be physical instance. Fix the miss matched code. Signed-off-by: Victor Zhao <Victor.Zhao@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-17drm/amdgpu: Use NPS ranges from discovery tableLijo Lazar
Add GMC API to fetch NPS range information from discovery table. Use NPS range information in GMC 9.4.3 SOCs when available, otherwise fallback to software method. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-17drm/amdgpu: Fix memory range calculationLijo Lazar
Consider the 16M reserved region also before range calculation for GMC 9.4.3 SOCs. Fixes: a433f1f59484 ("drm/amdgpu: Initialize memory ranges for GC 9.4.3") Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-05-02drm/amdgpu: Add gfx v9_4_4 ip blockHawking Zhang
Add gfx v9_4_4 ip block support Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-04-09drm/amdgpu: retire UMC v12 mca_addr_to_paTao Zhou
RAS TA will handle it, the function is useless. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>