summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu
AgeCommit message (Collapse)Author
2023-06-09drm/amdgpu: Use new atomfirmware init for GC 9.4.3Lijo Lazar
Use the new atomfirmware initialization logic for GC 9.4.3 based ASICs also. ASIC init logic doesn't consider boot clocks during init. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: Fix CP_HYP_XCP_CTL register programming in CPX modeMukul Joshi
Currently, in CPX mode, the CP_HYP_XCP_CTL register is programmed incorrectly with the number of XCCs in the partition. As a result, HIQ doesn't work in CPX mode. Fix this by programming the correct number of XCCs in a partition, which is 1, in CPX mode. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdkfd: Update SDMA queue management for GFX9.4.3Mukul Joshi
This patch updates SDMA queue management for multi XCC in GFX9.4.3. - Allocate/deallocate SDMA queues from the correct SDMA engines based on the partition mode. - Updates the kgd2kfd interface to fetch the correct SDMA register addresses. - It also fixes dumping correct SDMA queue info in debugfs. v2: squash in fix "drm/amdkfd: Fix XGMI SDMA user-mode queue allocation" Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: Fix VM fault reporting on XCC1Mukul Joshi
Fix VM fault reporting and clear VM fault register for XCC1. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: Add XCC inst to PASID TLB flushingMukul Joshi
Add XCC instance to select the correct KIQ ring when flushing TLBs on a multi-XCC setup. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Tested-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdkfd: Add XCC instance to kgd2kfd interface (v3)Mukul Joshi
Gfx 9 starts to have multiple XCC instances in one device. Add instance parameter to kgd2kfd functions where XCC instance was hard coded as 0. Also, update code to pass the correct instance number when running on a multi-XCC setup. v2: introduce the XCC instance to gfx v11 (Morris) v3: rebase (Alex) Signed-off-by: Amber Lin <Amber.Lin@amd.com> Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Tested-by: Amber Lin <Amber.Lin@amd.com> Signed-off-by: Morris Zhang <Shiwu.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdkfd: Update MQD management on multi XCC setupMukul Joshi
Update MQD management for both HIQ and user-mode compute queues on a multi XCC setup. MQDs needs to be allocated, initialized, loaded and destroyed for each XCC in the KFD node. v2: squash in fix "drm/amdkfd: Fix SDMA+HIQ HQD allocation on GFX9.4.3" Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Amber Lin <Amber.Lin@amd.com> Tested-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdkfd: Introduce kfd_node struct (v5)Mukul Joshi
Introduce a new structure, kfd_node, which will now represent a compute node. kfd_node is carved out of kfd_dev structure. kfd_dev struct now will become the parent of kfd_node, and will store common resources such as doorbells, GTT sub-alloctor etc. kfd_node struct will store all resources specific to a compute node, such as device queue manager, interrupt handling etc. This is the first step in adding compute partition support in KFD. v2: introduce kfd_node struct to gc v11 (Hawking) v3: make reference to kfd_dev struct through kfd_node (Morris) v4: use kfd_node instead for kfd isr/mqd functions (Morris) v5: rebase (Alex) Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Tested-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Morris Zhang <Shiwu.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: Add mode2 reset logic for v13.0.6Lijo Lazar
Mode2 reset for v13.0.6 has similar workflow as v13.0.2 Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: Add some XCC programmingLijo Lazar
Add additional XCC programming sequences. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: add node_id to physical id conversion in EOP handlerLe Ma
A new field nodeid in interrupt cookie indicates the node ID. Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Shiwu Zhang <shiwu.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: enable the ring and IB test for slave kcqShiwu Zhang
With the mec FW update to utilize the mqd base set by driver for kcq mapping, slave kcq ring test and IB test can be re-enabled. Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: support gc v9_4_3 ring_test running on all xccHawking Zhang
Each xcc has its own sratch_reg offset Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: fix vcn doorbell range settingJames Zhu
Should use vcn_ring0_1 instead of doorbell index to set nbio doorbell range. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu/jpeg: enable jpeg doorbell for jpeg4.0.3James Zhu
Enable jpeg doorbell for jpeg4.0.3. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu/vcn: enable vcn doorbell for vcn4.0.3James Zhu
Enable vcn doorbell for vcn4.0.3. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu/nbio: update vcn doorbell rangeJames Zhu
VCN4.0.3 used up to 16 doorbells per partition. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu/jpeg: add multiple jpeg rings support for vcn4_0_3James Zhu
Add multiple jpeg rings support for vcn4_0_3 Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu/jpeg: add multiple jpeg rings supportJames Zhu
Add multiple jpeg rings support. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu/vcn: enable vcn DPG mode for VCN4_0_3James Zhu
Enable vcn DPG mode for VCN4_0_3. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu/vcn: enable vcn pg for VCN4_0_3James Zhu
Enable vcn pg for VCN4_0_3. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu/vcn: enable vcn cg for VCN4_0_3James Zhu
Enable vcn cg for VCN4_0_3. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu/jpeg: enable jpeg pg for VCN4_0_3James Zhu
Enable jpeg pg for VCN4_0_3. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu/jpeg: enable jpeg cg for VCN4_0_3James Zhu
Enable jpeg cg for VCN4_0_3. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu/vcn: add vcn support for VCN4_0_3James Zhu
Add vcn support for VCN4_0_3. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu/jpeg: add jpeg support for VCN4_0_3James Zhu
Add jpeg support for VCN4_0_3. v2: squash in delayed work typo fix (Alex) Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: add VCN4_0_3 firmwareJames Zhu
Add VCN4_0_3 firmware. v2: fix fw name (Alex) Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu/: add more macro to support offset variantJames Zhu
Add more macro to support offset variant and simplify macro SOC15_WAIT_ON_RREG. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: Use the correct API to read registerLijo Lazar
Use SOC15 API so that the register offset is calculated correctly. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: Add kgd2kfd for GC 9.4.3Amber Lin
New GC (v9.4.3) and ATHUB (v1.8.0) versions are used. Add kgd_gfx_v9_4_3_* functions if registers in use of kgd_gfx_v9_* functions are changed or have different offset. Signed-off-by: Amber Lin <Amber.Lin@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: alloc vm inv engines for every vmhubShiwu Zhang
There are AMDGPU_MAX_VMHUBS of vmhub in maximum and need to init the vm_inv_engs for all of them. In this way, the below error can be ruled out. [ 217.317752] amdgpu 0000:02:00.0: amdgpu: no VM inv eng for ring sdma0 Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Christian Koenig <Christian.Koenig@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: override partition mode through module parameterShiwu Zhang
Add a module parameter to override the partition mode. Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: make the WREG32_SOC15_xx macro to support multi GCShiwu Zhang
To write regs on different GCDs, use the inst index. Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com> Reviewed-by: Le Ma <Le.Ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: add sysfs node for compute partition modeLe Ma
Add current/available compute partitin mode sysfs node. v2: make the sysfs node as IP independent one in amdgpu_gfx.c Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: assign different AMDGPU_GFXHUB for rings on each xccLe Ma
Pass the xcc_id to AMDGPU_GFXHUB(x) Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: init vmhubs bitmask for GC 9.4.3Le Ma
Each XCD owns one GFXHUB. v2: switch to the new VMHUB layout Signed-off-by: Le Ma <le.ma@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: add bitmask to iterate vmhubsLe Ma
As the layout of VMHUB definition has been changed to cover multiple XCD/AID case, the original num_vmhubs is not appropriate to do vmhub iteration any more. Drop num_vmhubs and introduce vmhubs_mask instead. v2: switch to the new VMHUB layout v3: use DECLARE_BITMAP to define vmhubs_mask Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: assign register address for vmhub object on each XCDLe Ma
Each XCD has its own gfxhub. v2: switch to the new VMHUB layout v3: fix mistake Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: introduce vmhub definition for multi-partition cases (v3)Hawking Zhang
v1: Each partition has its own gfxhub or mmhub. adjust the num of MAX_VMHUBS and the GFXHUB/MMHUB layout (Le) v2: re-design the AMDGPU_GFXHUB/AMDGPU_MMHUB layout (Le) v3: apply the gfxhub/mmhub layout to new IPs (Hawking) v4: fix up gmc11 (Alex) v5: rebase (Alex) Signed-off-by: Le Ma <le.ma@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: skip disabling fence driver src_irqs when device is unpluggedGuchun Chen
When performing device unbind or halt, we have disabled all irqs at the very begining like amdgpu_pci_remove or amdgpu_device_halt. So amdgpu_irq_put for irqs stored in fence driver should not be called any more, otherwise, below calltrace will arrive. [ 139.114088] WARNING: CPU: 2 PID: 1550 at drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c:616 amdgpu_irq_put+0xf6/0x110 [amdgpu] [ 139.114655] Call Trace: [ 139.114655] <TASK> [ 139.114657] amdgpu_fence_driver_hw_fini+0x93/0x130 [amdgpu] [ 139.114836] amdgpu_device_fini_hw+0xb6/0x350 [amdgpu] [ 139.114955] amdgpu_driver_unload_kms+0x51/0x70 [amdgpu] [ 139.115075] amdgpu_pci_remove+0x63/0x160 [amdgpu] [ 139.115193] ? __pm_runtime_resume+0x64/0x90 [ 139.115195] pci_device_remove+0x3a/0xb0 [ 139.115197] device_remove+0x43/0x70 [ 139.115198] device_release_driver_internal+0xbd/0x140 Signed-off-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: improve wait logic at fence pollingAlex Sierra
Accomplish this by reading the seq number right away instead of sleep for 5us. There are certain cases where the fence is ready almost immediately. Sleep number granularity was also reduced as the majority of the kiq tlb flush takes between 2us to 6us. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu/gmc11: implement get_vbios_fb_size()Alex Deucher
Implement get_vbios_fb_size() so we can properly reserve the vbios splash screen to avoid potential artifacts on the screen during the transition from the pre-OS console to the OS console. Acked-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: Differentiate between Raven2 and Raven/Picasso according to ↵Jesse Zhang
revision id Due to the raven2 and raven/picasso maybe have the same GC_HWIP version. So differentiate them by revision id. Signed-off-by: shanshengwang <shansheng.wang@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: change gfx 11.0.4 external_id rangeYifan Zhang
gfx 11.0.4 range starts from 0x80. Fixes: 311d52367d0a ("drm/amdgpu: add soc21 common ip block support for GC 11.0.4") Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reported-by: Yogesh Mohan Marimuthu <Yogesh.Mohanmarimuthu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amd/amdgpu: Fix warnings in amdgpu _object, _ring.cSrinivasan Shanmugam
Fix below warnings reported by checkpatch: WARNING: Prefer 'unsigned int' to bare use of 'unsigned' WARNING: static const char * array should probably be static const char * const WARNING: space prohibited between function name and open parenthesis '(' WARNING: braces {} are not necessary for single statement blocks WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'. Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu/gfx11: Adjust gfxoff before powergating on gfx11 as wellGuilherme G. Piccoli
(Bas: speculative change to mirror gfx10/gfx9) Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu/gfx10: Disable gfxoff before disabling powergating.Bas Nieuwenhuizen
Otherwise we get a full system lock (looks like a FW mess). Copied the order from the GFX9 powergating code. Fixes: 366468ff6c34 ("drm/amdgpu: Allow GfxOff on Vangogh as default") Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2545 Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu: release correct lock in amdgpu_gfx_enable_kgq()Dan Carpenter
This function was releasing the incorrect lock on the error path. Reported-by: kernel test robot <lkp@intel.com> Fixes: 1156e1a60f02 ("drm/amdgpu: add [en/dis]able_kgq() functions") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu/jpeg: Remove harvest checking for JPEG3Saleemkhan Jamadar
Register CC_UVD_HARVESTING is obsolete for JPEG 3.1.2 Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdgpu/gfx: disable gfx9 cp_ecc_error_irq only when enabling legacy gfx rasGuchun Chen
gfx9 cp_ecc_error_irq is only enabled when legacy gfx ras is assert. So in gfx_v9_0_hw_fini, interrupt disablement for cp_ecc_error_irq should be executed under such condition, otherwise, an amdgpu_irq_put calltrace will occur. [ 7283.170322] RIP: 0010:amdgpu_irq_put+0x45/0x70 [amdgpu] [ 7283.170964] RSP: 0018:ffff9a5fc3967d00 EFLAGS: 00010246 [ 7283.170967] RAX: ffff98d88afd3040 RBX: ffff98d89da20000 RCX: 0000000000000000 [ 7283.170969] RDX: 0000000000000000 RSI: ffff98d89da2bef8 RDI: ffff98d89da20000 [ 7283.170971] RBP: ffff98d89da20000 R08: ffff98d89da2ca18 R09: 0000000000000006 [ 7283.170973] R10: ffffd5764243c008 R11: 0000000000000000 R12: 0000000000001050 [ 7283.170975] R13: ffff98d89da38978 R14: ffffffff999ae15a R15: ffff98d880130105 [ 7283.170978] FS: 0000000000000000(0000) GS:ffff98d996f00000(0000) knlGS:0000000000000000 [ 7283.170981] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 7283.170983] CR2: 00000000f7a9d178 CR3: 00000001c42ea000 CR4: 00000000003506e0 [ 7283.170986] Call Trace: [ 7283.170988] <TASK> [ 7283.170989] gfx_v9_0_hw_fini+0x1c/0x6d0 [amdgpu] [ 7283.171655] amdgpu_device_ip_suspend_phase2+0x101/0x1a0 [amdgpu] [ 7283.172245] amdgpu_device_suspend+0x103/0x180 [amdgpu] [ 7283.172823] amdgpu_pmops_freeze+0x21/0x60 [amdgpu] [ 7283.173412] pci_pm_freeze+0x54/0xc0 [ 7283.173419] ? __pfx_pci_pm_freeze+0x10/0x10 [ 7283.173425] dpm_run_callback+0x98/0x200 [ 7283.173430] __device_suspend+0x164/0x5f0 v2: drop gfx11 as it's fixed in a different solution by retiring cp_ecc_irq funcs(Hawking) Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2522 Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>