summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu
AgeCommit message (Collapse)Author
2025-04-08drm/amdgpu: Reset RAS table if header is invalidLijo Lazar
If a valid header is not found during RAS eeprom init, consider it as new and reset RAS table info. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08drm/amdgpu: add loop bits for NPS2 page retirementTao Zhou
Support NPS2 RAS. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08drm/amd/amdgpu: decouple ASPM with pcie dpmKenneth Feng
ASPM doesn't need to be disabled if pcie dpm is disabled. So ASPM can be independantly enabled. Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08amd/amdgpu: Init vcn hardware per instance for vcn 4.0.3Ruili Ji
Add interface for hardware init by vcn instance. v2: fix code format Reviewed-by: Sonny Jiang <sonny.jiang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Ruili Ji <ruiliji2@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08drm/amdgpu: Disable ACA on VFsVictor Skvortsov
VFs query RAS error counts directly from host with AMDGPU_RAS_VIRT_ERROR_COUNT_QUERY. When ACA is enabled, an unusable aca_sysfs is created rather than amdgpu_ras_sysfs_create() Likewise, VFs depend on host support to query CPERs, rather than ACA component. Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Reviewed-by: Zhigang Luo <Zhigang.luo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08drm/amdgpu: use "irq" in place of "interrupt" in DCE6/8 as in DCE10/11Alexandre Demers
"interrupt" becomes "irq" in: dce_vX_0_set_hpd_interrupt_state() dce_vX_0_set_crtc_interrupt_state() dce_vX_0_set_pageflip_interrupt_state() It is easier when going through the code to just change the DCE number in the functions' name to find and compare them across DCE versions. Also, it standardizes function mapping inside a given structure where .set and .process are both set to functions with a "_irq" suffix. Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08drm/amdgpu: fix typos in DCEsAlexandre Demers
In DCE6, DCE8, DCE10, DCE11, "hdp" is replaced by "hpd" and replace "type" by "hpd" for a uniform parameter naming usage across DCEs. In link_factory.c, there is a missing "p" to "types" Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08drm/amdgpu/mes12: optimize MES pipe FW version fetchingAlex Deucher
Don't fetch it again if we already have it. It seems the registers don't reliably have the value at resume in some cases. Fixes: 785f0f9fe742 ("drm/amdgpu: Add mes v12_0 ip block support (v4)") Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08drm/amdgpu: cancel gfx idle work in device suspend for s0ixAlex Deucher
This is normally handled in the gfx IP suspend callbacks, but for S0ix, those are skipped because we don't want to touch gfx. So handle it in device suspend. Fixes: b9467983b774 ("drm/amdgpu: add dynamic workload profile switching for gfx10") Fixes: 963537ca2325 ("drm/amdgpu: add dynamic workload profile switching for gfx11") Fixes: 5f95a1549555 ("drm/amdgpu: add dynamic workload profile switching for gfx12") Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08drm/amdgpu/gfx12: dump full CP packet header FIFOsAlex Deucher
In dev core dump, dump the full header fifo for each queue. Each FIFO has 8 entries. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08drm/amdgpu/gfx11: dump full CP packet header FIFOsAlex Deucher
In dev core dump, dump the full header fifo for each queue. Each FIFO has 8 entries. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-08drm/amdgpu/gfx10: dump full CP packet header FIFOsAlex Deucher
In dev core dump, dump the full header fifo for each queue. Each FIFO has 8 entries. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx9.4.3: dump full CP packet header FIFOsAlex Deucher
In dev core dump, dump the full header fifo for each queue. Each FIFO has 8 entries. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx9: dump full CP packet header FIFOsAlex Deucher
In dev core dump, dump the full header fifo for each queue. Each FIFO has 8 entries. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: Fix CPER error handling on VFsVictor Skvortsov
CPER read will loop infinitely if an error is encountered and the more bit is set. Add error checks to break upon failure. v2: added function pointer checks Suggested-by: Tony Yi <Tony.Yi@amd.com> Signed-off-by: Victor Skvortsov <Victor.Skvortsov@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: Fix the comment to avoid warningSunil Khatri
Fix the below comment warning drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c:541: warning: Function parameter or struct member 'adev' not described in 'amdgpu_sdma_register_on_reset_callbacks' Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: Fix xgmi v6.4.1 link status reportingLijo Lazar
Use the right register offsets for getting link status. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: Add basic validation for RAS headerLijo Lazar
If RAS header read from EEPROM is corrupted, it could result in trying to allocate huge memory for reading the records. Add some validation to header fields. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdkfd: Drop workaround for GC v9.4.3 revID 0Apurv Mishra
Remove workaround code for the early engineering samples GC v9.4.3 SOCs with revID 0 Reviewed-by: Amber Lin <Amber.Lin@amd.com> Signed-off-by: Apurv Mishra <Apurv.Mishra@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: huge sid.h cleanup, drop substituted defines.Alexandre Demers
Now that we are using the proper defines, cleanup useless old "substituted" defines. Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: move si.c away from sid.hAlexandre Demers
Replace defines by the ones added earlier to GFX6, SMU6 and DCE6 Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: enable FW workaround for VCN 4_0_5Boyuan Zhang
Enabling VCN FW workaround for drm key injection through shared memory for vcn 4_0_5 Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: Add indirect L1_TLB_CNTL reg programming for VFsVictor Skvortsov
VFs on some IP versions are unable to access this register directly. This register must be programmed before PSP ring is setup, so use PSP VF mailbox directly. PSP will broadcast the register value to all VF assigned instances. Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Reviewed-by: Zhigang Luo <Zhigang.luo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx12: Implement the gfx12 kgq pipe resetPrike Liang
Implement the GFX12 kgq pipe reset, and temporarily disable the GFX12 pipe reset until the CPFW fully support it. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx11: Implement the GFX11 KCQ pipe resetPrike Liang
Implement the GFX11 compute pipe reset. As the GFX11 CPFW still hasn't fully supported pipe reset yet, therefore disable the KCQ pipe reset temporarily. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx11: Implement the GFX11 KGQ pipe resetPrike Liang
Implement the kernel graphics queue pipe reset,and the driver will fallback to pipe reset when the queue reset fails. However, the ME FW hasn't fully supported pipe reset yet so disable the KGQ pipe reset temporarily. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx11: fix CSIB handlingAlex Deucher
We shouldn't return after the last section. We need to update the rest of the CSIB. Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx10: fix CSIB handlingAlex Deucher
We shouldn't return after the last section. We need to update the rest of the CSIB. Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx9: fix CSIB handlingAlex Deucher
We shouldn't return after the last section. We need to update the rest of the CSIB. Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx8: fix CSIB handlingAlex Deucher
We shouldn't return after the last section. We need to update the rest of the CSIB. Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx7: fix CSIB handlingAlex Deucher
We shouldn't return after the last section. We need to update the rest of the CSIB. Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx6: fix CSIB handlingAlex Deucher
We shouldn't return after the last section. We need to update the rest of the CSIB. Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx: assign the actual me0 queues per pipeAlex Deucher
Set the actual number of queues per pipe for ME0 (gfx). This way we will dump all of the queues properly in dev core dumps. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx: decouple the number of kgqs from the hwAlex Deucher
The driver currently sets up one kgq per pipe. As such adev->gfx.me.num_queue_per_pipe is hardcoded to 1 everywhere. This is fine for kernel queues, but when we enable user queues we need to know that actual number of queues per pipe. Decouple the kgq setup from the actual hardware count. For dev core dumps and user queues, we want to know the actual number of queues per pipe. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx: make amdgpu_gfx_me_queue_to_bit() staticAlex Deucher
It's not used outside of amdgpu_gfx.c. Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu/gfx10: Add Cleaner Shader Support for GFX10.3.x GPUsSrinivasan Shanmugam
Enable the cleaner shader for other GFX10.3.x series of GPUs to provide data isolation between GPU workloads. The cleaner shader is responsible for clearing the Local Data Store (LDS), Vector General Purpose Registers (VGPRs), and Scalar General Purpose Registers (SGPRs), which helps prevent data leakage and ensures accurate computation results. This update extends cleaner shader support to GFX10.3.x GPUs, previously available for GFX10.3.0. It enhances security by clearing GPU memory between processes and maintains a consistent GPU state across KGD and KFD workloads. Cc: Mario Sopena-Novales <mario.novales@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: drop some dead codeAlex Deucher
Drop the cgs smu firmware code for SI, it's not used. The smu firmware fetching for SI is done in si_dpm.c. Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: continue cleaning up sid.h and si_enums.hAlexandre Demers
Remove more duplicated defines and move some in sid.h for coherence with CIK. Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amd/amdgpu: Fix typoAnanta Srikar
Fixes a typo in the word "version" in an error message. Signed-off-by: Ananta Srikar <srikarananta01@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: Replace deprecated function strcpy() with strscpy()Andres Urian Florez
Instead of using the strcpy() deprecated function to populate the fw_name, use the strscpy() function Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy Signed-off-by: Andres Urian Florez <andres.emb.sys@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: add rebar parameterAlex Deucher
Add a new parameter to disable BAR resizing. Note that this only disables the driver from attempting to resize the BAR, The BIOS may have resized the BAR at boot. Some teams have found this useful in debugging P2P DMA issues on systems where the available MMIO space did not allow for all of the GPUs present to resize their BARs. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: cleanup DCE6 a bit moreAlexandre Demers
Use shifts already available in DCE6's defines, masks and shifts. Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: keep removing sid.h dependency from si_dma.cAlexandre Demers
Move and rename DMA_SEM_INCOMPLETE_TIMER_CNTL and DMA_SEM_WAIT_FAIL_TIMER_CNTL in oss_1_0_d.h Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: move si_dma.c away from sid.h and si_enums.hAlexandre Demers
Replace defines for the ones in oss_1_0_d.h and oss_1_0_sh_mask.h Taking the opportunity to add some comments taken from cik_sdma.c Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: make GFX6 easier to readAlexandre Demers
Just fix the style and add a comment for reading easiness Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: move DCE6 away from sid.h and si_enums.h definesAlexandre Demers
This cleans up DCE6. I added some minor tweaks taken from CIK to exit early v2: minor fixes (Alex) Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: use GRPH_SECONDARY_SURFACE_ADDRESS_MASK with ↵Alexandre Demers
GRPH_SECONDARY_SURFACE_ADDRESS in DCE6 It seems a copy-paste error: since we are working with mmGRPH_SECONDARY_SURFACE_ADDRESS, GRPH_SECONDARY_SURFACE_ADDRESS__GRPH_SECONDARY_SURFACE_ADDRESS_MASK should be used. Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: move si_ih.c away from sid.h definesAlexandre Demers
They are properly defined under oss_1_0_d.h Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: remove PACKET3 duplicated defines from si_enums.hAlexandre Demers
PACKET3 is already in sid.h, as it is done under cikd.h for CIK Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-04-07drm/amdgpu: use proper defines, shifts and masks in DCE6 codeAlexandre Demers
By replacing VGA_VSTATUS_CNTL by VGA_RENDER_CONTROL__VGA_VSTATUS_CNTL_MASK, we also need to fix its usage in GMC6. Note: VGA_VSTATUS_CNTL's binary value was inverted in dce_6_0_sh_mask.h, so we need to invert its value where it was used. Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>