summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd
AgeCommit message (Collapse)Author
2025-03-26drm/amdgpu: stop unmapping MQD for kernel queues v3Christian König
This looks unnecessary and actually extremely harmful since using kmap() is not possible while inside the ring reset. Remove all the extra mapping and unmapping of the MQDs. v2: also fix debugfs v3: fix coding style typo Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-26Revert "drm/amdgpu/sdma_v4_4_2: update VM flush implementation for SDMA"Jesse.zhang@amd.com
this temporarily reverts commit 6ec04e38b2f6 ("drm/amdgpu/sdma_v4_4_2: update VM flush implementation for SDMA") it cause a regression. Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-26drm/amdgpu: Parse all deferred errors with UMC aca handleXiang Liu
We should only increase the deferred errors in UMC block. Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-26drm/amdgpu: Update ta ras blockStanley.Yang
Update ta ra block to keep sync with RAS TA. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-26drm/amdgpu: Add NPS2 to DPX compatible modeLijo Lazar
Compute partition DPX is possible in NPS2 mode. Update the compatible modes for DPX. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-26drm/amdgpu: Use correct gfx deferred error countXiang Liu
In the case of parsing GFX deferred error from SMU corrected error channel, the error count should be set to 1 instead of parsing from MISC0 register, which is 0. Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-26drm/amd/display: Actually do immediate vblank disableLeo Li
[Why] The `vblank_config.offdelay` field follows the same semantics as the `drm_vblank_offdelay` parameter. Setting it to 0 will never disable vblank. [How] Set `offdelay` to a positive number. Fixes: e45b6716de4b ("drm/amd/display: use a more lax vblank enable policy for DCN35+") Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2025-03-26drm/amd/display: prevent hang on link training failBrendan Tam
[Why] When link training fails, the phy clock will be disabled. However, in enable_streams, it is assumed that link training succeeded and the mux selects the phy clock, causing a hang when a register write is made. [How] When enable_stream is hit, check if link training failed. If it did, fall back to the ref clock to avoid a hang and keep the system in a recoverable state. Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Brendan Tam <Brendan.Tam@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2025-03-26Revert "drm/amd/display: dml2 soc dscclk use DPM table clk setting"Charlene Liu
[why] this dscclk use DCN defined per DPM level will cause a DCFCLK increase. needs to follow up. This reverts commit 15b959534a39530a21d378190557cc8d1eab7b09 Reviewed-by: Yihan Zhu <yihan.zhu@amd.com> Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-26drm/amd/display: Increase vblank offdelay for PSR panelsLeo Li
[Why] Depending on when the HW latching event (vupdate) of double-buffered registers happen relative to the PSR SDP (signals panel psr enter/exit) deadline, and how bad the Panel clock has drifted since the last ALPM off event, there can be up to 3 frames of delay between sending the PSR exit cmd to DMUB fw, and when the panel starts displaying live frames. This can manifest as micro-stuttering when userspace commit patterns cause rapid toggling of the DRM vblank counter, since PSR enter/exit is hooked up to DRM vblank disable/enable respectively. In the ideal world, the panel should present the live frame immediately on PSR exit cmd. But due to HW design and PSR limitations, immediate exit can only happen by chance, when: 1. PSR exit cmd is ack'd by FW before HW latching (vupdate) event, and 2. Panel's SDP deadline -- determined by it's PSR Start Delay in DPCD 71h -- is after the vupdate event. The PSR exit SDP can then be sent immediately after HW latches. Otherwise, we have to wait 1 frame. And 3. There is negligible drift between the panel's clock and source clock. Otherwise, there can be up to 1 frame of drift. Note that this delay is not expected with Panel Replay. [How] Since PSR power savings can be quite substantial, and there are a lot of systems in the wild with PSR panels, It'll be nice to have a middle ground that balances user experience with power savings. A simple way to achieve this is by extending the vblank offdelay, such that additional PSR exit delays will be less perceivable. We can set: 20/100 * offdelay_ms = 3_frames_ms => offdelay_ms = 5 * 3_frames_ms This ensures that `3_frames_ms` will only be experienced as a 20% delay on top how long the panel has been static, and thus make the delay less perceivable. If this ends up being too high of a percentage, it can be dropped further in a future change. Fixes: 537ef0f88897 ("drm/amd/display: use new vblank enable policy for DCN35+") Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2025-03-26drm/amd: Handle being compiled without SI or CIK support betterMario Limonciello
If compiled without SI or CIK support but amdgpu tries to load it will run into failures with uninitialized callbacks. Show a nicer message in this case and fail probe instead. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4050 Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2025-03-26drm/amd/pm: Add zero RPM enabled OD setting support for SMU14.0.2Tomasz Pakuła
Hook up zero RPM enable for 9070 and 9070 XT based on RDNA3 (smu 13.0.0 and 13.0.7) code. Tested on 9070 XT Hellhound Signed-off-by: Tomasz Pakuła <tomasz.pakula.oficjalny@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.12.x Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-26drm/amd/pm: Prevent division by zeroDenis Arefev
The user can set any speed value. If speed is greater than UINT_MAX/8, division by zero is possible. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: c52dcf49195d ("drm/amd/pp: Avoid divide-by-zero in fan_ctrl_set_fan_speed_rpm") Signed-off-by: Denis Arefev <arefev@swemel.ru> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2025-03-26drm/amd/pm: Prevent division by zeroDenis Arefev
The user can set any speed value. If speed is greater than UINT_MAX/8, division by zero is possible. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: c52dcf49195d ("drm/amd/pp: Avoid divide-by-zero in fan_ctrl_set_fan_speed_rpm") Signed-off-by: Denis Arefev <arefev@swemel.ru> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2025-03-26drm/amd/pm: Prevent division by zeroDenis Arefev
The user can set any speed value. If speed is greater than UINT_MAX/8, division by zero is possible. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 031db09017da ("drm/amd/powerplay/vega20: enable fan RPM and pwm settings V2") Signed-off-by: Denis Arefev <arefev@swemel.ru> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2025-03-26drm/amd/pm: Prevent division by zeroDenis Arefev
The user can set any speed value. If speed is greater than UINT_MAX/8, division by zero is possible. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: b64625a303de ("drm/amd/pm: correct the address of Arcturus fan related registers") Signed-off-by: Denis Arefev <arefev@swemel.ru> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2025-03-26drm/amd/pm: Prevent division by zeroDenis Arefev
The user can set any speed value. If speed is greater than UINT_MAX/8, division by zero is possible. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: c05d1c401572 ("drm/amd/swsmu: add aldebaran smu13 ip support (v3)") Signed-off-by: Denis Arefev <arefev@swemel.ru> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2025-03-25Merge tag 'timers-cleanups-2025-03-23' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer cleanups from Thomas Gleixner: "A treewide hrtimer timer cleanup hrtimers are initialized with hrtimer_init() and a subsequent store to the callback pointer. This turned out to be suboptimal for the upcoming Rust integration and is obviously a silly implementation to begin with. This cleanup replaces the hrtimer_init(T); T->function = cb; sequence with hrtimer_setup(T, cb); The conversion was done with Coccinelle and a few manual fixups. Once the conversion has completely landed in mainline, hrtimer_init() will be removed and the hrtimer::function becomes a private member" * tag 'timers-cleanups-2025-03-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (100 commits) wifi: rt2x00: Switch to use hrtimer_update_function() io_uring: Use helper function hrtimer_update_function() serial: xilinx_uartps: Use helper function hrtimer_update_function() ASoC: fsl: imx-pcm-fiq: Switch to use hrtimer_setup() RDMA: Switch to use hrtimer_setup() virtio: mem: Switch to use hrtimer_setup() drm/vmwgfx: Switch to use hrtimer_setup() drm/xe/oa: Switch to use hrtimer_setup() drm/vkms: Switch to use hrtimer_setup() drm/msm: Switch to use hrtimer_setup() drm/i915/request: Switch to use hrtimer_setup() drm/i915/uncore: Switch to use hrtimer_setup() drm/i915/pmu: Switch to use hrtimer_setup() drm/i915/perf: Switch to use hrtimer_setup() drm/i915/gvt: Switch to use hrtimer_setup() drm/i915/huc: Switch to use hrtimer_setup() drm/amdgpu: Switch to use hrtimer_setup() stm class: heartbeat: Switch to use hrtimer_setup() i2c: Switch to use hrtimer_setup() iio: Switch to use hrtimer_setup() ...
2025-03-25drm/display: dp: change drm_dp_dpcd_read_link_status() return valueDmitry Baryshkov
drm_dp_dpcd_read_link_status() follows the "return error code or number of bytes read" protocol, with the code returning less bytes than requested in case of some errors. However most of the drivers interpreted that as "return error code in case of any error". Switch drm_dp_dpcd_read_link_status() to drm_dp_dpcd_read_data() and make it follow that protocol too. Acked-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20250324-drm-rework-dpcd-access-v4-2-e80ff89593df@oss.qualcomm.com Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>
2025-03-21drm/amd/pm: Update feature list for smu_v13_0_6Asad Kamal
Update feature list for smu_v13_0_6 to show vcn & smu deep sleep feature enable status Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdgpu: Add parameter documentation for amdgpu_sync_fenceSrinivasan Shanmugam
The 'flags' parameter, which specifies memory allocation behavior while creating a sync entry, Fixes the below with gcc W=1: drivers/gpu/drm/amd/amdgpu/amdgpu_sync.c:162: warning: Function parameter or struct member 'flags' not described in 'amdgpu_sync_fence' Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdgpu/discovery: optionally use fw based ip discoveryAlex Deucher
On chips without native IP discovery support, use the fw binary if available, otherwise we can continue without it. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Flora Cui <flora.cui@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdgpu/discovery: use specific ip_discovery.bin for legacy asicsFlora Cui
vega10/vega12/vega20/raven/raven2/picasso/arcturus/aldebaran Signed-off-by: Flora Cui <flora.cui@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdgpu/discovery: check ip_discovery fw file availableFlora Cui
Signed-off-by: Flora Cui <flora.cui@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amd/pm: Remove unnecessay UQ10 to UINT conversionAsad Kamal
Few of the metrics data for smu_v13_0_12 has not been reported in Q10 format, remove UQ10 to UINT conversion for those Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amd/pm: Remove unnecessay UQ10 to UINT conversionAsad Kamal
Few of the metrics data for smu_v13_0_6 has not been reported in Q10 format, remove UQ10 to UINT conversion for those v2: Move smu_v13_0_12 changes to separate patch(Kevin) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdgpu/sdma_v4_4_2: update VM flush implementation for SDMAJesse.zhang@amd.com
This commit updates the VM flush implementation for the SDMA engine. - Added a new function `sdma_v4_4_2_get_invalidate_req` to construct the VM_INVALIDATE_ENG0_REQ register value for the specified VMID and flush type. This function ensures that all relevant page table cache levels (L1 PTEs, L2 PTEs, and L2 PDEs) are invalidated. - Modified the `sdma_v4_4_2_ring_emit_vm_flush` function to use the new `sdma_v4_4_2_get_invalidate_req` function. The updated function emits the necessary register writes and waits to perform a VM flush for the specified VMID. It updates the PTB address registers and issues a VM invalidation request using the specified VM invalidation engine. - Included the necessary header file `gc/gc_9_0_sh_mask.h` to provide access to the required register definitions. v2: vm flush by the vm inalidation packet (Lijo) v3: code stle and define thh macro for the vm invalidation packet (Christian) v4: Format definition sdma vm invalidate packet (Lijo) Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdgpu: Optimize VM invalidation engine allocation and synchronize GPU ↵Jesse.zhang@amd.com
TLB flush - Modify the VM invalidation engine allocation logic to handle SDMA page rings. SDMA page rings now share the VM invalidation engine with SDMA gfx rings instead of allocating a separate engine. This change ensures efficient resource management and avoids the issue of insufficient VM invalidation engines. - Add synchronization for GPU TLB flush operations in gmc_v9_0.c. Use spin_lock and spin_unlock to ensure thread safety and prevent race conditions during TLB flush operations. This improves the stability and reliability of the driver, especially in multi-threaded environments. v2: replace the sdma ring check with a function `amdgpu_sdma_is_page_queue` to check if a ring is an SDMA page queue.(Lijo) v3: Add GC version check, only enabled on GC9.4.3/9.4.4/9.5.0 v4: Fix code style and add more detailed description (Christian) v5: Remove dependency on vm_inv_eng loop order, explicitly lookup shared inv_eng(Christian/Lijo) v6: Added search shared ring function amdgpu_sdma_get_shared_ring (Lijo) Suggested-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amd/amdgpu: Increase max rings to enable SDMA page ringJesse.zhang@amd.com
Increase the maximum number of rings supported by the AMDGPU driver from 133 to 149. This change is necessary to enable support for the SDMA page ring. Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdgpu: Decode deferred error type in gfx aca bank parserXiang Liu
In the case of injecting uncorrected error with background workload, the deferred error among uncorrected errors need to be specified by checking the deferred and poison bits of status register. v2: refine checking for deferred error v2: log possiable DEs among CEs v2: generate CPER records for DEs among UEs Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdgpu/gfx11: Add Cleaner Shader Support for GFX11.5 GPUsSrinivasan Shanmugam
Enable the cleaner shader for GFX11.5.0/11.5.1 GPUs to provide data isolation between GPU workloads. The cleaner shader is responsible for clearing the Local Data Store (LDS), Vector General Purpose Registers (VGPRs), and Scalar General Purpose Registers (SGPRs), which helps prevent data leakage and ensures accurate computation results. This update extends cleaner shader support to GFX11.5.0/11.5.1 GPUs, previously available for GFX11.0.3. It enhances security by clearing GPU memory between processes and maintains a consistent GPU state across KGD and KFD workloads. Cc: Mario Sopena-Novales <mario.novales@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdgpu/mes: clean up SDMA HQD loopAlex Deucher
Follow the same logic as the other IP types. Reviewed-by: Prike Liang <Prike.Liang@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdgpu/mes: enable compute pipes across all MECAlex Deucher
Enable pipes on both MECs for MES. Fixes: 745f46b6a99f ("drm/amdgpu: enable mes v12 self test") Acked-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdgpu/mes: drop MES 10.x leftoversAlex Deucher
Leftover from MES bring up. There is no production MES support for MES 10.x. The rest of the MES 10.x code has already been removed so drop this. Acked-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdgpu/mes: optimize compute loop handlingAlex Deucher
Break when we get to the end of the supported pipes rather than continuing the loop. Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdgpu/sdma: guilty tracking is per instanceAlex Deucher
The gfx and page queues are per instance, so track them per instance. v2: drop extra parameter (Lijo) Fixes: fdbfaaaae06b ("drm/amdgpu: Improve SDMA reset logic with guilty queue tracking") Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdgpu/sdma: fix engine reset handlingAlex Deucher
Move the kfd suspend/resume code into the caller. That is where the KFD is likely to detect a reset so on the KFD side there is no need to call them. Also add a mutex to lock the actual reset sequence. v2: make the locking per instance Fixes: bac38ca8c475 ("drm/amdkfd: implement per queue sdma reset for gfx 9.4+") Reviewed-by: Jesse Zhang <jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdgpu: remove invalid usage of sched.readyChristian König
I can't count how often I had to remove this nonsense. Probably doesn't need an explanation any more. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdgpu: add cleaner shader trace pointChristian König
Note when the cleaner shader is executed. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdgpu: add isolation trace pointChristian König
Note when we switch from one isolation owner to another. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdgpu: stop reserving VMIDs to enforce isolationChristian König
That was quite troublesome for gang submit. Completely drop this approach and enforce the isolation separately. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdgpu: rework how the cleaner shader is emitted v3Christian König
Instead of emitting the cleaner shader for every job which has the enforce_isolation flag set only emit it for the first submission from every client. v2: add missing NULL check v3: fix another NULL pointer deref Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdgpu: rework how isolation is enforced v2Christian König
Limiting the number of available VMIDs to enforce isolation causes some issues with gang submit and applying certain HW workarounds which require multiple VMIDs to work correctly. So instead start to track all submissions to the relevant engines in a per partition data structure and use the dma_fences of the submissions to enforce isolation similar to what a VMID limit does. v2: use ~0l for jobs without isolation to distinct it from kernel submissions which uses NULL for the owner. Add some warning when we are OOM. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdgpu: overwrite signaled fence in amdgpu_syncChristian König
This allows using amdgpu_sync even without peeking into the fences for a long time. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdgpu: use GFP_NOWAIT for memory allocationsChristian König
In the critical submission path memory allocations can't wait for reclaim since that can potentially wait for submissions to finish. Finally clean that up and mark most memory allocations in the critical path with GFP_NOWAIT. The only exception left is the dma_fence_array() used when no VMID is available, but that will be cleaned up later on. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amd/amdgpu: Revert "drm/amd/amdgpu: shorten the gfx idle worker timeout"Kenneth Feng
This reverts commit 55ff973fe1c053de143969cfc8b34baff084084a. Reason for revert: this causes some tests fail with call trace. Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Acked-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdgpu/sdam: Skip SDMA queue reset for SRIOVAhmad Rehman
For SRIOV, skip the SDMA queue reset and return error. The engine/queue reset failure will trigger FLR in the sequence. v2: do not add queue reset support mask for sriov Signed-off-by: Ahmad Rehman <Ahmad.Rehman@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdgpu: Add support to load PSP TA v13.0.12 for SRIOVAhmad Rehman
Add case for 13.0.12. Signed-off-by: Ahmad Rehman <ahrehman@amd.com> Reviewed-by: Vignesh.Chander@amd.com Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdgpu: Enable amdgpu_ras_resume for gfx 9.5.0Ellen Pan
This enables ras to be resumed after gpu recovery on mi350 sriov. Signed-off-by: Ellen Pan <yunru.pan@amd.com> Reviewed-by: Ahmad Rehman <Ahmad.Rehman@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-03-21drm/amdkfd: set precise mem ops caps to disabled for gfx 11 and 12Jonathan Kim
Clause instructions with precise memory enabled currently hang the shader so set capabilities flag to disabled since it's unsafe to use for debugging. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Tested-by: Lancelot Six <lancelot.six@amd.com> Reviewed-by: Harish Kasiviswanathan <harish.kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>