summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdgpu
AgeCommit message (Collapse)Author
2024-09-12drm/amdgpu: handle gfx12 in amdgpu_display_verify_sizesMarek Olšák
[ Upstream commit 8dd1426e2c80e32ac1995007330c8f95ffa28ebb ] It verified GFX9-11 swizzle modes on GFX12, which has undefined behavior. Signed-off-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-12drm/amdgpu: Set no_hw_access when VF request full GPU failsYifan Zha
[ Upstream commit 33f23fc3155b13c4a96d94a0a22dc26db767440b ] [Why] If VF request full GPU access and the request failed, the VF driver can get stuck accessing registers for an extended period during the unload of KMS. [How] Set no_hw_access flag when VF request for full GPU access fails This prevents further hardware access attempts, avoiding the prolonged stuck state. Signed-off-by: Yifan Zha <Yifan.Zha@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-12drm/amdgpu: check for LINEAR_ALIGNED correctly in check_tiling_flags_gfx6Marek Olšák
[ Upstream commit 11317d2963fa79767cd7c6231a00a9d77f2e0f54 ] Fix incorrect check. Signed-off-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-12drm/amdgpu: clear RB_OVERFLOW bit when enabling interruptsDanijel Slivka
[ Upstream commit afbf7955ff01e952dbdd465fa25a2ba92d00291c ] Why: Setting IH_RB_WPTR register to 0 will not clear the RB_OVERFLOW bit if RB_ENABLE is not set. How to fix: Set WPTR_OVERFLOW_CLEAR bit after RB_ENABLE bit is set. The RB_ENABLE bit is required to be set, together with WPTR_OVERFLOW_ENABLE bit so that setting WPTR_OVERFLOW_CLEAR bit would clear the RB_OVERFLOW. Signed-off-by: Danijel Slivka <danijel.slivka@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-12drm/amdgpu: Fix smatch static checker warningHawking Zhang
[ Upstream commit bdbdc7cecd00305dc844a361f9883d3a21022027 ] adev->gfx.imu.funcs could be NULL Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-08drm/amdgpu: update type of buf size to u32 for eeprom functionsTao Zhou
[ Upstream commit 2aadb520bfacec12527effce3566f8df55e5d08e ] Avoid overflow issue. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-08drm/amdgpu: the warning dereferencing obj for nbio_v7_4Jesse Zhang
[ Upstream commit d190b459b2a4304307c3468ed97477b808381011 ] if ras_manager obj null, don't print NBIO err data Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Suggested-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-08drm/amdgpu: fix the waring dereferencing hiveJesse Zhang
[ Upstream commit 1940708ccf5aff76de4e0b399f99267c93a89193 ] Check the amdgpu_hive_info *hive that maybe is NULL. Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-08drm/amdgpu: fix dereference after null checkJesse Zhang
[ Upstream commit b1f7810b05d1950350ac2e06992982974343e441 ] check the pointer hive before use. Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-08drm/amdgpu: fix mc_data out-of-bounds read warningTim Huang
[ Upstream commit 51dfc0a4d609fe700750a62f41447f01b8c9ea50 ] Clear warning that read mc_data[i-1] may out-of-bounds. Signed-off-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-08drm/amdgpu: fix ucode out-of-bounds read warningTim Huang
[ Upstream commit 8944acd0f9db33e17f387fdc75d33bb473d7936f ] Clear warning that read ucode[] may out-of-bounds. Signed-off-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-08drm/amdgpu: Fix out-of-bounds read of df_v1_7_channel_numberMa Jun
[ Upstream commit d768394fa99467bcf2703bde74ddc96eeb0b71fa ] Check the fb_channel_number range to avoid the array out-of-bounds read error Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-08drm/amdgpu: Fix out-of-bounds write warningMa Jun
[ Upstream commit be1684930f5262a622d40ce7a6f1423530d87f89 ] Check the ring type value to fix the out-of-bounds write warning Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-08drm/amd/amdgpu: Check tbo resource pointerAsad Kamal
[ Upstream commit 6cd2b872643bb29bba01a8ac739138db7bd79007 ] Validate tbo resource pointer, skip if NULL Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-08drm/amdgpu: avoid reading vf2pf info size from FBZhigang Luo
[ Upstream commit 3bcc0ee14768d886cedff65da72d83d375a31a56 ] VF can't access FB when host is doing mode1 reset. Using sizeof to get vf2pf info size, instead of reading it from vf2pf header stored in FB. Signed-off-by: Zhigang Luo <Zhigang.Luo@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-08drm/amdgpu: fix overflowed array index read warningTim Huang
[ Upstream commit ebbc2ada5c636a6a63d8316a3408753768f5aa9f ] Clear overflowed array index read warning by cast operation. Signed-off-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-08drm/amdgpu: Fix uninitialized variable warning in amdgpu_afmt_acrMa Jun
[ Upstream commit c0d6bd3cd209419cc46ac49562bef1db65d90e70 ] Assign value to clock to fix the warning below: "Using uninitialized value res. Field res.clock is uninitialized" Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-09-04drm/amdgpu: Using uninitialized value *size when calling amdgpu_vce_cs_relocJesse Zhang
commit 88a9a467c548d0b3c7761b4fd54a68e70f9c0944 upstream. Initialize the size before calling amdgpu_vce_cs_reloc, such as case 0x03000001. V2: To really improve the handling we would actually need to have a separate value of 0xffffffff.(Christian) Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org> Signed-off-by: Vamsi Krishna Brahmajosyula <vamsi-krishna.brahmajosyula@broadcom.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-29drm/amdgpu/vcn: not pause dpg for unified queueBoyuan Zhang
commit 7d75ef3736a025db441be652c8cc8e84044a215f upstream. For unified queue, DPG pause for encoding is done inside VCN firmware, so there is no need to pause dpg based on ring type in kernel. For VCN3 and below, pausing DPG for encoding in kernel is still needed. v2: add more comments v3: update commit message Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-29drm/amdgpu/vcn: identify unified queue in sw initBoyuan Zhang
commit ecfa23c8df7ef3ea2a429dfe039341bf792e95b4 upstream. Determine whether VCN using unified queue in sw_init, instead of calling functions later on. v2: fix coding style Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-29drm/amdgpu: Validate TA binary sizeCandice Li
commit c99769bceab4ecb6a067b9af11f9db281eea3e2a upstream. Add TA binary size validation to avoid OOB write. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit c0a04e3570d72aaf090962156ad085e37c62e442) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-29drm/amdkfd: reserve the BO before validating itLang Yu
[ Upstream commit 0c93bd49576677ae1a18817d5ec000ef031d5187 ] Fix a warning. v2: Avoid unmapping attachment repeatedly when ERESTARTSYS. v3: Lock the BO before accessing ttm->sg to avoid race conditions.(Felix) [ 41.708711] WARNING: CPU: 0 PID: 1463 at drivers/gpu/drm/ttm/ttm_bo.c:846 ttm_bo_validate+0x146/0x1b0 [ttm] [ 41.708989] Call Trace: [ 41.708992] <TASK> [ 41.708996] ? show_regs+0x6c/0x80 [ 41.709000] ? ttm_bo_validate+0x146/0x1b0 [ttm] [ 41.709008] ? __warn+0x93/0x190 [ 41.709014] ? ttm_bo_validate+0x146/0x1b0 [ttm] [ 41.709024] ? report_bug+0x1f9/0x210 [ 41.709035] ? handle_bug+0x46/0x80 [ 41.709041] ? exc_invalid_op+0x1d/0x80 [ 41.709048] ? asm_exc_invalid_op+0x1f/0x30 [ 41.709057] ? amdgpu_amdkfd_gpuvm_dmaunmap_mem+0x2c/0x80 [amdgpu] [ 41.709185] ? ttm_bo_validate+0x146/0x1b0 [ttm] [ 41.709197] ? amdgpu_amdkfd_gpuvm_dmaunmap_mem+0x2c/0x80 [amdgpu] [ 41.709337] ? srso_alias_return_thunk+0x5/0x7f [ 41.709346] kfd_mem_dmaunmap_attachment+0x9e/0x1e0 [amdgpu] [ 41.709467] amdgpu_amdkfd_gpuvm_dmaunmap_mem+0x56/0x80 [amdgpu] [ 41.709586] kfd_ioctl_unmap_memory_from_gpu+0x1b7/0x300 [amdgpu] [ 41.709710] kfd_ioctl+0x1ec/0x650 [amdgpu] [ 41.709822] ? __pfx_kfd_ioctl_unmap_memory_from_gpu+0x10/0x10 [amdgpu] [ 41.709945] ? srso_alias_return_thunk+0x5/0x7f [ 41.709949] ? tomoyo_file_ioctl+0x20/0x30 [ 41.709959] __x64_sys_ioctl+0x9c/0xd0 [ 41.709967] do_syscall_64+0x3f/0x90 [ 41.709973] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Fixes: 101b8104307e ("drm/amdkfd: Move dma unmapping after TLB flush") Signed-off-by: Lang Yu <Lang.Yu@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-29drm/amdgpu: fix dereference null return value for the function ↵Jesse Zhang
amdgpu_vm_pt_parent [ Upstream commit 511a623fb46a6cf578c61d4f2755783c48807c77 ] The pointer parent may be NULLed by the function amdgpu_vm_pt_parent. To make the code more robust, check the pointer parent. Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-29drm/amdkfd: Move dma unmapping after TLB flushPhilip Yang
[ Upstream commit 101b8104307eac734f2dfa4d3511430b0b631c73 ] Otherwise GPU may access the stale mapping and generate IOMMU IO_PAGE_FAULT. Move this to inside p->mutex to prevent multiple threads mapping and unmapping concurrently race condition. After kfd_mem_dmaunmap_attachment is removed from unmap_bo_from_gpuvm, kfd_mem_dmaunmap_attachment is called if failed to map to GPUs, and before free the mem attachment in case failed to unmap from GPUs. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-29drm/amd/amdgpu/imu_v11_0: Increase buffer size to ensure all possible values ↵Lee Jones
can be stored [ Upstream commit a728342ae4ec2a7fdab0038b11427579424f133e ] Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/imu_v11_0.c: In function ‘imu_v11_0_init_microcode’: drivers/gpu/drm/amd/amdgpu/imu_v11_0.c:52:54: warning: ‘_imu.bin’ directive output may be truncated writing 8 bytes into a region of size between 4 and 33 [-Wformat-truncation=] drivers/gpu/drm/amd/amdgpu/imu_v11_0.c:52:9: note: ‘snprintf’ output between 16 and 45 bytes into a destination of size 40 Signed-off-by: Lee Jones <lee@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-29drm/amdgpu/jpeg2: properly set atomics vmid fieldAlex Deucher
commit e414a304f2c5368a84f03ad34d29b89f965a33c9 upstream. This needs to be set as well if the IB uses atomics. Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 35c628774e50b3784c59e8ca7973f03bcb067132) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-29drm/amdgpu: Actually check flags for all context ops.Bas Nieuwenhuizen
commit 0573a1e2ea7e35bff08944a40f1adf2bb35cea61 upstream. Missing validation ... Checked libdrm and it clears all the structs, so we should be safe to just check everything. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit c6b86421f1f9ddf9d706f2453159813ee39d0cf9) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-14drm/amdgpu: Add lock around VF RLCG interfaceVictor Skvortsov
[ Upstream commit e864180ee49b4d30e640fd1e1d852b86411420c9 ] flush_gpu_tlb may be called from another thread while device_gpu_recover is running. Both of these threads access registers through the VF RLCG interface during VF Full Access. Add a lock around this interface to prevent race conditions between these threads. Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Reviewed-by: Zhigang Luo <zhigang.luo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-14drm/amdgpu: Fix the null pointer dereference to ras_managerMa Jun
[ Upstream commit 4c11d30c95576937c6c35e6f29884761f2dddb43 ] Check ras_manager before using it Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-03drm/amdgpu/sdma5.2: Update wptr registers as well as doorbellAlex Deucher
commit a03ebf116303e5d13ba9a2b65726b106cb1e96f6 upstream. We seem to have a case where SDMA will sometimes miss a doorbell if GFX is entering the powergating state when the doorbell comes in. To workaround this, we can update the wptr via MMIO, however, this is only safe because we disallow gfxoff in begin_ring() for SDMA 5.2 and then allow it again in end_ring(). Enable this workaround while we are root causing the issue with the HW team. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/3440 Tested-by: Friedrich Vock <friedrich.vock@gmx.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org (cherry picked from commit f2ac52634963fc38e4935e11077b6f7854e5d700) Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-08-03drm/amdgpu: Remove GC HW IP 9.3.0 from noretry=1Tim Van Patten
[ Upstream commit 1446226d32a45bb7c4f63195a59be8c08defe658 ] The following commit updated gmc->noretry from 0 to 1 for GC HW IP 9.3.0: commit 5f3854f1f4e2 ("drm/amdgpu: add more cases to noretry=1") This causes the device to hang when a page fault occurs, until the device is rebooted. Instead, revert back to gmc->noretry=0 so the device is still responsive. Fixes: 5f3854f1f4e2 ("drm/amdgpu: add more cases to noretry=1") Signed-off-by: Tim Van Patten <timvp@google.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-08-03drm/amdgpu: Check if NBIO funcs are NULL in amdgpu_device_baco_exitFriedrich Vock
[ Upstream commit 0cdb3f9740844b9d95ca413e3fcff11f81223ecf ] The special case for VM passthrough doesn't check adev->nbio.funcs before dereferencing it. If GPUs that don't have an NBIO block are passed through, this leads to a NULL pointer dereference on startup. Signed-off-by: Friedrich Vock <friedrich.vock@gmx.de> Fixes: 1bece222eabe ("drm/amdgpu: Clear doorbell interrupt status for Sienna Cichlid") Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-27drm/amdgpu: Fix signedness bug in sdma_v4_0_process_trap_irq()Dan Carpenter
commit 6769a23697f17f9bf9365ca8ed62fe37e361a05a upstream. The "instance" variable needs to be signed for the error handling to work. Fixes: 8b2faf1a4f3b ("drm/amdgpu: add error handle to avoid out-of-bounds") Reviewed-by: Bob Zhou <bob.zhou@amd.com> Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Siddh Raman Pant <siddh.raman.pant@oracle.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-11drm/amdgpu: fix uninitialized scalar variable warningTim Huang
[ Upstream commit 9a5f15d2a29d06ce5bd50919da7221cda92afb69 ] Clear warning that uses uninitialized value fw_size. Signed-off-by: Tim Huang <Tim.Huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-11drm/amdgpu: Initialize timestamp for some legacy SOCsMa Jun
[ Upstream commit 2e55bcf3d742a4946d862b86e39e75a95cc6f1c0 ] Initialize the interrupt timestamp for some legacy SOCs to fix the coverity issue "Uninitialized scalar variable" Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-11drm/amdgpu: Fix uninitialized variable warningsMa Jun
[ Upstream commit 60c448439f3b5db9431e13f7f361b4074d0e8594 ] return 0 to avoid returning an uninitialized variable r Signed-off-by: Ma Jun <Jun.Ma2@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-05drm/amdgpu/atomfirmware: fix parsing of vram_infoAlex Deucher
commit f6f49dda49db72e7a0b4ca32c77391d5ff5ce232 upstream. v3.x changed the how vram width was encoded. The previous implementation actually worked correctly for most boards. Fix the implementation to work correctly everywhere. This fixes the vram width reported in the kernel log on some boards. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-05drm/amdgpu: avoid using null object of framebufferJulia Zhang
commit bcfa48ff785bd121316592b131ff6531e3e696bb upstream. Instead of using state->fb->obj[0] directly, get object from framebuffer by calling drm_gem_fb_get_obj() and return error code when object is null to avoid using null object of framebuffer. Reported-by: Fusheng Huang <fusheng.huang@ecarxgroup.com> Signed-off-by: Julia Zhang <Julia.Zhang@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-05drm/amdgpu: Fix pci state save during mode-1 resetLijo Lazar
[ Upstream commit 74fa02c4a5ea1ade5156a6ce494d3ea83881c2d8 ] Cache the PCI state before bus master is disabled. The saved state is later used for other cases like restoring config space after mode-2 reset. Fixes: 5c03e5843e6b ("drm/amdgpu:add smu mode1/2 support for aldebaran") Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-07-05drm/amd/amdgpu: Fix style errors in amdgpu_drv.c & amdgpu_device.cSrinivasan Shanmugam
[ Upstream commit 47fc644f801e4414753a9b7e87ed41f991cd68c3 ] Fix following checkpatch style errors in amdgpu_drv.c & amdgpu_device.c ERROR: exactly one space required after that #ifdef ERROR: spaces required around that '+=' (ctx:WxV) ERROR: space required before the open brace '{' ERROR: spaces required around that '||' (ctx:VxE) ERROR: space prohibited before that close parenthesis ')' ERROR: space required before the open parenthesis '(' ERROR: space required before the open brace '{' ERROR: code indent should use tabs where possible Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Stable-dep-of: 74fa02c4a5ea ("drm/amdgpu: Fix pci state save during mode-1 reset") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-16drm/amdgpu/atomfirmware: add intergrated info v2.3 tableLi Ma
commit e64e8f7c178e5228e0b2dbb504b9dc75953a319f upstream. [Why] The vram width value is 0. Because the integratedsysteminfo table in VBIOS has updated to 2.3. [How] Driver needs a new intergrated info v2.3 table too. Then the vram width value will be correct. Signed-off-by: Li Ma <li.ma@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-16drm/amdgpu: add error handle to avoid out-of-boundsBob Zhou
commit 8b2faf1a4f3b6c748c0da36cda865a226534d520 upstream. if the sdma_v4_0_irq_id_to_seq return -EINVAL, the process should be stop to avoid out-of-bounds read, so directly return -EINVAL. Signed-off-by: Bob Zhou <bob.zhou@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-12drm/amdgpu/mes: fix use-after-free issueJack Xiao
[ Upstream commit 948255282074d9367e01908b3f5dcf8c10fc9c3d ] Delete fence fallback timer to fix the ramdom use-after-free issue. v2: move to amdgpu_mes.c Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12drm/amdgpu: Fix the ring buffer size for queue VM flushPrike Liang
[ Upstream commit fe93b0927bc58cb1d64230f45744e527d9d8482c ] Here are the corrections needed for the queue ring buffer size calculation for the following cases: - Remove the KIQ VM flush ring usage. - Add the invalidate TLBs packet for gfx10 and gfx11 queue. - There's no VM flush and PFP sync, so remove the gfx9 real ring and compute ring buffer usage. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-06-12drm/amdgpu: Update BO eviction prioritiesFelix Kuehling
[ Upstream commit b0b13d532105e0e682d95214933bb8483a063184 ] Make SVM BOs more likely to get evicted than other BOs. These BOs opportunistically use available VRAM, but can fall back relatively seamlessly to system memory. It also avoids SVM migrations evicting other, more important BOs as they will evict other SVM allocations first. Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Acked-by: Mukul Joshi <mukul.joshi@amd.com> Tested-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-25drm/amdgpu: Fix possible NULL dereference in ↵Srinivasan Shanmugam
amdgpu_ras_query_error_status_helper() commit b8d55a90fd55b767c25687747e2b24abd1ef8680 upstream. Return invalid error code -EINVAL for invalid block id. Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1183 amdgpu_ras_query_error_status_helper() error: we previously assumed 'info' could be null (see line 1176) Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Tao Zhou <tao.zhou1@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> [Ajay: applied AMDGPU_RAS_BLOCK_COUNT condition to amdgpu_ras_query_error_status() as amdgpu_ras_query_error_status_helper() not present in v6.6, v6.1 amdgpu_ras_query_error_status_helper() was introduced in 8cc0f5669eb6] Signed-off-by: Ajay Kaher <ajay.kaher@broadcom.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-17drm/amdgpu: once more fix the call oder in amdgpu_ttm_move() v2Christian König
commit d3a9331a6591e9df64791e076f6591f440af51c3 upstream. This reverts drm/amdgpu: fix ftrace event amdgpu_bo_move always move on same heap. The basic problem here is that after the move the old location is simply not available any more. Some fixes were suggested, but essentially we should call the move notification before actually moving things because only this way we have the correct order for DMA-buf and VM move notifications as well. Also rework the statistic handling so that we don't update the eviction counter before the move. v2: add missing NULL check Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: 94aeb4117343 ("drm/amdgpu: fix ftrace event amdgpu_bo_move always move on same heap") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3171 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> CC: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-17drm/amdgpu: Refine IB schedule error loggingLijo Lazar
[ Upstream commit 4b18a91faf1752f9bd69a4ed3aed2c8f6e5b0528 ] Downgrade to debug information when IBs are skipped. Also, use dev_* to identify the device. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-05-02drm/amdgpu: Fix leak when GPU memory allocation failsMukul Joshi
commit 25e9227c6afd200bed6774c866980b8e36d033af upstream. Free the sync object if the memory allocation fails for any reason. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-02drm/amdgpu/sdma5.2: use legacy HDP flush for SDMA2/3Alex Deucher
commit 9792b7cc18aaa0c2acae6af5d0acf249bcb1ab0d upstream. This avoids a potential conflict with firmwares with the newer HDP flush mechanism. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>