summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2025-01-06drm/amd/display: correct type mismatches in comparisons in DML2Natanel Roizenman
[Why] Comparisons were made between unsigned char and unsigned int. [How] Corrected by changing variable types. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Natanel Roizenman <Natanel.Roizenman@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06drm/amd/display: Adjust dm to use supported interfaces for setting multiple ↵Wayne Lin
crc windows [Why & How] We actually have the capability to calculate independent CRC for 2 crc window at the same time. Extend dm with the capability by having array to configure/maintain multiple crc windows. Add the flexibility but use 1st CRC instance only for now. Can change to use the 2nd CRC instance if needed. Reviewed-by: HaoPing Liu <haoping.liu@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06drm/amd/display: Extend dc_stream_get_crc to support 2nd crc engineWayne Lin
[Why & How] Since now we can set multiple crc windows for secure display, add a new input parameter for dc_stream_get_crc to indicate to fetch crc from which crc engine. Reviewed-by: HaoPing Liu <haoping.liu@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06drm/amd/display: Add support for setting multiple CRC windows in dcWayne Lin
[Why & How] Have to support multiple CRC windows setting to dmub. Add new dmub forward functions for supporting/forwarding multiple crc windows setting to dmub. Reviewed-by: HaoPing Liu <haoping.liu@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06drm/amd/display: Add expanded HBlank field to dc_crtc_timingGeorge Shen
[Why] For DP HBlank expansion/reduction, the HBlank parameters of the original EDID timing needs to be notified to the sink in order for the timing to be reduced back to the original HBlank size. [How] Add parameter in dc_crtc_timing to track the increased HBlank. Reviewed-by: Michael Strauss <michael.strauss@amd.com> Signed-off-by: George Shen <george.shen@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06drm/amd/display: Add DP required HBlank size calc to link interfaceGeorge Shen
[Why] Some features, such as HBlank expansion/reduction, needs to know how much HBlank is required to support basic audio. [How] Add interface to link to calculate required HBlank size for a given link + timing combination to support basic audio (i.e. 2-channel 48KHz). Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: George Shen <george.shen@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06drm/amd/display: Parse RECEIVE_PORT0_CAP capabilities from DPCDGeorge Shen
[Why] DPCD register RECEIVE_PORT0_CAP contains HBlank expansion/reduction capabilities of a DP device. These capabilities are required to enable HBlank expansion/reduction logic. [How] Read raw RECEIVE_PORT0_CAP register values and store parsed fields. Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: George Shen <george.shen@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06drm/amd/display: Cleanup outdated interfaces in dcn401_clk_mgrDillon Varone
[WHY&HOW] - Remove legacy update clocks sequence - FCLK P-State allow message is not required Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06drm/amd/display: power up all gating blocks when releasing hw DCN35Yihan Zhu
[WHY & HOW] Driver disable will deallocate framebuffer to reset IPS state, this will cause IPS start with INIT state to blindly power gate ONO region to break power sequence. All the gating blocks should be powered up when releasing hw to ensure all the power optimizations are identical to pre-OS. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Duncan Ma <duncan.ma@amd.com> Signed-off-by: Yihan Zhu <Yihan.Zhu@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06drm/amd/display: update sequential pg logic DCN35Yihan Zhu
[WHY & HOW] No check for HUBP/DPP power gating when DSC instance is still running. Avoid HUBP/DPP to power gate when corresponding DSC block is still running in the power gating calculation. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Duncan Ma <duncan.ma@amd.com> Signed-off-by: Yihan Zhu <Yihan.Zhu@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-01-06drm/amdkfd: fixed page fault when enable MES shader debuggerJesse.zhang@amd.com
Initialize the process context address before setting the shader debugger. [ 260.781212] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:32 vmid:0 pasid:0) [ 260.781236] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10 [ 260.781255] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00040A40 [ 260.781270] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CPC (0x5) [ 260.781284] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0 [ 260.781296] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0 [ 260.781308] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x4 [ 260.781320] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0 [ 260.781332] amdgpu 0000:03:00.0: amdgpu: RW: 0x1 [ 260.782017] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:32 vmid:0 pasid:0) [ 260.782039] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10 [ 260.782058] amdgpu 0000:03:00.0: amdgpu: GCVM_L2_PROTECTION_FAULT_STATUS:0x00040A41 [ 260.782073] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: CPC (0x5) [ 260.782087] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x1 [ 260.782098] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0 [ 260.782110] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x4 [ 260.782122] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0 [ 260.782137] amdgpu 0000:03:00.0: amdgpu: RW: 0x1 [ 260.782155] amdgpu 0000:03:00.0: amdgpu: [gfxhub] page fault (src_id:0 ring:32 vmid:0 pasid:0) [ 260.782166] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000000000000 from client 10 Fixes: 438b39ac74e2 ("drm/amdkfd: pause autosuspend when creating pdd") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3849 Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-19drm/amd/display: Reapply fdedd77b0eb3Nathan Chancellor
Commit 2563391e57b5 ("drm/amd/display: DML2.1 resynchronization") blew away the compiler warning fix from commit 2fde4fdddc1f ("drm/amd/display: Avoid -Wenum-float-conversion in add_margin_and_round_to_dfs_grainularity()"), causing the warning to reappear. drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_dpmm/dml2_dpmm_dcn4.c:183:58: error: arithmetic between enumeration type 'enum dentist_divider_range' and floating-point type 'double' [-Werror,-Wenum-float-conversion] 183 | divider = (unsigned int)(DFS_DIVIDER_RANGE_SCALE_FACTOR * (vco_freq_khz / clock_khz)); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. Apply the fix again to resolve the warning. Re-apply again after commit be4e3509314a ("drm/amd/display: DML21 Reintegration For Various Fixes") This should be making its way back to the original DML trees this time. (Alex) Fixes: be4e3509314a ("drm/amd/display: DML21 Reintegration For Various Fixes") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3841 Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-19drm/amd/display: fix divide error in DM plane scale calcsMelissa Wen
dm_get_plane_scale doesn't take into account plane scaled size equal to zero, leading to a kernel oops due to division by zero. Fix by setting out-scale size as zero when the dst size is zero, similar to what is done by drm_calc_scale(). This issue started with the introduction of cursor ovelay mode that uses this function to assess cursor mode changes via dm_crtc_get_cursor_mode() before checking plane state. [Dec17 17:14] Oops: divide error: 0000 [#1] PREEMPT SMP NOPTI [ +0.000018] CPU: 5 PID: 1660 Comm: surface-DP-1 Not tainted 6.10.0+ #231 [ +0.000007] Hardware name: Valve Jupiter/Jupiter, BIOS F7A0131 01/30/2024 [ +0.000004] RIP: 0010:dm_get_plane_scale+0x3f/0x60 [amdgpu] [ +0.000553] Code: 44 0f b7 41 3a 44 0f b7 49 3e 83 e0 0f 48 0f a3 c2 73 21 69 41 28 e8 03 00 00 31 d2 41 f7 f1 31 d2 89 06 69 41 2c e8 03 00 00 <41> f7 f0 89 07 e9 d7 d8 7e e9 44 89 c8 45 89 c1 41 89 c0 eb d4 66 [ +0.000005] RSP: 0018:ffffa8df0de6b8a0 EFLAGS: 00010246 [ +0.000006] RAX: 00000000000003e8 RBX: ffff9ac65c1f6e00 RCX: ffff9ac65d055500 [ +0.000003] RDX: 0000000000000000 RSI: ffffa8df0de6b8b0 RDI: ffffa8df0de6b8b4 [ +0.000004] RBP: ffff9ac64e7a5800 R08: 0000000000000000 R09: 0000000000000a00 [ +0.000003] R10: 00000000000000ff R11: 0000000000000054 R12: ffff9ac6d0700010 [ +0.000003] R13: ffff9ac65d054f00 R14: ffff9ac65d055500 R15: ffff9ac64e7a60a0 [ +0.000004] FS: 00007f869ea00640(0000) GS:ffff9ac970080000(0000) knlGS:0000000000000000 [ +0.000004] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0.000003] CR2: 000055ca701becd0 CR3: 000000010e7f2000 CR4: 0000000000350ef0 [ +0.000004] Call Trace: [ +0.000007] <TASK> [ +0.000006] ? __die_body.cold+0x19/0x27 [ +0.000009] ? die+0x2e/0x50 [ +0.000007] ? do_trap+0xca/0x110 [ +0.000007] ? do_error_trap+0x6a/0x90 [ +0.000006] ? dm_get_plane_scale+0x3f/0x60 [amdgpu] [ +0.000504] ? exc_divide_error+0x38/0x50 [ +0.000005] ? dm_get_plane_scale+0x3f/0x60 [amdgpu] [ +0.000488] ? asm_exc_divide_error+0x1a/0x20 [ +0.000011] ? dm_get_plane_scale+0x3f/0x60 [amdgpu] [ +0.000593] dm_crtc_get_cursor_mode+0x33f/0x430 [amdgpu] [ +0.000562] amdgpu_dm_atomic_check+0x2ef/0x1770 [amdgpu] [ +0.000501] drm_atomic_check_only+0x5e1/0xa30 [drm] [ +0.000047] drm_mode_atomic_ioctl+0x832/0xcb0 [drm] [ +0.000050] ? __pfx_drm_mode_atomic_ioctl+0x10/0x10 [drm] [ +0.000047] drm_ioctl_kernel+0xb3/0x100 [drm] [ +0.000062] drm_ioctl+0x27a/0x4f0 [drm] [ +0.000049] ? __pfx_drm_mode_atomic_ioctl+0x10/0x10 [drm] [ +0.000055] amdgpu_drm_ioctl+0x4e/0x90 [amdgpu] [ +0.000360] __x64_sys_ioctl+0x97/0xd0 [ +0.000010] do_syscall_64+0x82/0x190 [ +0.000008] ? __pfx_drm_mode_createblob_ioctl+0x10/0x10 [drm] [ +0.000044] ? srso_return_thunk+0x5/0x5f [ +0.000006] ? drm_ioctl_kernel+0xb3/0x100 [drm] [ +0.000040] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? __check_object_size+0x50/0x220 [ +0.000007] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? drm_ioctl+0x2a4/0x4f0 [drm] [ +0.000039] ? __pfx_drm_mode_createblob_ioctl+0x10/0x10 [drm] [ +0.000043] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? __pm_runtime_suspend+0x69/0xc0 [ +0.000006] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? amdgpu_drm_ioctl+0x71/0x90 [amdgpu] [ +0.000366] ? srso_return_thunk+0x5/0x5f [ +0.000006] ? syscall_exit_to_user_mode+0x77/0x210 [ +0.000007] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? do_syscall_64+0x8e/0x190 [ +0.000006] ? srso_return_thunk+0x5/0x5f [ +0.000006] ? do_syscall_64+0x8e/0x190 [ +0.000006] ? srso_return_thunk+0x5/0x5f [ +0.000007] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ +0.000008] RIP: 0033:0x55bb7cd962bc [ +0.000007] Code: 4c 89 6c 24 18 4c 89 64 24 20 4c 89 74 24 28 0f 57 c0 0f 11 44 24 30 89 c7 48 8d 54 24 08 b8 10 00 00 00 be bc 64 38 c0 0f 05 <49> 89 c7 48 83 3b 00 74 09 4c 89 c7 ff 15 62 64 99 00 48 83 7b 18 [ +0.000005] RSP: 002b:00007f869e9f4da0 EFLAGS: 00000217 ORIG_RAX: 0000000000000010 [ +0.000007] RAX: ffffffffffffffda RBX: 00007f869e9f4fb8 RCX: 000055bb7cd962bc [ +0.000004] RDX: 00007f869e9f4da8 RSI: 00000000c03864bc RDI: 000000000000003b [ +0.000003] RBP: 000055bb9ddcbcc0 R08: 00007f86541b9920 R09: 0000000000000009 [ +0.000004] R10: 0000000000000004 R11: 0000000000000217 R12: 00007f865406c6b0 [ +0.000003] R13: 00007f86541b5290 R14: 00007f865410b700 R15: 000055bb9ddcbc18 [ +0.000009] </TASK> Fixes: 1b04dcca4fb1 ("drm/amd/display: Introduce overlay cursor mode") Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3729 Reported-by: Fabio Scaccabarozzi <fsvm88@gmail.com> Co-developed-by: Fabio Scaccabarozzi <fsvm88@gmail.com> Signed-off-by: Fabio Scaccabarozzi <fsvm88@gmail.com> Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-19drm/amd/display: increase MAX_SURFACES to the value supported by hwMelissa Wen
As the hw supports up to 4 surfaces, increase the maximum number of surfaces to prevent the DC error when trying to use more than three planes. [drm:dc_state_add_plane [amdgpu]] *ERROR* Surface: can not attach plane_state 000000003e2cb82c! Maximum is: 3 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3693 Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-19drm/amd/display: fix page fault due to max surface definition mismatchMelissa Wen
DC driver is using two different values to define the maximum number of surfaces: MAX_SURFACES and MAX_SURFACE_NUM. Consolidate MAX_SURFACES as the unique definition for surface updates across DC. It fixes page fault faced by Cosmic users on AMD display versions that support two overlay planes, since the introduction of cursor overlay mode. [Nov26 21:33] BUG: unable to handle page fault for address: 0000000051d0f08b [ +0.000015] #PF: supervisor read access in kernel mode [ +0.000006] #PF: error_code(0x0000) - not-present page [ +0.000005] PGD 0 P4D 0 [ +0.000007] Oops: Oops: 0000 [#1] PREEMPT SMP NOPTI [ +0.000006] CPU: 4 PID: 71 Comm: kworker/u32:6 Not tainted 6.10.0+ #300 [ +0.000006] Hardware name: Valve Jupiter/Jupiter, BIOS F7A0131 01/30/2024 [ +0.000007] Workqueue: events_unbound commit_work [drm_kms_helper] [ +0.000040] RIP: 0010:copy_stream_update_to_stream.isra.0+0x30d/0x750 [amdgpu] [ +0.000847] Code: 8b 10 49 89 94 24 f8 00 00 00 48 8b 50 08 49 89 94 24 00 01 00 00 8b 40 10 41 89 84 24 08 01 00 00 49 8b 45 78 48 85 c0 74 0b <0f> b6 00 41 88 84 24 90 64 00 00 49 8b 45 60 48 85 c0 74 3b 48 8b [ +0.000010] RSP: 0018:ffffc203802f79a0 EFLAGS: 00010206 [ +0.000009] RAX: 0000000051d0f08b RBX: 0000000000000004 RCX: ffff9f964f0a8070 [ +0.000004] RDX: ffff9f9710f90e40 RSI: ffff9f96600c8000 RDI: ffff9f964f000000 [ +0.000004] RBP: ffffc203802f79f8 R08: 0000000000000000 R09: 0000000000000000 [ +0.000005] R10: 0000000000000000 R11: 0000000000000000 R12: ffff9f96600c8000 [ +0.000004] R13: ffff9f9710f90e40 R14: ffff9f964f000000 R15: ffff9f96600c8000 [ +0.000004] FS: 0000000000000000(0000) GS:ffff9f9970000000(0000) knlGS:0000000000000000 [ +0.000005] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0.000005] CR2: 0000000051d0f08b CR3: 00000002e6a20000 CR4: 0000000000350ef0 [ +0.000005] Call Trace: [ +0.000011] <TASK> [ +0.000010] ? __die_body.cold+0x19/0x27 [ +0.000012] ? page_fault_oops+0x15a/0x2d0 [ +0.000014] ? exc_page_fault+0x7e/0x180 [ +0.000009] ? asm_exc_page_fault+0x26/0x30 [ +0.000013] ? copy_stream_update_to_stream.isra.0+0x30d/0x750 [amdgpu] [ +0.000739] ? dc_commit_state_no_check+0xd6c/0xe70 [amdgpu] [ +0.000470] update_planes_and_stream_state+0x49b/0x4f0 [amdgpu] [ +0.000450] ? srso_return_thunk+0x5/0x5f [ +0.000009] ? commit_minimal_transition_state+0x239/0x3d0 [amdgpu] [ +0.000446] update_planes_and_stream_v2+0x24a/0x590 [amdgpu] [ +0.000464] ? srso_return_thunk+0x5/0x5f [ +0.000009] ? sort+0x31/0x50 [ +0.000007] ? amdgpu_dm_atomic_commit_tail+0x159f/0x3a30 [amdgpu] [ +0.000508] ? srso_return_thunk+0x5/0x5f [ +0.000009] ? amdgpu_crtc_get_scanout_position+0x28/0x40 [amdgpu] [ +0.000377] ? srso_return_thunk+0x5/0x5f [ +0.000009] ? drm_crtc_vblank_helper_get_vblank_timestamp_internal+0x160/0x390 [drm] [ +0.000058] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? dma_fence_default_wait+0x8c/0x260 [ +0.000010] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? wait_for_completion_timeout+0x13b/0x170 [ +0.000006] ? srso_return_thunk+0x5/0x5f [ +0.000005] ? dma_fence_wait_timeout+0x108/0x140 [ +0.000010] ? commit_tail+0x94/0x130 [drm_kms_helper] [ +0.000024] ? process_one_work+0x177/0x330 [ +0.000008] ? worker_thread+0x266/0x3a0 [ +0.000006] ? __pfx_worker_thread+0x10/0x10 [ +0.000004] ? kthread+0xd2/0x100 [ +0.000006] ? __pfx_kthread+0x10/0x10 [ +0.000006] ? ret_from_fork+0x34/0x50 [ +0.000004] ? __pfx_kthread+0x10/0x10 [ +0.000005] ? ret_from_fork_asm+0x1a/0x30 [ +0.000011] </TASK> Fixes: 1b04dcca4fb1 ("drm/amd/display: Introduce overlay cursor mode") Suggested-by: Leo Li <sunpeng.li@amd.com> Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3693 Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-19drm/amd/display: Remove unnecessary amdgpu_irq_get/putAlex Hung
[WHY & HOW] commit 7fb363c57522 ("drm/amd/display: Let drm_crtc_vblank_on/off manage interrupts") lets drm_crtc_vblank_* to manage interrupts in amdgpu_dm_crtc_set_vblank, and amdgpu_irq_get/put do not need to be called here. Part of that patch got lost somehow, so fix it up. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amdgpu: Handle NULL bo->tbo.resource (again) in amdgpu_vm_bo_updateMichel Dänzer
Third time's the charm, I hope? Fixes: d3116756a710 ("drm/ttm: rename bo->mem and make it a pointer") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3837 Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/admgpu: replace kmalloc() and memcpy() with kmemdup()Mirsad Todorovac
The static analyser tool gave the following advice: ./drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c:1266:7-14: WARNING opportunity for kmemdup → 1266 tmp = kmalloc(used_size, GFP_KERNEL); 1267 if (!tmp) 1268 return -ENOMEM; 1269 → 1270 memcpy(tmp, &host_telemetry->body.error_count, used_size); Replacing kmalloc() + memcpy() with kmemdump() doesn't change semantics. Original code works without fault, so this is not a bug fix but proposed improvement. Link: https://lwn.net/Articles/198928/ Fixes: 84a2947ecc85 ("drm/amdgpu: Implement virt req_ras_err_count") Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: Xinhui Pan <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@gmail.com> Cc: Simona Vetter <simona@ffwll.ch> Cc: Zhigang Luo <Zhigang.Luo@amd.com> Cc: Victor Skvortsov <victor.skvortsov@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Lijo Lazar <lijo.lazar@amd.com> Cc: Yunxiang Li <Yunxiang.Li@amd.com> Cc: Jack Xiao <Jack.Xiao@amd.com> Cc: Vignesh Chander <Vignesh.Chander@amd.com> Cc: Danijel Slivka <danijel.slivka@amd.com> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Cc: linux-kernel@vger.kernel.org Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Mirsad Todorovac <mtodorovac69@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amd/display: Fix NULL pointer dereference in dmub_tracebuffer_showSrinivasan Shanmugam
It corrects the issue by checking if 'adev->dm.dmub_srv' is NULL before accessing its 'meta_info' member. This ensures that we do not dereference a NULL pointer. Fixes the below: drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_debugfs.c:917 dmub_tracebuffer_show() warn: address of 'adev->dm.dmub_srv->meta_info' is non-NULL drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amdgpu_dm_debugfs.c 901 static int dmub_tracebuffer_show(struct seq_file *m, void *data) 902 { 903 struct amdgpu_device *adev = m->private; 904 struct dmub_srv_fb_info *fb_info = adev->dm.dmub_fb_info; 905 struct dmub_fw_meta_info *fw_meta_info = &adev->dm.dmub_srv->meta_info; ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Even if adev->dm.dmub_srv is NULL, the address of ->meta_info can't be NULL 906 struct dmub_debugfs_trace_entry *entries; 907 uint8_t *tbuf_base; 908 uint32_t tbuf_size, max_entries, num_entries, first_entry, i; 909 910 if (!fb_info) 911 return 0; 912 913 tbuf_base = (uint8_t *)fb_info->fb[DMUB_WINDOW_5_TRACEBUFF].cpu_addr; 914 if (!tbuf_base) 915 return 0; 916 --> 917 tbuf_size = fw_meta_info ? fw_meta_info->trace_buffer_size : ^^^^^^^^^^^^ Always non-NULL 918 DMUB_TRACE_BUFFER_SIZE; 919 max_entries = (tbuf_size - sizeof(struct dmub_debugfs_trace_header)) / 920 sizeof(struct dmub_debugfs_trace_entry); 921 922 num_entries = v2: Initialize struct dmub_fw_meta_info *fw_meta_info to NULL (Dan Carpenter) Fixes: 5a498172c8d0 ("drm/amd/display: Make DMCUB tracebuffer debugfs chronological") Cc: Leo Li <sunpeng.li@amd.com> Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Roman Li <roman.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amdgpu: Show warning message if IH ring overflowPhilip Yang
If IH primary ring and KFD ih fifo overflows, we may miss CP, SDMA interrupts and cause application soft hang. Show warning message with ring name if overflow happens. Add function to get ih ring name to avoid duplicating it. To keep warning message consistent between GPU generations, change all *_ih.c except ASICs older than Vega which has only one ih ring. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amdkfd: Improve signal event slow pathPhilip Yang
If event slot is not signaled, kfd_signal_event_interrupt goes to slow path to scan all event slots to find the signaled event, this is needed for old ASICs that don't have the event ID or the event IDs are incorrect in the IH payload. There is case that GPU signal the same event twice, then driver process the first event interrupt, set_event and event slot is auto-reset, then for the second event interrupt, KFD goes to slow path as event is not signaled, just drop the second event interrupt because the application only need wakeup once. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amdkfd: Queue interrupt work to different CPUPhilip Yang
For CPX mode, each KFD node has interrupt worker to process ih_fifo to send events to user space. Currently all interrupt workers of same adev queue to same CPU, all workers execution are actually serialized and this cause KFD ih_fifo overflow when CPU usage is high. Use per-GPU unbounded highpri queue with number of workers equals to number of partitions, let queue_work select the next CPU round robin among the local CPUs of same NUMA. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amdgpu: Optimize gfx v9 GPU page fault handlingPhilip Yang
After GPU page fault, there are lots of page fault interrupts generated at short period even with CAM filter enabled because the fault address is different. Each page fault copy to KFD ih fifo to send event to user space by KFD interrupt worker, this could cause KFD ih fifo overflow while other processes generate events at same time. KFD process is aborted after GPU page fault, we only need one GPU page fault interrupt sent to KFD ih fifo to send memory exception event to user space. Incease KFD ih fifo size to 2 times of IH primary ring size, to handle the burst events case. This patch handle the gfx v9 path, cover retry on/off and CAM filter on/off cases. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amdkfd: KFD interrupt access ih_fifo data in-placePhilip Yang
To handle 40000 to 80000 interrupts per second running CPX mode with 4 streams/queues per KFD node, KFD interrupt handler becomes the performance bottleneck. Remove the kfifo_out memcpy overhead by accessing ih_fifo data in-place and updating rptr with kfifo_skip_count. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amdgpu: partially revert "reduce reset time"Christian König
This partially reverts commit 194eb174cbe4fe2b3376ac30acca2dc8c8beca00. This commit introduced a new state variable into adev without even remotely worrying about CPU barriers. Since we already have the amdgpu_in_reset() function exactly for this use case partially revert that. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amdgpu: set the VM pointer to NULL in amdgpu_job_prepareChristian König
As soon as the prepare phase is completed the VM might be released, better set it to NULL. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amdgpu: fix amdgpu_coredumpChristian König
The VM pointer might already be outdated when that function is called. Use the PASID instead to gather the information instead. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amdgpu: Enable psp v14_0_3 RAS support for non-SRIOV configurations.Candice Li
Enable psp v14_0_3 RAS support for non-SRIOV configurations. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amdgpu: Don't enable sdma 4.4.5 CTXEMPTY interruptPhilip Yang
The sdma context empty interrupt is dropped in amdgpu_irq_dispatch as unregistered interrupt src_id 243, this interrupt accounts to 1/3 of total interrupts and causes IH primary ring overflow when running stressful benchmark application. Disable this interrupt has no side effect found. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amdgpu: Fix potential integer overflow in scheduler mask calculationsKarol Przybylski
The use of 1 << i in scheduler mask calculations can result in an unintentional integer overflow due to the expression being evaluated as a 32-bit signed integer. This patch replaces 1 << i with 1ULL << i to ensure the operation is performed as a 64-bit unsigned integer, preventing overflow Discovered in coverity scan, CID 1636393, 1636175, 1636007, 1635853 Fixes: c5c63d9cb5d3 ("drm/amdgpu: add amdgpu_gfx_sched_mask and amdgpu_compute_sched_mask debugfs") Signed-off-by: Karol Przybylski <karprzy7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amdgpu/smu14.0.2: fix IP version checkAlex Deucher
Use the helper function rather than reading it directly. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amdgpu/gfx12: fix IP version checkAlex Deucher
Use the helper function rather than reading it directly. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amdgpu/mmhub4.1: fix IP version checkAlex Deucher
Use the helper function rather than reading it directly. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amdgpu/nbio7.11: fix IP version checkAlex Deucher
Use the helper function rather than reading it directly. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amdgpu/nbio7.0: fix IP version checkAlex Deucher
Use the helper function rather than reading it directly. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amdgpu/nbio7.7: fix IP version checkAlex Deucher
Use the helper function rather than reading it directly. Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amd/display: 3.2.314Aric Cyr
DC 3.2.314 contains some improvements as summarized below: * Update DML21 code. * Fixes for FAMS2 interface. * HDMI fixes. * Compilation warning fixes. Signed-off-by: Aric Cyr <aric.cyr@amd.com> Acked-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amd/display: Disable MPC rate control on ODM pipe updateGeorge Shen
[Why] Seamless boot skips MPC init for the active pipe, resulting in stale MPC rate control state being retained. This will cause issues since other logic assumes it is disabled (as DCN30 and newer does not need it). [How] Disable MPC rate control on ODM pipe update to cover the seamless boot case. Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: George Shen <george.shen@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amd/display: Block Invalid TMDS operationChris Park
[Why] When sink type is TMDS, PHY programming does not block against pixel clock greater than 600MHz. [How] Based on sink type, block greater than 600MHz phy programming. Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Reviewed-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Chris Park <chris.park@amd.com> Signed-off-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amd/display: Re-validate streams on commit_streamsDillon Varone
To prevent invalid HW programming, streams should be revalidated first before committing to HW. Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Reviewed-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18Revert "drm/amd/display: Fix green screen issue after suspend"Rodrigo Siqueira
This reverts commit 87b7ebc2e16c14d32a912f18206a4d6cc9abc3e8. A long time ago, we had an issue with the Raven system when it was connected to two displays: one with DP and another with HDMI. After the system woke up from suspension, we saw a solid green screen caused by an underflow generated by bad DCC metadata. To workaround this issue, the 'commit 87b7ebc2e16c ("drm/amd/display: Fix green screen issue after suspend")' was introduced to disable the DCC for a few frames after in the resume phase. However, in hindsight, this solution was probably a workaround at the kernel level for some issues from another part (probably other driver components or user space). After applying this patch and trying to reproduce the green issue in a similar hardware system but using the latest kernel and userspace, we cannot see the issue, which makes this workaround obsolete and creates extra unnecessary complexity to the code; for all of this reason, this commit reverts the original change. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amd/display: Fix uninitialized variables in amdgpu_dm_debugfsAlex Hung
[WHAT] Some fields in struct dc_link_settings and link_training_settings are not initialized and using them can cause unexpected results. [HOW] Initialize struct dc_link_settings and link_training_settings to zero. Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amd/display: Apply (some) policy for DML2 formulation on DCN35/DCN351Nicholas Kazlauskas
[Why] Dropping the entirety of dml2_policy_build_synthetic_soc_states exposes an issue for states that cannot be filled via bbox_overrides and rely on the default parameters that may or may not be present depending on the DM. For amdgpu_dm this results in missing parameters for most of the struct in higher states: - sr_exit_time_us - sr_enter_plus_exit_time_us - sr_exit_z8_time_us - sr_enter_plus_exit_z8_time_us - urgent_latency_pixel_data_only_us - urgent_latency_pixel_mixed_with_vm_data_us - urgent_latency_vm_data_only_us - dram_clock_change_latency_us - fclk_change_latency_us - usr_retraining_latency_us - writeback_latency_us - urgent_latency_adjustment_fabric_clock_component_us - urgent_latency_adjustment_fabric_clock_reference_mhz - dscclk_mhz - phyclk_mhz - phyclk_d18_mhz - phyclk_d32_mhz - use_ideal_dram_bw_strobe [How] Copy from the first state, applying a minimal policy to set max clocks for SOC independent values. Then copy the SOC dependent ones from the states modified by bbox_overrides. Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amd/display: delete legacy codeShunlu Zhang
Delete unused code. Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Reviewed-by: Jun Lei <jun.lei@amd.com> Signed-off-by: Shunlu Zhang <Shunlu.Zhang@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amd/display: Add new message for DF throttling optimization on dcn401Dillon Varone
[WHY] When effective bandwidth from the SoC is enough to perform SubVP prefetchs, then DF throttling is not required. [HOW] Provide SMU the required clocks for which DF throttling is not required. Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Reviewed-by: Alvin Lee <alvin.lee2@amd.com> Signed-off-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amd/display: DML21 Reintegration For Various FixesAustin Zheng
Reintegrate latest DML21 code. Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Austin Zheng <Austin.Zheng@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amd/display: Fix brightness adjustment on MiniLEDHarry VanZyllDeJong
[Why] Older Asics were changed to target new DCN while still needing older support causing brightness adjustments to fail. [How] Reverted the DCN targets on required DCNs Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Reviewed-by: Iswara Nagulendran <iswara.nagulendran@amd.com> Signed-off-by: Harry VanZyllDeJong <hvanzyll@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amd/display: Fix Mode Cutoff in DSC Passthrough to DP2.1 MonitorFangzhi Zuo
Source --> DP2.1 MST hub --> DP1.4/2.1 monitor When change from DP1.4 to DP2.1 from monitor manual, modes higher than 4k120 are all cutoff by mode validation. Switch back to DP1.4 gets all the modes up to 4k240 available to be enabled by dsc passthrough. [why] Compared to DP1.4 link from hub to monitor, DP2.1 link has larger full_pbn value that causes overflow in the process of doing conversion from pbn to kbps. [how] Change the data type accordingly to fit into the data limit during conversion calculation. Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Reviewed-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amd/display: init dc_power_stateCharlene Liu
Initialize the power state for dc use Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Reviewed-by: Chris Park <chris.park@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amd/display: initialize uninitialized variableMeera Patel
[WHY] There is one uninitialized variable in file dc/hwss/dcn401/dcn401_hwseq.c, which trigger com compile warnings. [HOW] Initialize the unininitialized variable. Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Reviewed-by: Ariel Bernstein <eric.bernstein@amd.com> Signed-off-by: Meera Patel <meera.patel@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>