summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd
AgeCommit message (Collapse)Author
2024-12-14drm/amd/display: disable SG displays on cyan skillfishAlex Deucher
[ Upstream commit 66369db7fdd7d58d78673bf83d2b87ea623efb63 ] These parts were mainly for compute workloads, but they have a display that was available for the console. These chips should support SG display, but I don't know that the support was ever validated on Linux so disable it by default. It can still be enabled by setting sg_display=1 for those that want to play with it. These systems also generally had large carve outs so SG display was less of a factor. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3356 Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-14drm/amd/display: Fix garbage or black screen when resetting otgZhongwei
[ Upstream commit ffa1e31f70d2e97c121709b44a8960f5d7becb10 ] [Why] For some EDP to MIPI panel, disabling OTG when link is alive like boot case, the converter might output garbage or show no display because our GPU is not sending required pixel data. Alos Dig fifo underflow was found which might cause garbage, when resetting otg for other types of EDP panels. [How] Skipping resetting OTG if the dig fifo is on. Make sure that the otg for the pipe is the one that the dig fifo is selecting via the FE mask. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Zhongwei <Zhongwei.Zhang@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-14drm/amd/display: skip disable CRTC in seemless bootup caseFudongwang
[ Upstream commit 0e37e4b9afbd08df1f00a70bbb4d1ec273d18c9e ] Resync FIFO is a workaround to write the same value to DENTIST_DISPCLK_CNTL register after programming OTG_PIXEL_RATE_DIV register, in case seemless boot, there is no OTG_PIXEL_RATE_DIV register update, so skip CRTC disable when resync FIFO to avoid random FIFO error and garbage. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Fudongwang <Fudong.Wang@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-14drm/amd/display: Fix out-of-bounds access in 'dcn21_link_encoder_create'Srinivasan Shanmugam
[ Upstream commit 63de35a8fcfca59ae8750d469a7eb220c7557baf ] An issue was identified in the dcn21_link_encoder_create function where an out-of-bounds access could occur when the hpd_source index was used to reference the link_enc_hpd_regs array. This array has a fixed size and the index was not being checked against the array's bounds before accessing it. This fix adds a conditional check to ensure that the hpd_source index is within the valid range of the link_enc_hpd_regs array. If the index is out of bounds, the function now returns NULL to prevent undefined behavior. References: [ 65.920507] ------------[ cut here ]------------ [ 65.920510] UBSAN: array-index-out-of-bounds in drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn21/dcn21_resource.c:1312:29 [ 65.920519] index 7 is out of range for type 'dcn10_link_enc_hpd_registers [5]' [ 65.920523] CPU: 3 PID: 1178 Comm: modprobe Tainted: G OE 6.8.0-cleanershaderfeatureresetasdntipmi200nv2132 #13 [ 65.920525] Hardware name: AMD Majolica-RN/Majolica-RN, BIOS WMJ0429N_Weekly_20_04_2 04/29/2020 [ 65.920527] Call Trace: [ 65.920529] <TASK> [ 65.920532] dump_stack_lvl+0x48/0x70 [ 65.920541] dump_stack+0x10/0x20 [ 65.920543] __ubsan_handle_out_of_bounds+0xa2/0xe0 [ 65.920549] dcn21_link_encoder_create+0xd9/0x140 [amdgpu] [ 65.921009] link_create+0x6d3/0xed0 [amdgpu] [ 65.921355] create_links+0x18a/0x4e0 [amdgpu] [ 65.921679] dc_create+0x360/0x720 [amdgpu] [ 65.921999] ? dmi_matches+0xa0/0x220 [ 65.922004] amdgpu_dm_init+0x2b6/0x2c90 [amdgpu] [ 65.922342] ? console_unlock+0x77/0x120 [ 65.922348] ? dev_printk_emit+0x86/0xb0 [ 65.922354] dm_hw_init+0x15/0x40 [amdgpu] [ 65.922686] amdgpu_device_init+0x26a8/0x33a0 [amdgpu] [ 65.922921] amdgpu_driver_load_kms+0x1b/0xa0 [amdgpu] [ 65.923087] amdgpu_pci_probe+0x1b7/0x630 [amdgpu] [ 65.923087] local_pci_probe+0x4b/0xb0 [ 65.923087] pci_device_probe+0xc8/0x280 [ 65.923087] really_probe+0x187/0x300 [ 65.923087] __driver_probe_device+0x85/0x130 [ 65.923087] driver_probe_device+0x24/0x110 [ 65.923087] __driver_attach+0xac/0x1d0 [ 65.923087] ? __pfx___driver_attach+0x10/0x10 [ 65.923087] bus_for_each_dev+0x7d/0xd0 [ 65.923087] driver_attach+0x1e/0x30 [ 65.923087] bus_add_driver+0xf2/0x200 [ 65.923087] driver_register+0x64/0x130 [ 65.923087] ? __pfx_amdgpu_init+0x10/0x10 [amdgpu] [ 65.923087] __pci_register_driver+0x61/0x70 [ 65.923087] amdgpu_init+0x7d/0xff0 [amdgpu] [ 65.923087] do_one_initcall+0x49/0x310 [ 65.923087] ? kmalloc_trace+0x136/0x360 [ 65.923087] do_init_module+0x6a/0x270 [ 65.923087] load_module+0x1fce/0x23a0 [ 65.923087] init_module_from_file+0x9c/0xe0 [ 65.923087] ? init_module_from_file+0x9c/0xe0 [ 65.923087] idempotent_init_module+0x179/0x230 [ 65.923087] __x64_sys_finit_module+0x5d/0xa0 [ 65.923087] do_syscall_64+0x76/0x120 [ 65.923087] entry_SYSCALL_64_after_hwframe+0x6e/0x76 [ 65.923087] RIP: 0033:0x7f2d80f1e88d [ 65.923087] Code: 5b 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 b5 0f 00 f7 d8 64 89 01 48 [ 65.923087] RSP: 002b:00007ffc7bc1aa78 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 65.923087] RAX: ffffffffffffffda RBX: 0000564c9c1db130 RCX: 00007f2d80f1e88d [ 65.923087] RDX: 0000000000000000 RSI: 0000564c9c1e5480 RDI: 000000000000000f [ 65.923087] RBP: 0000000000040000 R08: 0000000000000000 R09: 0000000000000002 [ 65.923087] R10: 000000000000000f R11: 0000000000000246 R12: 0000564c9c1e5480 [ 65.923087] R13: 0000564c9c1db260 R14: 0000000000000000 R15: 0000564c9c1e54b0 [ 65.923087] </TASK> [ 65.923927] ---[ end trace ]--- Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Roman Li <roman.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-14drm/amdgpu/hdp5.2: do a posting read when flushing HDPAlex Deucher
commit f756dbac1ce1d5f9a2b35e3b55fa429cf6336437 upstream. Need to read back to make sure the write goes through. Cc: David Belanger <david.belanger@amd.com> Reviewed-by: Frank Min <frank.min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-14drm/amdgpu/hdp7.0: do a posting read when flushing HDPAlex Deucher
commit 689275140cb8e9f8ae59e545086fce51fb0b994a upstream. Need to read back to make sure the write goes through. Cc: David Belanger <david.belanger@amd.com> Reviewed-by: Frank Min <frank.min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-14drm/amdgpu/hdp5.0: do a posting read when flushing HDPAlex Deucher
commit cf424020e040be35df05b682b546b255e74a420f upstream. Need to read back to make sure the write goes through. Cc: David Belanger <david.belanger@amd.com> Reviewed-by: Frank Min <frank.min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-14drm/amdgpu/hdp4.0: do a posting read when flushing HDPAlex Deucher
commit c9b8dcabb52afe88413ff135a0953e3cc4128483 upstream. Need to read back to make sure the write goes through. Cc: David Belanger <david.belanger@amd.com> Reviewed-by: Frank Min <frank.min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-14drm/amdgpu/hdp6.0: do a posting read when flushing HDPAlex Deucher
commit abe1cbaec6cfe9fde609a15cd6a12c812282ce77 upstream. Need to read back to make sure the write goes through. Cc: David Belanger <david.belanger@amd.com> Reviewed-by: Frank Min <frank.min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-14drm/amd/display: Add a left edge pixel if in YCbCr422 or YCbCr420 and odmPeterson Guo
commit 63e7ee677c74e981257cedfdd8543510d09096ba upstream. [WHY] On some cards when odm is used, the monitor will have 2 separate pipes split vertically. When compression is used on the YCbCr colour space on the second pipe to have correct colours, we need to read a pixel from the end of first pipe to accurately display colours. Hardware was programmed properly to account for this extra pixel but it was not calculated properly in software causing a split screen on some monitors. [HOW] The fix adjusts the second pipe's viewport and timings if the pixel encoding is YCbCr422 or YCbCr420. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: George Shen <george.shen@amd.com> Signed-off-by: Peterson Guo <peterson.guo@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-14drm/amd/display: Limit VTotal range to max hw cap minus fpDillon Varone
commit a29997b7ac1f5c816b543e0c56aa2b5b56baac24 upstream. [WHY & HOW] Hardware does not support the VTotal to be between fp2 lines of the maximum possible VTotal, so add a capability flag to track it and apply where necessary. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Jun Lei <jun.lei@amd.com> Reviewed-by: Anthony Koo <anthony.koo@amd.com> Signed-off-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-14drm/amd/display: Correct prefetch calculationLo-an Chen
commit 24d3749c11d949972d8c22e75567dc90ff5482e7 upstream. [WHY] The minimum value of the dst_y_prefetch_equ was not correct in prefetch calculation whice causes OPTC underflow. [HOW] Add the min operation of dst_y_prefetch_equ in prefetch calculation. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Lo-an Chen <lo-an.chen@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-14drm/amd/pm: fix and simplify workload handlingAlex Deucher
commit 1443dd3c67f6d1a8bd1f810e598e2f0c6f19205c upstream. smu->workload_mask is IP specific and should not be messed with in the common code. The mask bits vary across SMU versions. Move all handling of smu->workload_mask in to the backends and simplify the code. Store the user's preference in smu->power_profile_mode which will be reflected in sysfs. For internal driver profile switches for KFD or VCN, just update the workload mask so that the user's preference is retained. Remove all of the extra now unused workload related elements in the smu structure. v2: use refcounts for workload profiles v3: rework based on feedback from Lijo v4: fix the refcount on failure, drop backend mask v5: rework custom handling v6: handle failure cleanup with custom profile v7: Update documentation Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: Kenneth Feng <kenneth.feng@amd.com> Cc: Lijo Lazar <lijo.lazar@amd.com> Cc: stable@vger.kernel.org # 6.11.x Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-14drm/amdkfd: add MEC version that supports no PCIe atomics for GFX12Sreekant Somasekharan
commit 33114f1057ea5cf40e604021711a9711a060fcb6 upstream. Add MEC version from which alternate support for no PCIe atomics is provided so that device is not skipped during KFD device init in GFX1200/GFX1201. Signed-off-by: Sreekant Somasekharan <sreekant.somasekharan@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.11.x Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-14drm/amdkfd: hard-code cacheline for gc943,gc944David Yat Sin
commit 55ed120dcfdde2478c3ebfa1c0ac4ed1e430053b upstream. Cacheline size is not available in IP discovery for gc943,gc944. Signed-off-by: David Yat Sin <David.YatSin@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-14drm/amd/display: Ignore scalar validation failure if pipe is phantomChris Park
[ Upstream commit c33a93201ca07119de90e8c952fbdf65920ab55d ] [Why] There are some pipe scaler validation failure when the pipe is phantom and causes crash in DML validation. Since, scalar parameters are not as important in phantom pipe and we require this plane to do successful MCLK switches, the failure condition can be ignored. [How] Ignore scalar validation failure if the pipe validation is marked as phantom pipe. Cc: stable@vger.kernel.org # 6.11+ Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Chris Park <chris.park@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-14drm/amd/display: calculate final viewport before TAP optimizationYihan Zhu
[ Upstream commit e982310c9ce074e428abc260dc3cba1b1ea62b78 ] Viewport size excess surface size observed sometime with some timings or resizing the MPO video window to cause MPO unsupported. Calculate final viewport size first with a 100x100 dummy viewport to get the max TAP support and then re-run final viewport calculation if TAP value changed. Removed obsolete preliminary viewport calculation for TAP validation. Reviewed-by: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com> Signed-off-by: Yihan Zhu <Yihan.Zhu@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Stable-dep-of: c33a93201ca0 ("drm/amd/display: Ignore scalar validation failure if pipe is phantom") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-09drm/amd/display: Remove PIPE_DTO_SRC_SEL programming from set_dtbclk_dtoOvidiu Bunea
commit a3e6079bd93d5c66a43bf6a5f90e5b98465dc7b3 upstream. There are cases where an OTG is remapped from driving a regular HDMI display to a DP/eDP display. There are also cases where DTBCLK needs to be enabled for HPO, but DTBCLK DTO programming may be done while OTG is still enabled which is dangerous as the PIPE_DTO_SRC_SEL programming may change the pixel clock generator source for a mapped and running OTG and cause it to hang. Remove the PIPE_DTO_SRC_SEL programming from this sequence since it is already done in program_pixel_clk(). Additionally, make sure that program_pixel_clk sets DTBCLK DTO as source for special HDMI cases. Cc: stable@vger.kernel.org # 6.11+ Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Ovidiu Bunea <Ovidiu.Bunea@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-09drm/amd/display: update pipe selection policy to check head pipeYihan Zhu
commit 8fef253c94a5312b9150b2ff8e633b331bac7e88 upstream. [Why] No check on head pipe during the dml to dc hw mapping will allow illegal pipe usage. This will result in a wrong pipe topology to cause mpcc tree totally mess up then cause a display hang. [How] Avoid to use the pipe is head in all check and avoid ODM slice during preferred pipe check. Cc: stable@vger.kernel.org Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Yihan Zhu <Yihan.Zhu@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-09drm/amd/display: Fix handling of plane refcountJoshua Aberback
commit 27227a234c1487cb7a684615f0749c455218833a upstream. [Why] The mechanism to backup and restore plane states doesn't maintain refcount, which can cause issues if the refcount of the plane changes in between backup and restore operations, such as memory leaks if the refcount was supposed to go down, or double frees / invalid memory accesses if the refcount was supposed to go up. [How] Cache and re-apply current refcount when restoring plane states. Cc: stable@vger.kernel.org Reviewed-by: Josip Pavic <josip.pavic@amd.com> Signed-off-by: Joshua Aberback <joshua.aberback@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-09drm/amd/pm: Remove arcturus min power limitLijo Lazar
commit da868898cf4c5ddbd1f7406e356edce5d7211eb5 upstream. As per power team, there is no need to impose a lower bound on arcturus power limit. Any unreasonable limit set will result in frequent throttling. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-09drm/amd/pm: disable pcie speed switching on Intel platform for smu v14.0.2/3Kenneth Feng
commit b0df0e777874549c128b43f7bf4989a2ed24b37a upstream. disable pcie speed switching on Intel platform for smu v14.0.2/3 based on Intel's requirement. v2: align the setting with smu v13. Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.11.x Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-09drm/amd/pm: update current_socclk and current_uclk in gpu_metrics on smu v13.0.7Umio Yasuno
commit 2abf2f7032df4c4e7f6cf7906da59d0e614897d6 upstream. These were missed before. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3751 Signed-off-by: Umio Yasuno <coelacanth_dream@protonmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-09drm/amd: Fix initialization mistake for NBIO 7.11 devicesMario Limonciello
commit 349af06a3abd0bb3787ee2daf3ac508412fe8dcc upstream. There is a strapping issue on NBIO 7.11.x that can lead to spurious PME events while in the D0 state. Cc: stable@vger.kernel.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20241118174611.10700-2-mario.limonciello@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-09drm/amd/pm: skip setting the power source on smu v14.0.2/3Kenneth Feng
commit 76c7f08094767b5df3b60e18d1bdecddd4a5c844 upstream. skip setting power source on smu v14.0.2/3 Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.11.x Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-09drm/amdgpu: fix usage slab after freeVitaly Prosyak
commit b61badd20b443eabe132314669bb51a263982e5c upstream. [ +0.000021] BUG: KASAN: slab-use-after-free in drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched] [ +0.000027] Read of size 8 at addr ffff8881b8605f88 by task amd_pci_unplug/2147 [ +0.000023] CPU: 6 PID: 2147 Comm: amd_pci_unplug Not tainted 6.10.0+ #1 [ +0.000016] Hardware name: ASUS System Product Name/ROG STRIX B550-F GAMING (WI-FI), BIOS 1401 12/03/2020 [ +0.000016] Call Trace: [ +0.000008] <TASK> [ +0.000009] dump_stack_lvl+0x76/0xa0 [ +0.000017] print_report+0xce/0x5f0 [ +0.000017] ? drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched] [ +0.000019] ? srso_return_thunk+0x5/0x5f [ +0.000015] ? kasan_complete_mode_report_info+0x72/0x200 [ +0.000016] ? drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched] [ +0.000019] kasan_report+0xbe/0x110 [ +0.000015] ? drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched] [ +0.000023] __asan_report_load8_noabort+0x14/0x30 [ +0.000014] drm_sched_entity_flush+0x6cb/0x7a0 [gpu_sched] [ +0.000020] ? srso_return_thunk+0x5/0x5f [ +0.000013] ? __kasan_check_write+0x14/0x30 [ +0.000016] ? __pfx_drm_sched_entity_flush+0x10/0x10 [gpu_sched] [ +0.000020] ? srso_return_thunk+0x5/0x5f [ +0.000013] ? __kasan_check_write+0x14/0x30 [ +0.000013] ? srso_return_thunk+0x5/0x5f [ +0.000013] ? enable_work+0x124/0x220 [ +0.000015] ? __pfx_enable_work+0x10/0x10 [ +0.000013] ? srso_return_thunk+0x5/0x5f [ +0.000014] ? free_large_kmalloc+0x85/0xf0 [ +0.000016] drm_sched_entity_destroy+0x18/0x30 [gpu_sched] [ +0.000020] amdgpu_vce_sw_fini+0x55/0x170 [amdgpu] [ +0.000735] ? __kasan_check_read+0x11/0x20 [ +0.000016] vce_v4_0_sw_fini+0x80/0x110 [amdgpu] [ +0.000726] amdgpu_device_fini_sw+0x331/0xfc0 [amdgpu] [ +0.000679] ? mutex_unlock+0x80/0xe0 [ +0.000017] ? __pfx_amdgpu_device_fini_sw+0x10/0x10 [amdgpu] [ +0.000662] ? srso_return_thunk+0x5/0x5f [ +0.000014] ? __kasan_check_write+0x14/0x30 [ +0.000013] ? srso_return_thunk+0x5/0x5f [ +0.000013] ? mutex_unlock+0x80/0xe0 [ +0.000016] amdgpu_driver_release_kms+0x16/0x80 [amdgpu] [ +0.000663] drm_minor_release+0xc9/0x140 [drm] [ +0.000081] drm_release+0x1fd/0x390 [drm] [ +0.000082] __fput+0x36c/0xad0 [ +0.000018] __fput_sync+0x3c/0x50 [ +0.000014] __x64_sys_close+0x7d/0xe0 [ +0.000014] x64_sys_call+0x1bc6/0x2680 [ +0.000014] do_syscall_64+0x70/0x130 [ +0.000014] ? srso_return_thunk+0x5/0x5f [ +0.000014] ? irqentry_exit_to_user_mode+0x60/0x190 [ +0.000015] ? srso_return_thunk+0x5/0x5f [ +0.000014] ? irqentry_exit+0x43/0x50 [ +0.000012] ? srso_return_thunk+0x5/0x5f [ +0.000013] ? exc_page_fault+0x7c/0x110 [ +0.000015] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ +0.000014] RIP: 0033:0x7ffff7b14f67 [ +0.000013] Code: ff e8 0d 16 02 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 c3 48 83 ec 18 89 7c 24 0c e8 73 ba f7 ff [ +0.000026] RSP: 002b:00007fffffffe378 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 [ +0.000019] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ffff7b14f67 [ +0.000014] RDX: 0000000000000000 RSI: 00007ffff7f6f47a RDI: 0000000000000003 [ +0.000014] RBP: 00007fffffffe3a0 R08: 0000555555569890 R09: 0000000000000000 [ +0.000014] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fffffffe5c8 [ +0.000013] R13: 00005555555552a9 R14: 0000555555557d48 R15: 00007ffff7ffd040 [ +0.000020] </TASK> [ +0.000016] Allocated by task 383 on cpu 7 at 26.880319s: [ +0.000014] kasan_save_stack+0x28/0x60 [ +0.000008] kasan_save_track+0x18/0x70 [ +0.000007] kasan_save_alloc_info+0x38/0x60 [ +0.000007] __kasan_kmalloc+0xc1/0xd0 [ +0.000007] kmalloc_trace_noprof+0x180/0x380 [ +0.000007] drm_sched_init+0x411/0xec0 [gpu_sched] [ +0.000012] amdgpu_device_init+0x695f/0xa610 [amdgpu] [ +0.000658] amdgpu_driver_load_kms+0x1a/0x120 [amdgpu] [ +0.000662] amdgpu_pci_probe+0x361/0xf30 [amdgpu] [ +0.000651] local_pci_probe+0xe7/0x1b0 [ +0.000009] pci_device_probe+0x248/0x890 [ +0.000008] really_probe+0x1fd/0x950 [ +0.000008] __driver_probe_device+0x307/0x410 [ +0.000007] driver_probe_device+0x4e/0x150 [ +0.000007] __driver_attach+0x223/0x510 [ +0.000006] bus_for_each_dev+0x102/0x1a0 [ +0.000007] driver_attach+0x3d/0x60 [ +0.000006] bus_add_driver+0x2ac/0x5f0 [ +0.000006] driver_register+0x13d/0x490 [ +0.000008] __pci_register_driver+0x1ee/0x2b0 [ +0.000007] llc_sap_close+0xb0/0x160 [llc] [ +0.000009] do_one_initcall+0x9c/0x3e0 [ +0.000008] do_init_module+0x241/0x760 [ +0.000008] load_module+0x51ac/0x6c30 [ +0.000006] __do_sys_init_module+0x234/0x270 [ +0.000007] __x64_sys_init_module+0x73/0xc0 [ +0.000006] x64_sys_call+0xe3/0x2680 [ +0.000006] do_syscall_64+0x70/0x130 [ +0.000007] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ +0.000015] Freed by task 2147 on cpu 6 at 160.507651s: [ +0.000013] kasan_save_stack+0x28/0x60 [ +0.000007] kasan_save_track+0x18/0x70 [ +0.000007] kasan_save_free_info+0x3b/0x60 [ +0.000007] poison_slab_object+0x115/0x1c0 [ +0.000007] __kasan_slab_free+0x34/0x60 [ +0.000007] kfree+0xfa/0x2f0 [ +0.000007] drm_sched_fini+0x19d/0x410 [gpu_sched] [ +0.000012] amdgpu_fence_driver_sw_fini+0xc4/0x2f0 [amdgpu] [ +0.000662] amdgpu_device_fini_sw+0x77/0xfc0 [amdgpu] [ +0.000653] amdgpu_driver_release_kms+0x16/0x80 [amdgpu] [ +0.000655] drm_minor_release+0xc9/0x140 [drm] [ +0.000071] drm_release+0x1fd/0x390 [drm] [ +0.000071] __fput+0x36c/0xad0 [ +0.000008] __fput_sync+0x3c/0x50 [ +0.000007] __x64_sys_close+0x7d/0xe0 [ +0.000007] x64_sys_call+0x1bc6/0x2680 [ +0.000007] do_syscall_64+0x70/0x130 [ +0.000007] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ +0.000014] The buggy address belongs to the object at ffff8881b8605f80 which belongs to the cache kmalloc-64 of size 64 [ +0.000020] The buggy address is located 8 bytes inside of freed 64-byte region [ffff8881b8605f80, ffff8881b8605fc0) [ +0.000028] The buggy address belongs to the physical page: [ +0.000011] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1b8605 [ +0.000008] anon flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff) [ +0.000007] page_type: 0xffffefff(slab) [ +0.000009] raw: 0017ffffc0000000 ffff8881000428c0 0000000000000000 dead000000000001 [ +0.000006] raw: 0000000000000000 0000000000200020 00000001ffffefff 0000000000000000 [ +0.000006] page dumped because: kasan: bad access detected [ +0.000012] Memory state around the buggy address: [ +0.000011] ffff8881b8605e80: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc [ +0.000015] ffff8881b8605f00: 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc [ +0.000015] >ffff8881b8605f80: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc [ +0.000013] ^ [ +0.000011] ffff8881b8606000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fc [ +0.000014] ffff8881b8606080: fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb fb [ +0.000013] ================================================================== The issue reproduced on VG20 during the IGT pci_unplug test. The root cause of the issue is that the function drm_sched_fini is called before drm_sched_entity_kill. In drm_sched_fini, the drm_sched_rq structure is freed, but this structure is later accessed by each entity within the run queue, leading to invalid memory access. To resolve this, the order of cleanup calls is updated: Before: amdgpu_fence_driver_sw_fini amdgpu_device_ip_fini After: amdgpu_device_ip_fini amdgpu_fence_driver_sw_fini This updated order ensures that all entities in the IPs are cleaned up first, followed by proper cleanup of the schedulers. Additional Investigation: During debugging, another issue was identified in the amdgpu_vce_sw_fini function. The vce.vcpu_bo buffer must be freed only as the final step in the cleanup process to prevent any premature access during earlier cleanup stages. v2: Using Christian suggestion call drm_sched_entity_destroy before drm_sched_fini. Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Vitaly Prosyak <vitaly.prosyak@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-09drm/amd: Add some missing straps from NBIO 7.11.0Mario Limonciello
commit 902fbbf429b8213232b18de0ddfd5c0f3851cb8f upstream. Earlier ASICs have strap information exported, and this is missing for NBIO 7.11.0. Cc: stable@vger.kernel.org Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Fixes: ca8c68142ad8 ("drm/amdgpu: add nbio 7.11 registers") Link: https://lore.kernel.org/r/20241118174611.10700-1-mario.limonciello@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-09drm/amdgpu/pm: add gen5 display to the user on smu v14.0.2/3Kenneth Feng
commit 6719ab8234ce4b0c0e9aa93aaa94961e5b2bc852 upstream. add gen5 display to the user on smu v14.0.2/3 Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.11.x Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-09drm/amdkfd: Use the correct wptr sizeLijo Lazar
commit cdc6705f98ea3f854a60ba8c9b19228e197ae384 upstream. Write pointer could be 32-bit or 64-bit. Use the correct size during initialization. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-12-05drm/amd/display: Fix null check for pipe_ctx->plane_state in hwss_setup_dppZicheng Qu
[ Upstream commit 2bc96c95070571c6c824e0d4c7783bee25a37876 ] This commit addresses a null pointer dereference issue in hwss_setup_dpp(). The issue could occur when pipe_ctx->plane_state is null. The fix adds a check to ensure `pipe_ctx->plane_state` is not null before accessing. This prevents a null pointer dereference. Fixes: 0baae6246307 ("drm/amd/display: Refactor fast update to use new HWSS build sequence") Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Zicheng Qu <quzicheng@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-05drm/amd/display: Fix null check for pipe_ctx->plane_state in dcn20_program_pipeZicheng Qu
[ Upstream commit 6a057072ddd127255350357dd880903e8fa23f36 ] This commit addresses a null pointer dereference issue in dcn20_program_pipe(). Previously, commit 8e4ed3cf1642 ("drm/amd/display: Add null check for pipe_ctx->plane_state in dcn20_program_pipe") partially fixed the null pointer dereference issue. However, in dcn20_update_dchubp_dpp(), the variable pipe_ctx is passed in, and plane_state is accessed again through pipe_ctx. Multiple if statements directly call attributes of plane_state, leading to potential null pointer dereference issues. This patch adds necessary null checks to ensure stability. Fixes: 8e4ed3cf1642 ("drm/amd/display: Add null check for pipe_ctx->plane_state in dcn20_program_pipe") Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Zicheng Qu <quzicheng@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-05drm/amdkfd: Fix wrong usage of INIT_WORK()Yuan Can
[ Upstream commit 21cae8debc6a1d243f64fa82cd1b41cb612b5c61 ] In kfd_procfs_show(), the sdma_activity_work_handler is a local variable and the sdma_activity_work_handler.sdma_activity_work should initialize with INIT_WORK_ONSTACK() instead of INIT_WORK(). Fixes: 32cb59f31362 ("drm/amdkfd: Track SDMA utilization per process") Signed-off-by: Yuan Can <yuancan@huawei.com> Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-05drm/amdgpu: Fix map/unmap queue logicLijo Lazar
[ Upstream commit fa31798582882740f2b13d19e1bd43b4ef918e2f ] In current logic, it calls ring_alloc followed by a ring_test. ring_test in turn will call another ring_alloc. This is illegal usage as a ring_alloc is expected to be closed properly with a ring_commit. Change to commit the map/unmap queue packet first followed by a ring_test. Add a comment about the usage of ring_test. Also, reorder the current pre-condition checks of job hang or kiq ring scheduler not ready. Without them being met, it is not useful to attempt ring or memory allocations. Fixes tag refers to the original patch which introduced this issue which then got carried over into newer code. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Fixes: 6c10b5cc4eaa ("drm/amdgpu: Remove duplicate code in gfx_v8_0.c") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-05drm/amdgpu: fix ACA bank count boundary check errorYang Wang
[ Upstream commit 2bb7dced1c2f8c0e705cc74840f776406db492c3 ] fix ACA bank count boundary check error. Fixes: f5e4cc8461c4 ("drm/amdgpu: implement RAS ACA driver framework") Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-05drm/amdkfd: Use dynamic allocation for CU occupancy array in ↵Srinivasan Shanmugam
'kfd_get_cu_occupancy()' [ Upstream commit 922f0e00017b09d9d47e3efac008c8b20ed546a0 ] The `kfd_get_cu_occupancy` function previously declared a large `cu_occupancy` array as a local variable, which could lead to stack overflows due to excessive stack usage. This commit replaces the static array allocation with dynamic memory allocation using `kcalloc`, thereby reducing the stack size. This change avoids the risk of stack overflows in kernel space, in scenarios where `AMDGPU_MAX_QUEUES` is large. The allocated memory is freed using `kfree` before the function returns to prevent memory leaks. Fixes the below with gcc W=1: drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_process.c: In function ‘kfd_get_cu_occupancy’: drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_process.c:322:1: warning: the frame size of 1056 bytes is larger than 1024 bytes [-Wframe-larger-than=] 322 | } | ^ Fixes: 6ae9e1aba97e ("drm/amdkfd: Update logic for CU occupancy calculations") Cc: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Cc: Felix Kuehling <felix.kuehling@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Suggested-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-05drm/amdgpu: Fix the memory allocation issue in amdgpu_discovery_get_nps_info()Li Huafei
[ Upstream commit a1144da794adedb9447437c57d69add56494309d ] Fix two issues with memory allocation in amdgpu_discovery_get_nps_info() for mem_ranges: - Add a check for allocation failure to avoid dereferencing a null pointer. - As suggested by Christophe, use kvcalloc() for memory allocation, which checks for multiplication overflow. Additionally, assign the output parameters nps_type and range_cnt after the kvcalloc() call to prevent modifying the output parameters in case of an error return. Fixes: b194d21b9bcc ("drm/amdgpu: Use NPS ranges from discovery table") Suggested-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Li Huafei <lihuafei1@huawei.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-05drm/amd/display: Reduce HPD Detection Interval for IPSFangzhi Zuo
[ Upstream commit a88b19b13fb41a3fa03ec67b5f57cc267fbfb160 ] Fix DP Compliance test 4.2.1.3, 4.2.2.8, 4.3.1.12, 4.3.1.13 when IPS enabled. Original HPD detection interval is set to 5s which violates DP compliance. Reduce the interval parameter, such that link training can be finished within 5 seconds. Fixes: afca033f10d3 ("drm/amd/display: Add periodic detection for IPS") Reviewed-by: Roman Li <roman.li@amd.com> Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-05drm/amd/display: Increase idle worker HPD detection timeRoman Li
[ Upstream commit 60612f75992d96955fb7154468c58d5d168cf1ab ] [Why] Idle worker thread waits HPD_DETECTION_TIME for HPD processing complete. Some displays require longer time for that. [How] Increase HPD_DETECTION_TIME to 100ms. Reviewed-by: Sun peng Li <sunpeng.li@amd.com> Signed-off-by: Roman Li <Roman.Li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Stable-dep-of: a88b19b13fb4 ("drm/amd/display: Reduce HPD Detection Interval for IPS") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-05drm/amd/display: fix a memleak issue when driver is removedAurabindo Pillai
[ Upstream commit d4f36e5fd800de7db74c1c4e62baf24a091a5ff6 ] Running "modprobe amdgpu" the second time (followed by a modprobe -r amdgpu) causes a call trace like: [ 845.212163] Memory manager not clean during takedown. [ 845.212170] WARNING: CPU: 4 PID: 2481 at drivers/gpu/drm/drm_mm.c:999 drm_mm_takedown+0x2b/0x40 [ 845.212177] Modules linked in: amdgpu(OE-) amddrm_ttm_helper(OE) amddrm_buddy(OE) amdxcp(OE) amd_sched(OE) drm_exec drm_suballoc_helper drm_display_helper i2c_algo_bit amdttm(OE) amdkcl(OE) cec rc_core sunrpc qrtr intel_rapl_msr intel_rapl_common snd_hda_codec_hdmi edac_mce_amd snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_usb_audio snd_hda_codec snd_usbmidi_lib kvm_amd snd_hda_core snd_ump mc snd_hwdep kvm snd_pcm snd_seq_midi snd_seq_midi_event irqbypass crct10dif_pclmul snd_rawmidi polyval_clmulni polyval_generic ghash_clmulni_intel sha256_ssse3 sha1_ssse3 snd_seq aesni_intel crypto_simd snd_seq_device cryptd snd_timer mfd_aaeon asus_nb_wmi eeepc_wmi joydev asus_wmi snd ledtrig_audio sparse_keymap ccp wmi_bmof input_leds k10temp i2c_piix4 platform_profile rapl soundcore gpio_amdpt mac_hid binfmt_misc msr parport_pc ppdev lp parport efi_pstore nfnetlink dmi_sysfs ip_tables x_tables autofs4 hid_logitech_hidpp hid_logitech_dj hid_generic usbhid hid ahci xhci_pci igc crc32_pclmul libahci xhci_pci_renesas video [ 845.212284] wmi [last unloaded: amddrm_ttm_helper(OE)] [ 845.212290] CPU: 4 PID: 2481 Comm: modprobe Tainted: G W OE 6.8.0-31-generic #31-Ubuntu [ 845.212296] RIP: 0010:drm_mm_takedown+0x2b/0x40 [ 845.212300] Code: 1f 44 00 00 48 8b 47 38 48 83 c7 38 48 39 f8 75 09 31 c0 31 ff e9 90 2e 86 00 55 48 c7 c7 d0 f6 8e 8a 48 89 e5 e8 f5 db 45 ff <0f> 0b 5d 31 c0 31 ff e9 74 2e 86 00 66 0f 1f 84 00 00 00 00 00 90 [ 845.212302] RSP: 0018:ffffb11302127ae0 EFLAGS: 00010246 [ 845.212305] RAX: 0000000000000000 RBX: ffff92aa5020fc08 RCX: 0000000000000000 [ 845.212307] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 845.212309] RBP: ffffb11302127ae0 R08: 0000000000000000 R09: 0000000000000000 [ 845.212310] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000004 [ 845.212312] R13: ffff92aa50200000 R14: ffff92aa5020fb10 R15: ffff92aa5020faa0 [ 845.212313] FS: 0000707dd7c7c080(0000) GS:ffff92b93de00000(0000) knlGS:0000000000000000 [ 845.212316] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 845.212318] CR2: 00007d48b0aee200 CR3: 0000000115a58000 CR4: 0000000000f50ef0 [ 845.212320] PKRU: 55555554 [ 845.212321] Call Trace: [ 845.212323] <TASK> [ 845.212328] ? show_regs+0x6d/0x80 [ 845.212333] ? __warn+0x89/0x160 [ 845.212339] ? drm_mm_takedown+0x2b/0x40 [ 845.212344] ? report_bug+0x17e/0x1b0 [ 845.212350] ? handle_bug+0x51/0xa0 [ 845.212355] ? exc_invalid_op+0x18/0x80 [ 845.212359] ? asm_exc_invalid_op+0x1b/0x20 [ 845.212366] ? drm_mm_takedown+0x2b/0x40 [ 845.212371] amdgpu_gtt_mgr_fini+0xa9/0x130 [amdgpu] [ 845.212645] amdgpu_ttm_fini+0x264/0x340 [amdgpu] [ 845.212770] amdgpu_bo_fini+0x2e/0xc0 [amdgpu] [ 845.212894] gmc_v12_0_sw_fini+0x2a/0x40 [amdgpu] [ 845.213036] amdgpu_device_fini_sw+0x11a/0x590 [amdgpu] [ 845.213159] amdgpu_driver_release_kms+0x16/0x40 [amdgpu] [ 845.213302] devm_drm_dev_init_release+0x5e/0x90 [ 845.213305] devm_action_release+0x12/0x30 [ 845.213308] release_nodes+0x42/0xd0 [ 845.213311] devres_release_all+0x97/0xe0 [ 845.213314] device_unbind_cleanup+0x12/0x80 [ 845.213317] device_release_driver_internal+0x230/0x270 [ 845.213319] ? srso_alias_return_thunk+0x5/0xfbef5 This is caused by lost memory during early init phase. First time driver is removed, memory is freed but when second time the driver is inserted, VBIOS dmub is not active, since the PSP policy is to retain the driver loaded version on subsequent warm boots. Hence, communication with VBIOS DMUB fails. Fix this by aborting further communication with vbios dmub and release the memory immediately. Fixes: f59549c7e705 ("drm/amd/display: free bo used for dmub bounding box") Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-05drm/amdgpu/gfx9: Add Cleaner Shader Deinitialization in gfx_v9_0 ModuleSrinivasan Shanmugam
[ Upstream commit e47cb9d2533200d49dd5364d4a148119492f8a3d ] This commit addresses an omission in the previous patch related to the cleaner shader support for GFX9 hardware. Specifically, it adds the necessary deinitialization code for the cleaner shader in the gfx_v9_0_sw_fini function. The added line amdgpu_gfx_cleaner_shader_sw_fini(adev); ensures that any allocated resources for the cleaner shader are freed correctly, avoiding potential memory leaks and ensuring that the GPU state is clean for the next initialization sequence. Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Fixes: c2e70d307f44 ("drm/amdgpu/gfx9: Implement cleaner shader support for GFX9 hardware") Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-05drm/amdgpu: Fix JPEG v4.0.3 register writeLijo Lazar
[ Upstream commit 8c50bf9beb889fd2bdcbf95b27a5d101eede51fc ] EXTERNAL_REG_INTERNAL_OFFSET/EXTERNAL_REG_WRITE_ADDR should be used in pairs. If an external register shouldn't be written, both packets shouldn't be sent. Fixes: a78b48146972 ("drm/amdgpu: Skip PCTL0_MMHUB_DEEPSLEEP_IB write in jpegv4.0.3 under SRIOV") Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-05drm/amd/display: Fix incorrect DSC recompute triggerFangzhi Zuo
[ Upstream commit 4641169a8c95d9efc35d2d3c55c3948f3b375ff9 ] A stream without dsc_aux should not be eliminated from the dsc determination. Whether it needs a dsc recompute depends on whether its mode has changed or not. Eliminating such a no-dsc stream from the dsc determination policy will end up with inconsistencies in the new dc_state when compared to the current dc_state, triggering a dsc recompute that should not have happened. Reviewed-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-12-05drm/amd/display: Skip Invalid Streams from DSC PolicyFangzhi Zuo
[ Upstream commit 9afeda04964281e9f708b92c2a9c4f8a1387b46e ] Streams with invalid new connector state should be elimiated from dsc policy. Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-11-16Revert "drm/amd/pm: correct the workload setting"Alex Deucher
This reverts commit 74e1006430a5377228e49310f6d915628609929e. This causes a regression in the workload selection. A more extensive fix is being worked on. For now, revert. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3618 Fixes: 74e1006430a5 ("drm/amd/pm: correct the workload setting") Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-11-12drm/amd: Fix initialization mistake for NBIO 7.7.0Vijendar Mukunda
There is a strapping issue on NBIO 7.7.0 that can lead to spurious PME events while in the D0 state. Co-developed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20241112161142.28974-1-mario.limonciello@amd.com Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 447a54a0f79c9a409ceaa17804bdd2e0206397b9) Cc: stable@vger.kernel.org
2024-11-12Revert "drm/amd/display: parse umc_info or vram_info based on ASIC"Alex Deucher
This reverts commit 694c79769cb384bca8b1ec1d1e84156e726bd106. This was not the root cause. Revert. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3678 Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: aurabindo.pillai@amd.com Cc: hamishclaxton@gmail.com (cherry picked from commit 3c2296b1eec55b50c64509ba15406142d4a958dc) Cc: stable@vger.kernel.org # 6.11.x
2024-11-12drm/amd/display: Fix failure to read vram info due to static BP_RESULTHamish Claxton
The static declaration causes the check to fail. Remove it. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3678 Fixes: 00c391102abc ("drm/amd/display: Add misc DC changes for DCN401") Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Hamish Claxton <hamishclaxton@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: aurabindo.pillai@amd.com Cc: hamishclaxton@gmail.com (cherry picked from commit 91314e7dfd83345b8b820b782b2511c9c32866cd) Cc: stable@vger.kernel.org # 6.11.x
2024-11-12drm/amdgpu: enable GTT fallback handling for dGPUs onlyChristian König
That is just a waste of time on APUs. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3704 Fixes: 216c1282dde3 ("drm/amdgpu: use GTT only as fallback for VRAM|GTT") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit e8fc090d322346e5ce4c4cfe03a8100e31f61c3c) Cc: stable@vger.kernel.org
2024-11-11drm/amdgpu/mes12: correct kiq unmap latencyJack Xiao
Correct kiq unmap queue timeout value. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit cfe98204a06329b6b7fce1b828b7d620473181ff) Cc: stable@vger.kernel.org # 6.11.x
2024-11-11drm/amdgpu: fix check in gmc_v9_0_get_vm_pte()Christian König
The coherency flags can only be determined when the BO is locked and that in turn is only guaranteed when the mapping is validated. Fix the check, move the resource check into the function and add an assert that the BO is locked. Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: d1a372af1c3d ("drm/amdgpu: Set MTYPE in PTE based on BO flags") Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1b4ca8546f5b5c482717bedb8e031227b1541539) Cc: stable@vger.kernel.org