linux/linux-stable.git - Linux kernel stable tree

Age	Commit message (Collapse)	Author
2021-08-24	drm/amdgpu: switch from 'pci_' to 'dma_' API	Christophe JAILLET
	The wrappers in include/linux/pci-dma-compat.h should go away. The patch has been generated with the coccinelle script below. It has been compile tested. @@ @@ - PCI_DMA_BIDIRECTIONAL + DMA_BIDIRECTIONAL @@ @@ - PCI_DMA_TODEVICE + DMA_TO_DEVICE @@ @@ - PCI_DMA_FROMDEVICE + DMA_FROM_DEVICE @@ @@ - PCI_DMA_NONE + DMA_NONE @@ expression e1, e2, e3; @@ - pci_alloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3; @@ - pci_zalloc_consistent(e1, e2, e3) + dma_alloc_coherent(&e1->dev, e2, e3, GFP_) @@ expression e1, e2, e3, e4; @@ - pci_free_consistent(e1, e2, e3, e4) + dma_free_coherent(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_single(e1, e2, e3, e4) + dma_map_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_single(e1, e2, e3, e4) + dma_unmap_single(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4, e5; @@ - pci_map_page(e1, e2, e3, e4, e5) + dma_map_page(&e1->dev, e2, e3, e4, e5) @@ expression e1, e2, e3, e4; @@ - pci_unmap_page(e1, e2, e3, e4) + dma_unmap_page(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_map_sg(e1, e2, e3, e4) + dma_map_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_unmap_sg(e1, e2, e3, e4) + dma_unmap_sg(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_cpu(e1, e2, e3, e4) + dma_sync_single_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_single_for_device(e1, e2, e3, e4) + dma_sync_single_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_cpu(e1, e2, e3, e4) + dma_sync_sg_for_cpu(&e1->dev, e2, e3, e4) @@ expression e1, e2, e3, e4; @@ - pci_dma_sync_sg_for_device(e1, e2, e3, e4) + dma_sync_sg_for_device(&e1->dev, e2, e3, e4) @@ expression e1, e2; @@ - pci_dma_mapping_error(e1, e2) + dma_mapping_error(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_dma_mask(e1, e2) + dma_set_mask(&e1->dev, e2) @@ expression e1, e2; @@ - pci_set_consistent_dma_mask(e1, e2) + dma_set_coherent_mask(&e1->dev, e2) Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24	drm/amdkfd: CWSR with sw scheduler on Aldebaran and Arcturus	Mukul Joshi
	Program trap handler settings to enable CWSR with software scheduler on Aldebaran and Arcturus. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Amber Lin <Amber.Lin@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24	drm/amdgpu/OLAND: clip the ref divider max value	Shashank Sharma
	This patch limits the ref_div_max value to 100, during the calculation of PLL feedback reference divider. With current value (128), the produced fb_ref_div value generates unstable output at particular frequencies. Radeon driver limits this value at 100. On Oland, when we try to setup mode 2048x1280@60 (a bit weird, I know), it demands a clock of 221270 Khz. It's been observed that the PLL calculations using values 128 and 100 are vastly different, and look like this: +------------------------------------------+ \|Parameter \|AMDGPU \|Radeon \| \| \| \| \| +-------------+----------------------------+ \|Clock feedback \| \| \|divider max \| 128 \| 100 \| \|cap value \| \| \| \| \| \| \| \| \| \| \| +------------------------------------------+ \|ref_div_max \| \| \| \| \| 42 \| 20 \| \| \| \| \| \| \| \| \| +------------------------------------------+ \|ref_div \| 42 \| 20 \| \| \| \| \| +------------------------------------------+ \|fb_div \| 10326 \| 8195 \| +------------------------------------------+ \|fb_div \| 1024 \| 163 \| +------------------------------------------+ \|fb_dev_p \| 4 \| 9 \| \|frac fb_de^_p\| \| \| +----------------------------+-------------+ With ref_div_max value clipped at 100, AMDGPU driver can also drive videmode 2048x1280@60 (221Mhz) and produce proper output without any blanking and distortion on the screen. PS: This value was changed from 128 to 100 in Radeon driver also, here: https://github.com/freedesktop/drm-tip/commit/4b21ce1b4b5d262e7d4656b8ececc891fc3cb806 V1: Got acks from: Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com> V2: - Restricting the changes only for OLAND, just to avoid any regression for other cards. - Changed unsigned -> unsigned int to make checkpatch quiet. V3: Apply the change on SI family (not only oland) (Christian) Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Eddy Qin <Eddy.Qin@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24	drm/amd/display: refactor riommu invalidation wa	Eric Yang
	[Why] A cleaner solution, only done once on boot. [How] Remove previous workaround and configure an extra vmid one time on boot Reviewed-by: Kazlauskas Nicholas <Nicholas.Kazlauskas@amd.com> Acked-by: Solomon Chiu <solomon.chiu@amd.com> Signed-off-by: Eric Yang <Eric.Yang2@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-24	drm/amdgpu: Fix build with missing pm_suspend_target_state module export	Borislav Petkov
	Building a randconfig here triggered: ERROR: modpost: "pm_suspend_target_state" [drivers/gpu/drm/amd/amdgpu/amdgpu.ko] undefined! because the module export of that symbol happens in kernel/power/suspend.c which is enabled with CONFIG_SUSPEND. The ifdef guards in amdgpu_acpi_is_s0ix_supported(), however, test for CONFIG_PM_SLEEP which is defined like this: config PM_SLEEP def_bool y depends on SUSPEND \|\| HIBERNATE_CALLBACKS and that randconfig has: # CONFIG_SUSPEND is not set CONFIG_HIBERNATE_CALLBACKS=y leading to the module export missing. Change the ifdeffery to depend directly on CONFIG_SUSPEND. Fixes: 5706cb3c910c ("drm/amdgpu: fix checking pmops when PM_SLEEP is not enabled") Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lkml.kernel.org/r/YSP6Lv53QV0cOAsd@zn.tnic Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-08-20	drm/amdgpu: Cancel delayed work when GFXOFF is disabled	Michel Dänzer
	schedule_delayed_work does not push back the work if it was already scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms after the first time GFXOFF was disabled and re-enabled, even if GFXOFF was disabled and re-enabled again during those 100 ms. This resulted in frame drops / stutter with the upcoming mutter 41 release on Navi 14, due to constantly enabling GFXOFF in the HW and disabling it again (for getting the GPU clock counter). To fix this, call cancel_delayed_work_sync when the disable count transitions from 0 to 1, and only schedule the delayed work on the reverse transition, not if the disable count was already 0. This makes sure the delayed work doesn't run at unexpected times, and allows it to be lock-free. v2: * Use cancel_delayed_work_sync & mutex_trylock instead of mod_delayed_work. v3: * Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König) v4: * Fix race condition between amdgpu_gfx_off_ctrl incrementing adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off checking for it to be 0 (Evan Quan) Cc: stable@vger.kernel.org Reviewed-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> # v3 Acked-by: Christian König <christian.koenig@amd.com> # v3 Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-20	drm/amdgpu: use the preferred pin domain after the check	Christian König
	For some reason we run into an use case where a BO is already pinned into GTT, but should be pinned into VRAM\|GTT again. Handle that case gracefully as well. Reviewed-by: Shashank Sharma <Shashank.sharma@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-08-20	drm/amdgpu: Cancel delayed work when GFXOFF is disabled	Michel Dänzer
	schedule_delayed_work does not push back the work if it was already scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms after the first time GFXOFF was disabled and re-enabled, even if GFXOFF was disabled and re-enabled again during those 100 ms. This resulted in frame drops / stutter with the upcoming mutter 41 release on Navi 14, due to constantly enabling GFXOFF in the HW and disabling it again (for getting the GPU clock counter). To fix this, call cancel_delayed_work_sync when the disable count transitions from 0 to 1, and only schedule the delayed work on the reverse transition, not if the disable count was already 0. This makes sure the delayed work doesn't run at unexpected times, and allows it to be lock-free. v2: * Use cancel_delayed_work_sync & mutex_trylock instead of mod_delayed_work. v3: * Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König) v4: * Fix race condition between amdgpu_gfx_off_ctrl incrementing adev->gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off checking for it to be 0 (Evan Quan) Cc: stable@vger.kernel.org Reviewed-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> # v3 Acked-by: Christian König <christian.koenig@amd.com> # v3 Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-20	drm/amdgpu: use the preferred pin domain after the check	Christian König
	For some reason we run into an use case where a BO is already pinned into GTT, but should be pinned into VRAM\|GTT again. Handle that case gracefully as well. Reviewed-by: Shashank Sharma <Shashank.sharma@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2021-08-20	drm/amd/pm: a quick fix for "divided by zero" error	Evan Quan
	Considering Arcturus is a dedicated ASIC for computing, it will be more proper to drop the support for fan speed reading and setting. That's on the TODO list. Signed-off-by: Evan Quan <evan.quan@amd.com> Reported-by: Rui Teng <rui.teng@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-19	isystem: ship and use stdarg.h	Alexey Dobriyan
	Ship minimal stdarg.h (1 type, 4 macros) as <linux/stdarg.h>. stdarg.h is the only userspace header commonly used in the kernel. GPL 2 version of <stdarg.h> can be extracted from http://archive.debian.org/debian/pool/main/g/gcc-4.2/gcc-4.2_4.2.4.orig.tar.gz Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2021-08-19	isystem: trim/fixup stdarg.h and other headers	Alexey Dobriyan
	Delete/fixup few includes in anticipation of global -isystem compile option removal. Note: crypto/aegis128-neon-inner.c keeps <stddef.h> due to redefinition of uintptr_t error (one definition comes from <stddef.h>, another from <linux/types.h>). Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2021-08-18	drm/amd/display: Use DCN30 watermark calc for DCN301	Zhan Liu
	[why] dcn301_calculate_wm_and_dl() causes flickering when external monitor is connected. This issue has been fixed before by commit 0e4c0ae59d7e ("drm/amdgpu/display: drop dcn301_calculate_wm_and_dl for now"), however part of the fix was gone after commit 2cbcb78c9ee5 ("Merge tag 'amd-drm-next-5.13-2021-03-23' of https://gitlab.freedesktop.org/agd5f/linux into drm-next"). [how] Use dcn30_calculate_wm_and_dlg() instead as in the original fix. Fixes: 2cbcb78c9ee5 ("Merge tag 'amd-drm-next-5.13-2021-03-23' of https://gitlab.freedesktop.org/agd5f/linux into drm-next") Signed-off-by: Nikola Cornij <nikola.cornij@amd.com> Reviewed-by: Zhan Liu <zhan.liu@amd.com> Tested-by: Zhan Liu <zhan.liu@amd.com> Tested-by: Oliver Logush <oliver.logush@amd.com> Signed-off-by: Zhan Liu <zhan.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-18	drm/amd/pm: Fix spelling mistake "firwmare" -> "firmware"	Colin Ian King
	There is a spelling mistake in a dev_err error message. Fix it. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-18	drm/amd/amdgpu:flush ttm delayed work before cancel_sync	YuBiao Wang
	[Why] In some cases when we unload driver, warning call trace will show up in vram_mgr_fini which claims that LRU is not empty, caused by the ttm bo inside delay deleted queue. [How] We should flush delayed work to make sure the delay deleting is done. Signed-off-by: YuBiao Wang <YuBiao.Wang@amd.com> Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-18	drm/amd: consolidate TA shared memory structures	Candice Li
	Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-18	drm/amdgpu: increase max xgmi physical node for aldebaran	Hawking Zhang
	aldebaran supports up to 16 xgmi physical nodes. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-18	drm/amdgpu: disable BACO support for 699F:C7 polaris12 SKU temporarily	Evan Quan
	We have a S3 issue on that SKU with BACO enabled. Will bring back this when that root caused. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-18	drm/amd/display: Use DCN30 watermark calc for DCN301	Zhan Liu
	[why] dcn301_calculate_wm_and_dl() causes flickering when external monitor is connected. This issue has been fixed before by commit 0e4c0ae59d7e ("drm/amdgpu/display: drop dcn301_calculate_wm_and_dl for now"), however part of the fix was gone after commit 2cbcb78c9ee5 ("Merge tag 'amd-drm-next-5.13-2021-03-23' of https://gitlab.freedesktop.org/agd5f/linux into drm-next"). [how] Use dcn30_calculate_wm_and_dlg() instead as in the original fix. Fixes: 2cbcb78c9ee5 ("Merge tag 'amd-drm-next-5.13-2021-03-23' of https://gitlab.freedesktop.org/agd5f/linux into drm-next") Signed-off-by: Nikola Cornij <nikola.cornij@amd.com> Reviewed-by: Zhan Liu <zhan.liu@amd.com> Tested-by: Zhan Liu <zhan.liu@amd.com> Tested-by: Oliver Logush <oliver.logush@amd.com> Signed-off-by: Zhan Liu <zhan.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-18	drm/amdgpu: correct MMSCH 1.0 version	Zhigang Luo
	MMSCH 1.0 doesn't have major/minor version, only verison. Signed-off-by: Zhigang Luo <zhigang.luo@amd.com> Reviewed by Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-18	drm/amdgpu: get extended xgmi topology data	Jonathan Kim
	The TA has a limit to the amount of data that can be retrieved from GET_TOPOLOGY. For setups that exceed this limit, the xGMI topology needs to be re-initialized and data needs to be re-fetched from the extended link records by setting a flag in the shared command buffer. The number of hops and the number of links must be accumulated by the driver. Other data points are all fetched from the first request. Because the TA has already exceeded its link record limit, it cannot hold bidirectional information. Otherwise the driver would have to do more than two fetches so the driver has to reflect the topology information in the opposite direction. v2: squashed with internal reviewed fix Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Hawking Zhang <hawking.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	drm/amd/display: 3.2.149	Aric Cyr
	This version brings along following fixes: - Ensure DCN save init registers after VM setup - Fix multi-display support for idle opt workqueue - Use vblank control events for PSR enable/disable - Create default dc_sink when fail reading EDID under MST Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Aric Cyr <aric.cyr@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	drm/amd/display: [FW Promotion] Release 0.0.79	Anthony Koo
	Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Anthony Koo <Anthony.Koo@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	drm/amd/display: Guard vblank wq flush with DCN guards	Nicholas Kazlauskas
	[Why] Compilation of the workqueue fails if not building with the DCN config option set. [How] Guard calls to the flush with the DCN config option to fix the build. Reviewed-by: Roman Li <Roman.Li@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	drm/amd/display: Ensure DCN save after VM setup	Jake Wang
	[Why] DM initializes VM context after DMCUB initialization. This results in loss of DCN_VM_CONTEXT registers after z10. [How] Notify DMCUB when VM setup is complete, and have DMCUB save init registers. v2: squash in CONFIG_DRM_AMD_DC_DCN3_1 fix Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Jake Wang <haonan.wang2@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	drm/amd/display: Ensure DCN save after VM setup	Jake Wang
	[Why] DM initializes VM context after DMCUB initialization. This results in loss of DCN_VM_CONTEXT registers after z10. [How] Notify DMCUB when VM setup is complete, and have DMCUB save init registers. v2: squash in CONFIG_DRM_AMD_DC_DCN3_1 fix Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Jake Wang <haonan.wang2@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	drm/amdkfd: fix random KFDSVMRangeTest.SetGetAttributesTest test failure	Yifan Zhang
	KFDSVMRangeTest.SetGetAttributesTest randomly fails in stress test. Note: Google Test filter = KFDSVMRangeTest.* [==========] Running 18 tests from 1 test case. [----------] Global test environment set-up. [----------] 18 tests from KFDSVMRangeTest [ RUN ] KFDSVMRangeTest.BasicSystemMemTest [ OK ] KFDSVMRangeTest.BasicSystemMemTest (30 ms) [ RUN ] KFDSVMRangeTest.SetGetAttributesTest [ ] Get default atrributes /home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDSVMRangeTest.cpp:154: Failure Value of: expectedDefaultResults[i] Actual: 4294967295 Expected: outputAttributes[i].value Which is: 0 /home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDSVMRangeTest.cpp:154: Failure Value of: expectedDefaultResults[i] Actual: 4294967295 Expected: outputAttributes[i].value Which is: 0 /home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDSVMRangeTest.cpp:152: Failure Value of: expectedDefaultResults[i] Actual: 4 Expected: outputAttributes[i].type Which is: 2 [ ] Setting/Getting atrributes [ FAILED ] the root cause is that svm work queue has not finished when svm_range_get_attr is called, thus some garbage svm interval tree data make svm_range_get_attr get wrong result. Flush work queue before iterate svm interval tree. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	drm/amd/pm: change the workload type for some cards	Kenneth Feng
	change the workload type for some cards as it is needed. Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	Revert "drm/amd/pm: fix workload mismatch on vega10"	Kenneth Feng
	This reverts commit 0979d43259e13846d86ba17e451e17fec185d240. Revert this because it does not apply to all the cards. Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	drm/amd/display: Use vblank control events for PSR enable/disable	Nicholas Kazlauskas
	[Why] PSR can disable the HUBP along with the OTG when PSR is active. We'll hit a pageflip timeout when the OTG is disable because we're no longer updating the CRTC vblank counter and the pflip high IRQ will not fire on the flip. In order to flip the page flip timeout occur we should modify the enter/exit conditions to match DRM requirements. [How] Use our deferred handlers for DRM vblank control to notify DMCU(B) when it can enable or disable PSR based on whether vblank is disabled or enabled respectively. We'll need to pass along the stream with the notification now because we want to access the CRTC state while the CRTC is locked to get the stream state prior to the commit. Retain a reference to the stream so it remains safe to continue to access and release that reference once we're done with it. Enable/disable logic follows what we were previously doing in update_planes. The workqueue has to be flushed before programming streams or planes to ensure that we exit out of idle optimizations and PSR before these events occur if necessary. To keep the skip count logic the same to avoid FBCON PSR enablement requires copying the allow condition onto the DM IRQ parameters - a field that we can actually access from the worker. Reviewed-by: Roman Li <Roman.Li@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	drm/amd/display: Fix multi-display support for idle opt workqueue	Nicholas Kazlauskas
	[Why] The current implementation for idle optimization support only has a single work item that gets reshuffled into the system workqueue whenever we receive an enable or disable event. We can have mismatched events if the work hasn't been processed or if we're getting control events from multiple displays at once. This fixes this issue and also makes the implementation usable for PSR control - which will be addressed in another patch. [How] We need to be able to flush remaining work out on demand for driver stop and psr disable so create a driver specific workqueue instead of using the system one. The workqueue will be single threaded to guarantee the ordering of enable/disable events. Refactor the queue to allocate the control work and deallocate it after processing it. Pass the acrtc directly to make it easier to handle psr enable/disable in a later patch. Rename things to indicate that it's not just MALL specific. Reviewed-by: Roman Li <Roman.Li@amd.com> Acked-by: Wayne Lin <wayne.lin@amd.com> Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	drm/amd/display: Create dc_sink when EDID fail	Wayne Lin
	[Why] While reading remote EDID via Startech 1-to-4 hub, occasionally we won't get response in time and won't light up corresponding monitor. Ideally, we can still add generic modes for userspace to choose to try to light up the monitor and which is done in drm_helper_probe_single_connector_modes(). So the main problem here is that we fail .mode_valid since we don't create remote dc_sink for this case. [How] Also add default dc_sink if we can't get the EDID. Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Wayne Lin <Wayne.Lin@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	drm/amd/pm: correct the address of Arcturus fan related registers	Evan Quan
	These registers have different address from other SMU V11 ASICs. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	drm/amd/pm: drop unnecessary manual mode check	Evan Quan
	As the fan control was guarded under manual mode before fan speed RPM/PWM setting. Thus the extra check is totally redundant. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	drm/amd/pm: drop the unnecessary intermediate percent-based transition	Evan Quan
	Currently, the readout of fan speed pwm is transited into percent-based and then pwm-based. However, the transition into percent-based is totally unnecessary and make the final output less accurate. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	drm/amd/pm: correct the fan speed RPM retrieving	Evan Quan
	The relationship "PWM = RPM / smu->fan_max_rpm" between fan speed PWM and RPM is not true for SMU11 ASICs. So, we need a new way to retrieving the fan speed RPM. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	drm/amd/pm: correct the fan speed PWM retrieving	Evan Quan
	The relationship "PWM = RPM / smu->fan_max_rpm" between fan speed PWM and RPM is not true for SMU11 ASICs. So, we need a new way to retrieving the fan speed PWM. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	drm/amd/pm: record the RPM and PWM based fan speed settings	Evan Quan
	As the relationship "PWM = RPM / smu->fan_max_rpm" between fan speed PWM and RPM is not true for SMU11 ASICs. So, both the RPM and PWM settings need to be saved. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	drm/amd/pm: correct the fan speed RPM setting	Evan Quan
	The relationship "PWM = RPM / smu->fan_max_rpm" between fan speed PWM and RPM is not true for SMU11 ASICs. So, we need a new way to perform the fan speed RPM setting. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	drm/amd/amdgpu: remove unnecessary RAS context field	Candice Li
	Delete ras_if->name in the RAS ctx structure and remove related lines. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	drm/amdkfd: fix random KFDSVMRangeTest.SetGetAttributesTest test failure	Yifan Zhang
	KFDSVMRangeTest.SetGetAttributesTest randomly fails in stress test. Note: Google Test filter = KFDSVMRangeTest.* [==========] Running 18 tests from 1 test case. [----------] Global test environment set-up. [----------] 18 tests from KFDSVMRangeTest [ RUN ] KFDSVMRangeTest.BasicSystemMemTest [ OK ] KFDSVMRangeTest.BasicSystemMemTest (30 ms) [ RUN ] KFDSVMRangeTest.SetGetAttributesTest [ ] Get default atrributes /home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDSVMRangeTest.cpp:154: Failure Value of: expectedDefaultResults[i] Actual: 4294967295 Expected: outputAttributes[i].value Which is: 0 /home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDSVMRangeTest.cpp:154: Failure Value of: expectedDefaultResults[i] Actual: 4294967295 Expected: outputAttributes[i].value Which is: 0 /home/yifan/brahma/libhsakmt/tests/kfdtest/src/KFDSVMRangeTest.cpp:152: Failure Value of: expectedDefaultResults[i] Actual: 4 Expected: outputAttributes[i].type Which is: 2 [ ] Setting/Getting atrributes [ FAILED ] the root cause is that svm work queue has not finished when svm_range_get_attr is called, thus some garbage svm interval tree data make svm_range_get_attr get wrong result. Flush work queue before iterate svm interval tree. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	drm/amd/pm: change the workload type for some cards	Kenneth Feng
	change the workload type for some cards as it is needed. Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	Revert "drm/amd/pm: fix workload mismatch on vega10"	Kenneth Feng
	This reverts commit 0979d43259e13846d86ba17e451e17fec185d240. Revert this because it does not apply to all the cards. Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	drm/amd/amdgpu: consolidate PSP TA context	Candice Li
	Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	drm/amdgpu: Add MB_REQ_MSG_READY_TO_RESET response when VF get FLR notification.	Jiange Zhao
	When guest received FLR notification from host, it would lock adapter into reset state. There will be no more job submission and hardware access after that. Then it should send a response to host that it has prepared for host reset. Signed-off-by: Jiange Zhao <Jiange.Zhao@amd.com> Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com> Reviewed-by: Emily.Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	drm/amd/pm: change pp_dpm_sclk/mclk/fclk attribute is RO for aldebaran	Kevin Wang
	the following clock is only support voltage DPM, change attribute to RO: 1. pp_dpm_sclk 2. pp_dpm_mclk 3. pp_dpm_fclk Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	drm/amd/pm: change smu msg's attribute to allow working under sriov	Kevin Wang
	the following message is allowed in sriov mode: 1. GetEnabledSmuFeaturesLow 2. GetEnabledSmuFeaturesHigh Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	drm/amd/pm: change return value in aldebaran_get_power_limit()	Kevin Wang
	v1: 1. change return value to avoid smu driver probe fails when FEATURE_PPT is not enabled. 2. if FEATURE_PPT is not enabled, set power limit value to 0. v2: instead dev_err with dev_warn Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	drm/amd/pm: skip to load smu microcode on sriov for aldebaran	Kevin Wang
	v1: 1. skip to load smu firmware in sriov mode for aldebaran chip 2. using vbios pptable if in sriov mode. v2: clean up smu driver code in sriov code path Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-08-16	drm/amd/pm: correct DPM_XGMI/VCN_DPM feature name	Kevin Wang
	the following feature is wrong, it will cause sysnode of pp_features show error: 1. DPM_XGMI 2. VCN_DPM Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>