summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2024-07-24drm/amdgpu: Fix eeprom max record countStanley.Yang
The eeprom table is empty before initializing, set eeprom table version first before initializing. Changed from V1: Reuse amdgpu_ras_set_eeprom_table_version function Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 015b8a2fdf39a4c288ff24e7b715b8d9198e56dc)
2024-07-24drm/amdgpu: fix ras UE error injection failure issueYiPeng Chai
The ras command shared memory is allocated from VRAM and the response status of the command buffer will not be zero due to gpu being in fatal error state after ras UE error injection. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 8284951a6e79c6806c675e5f68a4cd425dd56bc4)
2024-07-24drm/amd/display: Remove ASSERT if significance is zero in math_ceil2Rodrigo Siqueira
In the DML math_ceil2 function, there is one ASSERT if the significance is equal to zero. However, significance might be equal to zero sometimes, and this is not an issue for a ceil function, but the current ASSERT will trigger warnings in those cases. This commit removes the ASSERT if the significance is equal to zero to avoid unnecessary noise. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com> Signed-off-by: Rodrigo Siqueira <rodrigo.siqueira@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 332315885d3ccc6d8fe99700f3c2e4c24aa65ab7)
2024-07-24drm/amd/display: Check for NULL pointerSung Joon Kim
[why & how] Need to make sure plane_state is initialized before accessing its members. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Reviewed-by: Xi (Alex) Liu <xi.liu@amd.com> Signed-off-by: Sung Joon Kim <sungjoon.kim@amd.com> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 295d91cbc700651782a60572f83c24861607b648)
2024-07-24drm/amdgpu/vcn: Use offsets local to VCN/JPEG in VFJane Jian
For VCN/JPEG 4.0.3, use only the local addressing scheme. - Mask bit higher than AID0 range v2 remain the case for mmhub use master XCC Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit caaf576292f8ccef5cdc0ac16e77b87dbf6e17ab)
2024-07-24drm/amdgpu: Add empty HDP flush function to VCN v4.0.3Lijo Lazar
VCN 4.0.3 does not HDP flush with RRMT enabled. Instead, mmsch will do the HDP flush. This change is necessary for VCN v4.0.3, no need for backward compatibility Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 49cfaebe48e97500a68d5322a8194736b0a2c3cf)
2024-07-24drm/amdgpu: Add empty HDP flush function to JPEG v4.0.3Lijo Lazar
JPEG v4.0.3 doesn't support HDP flush when RRMT is enabled. Instead, mmsch fw will do the flush. This change is necessary for JPEG v4.0.3, no need for backward compatibility Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Jane Jian <Jane.Jian@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 585e3fdb36f59c5cfed0ae06c852dc1df22b1d60)
2024-07-24drm/amd/amdgpu: Fix uninitialized variable warningsMa Ke
Return 0 to avoid returning an uninitialized variable r. Cc: stable@vger.kernel.org Fixes: 230dd6bb6117 ("drm/amd/amdgpu: implement mode2 reset on smu_v13_0_10") Signed-off-by: Ma Ke <make24@iscas.ac.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 6472de66c0aa18d50a4b5ca85f8272e88a737676)
2024-07-24drm/amdgpu: Fix atomics on GFX12David Belanger
If PCIe supports atomics, configure register to prevent DF from breaking atomics in separate load/store operations. Signed-off-by: David Belanger <david.belanger@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 666f14cab21b17ccc1bdfe1e82458aa429b3b7e0)
2024-07-24drm/amdgpu/sdma5.2: Update wptr registers as well as doorbellAlex Deucher
We seem to have a case where SDMA will sometimes miss a doorbell if GFX is entering the powergating state when the doorbell comes in. To workaround this, we can update the wptr via MMIO, however, this is only safe because we disallow gfxoff in begin_ring() for SDMA 5.2 and then allow it again in end_ring(). Enable this workaround while we are root causing the issue with the HW team. Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/3440 Tested-by: Friedrich Vock <friedrich.vock@gmx.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org (cherry picked from commit f2ac52634963fc38e4935e11077b6f7854e5d700)
2024-07-22Merge tag 'amd-drm-fixes-6.11-2024-07-18' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-fixes-6.11-2024-07-18: amdgpu: - Bump driver version for GFX12 DCC - DC documention warning fixes - VCN unified queue power fix - SMU fix - RAS fix - Display corruption fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240718215258.79356-1-alexander.deucher@amd.com
2024-07-22Merge tag 'drm-misc-next-fixes-2024-07-19' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next Two fixes for v3d to fix an array indexing on newer V3D revisions. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240719-emerald-newt-of-skill-89b54a@houat
2024-07-22Merge tag 'drm-xe-next-fixes-2024-07-18' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next - Xe_exec ioctl minor fix on sync entry cleanup upon error (Ashutosh) - SRIOV: limit VF LMEM provisioning (Michal) - Wedge mode fixes (Brost) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Zpk6CI0FDoTJwkSb@intel.com
2024-07-22Merge tag 'drm-intel-next-fixes-2024-07-18' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/i915/kernel into drm-next - Reset intel_dp->link_trained before retraining the link [dp] (Imre Deak) - Don't switch the LTTPR mode on an active link [dp] (Imre Deak) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Tvrtko Ursulin <tursulin@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/ZpjgtowjpUZoHvrl@linux
2024-07-18drm/xe: Don't suspend device upon wedgeMatthew Brost
When wedging a device we shouldn't be suspending device as state for debug will be lost. Also this appears to not work as the below stack trace pops upon trying to resume a wedged device: [ 304.245044] INFO: task cat:12115 blocked for more than 151 seconds. [ 304.251333] Tainted: G W 6.10.0-rc7-xe+ #3518 [ 304.257617] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 304.265459] task:cat state:D stack:13384 pid:12115 tgid:12115 ppid:3986 flags:0x00000006 [ 304.265465] Call Trace: [ 304.265467] <TASK> [ 304.265469] __schedule+0x3c4/0xdf0 [ 304.265478] schedule+0x3c/0x140 [ 304.265481] rpm_resume+0x1cc/0x740 [ 304.265484] ? __pfx_autoremove_wake_function+0x10/0x10 [ 304.265489] __pm_runtime_resume+0x49/0x80 [ 304.265494] guc_info+0x6b/0xb0 [xe] [ 304.265538] ? __pfx___drm_printfn_seq_file+0x10/0x10 [ 304.265541] ? __pfx___drm_puts_seq_file+0x10/0x10 [ 304.265545] seq_read_iter+0x111/0x4c0 [ 304.265551] seq_read+0xfc/0x140 [ 304.265556] full_proxy_read+0x58/0x80 [ 304.265560] vfs_read+0xa7/0x360 [ 304.265563] ? find_held_lock+0x2b/0x80 [ 304.265568] ksys_read+0x64/0xe0 [ 304.265571] do_syscall_64+0x68/0x140 [ 304.265575] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 304.265578] RIP: 0033:0x7f4254d14992 [ 304.265580] RSP: 002b:00007ffc558666f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [ 304.265583] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f4254d14992 [ 304.265584] RDX: 0000000000020000 RSI: 00007f4254ebb000 RDI: 0000000000000003 [ 304.265586] RBP: 00007f4254ebb000 R08: 00007f4254eba010 R09: 00007f4254eba010 [ 304.265587] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000022000 [ 304.265588] R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000 [ 304.265593] </TASK> [ 304.265594] Showing all locks held in the system: [ 304.265598] 1 lock held by khungtaskd/57: [ 304.265599] #0: ffffffff8273b860 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x36/0x1c0 [ 304.265607] 3 locks held by kworker/6:1/90: [ 304.265610] 1 lock held by in:imklog/547: [ 304.265611] #0: ffff88810498cd88 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0x76/0xc0 [ 304.265620] 1 lock held by dmesg/1310: v2: Drop local 'err' variable (Jonathan) Fixes: 8ed9aaae39f3 ("drm/xe: Force wedged state and block GT reset upon any GPU hang") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240716063902.1390130-2-matthew.brost@intel.com (cherry picked from commit 452bca0edbd0764ca0284239d5438b3edd305ab3) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-07-18drm/xe: Wedge the entire deviceMatthew Brost
Wedge the entire device, not just GT which may have triggered the wedge. To implement this, cleanup the layering so xe_device_declare_wedged() calls into the lower layers (GT) to ensure entire device is wedged. While we are here, also signal any pending GT TLB invalidations upon wedging device. Lastly, short circuit reset wait if device is wedged. v2: - Short circuit reset wait if device is wedged (Local testing) Fixes: 8ed9aaae39f3 ("drm/xe: Force wedged state and block GT reset upon any GPU hang") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240716063902.1390130-1-matthew.brost@intel.com (cherry picked from commit 7dbe8af13c189f5937e87e9fb924d5bbc49e6f71) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-07-18drm/xe/pf: Limit fair VF LMEM provisioningMichal Wajdeczko
Due to the current design of the BO and VRAM manager, any object with XE_BO_FLAG_PINNED flag, which the PF driver uses during VF LMEM provisionining, is created with the TTM_PL_FLAG_CONTIGUOUS flag, which may cause VRAM fragmentation that prevents subsequent allocations of larger objects, like fair VF LMEM provisioning. To avoid such failures, round down fair VF LMEM provisioning size to next power of two size, to compensate what xe_ttm_vram_mgr is doing to achieve contiguous allocations. Fixes: ac6598aed1b3 ("drm/xe/pf: Add support to configure SR-IOV VFs") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240711192320.1198-2-michal.wajdeczko@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit 4c3fe5eae46b92e2fd961b19f7779608352e5368) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-07-18drm/xe/exec: Fix minor bug related to xe_sync_entry_cleanupAshutosh Dixit
Increment num_syncs after xe_sync_entry_parse() is successful to ensure the xe_sync_entry_cleanup() logic under "err_syncs" label works correctly. v2: Use the same pattern as that in xe_vm.c (Matt Brost) Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240711211203.3728180-1-ashutosh.dixit@intel.com (cherry picked from commit 43a6faa6d9b5e9139758200a79fe9c8f4aaa0c8d) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-07-18Merge tag 'amd-drm-next-6.11-2024-07-12' of ↵Dave Airlie
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.11-2024-07-12: amdgpu: - RAS fixes - SMU fixes - GC 12 updates - SR-IOV fixes - IH 7 updates - DCC fixes - GC 11.5 fixes - DP MST fixes - GFX 9.4.4 fixes - SMU 14 updates - Documentation updates - MAINTAINERS updates - PSR SU fix - Misc small fixes Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240712171637.2581787-1-alexander.deucher@amd.com
2024-07-17drm/amd/display: fix corruption with high refresh rates on DCN 3.0Alex Deucher
This reverts commit bc87d666c05a13e6d4ae1ddce41fc43d2567b9a2 and the register changes from commit 6d4279cb99ac4f51d10409501d29969f687ac8dc. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3412 Cc: mikhail.v.gavrilov@gmail.com Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com> Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org # 6.10.x
2024-07-16Documentation/amdgpu: Fix duplicate declarationRodrigo Siqueira
Address the below kernel doc warning: Documentation/gpu/amdgpu/display/display-manager:134: drivers/gpu/drm/amd/display/dc/inc/hw/mpc.h:3: WARNING: Duplicate C declaration, also defined at gpu/amdgpu/display/dcn-blocks:101. Declaration is '.. c:struct:: mpcc_blnd_cfg'. Documentation/gpu/amdgpu/display/display-manager:146: drivers/gpu/drm/amd/display/dc/inc/hw/mpc.h:3: WARNING: Duplicate C declaration, also defined at gpu/amdgpu/display/dcn-blocks:3. Declaration is '.. c:enum:: mpcc_alpha_blend_mode'. To address the above warnings, this commit uses the 'no-identifiers' option in the dcn-blocks to avoid duplication with the previous use of this function doc in the display-manager file. Finally, replaces the deprecated ':function:' in favor of ':identifiers:'. Cc: Alex Deucher <alexander.deucher@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-16Documentation/gpu: Remove undocumented files from dcn-blockshubbub.hRodrigo Siqueira
The dchubbub.h and hubp.h do not have any meaningful documentation; for this reason, this commit removes those files from the dcn-blocks documentation. Cc: Alex Deucher <alexander.deucher@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-16drm/amd/display: Add simple struct doc to remove doc build warningRodrigo Siqueira
This commit is a part of a series that addresses the following build warning for opp: ./drivers/gpu/drm/amd/display/dc/inc/hw/opp.h:1: warning: no structured comments found ./drivers/gpu/drm/amd/display/dc/inc/hw/dpp.h:1: warning: no structured comments found This commit fixes this issue by adding a simple kernel-doc to a struct in the opp.h and the dpp.h files. Cc: Alex Deucher <alexander.deucher@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-16Documentation/gpu: Adjust DCN documentation pathsRodrigo Siqueira
When building the kernel-doc, it has the following complaints: Documentation/gpu/amdgpu/display/dcn-blocks:23: drivers/gpu/drm/amd/display/dc/inc/hw/hubp.h:3: WARNING: Duplicate C declaration, also defined at gpu/amdgpu/display/dcn-blocks:3. Declaration is '.. c:struct:: surface_flip_registers'. Documentation/gpu/amdgpu/display/dcn-blocks:35: drivers/gpu/drm/amd/display/dc/inc/hw/hubp.h:3: WARNING: Duplicate C declaration, also defined at gpu/amdgpu/display/dcn-blocks:3. Declaration is '.. c:struct:: surface_flip_registers'. This error happened due to a copy-and-paste where the same file path was duplicated multiple times to a different set of blocks. This commit addresses this issue by using the correct file path. Cc: Alex Deucher <alexander.deucher@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-16Documentation/gpu: Remove ':export:' option from DCN documentationRodrigo Siqueira
This commit reduces, but does not fix, all the occurrences and some of the documentation warnings related to the 'no structured comments.' This was caused by the wrong use of the ':export:' option in the DCN kernel-doc, so this commit drops the usage of those options. Acked-by: Alex Deucher <alexander.deucher@amd.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-16drm/amd/display: Move DIO documentation to the right placeRodrigo Siqueira
When building the kernel-doc, it complains with the below warning: ./drivers/gpu/drm/amd/display/dc/link/hwss/link_hwss_dio.h:1: warning: no structured comments found ./drivers/gpu/drm/amd/display/dc/link/hwss/link_hwss_dio.h:1: warning: no structured comments found This warning was caused by the wrong use of the ':export:' and the lack of function documentation in the file pointed under the ':internal:'. This commit addresses those issues by relocating the overview documentation to the correct C file, removing the ':export:' options, and adding two simple kernel-doc to ensure that ':internal:' does not have any warning. Cc: Alex Deucher <alexander.deucher@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Link: https://lore.kernel.org/dri-devel/20240715085918.68f5ecc9@canb.auug.org.au/ Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-16drm/amd/swsmu: enable Pstates profile levels for SMU v14.0.4Li Ma
Enables following UMD stable Pstates profile levels of power_dpm_force_performance_level for SMU v14.0.4. - profile_peak - profile_min_mclk - profile_min_sclk - profile_standard Signed-off-by: Li Ma <li.ma@amd.com> Reviewed-by: Tim Huang <tim.huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-16drm/amd/pm: early return if disabling DPMS for GFX IP v11.5.2Tim Huang
This was intended to add support for GFX IP v11.5.2, but it needs to be applied to all GFX11 and subsequent APUs. Therefore the code should be revised to accommodate this. Signed-off-by: Tim Huang <tim.huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-16drm/amdgpu: add mutex to protect ras shared memoryYiPeng Chai
Add mutex to protect ras shared memory. v2: Add TA_RAS_COMMAND__TRIGGER_ERROR command call status check. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-16drm/amd/display: Add function banner for idle_workqueueRoman Li
[Why] htmldocs warning: drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h: warning: Function parameter or struct member 'idle_workqueue' not described in 'amdgpu_display_manager'. [How] Add comment section for idle_workqueue with param description. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Link: https://lore.kernel.org/dri-devel/20240715090211.736a9b4d@canb.auug.org.au/ Signed-off-by: Roman Li <Roman.Li@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-16drm/amd/display: Add doc entry for program_3dlut_sizeAlex Hung
Fixes the warning: Function parameter or struct member 'program_3dlut_size' not described in 'mpc_funcs' Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/dri-devel/20240715090445.7e9387ec@canb.auug.org.au/ Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-16drm/amdgpu/vcn: not pause dpg for unified queueBoyuan Zhang
For unified queue, DPG pause for encoding is done inside VCN firmware, so there is no need to pause dpg based on ring type in kernel. For VCN3 and below, pausing DPG for encoding in kernel is still needed. v2: add more comments v3: update commit message Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-16drm/amdgpu/vcn: identify unified queue in sw initBoyuan Zhang
Determine whether VCN using unified queue in sw_init, instead of calling functions later on. v2: fix coding style Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-16drm/i915/dp: Don't switch the LTTPR mode on an active linkImre Deak
Switching to transparent mode leads to a loss of link synchronization, so prevent doing this on an active link. This happened at least on an Intel N100 system / DELL UD22 dock, the LTTPR residing either on the host or the dock. To fix the issue, keep the current mode on an active link, adjusting the LTTPR count accordingly (resetting it to 0 in transparent mode). v2: Adjust code comment during link training about reiniting the LTTPRs. (Ville) Fixes: 7b2a4ab8b0ef ("drm/i915: Switch to LTTPR transparent mode link training") Reported-and-tested-by: Gareth Yu <gareth.yu@intel.com> Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10902 Cc: <stable@vger.kernel.org> # v5.15+ Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240708190029.271247-3-imre.deak@intel.com (cherry picked from commit 211ad49cf8ccfdc798a719b4d1e000d0a8a9e588) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
2024-07-16drm/i915/dp: Reset intel_dp->link_trained before retraining the linkImre Deak
Regularly retraining a link during an atomic commit happens with the given pipe/link already disabled and hence intel_dp->link_trained being false. Ensure this also for retraining a DP SST link via direct calls to the link training functions (vs. an actual commit as for DP MST). So far nothing depended on this, however the next patch will depend on link_trained==false for changing the LTTPR mode to non-transparent. Cc: <stable@vger.kernel.org> # v5.15+ Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240708190029.271247-2-imre.deak@intel.com (cherry picked from commit a4d5ce61765c08ab364aa4b327f6739b646e6cfa) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
2024-07-15drm/amd/display: fix doc entry for bb_from_dmubAurabindo Pillai
Fixes the warning: Function parameter or struct member 'bb_from_dmub' not described in 'amdgpu_display_manager' Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-15drm/amd: Bump KMS_DRIVER_MINOR versionAurabindo Pillai
Increase the KMS minor version to indicate GFX12 DCC support since this contains a major change in how DCC is managed across IPs like GFX, DCN etc. This will be used mainly by userspace like Mesa to figure out DCC support on GFX12 hardware. v2: fix version number (Alex) Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-15drm/v3d: Fix Indirect Dispatch configuration for V3D 7.1.6 and laterMaíra Canal
`args->cfg[4]` is configured in Indirect Dispatch using the number of batches. Currently, for all V3D tech versions, `args->cfg[4]` equals the number of batches subtracted by 1. But, for V3D 7.1.6 and later, we must not subtract 1 from the number of batches. Implement the fix by checking the V3D tech version and revision. Fixes several `dEQP-VK.synchronization*` CTS tests related to Indirect Dispatch. Fixes: 18b8413b25b7 ("drm/v3d: Create a CPU job extension for a indirect CSD job") Signed-off-by: Maíra Canal <mcanal@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240714145243.1223131-2-mcanal@igalia.com
2024-07-15drm/v3d: Add V3D tech revision to the device informationMaíra Canal
The V3D tech revision can be a useful information when configuring jobs. Therefore, expose it in the `struct v3d_dev` with the V3D tech version. Signed-off-by: Maíra Canal <mcanal@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240714145243.1223131-1-mcanal@igalia.com
2024-07-12drm/amdgpu/mes12: add missing opcode stringAlex Deucher
Fixes the indexing of the string array. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-12drm/amdgpu/mes11: update opcode stringsAlex Deucher
Add new packet. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-12Revert "drm/amd/display: Reset freesync config before update new state"Leo Li
This change caused PSR SU panels to not read from their remote fb, preventing us from entering self-refresh. It is a regression. This reverts commit eb6dfbb7a9c67c7d9bcdb9f9b9131270e2144e3d. Signed-off-by: Leo Li <sunpeng.li@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit dc1000bf463d1d89f66d6b5369cf76603f32c4d3)
2024-07-12drm/omap: Restrict compile testing to PAGE_SIZE less than 64KBNathan Chancellor
Prior to commit dc6fcaaba5a5 ("drm/omap: Allow build with COMPILE_TEST=y"), it was only possible to build the omapdrm driver with a 4KB page size. After that change, when the PAGE_SIZE is 64KB or larger, clang points out that the driver has some assumptions around the page size implicitly by passing PAGE_SIZE to a parameter with a type of u16: drivers/gpu/drm/omapdrm/omap_gem.c:758:7: error: implicit conversion from 'unsigned long' to 'u16' (aka 'unsigned short') changes value from 65536 to 0 [-Werror,-Wconstant-conversion] 757 | block = tiler_reserve_2d(fmt, omap_obj->width, omap_obj->height, | ~~~~~~~~~~~~~~~~ 758 | PAGE_SIZE); | ^~~~~~~~~ arch/powerpc/include/asm/page.h:25:34: note: expanded from macro 'PAGE_SIZE' 25 | #define PAGE_SIZE (ASM_CONST(1) << PAGE_SHIFT) | ~~~~~~~~~~~~~^~~~~~~~~~~~~ drivers/gpu/drm/omapdrm/omap_gem.c:1504:44: error: implicit conversion from 'unsigned long' to 'u16' (aka 'unsigned short') changes value from 65536 to 0 [-Werror,-Wconstant-conversion] 1504 | block = tiler_reserve_2d(fmts[i], w, h, PAGE_SIZE); | ~~~~~~~~~~~~~~~~ ^~~~~~~~~ arch/powerpc/include/asm/page.h:25:34: note: expanded from macro 'PAGE_SIZE' 25 | #define PAGE_SIZE (ASM_CONST(1) << PAGE_SHIFT) | ~~~~~~~~~~~~~^~~~~~~~~~~~~ 2 errors generated. As there is a lot of use of a u16 type throughout this driver and it will only ever be run on hardware that has a 4KB page size, just restrict compile testing to when the page size is less than 64KB (as no other issues have been discussed and it keeps compile testing relatively more available). Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240620-omapdrm-restrict-compile-test-to-sub-64kb-page-size-v1-1-5e56de71ffca@kernel.org
2024-07-12Merge tag 'drm-xe-next-fixes-2024-07-11' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next UAPI Changes: - Rename xe perf layer as xe observation layer (Ashutosh) Driver Changes: - Drop trace_xe_hw_fence_free (Brost) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/Zo_3ustogPDVKZwu@intel.com
2024-07-12Merge tag 'drm-misc-next-fixes-2024-07-11' of ↵Dave Airlie
https://gitlab.freedesktop.org/drm/misc/kernel into drm-next A fix for fbdev on big endian systems, a condition fix for a sharp panel at removal, and a fix for qxl to prevent unpinned buffer access under certain conditions. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240711-benign-rich-mouflon-2eeafe@houat
2024-07-11drm/xe: Drop trace_xe_hw_fence_freeMatthew Brost
fence->ctx may be stale memory when trace_xe_hw_fence_free is called resuling UAF bug when deriving the device name. This tracepoint is not all that useful, so just drop it. Fixes: 501c4255c409 ("drm/xe/trace: Print device_id in xe_trace events") Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Gustavo Sousa <gustavo.sousa@intel.com> Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240708211008.956384-1-matthew.brost@intel.com (cherry picked from commit caaf1f44a6a27bae33eee189842c4d8fc21c3b02) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-07-11drm/xe/uapi: Rename xe perf layer as xe observation layerAshutosh Dixit
In Xe, the perf layer allows capture of HW counter streams. These HW counters are generally performance related but don't have to be necessarily so. Also, the name "perf" is a carryover from i915 and is not preferred. Here we propose the name "observation" for this common layer which allows capture of different types of these counter streams. v2: Rename observability layer to observation layer (Lucas/Rodrigo) v3: Rename sysctl file to "observation_paranoid" (Jose) Fixes: 52c2e956dceb ("drm/xe/perf/uapi: "Perf" layer to support multiple perf counter stream types") Fixes: fe8929bdf835 ("drm/xe/perf/uapi: Add perf_stream_paranoid sysctl") Acked-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Acked-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240703164801.2561423-1-ashutosh.dixit@intel.com (cherry picked from commit 8169b2097d88d99d7e4a72e20e4b549efe9eb8d7) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-07-10drm/amdgpu: remove exp hw support check for gfx12Alex Deucher
Enable it by default. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-10drm/amdgpu: timely save bad pages to eeprom after gpu ras reset is completedYiPeng Chai
The problem case is as follows: 1. GPU A triggers a gpu ras reset, and GPU A drives GPU B to also perform a gpu ras reset. 2. After gpu B ras reset started, gpu B queried a DE data. Since the DE data was queried in the ras reset thread instead of the page retirement thread, bad page retirement work would not be triggered. Then even if all gpu resets are completed, the bad pages will be cached in RAM until GPU B's bad page retirement work is triggered again and then saved to eeprom. This patch can save the bad pages to eeprom in time after gpu ras reset is completed. v2: 1. Add the above description to code comments. 2. Reuse existing function. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-10drm/amdgpu: flush all cached ras bad pages to eepromYiPeng Chai
Before uninstalling gpu driver, flush all cached ras bad pages to eeprom. v2: Put the same code into a function and reuse the function. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>