Age | Commit message (Collapse) | Author |
|
Fix typos
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Found some typos while exploring radeon code.
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Found some typos while exploring amdgpu code.
Signed-off-by: Alexandre Demers <alexandre.f.demers@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Update parameter description in the vcn_v5_0_0_is_idle function
Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c:1231: warning: Function parameter or struct member 'ip_block' not described in 'vcn_v5_0_0_is_idle'
drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c:1231: warning: Excess function parameter 'handle' description in 'vcn_v5_0_0_is_idle'
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
debugfs hang_hws is used by GPU reset test with HWS, for MES this crash
the kernel with NULL pointer access because dqm->packet_mgr is not setup
for MES path.
Skip GPU with MES for now, MES hang_hws debugfs interface will be
supported later.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
If GPU in reset, destroy_queue return -EIO, pqm_destroy_queue should
delete the queue from process_queue_list and free the resource.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The previous references to a non-existent `adev` parameter have been
removed & corrected to reflect the use of the `vinst` pointer, which
points to the VCN instance structure, in the below files:
- vcn_v1_0.c
- vcn_v2_0.c
- vcn_v3_0.c
Fixes the below with gcc W=1:
drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c:624: warning: Function parameter or struct member 'vinst' not described in 'vcn_v1_0_enable_clock_gating'
drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c:624: warning: Excess function parameter 'adev' description in 'vcn_v1_0_enable_clock_gating'
drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:376: warning: Function parameter or struct member 'vinst' not described in 'vcn_v2_0_mc_resume'
drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:376: warning: Excess function parameter 'adev' description in 'vcn_v2_0_mc_resume'
drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c:776: warning: Function parameter or struct member 'vinst' not described in 'vcn_v3_0_disable_clock_gating'
drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c:776: warning: Excess function parameter 'adev' description in 'vcn_v3_0_disable_clock_gating'
drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c:776: warning: Excess function parameter 'inst' description in 'vcn_v3_0_disable_clock_gating'
drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c:965: warning: Function parameter or struct member 'vinst' not described in 'vcn_v3_0_enable_clock_gating'
drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c:965: warning: Excess function parameter 'adev' description in 'vcn_v3_0_enable_clock_gating'
drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c:965: warning: Excess function parameter 'inst' description in 'vcn_v3_0_enable_clock_gating'
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
If HW scheduler hangs and mode1 reset is used to recover GPU, KFD signal
user space to abort the processes. After process abort exit, user queues
still use the GPU to access system memory before h/w is reset while KFD
cleanup worker free system memory and free VRAM.
There is use-after-free race bug that KFD allocate and reuse the freed
system memory, and user queue write to the same system memory to corrupt
the data structure and cause driver crash.
To fix this race, KFD cleanup worker terminate user queues, then flush
reset_domain wq to wait for any GPU ongoing reset complete, and then
free outstanding BOs.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
If waiting for gpu reset done in KFD release_work, thers is WARNING:
possible circular locking dependency detected
#2 kfd_create_process
kfd_process_mutex
flush kfd release work
#1 kfd release work
wait for amdgpu reset work
#0 amdgpu_device_gpu_reset
kgd2kfd_pre_reset
kfd_process_mutex
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock((work_completion)(&p->release_work));
lock((wq_completion)kfd_process_wq);
lock((work_completion)(&p->release_work));
lock((wq_completion)amdgpu-reset-dev);
To fix this, KFD create process move flush release work outside
kfd_process_mutex.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
With GPU reset-domain worker implemented, KFD hw_exception worker is not
needed any more, just call amdgpu_amdkfd_gpu_reset directly from
kfd_hws_hang.
Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add support for xgmi_v6_4_1 and use it appropriate places
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add APIs to initialize XGMI speed, width details and get to max
bandwidth supported. It is assumed that a device only supports same
generation of XGMI links with uniform width.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Move definitions related to xgmi to amdgpu_xgmi header
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
add fan abnormal detection on smu v14.0.2&smu v14.0.3
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Since kfd uses pasid values from graphic driver now do not need use kfd pasid
fucntions.
Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
If queue size is less than minimum, clamp it to minimum to prevent
underflow when writing queue mqd.
Signed-off-by: David Yat Sin <David.YatSin@amd.com>
Reviewed-by: Jay Cornwall <jay.cornwall@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Prior to the addition of ring reset, the debug option
`debug_disable_soft_recovery` could be used to force a full device
reset. Now that we have ring reset, create a debug option to disable
them in amdgpu, forcing the driver to go with the full device
reset path again when both options are combined.
This option is useful for testing and debugging purposes when one wants
to test the full reset from userspace.
Signed-off-by: André Almeida <andrealmeid@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
resource_build_scaling_params
Null pointer dereference issue could occur when pipe_ctx->plane_state
is null. The fix adds a check to ensure 'pipe_ctx->plane_state' is not
null before accessing. This prevents a null pointer dereference.
Found by code review.
Fixes: 3be5262e353b ("drm/amd/display: Rename more dc_surface stuff to plane_state")
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Ma Ke <make24@iscas.ac.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Some items were defined in both the general and DC glossaries.
Remove the duplicate entries.
Fixes: 2df30ae0ba0b ("Documentation/gpu: Add acronyms for some firmware components")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Rodrigo Siqueira <siqueira@igalia.com>
Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
They are noops on GFX11 for most firmware versions. KFD already
handles its own queues and they should already be unmapped at this
point so even if this runs, it's not doing anything.
Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
There is a spelling mistake and a grammatical error in a dev_err
message. Fix it.
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
In the case of poison inband log, the error type need to be specified
by checking the deferred or poison bit of status register.
v2: check both deferred and poison bit
Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add the trap irq processing for page queue of sdma442
Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by and Tested-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
disable gfxoff on the specific sku based on the requirement
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Change the DMESG reporting of unknown errors to "Boot Controller
Generic Error" to align with the RAS SPEC and provide more clarity
to customers.
Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Correct the logic to find supported NPS modes from firmware.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reported-by: Ava Zhang <niandong.zhang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Fixes: 30eb41f5d1a7 ("drm/amdgpu: Use firmware supported NPS modes")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The fru_id field is disabled cause of mis-matching defination
between CPER spec and driver.
Signed-off-by: Xiang Liu <xiang.liu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
'svm_range_cpu_invalidate_pagetables'
This commit addresses a circular locking dependency in the
svm_range_cpu_invalidate_pagetables function. The function previously
held a lock while determining whether to perform an unmap or eviction
operation, which could lead to deadlocks.
Fixes the below:
[ 223.418794] ======================================================
[ 223.418820] WARNING: possible circular locking dependency detected
[ 223.418845] 6.12.0-amdstaging-drm-next-lol-050225 #14 Tainted: G U OE
[ 223.418869] ------------------------------------------------------
[ 223.418889] kfdtest/3939 is trying to acquire lock:
[ 223.418906] ffff8957552eae38 (&dqm->lock_hidden){+.+.}-{3:3}, at: evict_process_queues_cpsch+0x43/0x210 [amdgpu]
[ 223.419302]
but task is already holding lock:
[ 223.419303] ffff8957556b83b0 (&prange->lock){+.+.}-{3:3}, at: svm_range_cpu_invalidate_pagetables+0x9d/0x850 [amdgpu]
[ 223.419447] Console: switching to colour dummy device 80x25
[ 223.419477] [IGT] amd_basic: executing
[ 223.419599]
which lock already depends on the new lock.
[ 223.419611]
the existing dependency chain (in reverse order) is:
[ 223.419621]
-> #2 (&prange->lock){+.+.}-{3:3}:
[ 223.419636] __mutex_lock+0x85/0xe20
[ 223.419647] mutex_lock_nested+0x1b/0x30
[ 223.419656] svm_range_validate_and_map+0x2f1/0x15b0 [amdgpu]
[ 223.419954] svm_range_set_attr+0xe8c/0x1710 [amdgpu]
[ 223.420236] svm_ioctl+0x46/0x50 [amdgpu]
[ 223.420503] kfd_ioctl_svm+0x50/0x90 [amdgpu]
[ 223.420763] kfd_ioctl+0x409/0x6d0 [amdgpu]
[ 223.421024] __x64_sys_ioctl+0x95/0xd0
[ 223.421036] x64_sys_call+0x1205/0x20d0
[ 223.421047] do_syscall_64+0x87/0x140
[ 223.421056] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 223.421068]
-> #1 (reservation_ww_class_mutex){+.+.}-{3:3}:
[ 223.421084] __ww_mutex_lock.constprop.0+0xab/0x1560
[ 223.421095] ww_mutex_lock+0x2b/0x90
[ 223.421103] amdgpu_amdkfd_alloc_gtt_mem+0xcc/0x2b0 [amdgpu]
[ 223.421361] add_queue_mes+0x3bc/0x440 [amdgpu]
[ 223.421623] unhalt_cpsch+0x1ae/0x240 [amdgpu]
[ 223.421888] kgd2kfd_start_sched+0x5e/0xd0 [amdgpu]
[ 223.422148] amdgpu_amdkfd_start_sched+0x3d/0x50 [amdgpu]
[ 223.422414] amdgpu_gfx_enforce_isolation_handler+0x132/0x270 [amdgpu]
[ 223.422662] process_one_work+0x21e/0x680
[ 223.422673] worker_thread+0x190/0x330
[ 223.422682] kthread+0xe7/0x120
[ 223.422690] ret_from_fork+0x3c/0x60
[ 223.422699] ret_from_fork_asm+0x1a/0x30
[ 223.422708]
-> #0 (&dqm->lock_hidden){+.+.}-{3:3}:
[ 223.422723] __lock_acquire+0x16f4/0x2810
[ 223.422734] lock_acquire+0xd1/0x300
[ 223.422742] __mutex_lock+0x85/0xe20
[ 223.422751] mutex_lock_nested+0x1b/0x30
[ 223.422760] evict_process_queues_cpsch+0x43/0x210 [amdgpu]
[ 223.423025] kfd_process_evict_queues+0x8a/0x1d0 [amdgpu]
[ 223.423285] kgd2kfd_quiesce_mm+0x43/0x90 [amdgpu]
[ 223.423540] svm_range_cpu_invalidate_pagetables+0x4a7/0x850 [amdgpu]
[ 223.423807] __mmu_notifier_invalidate_range_start+0x1f5/0x250
[ 223.423819] copy_page_range+0x1e94/0x1ea0
[ 223.423829] copy_process+0x172f/0x2ad0
[ 223.423839] kernel_clone+0x9c/0x3f0
[ 223.423847] __do_sys_clone+0x66/0x90
[ 223.423856] __x64_sys_clone+0x25/0x30
[ 223.423864] x64_sys_call+0x1d7c/0x20d0
[ 223.423872] do_syscall_64+0x87/0x140
[ 223.423880] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 223.423891]
other info that might help us debug this:
[ 223.423903] Chain exists of:
&dqm->lock_hidden --> reservation_ww_class_mutex --> &prange->lock
[ 223.423926] Possible unsafe locking scenario:
[ 223.423935] CPU0 CPU1
[ 223.423942] ---- ----
[ 223.423949] lock(&prange->lock);
[ 223.423958] lock(reservation_ww_class_mutex);
[ 223.423970] lock(&prange->lock);
[ 223.423981] lock(&dqm->lock_hidden);
[ 223.423990]
*** DEADLOCK ***
[ 223.423999] 5 locks held by kfdtest/3939:
[ 223.424006] #0: ffffffffb82b4fc0 (dup_mmap_sem){.+.+}-{0:0}, at: copy_process+0x1387/0x2ad0
[ 223.424026] #1: ffff89575eda81b0 (&mm->mmap_lock){++++}-{3:3}, at: copy_process+0x13a8/0x2ad0
[ 223.424046] #2: ffff89575edaf3b0 (&mm->mmap_lock/1){+.+.}-{3:3}, at: copy_process+0x13e4/0x2ad0
[ 223.424066] #3: ffffffffb82e76e0 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}, at: copy_page_range+0x1cea/0x1ea0
[ 223.424088] #4: ffff8957556b83b0 (&prange->lock){+.+.}-{3:3}, at: svm_range_cpu_invalidate_pagetables+0x9d/0x850 [amdgpu]
[ 223.424365]
stack backtrace:
[ 223.424374] CPU: 0 UID: 0 PID: 3939 Comm: kfdtest Tainted: G U OE 6.12.0-amdstaging-drm-next-lol-050225 #14
[ 223.424392] Tainted: [U]=USER, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[ 223.424401] Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS PRO WIFI/X570 AORUS PRO WIFI, BIOS F36a 02/16/2022
[ 223.424416] Call Trace:
[ 223.424423] <TASK>
[ 223.424430] dump_stack_lvl+0x9b/0xf0
[ 223.424441] dump_stack+0x10/0x20
[ 223.424449] print_circular_bug+0x275/0x350
[ 223.424460] check_noncircular+0x157/0x170
[ 223.424469] ? __bfs+0xfd/0x2c0
[ 223.424481] __lock_acquire+0x16f4/0x2810
[ 223.424490] ? srso_return_thunk+0x5/0x5f
[ 223.424505] lock_acquire+0xd1/0x300
[ 223.424514] ? evict_process_queues_cpsch+0x43/0x210 [amdgpu]
[ 223.424783] __mutex_lock+0x85/0xe20
[ 223.424792] ? evict_process_queues_cpsch+0x43/0x210 [amdgpu]
[ 223.425058] ? srso_return_thunk+0x5/0x5f
[ 223.425067] ? mark_held_locks+0x54/0x90
[ 223.425076] ? evict_process_queues_cpsch+0x43/0x210 [amdgpu]
[ 223.425339] ? srso_return_thunk+0x5/0x5f
[ 223.425350] mutex_lock_nested+0x1b/0x30
[ 223.425358] ? mutex_lock_nested+0x1b/0x30
[ 223.425367] evict_process_queues_cpsch+0x43/0x210 [amdgpu]
[ 223.425631] kfd_process_evict_queues+0x8a/0x1d0 [amdgpu]
[ 223.425893] kgd2kfd_quiesce_mm+0x43/0x90 [amdgpu]
[ 223.426156] svm_range_cpu_invalidate_pagetables+0x4a7/0x850 [amdgpu]
[ 223.426423] ? srso_return_thunk+0x5/0x5f
[ 223.426436] __mmu_notifier_invalidate_range_start+0x1f5/0x250
[ 223.426450] copy_page_range+0x1e94/0x1ea0
[ 223.426461] ? srso_return_thunk+0x5/0x5f
[ 223.426474] ? srso_return_thunk+0x5/0x5f
[ 223.426484] ? lock_acquire+0xd1/0x300
[ 223.426494] ? copy_process+0x1718/0x2ad0
[ 223.426502] ? srso_return_thunk+0x5/0x5f
[ 223.426510] ? sched_clock_noinstr+0x9/0x10
[ 223.426519] ? local_clock_noinstr+0xe/0xc0
[ 223.426528] ? copy_process+0x1718/0x2ad0
[ 223.426537] ? srso_return_thunk+0x5/0x5f
[ 223.426550] copy_process+0x172f/0x2ad0
[ 223.426569] kernel_clone+0x9c/0x3f0
[ 223.426577] ? __schedule+0x4c9/0x1b00
[ 223.426586] ? srso_return_thunk+0x5/0x5f
[ 223.426594] ? sched_clock_noinstr+0x9/0x10
[ 223.426602] ? srso_return_thunk+0x5/0x5f
[ 223.426610] ? local_clock_noinstr+0xe/0xc0
[ 223.426619] ? schedule+0x107/0x1a0
[ 223.426629] __do_sys_clone+0x66/0x90
[ 223.426643] __x64_sys_clone+0x25/0x30
[ 223.426652] x64_sys_call+0x1d7c/0x20d0
[ 223.426661] do_syscall_64+0x87/0x140
[ 223.426671] ? srso_return_thunk+0x5/0x5f
[ 223.426679] ? common_nsleep+0x44/0x50
[ 223.426690] ? srso_return_thunk+0x5/0x5f
[ 223.426698] ? trace_hardirqs_off+0x52/0xd0
[ 223.426709] ? srso_return_thunk+0x5/0x5f
[ 223.426717] ? syscall_exit_to_user_mode+0xcc/0x200
[ 223.426727] ? srso_return_thunk+0x5/0x5f
[ 223.426736] ? do_syscall_64+0x93/0x140
[ 223.426748] ? srso_return_thunk+0x5/0x5f
[ 223.426756] ? up_write+0x1c/0x1e0
[ 223.426765] ? srso_return_thunk+0x5/0x5f
[ 223.426775] ? srso_return_thunk+0x5/0x5f
[ 223.426783] ? trace_hardirqs_off+0x52/0xd0
[ 223.426792] ? srso_return_thunk+0x5/0x5f
[ 223.426800] ? syscall_exit_to_user_mode+0xcc/0x200
[ 223.426810] ? srso_return_thunk+0x5/0x5f
[ 223.426818] ? do_syscall_64+0x93/0x140
[ 223.426826] ? syscall_exit_to_user_mode+0xcc/0x200
[ 223.426836] ? srso_return_thunk+0x5/0x5f
[ 223.426844] ? do_syscall_64+0x93/0x140
[ 223.426853] ? srso_return_thunk+0x5/0x5f
[ 223.426861] ? irqentry_exit+0x6b/0x90
[ 223.426869] ? srso_return_thunk+0x5/0x5f
[ 223.426877] ? exc_page_fault+0xa7/0x2c0
[ 223.426888] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 223.426898] RIP: 0033:0x7f46758eab57
[ 223.426906] Code: ba 04 00 f3 0f 1e fa 64 48 8b 04 25 10 00 00 00 45 31 c0 31 d2 31 f6 bf 11 00 20 01 4c 8d 90 d0 02 00 00 b8 38 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 41 89 c0 85 c0 75 2c 64 48 8b 04 25 10 00
[ 223.426930] RSP: 002b:00007fff5c3e5188 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
[ 223.426943] RAX: ffffffffffffffda RBX: 00007f4675f8c040 RCX: 00007f46758eab57
[ 223.426954] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
[ 223.426965] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 223.426975] R10: 00007f4675e81a50 R11: 0000000000000246 R12: 0000000000000001
[ 223.426986] R13: 00007fff5c3e5470 R14: 00007fff5c3e53e0 R15: 00007fff5c3e5410
[ 223.427004] </TASK>
v2: To resolve this issue, the allocation of the process context buffer
(`proc_ctx_bo`) has been moved from the `add_queue_mes` function to the
`pqm_create_queue` function. This change ensures that the buffer is
allocated only when the first queue for a process is created and only if
the Micro Engine Scheduler (MES) is enabled. (Felix)
v3: Fix typo s/Memory Execution Scheduler (MES)/Micro Engine Scheduler
in commit message. (Lijo)
Fixes: 438b39ac74e2 ("drm/amdkfd: pause autosuspend when creating pdd")
Cc: Jesse Zhang <jesse.zhang@amd.com>
Cc: Yunxiang Li <Yunxiang.Li@amd.com>
Cc: Philip Yang <Philip.Yang@amd.com>
Cc: Alex Sierra <alex.sierra@amd.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
AMDISP GPIO control uses a dedicated pinctrl driver,
and requires MFD hotadd GPIO resources.
Co-developed-by: Pratap Nirujogi <pratap.nirujogi@amd.com>
Signed-off-by: Benjamin Chan <benjamin.chan@amd.com>
Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
They are noops on GFX12. There is no suspend/resume all support
in firmware so the function doesn't do anything. KFD already
handles its own queues and they should already be unmapped at this
point so even if this runs, it's not doing anything.
Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The last use of optc3_fpu_set_vrr_m_const() was removed in 2022's
commit 64f991590ff4 ("drm/amd/display: Fix a compilation failure on PowerPC
caused by FPU code")
which removed the only caller (with a similar) name.
Remove it.
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
DRM_ERROR() is no longer preferred. Replace DRM_ERROR() usage
with drm_err() in isp driver.
Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
All dce command_table_helper's shares a copy-pasted collection
of copy-pasted functions, which are: phy_id_to_atom,
clock_source_id_to_atom_phy_clk_src_id, and engine_bp_to_atom.
This patch removes the multiple copy-pasted by moving them to
the command_table_helper.c and make the command_table_helper's
calls the functions implemented by the command_table_helper.c
instead.
The changes were not tested on actual hardware. I am only able
to verify that the changes keep the code compileable and do my
best to to look repeatedly if I am not actually changing any code.
This is the version 4 of the PATCH, fixed comments about
licence in the new files and the matches From email to
Signed-off-by email. Fixed comments about using
command_table_helper instead of creating a dce_common
Signed-off-by: Luan Icaro Pinto Arcanjo <luanicaro@usp.br>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
To properly handle multiple GPUs.
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
If each instance uses the same fw image, only store one
copy in the driver.
Acked-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
No need for an IP specific version.
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
No need for an IP specific version.
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
No need for an IP specific version.
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
No need for an IP specific version.
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
No need for an IP specific version.
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
No need for an IP specific version.
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
No need for an IP specific version.
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
No need for an IP specific version.
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
No need for an IP specific version.
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
It's common for all VCN variants.
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Use the vcn instance power gating callbacks rather than
the IP powergating callback. This limits power gating to
only the instance in use rather than all of the instances.
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Rework the code as a vcn instance callback.
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Rework the code as a vcn instance callback.
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Rework the code as a vcn instance callback.
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Rework the code as a vcn instance callback.
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|