Age | Commit message (Collapse) | Author |
|
Don't fetch it again if we already have it. It seems the
registers don't reliably have the value at resume in some
cases.
Fixes: 785f0f9fe742 ("drm/amdgpu: Add mes v12_0 ip block support (v4)")
Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 9e7b08d239c2f21e8f417854f81e5ff40edbebff)
Cc: stable@vger.kernel.org # 6.12.x
|
|
This is normally handled in the gfx IP suspend callbacks, but
for S0ix, those are skipped because we don't want to touch
gfx. So handle it in device suspend.
Fixes: b9467983b774 ("drm/amdgpu: add dynamic workload profile switching for gfx10")
Fixes: 963537ca2325 ("drm/amdgpu: add dynamic workload profile switching for gfx11")
Fixes: 5f95a1549555 ("drm/amdgpu: add dynamic workload profile switching for gfx12")
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 906ad451675155380c1dc1881a244ebde8e8df0a)
Cc: stable@vger.kernel.org
|
|
When the parameter is set, disable user submissions
to kernel queues.
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
When the parameter is set, disable user submissions
to kernel queues.
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
For SDMA, we still need kernel queues for paging so
they need to be initialized, but we no not want to
accept submissions from userspace when disable_kq
is set.
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Plumb in support for disabling kernel queues.
v2: use ring counts per Felix' suggestion
v3: fix stream fault handler, enable EOP interrupts
v4: fix MEC interrupt offset (Sunil)
v5: clean up after removing extra sched.ready settings
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Plumb in support for disabling kernel queues in
GFX11. We have to bring up a GFX queue briefly in
order to initialize the clear state. After that
we can disable it.
v2: use ring counts per Felix' suggestion
v3: fix stream fault handler, enable EOP interrupts
v4: fix MEC interrupt offset (Sunil)
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
If we don't have kernel queues, the vmids can be used by
the MES for user queues.
Acked-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Make all resources available to user queues.
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Suggested-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add proper checks for disable_kq functionality in
gfx helper functions. Add special logic for families
that require the clear state setup.
v2: use ring count as per Felix suggestion
v3: fix num_gfx_rings handling in amdgpu_gfx_graphics_queue_acquire()
v4: fix error code (Alex)
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This would be set by IPs which only accept submissions
from the kernel, not userspace, such as when kernel
queues are disabled. Don't expose the rings to userspace
and reject any submissions in the CS IOCTL.
v2: fix error code (Alex)
Reviewed-by: Sunil Khatri<sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
On chips that support user queues, setting this option
will disable kernel queues to be used to validate
user queues without kernel queues.
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Similar to KFD, prevent runtime pm while user queues are active.
Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
So we can iterate across them when we need to manage
all user queues.
v2: add uq_mgr to adev list in amdgpu_userq_mgr_init
Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add the user queue IP support query to the drm_amdgpu_info_device
query.
Cc: marek.olsak@amd.com
Cc: prike.liang@amd.com
Cc: sunil.khatri@amd.com
Cc: yogesh.mohanmarimuthu@amd.com
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add an INFO query to check if user queues are supported.
v2: switch to a mask of IPs (Marek)
v3: move to drm_amdgpu_info_device (Marek)
Cc: marek.olsak@amd.com
Cc: prike.liang@amd.com
Cc: sunil.khatri@amd.com
Cc: yogesh.mohanmarimuthu@amd.com
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add a separate switch statement for the userq callback
assignment so that we can assign the callbacks for each
asic as the firmware becomes available.
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
With the ME details fixed, we can now consolidate
this state. Also split out the userq setup into a separate
switch statement so that we can set them per IP version
when the firmwares are ready.
Reviewed-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The display is freezing because the amdgpu_userq_wait_ioctl()
is waiting for a non-user queue fence(specifically, the PT update fence).
RootCause:
The resume_work is initiated by both amdgpu_userq_suspend and
amdgpu_userqueue_ensure_ev_fence at same time. The amdgpu_userq_suspend
signals a dma-fence and subsequently triggers the resume_work, which is
intended to replace the existing fence by creating new dma-fence. However,
following this, the amdgpu_userqueue_ensure_ev_fence schedules another
resume_work that generates a new dma-fence, thereby replacing the one
created by amdgpu_userq_suspend. Consequently, the original fence will
never be signaled.
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Shashank Sharma <shashank.sharma@amd.com>
Cc: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Warn if the number of pipes exceeds what the MES supports.
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Move it to amdgpu_mes to align with the compute and
sdma hqd masks. No functional change.
v2: rebase on new changes
v3: misc optimizations
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Sunil Khatri<sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This was leftover from MES bring up when we had MES
user queues in the kernel. It's no longer used so
remove it.
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Leftover from the MES self tests that were removed previously.
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Make sure these are set properly to ensure compatibility if
we ever update the IOCTL interface.
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Encountering a taint issue during the unloading of gpu_sched
due to the fence not being released/put. In this context,
amdgpu_vm_clear_freed is responsible for creating a job to
update the page table (PT). It allocates kmem_cache for
drm_sched_fence and returns the finished fence associated
with job->base.s_fence. In case of Usermode queue this finished
fence is added to the timeline sync object through
amdgpu_gem_update_bo_mapping, which is utilized by user
space to ensure the completion of the PT update.
[ 508.900587] =============================================================================
[ 508.900605] BUG drm_sched_fence (Tainted: G N): Objects remaining in drm_sched_fence on __kmem_cache_shutdown()
[ 508.900617] -----------------------------------------------------------------------------
[ 508.900627] Slab 0xffffe0cc04548780 objects=32 used=2 fp=0xffff8ea81521f000 flags=0x17ffffc0000240(workingset|head|node=0|zone=2|lastcpupid=0x1fffff)
[ 508.900645] CPU: 3 UID: 0 PID: 2337 Comm: rmmod Tainted: G N 6.12.0+ #1
[ 508.900651] Tainted: [N]=TEST
[ 508.900653] Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ELITE/X570 AORUS ELITE, BIOS F34 06/10/2021
[ 508.900656] Call Trace:
[ 508.900659] <TASK>
[ 508.900665] dump_stack_lvl+0x70/0x90
[ 508.900674] dump_stack+0x14/0x20
[ 508.900678] slab_err+0xcb/0x110
[ 508.900687] ? srso_return_thunk+0x5/0x5f
[ 508.900692] ? try_to_grab_pending+0xd3/0x1d0
[ 508.900697] ? srso_return_thunk+0x5/0x5f
[ 508.900701] ? mutex_lock+0x17/0x50
[ 508.900708] __kmem_cache_shutdown+0x144/0x2d0
[ 508.900713] ? flush_rcu_work+0x50/0x60
[ 508.900719] kmem_cache_destroy+0x46/0x1f0
[ 508.900728] drm_sched_fence_slab_fini+0x19/0x970 [gpu_sched]
[ 508.900736] __do_sys_delete_module.constprop.0+0x184/0x320
[ 508.900744] ? srso_return_thunk+0x5/0x5f
[ 508.900747] ? debug_smp_processor_id+0x1b/0x30
[ 508.900754] __x64_sys_delete_module+0x16/0x20
[ 508.900758] x64_sys_call+0xdf/0x20d0
[ 508.900763] do_syscall_64+0x51/0x120
[ 508.900769] entry_SYSCALL_64_after_hwframe+0x76/0x7e
v2: call dma_fence_put in amdgpu_gem_va_update_vm
v3: Addressed review comments from Christian.
- calling amdgpu_gem_update_timeline_node before switch.
- puting a dma_fence in case of error or !timeline_syncobj.
v4: Addressed review comments from Christian.
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Shashank Sharma <shashank.sharma@amd.com>
Cc: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Le Ma <le.ma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
To align with other headers.
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This can be enabled now. We have the firmware checks
in place.
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Currently disabled until the firmwares are officially
released.
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
s/CONFIG_DRM_AMD_USERQ_GFX/CONFIG_DRM_AMDGPU_NAVI3X_USERQ/
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The feature is not navi3x specific at this point.
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
I'd swear this was already fixed, but I guess the patch never
landed. Add it now.
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Take a reference when we create a queue and drop it
when we destroy the queue. We need to keep the device
active while user queues are active.
v2: squash in fix from Sunil
v3: squash in fix from Prike
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Use the IP type to look up the userq functions rather
than hardcoding it.
Reviewed-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
A deadlock situation has arised between the userq
signal ioctl and the eviction fence. In this scenario,
the function amdgpu_userq_signal_ioctl() has acquired a reservation
lock on the read/write buffer object (BO) through drm_exec.
Subsequently, it calls amdgpu_userqueue_ensure_ev_fence(),
which is in a waiting for the userq resume work.
Meanwhile, the userq suspend worker has initiated the userq resume
work(amdgpu_userqueue_resume_worker). This userq resume work attempts
to validate the vm->done BO, leading to amdgpu_userqueue_validate_bos
also attempting to reservation lock the same write BO that is already
locked by amdgpu_userq_signal_ioctl.
As a result, the resume work becomes stalled, causing
amdgpu_userqueue_ensure_ev_fence to remain in a waiting state.
Call Trace:
[ 242.836469] INFO: task gnome-shel:cs0:1288 blocked for more than 120 seconds.
[ 242.836486] Tainted: G OE 6.12.0-rc2rebased-oct-24+ #4
[ 242.836491] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 242.836494] task:gnome-shel:cs0 state:D stack:0 pid:1288 tgid:1282 ppid:1180 flags:0x00000002
[ 242.836503] Call Trace:
[ 242.836508] <TASK>
[ 242.836517] __schedule+0x3e0/0xb10
[ 242.836530] ? srso_return_thunk+0x5/0x5f
[ 242.836541] schedule+0x31/0x120
[ 242.836546] schedule_timeout+0x150/0x160
[ 242.836551] ? srso_return_thunk+0x5/0x5f
[ 242.836555] ? sysvec_call_function+0x69/0xd0
[ 242.836562] ? srso_return_thunk+0x5/0x5f
[ 242.836567] ? preempt_count_add+0x7f/0xd0
[ 242.836577] __wait_for_common+0x91/0x180
[ 242.836582] ? __pfx_schedule_timeout+0x10/0x10
[ 242.836590] wait_for_completion+0x28/0x30
[ 242.836595] __flush_work+0x16c/0x290
[ 242.836602] ? __pfx_wq_barrier_func+0x10/0x10
[ 242.836611] flush_delayed_work+0x3a/0x60
[ 242.836621] amdgpu_userqueue_ensure_ev_fence+0x2d/0xb0 [amdgpu]
[ 242.836966] amdgpu_userq_signal_ioctl+0x959/0xec0 [amdgpu]
[ 242.837171] ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu]
[ 242.837365] drm_ioctl_kernel+0xae/0x100 [drm]
[ 242.837398] drm_ioctl+0x2a1/0x500 [drm]
[ 242.837420] ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu]
[ 242.837622] ? srso_return_thunk+0x5/0x5f
[ 242.837627] ? srso_return_thunk+0x5/0x5f
[ 242.837630] ? _raw_spin_unlock_irqrestore+0x2b/0x50
[ 242.837635] amdgpu_drm_ioctl+0x4f/0x90 [amdgpu]
[ 242.837811] __x64_sys_ioctl+0x99/0xd0
[ 242.837820] x64_sys_call+0x1209/0x20d0
[ 242.837825] do_syscall_64+0x51/0x120
[ 242.837830] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 242.837835] RIP: 0033:0x7f2f33f1a94f
[ 242.837838] RSP: 002b:00007f2f24ffea30 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 242.837842] RAX: ffffffffffffffda RBX: 00007f2f24ffebd0 RCX: 00007f2f33f1a94f
[ 242.837845] RDX: 00007f2f24ffebd0 RSI: 00000000c0306457 RDI: 000000000000000d
[ 242.837847] RBP: 00007f2f24ffeab0 R08: 0000000000000000 R09: 0000000000000000
[ 242.837849] R10: 00007f2f24ffecd0 R11: 0000000000000246 R12: 00007f2f25000640
[ 242.837851] R13: 00000000c0306457 R14: 000000000000000d R15: 00007fff3b39c1e0
[ 242.837858] </TASK>
[ 242.837865] INFO: task Xwayland:cs0:1517 blocked for more than 120 seconds.
[ 242.837869] Tainted: G OE 6.12.0-rc2rebased-oct-24+ #4
[ 242.837872] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 242.837874] task:Xwayland:cs0 state:D stack:0 pid:1517 tgid:1338 ppid:1282 flags:0x00004002
[ 242.837878] Call Trace:
[ 242.837880] <TASK>
[ 242.837883] __schedule+0x3e0/0xb10
[ 242.837890] schedule+0x31/0x120
[ 242.837894] schedule_preempt_disabled+0x1c/0x30
[ 242.837897] __mutex_lock.constprop.0+0x386/0x6e0
[ 242.837902] ? srso_return_thunk+0x5/0x5f
[ 242.837905] ? __timer_delete_sync+0x81/0xe0
[ 242.837911] __mutex_lock_slowpath+0x13/0x20
[ 242.837915] mutex_lock+0x3b/0x50
[ 242.837919] amdgpu_userqueue_ensure_ev_fence+0x35/0xb0 [amdgpu]
[ 242.838138] amdgpu_userq_signal_ioctl+0x959/0xec0 [amdgpu]
[ 242.838340] ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu]
[ 242.838531] drm_ioctl_kernel+0xae/0x100 [drm]
[ 242.838559] drm_ioctl+0x2a1/0x500 [drm]
[ 242.838580] ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu]
[ 242.838778] ? srso_return_thunk+0x5/0x5f
[ 242.838783] ? srso_return_thunk+0x5/0x5f
[ 242.838786] ? _raw_spin_unlock_irqrestore+0x2b/0x50
[ 242.838791] amdgpu_drm_ioctl+0x4f/0x90 [amdgpu]
[ 242.838967] __x64_sys_ioctl+0x99/0xd0
[ 242.838972] x64_sys_call+0x1209/0x20d0
[ 242.838975] do_syscall_64+0x51/0x120
[ 242.838979] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 242.838982] RIP: 0033:0x7f9118b1a94f
[ 242.838985] RSP: 002b:00007f910cdff760 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 242.838989] RAX: ffffffffffffffda RBX: 00007f910cdff910 RCX: 00007f9118b1a94f
[ 242.838991] RDX: 00007f910cdff910 RSI: 00000000c0306457 RDI: 000000000000000c
[ 242.838993] RBP: 00007f910cdff7e0 R08: 0000000000000000 R09: 0000000000000001
[ 242.838995] R10: 00007f910cdff9d4 R11: 0000000000000246 R12: 00007f910ce00640
[ 242.838997] R13: 00000000c0306457 R14: 000000000000000c R15: 00007fff9dd11d10
[ 242.839004] </TASK>
v2: Addressed review comemnts from Christian.
v3/v4: Addressed review comemnts from Christian.
- Move drm_exec drm_exec loop after userq fence create.
- cleanup the newly created userq fence in case of error.
v5 - Addressed review comemnts from Christian.
- Create a new amdgpu_userq_fence_alloc() function for allocation.
- Calling dma_fence_put for cleanup procedure.
- make amdgpu_userq_fence_create() function static.
- drm_exec_init is called after mutex_unlock.
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Shashank Sharma <shashank.sharma@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The seq64 VM cache policy should be set to UC (Uncached) to
match with userqueue fence address kernel mapped memory's
cache settings.
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fix out-of-bounds issue in userq fence create when
accessing the userq xa structure. Added a lock to
protect the race condition.
v2:(Christian)
- Allocate memory with GFP_ATOMIC.
v3:
- Moved to 2 xa approach.
v4:(Christian)
- Lock the xa_for_each blocks and memory allocation part
as well to make sure that xa is not modified in between
the 2 xa_for_each blocks.
BUG: KASAN: slab-out-of-bounds in amdgpu_userq_fence_create+0x726/0x880 [amdgpu]
[ +0.000006] Call Trace:
[ +0.000005] <TASK>
[ +0.000005] dump_stack_lvl+0x6c/0x90
[ +0.000011] print_report+0xc4/0x5e0
[ +0.000009] ? srso_return_thunk+0x5/0x5f
[ +0.000008] ? kasan_complete_mode_report_info+0x26/0x1d0
[ +0.000007] ? amdgpu_userq_fence_create+0x726/0x880 [amdgpu]
[ +0.000405] kasan_report+0xdf/0x120
[ +0.000009] ? amdgpu_userq_fence_create+0x726/0x880 [amdgpu]
[ +0.000405] __asan_report_store8_noabort+0x17/0x20
[ +0.000007] amdgpu_userq_fence_create+0x726/0x880 [amdgpu]
[ +0.000406] ? __pfx_amdgpu_userq_fence_create+0x10/0x10 [amdgpu]
[ +0.000408] ? srso_return_thunk+0x5/0x5f
[ +0.000008] ? ttm_resource_move_to_lru_tail+0x235/0x4f0 [ttm]
[ +0.000013] ? srso_return_thunk+0x5/0x5f
[ +0.000008] amdgpu_userq_signal_ioctl+0xd29/0x1c70 [amdgpu]
[ +0.000412] ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu]
[ +0.000404] ? try_to_wake_up+0x165/0x1840
[ +0.000010] ? __pfx_futex_wake_mark+0x10/0x10
[ +0.000011] drm_ioctl_kernel+0x178/0x2f0 [drm]
[ +0.000050] ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu]
[ +0.000404] ? __pfx_drm_ioctl_kernel+0x10/0x10 [drm]
[ +0.000043] ? __kasan_check_read+0x11/0x20
[ +0.000007] ? srso_return_thunk+0x5/0x5f
[ +0.000007] ? __kasan_check_write+0x14/0x20
[ +0.000008] drm_ioctl+0x513/0xd20 [drm]
[ +0.000040] ? __pfx_amdgpu_userq_signal_ioctl+0x10/0x10 [amdgpu]
[ +0.000407] ? __pfx_drm_ioctl+0x10/0x10 [drm]
[ +0.000044] ? srso_return_thunk+0x5/0x5f
[ +0.000007] ? _raw_spin_lock_irqsave+0x99/0x100
[ +0.000007] ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[ +0.000006] ? __rseq_handle_notify_resume+0x188/0xc30
[ +0.000008] ? srso_return_thunk+0x5/0x5f
[ +0.000008] ? srso_return_thunk+0x5/0x5f
[ +0.000006] ? _raw_spin_unlock_irqrestore+0x27/0x50
[ +0.000010] amdgpu_drm_ioctl+0xcd/0x1d0 [amdgpu]
[ +0.000388] __x64_sys_ioctl+0x135/0x1b0
[ +0.000009] x64_sys_call+0x1205/0x20d0
[ +0.000007] do_syscall_64+0x4d/0x120
[ +0.000008] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ +0.000007] RIP: 0033:0x7f7c3d31a94f
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
VCN and VPE have different offset range, update the doorbell
offset range repsectively.
Doorbell size for VCN and VPE is 32bit.
v1 : add gfx switch case and fix checkpatch warnings (Shashank)
Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Introduce db_info structure to the populate the doorbell
information that is required to be mapped.
Made changes to the doorbell mapping func more generic,
by taking parameters that vary based on IPs and/or usecase
into db_info structure.
v2 - Fix space alignment and checkpatch warnings(Shashank)
Signed-off-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
That needs to be done after grabbing the lock, not before.
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
When applications closes, it triggers the drm_file_free
function which subsequently releases all allocated buffer
objects. Concurrently, the resume_worker thread will attempt
to map the usermode queue. However, since the wptr buffer
object has already been deallocated, this will result in
an Illegal opcode error being raised in the command stream.
Now replacing drm_release() with a new function
amdgpu_drm_release(). This function will set the flag to
prevent the scheduling of any new queue resume/map, stop
all queues and then call drm_release().
V2:
- Replace drm_release with amdgpu_drm_release(Christian).
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Apply sign extension to seq64 va address.
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Modify the MES process va end limit to max pfn.
Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The xarray pointer which has the userqueue xarray structure
reference should be cleared when the userqueue gets
destroyed. Otherwise, we may access the freed xa memory and
see the below warnings.
warning 1:
BUG: KASAN: slab-use-after-free in _raw_spin_lock+0x7a/0xe0
[ +0.000044] Call Trace:
[ +0.000017] <TASK>
[ +0.000016] dump_stack_lvl+0x6c/0x90
[ +0.000025] print_report+0xc4/0x5e0
[ +0.000025] ? srso_return_thunk+0x5/0x5f
[ +0.000024] ? kasan_complete_mode_report_info+0x60/0x1d0
[ +0.000030] ? _raw_spin_lock+0x7a/0xe0
[ +0.000023] kasan_report+0xdf/0x120
[ +0.000023] ? _raw_spin_lock+0x7a/0xe0
[ +0.000025] kasan_check_range+0xf7/0x1b0
[ +0.000025] __kasan_check_write+0x14/0x20
[ +0.000024] _raw_spin_lock+0x7a/0xe0
[ +0.000023] ? __pfx__raw_spin_lock+0x10/0x10
[ +0.000024] ? amdgpu_userq_wait_ioctl+0xac0/0x1f30 [amdgpu]
[ +0.000442] amdgpu_userq_wait_ioctl+0x18fc/0x1f30 [amdgpu]
[ +0.000428] ? __pfx_amdgpu_userq_wait_ioctl+0x10/0x10 [amdgpu]
[ +0.000424] ? __pfx_idr_alloc_u32+0x10/0x10
[ +0.000027] ? srso_return_thunk+0x5/0x5f
[ +0.000024] ? __kasan_check_write+0x14/0x20
[ +0.000025] ? srso_return_thunk+0x5/0x5f
[ +0.000024] ? idr_alloc+0x72/0xc0
[ +0.000023] ? srso_return_thunk+0x5/0x5f
[ +0.000023] ? fput+0x1c/0x2f0
[ +0.000025] drm_ioctl_kernel+0x178/0x2f0 [drm]
[ +0.000065] ? __pfx_amdgpu_userq_wait_ioctl+0x10/0x10 [amdgpu]
[ +0.000425] ? __pfx_drm_ioctl_kernel+0x10/0x10 [drm]
[ +0.000064] ? srso_return_thunk+0x5/0x5f
[ +0.000023] ? __kasan_check_write+0x14/0x20
[ +0.000025] drm_ioctl+0x513/0xd20 [drm]
[ +0.000058] ? __pfx_amdgpu_userq_wait_ioctl+0x10/0x10 [amdgpu]
[ +0.000428] ? __pfx_drm_ioctl+0x10/0x10 [drm]
[ +0.000061] ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[ +0.000027] ? __count_memcg_events+0x11f/0x3a0
[ +0.000027] ? srso_return_thunk+0x5/0x5f
[ +0.001040] ? srso_return_thunk+0x5/0x5f
[ +0.000969] ? _raw_spin_unlock_irqrestore+0x27/0x50
[ +0.000966] amdgpu_drm_ioctl+0xcd/0x1d0 [amdgpu]
[ +0.001352] __x64_sys_ioctl+0x135/0x1b0
[ +0.000966] x64_sys_call+0x1205/0x20d0
[ +0.000968] do_syscall_64+0x4d/0x120
[ +0.000960] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ +0.000962] RIP: 0033:0x7f42af11a94f
warning 2:
WARNING: at lib/xarray.c:1849 __xa_alloc+0x13a/0x150
[ 366.491409] RIP: 0010:__xa_alloc+0x13a/0x150
[ 366.491434] Call Trace:
[ 366.491437] <TASK>
[ 366.491440] ? show_regs+0x6d/0x80
[ 366.491445] ? __warn+0x91/0x140
[ 366.491450] ? __xa_alloc+0x13a/0x150
[ 366.491453] ? report_bug+0x1c9/0x1e0
[ 366.491459] ? handle_bug+0x63/0xa0
[ 366.491463] ? exc_invalid_op+0x1d/0x80
[ 366.491467] ? asm_exc_invalid_op+0x1f/0x30
[ 366.491476] ? __xa_alloc+0x13a/0x150
[ 366.491484] amdgpu_userq_wait_ioctl+0xe0e/0xfe0 [amdgpu]
[ 366.491743] ? idr_alloc_u32+0x97/0xd0
[ 366.491749] ? __pfx_amdgpu_userq_wait_ioctl+0x10/0x10 [amdgpu]
[ 366.491912] drm_ioctl_kernel+0xae/0x100 [drm]
[ 366.491942] drm_ioctl+0x2a1/0x500 [drm]
[ 366.491961] ? __pfx_amdgpu_userq_wait_ioctl+0x10/0x10 [amdgpu]
[ 366.492127] ? srso_return_thunk+0x5/0x5f
[ 366.492132] ? srso_return_thunk+0x5/0x5f
[ 366.492135] ? _raw_spin_unlock_irqrestore+0x2b/0x50
[ 366.492139] amdgpu_drm_ioctl+0x4f/0x90 [amdgpu]
[ 366.492288] __x64_sys_ioctl+0x99/0xd0
[ 366.492295] x64_sys_call+0x1209/0x20d0
[ 366.492299] do_syscall_64+0x51/0x120
[ 366.492303] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 366.492418] RIP: 0033:0x7f86f3b1a94f
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add the correct fences count variable [num_fences] in the fences
array iteration to handle the userq / non-userq fences.
v2:(Christian)
- All fences in the array either come from some reservation object
or drm_syncobj. If any of those are NULL then there is a bug
somewhere else.
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add mqd for userq compute queue for gfx11/gfx12
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch enables attachment and detachment of eviction fences.
This is just a fork of eviction fence enabling code from the first
patch of the series so that the CI testing can happen on fully
fledged code.
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Christian Koenig <christian.koenig@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The basic idea in this redesign is to add an eviction fence only in UQ
resume path. When userqueue is not present, keep ev_fence as NULL
Main changes are:
- do not create the eviction fence during evf_mgr_init, keeping
evf_mgr->ev_fence=NULL until UQ get active.
- do not replace the ev_fence in evf_resume path, but replace it only in
uq_resume path, so remove all the unnecessary code from ev_fence_resume.
- add a new helper function (amdgpu_userqueue_ensure_ev_fence) which
will do the following:
- flush any pending uq_resume work, so that it could create an
eviction_fence
- if there is no pending uq_resume_work, add a uq_resume work and
wait for it to execute so that we always have a valid ev_fence
- call this helper function from two places, to ensure we have a valid
ev_fence:
- when a new uq is created
- when a new uq completion fence is created
v2: Worked on review comments by Christian.
v3: Addressed few more review comments by Christian.
v4: Move mutex lock outside of the amdgpu_userqueue_suspend()
function (Christian).
v5: squash in build fix (Alex)
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arvind Yadav <arvind.yadav@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
- Add a field in struct amdgpu_mqd_prop for userqueue
secure sem fence address since now we have a generic
file for mes_userqueue.c
- Add secure sem fence address mqd support to gfx12 into
their corresponding init functions.
- Enable secure semaphore IRQ handling
V2: Address review comment from Alex:
Use fence_address instead of fenceaddress (Shashank)
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
This patch enables Usermode queue support across GFX, Compute
and SDMA IPs on GFX12/SDMA7. It typically reuses Navi3X userqueue
IP functions to create and destroy MQDs.
v2: rebase on proposed changes (Alex)
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Arvind Yadav <arvind.yadav@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Now that all of the IP specific code has been moved into
the IP specific functions, we can make this code generic.
V2: Fixed build errors and porting logics (Shashank)
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|