summaryrefslogtreecommitdiff
path: root/drivers/gpu/drm/amd/amdkfd/kfd_interrupt.c
AgeCommit message (Collapse)Author
2024-12-18drm/amdgpu: Show warning message if IH ring overflowPhilip Yang
If IH primary ring and KFD ih fifo overflows, we may miss CP, SDMA interrupts and cause application soft hang. Show warning message with ring name if overflow happens. Add function to get ih ring name to avoid duplicating it. To keep warning message consistent between GPU generations, change all *_ih.c except ASICs older than Vega which has only one ih ring. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amdkfd: Queue interrupt work to different CPUPhilip Yang
For CPX mode, each KFD node has interrupt worker to process ih_fifo to send events to user space. Currently all interrupt workers of same adev queue to same CPU, all workers execution are actually serialized and this cause KFD ih_fifo overflow when CPU usage is high. Use per-GPU unbounded highpri queue with number of workers equals to number of partitions, let queue_work select the next CPU round robin among the local CPUs of same NUMA. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amdgpu: Optimize gfx v9 GPU page fault handlingPhilip Yang
After GPU page fault, there are lots of page fault interrupts generated at short period even with CAM filter enabled because the fault address is different. Each page fault copy to KFD ih fifo to send event to user space by KFD interrupt worker, this could cause KFD ih fifo overflow while other processes generate events at same time. KFD process is aborted after GPU page fault, we only need one GPU page fault interrupt sent to KFD ih fifo to send memory exception event to user space. Incease KFD ih fifo size to 2 times of IH primary ring size, to handle the burst events case. This patch handle the gfx v9 path, cover retry on/off and CAM filter on/off cases. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-12-18drm/amdkfd: KFD interrupt access ih_fifo data in-placePhilip Yang
To handle 40000 to 80000 interrupts per second running CPX mode with 4 streams/queues per KFD node, KFD interrupt handler becomes the performance bottleneck. Remove the kfifo_out memcpy overhead by accessing ih_fifo data in-place and updating rptr with kfifo_skip_count. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-03-22drm/amdkfd: Cleanup workqueue during module unloadMukul Joshi
Destroy the high priority workqueue that handles interrupts during KFD node cleanup. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-06-09drm/amdkfd: Introduce kfd_node struct (v5)Mukul Joshi
Introduce a new structure, kfd_node, which will now represent a compute node. kfd_node is carved out of kfd_dev structure. kfd_dev struct now will become the parent of kfd_node, and will store common resources such as doorbells, GTT sub-alloctor etc. kfd_node struct will store all resources specific to a compute node, such as device queue manager, interrupt handling etc. This is the first step in adding compute partition support in KFD. v2: introduce kfd_node struct to gc v11 (Hawking) v3: make reference to kfd_dev struct through kfd_node (Morris) v4: use kfd_node instead for kfd isr/mqd functions (Morris) v5: rebase (Alex) Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Tested-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Morris Zhang <Shiwu.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-07-29drm/amdkfd: use time_is_before_jiffies(a + b) to replace "jiffies - a > b"Yu Zhe
time_is_before_jiffies deals with timer wrapping correctly. Signed-off-by: Yu Zhe <yuzhe@nfschina.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-04-07drm/amdkfd: Improve concurrency of event handlingFelix Kuehling
Use rcu_read_lock to read p->event_idr concurrently with other readers and writers. Use p->event_mutex only for creating and destroying events and in kfd_wait_on_events. Protect the contents of the kfd_event structure with a per-event spinlock that can be taken inside the rcu_read_lock critical section. This eliminates contention of p->event_mutex in set_event, which tends to be on the critical path for dispatch latency even when busy waiting is used. It also eliminates lock contention in event interrupt handlers. Since the p->event_mutex is now used much less, the impact of requiring it in kfd_wait_on_events should also be much smaller. This should improve event handling latency for processes using multiple GPUs concurrently. v2: Reschedule the worker periodically to avoid soft lockup warnings Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Sean Keely <Sean.Keely@amd.com> # v1 Tested-by: Sanjay Tripathi <sanjay.tripathi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-23drm/amdkfd: Use real device for messagesFelix Kuehling
kfd_chardev() doesn't provide much useful information in dev_... messages on multi-GPU systems because there is only one KFD device, which doesn't correspond to any particular GPU. Use the actual GPU device to indicate the GPU that caused a message. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-22drm/amdkfd: Drop IH ring overflow message to dbgKent Russell
When this was first implemented, overflows weren't expected in regular operations, and tests weren't in place to cause said overflow. Now there are cases where overflows occur with real workloads, but we know that the kernel can handle this robustly, so move the message to a debug message. Signed-off-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2022-02-14drm/amdkfd: update SPDX license headerRajneesh Bhardwaj
Update the SPDX License header for all the KFD files. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-12-01drm/amdkfd: add kfd_device_info_init functionGraham Sider
Initializes kfd->device_info given either asic_type (enum) if GFX version is less than GFX9, or GC IP version if greater. Also takes in vf and the target compiler gfx version. Uses SDMA version to determine num_sdma_queues_per_engine. Convert device_info to a non-pointer member of kfd, change references accordingly. Change unsupported asic condition to only probe f2g, move device_info initialization post-switch. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-10-03drm/amdkfd: fix a potential NULL pointer dereference (v2)Allen Pais
alloc_workqueue is not checked for errors and as a result, a potential NULL dereference could occur. v2 (Felix Kuehling): * Fix compile error (kfifo_free instead of fifo_free) * Return proper error code Signed-off-by: Allen Pais <allen.pais@oracle.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2018-07-11drm/amdkfd: fix zero reading of VMID and PASID for HawaiiLan Xiao
Upon VM Fault, the VMID and PASID written by HW are zeros in Hawaii. Instead of reading from ih_ring_entry, read directly from the registers. This workaround fix the soft hang issues caused by mishandled VM Fault in Hawaii. Signed-off-by: Lan Xiao <Lan.Xiao@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2018-04-13drm/amdkfd: Remove vlaLaura Abbott
There's an ongoing effort to remove VLAs[1] from the kernel to eventually turn on -Wvla. Switch to a constant value that covers all hardware. [1] https://lkml.org/lkml/2018/3/7/621 Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Laura Abbott <labbott@redhat.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27drm/amdkfd: use a high priority workqueue for IH workAndres Rodriguez
In systems under heavy load the IH work may experience significant scheduling delays. Under load + system workqueue: Max Latency: 7.023695 ms Avg Latency: 0.263994 ms Under load + high priority workqueue: Max Latency: 1.162568 ms Avg Latency: 0.163213 ms Further work is required to measure the impact of per-cpu settings on IH performance. Signed-off-by: Andres Rodriguez <andres.rodriguez@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27drm/amdkfd: wait only for IH work on IH exitAndres Rodriguez
We don't need to wait for all work to complete in the IH exit function. We only need to make sure the interrupt_work has finished executing to guarantee that ih_kfifo is no longer in use. Signed-off-by: Andres Rodriguez <andres.rodriguez@amd.com> Acked-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27drm/amdkfd: increase IH num entries to 8192Andres Rodriguez
A larger buffer will let us accommodate applications with a large amount of semi-simultaneous event signals. Signed-off-by: Andres Rodriguez <andres.rodriguez@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-10-27drm/amdkfd: use standard kernel kfifo for IHAndres Rodriguez
Replace our implementation of a lockless ring buffer with the standard linux kernel kfifo. We shouldn't maintain our own version of a standard data structure. Signed-off-by: Andres Rodriguez <andres.rodriguez@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2017-08-15drm/amdkfd: Clean up KFD style errors and warnings v2Kent Russell
Using checkpatch.pl -f <file> showed a number of style issues. This patch addresses as many of them as possible. Some long lines have been left for readability, but attempts to minimize them have been made. v2: Broke long lines in gfx_v7 get_fw_version Signed-off-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-05-19drm/amdkfd: Add the events moduleAndrew Lewycky
This patch adds the events module (kfd_events.c) and the interrupt handle module for Kaveri (cik_event_interrupt.c). The patch updates the interrupt_is_wanted(), so that it now calls the interrupt isr function specific for the device that received the interrupt. That function(implemented in cik_event_interrupt.c) returns whether this interrupt is of interest to us or not. The patch also updates the interrupt_wq(), so that it now calls the device's specific wq function, which checks the interrupt source and tries to signal relevant events. v2: Increase limit of signal events to 4096 per process Remove bitfields from struct cik_ih_ring_entry Rename radeon_kfd_event_mmap to kfd_event_mmap Add debug prints to allocate_free_slot and allocate_signal_page Make allocate_event_notification_slot return a correct value Add warning prints to create_signal_event Remove error print from IOCTL path Reformatted debug prints in kfd_event_mmap Map correct size (as received from mmap) in kfd_event_mmap v3: Reduce limit of signal events back to 256 per process Fix allocation of kernel memory for signal events Signed-off-by: Andrew Lewycky <Andrew.Lewycky@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-05-19drm/amdkfd: Add interrupt handling moduleAndrew Lewycky
This patch adds the interrupt handling module, kfd_interrupt.c, and its related members in different data structures to the amdkfd driver. The amdkfd interrupt module maintains an internal interrupt ring per amdkfd device. The internal interrupt ring contains interrupts that needs further handling. The extra handling is deferred to a later time through a workqueue. There's no acknowledgment for the interrupts we use. The hardware simply queues a new interrupt each time without waiting. The fixed-size internal queue means that it's possible for us to lose interrupts because we have no back-pressure to the hardware. However, only interrupts that are "wanted" by amdkfd, are copied into the amdkfd s/w interrupt ring, in order to minimize the chances for overflow of the ring. Signed-off-by: Andrew Lewycky <Andrew.Lewycky@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
2015-01-08drm/amdkfd: Drop interrupt SW ring bufferMichel Dänzer
The work queue couldn't reliably prevent the SW ring buffer from overflowing, so dmesg was spammed by kfd kfd: Interrupt ring overflow, dropping interrupt. messages when running e.g. the Atlantis Substance demo from https://wiki.unrealengine.com/Linux_Demos on Kaveri. Since the SW ring buffer doesn't actually do anything at this point, just remove it for now. When actual interrupt processing code is added to amdkfd, it should try to do things immediately and only defer to work queues when necessary. Signed-off-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>
2014-07-17amdkfd: Add interrupt handling moduleAndrew Lewycky
This patch adds the interrupt handling module, in kfd_interrupt.c, and its related members in different data structures to the amdkfd driver. The amdkfd interrupt module maintains an internal interrupt ring per amdkfd device. The internal interrupt ring contains interrupts that needs further handling. The extra handling is deferred to a later time through a workqueue. There's no acknowledgment for the interrupts we use. The hardware simply queues a new interrupt each time without waiting. The fixed-size internal queue means that it's possible for us to lose interrupts because we have no back-pressure to the hardware. v3: Move amdkfd from drm/radeon/ to drm/amd/ Change device init Made sure spin lock is taken only if init is complete Moved bool field to the end of the structure Signed-off-by: Andrew Lewycky <Andrew.Lewycky@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@amd.com>